Showing 337 to 340

Occam's Razor not some fuzzy rule of thumb

Not everyone is raised to revere Occam's Razor. To someone who wasn't, the statement "it's the simplest explanation" isn't a knockdown argument for anything. Why couldn't a complex Non-Occam explanation be correct?

So it bears explaining.


Occam's Razor is not just some fuzzy rule of thumb, it has a formalism: minimum message length (MML)!

"The woman down the street is a witch; she did it"

The above sentence looks like a short and simple theory for whatever happened, but it's far from simple. Several of the words used, such as "witch", require a lot of explanation for an AI or alien who knows nothing about any of the words. The resulting completed message, that contains all the data needed to interpret the sentence, in addition to the sentence itself, is the true MML of your theory.

If you then represent the message as binary code, you can describe its complexity in terms of bits (a log2 number).

(To be fair, though it doesn't make a difference for this page, here we handwave away an important detail: the "message" would actually be a computer program (Turing machine code, I think), as that is the shortest possible way—and a language-neutral way—to express a theory.)


Slightly longer messages must be taken by an Ideal reasoner as exponentially less likely to match reality.

Even if on one given occasion this may feel hard to justify, it's simply math, that if you have the habit of believing messages just one bit longer than the shortest message available, you'll be wrong twice as often as otherwise. To say nothing of when the message is ten bits longer, where on average you must expect your first thousand (because 210 = 1024) theories of the same length to be proven false.

And though there's technically a way out here to save your pet theory, if you were motivated to argue it into a defensible position… it's not valid to hope along the lines of "there's still a chance, right?" for the longer message to happen by luck to describe reality more closely. No one can feel a probability that small, so it's more human-psychologically correct (in the sense of that famous parable by Asimov "…wronger than both of them put together") to say that it's simply zero, i.e. to say that we actually know that the simpler explanation is correct (technically, just the most accurate by far – until someone thinks of a theory with an even shorter MML).

This is why physicists strive so hard to find simple theories – the simplicity is as good as proof it's correct!

(Why do physicists run any experiments at all then when they could just sit in an armchair crafting ever simpler theories? Excellent question! There's one constraint on your theory-making: you need the simplest theory that still fits all the facts at hand. Otherwise you could just propose a zero-length message as explanation for everything, right? If a theory fails to explain just one fact, it's already disproven and the answer has to be in a different theory, even if that one must be longer. They just discount anything that's longer than necessary. And run experiments to differentiate between theories of equal length.)

Once a simpler theory is found that fits, everyone acts like we know this theory is true, because… we essentially do know it.

The word "know", if it's to mean anything useful, is shorthand for a sufficiently high probability – large percentages like 99.9976%, the amount of decimals passing beyond the realm where it's psychologically realistic to keep track of the probability as a mental entity at all. We throw it away, and that's the point where we say we "know" the attached proposition. Although for agents with unbounded computing power, the number would always remain.

As Dennis Lindley (1923–2013) said, our theories must always allow for the possibility that the moon is made of green cheese, however tiny (Cromwell's Rule). Most people alive today would assign such a proposition about the moon too tiny a probability to bother keeping track of – in other words, they know perfectly well it's not made of green cheese! If this bothers you, the issue is that the word "know" is a bit of an abomination, a shorthand for a probability hugging up against 0% or 100% with many decimals. And the word "know" serve a pragmatic purpose as such a shorthand, but the vast majority of people don't think of it that way, they just hear it as absolute, so be wary.

Anyway, just as you won't bother to do an experiment to check if the moon is made of green cheese, as it's so improbable as to be not worth your time, then for the same reason, you don't bother to test or even consider any other hypotheses with long MML – they're so improbable as to be not worth your time.

To nevertheless privilege a long-MML hypothesis and insist it be tested, you must likewise argue for checking whether the moon is cheese, and decillions of other improbable hypotheses, and then humanity has no time to do anything else.

But… is it so bad to privilege a hypothesis "just this once"? From www.greaterwrong.com/posts/X2AD2LgtKgkRNPj2a/privileging-the-hypothesis:

In the minds of human beings, if you can get them to think about this particular hypothesis rather than the trillion other possibilities that are no more complicated or unlikely, you really have done a huge chunk of the work of persuasion. Anything thought about is treated as “in the running,” and if other runners seem to fall behind in the race a little, it’s assumed that this runner is edging forward or even entering the lead.

What if you have special knowledge that implies it's worth testing? Well, that's allowed and totally OK! Science doesn't pick sides. But your knowledge has to have a large evidential weight to offset the long MML. Without such weight, we're back to the previous reasoning – it's overwhelmingly likely to just waste our time.


If the explicit probability argument doesn't persuade you, how about track record?

Contrary to how it's often presented, the Copernican revolution, where we transitioned from a geocentric to a heliocentric model, wasn't straightforward! Read The Copernican Revolution From the Inside. In the beginning, the data fit the theory worse!

Yet people insisted trying to make heliocentrism true.

Why? They liked its philosophical simplicity. And in the end, that bore fruit. That's why we're now so confident in Occam's Razor: when you find a simple theory, it tends to be worth insisting on it for a while, more than any other butterfly idea. If you don't have that policy, you may get stuck on theories that fit the facts better right now and miss out on the truth.

Science would have discovered almost nothing by now if the scientists weren't thinking about hypotheses according to Occam's Razor.

There are infinite possible explanations for any phenomena, and every time you test one and it fails, you can rule out a large segment of the space of possible explanations similar to the one you just tested. Thus you quickly narrow down the most correct explanations, which results in technology that works. That phone in your hand was crafted by the invisible hand of Occam.

What links here

Created (2 years ago)

Nullius in verba

The Royal Society in 1660 had the slogan nullius in verba – "Take nobody's word for it". We can see it as representing a fundamental shift in mindset that we call the Enlightenment. It used to be near-universal among human cultures to believe in some sort of Fall From Grace: everything was better before, and the most solid knowledge comes from authorities like the church or someone who lived earlier who wrote something, the older the better.

Mapmakers everywhere used to fill-in the regions they didn't know well or didn't know anything about (perhaps they just tried to hide their knowledge-holes in order to sell, but I read in Sapiens: A History of Humankind that it also reflected a different mindset – they acted as if knowledge couldn't progress so what we had was as good as we were ever gonna have), but starting around this time, we see maps with blank areas clearly marked as unexplored, which invited curiosity.

Admitting what we didn't know led to the desire to find out.

But why was truth from established authority no longer satisfactory?

  • Sapiens: A Brief History of Humankind has an explanation for the post-1600 Europeans' odd (enterprising & exploratory) outlook.
    • Counterexample of the Chinese general who once took fleets as far as Madagascar but whose expeditions ended due to lack of interest from the throne. This is the norm for most societies. China didn't believe there was anything of interest far away. Thus they never discovered Polynesia or Australia.
    • Science was supported by empires with ample funds. Every Royal Navy ship brought a scientist or two just because, to document what they found. From the empire's perspective, it was also a way to buy legitimacy for colonialism, "white man's burden".
    • Dutch East India Company. Early stock market.
    • The tiny Netherlands defeated Spain because investors trusted NL finances. Spanish king unable to get loans, while NL could get all the loans they wanted. Early example of the fact that credit-ratin wins wars.

What links here

Created (2 years ago)

Terminal values ≠ instrumental values

  1. "I want my sister to live" – terminal value
  2. "I want to administer CPR on her" – instrumental value

If #1 disappears, you will not continue wanting to do #2 for itself.

Yet, English uses the word "want" for both – leads to confusion in discussions about AI.

What links here

Created (2 years ago)

Why do almost all species birth 50/50 males/females?

Take a thought experiment, where there's a species with a sex ratio of 10/90 for some reason – maybe some researchers edited the genes – that is, 10% of the time a sperm meets an egg, the result is male, and 90% of the time, the result is female.

Will it stay on 10/90 in the long run?

Well, what happens for an individual who makes a bit more males than females? That individual's genes will be more successful. It's the same old story. The only stable state is 50/50, which is where the ratio ends up after some number of generations. But why 50/50, not, say, 60% males 40% females?

I want examples. I think some species make a skewed adult sex ratio by eating some of their young? Or?

What links here

Created (2 years ago)
Showing 337 to 340