Showing 289 to 292

Write predictions in advance

Why?

  • Help conservation of expected evidence. Hard to conserve when you don't see the whole tree of possibilities, only one branch as it plays out.
  • Blinding (same rationale as pre-registered research)
  • Assure others that you did not do Texas sharpshooter, data-dredge etc.

What links here

Created (2 years ago)

Prosocial free software

For a piece of free software to benefit society, we tend to hope it fulfils the following.

  • bazaar-developed, not cathedral
  • either a high bus factor, or simple enough that bus factor isn't a concern
    • in other words: if a codebase has only one developer, others must be able and willing to take over when this person developer gets hit by a bus – the more lines of code, the more important they're well-written
    • if the codebase is very big, many people should be involved in it: that's called a high bus factor
  • Well-written, which eases auditing and forking

By contrast, these features are not prosocial

  • cathedral-developed
  • "open core"
  • SaaS

Related

Created (2 years ago)

How to put in your own words something the source expressed perfectly?

OK, you have at least these two purposes in taking a note:

  1. To give your future self the best possible explanation
  2. To learn-by-explaining

When the source has expressed something perfectly, and you feel you can't express it better, it would seem you can only satisfy one of the above purposes.

But, you can. One tactic:

  1. Forget the original phrasing (wait a day), then try again to explain
  2. After that feat of production, quote the original as a second explanation

What links here

Created (2 years ago)

Standard deviation

#statistics

A Gaussian distribution is defined by the mean and SD, but the mean and SD exist regardless of the distribution of the data.

In the case of fat-tail distributions (meaning fatter tail than the exponential dist), or at least in the case of the Cauchy distribution (t-dist with df=1), the mean is not well-defined (What is the mean wage in a corporation? Depends a lot on whether there's a billionaire among them or not). Then the SD might not be well-defined either, since its calculation involves the mean. But how about the binomial distribution? It is not (necessarily) Gaussian, yet it definitely has a mean, and it has a SD. Now what does the SD tell us about the data?

It tells us less than it would, were the data Gaussian-distributed. We cannot say, for example, that 68% of the observations fall within one SD of the mean, since that property arises only from the definition of the Gaussian distribution. So what is the SD, with this property removed?

It is a measure of spread around the mean. It is not special if the data is not Gaussian. There are a lot of alternatives, like Mean Absolute Deviation, which are just as informative (or not), much like there are several ways of taking the "average" (mean, median, mode). In non-Gaussian data, it is less useful to know the SD, but it is not meaningless.

But suppose I am told simply that "the SD is 3". What have I been told?

Created (2 years ago)
Showing 289 to 292