The chain of subsets goes like this:

Sampling unit < Sample < Sampling frame < Population


The target population
all subjects of interest (to which you want to be able to apply a conclusion) e.g. "all cows in the world".
The sampling frame
the subset of the population, from which you draw your sample, e.g. "all cows in Skane county".
Sampling design or sampling procedure
the whole plan, basically, may involve such concepts as stratified sampling

Population definition

Be specific!

  • All current customers -> Everyone who's bought from us in the last year.
  • SDSU students loyal to the football team -> SDSU students registered in Fall 2019 and have attended at least one game while a student
  • Unsatisfied customers -> Customers with an active paid account and have either scored us a 3 or lower in the last year or registered a complaint through e-mail

A good pop. def. uses clear, unambig, objective criteria to disting. betw. pop. and non-pop. members.

A good pop def suits the needs of the audience requesting the research

  • It is common that the client you're working with has a hazy idea of what makes someone a member of the pop of interest. Before you undertake your whole study, get verification from them so that all parties agree what the pop def is!

A good pop def matches with previous research so that the present study can be directly compared with them.

Sampling frame

Leslie Kish posited four basic problems of sampling frames:[7]

  1. "Errors of omission": Missing elements: Some members of the population are not included in the frame.
  2. "Errors of inclusion": Foreign elements: Non-members of the population are included in the frame.
  3. Duplicate entries: A member of the population is surveyed more than once.
  4. Groups or clusters: The frame lists clusters instead of individuals.

Example 1: errors of omission

Pop. of interest: all customers who have been on our tele-therapy platform in the last 30 days

Sampling frame: A list of 5,000 customer phone numbers, which were collected from customers after their 90-day free trial expired.

OK, there is going to be some, and probably sizable, overlap between these sets of individuals (many individuals are in both). However, some error sources:

  • Maybe the company didn't provide a complete customer list, just a list of 5,000. This kind of error of omission is okay – if the 5,000 were selected randomly, it won't affect conclusions.
  • People still in the trial period haven't had to submit phone numbers yet. This kind of error of omission poses a problem: the cause of the omission systematically differentiates your in-frame pop from your out-of-frame pop.

Potential remedies

  • If some pop characteristics are known, measure your sample and compare to see if they are approx equivalent, showing that the sample is representative.
    • E.g. if you KNEW from other research that the target pop is 65% 60+ years old, but your frame only has 5% 60+ years old, there's an issue!
Rhetorical analysis

Rhetoric of the underprivileged

First, let's see what's typical of classical rhetoric:

  1. Public: there is a space (agora) for adopting a stance together on general questions. Directed towards the citizens with the power to act.
  2. Doxical: Can support arguments on listeners' doxa
  3. Agonistic: confrontative. Not that much in genus demonstrativum, but even there, the praised's virtues are contrasted with others' flaws, the specter of the cultural outsider seen as bad.
  4. Authoritative: the speaker typically appears as a role model showing the way forward
  5. Closed: there is a preset thesis
  6. Monological: speaks to listeners, not with conversants. When not monologuing, the ancients were into dialectics, a style of debate that pits two viewpoints against each other to see which one "wins", and the purpose then is to find truth, not to convince.
  7. Persuasive

The anti-rhetorical style Karlyn Kohr Campbell observed inside women's groups (Rhetoric of Women: an Oxymoron):

  1. Private
  2. Not doxical: can't support arguments with shared doxa because no such thing
  3. No antagonism: to speak in the group is to realize a sisterhood. Any antagonism is directed outwards.
  4. Non-authoritative: the only authority is held by someone speaking of their own experiences in their own domain, but none in the group has the power to say what's the best way forward, nor are they trying to win it
  5. Open: participants have no aim except to question what they already think of their situation, by joining a group where all speak with equal agency. Joining the group is to ``risk your very self'' in a rhetoric that doesn't wrap a fixed program.
  6. Non-monological: nobody is standing on a podium
  7. No persuasion: The goal is a process, not a final belief. You serve up your insights and questions, and leave it to others to see any lessons in your words.
  • First observed in women's rhetoric.
  • A classic trope used by women persuading men: the antistrephon, generally useful when you're in a lower position
  • Classical rhetoric is useful for a society with many internally conflicting interests, then there is a need for one unifying force that decides what's what
  • Women's rhetoric is forced to revolt against the ruling patriarchal system. "The choice to be moderate and reformist doesn't exist for those who propose the liberation of woman". Here it must develop an antagonistic and confrontative strategy. But it cannot support itself on a counter-doxa, because whatever such a counter-doxa may say, it has been corrupted by the current system that outwardly supports the same values as the women's movement (democracy, equality, personal independence) but in practice betrays them.
    • Therefore, women's rhetoric must work with a more intentional method that reveals the contradiction in male-dominated society between its principally democratic ideas and its unfair treatment of people who happen to be women.
    • It may be done through incongruous perspectives ("Is God a black woman?") or symbolic reversals (filling a term like "whore" with positive meaning). As Campbell says, the purpose is to "do violence against the structure of reality". It's a demonstrative genre that dissolves doxa instead of confirming it.
    • How would a social structure look, where the women's group discussion rhetoric could be a pattern for the public discourse? We can suppose that on each point it must differ from the patriarchal system, leaving it entirely in splinters. It would mean a social order where the struggle between groups has been replaced by a peaceful cooperation (presupposing that systemic injustice and all other objective reasons for conflict have disappeared), where a general equality and respect for others has spread and where traditional positions of authority has been destroyed, where the preference for dogmatic and ideological thinking has been destroyed and a free search for truth and value is encouraged and respected in everyone, where opinions don't spread from centers of power via megaphones but are always forming and being aired in small and large groups that stand in a dialogical cooperation and where, finally, the desire to convince others about one's own opinon has been replaced with a curiosity for other perspectives, to try other positions and to recgnize their situational-bound value.
      • Which tendencies in current society have an effect towards that social structure and which have an effect in the opposite? This becomes important to recognize for whoever would like to see this form of rhetoric in political discourse.

Bitzers punkter

  1. Exigence, En tvingande omständighet som inbjuder till retorisk handling
  2. Audience. En publik som äger förmågan att förmedla den retoriska handlingen och därmed upphäva den kritiska omständigheten
  3. Constraints. Alla de restriktioner eller tvingande begränsningar som talaren har att ta hänsyn till om hon eller han vill kunna övertyga publiken att betrakta eller gripa sig an omständigheten på det sätt som talaren önskar.

Retorisk publik

Den man egentligen riktar sig mot, som uppfyller Bitzers punkt om audience.

Ullen: Åkessons retoriska publik var alla framtida väljare .


  • No situation is objective. Situations exist on the basis of a description of them that builds on our subjective experience of them.
  • The way we choose to characterize a situation always affects how we experience it. So rhetoric helps determine what the situation even is.
  • Situational rhetoric is a myth
  • ☑ Creating salience


  • ☑ Says that Bitzer and Vatz are best seen as completing each other.
  • Bitzer has the approach of the analyzing rhetorician
  • Vatz has the approach of the speaker
  • ☑ Bitzer views the rhetoric as an answer on the situation in a similar way that an answer answers a question.
  • Bitzer says that the rhetor chooses a response dependent on how they perceive the situation, althoug he also says the situation basically prescribes only one correct response, and it is on the speaker to decode the situation accurately.
  • ☑ Koppling mellan ideologi och dåd. När någon begår terrordåd inspirerad av ideolog, delar andra med samma ideologi ansvar för detta dåd?


  • Trovärdigheten som författare bygger på en djup och bred kunskap om fandomen.


The best style is the one that doesn't get noticed.

Tegners sentens: det dunkelt sagda är det dunkelt tänkta. Isokrates skriver: Förmågan att tala väl och att tänka väl går hand i hand, och det är stilen som förenar de två.

Eva Östlund-Stjärnegårdh har studerat sambandet mellan stil och betyg i gymnasisters nationella prov i svenska. Det visade sig att varierad meningslängd och varierade fundament var viktigast, alltså att ha både långa och korta meningar, och att ha få meningar som börjar med "Det är…".

Standarddef av ironi är att säga en sak men mena det motsatta. Men detta är inte en bra def. Det kan sägas bygga på ett psykolingvisktiskt missförstånd om hur språkförståelse går till. Föreställnigen: att vi först uppfattar en bokstavlig betydelse och därefter börajr vi söka efter en annan, avsedd betydelse.

The biggest reason this def is not good is it excludes many linguistic actions.

Transient Mark is a nothing mode


You like transient-mark-mode? You feel that turning it off is a totally different experience? Here's what blew my mind! There's no difference! The difference is visual!

The region always exists. It's simply a term for "the text between mark and point". Since the point always exists and there's almost always at least one mark in the mark ring, the region likewise always exists. Unless you've turned on delete-selection-mode, nothing reacts any different when the region is "activated". It may be clearer to think of it as the region being "highlighted", not "activated".

Okay… I lied. There is one difference brought about by transient-mark-mode: a number of commands, such as flush-lines, change their behavior when the region is "activated". That behavior is often useful.

What behavior is this? In the majority of cases, they'll behave as if you had carried out C-x n n (narrow-to-region) prior to calling the command.

That's all there is to it.

It's narrow-to-region minus the visual consequence of narrowing, and skips having to C-x n w (widen) afterwards.

I wish I could say it was more automated than manually calling narrow-to-region, but it's actually not, as you either you have to use C-SPC (set-mark-command) beforehand, or you use C-x C-x (exchange-point-and-mark) after, which activates the preexisting region. So what's the benefit?

I mean… you either call something that calls activate-mark, or you call narrow-to-region. You have to call something either way! Aside from skipping widen, what keystrokes are saved?

And… I like the idea of activating the region afterwards, instead of before, because it requires less foresight. Plus if you never internalize the fact that the region always exists, you probably do a lot of unnecessary C-SPC and grunt cursor work to define a region to operate on, as if you were in a non-Emacs editor. Turns out the region often has close to the correct boundaries anyway, so it's better to think of C-x C-x long before you even think of C-SPC.

A few sane approaches for "defining a new region":

  • When point is already at one end of the region you want. In this case, either use C-x C-x if the mark is at the other end, otherwise use avy (which will set the mark). Or you can use C-x C-x and then adjust with avy, possibly popping the recentmost mark.
  • When point isn't at either end of the region you want. In this case, use avy twice.
  • When the region you want is semantically simple, e.g. suitable for M-h (mark-paragraph). Note that you can repeat M-h to extend to several paragraphs.
  • When the region you want is semantically simple, but you forgot the hotkey for this semantic unit. Let's say it's a sentence. Now you either need C-SPC M-e, or to use avy.

To put it differently… C-SPC isn't for such trivial use cases as "defining a region to operate on". C-SPC is for interesting use cases like Multiple Cursors' mc/mark-pop, and only for those.

Did you ever accidentally deactivate a region and have to move the cursor around as you re-define the boundaries for a new region to get back to the state you were in? Don't do that! The region is never lost, only deactivated; type C-x C-x.

(transient-mark-mode 0)
(setq set-mark-command-repeat-pop t)
(defun toggle-region ()
  "If region is inactive, activate it, otherwise the opposite.
If you have transient-mark-mode disabled, this will temporarily turn it on
until you do something to deactivate the region.

If you have transient-mark-mode enabled, this command is not really necessary,
as \\[exchange-point-and-mark] (`exchange-point-and-mark') is similar."
  (if (region-active-p)

I believe that everyone benefits from some time without transient-mark-mode, and with visible-mark-mode. It helps you approach TMM with a less limited mindset.

Productive mindset for TMM

  • You're not so aware of the active region, if there is one. You never go to the extent of deactivating it with keyboard-quit, you just do what you were going to do anyway, potentially letting the window get covered with the region highlight face for no reason.
  • You use C-SPC to set marks, not to activate region. The fact this happens to activate the region is like the fact that your car smells faintly of dog urine: try to ignore it and just drive.
  • When you need an active region, then you check if it is. Because you actually didn't know. It's like how novice drivers pay mental bandwidth to many immobile objects in their field of view, such as parked cars, failing to see a pedestrian up ahead, while for experienced drivers the pedestrian is the main object they're aware of. The region is like a parked car. You needn't be that aware of it.
  • Delete-selection-mode is anathema.

Book: Anthropic Bias

Table of Contents

Nick Bostrom

The whole book is hosted on, along with a FAQ for the Doomsday argument.

The name

In 1983, Carter expressed regret about naming it "anthropic principle", suggesting better names would be "the psychocentric principle", "the cognizability principle" or "the observer self-selection principle". <2023-Jan-12> I like "observer selection principle".

The principle has nothing to do with homo sapiens, and this name has misled some authors (according to Bostrom: Gale 1981, Gould 1985 and Worrall 1996 among others). If you talk about it under this name, emphasize that it concerns intelligent observers in general.

Why nitpick? It's paramount if we want to apply anthropic reasoning to hypotheses about other possible life forms, and it's crucial when we discuss the Doomsday Argument. Also it makes it look "as if it were part of a project to restitute Homo sapiens into the glorious role of Pivot of Creation. For example, Stephen Jay Gould's 1985 criticism is based on this misconception."

Different authors' formulations of the anthropic principle

  • Carter (1974): vague. "…what we can expect to observe must be restricted by the conditions necessary for our presence as observers."
  • Gott: underdeveloped
  • Leslie
  • Barrow & Tipler: dodgy baloney

Anthropic principles

  • Weak anthropic principle (WAP)
    • Superweak anthropic principle
  • Strong anthropic principle (SAP)
  • Participatory (PAP)
  • Final (FAP)
  • Completely Ridiculous (CRAP)
  • Copernican

Carter's WAP: "we must be prepared to take account of the fact that our location in the universe is necessarily privileged to the extent of being compatible with our existence as observers."

Carter's SAP: "the universe (and hence the fundamental parameters on which it depends) must be such as to admit the creation of observers within it at some stage"

These formulations have been attacked for being tautologies (the WAP in particular) and speculative (the SAP in particular), and vague.

Leslie 1989 argues that AP, WAP and SAP can all be understood as tautologies and the difference is purely verbal.

Leslie's Superweak AP: "If intelligent life's emergence, NO MATTER HOW HOSPITABLE THE ENVIRONMENT, always involves very improbable happenings, then any intelligent living beings that there are evolved where such improbable happenings happened."

Michael Hart 1982 has given a figure of 1 in 103000 as his most optimistic estimate of how likely it is that the right molecules would just happen to bump into each other to form a short DNA string capable of self-replication, although it's possible there exists some as yet unknown abiotic process bridging the gap between amino acids and full replicators. He stresses that we shouldn't assume that the evolution of life on an Earth-like planet is even remotely probable, just because it happened on our Earth-like planet.

Does it matter which Big World?

  • Let B : we are in a big world
  • Let T : some theory that is compatible with B
  • Let E : some proposition asserting that some specific observation is made

Then by Bayes Theorem,

Pr(T|E&B) = Pr(E|T&B) Pr(T|B) / Pr(E|B)

In order to know that observing E lets us judge T as any truer (or less true) than before we observed it, we'd need the expression Pr(T|E&B) – Pr(T|B) to evaluate to something nonzero, right? With algebra, we can see that:

Pr(T|E&B) - Pr(T|B) ≈ 0 if and only if Pr(E|T&B) ≈ Pr(E|B)

Ok, what Bostrom seems to have been saying is that there's something wrong in this methodology, and that if we take it as given, it doesn't matter which Big World theory is actually true, for our judgments of any T. New empirical findings would be pointless except to provide further support for the hypothesis that we are living in some Big World, for instance by showing that the universe is open.

Also the leaky connection between theory and observation spills over into other domains than cosmology; since nothing hinges on how we define T, we can extend the argument to prove that no observation has any bearing on any empirical scientific question(!) so long as we are living in a Big World. And I'll call that absurd – though it's startling to realize that even Newtonian mechanics and all its proof could be something I dreamed up just now, and it's a fascinating perspective on provability of theories (in addition to Occam's Razor discussions, Duhem-Quine thesis etc), and a principled way to reason about it seems called for.

Self-Sampling Assumption (SSA)

(SSA) One should reason as if one were a random sample from the set of all observers in one's reference class.

Strong Self-Sampling Assumption (SSSA)

Self-Indication Assumption (SIA)

(SIA) Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.

Some of the more profound criticisms of specific anthropic inferences rely implicitly on SIA. In particular, adopting SIA annihilates the DA.

Bostrom rejects SIA in Chapter 7.


  • The fine-tuning observation
    • 20 or so physical constants, any of which if they were different would make life impossible

Thought experiments

Gedankens around fine-tuning

  • Drawing shortest straw
    • difference from fine-tuning: participant exists before experiment
      • variation: railway tracks to each straw, track length irrelevant, which track picked by automaton, one person in gestation to be woken only if shortest straw picked
  • Winning three 1 in 1k lotteries makes us suspect foul play vs winning a 1 in 1B lottery, which is unsurprising ('someone had to win')
    • if someone says i'm "going to win the 1B lottery" and then does, that's also foul play

Straw-lottery 25-28

Dungeon 59-62, 69, 150, 156, 164

The world consists of a dungeon that has one hundred cells. In each cell there is one prisoner. Ninety of the cells are painted blue on the outside and the other ten are painted red. Each prisoner is asked to guess whether he is in a blue or a red cell. (Everybody knows all this.) You find yourself in one of the cells. What color should you think it is?—Answer: Blue, with 90% probability.

Most people agree with this. The general principle seems to be that your credence of having property P should be equal to the fraction of observers who have P; this agrees with SSA.

If everyone accepts SSA and bets blue, 90% of prisoners will win their bets. If they reject it and consider that one is no more likely to be in blue than red, only 50% will win their bets.

There's a class of rules that would do even better than SSA, like this rule, let's call it A: "If you are Geraldine Truman, bet that you are in a blue cell; if you are Harry Smith, bet that you are in a red cell; …"

Intuitively, rules like A are cheating, and we can show that by putting it in the context of its rival permutations A', A'', A''' etc, which map names to recommendations in different ways. Many of these rules will do badly, certainly worse than SSA, and on average, it'll be like a coin toss. I guess deciding whether to follow the advice would mainly be about what you think most likely about who put you in this universe and what their motivations might be.

If we want to find a betting rule that can be used by all participants (which we do because they all individually have access to the same evidence), a probability of 90% is the only one that makes it impossible to bet against them in such a way that they were collectively guaranteed to lose money. In other words, setting your degree of belief equal to what SSA is the only way not to be a collective sucker.

Emeralds 62-63

Imagine an experiment planned as follows. At some point in time, three humans would each be given an emerald. Several centuries afterwards, when a completely different set of humans was alive, five thousand humans would each be given an emerald. Imagine next that you have yourself been given an emerald in the experiment. You have no knowledge, however of whether your century is the earlier century in which just three people were to be in this situation, or in the later century in which five thousand were to be in it…

Suppose you in fact betted that you lived [in the earlier century]. If every emerald-getter in the experiment betted this way, there would be five thousand losers and only three winners. The sensible bet, therefore, is that yours is instead the later century of the two. (Leslie 1996)

The arguments made in Dungeon can be recycled here.

Two Batches

Variant of Emeralds.

Incubator I

The gedankens Dungeon, Emeralds and Two Batches were all pretty simple, with a static amount of observers. Now, what if the amount of observers would be different depending on which hypothesis is true?

"Incubator": A coin flip determines whether to create one room with one observer inside, or two rooms with one observer in each. If the coin falls tails, we get 1 black-bearded man. If the coin falls heads, we get 1 black-bearded man and 1 white-bearded man. Because the rooms are dark, nobody knows the color of their beard.

All observers created are informed about this. You are created, and find yourself in a dark room. Now, two questions:

(a) What chance do you assign to that the coin fell tails?

(b) The light turns on, and you see that your beard is black. Now what chance do you assign to that the coin fell tails?

Bostrom mentions three ways of reasoning about this: naive, SSA, and SSA+SIA. Each imply different priors and conditionals, and therefore different posteriors.

"At neither stages do I have relevant information on how the coin landed." Therefore, at stage (a), assign 1/2, and at stage (b), assign 1/2 again. This violates probability theory, so forget it.
Simple Bayesian reasoning, the one where you look at what probabilities you would've assigned to different outcomes before observing the black beard and then snip the outcome tree. We know Pr(Black | Tails) = 1 and Pr(Black | ~Tails) = 1/2. Then by Bayes Theorem, at stage (b) assign 2/3.
It's twice as likely you should exist if two observers exist than if only one does. So change up the priors so it's not half-half whether the coin fell heads or tails, but rather Pr(Tails) = 1/3, Pr(~Tails) = 2/3. The rest follows from Bayes Theorem. At stage (a) assign 1/3, and at stage (b) assign 1/2.

Incubator II 70-72

Incubator III 166-177

Old-evidence problem 96-105

Presumptuous Philosopher 124-126, 161, 205

Serpent's Advice 142-150, 165

Lazy Adam 143-144, 156, 160-161, 165, 171, 180-181, 204-205

Eve's Card Trick 144, 165

UN++ 150-205

Quantum Joe 154-205

Meta-Newcomb problem 157-158

Mr. Amnesiac 162-165

Blackbeards & Whitebeards 179-182

Sleeping Beauty problem 185-198

Absent-Minded Driver 194

Choice of reference class

  • If R is your reference class, it must be between these extremes: Rε ⊂ R ⊂ RU.
  • To be rational, must not use a reference class that's too broad
  • To be rational, must not use a reference class that's too narrow (all people born since my own birth) nor a gerrymandered reference class (the class that includes myself and all observers born after 2020).

Similar to what Eliezer said in 2008 about finding the simplest generalizations ( AKA carving thingspace well:

Now there are gruesome questions about the proper generalization that covers all these tiny cases. Call an object “grue” if it appears green before January 1, 2020 and appears blue thereafter. If all emeralds examined so far have appeared green, is the proper generalization, “Emeralds are green” or “Emeralds are grue”?

The answer is that the proper generalization is “Emeralds are green.” I’m not going to go into the arguments at the moment. It is not the subject of this essay, and the obvious answer in this case happens to be correct. The true Way is not stupid ( however clever you may be with your logic, it should finally arrive at the right answer rather than a wrong one.

In a similar sense, the simplest generalizations that would cover observed microscopic phenomena alone take the form of “All electrons have spin 1/2” and not “All electrons have spin 1/2 before January 1, 2020” or “All electrons have spin 1/2 unless they are part of an entangled system that weighs more than 1 gram.”

In ordinary situations, there is no reference class problem.

Suppose that in Dungeon, 10 of the blue cells contain a polar bear instead of a human. Then you simply count them out of the possibilities since you know you are human and therefore cannot be in one of the cells with a polar bear, and the ratio is now 80 blue cells vs 10 red cells.

Reference class becomes an issue when the number of observers is unknown and correlated with the hypotheses under consideration.

Edge cases for who could be part of your reference class:

  1. Intellectual limitations. (a gifted chimpanzee, a Neanderthal, a mentally disabled human, persons who can't understand SSA and the probabilistic reasoning involved in using it in the application in question)
  2. Insufficient information (persons who don't know about the experimental setup)
  3. Lack of some occurrent thoughts (persons who, as it happens, don't think of applying SSA)
  4. Exotic mentality (angels, superintelligent computers, posthumans)

Bostrom indicates we need something beyond the basic principle of indifference. Anyway, for many purposes, these details do not matter much. In thought experiments, we stipulate no edge cases, and real-world applications will approximate this ideal closely enough that the results one derives are robust under variation in those zones of uncertainty we have flagged.

Principle of indifference

"Assign equal credence to any two hypotheses if you don't have any reason to prefer one to the other."

A problem: It balances between vacuity and inconsistency. One can make it go either way depending on how strong an interpretation one gives of "reason". If reasons can include any subjective inclination, the principle loses most of its content. If having a reason requires one to have objectively significant statistical data, the principle can be shown to be inconsistent.

Anyway, Bostrom refines it in a later chapter.

A conclusion's insensitivity to reference class indicates its strength

pg. 202

Doomsday Argument (DA)

Assume that as time passes, the amount of observers increases superlinearly, up to some cutoff point such as a doomsday event. This would imply that the largest share of observers are born just before that cutoff point. Therefore if you could've been born as any person in any time, it's overwhelmingly likely you're one of the people in that time slice just before the end times.


  • Gott pg. 90-94
  • Carter-Leslie pg. 94-96, 105-107
  • Bostrom's explication pg. 96-105

Ways to invalidate the DA:

  • By accepting SIA—which you shouldn't per Bostrom
  • Something about applying the Observer Equation (I didn't follow the reasoning near the end of the book)


Some paradoxes go away when you stop thinking of observers as a contiguous entity extruded over the time-axis, a whole with a single identity as if they had a label like "person #4346921040" somewhere in a grand record outside Time, and instead slice the observer up by instants of time: observer-moments.

these quantities may be different:

  1. How many observers can expect to find themselves as a Boltzmann brain? (Some fraction of total observers)
  2. How many observer-moments can expect to be experienced by a Boltzmann brain? (A strictly lower fraction of total observer-moments?)

(Unrelated note:) Continuity of consciousness

The 1mm animal Trichoplax ( can regenerate a full body after having most of it chopped off, and notably, when chopped apart, the pieces will try to find each other and join back together. That invites some questions about continuity of consciousness. When you split in two, there are two observers, and if you merge with your split-buddy… is one murdered?

(The Star Trek Voyager episode "Tuvix" plays with this conundrum. In reality, I think merging of consciousness might be given a very different ethical frame than we'd expect, if done in a principled way the entrants regularly agree to, especially if the merging mechanism works differently such that from the entrants can have more reliable expectations about the succeeding observer-moment)

It seems to me (as of [2023-01-13 Fri]) that continuity of consciousness is an illusion, and accepting that dissolves many such conundrums.

I am a series of observer-moments, and you could say I die every second (or rather every Planck interval).

It resolves the moral conundrum of the Star Trek transporter. No need to worry about whether it kills me and creates a copy when I am anyways dying and creating a copy every second. You could say I am in fact wrong about expecting to experience the next moment – that just won't be me in "this consciousness" doing that.

(Hell of an argument for valuing other peoples' QALYs equal to your own)

When we think of death as a tragedy, it's because it ends a series of observer-moments, each of which expects there to be another observer-moment like itself immediately afterwards, and in fact develops long-term ambitions and desires because of the apparent reliability of this process (it would not bother to have any ambitions if it could only remember a series of random experiences none of which had ever affected the next).

With the Star Trek transporter, there is no tragedy: the series of observer-moments is not broken. Does it matter if the spacetime coordinates of the observer-moment-generator jumps sharply? Even without teleporting, the brain is different in every moment than in the preceding moment, and the spacetime coordinates on each neuron has changed slightly, and some neurons are in different states and may even have exchanged a molecule with the environment.

(Unrelated note:) Cardinality

Can you take a fraction of infinity? If we have two quantites A and B, where A is some huge number and B must be strictly less than A, let's say 0.01A, can we still think of B as strictly smaller if we increase A to infinity? For the sake of probabilistic estimates such as "how likely is it that my universe belongs to the A category rather than the B category?". Or is it suddenly equally likely that my universe is in A vs. B?

Bostrom speaks as if B is meaningfully less than A.

Math has stuff to say about infinities, specifically figuring out whether some quantity is "aleph-0 infinite" or "aleph-1 infinite", but this seems a totally different topic.

(Unrelated note:) EY's perspective from Book: Inadequate Equilibria

(Unrelated note:) Fungibility of work, from Book: Doing Good Better

It feels related somehow. Maybe just because tickles the same "counterfactual thinking" pathways. … "if I'm not here working as a doctor in Britain, someone else would be in my office doing my job, so what's my actual total effect on society?" feels similar to thinking about indexical evidence.

Observer Equation (OE)

Given on page 172-173.


  • α : an observer-moment
  • Pα : the observer-moment α's subjective probability function
  • Ωα : all possible observer-moments in α's reference class
  • wα : the possible world in which α is located
  • e : some evidence
  • h : some hypothesis
  • Ωe : class of observer-moments "about whom" e is true
  • Ωh : class of observer-moments "about whom" h is true
    • h can be indexical like "this is an observer-moment that has a black beard", or if h is non-indexical, it is true about all and only those possible observer-moments that live in possible worlds where h holds true
  • Ω(w) : the class of observer-moments in the possible world ww .

The Observer Equation:

Pα(he)=1γσΩhΩePα(wσ)ΩσΩ(wσ) P_{\alpha}(h|e) = \frac{1}{\gamma} \sum_{\sigma \in \Omega_{h} \cap \Omega_{e}} \frac{ P_{\alpha}(w_{\sigma}) }{ |\Omega_{\sigma} \cap \Omega(w_{\sigma})| }

where γ is a normalization constant (note the thing being summed is the same as above):

γ=σΩePα(wσ)ΩσΩ(wσ) \gamma = \sum_{\sigma \in \Omega_{e}} \frac{ P_{\alpha}(w_{\sigma}) }{ |\Omega_{\sigma} \cap \Omega(w_{\sigma})| }

I observe that this equation takes the form of the basic Pr(he)=Pr(he)/Pr(e)\Pr(h|e) = \Pr(h \cap e) / \Pr(e) .

Quantum-generalized Observer Equation (QOE)

Are you facing different outcomes for which you know the quantum measure (probability of observing)? Then you can use the QOE. Note this is rare!

The QOE is just the same as the OE but includes a factor for the quantum measures, μ.

Observer-relative chances

Principal Principle

A very banal principle, but philosophers had to discuss it. The principle is that when you know some objective chance, you set your subjective credence to the same, unless you have other relevant information.

