Saturday, March 2, 2013

Infinity, probability and disagreement

Consider the following sequence of events:

  1. You roll a fair die and it rolls out of sight.
  2. An angel appears to you and informs you that you are one of a countable infinity of almost identical twins who independently rolled a fair die that rolled out of sight, and that similar angels are appearing to them all and telling them all the same thing. The twins all reason by the same principles and their past lives have been practically indistinguishable.
  3. The angel adds that infinitely many of the twins rolled six and infinitely many didn't.
  4. The angel then tells you that the angels have worked out a list of pairs of identifiers of you and your twins (you're not exactly alike), such that each twin who rolled six is paired with a twin who didn't roll six.
  5. The angel then informs you that each pair of paired twins will be transported into a room for themselves. And, poof!, it is so. You are sitting across from someone who looks very much like you, and you each know that you rolled six if and only if the other did not.
Let H be the event that you did not roll six. How does the probability of H evolve?

After step 1, presumably your probability of H is 5/6. But after step 5, it would be very odd if it was still 5/6. For if it is still 5/6 after step 5, then you and your twin know that exactly one of you rolled six, and each of you assigns 5/6 to the probability that it was the other person who rolled six. But you have the same evidence, and being almost identical twins, you have the same principles of judgment. So how could you disagree like this, each thinking the other was probably the one who rolled six?

Thus, it seems that after step 5, you should either assign 1/2 or assign no probability to the hypothesis that you didn't get six. And analogously for your twin.

But at which point does the change from 5/6 to 1/2-or-no-probability happen? Surely merely physically being in the same room with the person one was paired with shouldn't have made a difference once the list was prepared. So a change didn't happen in step 5.

And given 3, that such a list was prepared doesn't seem at all relevant. Infinitely many abstract pairings are possible given 3. So it doesn't seem that a change happened in step 4. (I am not sure about this supplementary argument: If it did happen after step 4, then you could imagine having preferences as to whether the angels should make such a list. For instance, suppose that you get a goodie if you rolled six. Then you should want the angels to make the list as it'll increase the probability of your having got six. But it's absurd that you increase your chances of getting the goodie through the list being made. A similar argument can be made about the preceding step: surely you have no reason to ask the angels to transport you! These supplementary arguments come from a similar argument Hud Hudson offered me in another infinite probability case.)

Maybe a change happened in step 3? But while you did gain genuine information in step 3, it was information that you already had almost certain knowledge of. By the law of large numbers, with probability 1, infinitely many of the rolls will be sixes and infinitely many won't. Simply learning something that has probability 1 shouldn't change the probability from 5/6 to 1/2-or-no-probability. Indeed, if it should make any difference, it should be an infinitesimal difference. If the change happens at step 3, Bayesian update is violated and diachronic Dutch books loom.

So it seems that the change had to happen all at once in step 2. But this has serious repercussions: it undercuts probabilistic reasoning if we live in multiverse with infinitely many near-duplicates. In particular, it shows that any scientific theory that posits such a multiverse is self-defeating, since scientific theories have a probabilistic basis.

I think the main alternative to this conclusion is to think that your probability is still 5/6 after step 5. That could have interesting repercussions for the disagreement literature.

Fun variant: All of the twins are future and past selves of yours (whose memory will be wiped after the experiment is over).

I'm grateful to Hud Hudson for a discussion in the course of which I came to this kind of an example (and some details are his).


Martin Cooke said...

It would be very nice if you had an argument against the atheist invocation of multiverses, but I wonder if the argument would work against the mere possibility of countably many multiverses? If so, then is it not more of an argument against a set of all and only the natural numbers? If so, then there could still be some other sort of infinity of parallel universes (e.g. hash many, where hash times zero equals zero divided by zero).

Regarding when the change had to happen, is the situation not analogous to the following? You buy a lottery ticket. On the day of the lottery you are not allowed to see the draw, but you are instead kidnapped by the secret services and put in a room with someone else, determined by them as follows: he or she is the lottery winner, if you did not win (if you won, then he or she is not the winner, you are). Now, you have the same chance of winning all the way through, of course, so it is very likely that the other person in the room with you won the lottery.

Martin Cooke said...

Woops, the last word of my first line should have been 'universes?'

Incidentally, is this not essentially Levy's paradox? (I described Levy's paradox in the May 2010 issue of The Reasoner.)

Alexander R Pruss said...

1. I don't know that this is Levy's paradox, but it does have a very similar concluding part.

2. In my story, I was assuming a countable multiverse. But if there is an uncountably infinite one, the angels could perhaps just select any countably infinite subset of twins.

3. There are a number of ways of filling out your lottery story. For it to be parallel, your situation and that of the other person need to be symmetric. Here are two ways of doing that:

A. They kidnap two lottery entrants at random. Then they find out that one is a winner and announce it.

B. They have decided to kidnap one lottery loser at random and one lottery winner at random.

In case A, their announcement that one of the two of you is the winner is evidence that you're a winner. In case B, their decision to kidnap you is evidence that you're a winner. In both cases, I expect your probability of being the winner goes unproblematically up to 1/2.

But in my scenario, it's hard to see what relevant evidence has been gained.

Anonymous said...

Isn't the problem trying to apply a probability to a single event? That is, saying the odds of rolling a 1 though 5 is 5/6 is really shorthand for saying that over an infinite, or rather, indefinite number of repetitions, the proportion of non-sixes will approach 5/6. But if we expand this scenario across an inifinite number of attempts or across infinite possible worlds, it becomes clear that it is not symmetrical: if we count all the infinite times you roll the die, non-six will happen in ~83% of the cases; but only a subset of your twin's rollings are used, namely the ones in which he's opposite (all the cases where he was the same are ignored).

So if we take care to spell out what "probability" we're referring to, there's no paradox: if the scenario is repeated from your point of view, then you have an overall probability of having rolled non-six of 5/6, and your specially-selected twin has a probability of 1/6 of having rolled a six. This is still symmetrical, in that if we are instead considering the (hypothetical) repeated series from his point of view, the probabilities will be the same, just the other way around.

Alexander R Pruss said...

Let's work out the probability at the final stage. Let D1 be my die roll. Let D100 be the die roll of the person I'm paired with. The information E that I've been given is: (D1=6 iff ~(D100=6)).

Now: P(D1=6 and E) = P(D1=6 and ~(D100=6)) = (1/6)(5/6)=5/36.
But P(E) = 10/36 (there are exactly ten combinations of the two dice such that E holds: 1-6, 2-6, 3-6, ..., 5-6, 6-1, 6-2, 6-3, ..., 6-5).
So P(D1=6 | E) = P(D1=6 and E)/P(E) = (5/36)/(10/36) = 1/2.

I am a suspicious of this argument, though. There may be a subtle distinction between two kinds of evidence: evidence where one queried about the value of some random variable, went to the lab, and found the value, versus evidence where it was a matter of chance that one found the value of THAT random variable. I am reminded of Sleeping Beauty here somehow.

Alexander R Pruss said...

I think I may have it! Let's suppose a concrete procedure for how the angels pair people up. Here's the simplest one I can think of. Start with an ordering of people. They then start at the top of the list. They take the first person, and then they go down the list to find the next person down who has the opposite result (where by "result" I just mean whether the person got six). They pair these two, remove them from the list, and continue.

OK, now suppose that they followed this procedure, and furthermore suppose that every twin is given a badge with his number. Given your and your paired twin's badge number, you can both use Bayes' Theorem to work out the respective probabilities of each having rolled six. And these probabilities will be mutually consistent and need not be symmetric.

For example, suppose I'm lucky enough to bear badge #1, and I find myself paired with someone bearing badge #10. What did I learn? Well, I learned that people with badges 1-9 have the same result, while the person with badge 10 has a different result. Let E be this evidence and let Dn be what person #n's die shows. Then: P(E | D1=6) = (1/6)^9 (5/6). P(E | ~(D1=6)) = (5/6)^9 (1/6). Plug these two into Bayes' theorem and you will find: P(D1=6 | E) = 0.0000005 and P(D10 = 6 | E) = 0.9999995.

In other words, when you know the procedure by which the angels picked the pairs, you've got perfectly well-defined and unparadoxical results. As soon as you learn whom you're paired with, you do a normal Bayesian update and all is well.

But in the story as I originally gave it, you don't know the procedure and there are no helpful badges. Let's suppose the angels give you some information that's symmetric between everybody involved, such as that they are following the above procedure, based on some ordering. To keep symmetry, however, they don't disclose the ordering.

More later...

Alexander R Pruss said...
This comment has been removed by the author.
Alexander R Pruss said...

... To resume.

But now you're a little clever. You think to yourself. Suppose I learned they gave me badge #N and my twin badge #M. Then I could calculate the probability p(N,M) that I rolled a six, and of course I would think that my twin rolled 1-p(N,M). Moreover, 1-p(N,M) = p(M,N) by the symmetry of the situation (or by a complex probability calculation).

Suppose, now, I learned that they gave me and my twin badges #N and #M, but didn't tell me who got which badge. Then because it's just as likely that they'd give me #N and him #M as the other way around. So, in this case, my probability of having rolled six is: (1/2)p(N,M) + (1/2)p(M,N) = (1/2)p(N,M) + (1/2)(1-p(N,M)) = 1/2.

So, while I don't know which badge numbers they gave me and my twin, no matter which badge numbers I found them to have given me and my twin, as long as I didn't know which was which, I would assign 1/2 to my having rolled six. So I should assign 1/2 to my having rolled six.

But this clever line of thought fails. It fails because there is no good way of probabilistically modeling the implicit infinite fair lottery involved in the badge number assignments.

So, here's a way to put a criticism the argument I made two comments up that P(D1=6 | E) = 1/2. While the calculation was correct, E isn't my total relevant evidence. I also have the additional information that I am paired with precisely this twin across from me. This additional information, however, cannot be integrated into the probabilistic framework, since it involves an implicit infinite fair lottery.

It seems that I now have three choices. The first is to say that this additional information infects my credence for D1=6, and I can no longer conclude either that it's 1/6 nor that it's 1/2, but I must simply say it's got no probability.

The second is to say that what information we can't handle within the probability calculus should simply be ignored when there is information that we can handle in the vicinity. If we do this, then we stick to the calculation I did above, and assign 1/2 as the probability at the end. (I think this is unacceptable, if only because I can come up with variants of my original case where the natural probability will be other than 1/2, say because instead of being paired with one twin who has a different result, I get paired with two twins have different results from me, which will lead to 1/3.)

The third option is to defend sticking to 1/6 as the probability of having rolled heads, on the grounds that how I got to E involved a biased sampling process--I didn't just run across another twin and learn that he had the opposite result from mine (if I did, then 1/2 would be the right probability). Rather, I was shown a carefully chosen twin. The way to handle biased sampling in a Bayesian system is by simply including in one's evidence all the information one has about how the sampling works. But in this case, the information on how the sampling works is information that cannot be handled in a probabilistic framework, because probabilistic frameworks cannot handle infinite fair lotteries.

But where one has biased data and cannot correct for the bias, then one needs to drop the data. And so one assigns 1/6, as does the other person in the room, and we have an argument for a controversial thesis in the disagreement literature: that two people can share all the evidence, evaluate the evidence by the same rational principles, and come to incompatible conclusions.

I am not that happy with this way out.

Mark said...

Define a p/q-cohort (for 0 < p < q) as a group of q epistemically symmetric individuals, of whom exactly p have rolled a six in the experiment. The intuition that the probability I rolled a 6 is .5 comes from the fact that the set of twins can be partitioned into a set of 1/2-cohorts. But that is not my total evidence; my total evidence is that the set of twins can be partitioned into a set of p/q-cohorts for any 0 < p < q. And it's not clear to me why this much more extensive piece of evidence should shift my prior.

Jeremy Gwiazda said...

Alex, at one point you write:

“So it seems that the change had to happen all at once in step 2. But this has serious repercussions: it undercuts probabilistic reasoning if we live in multiverse with infinitely many near-duplicates.”

There is a simple reply: you’re not talking about the correct conception of infinitely many. Here are further details ;)

Alexander R Pruss said...

There is an interesting discussion of the argument here.

Alexander R Pruss said...


Your approach is similar to the following approach. Deny the Axiom of Choice, including the Axiom of Countable Choice. Deny that it's possible to have an actual countable infinity. All that's possible is an actual uncountable infinity, maybe of the sort you talk about. (Note that the set {1,...,N}, where N is an infinite number, will be uncountable in extensions of the reals that have hyperintegers. And it's sets like that that you're talking about, I think.)

And I can't run my argument in an uncountable setting where there is no countable subset.

Jeremy Gwiazda said...


I definitely deny that there is any sort of actual determined countable infinity. I haven’t thought much about the axiom of choice. My concern here is that if, e.g., the natural numbers are properly understood (as a potential infinity, not actual/present-all-at-once), then couldn’t there be an Axiom of Countable Choice that even I might support? At least, it's not clear to me that there could not be.

Also – I look forward to working through the details of your argument that any such number is uncountable.