Alexander Pruss's Blog

Wednesday, July 31, 2013

More fun with conditional probabilities

Let X and Y be independent random variables uniformly distributed over [0,1), and suppose our setup is symmetric between X and Y (i.e., any probabilities, conditional or not, are symmetric under interchange of X and Y). Let Z be the point in the plane with polar coordinates (r,θ)=(√X,2πY). It is easy to see that Z is uniformly distributed over the unit disc D (not including the boundary).

Let A be the horizontal line segment from the center O of the disc to the right edge.

Question: What is P(Z=O|Z∈A)?

The obvious answer is zero or infinitesimal. After all, Z is uniformly distributed over the disc D, and O is just one of the infinitely many points on A.

But the obvious answer seems to be mistaken. Here's why. We have Z on the line segment A if and only if either X=0 (in which case, no matter what Y is, Z=O) or Y=0. We have Z=O if and only if X=0 (it doesn't matter what Y is). Let E be the event that X=0 or Y=0, i.e., that Z∈A. Then P(Z=O|Z∈A)=P(X=0|E). But surely P(X=0|E)=P(Y=0|E) by symmetry. So 1=P(X=0 or Y=0|E)≤P(X=0|E)+P(Y=0|E)≤2P(X=0|E), and so P(X=0|E)≥1/2. Thus, P(Z=O|Z∈A)≥1/2.

Many of these posts on conditional probabilities, infinitesimals and uniform distribution should be going into a paper which may be entitled "In search of true uniformity."

The simple religious perception argument against naturalism

Every natural perceptual faculty we have sometimes functions veridically.
We have a natural faculty of religious perception.
Religious perception is always perception as of something supernatural.
If a perception is as of an F, and the perception is veridical, then there is an F.
Therefore, there is something supernatural.

One can always also try a probabilistic version of the argument: it is very unlikely that a faculty should never function veridically, so probably there is something supernatural.

Tuesday, July 30, 2013

Uniform probabilities and Borel paradox

Question 1: I uniformly choose a random number X from the interval [0,1] (all points from 0 to 1, inclusive). What is the conditional probability that X is 1/9, given that X is either 1/9 or 4/9? I.e., if I receive the information that either 1/9 or 4/9 was picked, how confident should I be that 1/9 was picked?

Answer? Surely, the right answer is: 1/2. Both points are equally likely.

Question 2: I shoot a dart at a circular target of radius 1, with uniform distribution over the target, and I measure the distance R between where the dart hits and the center of the target. What is the conditional probability that R is 1/3, given that R is 1/3 or 2/3?

Answer? We expect this to be less than one half, because the circle of radius 2/3 is bigger. More precisely, the circle of radius R has circumference 2πR. The conditional probability of being on some circle, given that one is on one of two circles, is presumably proportional to the circumference of the relevant circle. Thus: P(R=1/3|R=1/3 or R=2/3)=2π(1/3)(2π(1/3)+2π(2/3))=1/3.

But now let Y=R². Observe that Y is uniformly distributed over the interval [0,1]. Here's why. Y is in the interval [a,b] (where b≥a) provided that R is in the interval [√a,√b]. But the probability of R being in that interval is equal to the probability that the dart lands between √a and √b units away from the center of the target. The region where this happens has area π(√b)²−π(√a)². The total area of the circle is π(1)². So the fraction of the area of the circle where Y is in [a,b] is equal to (π(√b)²−π(√a)²)/π=b−a. Thus, the probability that Y is in [a,b] is equal to b−a, which is exactly what we have in the case of a uniform distribution.[note 1]

Let's now go back to Question 1. The only thing I stipulated was that X is uniformly distributed over [0,1]. Well, we've seen that Y is uniformly distributed over [0,1]. So, what we said about X's conditional probability should hold for Y. Thus, the conditional probability of Y being 1/9 given that it's 1/9 or 4/9 should be 1/2. But let's see: P(Y=1/9|Y=1/9 or Y=4/9)=P(R=√(1/9)|R=√(1/9) or Y=√(4/9))=P(R=1/3|R=1/3 or R=2/3), by definition of Y. But we've already worked out the latter conditional probability as the answer to Question 2: it's 1/3.

(This is of course a version of the Borel paradox.)

So what is going on?

Well, we're conditioning on events of zero probability. That's fishy. One thing we could learn from this story is that saying that some measurement is uniformly distributed in the sense in which that's normally understood does not convey all the relevant information about that measurement. For to compute conditional probabilities on null sets, one needs more information on how the "uniform" measurement was generated. For if it was generated as the square of the distance from the center of our target of unit radius, the conditional probabilities will be different than if it is generated in a more truly uniform manner.

It is tempting to say that true uniformity of a number in [0,1] requires that the Popper function associated with the process be invariant under isometries, in this case translations. That's fine in one dimension, but in three dimensions such strong isometric invariance cannot hold.

A different move is simply to admit that the notion of conditional probability just doesn't make sense when we're conditioning on sets of measure zero. Sure, we can sometimes talk about what credences it would be rational to give in some condition, where that condition has null probability. But that is a matter of rationality, not of formal probability theory.

I've previously made this point with infinitesimals.

Sunday, July 28, 2013

Rationality, value and presentism

If presentism is true, future events are not real.
It is not rational to trade a real good event for a non-real event.
It is rational to trade a small present good event for a great future good event.
Present good events are real.
So, presentism is not true.

I wonder if a rejection of presentism wasn't implicit in the traditional Catholic condemnation of usury. For if presentism is true, then present cash sure seems worth more than future cash, since the latter doesn't exist. But the argument that usury may be charged because present cash is worth more than future cash so has been explicitly condemned by Pope Innocent XI (1679). (Parenthetically, this leads to the question of whether and, if so why, lending at interest is permitted in our day. I think what has happened is that the nature of what is denoted by the word "money" has changed, from being something largely constituted by the value of concrete stuff, like some metals, to being entirely a matter of shifting and always future-directed social agreement. Thus, the word "money" means something different in the medieval texts and in contemporary usage. Of course 1679 isn't medieval, but the switchover was a temporally extended event with vague boundaries.)

Thursday, July 25, 2013

YouTube talk on sexual ethics

Franciscan University of Steubenville has posted my talk on sexual ethics on YouTube. The talk gives some of the central ideas of my One Body book.

Wednesday, July 24, 2013

Culpability and reasons

Suppose I have (both objectively and subjectively) a morally decisive reason R to refrain from doing A. But nonetheless I do A for some reason S. This reason S is a bad reason. Notice that how poor a reason S is tends to contribute to my culpability. ("What profit it a man to gain the whole world at the cost of his own soul? But Wales!?") Moreover, S's being less deeply entrenched in me makes me more culpable. I don't have even the excuse of habit. On the other hand, the more deeply entrenched R is in me, the worse I am for neglecting R. This suggests that Hume, in insisting that what is crucial for culpability is that a wrong action flow from and reflect one's characte, gets the matter reversed. It is the reasons against the action that make for culpability. (This is perhaps most clear in cases of wrongful omissions.)

Monday, July 22, 2013

Fine-tuning and best-systems accounts of laws

According to best-systems accounts of laws, the laws are the theorems of the best system correctly describing our world. The best system, roughly, is one that optimizes for informativeness (telling us as much as possible about our world) and brevity of expression.

Now, suppose that there is some dimensionless constant α, say the fine-structure constant, which needs to be in some narrowish range to have a universe looking like ours in terms of whether stars form, etc. Simplify to suppose that there is only one such constant (in our world, there are probably more). Suppose also, as might well be the case, that this constant is a typical real number in that it is not capable of a finite description (in the way that e, π, 1, −8489/91907^4/7 are)—to express it needs something an infinite decimal expansion. The best system will then not contain a statement of the exact value for α. An exact value would require an infinitely long statement, and that would destroy the brevity of the best system. But specifying no value at all would militate against informativeness. By specifying a value to sufficient precision to ensure fine-tuning, the best system thereby also specifies that there are stars, etc.

Suppose the correct value of α is 0.0029735.... That's too much precision to include in the best system—it goes against brevity. But including in the best system that 0.0029<α<0.0030 might be very informative—suppose, for instance, that it implies fine-tuning for stars, for instance.

But then on the best-systems account of laws, it would be a required by law that the first four digits of α after the decimal point be 0029, but there would be no law for the further digits. But surely that is wrong. Surely either all the digits of α are law-required or none of them are.

Friday, July 19, 2013

Symmetry and Indifference

Suppose we have some situation where either event A or event B occurred, but not both, and the two events are on par: our epistemic situation is symmetric between them. Surely:

One should not assign a different probability to A than to B.

After all, such a difference in probability would be unsupported by the evidence. It is tempting to conclude that:

One should assign the same probability to A as to B.

From (2), the Principle of Indifference follows: if it's certain that exactly one of A₁,...,A_n happened, and the epistemic situation is symmetric between them all, then by applying (2) to the different pairs, we conclude that they all have equal probability, and since the probabilities must add up to one, it follows that P(A_i)=1/n for all i.

But while (1) is very plausible (notwithstanding subjective Bayesianism), (2) does not follow from (1), and likewise Indifference does not follow. For (1) is compatible with not assigning any probability to either A or B. And sometimes that is just the right thing to do. For instance, in this post, A and D are on par, but the argument of the post shows that no probability can be assigned to either.

In fact, we can generalize (1):

One should treat A probabilistically on par with B.

If one of the two has a probability, the other should have a probability, and the same one. If one of the two has an imprecise probability, the other should have one, and the same one. If one is taken as maximally nonmeasurable, so should the other one be. And even facts about conditional probabilities should be parallel.

Nonetheless, there is a puzzle. It is very intuitive that sometimes Indifference is correct. Sometimes, we correctly go from the fact that A and B are on par to the claim that they have the same probability. Given (1) (or (3)), to make that move, we need the auxiliary premise that at least one of A and B has a probability.

So the puzzle now is: Under what circumstances do we know of an event that it has a probability? (Cf. this post.)

Thursday, July 18, 2013

Justification and subjective Bayesianism

According to subjective Bayesianism, the only constraints on prior probabilities are that they be consistent and, for contingent events, strictly between 0 and 1. But this makes it too easy to be within one's full epistemic rights in believing really silly stuff with no evidence whatsoever, just by having assigned it a high prior.

Perhaps what the subjective Bayesian needs to do is distinguish epistemic permissibility, which one can have without evidence, from epistemic justification, which requires some evidence. I doubt that the subjective Bayesian is going to be able to run a good story here that's consistent with the subjective elements. After all, there are many silly things that we are justified in disbelieving precisely because of their low priors and despite there being evidence for them, such as the law of gravity that says F=Gmm'/r^2+a, where a=10^−1000000, a law that we actually have a lot of evidence for—any evidence we have for Newton's law of gravitation is also evidence for this law, since the two laws are experimentally indistinguishable—but which we rightly disbelieve precisely because of low priors.

The Bayesian needs non-subjective priors.

Wednesday, July 17, 2013

Mereological universalism and Platonism

If mereological universalism holds, then for any predicate F that is satisfied by an object, there is a mereological sum of all the objects that satisfy F.
If Platonism is true, then for any object x there is a property U_x such that (a) x has U_x, (b) nothing else has U_x, and (c) U_x does not have mereological parts.

Call any such a property an "identity of x". Condition (c) is probably satisfied by all properties, but I could imagine someone thinking that conjunctive properties are mereological sums of their conjuncts, so (c) would be needed to rule out conjunctive properties (and if V_x is a conjunctive property satisfying (a) and (b), then we can let U_x be the property of having V_x or being a unicorn, and this won't have mereological parts).

Assume Platonism and mereological universalism. For any predicate G(x), say that I_G(P) if and only if P is an identity of an object x such that G(x). For any object S, say that x∈S if and only if an identity of x is a mereological part of S. Then by (1) and (2), for any predicate G that has at least one satisfier there is an object M such that x∈M if and only if G(x). Just let M be any mereological sum of all the objects satisfying I_G (to get that if x∈M, then G(x), note that it follows from (c) that no identity can overlap more than one identity).
But now we have a contradiction. Let G(x) be the Russell predicate not(x∈x). Then G has at least one satisfier: e.g., you satisfy G. So there is an M such that x∈M if and only if G(x). So, M∈M if and only if not(M∈M), a contradiction.

So, our Platonist mereological universalist needs to restrict her universalism to concrete objects or something like that. Lewis's Parts of Classes is no doubt relevant, and what I said above may well overlap with Lewis (I haven't actually read that book--just ordered from Interlibrary Loan).

Tuesday, July 16, 2013

Trees and limits of probabilistic reasoning

Suppose you're one of the nodes of this infinite tree, cut off for the purposes of the diagram, but you have no information whatsoever on which node you are. Region A is exactly like region B. And region B is exactly like region C.

So that you're in D must be at least as likely as that you're in B or C, but that you're in B is just as likely as that you're in A, and ditto for being in C. Hence that you're in D is at least twice as likely as that you're in A. But that you're in D obviously has the same probability as that you're in A.

Thus, P(A)=P(D)≥2P(A). Hence P(A)=0. But of course the whole tree is equal to three copies of A, plus the point 0. So if you can assign probabilities, then you're certain to be at 0. Which is absurd, especially since you can recenter the graph at another point and run the argument again.

Philosophically, this is a nice illustration of the severe limits to probabilistic reasoning. My eight-year-old son is looking at what I'm posting and says: "It's just probabilities and I'm not going to use probabilities on this tree. It's obvious." He wonders why I am posting such obvious things.

Mathematically, all that this displays is that of course there is a paradoxical decomposition of a regular tree, and hence that there is no finitely-additive symmetry-invariant probability measure on a regular tree.

Saturday, July 13, 2013

YouTube talks on Principle of Sufficient Reason and on the measure problem in cosmology

For most of the week I was at the Santa Cruz Philosophy of Cosmology institute. Thursday I gave a talk on the Principle of Sufficient Reason and on the measure problem in cosmology for multiverses. These talks and many talks by smart people are available here. Click on "Program", then scroll down close to the bottom to Thursday 7/11 if you want mine.

Or just click right here for part 1 and part 2 of my talk on Youtube.

Here are the slides to part 1 (cosmological argument) and part 2 (measure problem).

Friday, July 12, 2013

Why prefer simple and elegant theories?

Why should we prefer simple and elegant theories, the empirical evidence being equal? There are two standard answers (besides the answer that no, we shouldn't):

truth: beautifully simple theories are more likely, all other things being equal, to be true
pragmatic: simple theories are easier to work with.

Here's a fun alternative to (2):

aesthetic: just as it's better to hang around beautiful places, it's better to hang around beautiful theories.

This is mainly a tongue-in-cheek suggestion. I definitely think, in part for theistic reasons, that truth is the right answer.

But I think there is a bit of truth to the aesthetic suggestion. When you have multiple equivalent formulations of the same theory, why not spend more time with the more beautiful ones, simply because they are more beautiful? Of course, this could be defeated by more pedestrian pragmatic concerns, because a more beautiful formulation can be harder to actually work with. (Actually, that last point suggests that (2) can't be the whole story about preference for elegant theories.)

Wednesday, July 10, 2013

Quasi-deterministic causation

Say that:

C quasi-deterministically causes E iff necessarily either C causes E or C does not cause anything.

Now, quasi-deterministic causation of actions by reasons is compatible with libertarian views of freedom. Moreover, quasi-deterministic causation provides something like contrastive explanation of the effect. Thus, libertarian views of freedom are compatible with something like contrastive explanation of actions.

Tuesday, July 9, 2013

More on comparing zero probability sets

There are two devices, A and B, each of which generates an independent uniformly distributed number between 0 and 1. You have a choice between two games.

Game 1: You win if B generates the number 1/2.

Game 2: You win if B generates the number generated by A.

Perhaps you say: "I don't care. I have infinitesimal or zero probability of winning." To make you care, suppose the game is free but the payoff is infinite.

Intuitively, you're equally likely to win either game. There is no reason to choose one over the other, given that the two devices are independent. It's no easier or harder to get 1/2 than to get whatever number A generates.

But there is another way of seeing the situation. Graph the state space of the game with the x-coordinate corresponding to A and the y-coordinate corresponding to B. The state space is then the unit square. But on Game 1, the victory region is a horizontal line y=1/2 while on Game 2, it is the diagonal line y=x. But the diagonal line has the square root of two, approximately 1.414, as its length, while the horizontal line has unit length. So you should choose Game 2.

Really? I still think it makes no difference. (And that it makes no difference shows that rotation invariance does not apply to all cases of uniform distribution in the square, since the horizontal line when rotated becomes obviously shorter than the diagonal one.)