Monday, June 17, 2013

Blowguns and their darts

In case this heated discussion has made you desire a blowgun, here are my instructions for blowgun and dart making. :-) Stay safe!

Saturday, June 15, 2013

Spoof arguments against the Axiom of Choice

As a first year graduate student, I wrote this pseudonymous spoof, inspired by the Sokal affair. It's rather immature in places, but enjoy!

Friday, June 14, 2013

The Banach-Tarski Paradox and the Axiom of Choice

The Banach-Tarski theorem (BTT) says that, given the Axiom of Choice, a continuous ball can be decomposed into a finite number of pieces that can be rearranged to form two balls of equal size. That's weird, and is taken by some to be an argument against the Axiom of Choice.

I don't think we should take it as such an argument. Sure, BTT is paradoxical. But when one looks at the proof, one notes that the proof makes use of paradoxical results that do not depend on the Axiom of Choice. For instance, a lemma in standard proofs of BTT is the surprising fact that you can take any circle that's missing a countable number of points, decompose that circle into two disjoint (messy!) pieces, and reassemble the pieces, without overlap, to get a complete circle. That lemma is about as weird as BTT, but it doesn't use the Axiom of Choice at all.[note 1] Moreover, the proof uses paradoxical decompositions of various countable sets, and these are, well, paradoxical, but do not involve the Axiom of Choice. Using the Axiom of Choice lets you put all the paradoxicality together into a neat package, but when I think about the proof of the result, I just don't see Choice as the source of paradoxicality. In fact, once one sees all the other ingredients of the proof of BTT, the Axiom of Choice step seems quite intuitive.

Another way to put the point is this: Once one reflects enough on all the pieces of the proof of BTT that do not use Choice and accepts them, BTT no longer seems very surprising. When you cut things up into strangely scattered pieces, it's not that surprising that you can put them back together in various ways.

Thursday, June 13, 2013

Popper functions and null sets

Let's go back to the problem that I keep on thinking about: How to distinguish possibilities that are classically of null probability. For instance, given a uniform choice of a point on some nice set (say, a ball) in Euclidean space, we want to say something like P({x,y})>P({x}), when x and y are distinct: it's more likely that one would hit one of two points than that one would hit a particular point. A series of my blog posts (and at least one article) showed that infinitesimals aren't the way. What about conditional probabilities?

For various reasons, instead of taking unconditional probabilities to be fundamental and defining conditional probabilities in terms of them, one may want to take conditional probabilities as fundamental. The standard method is to use Popper functions (I'll assume the linked axioms below). One might hope to use Popper functions to do things like make sense of the difference between the probability of two points in a continuous case (say, where a point is uniformly chosen in some nice subset of a Euclidean space) and the probability of a single point. For instance, one might hope that P({x,y}|{x,y})>P({x}|{x,y}) whenever x and y are distinct.

This won't work, however. Instead of working with propositions, I will work with sets—the definitions of Popper functions neatly adapt. Let Ω be a solid three-dimensional ball. Assume that P(A|B) is defined for all A and B in some algebra F of subsets of Ω.

Say that a set A in F is P-trivial provided that P(C|A)=1 for all C (including for the empty set). The empty set is trivial, of course. In order for us to have any hope of saying things like P({x,y}|{x,y})>P({x}|{x,y}), we better have sets with two points be non-trivial. Now, it's not hard to show that a finite union of trivial sets is trivial, so in order for any finite sets to be non-trivial, singletons need to be non-trivial. Moreover, it's easy to see that any subset of a trivial set is trivial.

Moreover, we want rotational symmetry. Say that F and P are rotationally symmetric provided that for any rotation r around the origin and A and B in F, rA and rB are in F, and P(rA|rB)=P(A|B).

Theorem. If P is rotationally symmetric and F includes all countable subsets of Ω, then there is at least one P-trivial countably infinite set and all finite sets are trivial.

If we are to be extending something like classical probabilities, we do want countable subsets of Ω to be in F. The triviality of finite sets follows from the fact that some singleton is trivial (since any subset of a trivial set is trivial) and if one singleton is trivial, then by rotational invariance, they all are, so all we need to prove is the existence of that trivial countably infinite set.

The proof is easily based on standard ideas from the proof of the Banach-Tarski Paradox.

Proof. Let SO(3) be the rotation group around the origin. Famously, there is a subgroup G that is isomorphic to the free group F2 on two elements. Choose a point ω in Ω such that ρ(ω)=ω only if ρ is the identity rotation. (This is a counting argument: G is a countable group, and each non-identity member of G has two fixed points on any given sphere around the center, so for any fixed sphere there will be only countably many fixed points of non-identity members of G on it, and hence there will be a point on the sphere that isn't a fixed point of any non-identity member.) Let H={ρ(ω):ρ∈G}. Then H is a countable subset of Ω.

For a reductio ad absurdum, suppose H is non-trivial. Then P(−|H) is a finitely additive probability measure on H. Moreover, HH for any ρ∈G, so by rotational invariance PK|H)=PKH)=P(K|H) for any ρ∈G, and so P(−|H) is a finitely additive G-invariant measure on H. Using the bijection between G and H given by f(ρ)=ρ(ω) (we use the choice of ω to see that f is one-to-one), we can then get a finitely additive G-left-invariant measure on G. But G is isomorphic to F2 and hence is not amenable and hence has no such invariant measure. (One could also neatly demonstrate a paradoxical decomposition here.) That's a contradiction, so H must be trivial. QED

Even though this argument uses ideas from the proof of Banach-Tarski, and famously the latter uses the Axiom of Choice, this argument does not use the Axiom of Choice.

One can perhaps get out of this by having a more onerous requirement on the sets A and B that are on either side of the bar in "P(A|B)" than that they are fit in a single algebra F. We want any countable set to be an acceptable A, but perhaps we don't want to allow every countable set as an acceptable B. I don't know what natural requirement could be put here.

Wednesday, June 12, 2013

Yet another account of omnipotence

The following account of omnipotence runs into the McEar objection:

  1. x is omnipotent iff x can do anything whose doing is consistent with the nature of x.
For suppose McEar has the essential property of doing nothing other than scratching his ear, and suppose he can scratch his ear. Then (1) counts McEar as omnipotent. That's no good.

The Pearce-Pruss account of omnipotence escapes this. But so does this minor twist on (1):

  1. x is omnipotent iff x can do anything whose doing is consistent with the nature of a perfect being.
There are things consistent with the nature of a perfect being that McEar can't do, say create a pebble.

Perhaps, though, there is a circularity problem. For a perfect being has all perfections. And one of the perfections is omnipotence. However, I do not know that this is fatal. Compare:

  1. a fully self-knowledgeable person is one who knows all her mental attributes.
This seems a perfectly reasonable definition, even if one of the mental attributes of such a person is being fully self-knowledgeable.

Tuesday, June 11, 2013

A funny uniform distribution?

Let X and Y be independent random variables uniformly distributed over [0,1]. Let Z=max(X2,Y2). Then it's easy to check that Z is also uniformly distributed over [0,1).

But now suppose we think that uniform random variables have equal infinitesimal probabilities of hitting every point. Thus, P(X=a)=P(Y=a)=α for every a, where α is some infinitesimal. What, then, is P(Z=a)? Well Z=a if and only if one of three mutually exclusive possibilities occurs:

  1. X<a1/2 and Y=a1/2
  2. X=a1/2 and Y<a1/2
  3. X=Y=a1/2.
Now, P(X<a1/2)=a1/2O(α) (the last term is due to end-point effects: P(X<1)=1−α) and P(Y=a1/2)=α. Thus, P((1))=αa1/2+O2). By the same token, P((2))=αa1/2+O2). And P(X=Y=a1/2)=α2 by independence. Thus, P(Z=a)=P((1))+P((2))+P((3))=2αa1/2+O2).

In other words, Z is a uniformly distributed random variable by standard probabilistic criteria, but the probability of Z hitting different points is different: P(Z=a) is basically an infinitesimal multiple of √a.

What is happening here is that if one attempts to attach infinitesimal probabilities to the individual outcomes of bona fide classical probabilities, the infinitesimal individual outcome probabilities float free from the distribution. You can have the same individual outcome probabilities and different distributions or, as in this post, different (nonuniform) individual outcome probabilities and the same (uniform!) distribution.

Thursday, June 6, 2013

Uniform distributions

Consider two random variables, X and Y, whose probability densities pX and pY are shown in the following graph, with pX(x)=1, in blue, and pY(x)=2x, in red.


Looking at the graph, it is tempting to say things like this: X is a uniform distribution and has equal probability of having any value between 0 and 1, while values closer to 0 are much less likely than values close to 1 for Y. We might even look at the graph and say things like: P(X=0.1)=P(X=0.2) while P(Y=0.2)>P(Y=0.1).

Of course, with these continuous distributions, classical probability theory assigns equal zero probability to every value: P(X=a)=P(Y=a)=0 for all a. But this seems wrong, and so we may want to bring in infinitesimals to remedy this, assigning to P(Y=0.2) an infinitesimal twice as big as the one we assign to P(Y=0.1), while P(X=0.2)=P(X=0.1).

Or we might attempt to express the pointwise non-uniformity of Y by using conditional probability P(Y=0.2|Y=0.1 or Y=0.2)=2/3 and P(Y=0.1|Y=0.1 or Y=0.2)=1/3, while P(X=0.2|X=0.1 or X=0.2)=1/2=P(X=0.1|X=0.1 or X=0.2).

In other words, it is tempting to say: X is pointwise uniform while Y is not.

Such pointwise thinking is problematic, however. For I could have generated Y by taking our uniformly distributed random variable X and setting Y=X1/2. (It's an easy exercise to see that if X is uniform then the probability density of X1/2 is given by p(x)=2x.) Suppose that I am right in what I said about the uniformity of pointwise and conditional probabilities for X. Then P(Y=0.1)=P(X=0.01)=P(X=0.04)=P(Y=0.2). And P(Y=0.2|Y=0.1 or Y=0.2)=P(X=0.04|X=0.01 or X=0.04)=1/2=P(X=0.01|X=0.01 or X=0.04)=P(Y=0.1|Y=0.1 or Y=0.2), since Y=0.1 if and only if X=0.01 and Y=0.2 if and only if X=0.04.

So in fact, Y could have the nonuniform distribution of the red line in the graph and yet be just as pointwise uniform as X.

Lesson 1: It is a mistake to describe a uniform distribution on a continuous set as one "where every outcome is equally likely". For even if one finds a way of making nontrivial sense of this, by infinitesimals or conditional probabilities say (and I think similar arguments will work for any other plausible characterization), a nonuniform distribution can satisfy this constraint just as happily.

Lesson 2: One cannot characterize continuous distributions by facts about pointwise probabilities. It is tempting to characterize the uniform distribution by P(X=a)=P(X=b) (infinitesimal version, but similarly for conditional probabilities) and the nonuniform one by P(Y=a)=(a/b)P(Y=b). But in fact both could have the same pointwise properties. I find this lesson deeply puzzling. Intuitively, it seems that chances of aggregate outcomes (like the chance that X is between 0.1 and 0.2) should come out of pointwise chances. But no.

The converse characterization would also be problematic: pointwise facts can't be derived from the distribution facts. For imagine a random variable Z which is such that Z=X unless X=1/2, and Z=1/4 if X=1/2 (cf. this paper). This variable has the same distribution as X, but it has obviously different pointwise probability facts.

Wednesday, June 5, 2013

Simple and full induction

A followup on the previous post.

Simple induction: F1 is G, F2 is G, ..., Fn is G, so probably: Fn+1 is G.

Full induction: F1 is G, F2 is G, ..., Fn is G, so probably: Fk is G for all k.

Intuitively, simple induction seems to be always the better inference than full induction. Indeed, in cases where there are rare exceptions that didn't occur for Fk where kn, simple induction typically gives the right answer but full induction gives the wrong answer. Moreover, the conclusion of the full induction is logically stronger (modulo the existence of Fn+1), so it seems clear that simple induction is the better inference.

But no! Let's say that I, Jon and Trent (and a number of others!) entered a raffle held for a charity where there is only one prize. That Jon and Trent didn't win is some weak evidence that nobody won the raffle—namely, that the charity raffle was crooked. So we do have some evidence for the full inductive conclusion. But that Jon and Trent didn't win is also some evidence that I won. This is true even if we admit the possibility that nobody won, as long as we insist that it is certain that there is only one prize, and hence at most one person won. For P(Jon and Trent didn't win | I won) = 1, but P(Jon and Trent didn't win | I didn't win) < 1, and so that they didn't win supports that I won.

On Bayesian grounds, if the existence of all the Fk is in the background, that F1 is G, F2 is G, ..., Fn is G will never be evidence against that all the Fk are G, and in contingent regular cases will be evidence for the universal claim. But it could well be evidence against that Fn+1 is G.

Lightbulbs and induction

My colleague Trent Dougherty brought to me the very interesting question of how we inductively confirm that the sun will rise tomorrow given background knowledge that the sun one day won't rise.

This makes me think of an oddity. If I know that a lightbulb worked yesterday, that gives me reason to think it will work today. But if I know that it worked for the hundred preceding days, that gives me less reason to think it will work today, because it also gives me evidence that a burnout is due.

So given appropriate background knowledge—in this case, that lightbulbs burn out—more inductive cases do not necessarily raise the probability of the outcome, but can even lower it.

Burnout cases aren't the only ones like this. If I bought a lottery ticket, the more people I learn did not win, the more likely it is that I won.

Nothing greatly exciting here, except that we need to be careful to avoid flatfooted statements of how induction works.

A probabilistic argument against panexperientialism

Let panexperientialism be the view that all fundamental particles have fundamental experiential or protoexperiential properties.

There is good reason to doubt this. Fundamental particles differ as to whether they have fundamental properties like mass, charge and spin. Thus, we should expect them to differ as to whether they have experiential or protoexperiential properties, and hence we should not expect all fundamental particles to have such properties.

A variant argument. For any subset S of types of fundamental particles, there is S-experientialism, which holds that all and only the fundamental particles from S have the fundamental experiential or protoexperiential properties. Panexperientialism then is S-experientialism where S contains all fundamental particle types. But there are many values of S for which S-experientialism explains our consciousness as well (or as badly) as panexperientialism—for instance, S might be all fermions, or all leptons. So what reason do we have to think that of all these, panexperientialism is true? Well, we might think it's the simplest version. Yes, but the simplicity argument is defeated by the inductive considerations of the previous paragraph.

Tuesday, June 4, 2013

Two thoughts about the surprise exam paradox

The standard version of the surprise exam paradox is that the teacher announces that next week there will be a surprise exam: an exam whose occurrence will surprise the students. The students are smart and reason: it can't be on Friday, since if Monday through Thursday pass without an exam, they'd know that it would be on Friday and wouldn't be surprised. But likewise it can't be on Thursday, since they know it can't be on Friday, and so if Monday through Wednesday pass without an exam, they'll know by Thursday that it's on Thursday, and won't be surprised. Repeating this reasoning, the exam has to be on Monday, but then that won't be a surprise either. So a surprise exam is impossible, which is paradoxical. Besides, then, despite all the reasoning, the exam happens on, say, Tuesday and the students really are surprised.

The above version is over a span of 5 days. Generalize by supposing a span of n days for the surprise exam. I really don't know the surprise exam literature, so there may be nothing new here.

First thought: The n=1 case is already a bit paradoxical. The teacher announces: "We will have a surprise exam on Monday." Students are puzzled. Since they know that it will be on Monday, how can it be a surprise to them when it happens on Monday? Should they conclude that the teacher has just told them something obviously false? But if so, then they don't know that there will be an exam on Monday. And then when Monday rolls around, they are quite open to being surprised by the occurrence of the exam. So maybe what the teacher told them isn't obviously false. So charity suggests that they believe the teacher—there really will be a surprise exam on Monday. But if so, then once again they won't be surprised, and so the teacher is telling them something false, and so they should dismiss it. And so on: They keep on thinking this through, and Monday rolls around, and they fail the exam because instead of studying for it, they were thinking about whether there would be an exam! So the n=1 case is paradoxical. It is interesting to ask: Does the n=1 case contain all of the paradoxicality of the n=5 case?

Second thought. Suppose we have some probability cutoffs the define assertibility and surprise: something is assertible provided it has probability at least α and it's surprising provided it had probability less than β. Maybe the values are α=0.9 and β=0.1. I'll do the examples with those. Suppose now that the teacher genuinely will set up an exam at a random date, with some probability distribution on the days 1,2,...,n. First, suppose the distribution is known to the students to be uniform. If the exam is on day n, there is no surprise. But that isn't enough to undercut the assertibility of the teacher's statement. For the probability that the exam would end up on day n is only 1/n, and as long 1−α≥1/n, the teacher might not be taking an undue risk. But we do get a constraint here: n≥1/(1−α). With our sample numbers, this means n≥1/(1−0.9)=10. So we don't have assertibility in the original version where n=5.

Let's keep ploughing through and see what other constraints there are. On the 1/β (rounded down) last days, the probability each day that there would be an exam on that day, if we get to that day examless, is greater than or equal to β, and so there is no surprise if the exam is then. So for assertibility, we better have approximately (I am ignoring rounding) (1/β)/n≤1−α, or n≥1/(β(1−α)). And if we have that approximately, then the teacher has assertibility in saying "There will be a surprise exam over the next n class days." With our sample numbers, the constraint is n≥100. So on our probabilistic understanding of surprise and assertibility (and rounding issues don't come up since in our case 1/β=10 exactly), you can honestly announce a surprise exam if there are at least 100 days that the exam might be on, and then just choose a day uniformly randomly, and even tell the students that's how you're choosing.

It would be fun to see if other distributions than the uniform one might not allow one to bring down the n≥100 constraint.

Charity filter

  1. Do not attribute to malice, selfishness or incompetence what you can attribute to a reasonable but mistaken judgment.
  2. Do not attribute to malice or selfishness what you can attribute to incompetence.
  3. Do not attribute to malice what you can attribute to selfishness.

Sunday, June 2, 2013

Salmon's argument against S4

Start with:

  1. If x originates from chunk α of matter and β is a non-overlapping chunk of matter, then x couldn't have originated from β.
  2. If x originates from chunk α of matter and α' is a chunk of matter that almost completely overlaps α, then x could have originated from α'.
Iterating (2), and assuming a finite sequence of almost completely overlapping chunks between α and β, we conclude that an object x that originates from α possibly possibly ... possibly (with a finite number of possiblys) originates from β. By S4, we conclude that x could have originated from β, contrary to (1). Nathan Salmon uses this as an argument against S4.

But this is a mistaken line of thought. For (2) is not significantly more plausible than:

  1. If x could have originated from chunk α of matter and α' is a chunk of matter that almost completely overlaps α, then x could have originated from α'.
Both (2) and (3) embody the same prima facie plausible small-variation intuition. If one thought (3) was false, one would have little reason to think (2) is true.

But given (3), Salmon's argument can be run without S4—all we need is T (what is actually true is possible). Iterating uses of (3) and modus ponens, we conclude that (1) is false. In other words, we cannot hold both (1) and (3). And since (2) has little plausibility apart from (3), we shouldn't hold both (1) and (2). Thus, Salmon's argument is not an argument against S4, but an argument against the conjunction of (1) and (2). And I say we should reject (2).

Saturday, June 1, 2013

Material beings

What is a material being? Suggestion:

  1. x is material if and only if x is in space.
A minor problem: The philosophical tradition has it that materiality is some kind of a negative status. But (1) makes materiality be a positive status. A more serious problem: God is omnipresent, so (1) makes God material. Revision:
  1. x is material if and only if x occupies a proper part of space.
This takes care of the case of God and shows how materiality is a limitation on omnipresence. But imagine a world all of whose space is filled by a walnut. Surely, the walnut would still be material. So:
  1. x is material if and only if possibly x occupies a proper part of space.
This is better. Whether it is adequate will depend on one's intuitions about things like the electromagnetic field. Suppose one thinks both that (a) the electromagnetic field is not a material being and (b) God could miraculusly make it occupy a proper part of space. (I assume that normally it occupies all of space, even the places where its value is zero.) Then (3) is inadequate. I am happy to count the electromagnetic field as material myself. But if you're not, then:
  1. x is material if and only if necessarily it is abnormal for x to occupy only a proper part of space.

Friday, May 31, 2013

A Cosmological Argument based on the Empire State Building

Assume:

  1. Necessarily, every exact duplicate of the Empire State Building has a cause.
  2. Necessarily, if an exact duplicate of the Empire State Building never changes, then neither it nor any of its parts cause its existence.
  3. Possibly, the only contingent beings ever are an unchanging duplicate of the Empire State Building and parts thereof.
Let w be a world where the only contingent beings ever are the unchanging duplicate of the Empire State Building and its parts. By (1), it has a cause. By (2), this cause cannot be the Empire State Building or a part thereof. Since those are all the contingent beings, the cause must be a necessary being, or include a necessary being as a part. So, possibly, there is a necessary being. By S5:
  1. There is a necessary being.

Premise (1) is a version of the Causal Principle specialized to the sorts of entities that we are most confident of there being causes of. One might wonder about why one needs the "never changes" in (2). But there is reason for it. Some objects can perhaps be caused by their parts. Imagine a bunch of trees that grow together to form a tower. We could likewise imagine a bunch of moving stony and metallic beings that come together to form an exact duplicate of the Empire State Building. This is ruled out by the "never changes".