Wednesday, August 19, 2020

Product spaces for hyperreal and full conditional probabilities

I think the following is a consequence of a hyperreal variant of the Horn-Tarski extension theorem for measures on boolean algebras:

Claim: Suppose that <Ωi, Fi, Pi> for i ∈ I is a finitely additive probability space with values in some field R* of hyperreals. Then, assuming the Axiom of Choice, there is a hyperreal-valued finitely additive probability space <Ω, 2Ω, P> where Ω = ∏i ∈ IΩi and where the Ωi-valued random variables πi given by the natural projections of Ω to Ωi are independent and have the distributions given by the Pi.

Note that the values of P might be in a hyperreal field larger than R*.

Given the Claim, and given the well-known correspondences between hyperreal-valued probabilities and full conditional real-valued probabilities, it follows that we can define meaningful product-space conditional real-valued probabilities.

It would be really nice if the product-space conditional probabilities were unique in the special case where Fi is the power set of Ωi, or at least if they were close enough to uniqueness to define the same real-valued conditional probabilities.

For a particularly interesting case, consider the case where X and Y are generated by uniform throws of a dart at the interval [0, 1], and we have a regular finitely additive hyperreal-valued probability on [0, 1] (regular meaning that all non-empty sets have positive measure). Let Z be the point (X, Y) in the unit square.

Looking at how the proof of the Horn-Tarski extension theorem works, it seems to me that for any positive real number r, and any non-trivial line segment L along the x = y diagonal in the square [0, 1]2, there is a product measure P satisfying the conditions of the Claim (where P1 and P2 are the uniform measures on [0, 1]) such that P(L)=rP(H), where H is the horizontal line segment {(x, 0):x ∈ [0, 1]}. For instance, if L is the full diagonal, we would intuitively expect P(L)=21/2P(H), but in fact we can make P(L)=100000P(H) or P(L)=P(H)/100000 if we like. It is clear that such a discrepancy will generate different conditional probabilities.

I haven’t checked all the details yet, so this could be all wrong.

But if it is right, here is a philosophical upshot. We would expect there to be a unique canonical product probability for independent random variables. However, if we insist on probabilities that are so fine-grained as to tell infinitesimal differences apart, then we do not at present have any such unique canonical product probability. If we are to have one, we need some condition going beyond independence.

This is part of a larger set of claims, namely that we do not at present have a clear notion of what “uniform probability” means once we make our probabilities more finegrained than classical real-valued probability.

Putative Sketch of Proof of Claim: Embedding R* in a larger field if necessary, we may assume that R* is |2Ω|-saturated. Define a product measure on the cylinder subsets of Ω as usual. The proof of the Horn-Tarski extension theorem for measures on boolean algebras looks to me like it works for |B|-saturated hyperreal-valued probability measures where B is the boolean algebra, and completes the proof of our claim.

4 comments:

Andrew Dabrowski said...

Are your "hyperreals" synonymous with Robinson's nonstandard reals?

Alexander R Pruss said...

Maybe. Mine are constructed via an ultraproduct of the reals like his. But from what I recall in Robinson's textbook, his were constructed via a different ultrafilter, namely one over the naturals.

Andrew Dabrowski said...

OK, thanks.

IanS said...

Here is an example of different joint hyperreal probabilities with the same (strictly, isomorphic) pairs of independent marginal probabilities. Usual warning: amateur at work.

The example uses hyperreal probabilities constructed by the sequence-and-ultrafilter method. To form a product of such probabilities, we need a cross product ultrafilter. The usual product W of an ultrafilter U on the powerset of A and V on the powerset of B can be described like this: a subset C of the powerset of AxB is in the product ultrafilter W iff the set of ‘a’ in A such that (the set of ‘b’ in B such that (‘a’ x ‘b’ is in C) is in V) is in U.

Note (this is the key point) that this product treats the factors differently. We would get a different product ultrafilter if we said ‘… iff the set of ‘b’ in B such that (the set of ‘a’ in A such that (‘a’ x ‘b’ is in C) is in U) is in V.’

Here is the construction. Define a probability P on the rationals in [0, 1) as follows. Define a sequence of sets Sn (n = 1, 2, 3 …) by Sn = { o/n!, 1/n!, …, (n!-1)/n!}. For any set A of rationals in [0, 1) define Pn(A) = (size of A intersect Sn)/n!. Use the Pn and a (free) ultrafilter U on n to define a hyperreal probability P in the usual way. P is finitely additive, regular and invariant under (folded) rational rotations.

Define a second probability Q, again on the rationals in [0, 1), in a similar way, with sets Tm, probabilities Qm, and a possibly different ultrafilter V on m.

Define a joint probability R on the rationals in [0, 1) x [0, 1) like this: for each pair n, m define Rn,m(A) as (size of A intersect (Sn x Tm))/(n!*m!). Use the Rn,m and the product ultrafilter W of U and V as described above to define a hyperreal probability R. R is finitely additive, regular and invariant under (folded) rational rotations on both coordinates.

The marginal distribution R((.) x ([0, 1)) is isomorphic to P. (Because Rnm(A x [0, 1)) = Pn(A) for all A, n, and m, and any set u in U corresponds to a set u x {1, 2, 3, …} in W.) Similarly, the marginal R([0, 1) x (.)) is isomorphic to Q. Also, R makes its marginals independent. (Because Rn,m(A x B) = Rn,m(A x [0, 1)) * Rn,m([0, 1) x B) for all n and m.)

Here is the interesting bit. Associate the random variable X with the first coordinate, Y with the second. Then R(Y=0) is strictly less than R(X=0). In fact, R(Y=0)/R(X=0) is infinitesimal. To see this, note that, for any natural k, the set of all pairs (n, m) with n = 1, 2, 3 … and m strictly greater than n+k is in the product ultrafilter W. (Because all free ultrafilters on the naturals contain the Frechet filter. Note also, this is where the asymmetry of the product ultrafilter comes in: a similar set with n and m swapped would not be in W.) On this set, Rn,m(Y=0)/Rn,m(X=0) = n!/m!, which is less than 1/k!. Then take k arbitrarily large.

We can construct another joint probability R´ similarly, but using the alternate product ultrafilter mentioned above. R´ will have all the properties mentioned above, except that R´(Y=0) will be greater than R´(X=0) and R´(Y=0)/R´(X=0) will be 1/infinitesimal. So R´ will be genuinely different from R.

So we have two different joint distributions, both with the same (strictly, isomorphic) independent marginal distributions. Neither is more obviously natural. As the post suggests, independence alone is not enough to force a unique joint distribution.