For a number of years I’ve been interested in what one might call “the credence of a random proposition”. Today, I saw that once precisely formulated, this is pretty easy to work out in a special case, and it has some interesting consequences.
The basic idea is this: Fix a particular rational agent and a subject matter the agent thinks about, and then ask what can be said about the credence of a uniformly randomly chosen proposition on that subject matter. The mean value of the credence will be, of course, 1/2, since for every proposition p, its negation is just as likely to be chosen.
It has turned out that on the simplifying assumption that all the situations (or worlds) talked about have equal priors, the distribution of the posterior credence among the randomly chosen propositions is binomial, and hence approximately normal. This was very easy to show once I saw how to formulated the question. But it still wasn’t very intuitive to me as to why the distribution of the credences is approximately normal.
Now, however, I see it. Let μ be any probability measure on a finite set Ω—say, the posterior credence function on the set of all situations. Let p be a uniformly chosen random proposition, where one identifies propositions with subsets of Ω. We want to know the distribution of μ(p).
Let the distinct members (“situations”) of Ω be ω1, ..., ωn. A proposition q can be identified with a sequence q1, ..., qn of zeroes and/or ones, where qi is 1 if and only if ωi ∈ q (“q is true in situation ωi”). If p is a uniformly chosen random proposition, then p1, ..., pn will be independent identically distributed random variables with P(pi = 0)=P(pi = 1)=1/2, and p will be the set of the ωi for which pi is 1.
Then we have this nice formula:
- μ(p)=μ(ω1)p1 + ... + μ(ωn)pn.
This formula shows that μ(p) is the sum of independent random variables, with the ith variable taking on the possible values 0 and μ(ωi) with equal probability.
The special case in my first post today was one where the priors for all the ωi are equal, and hence the non-zero posteriors are all equal. Thus, as long as there are lots of non-zero posteriors—i.e., as long as there is a lot we don’t know—the posterior credence is by (1) a rescaling of a sum of lots of independent identically distributed Bernoulli random variables. That is, of course, a binomial distribution and approximately a normal distribution.
But what if we drop the assumption that all the situations have equal priors? Let’s suppose, for simplicity, that our empirical data precisely rules out situations ωm + 1, ..., ωn (otherwise, renumber the situations). Let ν be the prior probabilities on Ω. Then μ is directly proportional to ν on {ω1, ..., ωm} and is zero outside of it, and:
- μ(p)=c(ν(ω1)p1 + ... + ν(ωm)pm)
where c = 1/(ν(ω1)+.... + ν(ωm)). Thus, μ(p) is the sum of m independent but perhaps no longer identically distributed random variables. Nonetheless, the mean of μ(p) will still be 1/2 as is easy to verify. Moreover, if the ν(ωi) do not differ too radically among each other (say, are the same order of magnitude), and m is large, we will still be close to a normal distribution by the Berry-Esseen inequality and its refinements.
In other words, as long as our priors are not too far from uniform, and there is a lot we don’t know (i.e., m is large), the distribution of credences among randomly chosen propositions is approximately normal. And to get estimates on the distribution of credences, we can make use of the vast mathematical literature on sums of independent random variables. This literature is available even without the "approximate uniformity" condition on the priors (which I haven't bothered to formulate precisely).