Sunday, August 4, 2019

The credence distribution among rational beliefs

Let’s say that there are n initially epistemically relevant possible situations, w1, ..., wn, one of which (say, w1) is the true one. We can identify the propositions about these situations with subsets of the set W of possible situations. Additionally, we have some evidence concerning the possible situations. Here is a question that interests me:

  1. What is the distribution of (posterior) credences among the propositions like?

My intuition says that most of our credences are in the middle range, between 1/4 and 3/4, and that very few will be close to 0 and 1. I would speculate that the distribution of credences would look like a bell curve (when I told him about the problem, my son also had the same speculation).

Can we make some progress on the question? I think so.

For simplicity, let’s suppose our priors are uniform on W: each possible situation is equally likely.

Our evidence basically restricts the set of initially epistemically relevant situations W to a subset e of W, presumably a subset containing the true situation. Let m = |e| be the number of elements of e.

Consider an arbitrary proposition about the situations. This proposition can be identified with a subset p of W. Then the posterior P(p|e) will equal |p ∩ e|/|e| = |p ∩ e|/m, because the priors are uniform.

So now our question is:

  1. What is the distribution of |p ∩ e|/m amongst the propositions p?

There are 2n propositions, and the possible values of |p ∩ e|/m are 0/m, 1/m, ....,m/m. We can thus just ask about the distribution of |p ∩ e| among the propositions p. Here is one easy fact:

  1. If k > m, then there are zero propositions p such that |p ∩ e|=k.

So let’s now consider 0 ≤ k ≤ m and ask how many propositions there are such that |p ∩ e|=k. This is not so hard. Such a proposition consists of a subset of e of size k and an arbitrary subset of W − e. There are C(m, k) subsets of e of size k (where C(m, k) is the binomial coefficient) and there are 2n − m subsets of W − e. Thus:

  1. There are 2n − mC(m, k) propositions p such that |p ∩ e|=k.

Since C(m, k)=0 unless 0 ≤ k ≤ m, claim (4) is true even without restricting the values of k.

Lesson learned:

  1. The distribution of |p ∩ e| is a symmetric binomial distribution on 0, ..., m.

This binomial distribution has standard deviation m1/2/2. Now, we can answer (2):

  1. The distribution of |p ∩ e|/m, i.e., the posterior probability, amongst the propositions p is a symmetric binomial distribution scaled to have range from 0 to 1, mean 1/2 and standard deviation σ = 1/2m1/2.

Since the binomial distribution is close to a normal distribution for large m (and in life m will often be large: it is the number of situations that remain epistemically relevant given our evidence), my conjecture that most credences are in the middle range and that we have a bell curve turns out correct. (By the way, I’ve been doing the math while writing this post. So I wrote down the speculation before I knew that it was correct.)

Note that in practice we can restrict our situations to concern some particular subject matter, say the order of cards in a deck, the outcomes of die throws, the possible scientific hypotheses about the the evolution of bipedality, etc. And as long as we are sufficiently fine-grained that the number of still-in-play hypotheses is large, the above result applies.

So:

  1. We would expect the vast majority of a rational agent’s credences to cluster around 1/2. A very small minority will have credences near 0 and 1, and we have fast decay in the number of propositions with a given credence as we get closer to 0 or 1.

No comments: