Suppose all you know about n is that it is a positive integer. What probabilities should you assign to the values of n? Intuitively, you should assign equal probability to each value. But that probability will have to be zero if the probabilities are to add up to one (infinitesimals won't help by Tim McGrew's argument). So now the probability of everything will be zero by countable additivity. We can drop countable additivity, but then there will no longer be a unique canonical measure—there are many finitely additive measures.
So, here's a suggestion. The details are not yet worked out and may well overlap with the literature. The following is for an ideal agent with evidence that is certain. I don't know how to generalize it from there.
Step 1: Drop the normalization to one. Instead, talk of an epistemic possibility measure (epm) m, and say that m(p) is the degree of epistemic possibility of p (I am not calling it probability, but probability measures will be a special case; I am following Trent Dougherty's idea that given classical probabilities, the degree of epistemic probability of p is equal to its degree of epistemic probability). An epm takes values from zero to infinity (both may be included) and is countably additive. Depending on context, I'll go back and forth between talking it as assigning values to propositions or sets (in the latter case, it'll just be a Lebesgue measure). The case where the total measure (i.e., the measure of a tautology or of a whole set) is one shall be referred to as classical. I will say that p is epistemically possible if and only if the epm if p is greater than zero.
Step 2: Instead of modeling the degree of belief in p with a single number, P(p), as in the classical theory, we model it with the pair of numbers: <m(p),m(~p)>, which I will call the degree of belief in p. The agent is certain of p provided that ~p is epistemically impossible, i.e., provided the degree of belief in p is of the form <x,0>. This means that there is a distinction between maximal epistemic possibility and certainty: maximal epistemic possibility is when the degree of epistemic possibility of p is equal to that of a tautology, while certainty will be when the degree of epistemic possibility of the negation of p is zero. The axioms (see Step 4) will ensure that when the total measure is finite, certainty and maximal epistemic possibility come together. (Here is the example which leads me to this. If N is the set of positive integers, m is counting measure and E is the set of even integers, then m(E)=m(N)=infinity, but obviously if all one knows about a number is that it is in N, one isn't certain that it is in E. Here, m(~E)=infinity as well, so both E and ~E have maximal epistemic possibility, and hence there is no certainty.) We say that the agent is has a greater degree of belief in q than in p provided that either m(q)>m(p) or m(~p)<m(~q).
Step 3: The agent's doxastic state is not just modeled by the degrees of epistemic possibility assigned to all the (relevant) propositions, but by all the conditional degrees of epistemic possibility assigned to all the (relevant) propositions on all the (relevant) conditions. More precisely, for each proposition q whose negation isn't a tautology, there is a "conditional" epm m(−|q). The unconditional epm, which measures the degree of epistemic possibility, is m(p)=m(p|T) where T is a tautology. These assignments are dynamic, which I will sometimes indicate by a time subscript, and are updated by the very simple updated rule that when evidence E comes in, and t is a time just before the evidence and t' is just after, then mt'(p|q)=mt(p|q&E).
Step 4: Consistency is forced by an appropriate set of axioms, over and beyond the condition that m(−|q) is an epm for every q whose negation isn't a tautology. For instance, it will follow from the axioms that m(p&q|q)=m(p|q), and that m(p|q&r)m(q|r)=m(p&q|r)m(T|q&r) whenever both sides are defined (stipulation: xy is defined if and only if it is not the case that one of x and y is zero and the other is infinity) and T is a tautology. Maybe these are the only axioms needed. Maybe the second is all we need, but we may need a little more.
Step 5: To a first approximation, it is more decision-theoretically rational to do A than B iff the Lebesgue integral of (1A(x)−1B(x))p(x) is greater than zero, where p is the payoff function on our sample space, 1S is the indicator function equal to 1 on S and 0 elsewhere, and the integral is taken with respect to m(−|do(A) or do(B)). Various qualifications are needed, and something needs to be said about cases where the integrals are undefined, and maybe about the case where either A or B has zero epm conditionally on (do(A) or do(B)). This is going to be hard.
Example: Suppose we're working with the positive integers N (i.e., with a positive integer about which we know nothing). Let m(F|G) be the cardinality of the intersection of F and G. Then, we're certain of N, but of no proper subset of N. We have the same degree of beliefs in the evens, in the odds, in the primes, etc., since they all have the same cardinality. However, we have a greater degree of belief in the number being greater than 100 than we do in the evens, and that is how it should be. Supposing we get as evidence some finite set (i.e., the proposition that the number is in some finite set). Then, quite correctly, we get a classical uniform probability measure out of the update rule. Moreover, in the infinite case, we still get correct conclusions like that it is more decision-theoretically rational to bet on the numbers divisible by two than on the numbers divisible by four, even though the degree of belief is the same for both.
Step 5 is completely wrong.
ReplyDeleteHere are two ways of fixing it.
Way one: Think of alternative behaviors as choosing alternative payoff functions--this nicely models betting cases where your decision is independent of the stuff in the sample space. Suppose you are choosing between an action with a payoff function u and an action with a payoff function v. Then: It is rational to choose u over v iff the integral of u-v with respect to the unconditional epm is positive.
Way two: Suppose you're choosing between two actions A and B, and there is a single payoff function U. Then it's rational to choose A over B iff m(do(B))I(U,m(-,do(A))) > m(do(A))I(U,m(-,do(B))), where I(f,r) is the integral of f with respect to the measure r.
Unfortunately, there are cases where the first way produces a better answer than the second, and vice versa.
What one wants is a way of combining the two. This is going to be tricky.
It seems to me that whenever we talk about purely abstract situations we have the problem of lacking contexts which are implied in more typical scenarios. For example, if we ask to pick a natural number between 1 and 10, then we have in mind the human ability to choose, essentially with equiprobability, one of those numbers, perhaps with the aid of a balanced die or other tool/device. But when we ask the probability of choosing any of the whole set of natural numbers, what mechanism have we in mind? How are we to interpret whatever probability assignment with which we choose to endow that set? I would not care to venture into an analysis of the probability until it is properly related to some system.
ReplyDeleteIn the end, we can assign any probability distribution we wish to the set, so long as it satisfies the rules laid out in probability theory. The trick is endowing that assignment with real-world meaning.
Cases like this seem relevant. There are infinitely many people, including you, and some of them are wearing red hats and some aren't. Everybody is blindfolded. How likely is it that your hat is red?
ReplyDeleteThis story doesn't seem to require any randomization procedure.
Hmm. If you have a particular probability model/interpretation in mind, then I'd be curious to hear it. Without knowing what you mean, however, I am unable to answer.
ReplyDeleteTo illustrate what I mean by a model, suppose I was able to learn the color of the ten blindfolded subjects nearest to me. In that case, I could use that data to predict whether or not my hat was red, with the natural probability assignment one would expect in such a situation. The fact that there are infinitely many people wouldn't be relevant to make predictions about a particular locale. However, I wouldn't expect to make accurate predictions about the whole infinite set.
Hatsoff would be as likely to have a red hat as not, for all he knows in that situation, which might be expressed like this: "Hatsoff's hat is 50% likely to be red." That "50" does not have to satisfy the standard probability axioms because it does not have to satisfy the standard axioms of mathematics. It signifies only the obviously rational way to bet in such a situation.
ReplyDeleteIf all I know about n is that it is a positive integer, then I should not try to assign a probability. Any natural number is as likely as any other, for all I know, so I should bet as though I was 0% likely to get any particular number out of all of them, but also that no number was impossible. I should also avoid the implicit contradiction, e.g. by using 50% as a default. After all, from the hat scenario I know that I should bet at 50% for the number being even, or prime, or divisible by four, and so forth. I just have to be careful, in any scenario that tries to use even and divisible by four against me, to use my default at the most basic level of my actions. Ultimately, I won't have any idea what might happen, and should bet accordingly (if I can't avoid betting).
Developing a theory of epistemic possibility measure is all very well, but we won't need it in finite cases (such as those that arise in science), and in infinite cases there are almost bound to be surprises (e.g. new paradoxes), and that fact suggests that we just use the default (in such hypothetical scenarios). After all, simple actual infinities may be metaphysically impossible for all we know. So in any such scenario it is 50% likely that you should not try to bet but should try to wake up!