Monday, February 3, 2014

An argument for expected utility maximization

Until very recently, I thought there was only one argument for the idea that, barring deontic concerns and the like, rationality is connected to the maximization of expected utilities, namely the long-run advantage argument based on the Law of Large Numbers. But there is another: an argument from a plausible set of axioms for rational preferability. Fix a probability space Ω. Say that a gamble is a bounded real-valued random variable on Ω. Suppose that there is a rational preferability ordering < on gambles, where we write A<B if B is preferable to A. Here are some plausible axioms for <:

  1. Transitivity: < is transitive
  2. Domination: If A(ω)≤B(ω) for all ω∈Ω, then for all C, if C<A, then C<B, and if B<C, then A<C.
  3. Sure Thing: If A and B are gambles that have certainty of getting payoffs a and b respectively, with a<b, then A<B.
  4. Additivity: If A<C and B<D, then A+B<C+D.
  5. Equivalence: If A and B are probabilistically equivalent (i.e., P(AU)=P(BU) for every measurable U), then for all C we have A<C if and only if B<C, and C<A if and only if C<B.

The most controversial will be, I suppose, Additivity. But there is a very simple argument for it: If you should choose C over A, and D over B, then you should choose the combination of C and D over the combination of A and B.

Add a handy technical assumption:

  1. Continuity: There is a collection of events Ea, for 0<a<1, such that P(Ea)=a and Ea is a subset of Eb when a<b.
To get Continuity, all we need to do is suppose we've got some irrelevant continuous random process going on in our probability space, like the decay of an radioactive sample, or else suppose that we've got an infinite sequence of independent identically distributed coin flips, etc. Even if our world doesn't contain such a process, surely the same preferences would be rational in a world where some irrelevant-to-us such process takes place. So we can assume Continuity.

Theorem: Assume (1)-(6). If E(A)<E(B) for gambles A and B, then A<B.

The proof is given in this footnote: [note 1].

Personally, I am suspicious of transitivity in general, but I am less suspicious of it in the case of real-valued bounded gambles.

19 comments:

IanS said...

Here is another angle. You are no doubt familiar with the von Neumann – Morgenstern representation theorem, the basis of decision theory? It is usually framed in terms of “lotteries” (i.e. probability distributions) over a number of outcomes. It says that (given the assumptions), consistent rational choice between lotteries is equivalent to assigning each outcome a (real) utility and maximizing expected utility. Only one extra step is needed: show that the only assignment of utilities to the real numbers consistent with additivity is linear. This seems highly plausible. Of course, the vN-M premises are not quite the same as yours, and I’m leaving aside the analytical issues.

Alexander R Pruss said...

I think it's a different approach. Representation theorems start with decision behavior and derive probabilities. I start with probabilities and derive decision behavior.

I think the whole idea, common as it is, of basing decision theory in representation theorems gets things backwards and is a kind of leftover of behaviorism. Belief is directly aimed at truth.

That said, the reminder of the relevance of the representation theorem is quite helpful to me. One way to put what I am saying is that if one starts with probabilities and a real number assignment on the outcomes and thinks decisions are based on them (my Equivalence and Domination axioms), one can get a result similar to the representation theorem but with somewhat different axioms. For instance, my additivity axiom is rather like the independence axiom, but I think more plausible.

Alexander R Pruss said...

OK, not all representation theorems do what I just said. So, yes, this is much closer to that known stuff.

IanS said...

The proof seems to work. It also shows why some people might reject the axioms.

This simple example follows the line of the proof. I have two options: 1/2 chance of $2, or certainty of $1. Which should I choose? Both options can be seen as the sum of 2 gambles, each paying $1 with probability 1/2. If both gambles always pay out (or not) together, I get 1/2 chance of $2. If one always pays when the other doesn’t, I get $1 for certain. So both alternatives are the sum of (pairwise) probabilistically equivalent gambles, so I should be indifferent.

But if I’m an adventurous type, the positive correlation of the $1 gambles makes me prefer the 1/2 chance of $2. If I like to play safe, the negative correlation makes me prefer the safe $1. The additivity axiom requires me to ignore the correlations. So I may not feel obliged to accept it.

Alexander R Pruss said...

Yeah, people who like or avoid risk will want to reject the additivity axiom.

And I now think that my quick argument for the additivity axiom may have been badly mistaken--Lara Buchak has helped me see this. I was thinking something like this: First I have a choice between A and C, and then between B and D. It would be bad if by choosing rationally in each case, namely C over A and then D over B, I ended up making an overall choice of C+D over A+B that isn't rational (something like a diachronic prisoner's dilemma, I guess).

But I shouldn't consider the two successive choices in isolation from each other. I shouldn't really be choosing between A and C first, and then between B and D. Suppose I know that I am first going to get a choice between A and C and then a choice between B and D. Suppose also I know that no matter how I choose between A and C, I will end up choosing D over B in the second choice (if I don't know that, things are even more complicated). Then I shouldn't really be choosing between A and C now, but between A+D and C+D. But in theories that take risk into account (e.g., Buchak's or classic utility maximization with a non-linear utility function) the choice between A+D and C+D will in general be different from the choice between A and C. (I think that having such a difference is mistaken. It is a violation of conglomerability. It leads to the unhappy result that one may have self-interested reasons to adjust one's credences in a way not guided by evidence. But this is a further discussion.)

Alexander R Pruss said...

I may be able to replace the axioms with:
1. Transitivity
2. Sure Thing
3. Modified Additivity: If A<C and D strictly dominates B in the sense that there is an a>0 such that D(w)>a+B(w) for all w, then A+B<C+D.
4. Equivalence.

IanS said...

To be clear, the point of my previous comment is that additivity is too strong to be plausibly taken as an axiom. We do want to be able to prove that the two options in my example rank equal in preference. But an axiom that asserts it directly would beg the question.

I suggest instead Independent Additivity : If A and B are both (statistically) independent of C, and A is preferred to B, then A + C is preferred to B + C. Note that this can also be phrased in terms of convolutions of frequency distributions.

The independence requirement may (or may not) address your concern about weird interactions in sequences of choices. Is it strong enough to give the desired result? I think I can prove it is, if I add the usual vn-M axioms and a bit of analytical handwaving. Would it work without the vn-M axioms? One can’t help feeling it must – isn’t the mean the only summary statistic that is consistent with addition? – but the analysis looks tricky.

Alexander R Pruss said...

Very quick comment. I wonder it'd you can get something here if you add completeness of the order and use the central limit theorem. If you average enough independent copies of A you get close to a Gaussian centered on the mean and if you add enough independent copies of B you get another Gaussian. The Gaussians get narrower and narrower as you average more. If the mean of B is higher, you might have a shot at proving domination between the approximate Gaussians. But this may require not just the CLT but theorems about rates of convergence in the CLT. (And the ones I sort of remember don't seem to do the job.)

Alexander R Pruss said...

Nope, my suggestion can't work (nothing deep).

Alexander R Pruss said...

Both maximin and maximax satisfy my original axioms with Independent Additivity (the maxima and minima are additive when one has independence); in fact they satisfy the stronger version of Independent Additivity you get when you take my *original* Additivity, and add the requirements that A is indep of B and C is indep of D. And they satisfy all the other axioms in my post if we tweak the calculation of maxima and minima to neglect sets of measure zero (e.g., instead of defining the maximum of A as the greatest value of A, define it A as the supremum of { t : P(A>t)>0 }; there must be a name for this quantity; if A is non-negative, it's equal to the L^infinity norm).

So if one's going to use Independent Additivity, one will need to bring in axioms that rule out maximin and maximax. vNM rule it out with their Archimedean/continuity axiom. But since what led me to thinking about this stuff was Pascal's Wager, where there is no hope for an Archimedean axiom, I don't want to go there.

Alexander R Pruss said...

Modified Additivity has something intuitively going for it. Assume *Constant* Additivity: if A<B then c+A<c+B. Freebies don't change things. (This won't work if the variables are denominated in dollars rather than utilities and there is a non-linear utility function. But I am denominating in utilities.)

If you have Constant Additivity but deny Modified Additivity you get the following paradox: You know for sure you're about to learn a piece of information. You also know for sure that no matter what you learn, that piece of information will make it rational for you prefer game 2 over game 1. But nonetheless you do not yet prefer game 2 over game 1. (In other words, you have a violation of something that I think is called Extended Domination: if game 2 is preferable everywhere on a partition, it's preferable.)

Why? Well, suppose A<B but not C+A<C+B. Suppose game 2 has payoff C+B and game 1 has payoff C+A, and you're about to learn the value of C. Once you learn the value of C, by Constant Additivity game 2 will be preferable. But before you learn the value of C, game 2 is not preferable.

More: You've just heard the value of C, and so you prefer game 2. You're about to play game 2, but then the exact value of C slips from your mind. So now game 2 is no longer preferable. So do you need to keep the exact value of C *right before your mind's eye* for game 2 to be preferable?

I also suspect that if you add completeness and something like an Archimedean axiom, then you will end up with cases where you will be willing to pay not to receive information (you don't want to learn the value of C, because you know that when you do you will choose the other way, and by your lights that's a bad choice), as well as cases where by your lights it will pay to change your credences without evidence.

ockraz said...

So, is this an effort to address your December 31 scenario? When I said that one could have a rule which wasn't a rule about how to behave (eg, never lie), but a rule in the sense of a formula for calculating a value (eg, 4πR²)- this is the sort of thing I had in mind.

IanS said...

Alex:

Good point on min and max. To rule them out, I suggest replacing Domination with something like this:

If for all x, P(B ≤ x) ≤ P(A ≤ x) and for some x, P(B ≤ x) < P(A ≤ x), then A < B.

Note that this is actually a statement about the cumulative distribution functions. [As an aside, it still makes sense if you include +∞ and -∞, as in your Pascal’s wager post.]

Is the desired result about expectation still true without vNM? As I said before, it seems plausible but tricky. At a minimum, any proof would have to invoke the strict inequality bit of the above condition. And without an Archimedian condition, an inequality could be strict without being usefully big.

How much stronger is Modified Additivity than Independent Additivity? Can’t we do a 2-step replacement ( A + B < C + B < C + D ) , if not with the random variables themselves, then with probabilistically equivalent substitutes? OK, the order may not be total, so the middle term may not be comparable with the others. Is there more?

On the paradox, it’s above my pay grade (kinda like this whole discussion). But some quick googling shows that Buchak’s book has chapters on Consistency and Diachronic Choice. I haven’t read them, but I think the general idea is that you can be inconsistent without being irrational.

IanS said...

The above is not enough to rule out combinations like mean + max. Minor tweaks probably won't fix this. Perhaps an additional requirement of vNM independence (which uses mixing, not adding) might.

Alexander R Pruss said...

I was thinking of distribution conditions like that--I should then be able to remove my Equivalence condition--but the worry with such conditions is that they are going to be unintuitive to most philosophers.

(I've used analogues of the nonstrict version of this kind of distribution comparison very heavily in some of my math work.)

In any case, I still don't see how to use the distribution requirement plus the independent additivity to get the expectation condition.

IanS said...

Apologies, Independent Additivity looks like a dead end, at least for your purposes. Here’s why. Combinations like mean + max or mean + min are consistent with Independent Additivity and my modified Dominance. To rule them out we need more restrictions. The obvious way is to require some sort of continuity (vNM Archimedian, or similar) – note max and min are discontinuous in the probabilities. But this is just what you don’t want for Pascal’s wager. So it looks hopeless.

Alexander R Pruss said...

One might be able to use Independent Additivity together with completeness and a strengthening of Sure Thing. The strengthening of Sure Thing would say that if A and B are approximate Gaussians with EA<EB and with standard deviations that are much, much narrower than EB-EA, then A<B. After all, it would be crazy to choose an approximate Gaussian with a mean of 0 and standard deviation 0.001 over an approximate Gaussian with a mean of 10 and a standard deviation 0.001.

The degree of approximation in "approximate Gaussian" can then be chosen to match the closeness we get from some estimate on the rate of convergence in the Central Limit Theorem for bounded random variables.

IanS said...

Is this thread still live, and are you still watching it?

I came across an unpublished paper Dominance-Based Decision Theory by Easwaran (apparently a conference presentation). His approach is like yours in using dominance and probabilistic equivalence. Note especially sections 4.2 (Utility Shifting) and 4.3 (Relative Expectation). Easwaran gets expectation from dominance by re-shaping distributions, as in your proof, but he justifies it by moving bits of utility between equally likely outcomes. His Decision Theory without Representation Theorems (unpublished draft) gives more detail. I have only scanned it, not studied it. It leads up to Section 3.5.4 Expected Utility, which again is similar to your proof.

Easwaran’s theorem, like yours, requires bounded utility. To make it work for distributions with unbounded utility but finite expectation, some new principle would be needed, some sort of tail-taming and limiting.

Easwaran’s draft is still a draft – does this indicate that he has abandoned the approach? Certainly the technical details are more technical than one would have hoped.

Alexander R Pruss said...

That's interesting. I suppose moving bits of utility between equally likely outcomes is not very different from Additivity.