Suppose we have a probability space *Ω* with algebra *F* of events, and a distinguished
subalgebra *H* of events on
*Ω*. My interest here is in
accuracy *H*-scoring rules,
which take a (finitely-additive) probability assignment *p* on *H* and assigns to it an *H*-measurable score function *s*(*p*) on *Ω*, with values in [−∞,*M*] for some finite *M*, subject to the constraint that
*s*(*p*) is *H*-measurable. I will take the score
of a probability assignment to represent the epistemic utility or
accuracy of *p*.

For a probability *p* on
*F*, I will take the score of
*p* to be the score of the
restriction of *p* to *H*. (Note that any finitely-additive
probability on *H* extends to a
finitely-additive probability on *F* by Hahn-Banach theorem, assuming
Choice.)

The scoring rule *s* is
*proper* provided that *E*_{p}*s*(*q*) ≤ *E*_{p}*s*(*p*)
for all *p* and *q*, and strictly so if the inequality
is strict whenever *p* ≠ *q*. Propriety says that
one never expects a different probability from one’s own to have a
better score (if one did, wouldn’t one have switched to it?).

Say that the scoring rule *s*
is *open-minded* provided that for any probability *p* on *F* and any finite partition *V* of *Ω* into events in *F* with non-zero *p*-probability, the *p*-expected score of finding out
where in *V* we are and
conditionalizing on that is at least as big as the current *p*-expected score. If the scoring
rule is open-minded, then a Bayesian conditionalizer is never precluded
from accepting free information. Say that the scoring rule *s* is *strictly* open-minded
provided that the *p*-expected
score increases of finding out where in *V* we are and conditionalizing
increases whenever there is at least one event *E* in *V* such that *p*(⋅|*E*) differs from *p* on *H* and *p*(*E*) > 0.

Given a scoring rule *s*, let
the expected score function *G*_{s} on the
probabilities on *H* be defined
by *G*_{s}(*p*) = *E*_{p}*s*(*p*),
with the same extension to probabilities on *F* as scores had.

It is well-known that:

- The (strict) propriety of
*s*entails the (strict) convexity of*G*_{s}.

It is easy to see that:

- The (strict) convexity of
*G*_{s}implies the (strict) open-mindedness of*s*.

Neither implication can be reversed. To see this, consider the
single-proposition case, where *Ω* has two points, say 0 and 1, and
*H* and *F* are the powerset of *Ω*, and we are interested in the
proposition that one of these point, say 1, is the actual truth. The scoring rule
*s* is then equivalent to a pair
of functions *T* and *F* on [0,1] where *T*(*x*) = *s*(*p*_{x})(1)
and *F*(*x*) = *s*(*p*_{x})(0)
where *p*_{x}
is the probability that assigns *x* to the point 1. Then *G*_{s} corresponds
to the function *x**T*(*x*) + (1−*x*)*F*(*x*),
and each is convex if and only if the other is.

To see that the non-strict version of (1) cannot be reversed, suppose
(*T*,*F*) is a
non-trivial proper scoring rule with the limit of *F*(*x*)/*x* as *x* goes to 0 finite. Now form a new scoring rule by
letting *T* * (*x*) = *T*(*x*) + (1−*x*)*F*(*x*)/*x*.
Consider the scoring rule (*T**,0). The corresponding function
*x**T* * (*x*) is
going to be convex, but (*T**,0)
isn’t going to be proper unless *T** is constant, which isn’t going to
be true in general. The strict version is similar.

To see that (2) cannot be reversed, note that the only non-trivial
partition is {{0}, {1}}. If our current
probability for 1 is *x*, the expected score upon learning
where we are is *x**T*(1) + (1−*x*)*F*(0).
Strict open-mindedness thus requires precisely that *x**T*(*x*) + (1−*x*)*F*(*x*) < *x**T*(1) + (1−*x*)*F*(0)
whenever *x* is neither 0 nor 1. It
is clear that this is not enough for convexity—we can have wild
oscillations of *T* and *F* on (0,1) as long as *T*(1) and *F*(1) are large enough.

Nonetheless, (2) can be reversed (both in the strict and non-strict versions) on the following technical assumption:

- There is an event
*Z*in*F*such that*Z*∩*A*is a non-empty proper subset of*A*for every non-empty member of*H*.

This technical assumption basically says that there is a non-trivial
event that is logically independent of everything in *H*. In real life, the technical
assumption is always satisfied, because there will always be something
independent of the algebra *H*
of events we are evaluating probability assignments to (e.g., in many
cases *Z* can be the event that
the next coin toss by the investigator’s niece will be heads). I will
prove that (2) can be reversed in the Appendix.

It is easy to see that adding (3) to our assumptions doesn’t help reverse (1).

Since open-mindedness is pretty plausible to people of a Bayesian
persuasion, this means that convexity of *G*_{s} can be
motivated independently of propriety. Perhaps instead of focusing on
propriety of *s* as much as the
literature has done, we should focus on the convexity of *G*_{s}?

Let’s think about this suggestion. One of the most important uses of
scoring rules could be to evaluate the expected value of an experiment
prior to doing the experiment, and hence decide which experiment we
should do. If we think of an experiment as a finite partition *V* of the probability space with each
cell having non-zero probability by one’s current lights *p*, then the expected value of the
experiment is:

- ∑
_{A ∈ V}*p*(*A*)*E*_{pA}*s*(*p*_{A}) = ∑_{A ∈ V}*p*(*A*)*G*_{s}(*p*_{A}),

where *p*_{A} is the result
of conditionalizing *p* on *A*. In other words, to evaluate the
expected values of experiments, all we care about is *G*_{s}, not *s* itself, and so the convexity of
*G*_{s} is a
very natural condition: we are never oligated to refuse to know the
results of free experiments.

However, at least in the case where *Ω* is finite, it
is known that any (strictly) convex function (maybe subject to some
growth conditions?) is equal to *G*_{u} for a some
(strictly) proper scoring rule *u*. So we don’t really gain much
generality by moving from propriety of *s* to convexity of *G*_{s}. Indeed, the
above observations show that for finite *Ω*, a (strictly) open-minded way of
evaluating the expected epistemic values of experiments in a setting
rich enough to satisfy (3) is always generatable by a (strictly) proper
scoring rule.

In other words, if we have a scoring rule that is open-minded but not proper, we can find a proper scoring rule that generates the same prospective evaluations of the value of experiments (assuming no special growth conditions are needed).

**Appendix:** We now prove the converse of (2) assuming
(3).

Assume open-mindedness. Let *p*_{1} and *p*_{2} be two distinct
probabilities on *H* and let
*t* ∈ (0,1). We must show that
if *p* = *t**p*_{1} + (1−*t*)*p*_{2},
then

*G*_{s}(*p*) ≤*t**G*_{s}(*p*_{1}) + (1−*t*)*G*_{s}(*p*_{2})

with the inequality strict if the open-mindedness is strict. Let
*Z* be as in (3). Define

*p*′(*A*∩*Z*) =*t**p*_{1}(*A*)*p*′(*A*∩*Z*^{c}) = (1−*t*)*p*_{2}(*A*)*p*′(*A*) =*p*(*A*)

for any *A* ∈ *H*.
Then *p*′ is a probability on
the algebra generated by *H* and
*Z* extending *p*. Extend it to a probability on
*F* by Hahn-Banach. By
open-mindedness:

*G*_{s}(*p*′) ≤*p*′(*Z*)*E*_{p′Z}*s*(*p*′_{Z}) +*p*′(*Z*^{c})*E*_{p′Zc}*s*(*p*′_{Zc}).

But *p*′(*Z*) = *p*(*Ω*∩*Z*) = *t*
and *p*′(*Z*^{c}) = 1 − *t*.
Moreover, *p*′_{Z} = *p*_{1}
on *H* and *p*′_{Zc} = *p*_{2}
on *H*. Since *H*-scores don’t care what the
probabilities are doing outside of *H*, we have *s*(*p*′_{Z}) = *s*(*p*_{1})
and *s*(*p*′_{Zc}) = *s*(*p*_{2})
and *G*_{s}(*p*′) = *G*_{s}(*p*).
Moreover our scores are *H*-measurable, so *E*_{p′Z}*s*(*p*_{1}) = *E*_{p1}*s*(*p*_{1})
and *E*_{p′Zc}*s*(*p*_{2}) = *E*_{p2}*s*(*p*_{2}).
Thus (9) becomes:

*G*_{s}(*p*) ≤*t**G*_{s}(*p*_{1}) + (1−*t*)*G*_{s}(*p*_{2}).

Hence we have convexity. And given strict open-mindedness, the inequality will be strict, and we get strict convexity.

## No comments:

Post a Comment