Alexander Pruss's Blog: An interesting epistemic scoring rule

Friday, March 7, 2014

An interesting epistemic scoring rule

A forecast p is an assignment of probabilities to events in some space Ω. A proper score is an assignment of a random variable s_p to each forecast on that space, with the property that E_ps_p≤E_ps_q whenever p is a consistent forecast (one that satisfies the axioms of probability) and q is any other forecast. Here, E_p is expectation with respect to the probability function p. Propriety basically says that if we have a consistent forecast, then by our own lights no other forecast is expected to have a better score. The scores are thought of as penalties or distances from truth—smaller is better.

One thing proper scoring rules have been used for is to argue that our credences should be consistent. For instance, under a simplifying assumption, Predd et al. have basically shown that the proper score for an inconsistent forecast is always dominated (from below) by a proper score for some consistent forecast. The simplifying assumption is that scores are computed for individual events and added.

Now, here is a curious proper score that does not satisfy this simplifying assumption. Suppose we're working with a finite space Ω with n points. Suppose p is consistent. Let m(p) be a point of Ω where p is maximized for a forecast p. (Use any tie-breaking method you like if that point isn't unique.) Then let s_p be 0 at m(p) and 1 everywhere else. Then if p and q are consistent, E_ps_q=1−p(m(q)) (where p(ω)=p({ω})). Since p(m(p))≥p(m(q)) by definition of m, it follows that E_ps_p≤E_ps_q. Observe that p(m(p))≥1/n. Thus, E_ps_p≤1−1/n. Finally, if p is inconsistent, let s_p be 1−1/n everywhere. Then s is a proper score.

For consistent forecasts, our s is a best guess score: a forecast's maximum point (with whatever tie breaker one likes) counts as the forecast's "best guess", and we get the perfect score 0 if we guessed right, and we get 1 otherwise. And for inconsistent forecasts, I just assigned a value that makes the score proper and, well, that makes what I am about to say true.

Namely: the above score s does not have the domination property that I talked about earlier. Let q be any inconsistent forecast. Then s_q is 1−1/n everywhere. If p is any consistent forecast, however, then s_p is 1 at all but one point, and so s_p does not dominate s_q from below.

Now, our score s is not a strictly proper score (Predd et al. actually work with strictly proper scores): for a strictly proper score s, whenever q differs from p and p is consistent, we will have E_ps_p<E_ps_q. But we can make our score strictly proper. Fix a small constant c. Then s+cb, where b is the standard Brier score, will be strictly proper. But if c is small enough, s+cb will also fail to have the domination property.

We should already have been suspicious of the argument for consistency based on proper scores and domination when proper scores were defined: the definition treated consistent forecasts in a special way (i.e., E_ps_p≤E_ps_q was only required when p is consistent—of course, it's hard to define E_p when p is inconsistent, so there is some excuse). But now we have even more reason to be suspicious: it is only some proper scores that have the property that scores of inconsistent forecasts are dominated by scores of consistent ones. Now, if we had some philosophical reason to think that the right way to score forecasts is by adding up scores for individual events, this would be better. But I don't know of such a philosophical reason.

Alexander Pruss's Blog

Friday, March 7, 2014

An interesting epistemic scoring rule

No comments:

Labels

Subscribe