Accuracy scoring rules measure the value of your probability
assignment’s closeness to the truth. A scoring rule for a single
proposition *p* can be thought
of as a pair of functions, *T*
and *F* on the interval [0,1] where *T*(*x*) tells us the score for
assigning *x* to *p* when *p* is true and *F*(*x*) tells us the score for
assigning *x* to *p* when *p* is false. The scoring rule is
proper provided that:

*x**T*(*x*) + (1−*x*)*F*(*x*) ≥*x**T*(*y*) + (1−*x*)*F*(*y*)

for all *x* and *y*. If you assign probability *x* to *p*, then *x**T*(*y*) + (1−*x*)*F*(*y*)
measures your expected value of the score for someone who assigns *y*. The propriety condition thus says
that by your lights there isn’t a better probability to assign. After
all, if there were, wouldn’t you assign it?

I’ve been playing with how to construct proper scoring rules for a
single proposition, and I found two nice ways that are probably in the
literature but I haven’t seen explicitly. First, let *F* be any monotone (not necessarily
strictly) decreasing function on [0,1]
that is finite except perhaps at 1.
Then let:

*T*_{F}(*x*) =*F*(1/2) − ((1−*x*)/*x*)*F*(*x*) − ∫_{1/2}^{x}*u*^{−2}*F*(*u*)*d**u*.

I think we then have the following:

**Fact 1:** The pair (*T*_{F},*F*)
is a proper scoring rule.

Second, let *T* be any
monotone increasing function on [0,1]
that is finite except perhaps at 0.
Let:

*F*_{T}(*x*) =*T*(1/2) − (*x*/(1−*x*))*T*(*x*) + ∫_{1/2}^{x}1/(1−*u*)^{2}*T*(*u*)*d**u*.

I think then we have the following:

**Fact 2:** The pair (*T*,*F*_{T})
a proper scoring rule.

In other words, to generate a proper scoring rule, we just need to choose one of the two functions making up the scoring rule, make sure it is monotone in the right direction, and then we can generate the other function.

Here’s a sketch of the proof of Fact 1. Note first that if *F* = *c* is constant, then
*T*_{F}(*x*) = *c* − ((1−*x*)/*x*)*c* + *c*(*x*^{−1}−(1/2)^{−1}) = *c* + *c* − 2*c* = 0
for all *x*. Since the map *F* ↦ *T*_{F}
is linear, it follows that if *F* and *H* differ by a constant, then *T*_{F} and *T*_{H} are the same.
Thus subtracting a constant from *F*, we can assume without loss of
generality that *F* is
non-positive.

We can then approximate *F*
by functions of the form ∑_{i}*c*_{i}1_{[ai,1]}
with *c*_{i}
non-positive (here I have to confess to not having checked all the
details of the approximation) and by linearity we only need to check
propriety for *F* = − 1_{[a,0]}.
If *a* = 0, then *F* is constant and *T*_{F} will be zero,
and we will trivially have propriety. So suppose *a* > 0. Let *T*(*x*) = − ((1−*x*)/*x*)*F*(*x*) − ∫_{0}^{x}*u*^{−2}*F*(*u*)*d**u*.
This differs by a constant from *T*_{F}, so (*T*_{F},*F*)
will be proper if and only if (*T*,*F*) is. Note that *T*(*x*) = 0 for *x* < *a* and for *x* ≥ *a* we have:

*T*(*x*) = ((1−*x*)/*x*) + ∫_{a}^{x}*u*^{−2}*d**u*= ((1−*x*)/*x*) − (*x*^{−1}−*a*^{−1}) =*a*^{−1}− 1.

Thus, *T* = ((1−*a*)/*a*) ⋅ 1_{[a,0]}.
Now let’s check if we have the propriety condition:

*x**T*(*y*) + (1−*x*)*F*(*y*) ≤*x**T*(*x*) + (1−*x*)*F*(*x*).

Suppose first that *x* ≥ *a*. Then the
right-hand-side is *x*(1−*a*)/*a* − (1−*x*).
This is non-negative for *x* ≥ *a*, and the
left-hand-side of (1) is zero if *y* < *a*, so we are done if
*y* < *a*. Since *T* and *F* are constant on [*a*,1], the two sides of (1) are
equal for *y* ≥ *a*.

Now suppose that *x* < *a*. Then the
right-hand-side is zero. And the left-hand-side is zero unless *y* ≥ *a*. So suppose *y* ≥ *a*. Since *T* and *F* are constant on [*a*,1], we only need to check (1) at
*y* = 1. At *y* = 1, the left-hand-side of (1) is
*x*(1−*a*)/*a* − (1−*x*) ≤ 0
if *x* < *a*.

Fact 2 follows from Fact 1 together with the observation that (*T*,*F*) is proper if and only
if (*F**,*T**) is proper,
where *T* * (*x*) = *T*(1−*x*)
and *F* * (*x*) = *F*(1−*x*).

## 13 comments:

"Second, let T be any monotone increasing function on [0,1] that is finite except perhaps at − 1."

Perhaps you meant

"Second, let T be any monotone increasing function on [0,1] that is finite except perhaps at 1." Is -1 in the domain of T?

Also, if F is assumed to be monotone decreasing, the case F=c is not consistent with that hypothesis.

Thanks for this post, I found the idea of proposition scoring to be very interesting.

Since F is monotone decreasing, we already know that F is differentiable almost everywhere in [0,1].

Sure, the possibilities are infinite.

I hadn't heard of the name Brier before, but that rule is equivalent to

T(x) = -(1-x)^2,

obtained when the kernel is

g(x) = x(1-x).

I'm not a philosopher, but causal finitism would seem to be a different beast from "possibility" finitism.

Kerry:

I meant 0, not -1. Fixed. Thanks!

And by by "increasing" I meant "non-strictly". I.e., "non-decreasing".

Andrew:

The symmetry assumption is natural, but not inevitable. Here is one way to construct proper accuracy scoring rules. You are going to be given a choice between a number of different games, each of which has an outcome that depends on the proposition q. You choose the game with the highest expected outcome according to your probability you assigned to q. Your score then is the actual outcome of the game. (You can get different scoring rules for different tie-breaking procedures for equal expected outcomes.) This kind of a scoring rule tends not to satisfy the symmetry condition. For instance, suppose you have a choice between two games, one which pays $1 on q and $2 on ~q and one which pays $3 on q and $1 on ~q. Let p be your probability of q. Then you will choose game 1 if p is less than 1/3 and game 2 if p is more than 1/3. Then T(p)=1 if p<1/3 and T(p)=3 if p>1/3, while F(p)=2 if p<1/3 and F(p)=1 if p>1/3. (And what we have at 1/3 depends on the tie-breaking choice.) Then we don't have your symmetry condition.

Another family of cases of asymmetry is like this. If there is objective morality (or God or a lawlike universe, etc.), there is a lot of value in being nearly certain that there is objective morality (etc.) But if there is no objective morality (or no God or no lawlike universe, etc.), there is little value in being nearly certain of that.

By the way, I suspect that all single-proposition scoring rules are generated by the procedure in my post, up to an additive constant, but I don't have a proof yet. I can show that the differences have to have zero derivative almost everywhere, but that's not enough.

Andrew:

You should make the integral defining F go from something other than 0, as the integral from 0 to x of 1/x is infinite.

I wasn't attempting a general theory of scoring functions, just wanted to provide an easy way to generate them, including the common ones.

I misunderstood your intentions: you want custom scoring rules for a particular P; I thought you were looking for general scoring rules that could be used for any P.

For example, if you want to evaluate your weatherman, you probably want to evaluate his ability on all kinds of weather not just e.g. snow.

Also, you seem to want to derive scoring rules from a RV V dependent on the event P.

The scheme is:

1. estimate x as Pr(P);

2. use estimate x to calculate the optimal strategy;

3. use the actual payoff V as the score.

That should be guaranteed to form a proper scoring rule,

since the optimization is built in at step 2.

Is that right?

I was going to offer a response to Zsolt Nagy, but I think it's just for the best that you blocked him, Dr. Pruss. Hopefully, you can block his IP from commenting. You might have to block the IP of VPNs as well. I don't know if that's possible, though.

Btw, please remove my first comment - it was superseded by a correction.

It looks like my method does generate every proper scoring rule. There is an old paper of Schervish which in Theorem 4.2 and Lemma A.7 gives a characterization of all scoring rules. One can use the characterization to show that for every T there is a unique (up to additive constant) F and vice versa. https://projecteuclid.org/journals/annals-of-statistics/volume-17/issue-4/A-General-Method-for-Comparing-Probability-Assessors/10.1214/aos/1176347398.full

Post a Comment