[This post uses the wrong concept of a strictly proper score. See the comments.]
A scoring rule for a credence assignment is a measure of the inaccuracy of the credences: the lower the value, the better.
A proper scoring rule is a scoring rule with the property that for each probabilistically consistent credence assignment P, the expected value according to P of the score for P is maximized at P. If it’s maximized uniquely at P, the scoring rule is said to be strictly proper.
A scoring rule is additive provided that it is the sum of scoring rules each of which depends only on the credence assigned to a single proposition and the truth value of that proposition.
The formal epistemology literature has a lot of discussion of a strict domination theorem that given an additive strictly proper scoring rule, you will do better to have a credence assignment that is probabilistically consistent: indeed, another credence assignment will give a better score in every possible world.
The assumption of strict propriety gets a fair amount of discussion. Not so the assumption of additivity.
It turns out that if you drop additivity, the theorem fails. Indeed: this is trivial. Consider any strictly proper scoring rule s, and modify it to a rule s* that assigns the score −∞ to any inconsistent credence. Then any inconsistent credence receives the best possible score in every possible world. Moreover, s* is still strictly proper if s is because the definition of strict propriety only involves the behavior of the scoring rule as applied to consistent credences, and hence s* is strictly proper if and only if s is. And, of course, s* is not additive.
But of course my rule s* is very much ad hoc and it is gerrymandered to reward inconsistency. Can we make a non-additive scoring for which the domination theorem fails that lacks such gerrymandering and is somewhat natural?
I think so. Consider a finite probability space Ω, with n points ω1, ..., ωn in it. Now, consider a scoring rule generated as follows.
Say that a simple gamble g on Ω is an assignment of values to the n points. Let G be a set of simple gambles. Imagine an agent who decides which simple gamble g in G to take by the following natural method: she calculates ∑iP({ωi})g(ωi), where P is her credence assignment, and chooses the gamble g that maximizes this sum. If there is a tie, she has some tie-resolution mechanism. Then, we can say that the G-score of her credences is the negative of the utility gained from the gamble she chose. In other words, her G-score at location ωi is −g(ωi) where g is a maximally auspicious gamble according to her credences.
It is easy to see that G-score is a proper score. Moreover, if there are never any ties in choosing the maximally auspicious gamble, the score is strictly proper.
This is a very natural way to generate a score: we generate a score by looking how well you would do when acting on the credences in the face of a practical decision. But any scores generated in this way will fail to satisfy the domination theorem. Here’s why: the scoring rule scores any inconsistent non-negative credence P that is non-zero on some singleton the same way as it scores the consistent credence P* defined by P*(A)=∑ω ∈ AP({ω})/∑ω ∈ ΩP({ω}). Thus, the domination theorem will fail to apply to any scoring rule generated in the above way, since the domination thing does not happen for consistent credences.
The only thing that remains is to check that there is some natural strictly proper rule that can be generated using the above method. Here’s one. Let Gn be the set of simple gambles that assign to the n points of Ω values that lie in the n-dimensional unit ball. In other words, each simple gamble g ∈ Gn is such that ∑i(g(ai))2 ≤ 1.
A bit of easy constrained maximization using Lagrange multipliers shows that if P is a credence assignment on Ω such that P({ωi}) ≠ 0 for at least one point ωi ∈ Ω, then there is a unique maximally auspicious gamble g and it is given by g(ωj)=P({ωj})/(∑i(P({ωi}))2)1/2. Because of the uniqueness, we have a strictly proper scoring rule.
The Gn-score of a credence assignment P is then s(P, ωj)= − P({ωj})/(∑i(P({ωi}))2)1/2.
This looks fairly natural. The choice of Gn seems fairly natural as well. There is no gerrymandering going on. And yet the domination theorem fails for the Gn-score. (I think any strictly convex set of simple gambles works for Gn, actually.)
Thus, absent some good argument for why Gn-score is a bad way to score credences, it seems that the scoring rule domination argument isn’t persuasive.
More generally, consider any credence-based procedure for deciding between finite sets of gambles that has the following two properties:
The procedure yields a gamble that maximizes expected utility in the case of consistent credences, and
The procedure never recommends a gamble that is dominated by another gamble.
There are such procedures that apply to interesting classes of inconsistent credences and that are nonetheless pretty natural. Given any such procedure, we can extend it arbitrarily to apply to all inconsistent credences, we assign a score to a credence assignment as the negative of the value of the selected gamble, and we have a proper score to which the domination theorem doesn’t apply. And if make our set of gambles be the n-ball Gn, then the score is strictly proper.