Conciliationism holds that in cases of peer disagreement the two peers should move to a credence somewhere between their individual credences. In a recent post I presented a toy model of error of reasoning on which conciliationism was in general false. In this post, I will present another toy model with the same property.

Bayesian evidence is additive when instead of probability *p* one works with log-odds *λ*(*p*)=log(*p*/(1 − *p*)). From that point of view, it is natural to model error in the evaluation of the force of evidence as the addition of a normally-distributed term with mean zero to the log-odds.

Suppose now that Alice and Bob evaluate their first-order evidence, which they know they have in common, and come to the individual conclusions that the probability of some *Q* is *α* and *β* respectively. Moreover, both Alice and Bob have the above additive model of their own error-proneness in the evaluation of first-order evidence, and in fact they assign the same standard deviation *σ* to the normal distribution. Finally, we assume that Alice and Bob know that their errors are independent.

Alice and Bob are good Bayesians. They will next apply a discount for their errors to their first-order estimates. You might think: “No discount needed. After all, the error could just as well be negative as well as positive, and the positive and negative possibilities cancel out, leaving a mean error of zero.” That’s mistaken, because while the normal distribution is symmetric, what we are interested in is not the expected error in the log-odds, which is indeed zero, but the mean error in the probabilities. And once one transforms back from log-odds to probabilities, the normal distribution becomes asymmetric. A couple of weeks back, I worked out some formulas which can be numerically integrated with Derive.

First-order probability | σ |
Second-order probability |
---|---|---|

0.80 | 1.00 | 0.76 |

0.85 | 1.00 | 0.81 |

0.90 | 1.00 | 0.87 |

0.95 | 1.00 | 0.93 |

0.80 | 0.71 | 0.78 |

0.80 | 0.71 | 0.83 |

0.90 | 0.71 | 0.88 |

0.95 | 0.71 | 0.94 |

So, for instance, if Alice has a first-order estimate of 0.90 and Bob has a first-order estimate of 0.95, and they both have *σ* = 1 in their error models, they will discount to 0.87 and 0.93.

Let the discounted credences, after evaluation of the second-order evidence, be *α*^{*} and *β*^{*} (the value depends on *σ*).

Very good. Now, Alice and Bob get together and aggregate their final credences. Let’s suppose they do so completely symmetrically, having all information in common. Here’s what they will do. The correct log-odds for *Q*, based on the correct evaluation of the evidence, equals Alice’s pre-discount log-odds log(*α*/(1 − *α*)) plus an unknown error term with mean zero and standard deviation *σ*, as well as equalling Bob’s pre-discount log-odds log(*α*/(1 − *α*)) plus an unknown error term with mean zero and standard deviation *σ*.

Now, there is a statistical technique we learn in grade school which takes a number of measurements of an unknown quantity, with the same normally distributed error, and which returns a measurement with a smaller normally distributed error. The technique is known as the *arithmetic mean*. The standard deviation of the error in the resulting averaged data point is *σ*/*n*^{1/2}, where *n* is the number of samples. So, Alice and Bob apply this technique. They back-calculate *α* and *β* from their final individual credences *α*^{*} and *β*^{*}, they then calculate the log-odds, average, and go back to probabilities. And then they model the fact that there is still a normally-distributed error term, albeit one with standard deviation *σ*/2^{1/2}, so they adjust for that to get a final credence *α*^{**} = *β*^{**}.

So what do we get? Do we get conciliationism, so that their aggregated credence *α*^{**} = *β*^{**} is in between their individual credences? Sometimes, of course, we do. *But not always*.

Observe first what happens if *α*^{*} = *β*^{*}. “But then there is no disagreement and nothing to conciliate!” True, but there is still data to aggregate. If *α*^{*} = *β*^{*}, then the error discount will be smaller by a factor of the square root of two. In fact, the table above shows what will happen, because (not by coincidence) 0.71 is approximately the reciprocal of the square root of two. Suppose *σ* = 1. If *α*^{*} = *β*^{*} = 0.81, this came from pre-correction values *α* = *β* = 0.85. When corrected with the smaller normal error of 0.71, we now get a corrected value *α*^{**} = *β*^{**} = 0.83. In other words, aggregating the data from one another, Alice and Bob raise their credence in *Q* from 0.81 to 0.83.

But all the formulas here are quite continuous. So if *α*^{*} = 0.8099 and *β*^{*} = 0.8101, the aggregation will still yield a final credence of approximately 0.83 (I am not bothering with the calculation at this point). So, when conciliating 0.8099 and 0.8101, you get a final credence that is *higher* than either one. Conciliationism is thus false.

The intuition here is this. When the two credences are reasonably close, the amount by which averaging reduces error overcomes the downward movement in the higher credence.

Of course, there will also be cases where aggregation of data does generate something in between the two data points. I conjecture that on this toy model, as in my previous, this will be the case whenever the two credences are on opposite sides of 1/2.

## 9 comments:

I wonder if an increased standard deviation (from summing variances when combining two measures of data) would change anything here? You appear to use the same sigma for all the standard deviations.

The standard deviation sigma is the one that affects the log-odds of each individual agent. But the standard deviation in the *average* of the two agents' log-odds estimates is smaller, by a factor of the square root of two (precisely because variances add).

The calculations in this post unacceptably neglect the priors on the evidential force.

Is higher-order Bayesianism (i.e. ordinary Bayesianism plus cognitive error model) really Bayesian? (I take it that this was the point of Heath White’s comment on your earlier post.)

Here is a simple, strictly Bayesian, model. There are two similar-looking but biased coins. Coin A has Pr(Heads) = 0.9. Coin B has Pr(Heads) = 0.1. A third party picks one coin randomly with probability 0.5, flips it once, and briefly shows you the outcome. You see Heads. But there is a catch. Your vision is unreliable. The probability that you see Heads or Tails correctly is 0.5 + Δ. (i.e Pr(See Heads | Heads) = 0.5 + Δ, etc.). What is your posterior credence that the coin is A? You apply Bayes and get Pr(A | See Heads) = 0.5 + 0.8 * Δ. Now suppose that a second person with similarly but independently unreliable vision also sees the outcome. If she also sees Heads (which you can infer from her stated posterior credence), you can again apply Bayes to get Pr(A | both see Heads) = 0.5 + 0.8 * Δ / (0.5 + 2 * (Δ ^ 2)). For 0 < Δ < 0.5, this is clearly greater than your original credence. If she sees Tails your new credence would be 1/2. These results are entirely intuitive. The second person is in effect just a second independent pair of eyes. If you agree, you have more confidence in what you saw, so your credence is more extreme. If you disagree, less.

But note, this is not second order Bayesianism. Your cognitive faculties are taken to be perfect: it’s only your vision that is faulty. You are treating what you see as ordinary evidence that can be modelled probabilistically. It seems to me that in any truly Bayesian approach must work like this, i.e. it must ‘top out’ in perfect cognitive faculties, with everything else taken as evidence that can be probabilistically modelled.

For what it’s worth, I doubt that real people work like this. That’s one reason I have trouble with Bayesian approaches, except in special, textbook-style circumstances.

Regarding my worries about priors, as long as the priors on the evidential force (measured as additive in log-odds space) are flat within several sigmas of Alice's and Bob's first order estimates, what I said works out.

Ian:

That's interesting. I guess you could modify the story by introducing two new characters, Alice* and Bob*. Alice* and Bob* don't have access to Alice and Bob's first order evidence. All they have available are Alice and Bob's respective first-order evaluations, which they simply take to be evidence in a straightforward conditionalization. And then when you bring Alice* and Bob* together, you don't have peerhood, since Alice* and Bob* have different evidence: Alice* knows only about Alice's evaluation and Bob* knows only about Bob's evaluation. Alice* and Bob* aggregate their data in a standard Bayesian way, and get the results I mention.

And then my original story merges people with their starred versions.

"But the standard deviation in the *average* of the two agents' log-odds estimates is smaller, by a factor of the square root of two"

Convergence of the standard error of the mean assumes a consistent estimator. But if half the estimates are biased in a uniform random distribution way, I am dubious that the estimate is consistent enough for it to converge.

This may mean that with larger sample sizes a cynical estimator that sets all estimates (that are not the same with Alice and Bob) to 0.5 may do just about as well as any other formula.

William:

That's right, but only regarding my other model.

You can model people that way, but then they are not strictly Bayesian. As you say above and in your newer post, A* and B* implicitly use priors on the evidential force. In a strictly Bayesian approach, these priors would be derived from a probability model.

Here is a sketch of a strictly Bayesian model that works much like the one in the post, but requires no numerical integration.

One of two hypotheses about a random variable X is true. Under Hypothesis A, X is distributed as N(μ,V), under hypothesis B as N(-μ,V). Each hypothesis has prior probability 1/2.

You observe X. The log-odds for A, calculated from the Normal distribution function, is 2μX/V. Suppose there is an additive error, distributed as N(0, W), in reporting the log-odds to the 2nd level. Then under A, the reported log-odds is distributed as N(2μμ/V, 4μμ/V+W). Under B, it is distributed N(-2μμ/V, 4μμ/V+W). So the 2nd level log-odds ratio for A, taking into account the reporting error, is (2μX/V) / (1 + VW/(4μμ)). Call this R. The 2nd level posterior probability can be calculated from R as exp(R) / (1 + exp(R)). This is an exact Bayesian result.

If W = 0 (i.e. no reporting error), R is the same as for first order result (as it should be). As W increases, R shrinks towards 0, again as it should. If two people share their 2nd level info as in the post, W should be replaced by W/2. This expands R away from zero. So, as in the post, if Alice and Bob agree on a posterior probabilitiy greater than 0.5, their merged probability will be higher.

Now for the point. Note that the adjustment factor 1/(1 + VW/4μμ) depends not only on W (the variance of the reporting error), but also all the other parameters through (μ^2)/ V. So no generic 2nd level prior on evidential force can match the strict Bayesian result.

Thanks, Ian, for your help.

On reflection, a serious problem with all these models is that it's not clear that Alice and Bob can be said to have the same evidence. For 2nd-level-Alice has available to her the data from 1st-level-Alice, while 2nd-level-Bob has available to him the data from 1st-level-Bob. And this is different data, hence different evidence.

I am not sure, though, that the conciliationist can make this complaint. For it seems that something like *this* difference of evidence will be present whenever two peers evaluate things differently.

Post a Comment