Suppose we have a group of perfect Bayesian agents with the same evidence who nonetheless disagree. By definition of “perfect Bayesian agent”, the disagreement must be rooted in differences in priors between these peers. Here is a natural-sounding recipe for conciliating their disagreement: the agents go back to their priors, they replace their priors by the arithmetic average of the priors within the group, and then they re-updated on all the evidence that they had previous got. (And in so doing, they lose their status as perfect Bayesian agents, since this procedure is not a Bayesian update.)
Since the average of consistent probability functions is a consistent probability function, we maintain consistency. Moreover, the recipe is a conciliation in the following sense: whenever the agents previously all agreed on some posterior, they still agree on it after the procedure, and with the same credence as before. Whenever the agents disagreed on something, they now agree, and their new credence is strictly between the lowest and highest posteriors that the group assigned prior to conciliation.
Here is a theory that can give a justification for this natural-sounding procedure. Start with natural law Bayesianism which is an Aristotelian theory that holds that human nature sets constraints on what priors count as natural to human beings. Thus, just as it is unnatural for a human being to be ten feet tall, it is unnatural for a human being to have a prior of 10−100 for there being mathematically elegant laws of nature. And just as there is a range of heights that is natural for a mature human being, there is a range of priors that is natural for the proposition that there are mathematically elegant laws.
Aristotelian natures, however, are connected with the actual propensities of the beings that have them. Thus, humans have a propensity to develop a natural height. Because of this propensity, an average height is likely to be a natural height. More generally, for any numerical attribute governed by a nature of kind K, the average value of that attribute amongst the Ks is likely to be within the natural range. Likely, but not certain. It is possible, for instance, to have a species whose average weight is too high or too low. But it’s unlikely.
Consequently, we would expect that if we average the values of the prior for a given proposition q over the human population, the average would be within the natural range for that prior. Moreover, as the size of a group increases, we expect the average value of an attribute over the group to approach the average value the attribute has in the full population. Then, if I am a member of the group of disagreeing evidence-sharing Bayesians, it is more likely that the average of the priors for q amongst the members of the group lies within the natural human range for that prior for q than it is that my own prior for q lies within the natural human range for q. It is more likely that I have an unnatural height or weight than that the average in a larger group is outside the natural range for height or weight.
Thus, the prior-averaging recipe is likely to replace priors that are defectively outside the normal human range with priors within the normal human range. And that’s to the good rationally speaking, because on a natural law epistemology, the rational way for humans to reason is the same as the normal way for humans to reason.
It’s an interesting question how this procedure compares to the procedure of simply averaging the posteriors. Philosophically, there does not seem to be a good justification of the latter. It turns out, however, that typically the two procedures give the same result. For instance, I had my computer randomly generate 100,000 pairs of four-point prior probability spaces, and compare the result of prior- to posterior-averaging. The average of the absolute value of the difference in the outputs was 0.028. So the intuitive, but philosophically unjustified, averaging of posteriors is close to what I think is the more principled averaging of priors.
The procedure also has an obvious generalization from the case where the agents share the same evidence to the case where they do not. What’s needed is for the agents to make a collective list of all their evidence, replace their priors by averaged priors, and then update on all the items in the collective list.
Humans lived for millennia without even having had the idea that there are mathematically elegant laws of nature (at least as we moderns understand them), let alone having priors for it.
ReplyDeleteIf dolphins were intelligent enough to come up with the idea of such laws, or if there were sufficiently intelligent Martians, should their priors be different from ours?
It is not so easy to see how there could be natural, human-specific priors for the vast number of propositions we can come up with. It seems more plausible that there are non-Bayesian approaches to learning that apply to all sufficiently rational and intelligent beings.
The priors for things people don't think about can be defined by counterfactuals: if you were to form the concept of a proposition p, what probability should you assign to it?
ReplyDeleteIt wouldn't surprise me if dolphins and Martians should have different priors, because their different environment might make it be a good idea that they, say, jump to inductive conclusions faster or slower than we do.
The procedure described in this post has been earlier described by Berntson and Isaacs: https://www.yoaavisaacs.com/uploads/6/9/2/0/69204575/a_new_prospect_for_epistemic_aggregation.pdf
ReplyDelete