Alexander Pruss's Blog: social epistemology

Showing posts with label social epistemology. Show all posts

Tuesday, February 25, 2025

Being known

The obvious analysis of “p is known” is:

There is someone who knows p.

But this obvious analysis doesn’t seem correct, or at least there is an interesting use of “is known” that doesn’t fit (1). Imagine a mathematics paper that says: “The necessary and sufficient conditions for q are known (Smith, 1967).” But what if the conditions are long and complicated, so that no one can keep them all in mind? What if no one who read Smith’s 1967 paper remembers all the conditions? Then no one knows the conditions, even though it is still true that the conditions “are known”.

Thus, (1) is not necessary for a proposition to be known. Nor is this a rare case. I expect that more than half of the mathematics articles from half a century ago contain some theorem or at least lemma that is known but which no one knows any more.

I suspect that (1) is not sufficient either. Suppose Alice is dying of thirst on a desert island. Someone, namely Alice, knows that she is dying of thirst, but it doesn’t seem right to say that it is known that she is dying of thirst.

So if it is neither necessary nor sufficient for p to be known that someone knows p, what does it mean to say that p is known? Roughly, I think, it has something to do with accessibility. Very roughly:

Somebody has known p, and the knowledge is accessible to anyone who has appropriate skill and time.

It’s really hard to specify the appropriateness condition, however.

Does all this matter?

I suspect so. There is a value to something being known. When we talk of scientists advancing “human knowledge”, it is something like this “being known” that we are talking about.

Imagine that a scientist discovers p. She presents p at a conference where 20 experts learn p from her. Then she publishes it in a journal when 100 more people learn it. Then a Youtuber picks it up and now a million people know it.

If we understand the value of knowledge as something like the sum of epistemic utilities across humankind, then the successive increments in value go like this: first, we have a move from zero to some positive value V when the scientist discovers p. Then at the conference, the value jumps from V to 21V. Then after publication it goes from 21V to 121V. Then given Youtube, it goes from 121V to 100121V. The jump at initial discovery is by far the smallest, and the biggest leap is when the discovery is publicized. This strikes me as wrong. The big leap in value is when p becomes known, which either happens when the scientist discovers it or when it is presented at the conference. The rest is valuable, but not so big in terms of the value of “human knowledge”.

Monday, February 24, 2025

Epistemically paternalistic lies

Suppose Alice and Bob are students and co-religionists. Alice is struggling with a subject and asks Bob to pray that she might do fine on the exam. She gets 91%. Alice also knows that Bob’s credence in their religion is a bit lower than her own. When Bob asks her how she did, she lies that she got 94%, in order to boost Bob’s credence in their religion a bit more.

Whether a religion is correct is very epistemically important to Bob. But whether Alice got 91% or 94% is not at all epistemically important to Bob except as evidence for whether the religion is correct. The case can be so set up that by Alice’s lights—remember, she is more confident that the religion is correct than Bob is—Bob can be expected to be better off epistemically for boosting his credence in the religion. Moreover, we can suppose that there is no plausible way for Bob to find out that Alice lied. Thus, this is an epistemically paternalistic lie expected to make Bob be better off epistemically.

And this lie is clearly morally wrong. Thus, our communicative behavior is not merely governed by maximization of epistemic utility.

More on averaging to combine epistemic utilities

Suppose that the right way to combine epistemic utilities across people is averaging: the overall epistemic utility of the human race is the average of the individual epistemic utilities. Suppose, further, that each individual epistemic utility is strictly proper, and you’re a “humanitarian” agent who wants to optimize overall epistemic utility.

Suppose you’re now thinking about two hypotheses about how many people exist: the two possible numbers are m and n, which are not equal. All things considered, you have credence 0 < p₀ < 1 in the hypothesis H_m that there are m people and 1 − p₀ in the hypothesis H_n that there are n people. You now want to optimize overall epistemic utility. On an averaging view, if H_m is true, if your credence is p₁, your contribution to overall epistemic utility will be:

(1/m)T(p₁)

and if H_m is false, your contribution will be:

(1/n)F(p₁),

where your strictly proper scoring rule is given by T, P. Since your credence is p₁, by your lights the expected value after your changing your credence to p₀ will be:

p₀(1/m)T(p₁) + (1−p₀)(1/n)F(p₁) + Q(p₀)

where Q(p₀) is the contribution of other people’s credences, which I assume you do not affect with your choice of p₁. If m ≠ n and T, F is strictly proper, the expected value will be maximized at

p₁ = (p₀/m)/(p₀/m+(1−p₀)/n) = np₀/(np₀+m(1−p₀)).

If m > n, then p₁ < p₀ and if m < n, then p₁ > p₀. In other words, as long as n ≠ m, if you’re an epistemic humanitarian aiming to improve overall epistemic utility, any credence strictly between 0 and 1 will be unstable: you will need to change it. And indeed your credence will converge to 0 if m > n and to 1 if m < n. This is absurd.

I conclude that we shouldn’t combine epistemic utilities across people by averaging the utilities.

Idea: What about combining them by computing the epistemic utilities of the average credences, and then applying a strictly proper scoring rule, in effect imagining that humanity is one big committee and that a committee’s credence is the average of the individual credences?

This is even worse, because it leads to problems even without considering hypotheses on which the number of people varies. Suppose that you’ve just counted some large number nobody cares about, such as the number of cars crossing some intersection in New York City during a specific day. The number you got is even, but because the number is big, you might well have made a mistake, and so your credence that the number is even is still fairly low, say 0.7. The billions of other people on earth all have credence 0.5, and because nobody cares about your count, you won’t be able to inform them of your “study”, and their credences won’t change.

If combined epistemic utility is given by applying a proper scoring rule to the average credence, then by your lights the expected value of the combined epistemic utility will increase the bigger you can budge the average credence, as long as you don’t get it above your credence. Since you can really only affect your own credence, as an epistemic humanitarian your best bet is to set your credence to 1, thereby increasing overall human credence from 0.5 to around 0.5000000001, and making a tiny improvement in the expected value of the combined epistemic utility of humankind. In doing so, you sacrifice your own epistemic good for the epistemic good of the whole. This is absurd!

I think the idea of averaging to produce overall epistemic utilities is just wrong.

Monday, July 31, 2023

Values of disagreement

We live in a deeply epistemically divided society, with lots of different views, including on some of the most important things.

Say that two people disagree significantly on a proposition if one believes it and one disbelieves it. The deep epistemic division in society includes significant disagreement on many important propositions. But whenever two people significantly disagree on a proposition, one of them is wrong. Being wrong about an important proposition is a very bad thing. So the deep division implies some very bad stuff.

Nonetheless, I’ve been thinking that our deep social disagreement leads to some important advantages as well. Here are three that come to mind:

If two people significantly disagree on a proposition, then by bivalence, one of them is right. There is a value in someone getting a matter right, rather than everyone getting it wrong or suspending judgment.
Given our deep-seated psychological desire to convince others that we’re right, if others disagree with us, we will continue seeking evidence in order to convince them. Thus disagreement keeps us investigating, which is beneficial whether or not we are right. If everyone agreed with us, we would be apt to stop investigating, which would mean we would either get us stuck with a falsehood, or at least likely provide us with less evidence of the truth than is available. Moreover, continued investigation is apt to refine our theory, even if the theory was already basically right.
To avoid getting stuck in local maxima in our search for the best theory, it is good if people are searching in very different areas of epistemic space. Disagreement helps make that happen.

Wednesday, July 26, 2023

Committee credences

Suppose the members of a committee individually assign credences or probabilities to a bunch of propositions—maybe propositions about climate change or about whether a particular individual is guilty or innocent of some alleged crimes. What should we take to be “the committee’s credences” on the matter?

Here is one way to think about this. There is a scoring rule s that measures the closeness of a probability assignment to the truth that is appropriate to apply in the epistemic matter at hand. The scoring rule is strictly proper (i.e., such that an individual by their own lights is always prohibited from switching probabilities without evidence). The committee can then be imagined to go through all the infinitely many possible probability assignments q, and for each one, member i calculates the expected value E_{p_i}s(q) of the score of q by the lights of the member’s own probability assignment p_i.

We now need a voting procedure between the assignments q. Here is one suggestion: calculate a “committee score estimate” for q in the most straightforward way possible—namely, by adding the individuals’ expected scores, and choose an assignment that maximizes the committee score estimate.

It’s easy to prove that given that the common scoring rule is strictly proper, the probability assignment that wins out in this procedure is precisely the average p̄ = (p₁+...+p_n)/n of the individuals’ probability assignments. So it is natural to think of “the committee’s credence” as the average of the members’ credences, if the above notional procedure is natural, which it seems to be.

But is the above notional voting procedure the right one? I don’t really know. But here are some thoughts.

First, there is a limitation in the above setup: we assumed that each committee member had the same strictly proper scoring rule. But in practice, people don’t. People differ with regard to how important they regard getting different propositions right. I think there is a way of arguing that this doesn’t matter, however. There is a natural “committee scoring rule”: it is just the sum of the individual scoring rules. And then we ask each member i when acting as a committee member to use the committee scoring rule in their voting. Thus, each member calculates the expected committee score of q, still by their own epistemic lights, and these are added, and we maximize, and once again the average will be optimal. (This uses the fact that a sum of strictly proper scoring rules is strictly proper.)

Second, there is another way to arrive at the credence-averaging procedure. Presumably most of the reason why we care about a committee’s credence assignments is practical rather than purely theoretical. In cases where consequentialism works, we can model this by supposing a joint committee utility assignment (which might be the sum of individual utility assignments, or might be consensus utility assignment), and we can imagine the committee to be choosing between wagers so as to maximize the agreed-on committee utility function. It seems natural to imagine doing this as follows. The committee expectations or previsions for different wagers are obtained by summing individual expectations—with the individuals using the agreed-on committee utility function, albeit with their own individual credences to calculate the expectations. And then the committee chooses a wager that maximizes its prevision.

But now it’s easy to see that the above procedure yields exactly the same result as the committee maximizing committee utility calculated with respect to the average of the individuals’ credence assignments.

So there is a rather nice coherence between the committee credences generated by our epistemic “accuracy-first”
procedure and what one gets in a pragmatic approach.

But still all this depends on the plausible, but unjustified, assumption that addition is the right way to go, whether for epistemic or pragmatic utility expectations. But given this assumption, it really does seem like the committee’s credences are reasonably taken to be the average of the members’ credences.

Tuesday, October 16, 2018

Yet another reason we need social epistemology

Consider forty rational people each individually keeping track of the ethnicities and virtue/vice of the people they interact with and hear about (admittedly, one wonders why a rational person would do that!). Even if there is no statistical connection—positive or negative—between being Polish and being morally vicious, random variation in samples means that we would expect two of the forty people to gain evidence that there is a statistically significant connection—positive or negative—between being Polish and being morally vicious at the p = 0.05 level. We would, further, intuitively expect that one in the forty would come to conclude on the basis of their individual data that there is a statistically significant negative connection between Polishness and vice and one that there is a statistically significant positive connection.

It seems to follow that for any particular ethnic or racial or other group, at the fairly standard p = 0.05 significance level, we would expect about one in forty rational people to have a rational racist-type view about any particular group’s virtue or vice (or any other qualities).

If this line of reasoning is correct, it seems that it is uncharitable to assume that a particular racist’s views are irrational. For there is a not insignificant chance that they are just one of the unlucky rational people who got spurious p = 0.05 level confirmation.

Of course, the prevalence of racism in the US appears to be far above the 1/40 number above. However, there is a multiplicity of groups one can be a racist about, and the 1/40 number is for any one particular group. With five groups, we would expect that approximately 5/40=1/8 (more precisely 1 − (39/40)⁵) of rational people to get p = 0.05 confirmation of a racist-type hypothesis about one of the groups. That’s still presumably significantly below the actual prevalence of racism.

But in any case this line of reasoning is not correct. For we are not individual data gatherers. We have access to other people’s data. The widespread agreement about the falsity of racist-type claims is also evidence, evidence that would not be undercut by a mere p = 0.05 level result of one’s individual study.

So, we need social epistemology to combat racism.

Tuesday, September 4, 2018

Conciliationism with and without peerhood

Conciliationists say that when you meet an epistemic peer who disagrees with you, you should alter your credence towards theirs. While there are counterexamples to conciliationism here is a simple argument that normally something like conciliationism is correct without the assumption of epistemic peerhood:

That someone’s credence in a proposition p is significantly below 1/2 is normally evidence against p.
Learning evidence against a proposition typically should lower one’s credence.
So, normally, learning that someone’s credence is significantly below 1/2 should lower one’s credence.

In particular, if your credence is above 1/2, then learning that someone else’s is significantly below 1/2 should normally lower one’s credence. And there are no assumptions of peerhood here.

The crucial premise is (1). Here is a simple thought: Normally, people’s credences are responsive to evidence. So when their credence is low, that’s likely because they had evidence against a proposition. Now the evidence they had either is or is not evidence you also have. If you know it is not evidence you also have, then learning that they have additional evidence against the proposition should normally provide you with evidence against it, too. If it is evidence you also have, that evidence should normally make no difference. You don’t know which of these is the case, but still the overall force of evidence is against the proposition.

One might, however, have a worry. Perhaps while normally learning that someone’s credence is significantly below 1/2 should lower one’s credence, when that someone is an epistemic peer and hence shares the same evidence, it shouldn’t. But actually the argument of the preceding paragraph shows that as long as you assign a non-zero probability to the person having more evidence, their disagreement should lead you to lower your credence. So the worry only comes up when you are sure that the person is a peer. It would, I think, be counterintuitive to think you should normally conciliate but not when you are sure the other person is a peer.

And I think even in the case where you know for sure that the other person has the same evidence you should lower your credence. There are two possibilities about the other person. Either they are a good evaluator of evidence or not. If not, then their evaluation of the evidence is normally no evidence either for or against the proposition. But if they are good evaluators, then their evaluating the evidence as being against the proposition normally is evidence that the evidence is against the proposition, and hence is evidence that you evaluated badly. So unless you are sure that they are a bad evaluator of evidence, you normally should conciliate.

And if you are sure they are a bad evaluator of evidence, well then, since you’re a peer, you are a bad evaluator, too. And the epistemology of what to do when you know you’re bad at evaluating evidence is hairy.

Here's another super-quick argument: Agreement normally confirms one's beliefs; hence, normally, disagreement disconfirms them.

Why do I need the "normally" in all these claims? Well, we can imagine situations where you have evidence that if the other person disbelieves p, then p is true. Moreover, there may be cases where your credence for p is 1.

Friday, August 31, 2018

Peers and twins

I just realized something that I should have known earlier. Suppose I have a doppelganger who is just like me and goes wherever I go—by magic, he can occupy a space that I occupy—and who always sees exactly what I see and who happened always to judge and decide just as I do. What I’ve just realized is that the doppelganger is not my epistemic peer, even though he is just like me.

He is not my peer because he has evidence that I do not and I have evidence that he does not. For I know what experiences I have and he knows what experiences he has. But even though my experiences are just like his, they are not numerically the same experiences. When he sees, it is through his eyes and when I see, it is through my eyes.

Suppose that on the basis of a perception of a distant object that looked like a dog I formed a credence of 0.98 that the object is a dog, and my doppelganger did the same thing. And suppose that suddenly a telepathic opportunity opens up and we each learn about the other’s existence and credences.

Then our credences that the distant object is a dog will go up slightly, because we will each have learned that someone else’s experiences matched up with ours. Given that the other person in this case is just like me, this doesn’t give me much new information. It is very likely that someone just like me looking in the same direction would see things the same way. But it is not certain. After all, my perception could still be due to a random error in my eyes. So could my doppelganger’s be. But the fact that our perceptions match up rules makes it implausible to suppose the random error hypothesis, and hence it raises the credence that the object really is a dog. Let’s say our credences will go up to 0.985.

Now suppose that instead this is a case of slight disagreement: His credence that there is a dog there is 0.978 and mine is 0.980, this being the first time we deviate in our whole lives. I think the closeness to me of the other’s judgment is still evidence of correctness. So I think my credence, and his as well, should still go up. Maybe not to 0.985, but maybe 0.983.

Monday, August 21, 2017

Searching for the best theory

Let’s say that I want to find the maximum value of some function over some domain.

Here’s one naive way to do it:

Algorithm 1: I pick a starting point in the domain at random, place an imaginary particle there and then gradually move the particle in the direction where the function increases, until I can’t find a way to improve the value of the function.

This naive way can easily get me stuck in a “local maximum”: a peak from which all movements go down. In the example graph, most starting points will get one stuck at local maxima.

Let’s say I have a hundred processor cores available, however. Then here’s another simple thing I could do:

Algorithm 2: I choose a hundred starting points in the domain at random, and then have each core track one particle as it tries to move towards higher values of the function, until it can move no more. Once all the particles are stuck, we survey them all and choose the one which found the highest value. This is pretty naive, too, but we have a much better chance of getting to the true maximum of the function.

But now suppose I have this optimization idea:

Algorithm 3: I follow Algorithm 2, except at each time step, I check which of the 100 particles is at the highest value point, and then move the other 99 particles to that location.

The highest value point found is intuitively the most promising place, after all. Why not concentrate one’s efforts there?

But Algorithm 3 is, of course, be a bad idea. For now all 100 particles will be going lock-step, and will all arrive at the same point. We lose much of the independent exploration benefit of Algorithm 2. We might as well have one core.

But now notice how often in our epistemic lives, especially philosophical ones, we seem to be living by something like Algorithm 3. We are trying to find the best theory. And in journals, conferences, blogs and conversations, we try to convince others that the theory we’re currently holding to is the best one. This is as if each core was trying to convince the 99 to explore the location that it was exploring. If the core succeeded, the effect would be like Algorithm 3 (or worse). Forcing convergence—even by intellectually honest means—seems to be harmful to the social epistemic enterprise.

Now, it is true that in Algorithm 2, there is a place for convergence: once all the cores have found their local maxima, then we have the overall answer, namely the best of these local maxima. If we all had indeed found our local maxima, i.e., if we all had fully refined our individual theories to the point that nothing nearby was better, it would make sense to have a conference and choose the best of all of the options. But in fact most of us are still pretty far from even the locally best theory, and it seems unlikely that we will achieve it in this life.

Should we then all work independently, not sharing results lest we produce premature convergence? No. For one, the task of finding the locally optimal theory is one that we probably can’t achieve alone. We are dealing with functions whose values at the search point cannot be evaluated by our own efforts, and where even exploring the local area needs the help of others. And so we need cooperation. What we need is groups exploring different regions of the space of theories. And in fact we have this: we have the Aristotelians looking for the best theory in the vicinity of Aristotle’s, we have the Humeans, etc.

Except that each group is also trying to convince the others. Is it wrong to do so?

Well, one complicating factor is that philosophy is not just an isolated intellectual pursuit. It has here-and-now consequences for how to live our lives beyond philosophy. This is most obvious in ethics (including political philosophy), epistemology and philosophy of religion. In Algorithm 3, 99 of the cores may well be exploring less promising areas of the search space, but it’s no harm to a core to be exploring such an area. But it can be a serious harm to a person to have false ethical, epistemological or religious beliefs. So even if it were better for our social intellectual pursuits that all the factions be doing their searching independently, we may well have reasons of charity to try to convince others—but primarily where this has ethical, epistemological or religious import (and often it does, even if the issue is outside of these formal areas).

Furthermore, we can benefit from criticism by people following other paradigms than ours. Such criticism may move us to switch to their paradigm. But it can benefit us even if it does not do that, by helping us find the optimal theory in our local region.

And, in any case, we philosophers are stubborn, and this stubbornness prevents convergence. This stubbornness may be individually harmful, by keeping us in less promising areas of the search space, but beneficial to the larger social epistemic practice by preventing premature convergence as in Algorithm 3.

Stubbornness can be useful, thus. But it needs to be humble. And that's really, really hard.

Thursday, February 9, 2017

Conciliationism and another toy model

Conciliationism holds that in cases of peer disagreement the two peers should move to a credence somewhere between their individual credences. In a recent post I presented a toy model of error of reasoning on which conciliationism was in general false. In this post, I will present another toy model with the same property.

Bayesian evidence is additive when instead of probability p one works with log-odds λ(p)=log(p/(1 − p)). From that point of view, it is natural to model error in the evaluation of the force of evidence as the addition of a normally-distributed term with mean zero to the log-odds.

Suppose now that Alice and Bob evaluate their first-order evidence, which they know they have in common, and come to the individual conclusions that the probability of some Q is α and β respectively. Moreover, both Alice and Bob have the above additive model of their own error-proneness in the evaluation of first-order evidence, and in fact they assign the same standard deviation σ to the normal distribution. Finally, we assume that Alice and Bob know that their errors are independent.

Alice and Bob are good Bayesians. They will next apply a discount for their errors to their first-order estimates. You might think: “No discount needed. After all, the error could just as well be negative as well as positive, and the positive and negative possibilities cancel out, leaving a mean error of zero.” That’s mistaken, because while the normal distribution is symmetric, what we are interested in is not the expected error in the log-odds, which is indeed zero, but the mean error in the probabilities. And once one transforms back from log-odds to probabilities, the normal distribution becomes asymmetric. A couple of weeks back, I worked out some formulas which can be numerically integrated with Derive.

First-order probability	σ	Second-order probability
0.80	1.00	0.76
0.85	1.00	0.81
0.90	1.00	0.87
0.95	1.00	0.93
0.80	0.71	0.78
0.80	0.71	0.83
0.90	0.71	0.88
0.95	0.71	0.94

So, for instance, if Alice has a first-order estimate of 0.90 and Bob has a first-order estimate of 0.95, and they both have σ = 1 in their error models, they will discount to 0.87 and 0.93.

Let the discounted credences, after evaluation of the second-order evidence, be α^* and β^* (the value depends on σ).

Very good. Now, Alice and Bob get together and aggregate their final credences. Let’s suppose they do so completely symmetrically, having all information in common. Here’s what they will do. The correct log-odds for Q, based on the correct evaluation of the evidence, equals Alice’s pre-discount log-odds log(α/(1 − α)) plus an unknown error term with mean zero and standard deviation σ, as well as equalling Bob’s pre-discount log-odds log(α/(1 − α)) plus an unknown error term with mean zero and standard deviation σ.

Now, there is a statistical technique we learn in grade school which takes a number of measurements of an unknown quantity, with the same normally distributed error, and which returns a measurement with a smaller normally distributed error. The technique is known as the arithmetic mean. The standard deviation of the error in the resulting averaged data point is σ/n^1/2, where n is the number of samples. So, Alice and Bob apply this technique. They back-calculate α and β from their final individual credences α^* and β^*, they then calculate the log-odds, average, and go back to probabilities. And then they model the fact that there is still a normally-distributed error term, albeit one with standard deviation σ/2^1/2, so they adjust for that to get a final credence α^** = β^**.

So what do we get? Do we get conciliationism, so that their aggregated credence α^** = β^** is in between their individual credences? Sometimes, of course, we do. But not always.

Observe first what happens if α^* = β^*. “But then there is no disagreement and nothing to conciliate!” True, but there is still data to aggregate. If α^* = β^*, then the error discount will be smaller by a factor of the square root of two. In fact, the table above shows what will happen, because (not by coincidence) 0.71 is approximately the reciprocal of the square root of two. Suppose σ = 1. If α^* = β^* = 0.81, this came from pre-correction values α = β = 0.85. When corrected with the smaller normal error of 0.71, we now get a corrected value α^** = β^** = 0.83. In other words, aggregating the data from one another, Alice and Bob raise their credence in Q from 0.81 to 0.83.

But all the formulas here are quite continuous. So if α^* = 0.8099 and β^* = 0.8101, the aggregation will still yield a final credence of approximately 0.83 (I am not bothering with the calculation at this point). So, when conciliating 0.8099 and 0.8101, you get a final credence that is higher than either one. Conciliationism is thus false.

The intuition here is this. When the two credences are reasonably close, the amount by which averaging reduces error overcomes the downward movement in the higher credence.

Of course, there will also be cases where aggregation of data does generate something in between the two data points. I conjecture that on this toy model, as in my previous, this will be the case whenever the two credences are on opposite sides of 1/2.

Wednesday, February 8, 2017

Peer disagreement, conciliationism and a toy model

Let suppose that Alice and Bob are interested in the truth of some proposition Q. They both assign a prior probability of 1/2 to Q, and all the first-order evidence regarding Q is shared between them. They evaluate this first-order evidence and come up with respective posteriors α and β for Q in light of the evidence.

Further, Alice and Bob have background information about how their minds work. They each have a random chance of 1/2 of evaluating the evidence exactly correctly and a random chance of 1/2 that a random bias will result in their evaluation being completely unrelated to the evidence. In the case of that random bias, their output evaluation is random, uniformly distributed over the interval between 0 and 1. Moreover, Alice and Bob’s errors are independent of what the other person thinks. Finally, Alice and Bob’s priors as to what the correct evaluation of the evidence will show is uniformly distributed between 0 and 1.

Given that each now has this further background information about their error-proneness, Alice and Bob readjust their posteriors for Q. Alice reasons thus: the probability that my first-order evaluation of α was due to the random bias is 1/2. If I knew that the random bias happened, my credence in Q would be 1/2; if I knew that the random bias did not happen, my credence in Q would be α. Not knowing either way, my credence in Q should be:

α^* = (1/2)(1/2)+(1/2)α = (1/2)(1/2 + α).

Similarly, Bob reasons that his credence in Q should be:

β^* = (1/2)(1/2 + β).

In other words, upon evaluating the higher-order evidence, both of them shift their credences closer to 1/2, unless they were at 1/2.

Next, Alice and Bob pool their data. Here I will assume an equal weight view of how the data pooling works. There are now two possibilities.

First, suppose Alice and Bob notice that their credences in Q are the same, i.e., α^* = β^*. They know this happens just in case α = β by (1) and (2). Then they do a little Bayesian calculation: there is a 1/4 prior that neither was biased, in which case the equality of credences is certain; there is a 3/4 prior that at least one was biased, in which case the credences would almost certainly be unequal (the probability that they’d both get the same erroneous result is zero given the uniform distribution of errors); so, the posterior that they are both correct is 1 (or 1 minus an infinitesimal). In that case, they will adjust their credences back to α and β (which are equal). This is the case of peer agreement.

Notice that peer agreement results in an adjustment of credence away from 1/2 (i.e., α^* is closer to 1/2 than α is, unless of course α = 1/2).

Second, suppose Alice and Bob notice that their credences in Q are different, i.e., α^* ≠ β^*. By (1) and (2), it follows that their first-order evaluations α and β were also different from one another. Now they reason as follows. Before they learned that their evaluations were different, there were four possibilities:

EE: Alice erred and Bob erred
EN: Alice erred but Bob did not err
NE: Alice did not err but Bob erred
NN: no error by either.

Each of these had equal probability 1/4. Upon learning that their evaluations were different, the last option was ruled out. Moreover, given the various uniform distribution assumptions, the exact values of the errors do not affect the probabilities of which possibility was the case. Thus, the EE, EN and NE options remain equally likely, but now have probability 1/3. If they knew they were in EE, then their credence should be 1/2—they have received no data. If they knew they were in EN, their credence should be β, since Bob’s evaluation of the evidence would be correct. If they knew they were in NE, their credence should be α, since Alice’s evaluation would be correct. But they don’t know which is the case, and the three cases are equally likely, so their new credence is:

α^** = β^** = (1/3)(1/2 + α + β) = (1/3)(2α^* + 2β^* − 1/2).

(They can calculate α and β from α^* and β^*, respectively.)

Now here’s the first interesting thing. In this model, the “split the difference” account of peer disagreement is provably wrong. Splitting the difference between α^* and β^* would result in (1/2)(α^* + β^*). It is easy to see that the only case where (3) generates the same answer as splitting the difference is when α^* + β^* = 1, i.e., when the credences of Alice and Bob prior to aggregation were equidistant from 1/2, in which case (3) says that they should go to 1/2.

And here is a second interesting thing. Suppose that α^* < β^*. Standard conciliationist accounts of peer disagreement (of which “split the difference” is an example) say that Alice should raise her credence and Bob should lower his. Does that follow from (3)? The answer is: sometimes. Here are some cases:

α^* = 0.40, β^* = 0.55, α^** = β^** = 0.47
α^* = 0.55, β^* = 0.65, α^** = β^** = 0.63
α^* = 0.60, β^* = 0.65, α^** = β^** = 0.67
α^* = 0.60, β^* = 0.70, α^** = β^** = 0.70.

Thus just by plugging some numbers in, we can find some conciliationist cases where Alice and Bob should meet in between, but we can also find a case (0.60 and 0.70) where Bob should stand pat, and a case (0.60 and 0.65) where both should raise their credence.

When playing with numbers, remember that by (1) and (2), the possible range for α^* and β^* is between 1/4 and 3/4 (since the possible range for α and β is from 0 to 1).

What can we prove? Well, let's first consider the case where α^* < 1/2 < β^*. Then it's easy to check that Bob needs to lower his credence and Alice needs to raise hers. That's a conciliationist result.

But what if both credences are on the same side of 1/2? Let say 1/2 < α^* < β^*. Then it turns out that:

Alice will always raise her credence
Bob will lower his credence if and only if β^* > 2α^* − 1/2
Bob will raise his credence if and only if β^* < 2α^* − 1/2.

In other words, Bob will lower his credence if his credence is far enough away from Alice’s. But if it’s moderately close to Alice’s, both Alice and Bob will raise their credences.

While the model I am working with is very artificial, this last result is pretty intuitive: if both of them have credences that are fairly close to each other, this supports the idea that at least one of them is right, which in turn undoes some of the effect of the α → α^* and β → β^* transformations in light of their data on their own unreliability.

So what do we learn about peer disagreement from this model? What we learn is that things are pretty complicated, too complicated to encompass in a simple non-mathematical formulation. Splitting the difference is definitely not the way to go in general. Neither is any conciliationism that makes the two credences move towards their mutual mean.

Of course, all this is under some implausible uniform distribution and independence assumptions, and a pretty nasty unreliability assumption that half the time we evaluate evidence biasedly. I have pretty strong intuitions that a lot of what I said depends on these assumptions. For instance, suppose that the random bias results in a uniform distribution of posterior on the interval 0 to 1, but one’s prior probability distribution for one’s evaluation of the evidence is not uniform but drops off near 0 and 1 (one doesn’t think it likely that the evidence will establish or abolish Q with certainty). Then if α (say) is close to 0 or 1, that’s evidence for bias, and a more complicated adjustment will be needed than that given by (1).

So things are even more complicated.

Tuesday, January 26, 2016

Conciliation and caution

I assign a credence 0.75 to p and I find out that you assign credence 0.72 to it, despite us both having the same evidence and epistemic prowess. According to conciliationism, I should lower my credence and you should raise yours.

Here's an interesting case. When I assigned 0.75 to p, I reasoned as follows: my evidence prima facie supported p to a high degree, say 0.90, but I know that I could have made a mistake in my evaluation of the evidence, so to be safe I lowered my credence to 0.75. You, being my peer and hence equally intellectually humble, proceeded similarly. You evaluated the evidence at 0.87 and then lowered the credence to 0.72 to be safe. Now when I learn that your credence is 0.72, I assume you were likewise being humbly cautious. So I assume you had some initial higher evaluation, but then lowered your evaluation to be on the safe side. But now that I know that both you and I evaluated the evidence significantly in favor of p, there is no justification for as much caution. As a result, I raise my credence. And maybe you proceed similarly. And if we're both advocates of the equal weight view, thinking that we should treat each others' credences on par, we will both raise our credence to the same value, say 0.80. As a result, you revise in the direction conciliationism tells you to (but further than most conciliationists would allow) and I revise in the opposite direction to what conciliationism says.

The case appears to be a counterexample to conciliationism. Now, one might argue that I was unfair to conciliationists. It's not uncommon in the literature to define conciliationism as simply the view that both need to change credence rather than the view that they must each change in the direction of the other's credence. And in my example, both change their credence. I think this reading of conciliationism isn't fair to the motivating intuitions or the etymology. Someone who, upon finding out about a disagreement, always changes her credence in the opposite direction of the other's credence is surely far from being a conciliatory person! Be that as it may, I suspect that counterexamples like the above can be tweaked. For instance, I might reasonably reason as follows:

You assign a smaller credence than I, though it's pretty close to mine. Maybe you started with an initial estimate close to but lower than mine and then lowered it by the same amount as I did out of caution. Since your initial estimate was lower than mine, I will lower mine a little. But since it was close, I don't need to be as cautious.

It seems easy to imagine a case like this where the two effects cancel out, and I'm left with the same credence I started with. The result is a counterexample to a conciliationism that merely says I shouldn't stay pat.

Tuesday, December 29, 2015

Trusting leaders in contexts of war

Two nights ago I had a dream. I was in the military, and we were being deployed, and I suddenly got worried about something like this line of thought (I am filling in some details--it was more inchoate in the dream). I wasn't in a position to figure out on my own whether the particular actions I was going to be commanded to do are morally permissible. And these actions would include killing, and to kill permissibly one needs to be pretty confident that the killing is permissible. Moreover, only the leaders had in their possession sufficient information to make the judgment, so I would have to rely on their judgment. But I didn't actually trust the moral judgment of the leaders, particularly the president. My main reason in the dream for not trusting them was that the president is pro-choice, and someone whose moral judgment is so badly mistaken as to think that killing the unborn is permissible is not to be trusted in moral judgments relating to life and death. As a result, I refused to participate, accepting whatever penalties the military would impose. (I didn't get to find out what these were, as I woke up.)

Upon waking up and thinking this through, I wasn't so impressed by the particular reason for not trusting the leadership. A mistake about the morality of abortion may not be due to a mistake about the ethics of killing, but due to a mistake about the metaphysics of early human development, a mistake that shouldn't affect one's judgments about typical cases of wartime killing.

But the issue generalizes beyond abortion. In a pluralistic society, a random pair of people is likely to differ on many moral issues. The probability of disagreement will be lower when one of the persons is a member of a population that elected the other, but the probability of disagreement is still non-negligible. One worries that a significant percentage of soldiers have moral views that differ from those of the leadership to such a degree that if the soldiers had the same information as the leaders do, the soldiers would come to a different moral evaluation of whether the war and particular lethal acts in it are permissible. So any particular soldier who is legitimately confident of her moral views has reason to worry that she is being commanded things that are impermissible, unless she has good reason to think that her moral views align well with the leaders'. This seems to me to be a quite serious structural problem for military service in a pluralistic society, as well as a serious existential problem.

The particular problem here is not the more familiar one where the individual soldier actually evaluates the situation differently from her leaders. Rather, it arises from a particular way of solving the more familiar problem. Either the soldier has sufficient information by her lights to evaluate the situation or she does not. If she does, and she judges that the war or a lethal action is morally wrong, then of course conscience requires her to refuse, accepting any consequences for herself. Absent sufficient information, she needs to rely on her leaders. But here we have the problem above.

How to solve the problem? I don't know. One possibility is that even though there are wide disparities between moral systems, the particular judgments of these moral systems tend to agree on typical acts. Even though utilitarianism is wrong and Catholic ethics is right, the utilitarian and the Catholic moralist tend to agree about most particular cases that come up. Thus, for a typical action, a Catholic who hears the testimony of a well-informed utilitarian that an action is permissible can infer that the action is probably permissible. But war brings out differences between moral systems in a particularly vivid way. If bombing civilians in Hiroshima and Nagasaki is likely to get the emperor to surrender and save many lives, then the utilitarian is likely to say that the action is permissible while the Catholic will say it's mass murder.

It could, however, be that there are some heuristics that could be used by the soldier. If a war is against a clear aggressor, then perhaps the soldier should just trust the leadership to ensure that the other conditions (besides the justness of the cause) in the ius ad bellum conditions are met. If a lethal action does not result in disproportionate civilian deaths, then there is a good chance that the judgments of various moral systems will agree.

But what about cases where the heuristics don't apply? For instance, suppose that a Christian is ordered to drop a bomb on an area that appears to be primarily civilian, and no information is given. It could be that the leaders have discovered an important military installation in the area that needs to be destroyed, and that this is intelligence that cannot be disclosed to those who will carry out the bombing. But it could also be that the leaders want to terrorize the population into surrender or engage in retribution for enemy acts aimed at civilians. Given that there is a significant probability, even if it does not exceed 1/2, that the action is a case of mass murder rather than an act of just war, is it permissible to engage in the action? I don't know.

Perhaps knowledge of prevailing military ethical and legal doctrine can help in such cases. The Christian may know, for instance, that aiming at civilians is forbidden by that doctrine. In that case, as long as she has enough reason to think that the leadership actually obeys the doctrine, she might be justified in trusting in their judgment. This is, I suppose, an argument for militaries to make clear their ethical doctrines and the integrity of their officers. For if they don't, then there may be cases where too much disobedience of orders is called for.

I also don't know what probability of permissibility is needed for someone to permissibly engage in a killing.

I don't work in military ethics. So I really know very little about the above. It's just an ethical reflection occasioned by a dream...

Monday, October 19, 2015

Being trusting

This is a followup on the preceding post.

1. Whenever the rational credence of p is 0.5 on some evidence base E, at least 50% of human agents who assign a credence to p on E will assign a credence between 0.25 and 0.75.

2. The log-odds of the credence assigned by human agents given an evidence base can be appropriately modeled by the log-odds of the rational credence on that evidence base plus a normally distributed error whose standard deviation is small enough to guarantee the truth of 1.

3. Therefore, if I have no evidence about a proposition p other than that some agent assigned credence r on her evidence base, I should assign a credence at least as far from 0.5 as F(r), where:

F(0.5) = 0.5
F(0.6) = 0.57
F(0.7) = 0.64
F(0.8) = 0.72
F(0.9) = 0.82
F(0.95) = 0.89
F(0.98) = 0.95
F(0.99) = 0.97

4. This is a pretty trusting attitude.

5. So, it is rational to be pretty trusting.

The trick behind the argument is to note that (1) and (2) guarantee that the standard deviation of the normally distributed error on the log-odds is less than 1.63, and then we just do some numerical integration (with Derive) to compute the expected value of the rational credence.

Wednesday, April 29, 2015

Convincing

Free will is incompatible with (causal) determinism, and I know it. I know it because I have sound arguments for it, with compelling premises. It is good for people to know the truth about things that matter, and this is one of them. So I should be glad, for your sake and not just out of vanity, if I convinced you by one of these compelling arguments. And I would be glad.

But perhaps I shouldn't be glad if I convinced everyone, and that's for two reasons. First, there actually being compatibilists helps keep incompatibilist investigators honest and leads to a deeper understanding of the ways in which determinism precludes free will. Second, while I know that freedom is incompatible with determinism, I might be wrong. That chance is sufficiently small that it's safe for me and you to risk the cost of being wrong. But the cost of everyone getting this wrong is more than the sum of the costs of the individuals getting it wrong. Once something becomes near universally accepted, it is much harder for humankind to retreat from it.

Thus, while I want to convince you of incompatibilism, I also want there to be dissent in the epistemic community. This is something like a tragedy of the commons in the epistemic sphere.

Fortunately, human nature is such that I run only an insignificant risk of getting everyone to agree with me when I offer an argument for incompatibilism. So I can offer the arguments safely.

I chose the example of incompatibilism carefully. I wouldn't say the same thing about things that I am much more confident of, say that there is a physical world or that 2+2=4. There the risk of being wrong is so small and the level of unreasonableness in denying the claim are sufficiently high that it would be good for the epistemic community to have universal agreement. On the other hand, there are philosophical doctrines which I think are likely to be true, but where I am sufficiently unsure that I would cringe if I convinced someone.

Saturday, November 16, 2013

Why faith in the testimony of others is loving: Notes towards a thoroughly ethical social epistemology

Loving someone has three aspects: the benevolent, the unitive and the appreciative. (I develop this early on in One Body.) Believing something and gaining knowledge on the testimony of another teaching involves all three aspects of love.

Appreciation: If I believe you on testimony, then I accept you as a person who speaks honestly and reasons well. It is a way of respecting your epistemic achievement. This does not mean that a failure to accept your testimony is always unappreciative. I may appreciate you, but have good reason to think that the information you have received is less complete than mine.

Union: Humans are social animals, and our sociality is partly constituted by our joint epistemic lives. To accept your testimony is to be united with you epistemically.

Benevolence: Excelling at our common life of learning from and teaching one another is a part of our flourishing. If I gain knowledge from you, you thereby flourish as my teacher. Thus by learning from you, I benefit not only myself as learner but I benefit you by making you a successful teacher.

We learn from John Paul II's philosophical anthropology that we are essentially givers and accepters of gifts. In giving, epistemically and otherwise, we are obviously benevolent, but also because it is the human nature to be givers, in grateful acceptance of a gift we benefit, unite with and affirm the giver, thereby expressing all three aspects of love.

Thursday, October 4, 2012

A dialog on rhetoric, autonomy and original sin

L: Rhetorical persuasion does not track truth in the way that good arguments do. The best way for us to collectively come to truth is well-reasoned arguments presented in a dry and rigorous way, avoiding rhetorical flourishes. Rhetoric makes weaker arguments appear stronger than they are and a practice of giving rhetorically powerful arguments can make stronger arguments appear weaker.

R: Rhetoric appeals to emotions and emotions are truth-tracking, albeit their reliability, except in the really virtuous individual, may not be high. So I don't believe that rhetorical persuasion does not track truth. But I will grant it for our conversation, L. Still, you're forgetting something crucial. People have an irrational bias against carefully listening to arguments that question their own basic assumptions. Rhetoric and other forms of indirect argumentation sneak in under the radar of one's biases and make it possible to convince people of truths that otherwise they would be immune to.

L: Let's have the conversation about the emotions on another day. I suspect that even if emotions are truth-tracking, in practice they are sufficiently unreliable except in the very virtuous, and it is not the very virtuous that you are talking of convincing. I find your argument ethically objectionable. You are placing yourself intellectually over other people, taking them to have stupid biases, sneaking under their guard and riding roughshod over their autonomy.

R: That was rhetoric, not just argument!

L: Mea culpa. But you see the argumentative point, no?

R: I do, and I agree it is a real worry. But given that there is no other way of persuading not very rational humans, what else can we do?

L: But there are other ways of persuading them. We could use threats or brainwashing.

R: But that would be wrong!

L: This is precisely the point at issue. Threats or brainwashing would violate autonomy. You seemed to grant that rhetorical argument does so as well. So it should be wrong to convince by rhetorical argument just as much as by threats or brainwashing.

R: But it's good for someone to be persuaded of the truth when they have biases that keep them from truth.

L: I don't dispute that. But aren't you then just paternalistically saying that it's alright to violate people's autonomy for their own good?

R: I guess so. Maybe autonomy isn't an absolute value, always to be respected.

L: So what objection do you have to convincing people of the truth by threat or brainwashing?

R: Such convincing—granting for the sake of argument that it produces real belief—would violate autonomy too greatly. I am not saying that every encroachment on autonomy is justified, but only that the mild encroachment involved in couching one's good arguments in a rhetorically effective form is.

L: I could pursue the question whether you shouldn't by the same token say that for a great enough good you can encroach on autonomy greatly. But let me try a different line of thought. Wouldn't you agree that it would be a unfortunate thing to use means other than the strength of argument to convince someone of a falsehood?

R: Yes, though only because it is unfortunate to be convinced of a falsehood. In other words, it is no more unfortunate than being convinced of a falsehood by means of strong but ultimately unsound or misleading arguments.

L: I'll grant you that. But being convinced by means of argument tracks truth, though imperfectly. Being convinced rhetorically does not.

R: It does when I am convincing someone of a truth!

L: Do you always try to convince people of truths?

R: I see what you mean. I do always try to convince people of what I at the time take to be the truth—except in cases where I am straightforwardly and perhaps wrongfully deceitful, sinner that I am—but I have in the past been wrong, and there have been some times when what I tried to convince others of has been false.

L: Don't you think that some of the things you are now trying to convince others of will fall in the same boat, though of course you can't point out which they are, on pain of self-contradiction?

R: Yes. So?

L: Well, then, when you strive to convince someone by rhetorical means of a falsehood, you are more of a spreader of error than when you try to do so by means of dry arguments.

R: Because dry arguments are less effective?

L: No, because reasoning with dry arguments is more truth conducive. Thus, when you try to convince someone of a falsehood by means of a dry argument, it is more likely that you will fail for truth-related reasons—that they will see the falsehood of one of your premises or the invalidity of one of your inferences. Thus, unsound arguments will be more likely to fail to convince than sound arguments will be. But rhetoric can as easily convince of falsehood as of truth.

R: I know many people who will dispute the truth conduciveness of dry argument, but I am not one of them—I think our practices cannot be explained except by thinking there is such conduciveness there. But I could also say that rhetorical argument is truth conducive in a similar way. The truth when attractively shown forth is more appealing than a rhetorically dressed up falsehood.

L: Maybe. But we had agreed to take for granted in our discussion that rhetorical persuasion is not truth tracking.

R: Sorry. It's easy to forget yourself when you've granted a falsehood for the sake of discussion. Where were we?

L: I said that reasoning with dry arguments is more truth conducive, and hence runs less of a risk of persuading people of error.

R: Is it always wrong to take risks?

L: No. But the social practice of rhetorical presentation of arguments—or, worse, of rhetorical non-argumentative persuasion—is less likely to lead to society figuring out the truth on controversial questions.

R: Are you saying that we should engage in those intellectual practices which, when practiced by all, are more likely to lead to truth?

L: I am not sure I want to commit myself to this in all cases, but in this one, yes.

R: I actually think one can question your claim about social doxastic utility. Rhetorical persuasion leads to a greater number of changes of mind. A society that engages in practices of rhetorical persuasion is likely to have more in the way of individual belief change, as dry arguments do not in fact convince. But a society with more individual belief change might actually be more effective at coming to the truth, since embodying different points of view in the same person at different times can lead to a better understanding of the positions and ultimately a better rational decision between them. We could probably come up with some interesting computation social epistemology models here.

L: You really think this?

R: No. But it seems no less likely to be correct than your claim that dry argument is a better social practice truth-wise.

L: Still, maybe there is a wager to be run here. Should you engage in persuasive practices here that (a) by your own admission negatively impact the autonomy of your interlocutors and (b) are no more likely than not to lead to a better social epistemic state?

R: So we're back to autonomy?

L: Yes.

R: But as I said I see autonomy not as an absolute value. If I see that a person is seriously harming herself through her false beliefs, do I not have a responsibility to help her out—the Golden Rule and all that!—even if I need to get around her irrational defenses by rhetorical means?

L: But how do you know that you're not the irrational one, about to infect an unwary interlocutor?

R: Are you afraid of being infected by me?

L: I am not unwary. Seriously, aren't you taking a big risk in using rhetorical means of persuasion, in that such means make you potentially responsible for convincing someone, in a way that side-steps some of her autonomy, of a falsehood? If by argument you persuade someone, then she at least has more of a responsibility here. But if you change someone's mind by rhetoric—much as (but to as smaller degree) when by threat or brainwashing—the responsibility for the error rests on you.

R: That is a scary prospect.

L: Indeed.

R: But sometimes one must do what is scary. Sometimes love of neighbor requires one to take on responsibilities, to take risks, to help one's neighbor out of an intellectual pit. Taking the risks can be rational and praiseworthy. And sometimes one can be rationally certain, too.

L: I am not sure about the certainty thing. But it seems that your position is now limited. That it is permissible to use rhetorical persuasion when sufficiently important goods of one's neighbor are at stake that the risk of error is small relative to these.

R: That may be right. Thus, it may be right to teach virtue or the Gospel by means that include rhetorical aspects, but it might be problematic to rhetorically propagate those aspects of science or philosophy that are not appropriately connected to virtue or the Gospel. Though even there I am not sure. For those things that aren't connected to virtue or the Gospel don't matter much, and error about them is not a great harm, so the risks may still be doable. But you have inclined me to think that one may need a special reason to engage in rhetoric.

L: Conditionally, of course, on our assumption that rhetoric is not truth-conducive in itself.

R: Ah, yes, I almost forgot that.

Saturday, April 7, 2012

The improbable and the impossible

This discussion from Douglas Adams' The Long Dark Tea-Time of the Soul (pp. 165-166) struck me as quite interesting:

[Kate:] "What was the Sherlock Holmes principle? 'Once you have discounted the impossible, then whatever remains, however improbable, must be the truth.'"

"I reject that entirely," said Dirk sharply. "The impossible often has a kind of integrity to it which the merely impossible lacks. How often have you been presented with an apparently rational explanation of something that works in all respects other than one, which is just that it is hopelessly improbable? Your instinct is to say, 'Yes, but he or she simply wouldn't do that.'"

"Well, it happened to me today, in fact," replies Kate.

"Ah, yes," said Dirk, slapping the table and making the glasses jump, "your girl in the wheelchair [the girl was constantly mumbling exact stock prices, with a 24-hour delay]--a perfect example. The idea that she is somehow receiving yesterday's stock market prices out of thin air is merely impossible, and therefore must be the case, because the idea that she is maintaining an immensely complex and laborious hoax of no benefit to herself is hopelessly improbable. The first idea merely supposes that there is something we don't know about, and God knows there are enough of those. The second, however, runs contrary to something fundamental and human which we do know about. ..."

This reminds me very much of the Professor's speech in The Lion, the Witch and the Wardrobe:

Either your sister is telling lies, or she is mad, or she is telling the truth. You know she doesn't tell lies and it is obvious that she is not mad. For the moment and unless any further evidence turns up, we must assume that she is telling the truth.

Both Dirk Gently and the Professor think that we need to have significantly greater confidence in what we know about other people's character than in our scientific knowledge of how the non-human world works. This seems to me to be just right. Our scientific knowledge of the world almost entirely depends on trusting others.

So, both C. S. Lewis and Douglas Adams are defending faith in Christ, though of course Adams presumably unintentionally. :-)

Thursday, March 22, 2012

Aggregating data from agents with the same evidence

Consider a case where we have two or more rational agents who have in some sense the same evidence, but who evaluate the force of the evidence differently and who have different priors, and who assign different credences to p. Suppose for simplicity that you are a completely undecided agent, with no evidence of your own, rather than one of the people with the evidence (this brackets one of the questions that the disagreement literature is concerned with—whether if you are one of these agents, you should stand pat or not). What credence should we assign after aggregating the agents' different credences?

An obvious suggestion is that we average the credences. That suggestion is incorrect, I believe.

The intuition I have is that averaging is the right move to make when aggregating estimates that are likely to suffer from normally distributed errors. But credences do not suffer from normally distributed errors. Suppose the correct credence, given the evidence, is 0.9. The rational agent's credences is not normally distributed around 0.9, since it cannot exceed 1 or fall below 0.

However, once we replace the credences with logarithms of odds, as we have learned to do from Turing, where the log-odds corresponding to a credence p is log (p/(1−p)), then we are dealing with the sorts of additive quantities where we can expect normally distributed error. When we are dealing with log-odds, Bayes' theorem becomes additive:

posterior-log-odds = prior-log-odds + log-likelihood-ratio.

We can think of the rational agents as having normally distributed errors for their prior log-odds and for their estimate of the evidence's log-likelihood-ratio. (Maybe more can be said in defense of those assumptions.) We idealize, then, by supposing errors to be independent. And in cases where we are dealing with independent normally distributed errors, the best aggregation of the estimates is arithmetic averaging (cf. this post on voting).

If this line of thought works, what we should do is calculate the log-odds corresponding to the agents' credences, average these (somehow weighting by competence, I suppose, if there is competence data), and then calculate the credence corresponding to that average.

This method handles symmetry cases just as ordinary averaging does. If one agent says 0.9 and another says 0.1, then we get 0.5, as we should.

But this method of aggregation yields significantly different results when some of the credences are close to 0 or 1. Suppose we have two agents with credences 0.1 and 0.99. The arithmetic average would be 0.55. But this method recommends 0.77. Suppose we have three agents with credences 0.1, 0.1 and 0.99. The arithmetic average would be 0.40. But our aggregation method yields 0.52. On the other hand, if we have credences 0.02 and 0.8, we get 0.22. All this is correct, under the normal distribution in log-odds error assumption.

If you want to play with this, I made a simple credence aggregation calculator.

This method, thus, accords greater weight to those who are more certain, in either direction. Therefore, the method suffers from the same manipulation problem that the corresponding voting method does. The method will produce terrible results when applied to agents who significantly overestimate probabilities close to 1 or underestimate probabilities close to 0—or when they lie about their credences. That's why I am only advertising this method in the case of rational agents. How useful this is in real life is hard to say. It could be that one just needs to adapt the method by throwing out fairly extreme credences, just as one throws out outliers in science, by taking them to be evidence of credences not formed on the basis of evidence (this need not be pejorative—I am not an evidentialist).

There is, I think, an interesting lesson here that parallels a lesson I drew out in the voting case. In aggregating credences, just as in aggregating votings, we have two desiderata: (1) extract as much useful information as we can from the individual agent data, and (2) not allow individual non-rational or non-team-player agents to manipulate the outcome unduly. These two desiderata are at odds with each other. How far we can trust other agents not to be manipulative affects social epistemology just as it does voting.

But here is a happy thought for those of us who (like me) have high credences in various propositions that are dear to us and where those credences are, we think, evidence-based. For then we get to outvote, in the court of our own minds (for our friends may dismiss us as outliers), more sceptically oriented friends. Let's say my credence that it's objectively wrong to torture those known to be innocent is 0.99999999, but I have two colleagues who incline to irrealism, and hence assign 0.1 to this claim. Even if I accord no greater weight to my own opinions, I still end up with an aggregate credence of 0.99.

Wednesday, March 21, 2012

More on interpersonal data consolidation

This expands on the discussion here.

You query three distinguished astrobiologists each of whom is currently doing Mars research how likely they think there was once life on Mars. They give you probabilities of 0.85, 0.90 and 0.95 respectively for there being life on Mars, and explain the data on which they base their decisions. You find that each of them assigns the correct probability given their data set (hence they each have at least somewhat different data they are working with), given the same reasonable prior. Moreover, you have no other data about whether there was life on Mars.

What probability should you assign to L, the hypothesis that there was once life on Mars? The intuitive answer is: 0.90. But it turns out that what I told you in the preceding paragraph underdetermines the answer. What I said in the preceding paragraph is compatible with any probability other than zero and one, depending on what sort of dependencies there are between the data sets on the basis of which the scientists have formed their respective judgments.

For instance, it could be that they have in common a single very strong piece of evidence E⁻₀ against L but that they each have an even stronger piece of evidence in favor of L. Moreover, their respective pieces of evidence E⁺₁, E⁺₂ and E⁺₃ in favor of L are conditionally independent of each other and of E⁻₀ (on L and on not-L). In such a case, when you consolidate their data, you get E⁻₀,E⁺₁,E⁺₂,E⁺₃. Since each of the E⁺_i is sufficient to significantly undo the anti-L effect of E⁻₀, it follows that when you consolidate all four pieces of data (starting with the same prior), you get a very high probability of L, indeed much higher than 0.90. In a case where the evidence-against is shared but the evidence-for is not, the consolidated probability is much higher than the individual probabilities.

For another case, it could be that each expert started off with a credence of 1/2 in L, and then each expert had a completely different data set that moved them to their respective probabilities. In this case, when you consolidate, you will once again get a probability significantly higher than any of their individual probabilities, since their data will add up.

On the other hand, if they each have in common a single extremely strong peice of evidence in favor of L but also each have a different strong piece of evidence against L, and we've got the right independence hypotheses, then the result of consolidating their data will be a small probability.

Both scenarios I've described are compatible with the setup. And if one assigns numbers appropriately, one can use these two scenarios to generate any consolidated probability strictly between 0 and 1.

The lesson here is that the result of consolidating expert opinions is not just a function of the expert's credences, even if these credences are all exactly right given the evidence the experts have. Consolidation needs to get below the hood on the experts' credences, to see just how much overlap and dependence there is in the evidence that the experts are basing their views on.

We can, however, give one general rule. If the experts are basing their views on entirely independent (given the hypothesis and given its negation) evidence, and are starting with a prior credence of 1/2, then the consolidated odds are equal to the product of the odds, where the odds corresponding to a probability p are p/(1−p). (It's a lot easier to Bayesian stuff in terms of odds or their logarithms.)