Alexander Pruss's Blog: evidence

Showing posts with label evidence. Show all posts

Friday, January 17, 2025

Knowledge and anti-knowledge

Suppose knowledge has a non-infinitesimal value. Now imagine that you continuously gain evidence for some true proposition p, until your evidence is sufficient for knowledge. If you’re rational, your credence will rise continuously with the evidence. But if knowledge has a non-infinitesimal value, your epistemic utility with respect to p will have a discontinuous jump precisely when you attain knowledge. Further, I will assume that the transition to knowledge happens at a credence strictly bigger than 1/2 (that’s obvious) and strictly less than 1 (Descartes will dispute this).

But this leads to an interesting and slightly implausible consequence. Let T(r) be the epistemic utility of assigning evidence-based credence r to p when p is true, and let F(r) be the epistemic utility of assigning evidence-based credence r to p when p is false. Plausibly, T is a strictly increasing function (being more confident in a truth is good) and F is a strictly decreasing function (being more confident in a falsehood is bad). Furthermore, the pair T and F plausibly yields a proper scoring rule: whatever one’s credence, one doesn’t have an expectation that some other credence would be epistemically better.

It is not difficult to see that these constraints imply that if T has a discontinuity at some point 1/2 < r_K < 1, so does F. The discontinuity in F implies that as we become more and more confident in the falsehood p, suddenly we have a discontinuous downward jump in utility. That jump occurs precisely at r_K, namely when we gain what we might call “anti-knowledge”: when one’s evidence for a falsehood becomes so strong that it would constitute knowledge if the proposition were true.

Now, there potentially are some points where we might plausibly think that epistemic utility of having a credence in a falsehood takes a discontinuous downward jump. These points are:

1, where we become certain of the falsehood
r_B, the threshold of belief, where the credence becomes so high that we count as believing the falsehood
1/2, where we start to become more confident in the falsehood p than the truth not-p
1 − r_B, where we stop believing not-p, and
0, where the falsehood p becomes an epistemic possibility.

But presumably r_K is strictly between r_B and 1, and hence r_K is no one of these points. Is it plausible to think that there is a discontinuous downward jump in epistemic utility when we achieve anti-knowledge by crossing the threshold r_K in a falsehood.

I am incline to say not. But that forces me to say that there is no discontinuous upward jump in epistemic utility once we gain knowledge.

On the other hand, one might think that the worst kind of ignorance is when you’re wrong but you think you have knowledge, and that’s kind of like the anti-knowledge point.

Thursday, July 11, 2024

The dependence of evidence on prior confidence

Whether p is evidence for q will often depend on one’s background beliefs. This is a well-known phenomenon.

But here’s an interesting fact that I hadn’t noticed before: sometimes whether p is evidence for q depends on how confident one is in q.

The example is simple: let p be the proposition that all other reasonable people have confidence level around r in q. If r is significantly bigger than one’s current confidence level, then p tends to be evidence for q. If r is significantly smaller than one’s current confidence level, then p tends to be evidence against q.

Tuesday, May 7, 2024

Mushrooms

Some people have the intuition that there is something fishy about doing standard Bayesian update on evidence E when one couldn’t have observed the absence of E. A standard case here is where the evidence E is being alive, as in firing squad or fine-tuning cases. In such cases, the intuition goes, you should just ignore the evidence.

I had a great conversation with a student who found this line of thought compelling, and came up with this pretty convincing (and probably fairly standard) case that you shouldn’t ignore evidence E like that. You’re stranded on a desert island, and the only food is mushrooms. They come in a variety of easily distinguishable species. You know that half of the species have a 99% chance of instantly killing you, and otherwise having no effect on you other than nourishment, and the other half have a 1% chance of instantly killing you, again otherwise having no effect on you other than nourishment. You don’t know which are which.

To survive until rescue, you need to eat one mushroom a day. Consider two strategies:

Eat a mushroom from a random species the first day. If you survive, conclude that this species is likely good, and keep on eating mushrooms of the same species.
Eat a mushroom from a random species every day.

The second strategy makes just as much sense as the first if your survival does not count as evidence. But we all know what will happen if you follow the second strategy: you’ll be very likely dead after a few days, as your chance of surviving n mushrooms is (1/2)ⁿ. On the other hand, if you follow the first strategy, your chance of surviving n mushrooms is slightly bigger than (1/2)(0.99)ⁿ. And the first strategy is precisely what is favored by updating on your survival: you take your survival to be evidence that the mushroom you ate was one of the safer ones, so you keep on eating mushrooms from the same species. If you want to live until rescue, the first strategy is your best bet.

Suppose you’re not yet convinced. Here’s a variant. You have a phone. You call your mom on the first day, and describe your predicament. She comforts you and tells you that rescue will come in a week. And then she tells you that she was once stuck for a week on this very island, and ate the pink lacy mushrooms. Then your battery dies. You rejoice: you will eat the pink lacy mushrooms and thus survive! But then suddenly you get worried. You don’t know when your mom was stuck on the island. If she was stuck on the island before you were conceived, then had she not survived the mushrooms, you wouldn’t have been around to hear it. And in that case, you think her evidence is worthless, because you wouldn’t have any evidence had she not survived. So now it becomes oddly epistemically relevant to you whether your mom was on the island before or after you were conceived. But it seems largely epistemically irrelevant when your mom’s visit to the island was.

Saturday, January 21, 2023

Knowing you will soon have enough evidence to know

Suppose I am just the slightest bit short of the evidence needed for belief that I have some condition C. I consider taking a test for C that has a zero false negative rate and a middling false positive rate—neither close to zero nor close to one. On reasonable numerical interpretations of the previous two sentences:

I have enough evidence to believe that the test would come out positive.
If the test comes out positive, it will be another piece of evidence for the hypothesis that I have C, and it will push me over the edge to belief that I have C.

To see that (1) is true, note that the test is certain to come out positive if I have C and has a significant probability of coming out positive even if I don’t have C. Hence, the probability of a positive test result will be significantly higher than the probability that I have C. But I am just the slightest bit short of the evidence needed for belief that I have C, so the evidence that the test would be positive (let’s suppose a deterministic setting, so we have no worries about the sense of the subjunctive conditional here) is sufficient for belief.

To see that (2) is true, note that given that the false negative rate is zero, and the false positive rate is not close to one, I will indeed have non-negligible evidence for C if the test is positive.

If I am rational, my beliefs will follow the evidence. So if I am rational, in a situation like the above, I will take myself to have a way of bringing it about that I believe, and do so rationally, that I have C. Moreover, this way of bringing it about that I believe that I have C will itself be perfectly rational if the test is free. For of course it’s rational to accept free information. So I will be in a position where I am rationally able to bring it about that I rationally believe C, while not yet believing it.

In fact, the same thing can be said about knowledge, assuming there is knowledge in lottery situations. For suppose that I am just the slightest bit short of the evidence needed for knowledge that I have C. Then I can set up the story such that:

I have enough evidence to know that the test would come out positive,

and:

If the test comes out positive, I will have enough evidence to know that I have C.

In other words, oddly enough, just prior to getting the test results I can reasonably say:

I don’t yet have enough evidence to know that I have C, but I know that in a moment I will.

This sounds like:

I don’t know that I have C but I know that I will know.

But (6) is absurd: if I know that I will know something, then I am in a position to know that the matter is so, since that I will know p entails that p is true (assuming that p doesn’t concern an open future). However, there is no similar absurdity in (5). I may know that I will have enough evidence to know C, but that’s not the same as knowing that I will know C or even be in a position to know C. For it is possible to have enough evidence to know something without being in a position to know it (namely, when the thing isn’t true or when one is Gettiered).

Still, there is something odd about (5). It’s a bit like the line:

After we have impartially reviewed the evidence, we will execute him.

Appendix: Suppose the threshold for belief or knowledge is r, where r < 1. Suppose that the false-positive rate for the test is 1/2 and the false-negative rate is zero. If E is a positive test result, then P(C|E) = P(C)P(E|C)/P(E) = P(C)/P(E) = 2P(C)/(1+P(C)). It follows by a bit of algebra that if my prior P(C) is more than r/(2−r), then P(C|E) is above the threshold r. Since r < 1, we have r/(2−r) < r, and so the story (either in the belief or knowledge form) works for the non-empty range of priors strictly between r/(2−r) and r.

Thursday, February 10, 2022

It can be rational to act as if one's beliefs were more likely true than the evidence makes them out to be

Consider this toy story about belief. It’s inconvenient to store probabilities in our minds. So instead of storing the probability of a proposition p, once we have evaluated the evidence to come up with a probability r for p, we store that we believe p if r ≥ 0.95, that we disbelieve p if r ≤ 0.05, and otherwise that we are undecided. (Of course, the “0.95” is only for the sake of an example.)

Now, here is a curious thing. Suppose I come across a belief p in my mind, having long forgotten the probability it came with, and I need to make some decision to which p is relevant. What probability should I treat p as having in my decision? A natural first guess is 0.95, which is my probabilistic threshold for belief. But that is a mistake. For the average probability of my beliefs, if I follow the above practice perfectly, is bigger than 0.95. For I don’t just believe things that have probability 0.95. I also believe things that have probability 0.96, 0.97 and even 0.999999. Intuitively, however, I would expect that there are fewer and fewer propositions with higher and higher probability. So, intuitively, I would expect the average probability of a believed proposition to be a somewhat above 0.95. How far above, I don’t know. And the average probability of a believed proposition is the probability that if I pick a believed proposition out of my mental hat, it will be true.

So even though my threshold for belief is 0.95 in this toy model, I should treat my beliefs as if they had a slightly higher probability than that.

This could provide an explanation for why people can sometimes treat their beliefs as having more evidence than they do, without positing any irrationality on their part (assuming that the process of not storing probabilities but only storing disbelieve/suspend/belief is not irrational).

Objection 1: I make mistakes. So I should take into account the fact that sometimes I evaluated the evidence wrong and believed things whose actual evidential probability was less than 0.95.

Response: We can both overestimate and underestimate probabilities. Without evidence that one kind of error is more common than the other, we can just ignore this.

Objection 2: We have more fine-grained data storage than disbelieve/suspend/believe. We confidently disbelieve some things, confidently believe others, are inclined or disinclined to believe some, etc.

Response: Sure. But the point remains. Let’s say that we add “confidently disbelieve” and “confidently believe”. It’ll still be true that we should treat the things in the “believe but not confidently” bin as having slightly higher probability than the threshold for “believe”, and the things in the “confidently believe” bin as having slightly higher probability than the threshold for “confidently believe”.

Monday, August 30, 2021

Absence of evidence

It seems that the aphorism “Absence of evidence is not evidence of absence” is typically false.

For if H is a hypothesis and E is the claim that there is evidence for H, then E raises the probability of H: P(H|E)>P(H). But then (as long as P(E)>0, as Bayesian regularity will insist), it mathematically follows that P(∼H|∼E)>P(∼H). Thus the absence of evidence is evidence for the falsity (“absence”) of the hypothesis.

I think there is only one place where one can challenge this argument, namely the claim:

If there is evidence for H, then the fact that there is evidence for H is itself evidence for H.

First, let’s figure out what (1) is saying. I think the best reading is that it presupposes some kind of notion of a body of first-order evidence—maybe all the stuff that human beings have ever observed—and says that if the actual contents of that body of first-order evidence supports H, then the fact that that body supports H itself supports H.

Here is a way to make this precise. We suppose there is some random variable O whose value (not real valued, of course) is all first-order observations humans ever made. Let W be the set of all possible values that O could take on. For simplicity, we can take W to be finite: there is a maximum number of observations a human can make in a lifespan, a finite resolution to each observation, and a maximum number of human beings who could have lived on earth. Let o₀ be the actual value that O has. Let W_H = {o ∈ W : P(H|O = o)>P(H)}.

Assuming we have Bayesian regularity, we can suppose O = o has non-zero probability for each o ∈ W_H. Then the claim that there is evidence for H is itself evidence for H comes to this:

P(H|O ∈ W_H)>P(H).

And it is easy to check that this follows by finite conglomerability from the fact that P(H|O = o)>P(H) for each o ∈ W_H.

There might be cases where we expect infinite conglomerability to be lacking. In those cases (1) would be dubious. Here is one such case. Suppose Alice and Bob each get a ticket from a fair infinite jar with tickets numbered 1,2,3,…. Alice looks at her ticket. Bob doesn’t look at his yet, but knows that Alice has looked at hers. Bob notes that whatever number Alice has seen, it is nearly certain that his number is bigger (there are infinitely many numbers bigger than Alice’s number and only finitely many smaller ones). Thus, Bob knows that the evidence available to humans supports the thesis that his number is bigger than Alice’s. But Bob’s knowing this is not actually evidence that his number is bigger than Alice’s, for until Bob actually observes one or the other number, he is in the same evidential position as before Alice looked at her ticket—and at that point, it is obvious that it’s not more likely that Bob’s ticket has a bigger number than Alice’s.

But apart from weird cases where conglomerability fails, (1) is true, and so absence of evidence is evidence of absence, assuming we have enough Bayesian regularity.

Perhaps a charitable reading of the aphorism that absence of evidence isn’t evidence of absence is just that absence of evidence isn’t always significant evidence of absence. That seems generally correct.

Thursday, August 29, 2019

The unavoidability of misleading evidence

Three definitional assumptions:

E is only evidence if there is some hypothesis H to which E makes an evidential difference, i.e., P(H|E)≠P(H).
E is incomplete if and only if it is evidence such that there is a hypothesis H such that 0 < P(H|E)<1, i.e., E doesn’t make everything certain.
E is misleading with respect to a hypothesis H if and only if either H is true and E is evidence against H (i.e., P(H|E)<P(H)) or H is false and E is evidence for H (i.e., P(H|E)>P(H)).

Then:

Every piece of incomplete evidence is misleading (with respect to some hypothesis).

[Proof: Suppose E is incomplete evidence. Either E is or is not true. If it is not true, it is misleading, since it lowers its own probability to zero. So, suppose that E is true. Let H₁ be a hypothesis such that 0 < P(H₁|E)<1. Replacing H₁ by its negation if necessary, we can assume H₁ is true. Note that the fact that E is evidence implies that 0 < P(E)<1. Let H be the disjunctive hypothesis: ∼E or (H₁&E). This is true as the second disjunct is true. Now, note that P(H₁&E)<P(E) as P(H₁|E)<1. Thus, (1 − P(E))P(H₁&E)<(1 − P(E))P(E). Thus, P(H₁&E)<P(E)P(H₁&E)+(1 − P(E))P(E). Thus: P(H₁|E)=P(H₁&E)/P(E)<P(H₁&E)+(1 − P(E)) = P(H₁&E)+P(∼E)=P(H). Thus, E is evidence against H even though H is true.]

In particular, we should not take misleadingness of evidence to be an evil. Misleadingness of evidence is a normal part of reasoning with incomplete information.

Thursday, August 8, 2019

Erring on the side of moderation leads to erring on the side of extremism, at least epistemically

One might think that having a less extreme (i.e., further from 0 and 1, and closer to 1/2) credence than is justified by the evidence is pretty safe epistemically. So, if one wants to be safe, one should move one’s credences closer to 1/2: moderation is safer than extremism.

But if one is to be consistent, this doesn’t work. For instance, suppose that the evidence points to clearly independent hypotheses A and B each having probability 0.6, but in the name of safety one assigns them 0.5. Then consistency requires one to assign their conjunction 0.5 × 0.5 = 0.25, whereas the evidence pointed to their conjunction having probability 0.6 × 0.6 = 0.36. In other words, by being more moderate about A and B, one is more extreme about their conjunction.

In other words, once we have done our best in evaluating all the avilable evidence, we should go with the credence the evidence points to, rather than adding fudge factors to make our credences more moderate. (Of course, in particular cases, the existence of some kind of a fudge factor may be a part of the available evidence.)

Tuesday, February 26, 2019

The reportable and the assertible

I’ve just had a long conversation with a grad student about (inter alia) reporting and asserting. My first thought was that asserting is a special case of reporting, but one can report without asserting. For instance, I might have a graduate assistant write a report on some aspect of the graduate program, and then I could sign and submit that report without reading it. I would then be reporting various things (whether responsibly so would depend on how strong my reasons to trust the student were), but it doesn’t seem right to say that I would be asserting these things.

But then I came to think that just as one can report without asserting, one can assert without reporting. For instance, there is no problem with asserting facts about the future, such as that the sun will rise tomorrow. But I can’t report such facts, even though I know them.

It’s not really a question of time. For (a) I also cannot report that the sun rose a million years ago, and (b) if I were to time-travel to the future, observe the sunrise, and come back, then I could report that the sun will rise tomorrow.

And it’s not a distinction with respect to the quantity of evidence. After all, I can legitimately report what I had for dinner yesterday, but it’s not likely that I have as good evidence about that as I do that the sun will rise tomorrow.

I suspect it’s a distinction as to the kind of evidence that is involved. I am a legally bound reporter of illegal activity on campus. But I can’t appropriately report that a violation of liquor laws occurred in the dorms over the weekend if I know it only on the basis of the general claim that such violations, surely, occur every weekend. The kind of evidence that memory provides is typically appropriate for reporting, while the kind of evidence that induction provides is at least typically not.

Interestingly, although I can’t appropriately report that tomorrow the sun will rise, I can appropriately report that I know that the sun will rise tomorrow. This means that the reportable is not closed under obvious entailment.

Monday, October 8, 2018

Evidentialism, and self-defeating and self-guaranteeing beliefs

Consider this modified version of William James’ mountaineer case: The mountaineer’s survival depends on his jumping over a crevasse, and the mountaineer knows that he will succeed in jumping over the crevasse if he believes he will succeed, but doesn’t know that he will succeed as he doesn’t know whether he will come to believe that he will succeed.

James used his version of the case to argue that pragmatic reasons can legitimately override lack of epistemic reasons.

But what is interesting to me in my variant is the way it provides a counterexample to evidentialism. Evidentialists say that you epistemically should form your beliefs only on the basis of evidence. But notice that although the belief that he will succeed at the jump needs to be formed in the absence of evidence for its truth, as soon as it is formed, the belief itself becomes its own evidence to the point that it turns into knowledge. The belief is self-guaranteeing. So there seems to be nothing to criticize epistemically about the formation of the belief, even though the formation is independent of evidence. In fact, it seems, there is a good epistemic reason to believe, since by believing the mountaineer increases the stock of his knowledge.

Moreover, we can even make the case be one where the evidence on balance points against the proposition. Perhaps the mountaineer has attempted, in safer circumstances, to get himself to believe that he can make such a jump, and seven times out of ten he has failed at both self-induction of belief, and also at the jump. But in the remaining three times out of ten, he succeeded at both. So, then, the mountaineer has non-conclusive evidence that he won’t manage to believe that he will succeed (and that he won’t succeed). If he comes to believe that he will succeed, he comes to believe this against the evidence—but, still, in doing, he increases his stock of knowledge, since the belief, once believed, is self-guaranteeing.

(This phenomenon of self-guaranteeing belief reminds me of things that Kierkegaard says about faith, where faith itself is a miracle that hence is evidence for its truth.)

Interestingly, we might also be able to construct cases of well-evidenced but self-defeating beliefs. Consider a jeweler who has noticed that she is successful at cutting a diamond if and only if she believes she will be unsuccessful. Her theory is that belief in her success makes her insufficiently careful. Over time, she has learned to suspend judgment in her success, and hence to be successful. But now she reflects on her history, and she finds herself with evidence that he will be successful in cutting the next diamond. Yet if she believes on this evidence, this will render her overconfident, and hence render the belief false!

This is related to the examples in this paper on lying.

So perhaps what the evidentialist needs to say is that you epistemically may believe p if and only if the evidence says that if you believe p, p is true?

Tuesday, June 12, 2018

Yet another counterexample to Nicod's Principle

Nicod’s Principle says that the claim that all Fs are Gs is confirmed by each instance.

Here’s yet another counterexample. Consider the claim:

All unicorns are male.

We take this claim to be true, albeit vacuously so, since there are no unicorns.

But suppose an instance of (1), namely a male unicorn, were found. We would immediately conclude that (1) is probably false. For if there is a male unicorn, likely there is a female one as well.

The problem here is that when we learn of Sam that it is a male unicorn, we also learn that there are unicorns. And as soon as we learned that there are unicorns, that undercut the reason we had for believing (1), namely that we thought (1) was vacuously true.

Tuesday, April 24, 2018

Balancing between theism and atheism

The problem of evil consists of three main parts:

The problem of suffering.
The problem of evil choices.
The problem of hiddenness (which is an evil at most conditionally on God’s existing).

The theist has trouble explaining why there is so much suffering. The atheist, however, has trouble explaining why there is any suffering, given that suffering presupposes consciousness, and the atheist has trouble explaining why there is any consciousness.

Of course, there are atheist-friendly naturalistic accounts of consciousness. But they all face serious difficulties. This parallels the fact that theists have theodical accounts of why God permits so much suffering, accounts that also face serious difficulties.

So, on the above, considerations of suffering are a net tie between theism and atheism.

The theist does not actually have all that much trouble explaining why there are evil choices. Libertarian free will does the job. Of course, there are some problems with libertarian accounts of free will. These problems are not, I think, nearly as serious as the problems that theists have with explaining why there is so much suffering or atheists have with explaining why there is consciousness. Moreover, there is a parallel problem for the atheist. Evil choices can only exist given free will. Prima facie the most plausible accounts of free will are libertarian agent-causal ones. But those are problematic for the atheist, who will find it difficult to explaining where libertarian agents come from. The atheist probably has to embrace a compatibilist theory, which has at least as many problems as libertarian agent-causalism.

So, considerations of evil choices look at best as a net tie for the atheist.

Finally, there is the problem of hiddenness for the theist. But while the theist has trouble explaining how we don’t all know something so important as the existence of God, the atheist has epistemological trouble of her own: she has trouble explaining how she knows that there is no God. After all, knowledge of the highly abstract facts that enter into arguments regarding the existence of God is not the sort of knowledge that seems to be accessible to evolved natural beings.

So, considerations of knowledge of the existence or non-existence of God look as a net tie.

The problem of evil, however, exhausts the powerful arguments for atheism. But the above considerations far from exhaust the powerful arguments for theism.

The above reasoning no doubt has difficulties. But I want to propose it as a strategy for settling disputes in cases where it's hard to assign probabilities. For even if it's hard to assign probabilities, we can have good intuitions that two considerations are a wash, that they provide equal evidence. And if we can line up arguments in such a way, being more careful with issues of statistical dependence than I was above, then we can come to a view as to which way some bunch of evidence points.

Thursday, July 13, 2017

Preponderance of evidence

I do formal epistemology, but I am no legal scholar, so this could be a complete misunderstanding. It is my understanding that in civil cases a preponderance of evidence standard is used on which the evidence needs to support the conclusion with a probability merely greater than 1/2. This seems ridiculous in cases where one is seeking compensation for damages that may or may not have occurred.

Suppose I run a business, and I treat my staff somewhat shabbily but not actionably. One day, hundreds of dollars worth of damage occurs in the server room. Review of blurry security camera footage, building security logs and other data proves beyond reasonable doubt the following facts:

A thin stocking was put over the camera, hence the blur.
There were five employees in the offices at the time, all of whom had a similar build and appearance: Alfred, Bill, Carl, David and Edgar.
Three of the employees went to the bathroom and returned with buckets full of water which they poured over the servers.
The other two employees did their best to stop the three, including calling 911 and heroically trying to block the door to the server room. As a result of the scuffle, everybody’s fingerprints are on the buckets and everybody is wet.
Each employee claims with equal credibility that he was one of the two trying to stop the attack. Moreover, everybody claims to be unable to identify who the “other” employee trying to stop the attack is. The video footage shows a scene of such confusion that this inability to identify is unsurprising.

So, I fire all five employees and then sue each of the five individually for damages. I argue in the case of each employee that the evidence clearly yields a 3/5 probability that he was responsible for damage, and remind the court that 3/5 > 1/2.

But surely it would be a serious miscarriage of justice for all five to be held liable for damages that two of the five sought to prevent.

I wonder if cases like this get their force solely from the fact that the probabilities involved—namely, 3/5—are low, or if there is something else going on. Suppose I had a thousand employees, and 999 were damaging company property while one was trying to stop it. Should I be able to sue all 1000, correctly claiming a probability of 999/1000 of responsibility in each case, while knowing for sure that a judgment in my favor in all 1000 cases will place a severe financial burden on exactly one innocent person?

That is an uncomfortable conclusion, but perhaps we should bite the bullet and say that this is no different from a court knowing that over the run of many cases, there will be a small minority where innocents are burdened with grave burdens—and the risk of suffering such burdens is just part of the cost of membership in the society, much as being subject to the draft is.

But it seems much more uncomfortable to say something like this in the 3/5 case—or a 51/100 case—than in a 999/1000 case.

Naive intuition: The evidence needed should scale with the burden to the defendant in the case of a finding against them. Maybe the evidence requirements do thus scale in practice. Like I said, I am no legal scholar.

Friday, February 3, 2017

Evidentialism and higher-order belief

It seems epistemically vicious to induce or maintain a belief for which one has insufficient evidence.

But suppose that my evidence supports a quite low degree of confidence about (a) whether I have or will have any higher-order beliefs, (b) the reliability of my introspection into higher-order beleifs, and (c) whether I am capable of self-inducing a belief. I now try to self-induce a belief that I have a higher-order belief, reasoning: either I’ll succeed or I’ll fail in self-induction. If I succeed, I will gain a true belief—for then I will have a higher-order belief. If I fail, no harm done. So I try, and I succeed.

Nothing epistemically vicious has been done, even though I self-induced a belief for which I had insufficient evidence.

In light of my evidenced low degree of confidence in the reliability of introspection into higher-order beliefs, once I have gained the belief, I still on balance have insufficient evidence for the belief. But it doesn’t seem irrational to try to maintain the belief, on the grounds that one can only successfully maintain it if one has it, and if one has it, it’s true. And so I try to maintain the belief, and I succeed. So I maintain the belief despite continuing insufficient evidence, and yet I am rational.

Here’s a reverse case. Let’s say that I find myself with very strong evidence that I will do not have and will never have any higher-order beliefs. It would be irrational to try to get myself to believe this proposition on this evidence.

So perhaps we should tie rationality not to evidence for a belief, but to evidence for the material conditional: if I have the belief, it is true?

Cf. this about assertion.

Wednesday, January 27, 2016

Pro tanto epistemic reasons and Bayesianism

I've been thinking about whether a Bayesian can make sense of the concept of pro tanto epistemic reasons. The idea is that p gives us a reason to believe q, though in the light of our full evidence it may no longer support q. In other words, the idea is that p is pro tanto evidence for q there is a reduced set of evidence relative to which p supports q.

But on a Bayesian picture it can't be just any reduced set of evidence, or else we have too many pro tanto reasons. Let p be the proposition that the sky looks blue and q be the proposition that the sky is red. Let r be some exceedingly unlikely fact, say the fact that when I asked random.org to generate a sequence of 16 random bytes, it generated 52 e2 57 4d 6d 16 c9 dd 12 9e b4 63 27 7e 86 53. Then I believe that (p&~r)→q, where the conditional is material. I believe it, because I believe the antecedent to be false. But if my background consists only of (p&~r)→q, then given reasonable priors p strongly supports q.

So what we want, I think, is to say that p is pro tanto evidence for q provided that there is a privileged reduced set of evidence relative to which p supports q. But what reduced sets of evidence are privileged? There is only one such set that stares one in the face: the empty set. So, the suggestion is: p is pro tanto evidence for q if and only if p supports q relative to an empty background, i.e., according to the absolute priors.

This, I think, offers a way to make some progress on the problem of priors. If we have independent sufficient conditions for something to be a pro tanto reason, then we have a constraint on our absolute priors, namely that if p is pro tanto evidence for q, then P(p & q) > P(p)P(q).

Could we have some such independent sufficient conditions for pro tanto reasons? I think so. For instance, around here, Trent Dougherty has been pushing phenomenal conservatism, which can be taken to be the view that seemings are pro tanto reasons for their contents. If the above Bayesian account of pro tanto reasons is correct, then this puts a constraint on prior probabilities, and that's a good thing.

Monday, January 25, 2016

"Why are you telling me this?" and protocols

Suppose you want to convince me that I have no hands but are unable to lie (and I know for sure you are unable to lie). However, you know a lot more than I about something, perhaps something completely irrelevant to the question. For instance, suppose you know the results of some very long sequence of die rolls that's completely irrelevant. It seems you can fool me with the truth. For you can find some true proposition p about the die rolls such that I assign an exceedingly low probability to p. You then reveal to me this disjunctive fact: p is true or I have no hands. Then: P(no hands | p or no hands) = P(no hands) / (P(p) + P(~p and no hands)) ≥ P(no hands) / (P(p) + P(no hands)). (Exercise: check the details.) If P(p) is sufficiently small, relative to my prior probability P(no hands) (which of course is non-zero--there is a tiny chance that I was in a terrible accident and superb prostheses have just been developed), this will be close to 1.

But of course, whether I have hands or not, if you know a lot more about something than I do, you will be able to find a truth that I assign a tiny probability to. So I really shouldn't be deceived by you. Rather, I should take myself to have learned p. Your disjunction is equivalent to the material conditional that if I have hands, then p. I know I have hands. So, p. But what about the Bayesian calculation, which is mathematically correct?

This is a protocol problem. If I happened to ask you whether the disjunction "p is true or I have no hands" was true, and you then revealed it to me that it was, the Bayesian calculation would have been correct. But the actual protocol was that you picked out a truth that I took to be unlikely, and disjoined it with a claim that I have no hands. If I knew for sure that this was your protocol, I would have learned two things: first that p is true, and second that p is true or I have no hands. The second would have been uninformative in light of the first, and so there would be no deceit. But of course if the above were to really happen, I wouldn't know for sure what your protocol was.

In real life, when someone tells us something odd out of the blue, we often ask: "Why are you telling me this?" The above case shows how epistemically important the answer to this question can be. If you tell me (remember that you are unable to lie) that you're telling me this to get me to think I have no hands, I will suspect that your protocol may be to find an unlikely truth and disjoin it with the claim that I have no hands. As long as I have significant suspicion that this is your protocol, your statement won't shake my near-certainty that I have hands. But if you tell me that you were telling me this because you decided, before finding out whether p was true, that you were going to tell me whether or not the disjunction is true, then my near-certainty that I have hands should be shaken. I wonder how often "Why are you telling me this?" involves a case of trying to find the protocol and thus to figure out how to update. Rarely? Often?

Friday, January 22, 2016

Absence of evidence is evidence of absence

It's a well-known theorem that if conditioning on A increases the probability of B, then conditioning on not-A increases the probability of not-B. So if learning that there is evidence of B increases the probability of B, learning that there is no evidence for B increases the probability of not-B. Hence, there is a sense in which absence of evidence is evidence of absence.

Of course, the absence of evidence may be exceedingly weak evidence of absence, and that's probably the right way to take the platitude that absence of evidence isn't evidence of absence.

Thursday, October 29, 2015

A weakly-fallibilist evidentialist can't be an evidential Bayesian

The title is provocative, but the thesis is less provocative (and in essence well-known: Hawthorne's work on the deeply contingent a priori is relevant) once I spell out what I stipulatively mean by the terms. By evidential Bayesianism, I mean the view that evidence should only impact our credences by conditionalization. By evidentialism, I mean the view that high credence in contingent matters should not be had except by evidence (most evidentialists make a stronger claims). By weak fallibilism, I mean that sometimes a correctly functioning epistemic agent appropriately would have high credence on the basis of non-entailing evidence. These three theses cannot all be true.

For suppose that they are all true, and I am a correctly functioning epistemic agent who has appropriate high credence in a contingent matter H, and yet my total evidence E does not entail H. By evidentialism, my credence comes from the evidence. By evidential Bayesianism, if P measures my prior probabilities, then P(H|E) is high. But it is a theorem that P(H|E) is less than or equal to P(E→H), where the arrow is a material conditional. So the prior probability of E→H is high. This conditional is not necessary as E does not etnail H. Hence, I have high prior credence in a contingent matter. Prior probabilities are by definition independent of my total evidence. So evidentialism is violated.

Monday, October 19, 2015

Correcting Bayesian calculations

Normally, we take a given measurement is a sample of a bell-curve distribution centered on the true value. But we have to be careful. Suppose I report to you the volume of a cubical cup. What the error distribution is like depends on how I measured it. Suppose I weighed the cup before and after filling it with water. Then the error might well have the normal distribution we associate with the error of a scale. But suppose instead I measure the (inner) length of one of the sides of the cup, and then take the cube of that length. Then the measurement of the length will be normally distributed, but not the measurement of the volume. Suppose that what I mean by "my best estimate" of a value is the mathematical expectation of that value with respect to my credences. Then it turns out that my best estimate of the volume shouldn't be the cube of the side length, but rather it should be L³+3Lσ², where L is the side-length and σ is the standard deviation in the side-length measurements. Intuitively, here's what happens. Suppose I measure the side length at 5 cm. Now, it's equally likely that the actual side length is 4 cm as that it is 6 cm. But 4³=64 and 6³=216. The average of these two equally-likely values is 140, which is actually more than 5³=125. So if by best-estimate I mean the estimate that is the mathematical expectation of the value with respect to my credences, the best-estimate for the volume should be higher than the cube of the best-estimate for the side-length. (I'm ignoring complications due to the question whether the side-length could be negative; in effect, I'm assuming that the σ is quite a bit smaller than L.)

There is a very general point here. Suppose that by the best estimate of a quantity I mean the mathematical expectation of that quantity. Suppose that the quantity y I am interested in is given by the formula y=f(x) where x is something I directly measure and where my measurement of x has a symmetric error distribution (error of the same magnitude in either direction are equally likely). Then if f is a strictly convex function, then my best estimate for y should actually be bigger than f(x): simply taking my best estimate for x and applying f will underestimate y. On the other hand, if f is strictly concave, then my best estimate for y should be smaller than f(x).

But now let's consider something different: estimating the weight of evidence. Suppose I make a bunch of observations and update in a Bayesian way on the basis of them to arrive at a final credence. Now, it turns out that when you formulate Bayes' theorem in terms of the log-odds-ratio, it becomes a neat additive theorem:

posterior log-odds-ratio = prior log-odds-ratio + log-likelihood-ratio.

[If p is the probability, the log-odds ratio is log (p/(1−p)). If E is the evidence and H is the hypothesis, the log-likelihood-ratio is log (P(E|H)/P(E|~H)).] As we keep on repeating adding new evidence into the mix, we keep on adding new log-likelihood-ratios to the log-odds-ratio. Assuming competency in doing addition, there are two or three sources of error--sources of potential divergence between my actual credences and the rational credences given the evidence. First, I could have stupid priors. Second, I could have the wrong likelihoods. Third, perhaps, I could fail to identify the evidence correctly. Given the additivity between these errors, it's not unreasonable to think that error in the log-odds-ratio will be approximately normally distributed. (All I will need for my argument is that it has a distribution symmetric around some value.)

But as the case of the cubical cup shows, it does not follow that the error in the credence will be normally distributed. If x is the log-odds-ratio and p is the probability or credence, then p=e^x/(e^x+1). This is a very pretty function. It is concave for log-odds-ratios bigger than 0, corresponding to probabilities bigger than 1/2, and convex for log-odds-ratios smaller than 0, corresponding to probabilities less than 1/2, though it is actually fairly linear over a range of probabilities from about 0.3 to 0.7.

We can now calculate an estimate of the rational credence by applying the function e^x/(e^x+1) to the log-odds-ratio. This will be equivalent to the standard Bayesian calculation of the rational credence. But as we learn from the cube case, we don't in general get the best estimate of a quantity y that is a mathematical function of another quantity x by measuring x with normally distributed error and computing the corresponding y. When the function in question is convex, my best estimate for y will be higher than what I get in this way. When the function is concave, I should lower it. Thus, as long as we are dealing with small normal error in the log-odds-ratio, when we are dealing with probabilities bigger than around 0.7, I should lower my credence from that yielded by the Bayesian calculation, and when we are dealing with probabilities smaller than around 0.3, I should raise my credence relative to the Bayesian calculation. When my credence is between 0.3 and 0.7, to a decent approximation I can stick to the Bayesian credence, as the transformation function between log-odds-ratios and probabilities is pretty linear there.

How much difference does this correction to Bayesianism make? That depends on what the actual normally distributed error in log-odds-ratios is. Let's make up some numbers and plug into Derive. Suppose my standard deviation in log-odds-ratio is 0.4, which corresponds to an error of about 0.1 in probabilities when around 0.5. Then the correction makes almost no difference: it replaces a Bayesian's calculation of a credence 0.01 with a slightly more cautious 0.0108, say. On the other hand, if my log-odds-ratio standard deviation is 1, which corresponds with a variation of probability of around plus or minus 0.23 when centered on 0.5, then the correction changes a Bayesian's calculation of 0.01 to the definitively more cautious 0.016. But if my log-odds-ratio standard deviation is 2, corresponding to a variation of probability of 0.38 when centered on 0.5, then the correction changes a Bayesian's calculation of 0.01 to 0.04. That's a big difference.

There is an important lesson here. When I am badly unsure of the priors and/or likelihoods, I shouldn't just run with my best guesses and plug them into Bayes' theorem. I need to correct for the fact that my uncertainty about priors and/or likelihoods is apt to be normally (or at least symmetrically about the right value) distributed on the log-odds scale, not on the probability scale.

This could be relevant to the puzzle that some calculations in the fine-tuning argument yield way more confirmation than is intuitively right (I am grateful to Mike Rota for drawing my attention to the last puzzle, in a talk he gave at the ACPA).

Wednesday, September 30, 2015

A virtuous evidential regress

Could this ever be the case: p₂ is evidence for p₁, p₃ is evidence for p₂, p₄ is evidence for p₃, and so on ad infinitum?

I don't think we can rule this out on epistemological grounds alone. For suppose that there are infinitely many unicorns in the universe, none of which you've observed, but there are also infinitely many experts. Expert number n happens to inform you that there are at least n unicorns in the universe. Now, let p_n be the proposition that there are at least n unicorns in the universe. Then obviously p₁ is evidence for p₂, p₃ is evidence for p₂ and so on. But there is nothing vicious about this regress. For you have independent evidence for each p_n. This is a case where although there is an infinite evidential regress, all the ultimate evidence is outside of the regress—for ultimately all the evidence about the unicorns comes from the experts.

But note that despite the fact that the ultimate evidence is all outside the regress, the evidential relations within the regress are important. For while you have some evidence for p₁ directly from the first expert, you also have some additional evidence for p₁ deriving from p₂, and hence from the second expert.