Suppose I see a hypothetical click-bait article that says: “One thing you can do that science says doubles your chance of living past a hundred.” I foolishly click on it, and from the first paragraph find out that supposedly that thing is a serious photography hobby. Now, the idea isn’t crazy: having a serious hobby that can be pursued over a lifetime and that involves artistic and intellectual skills could well increase your lifespan. But I am reasonably sceptical.
Suppose for simplicity that after reading the first paragraph, I assign a credence of 1/2 to the null hypothesis N that there is no correlation between photography and living to a hundred and a credence of 1/2 to the hypothesis H that serious photography doubles the chance of living past 100.
Suppose I now tell my mother about the article, and she says: “Your great-grandma Alice was an avid photographer and she lived past 100!”
This is paradigmatically anecdotal evidence. But let’s do a quick and dirty Bayesian analysis. I just learned the fact E1 at least one of my eight great-grandparents was both a serious photographer and lived past 100. Let’s say for simplicity that each of my great-grandparents was born more than 100 years ago and had a 1% chance of living past 100 (some actual data is here), and that 15% of people are serious photographers. Then the conditional probability of my evidence E1 on the null hypothesis N is 1 − (1 − 0.01 ⋅ 0.15)8 or 1%, but on the doubling-chance hypothesis H it is 1 − (1 − 2 ⋅ 0.01 ⋅ 0.15)8 or 2.4%. Plugging these into Bayes’ theorem, I get a 67% posterior probability of H, and now I have a significant degree of credence in a photography-centenarianism link.
But suppose that instead of talking to my mother, I read further on in the article, and find after reading whatever scientific study spawned the article, the author found and interviewed a centenarian Bob who has been an avid photographer for half his life. What does that piece of anecdotal data do to my credence in the link between photography and a long life? Nothing! To a first approximation, the relevant fact I learned from the interview is the fact E2 that there exists at least one person in the world who was an avid photographer and lived past 100. And the conditional probability of E2 is very close to 1 on both H and N, so by Bayes’ theorem it doesn’t change my credences in H and N. (To a second approximation, I learned that the there was one person accessible to the author who was an avid photographer and lived past 100. And that is presumably slightly more likely on H and N. So I should get a slight boost in H, but only a slight one, since in the modern world we have access to lots of people.)
Now consider an intermediate case. Instead of talking to relatives, I share the article with a hundred people, and one of them writes back: “Wow! My tennis partner’s great-uncle Carl was an avid photographer and lived past 100.” Let’s over simplify by supposing that each of my hundred correspondents read the article and on average contributes ten people born more than 100 years ago to the sample. So from the first response, I have learned the fact E3 that in this sample of 500, there is at least one person who is a centenarian and a serious photographer. The probability of this on H is 95% and on N it is 78%. Plugging these into Bayes’ theorem, my credence in the photography-centenarianism link is 55%, which is a rather modest boost over my initial 50%. (The crucial point was that in the initial grandparents sample, the probability on H was double than on N, but now as both probabilities approach 100%, the ratio gets less impressive.)
There are some lessons here: If we are careful with our reasoning, anecdotal data can actually be quite relevant. Moreover, while it’s presumably been ingrained in us since high school science classes that larger sample sizes are better, for certain kinds of anecdotal data, smaller sample sizes are better. This is because the relevant information given by certain kinds of anecdotal data is positive: it is information that some sample contains at least one instance of some sort that is rare on both the null hypothesis and the alternate hypothesis (say, a photographer centenarian). In those cases, once the sample size gets large enough, the probability of the evidence on either hypothesis gets close to 1, and the evidential force disappears.
What this means is that for certain kinds of anecdotal data it makes perfect sense to be more impressed by an anecdote about oneself (a sample of one) than by an anecdote about a relative, and by an anecdote about a relative than by an anecdote about a friend’s friend, and to be essentially unmoved by an anecdote about a stranger on the Internet. And that is, I suspect, how most people actually proceed, notwithstanding blanket condemnations of anecdotal reasoning.
How can we do even better? Well, we should try to enrich our positive anecdotal data with other kinds of anecdotal data: Did I have centenarian relatives who weren’t photographers or photographer relatives who weren’t centenarians? All that would ideally be taken into account. But, nonetheless, if all I have is one positive anecdote, Bayesianism requires me not to dismiss it.