Monday, August 21, 2017

Searching for the best theory

Let’s say that I want to find the maximum value of some function over some domain.

Here’s one naive way to do it:

Algorithm 1: I pick a starting point in the domain at random, place an imaginary particle there and then gradually move the particle in the direction where the function increases, until I can’t find a way to improve the value of the function.

This naive way can easily get me stuck in a “local maximum”: a peak from which all movements go down. In the example graph, most starting points will get one stuck at local maxima.

Let’s say I have a hundred processor cores available, however. Then here’s another simple thing I could do:

Algorithm 2: I choose a hundred starting points in the domain at random, and then have each core track one particle as it tries to move towards higher values of the function, until it can move no more. Once all the particles are stuck, we survey them all and choose the one which found the highest value. This is pretty naive, too, but we have a much better chance of getting to the true maximum of the function.

But now suppose I have this optimization idea:

Algorithm 3: I follow Algorithm 2, except at each time step, I check which of the 100 particles is at the highest value point, and then move the other 99 particles to that location.

The highest value point found is intuitively the most promising place, after all. Why not concentrate one’s efforts there?

But Algorithm 3 is, of course, be a bad idea. For now all 100 particles will be going lock-step, and will all arrive at the same point. We lose much of the independent exploration benefit of Algorithm 2. We might as well have one core.

But now notice how often in our epistemic lives, especially philosophical ones, we seem to be living by something like Algorithm 3. We are trying to find the best theory. And in journals, conferences, blogs and conversations, we try to convince others that the theory we’re currently holding to is the best one. This is as if each core was trying to convince the 99 to explore the location that it was exploring. If the core succeeded, the effect would be like Algorithm 3 (or worse). Forcing convergence—even by intellectually honest means—seems to be harmful to the social epistemic enterprise.

Now, it is true that in Algorithm 2, there is a place for convergence: once all the cores have found their local maxima, then we have the overall answer, namely the best of these local maxima. If we all had indeed found our local maxima, i.e., if we all had fully refined our individual theories to the point that nothing nearby was better, it would make sense to have a conference and choose the best of all of the options. But in fact most of us are still pretty far from even the locally best theory, and it seems unlikely that we will achieve it in this life.

Should we then all work independently, not sharing results lest we produce premature convergence? No. For one, the task of finding the locally optimal theory is one that we probably can’t achieve alone. We are dealing with functions whose values at the search point cannot be evaluated by our own efforts, and where even exploring the local area needs the help of others. And so we need cooperation. What we need is groups exploring different regions of the space of theories. And in fact we have this: we have the Aristotelians looking for the best theory in the vicinity of Aristotle’s, we have the Humeans, etc.

Except that each group is also trying to convince the others. Is it wrong to do so?

Well, one complicating factor is that philosophy is not just an isolated intellectual pursuit. It has here-and-now consequences for how to live our lives beyond philosophy. This is most obvious in ethics (including political philosophy), epistemology and philosophy of religion. In Algorithm 3, 99 of the cores may well be exploring less promising areas of the search space, but it’s no harm to a core to be exploring such an area. But it can be a serious harm to a person to have false ethical, epistemological or religious beliefs. So even if it were better for our social intellectual pursuits that all the factions be doing their searching independently, we may well have reasons of charity to try to convince others—but primarily where this has ethical, epistemological or religious import (and often it does, even if the issue is outside of these formal areas).

Furthermore, we can benefit from criticism by people following other paradigms than ours. Such criticism may move us to switch to their paradigm. But it can benefit us even if it does not do that, by helping us find the optimal theory in our local region.

And, in any case, we philosophers are stubborn, and this stubbornness prevents convergence. This stubbornness may be individually harmful, by keeping us in less promising areas of the search space, but beneficial to the larger social epistemic practice by preventing premature convergence as in Algorithm 3.

Stubbornness can be useful, thus. But it needs to be humble. And that's really, really hard.