Wednesday, February 4, 2026

Algorithmic priors and human nature

One promising way to define priors is with algorithmic probability, such as Solomonoff priors. The idea is that we have a language L (say, one based on Turing machines), and we imagine generating random descriptions in L in a canonical way. E.g., add an end-of-string symbol to L and randomly and independently generate symbols until you hit the end-of-string symbol, and then conditionalize on the string uniquely describing a situation, and take the probability of a specific situation s to be the probability of that a random description so generated describes s.

These kinds of priors are rather appealing for science, since they appear to be induction-friendly, as they assign high probabilities to compressible—more briefly expressible—situations. Thus, if our situations are distributions of color among ravens, monochromatic distributions get much higher probability as they can be much more briefly described, like x(B(x)) or x(W(x)).

Philosophically, I think the big problem is with the choice of the language. It would be nice if we could let L be a language that cuts nature exactly at the joints. But we don’t know that language. And absent that language, we need something arbitrary.

Here is a particular version of the problem. Take Kuhn’s division of science into ordinary and revolutionary science. One aspect of this division is that in ordinary science, we have a scientific language, and are discovering things within it. In that case, it is reasonable to take L to be that language. However, when we are doing revolutionary science and creating new paradigms, we cannot do that. The new paradigms either cannot be described in the old language or their description is unwieldy in a way that does not do justice to the plausibility of the new paradigm. Indeed, much of the point of revolutionary science is to create a language within which the description of the world is simpler, and then argue that this language is therefore more likely to cut nature at the joints.

Another version of this problem is what language L we choose when we are generating fundamental priors. Practically speaking, we cannot use a scientific language that cuts nature at the joints, because we have not yet discovered it. If this was merely a practical concern, we could try to say that this doesn’t matter: the fundamental priors are ones that we ought to have rather than any that we actually have or could have—perhaps ought does not imply can. But the concern is not merely practical. For one of the main points of our inductive reasoning is to discover what concepts cut nature at the joints, and this is largely an empirical enterprise. If the right fundamental priors were to reflect the joints in nature, then the enterprise wouldn’t make much sense, as we would be obligated to have already completed much of the enterprise before we started it.

So, I think, we have to say L does not always cut nature at the joints, and yet this generating appropriate priors for us. But we still need a constraint on L. After all, we could imagine a language that thwarts our empirical enterprise, such as one where only fairies can be described briefly and anything else requires very long descriptions, so we have very high priors for fairies and very low priors for everything else. What will be the constraint? Practically, we pretty much have to start with some ordinary human language. I think our ideal should not be far from what is practical. Thus, I propose, if we are going to go with algorithmic priors, we should choose L to be a language that fits well with our human nature as communicators. This is an anthropocentric choice, and I think human epistemology is rightly anthropocentric.

But why think that the anthropocentric choice is apt to lead to truth? There are two stories to be told here. First, it may be that human nature requires a measure of trust in human nature. Second, that trust is vindicated if we are created by a good God who loves the truth.

No comments: