Thursday, March 6, 2025

Definitions

In the previous post, I offered a criticism of defining logical consequence by means of proofs. A more precise way to put my criticism would be:

  1. Logical consequence is equally well defined by (i) tree-proofs or by (ii) Fitch-proofs.

  2. If (1), then logical consequence is either correctly defined by (i) and correctly defined by (ii) or it is not correctly defined by either.

  3. If logical consequence is correctly defined by one of (i) and (ii), it is not correctly defined by the other.

  4. Logical consequence is not both correctly defined by (i) and and correctly defined by (ii). (By 3)

  5. Logical consequence is neither correctly defined by (i) nor by (ii). (By 1, 2, and 4)

When writing the post I had a disquiet about the argument, which I think amounts to a worry that there are parallel arguments that are bad. Consider the parallel argument against the standard definition of a bachelor:

  1. A bachelor is equally well defined as (iii) an unmarried individual that is a man or as (iv) a man that is unmarried.

  2. If (6), then a bachelor is either correctly defined by (iii) and correctly defined by (iv) or it is not correctly defined by either.

  3. If logical consequence is correctly defined by one of (iii) and (iv), it is not correctly defined by the other.

  4. A bachelor is not both correctly defined by (iii) and correctly defined by (iv). (By 9)

  5. A bachelor is neither correctly defined by (iii) nor by (iv). (By 6, 7, and 10)

Whatever the problems of the standard definition of a bachelor (is a pope or a widower a bachelor?), this argument is not a problem. Premise (9) is false: there is no problem with saying that both (iii) and (iv) are good definitions, given that they are equivalent as definitions.

But now can’t the inferentialist say the same thing about premise (3) of my original argument?

No. Here’s why. That ψ has a tree-proof from ϕ is a different fact from the fact that ψ has a Fitch-proof from ϕ. It’s a different fact because it depends on the existence of a different entity—a tree-proof versus a Fitch-proof. We can put the point here in terms of grounding or truth-making: the grounds of one involve one entity and the grounds of the other involve a different entity. On the other hand, that Bob is an unmarried individual who is a bachelor and that Bob is a bachelor who is unmarried are the same fact, and have the same grounds: Bob’s being unmarried and Bob’s being a man.

Suppose one polytheist believes in two necessarily existing and essentially omniscient gods, A and B, and defines truth as what A believes, while her coreligionist defines truth as what B believes. The two thinkers genuinely disagree as to what truth is, since for the first thinker the grounds of a proposition’s being true are beliefs by A while for the second the grounds are beliefs by B. That necessarily each definition picks out the same truth facts does not save the definition. A good definition has to be hyperintensionally correct.

Logical consequence

There are two main accounts of ψ being a logical consequence of ϕ:

  • Inferentialist: there is a proof from ϕ to ψ

  • Model theoretic: every model of ϕ is a model of ψ.

Both suffer from a related problem.

On inferentialism, the problem is that there are many different concepts of proof all of which yield an equivalent relation of between ϕ and ψ. First, we have a distinction as to how the structure of a proof is indicated: is a tree, a sequence of statements set off by subproof indentation, or something else. Second, we have a distinction as to the choice of primitive rules. Do we, for instance, have only pure rules like disjunction-introduction or do we allow mixed rules like De Morgan? Do we allow conveniences like ternary conjunction-elimination, or idempotent? Which truth-functional symbols do we take as undefined primitives and which ones do we take as abbreviations for others (e.g., maybe we just have a Sheffer stroke)?

It is tempting to say that it doesn’t matter: any reasonable answers to these questions make exactly the same ψ be logical consequence of the same ϕ.

Yes, of course! But that’s the point. All of these proof systems have something in common which makes them ``reasonable’’; other proof systems, like ones including the rule of arbitrary statement introduction, are not reasonable. What makes them reasonable is that the proofs they yield capture logical consequence: they have a proof from ϕ to ψ precisely when ψ logically follows from ϕ. The concept of logical consequence is thus something that goes beyond them.

None of these are the definition of proof. This is just like the point we learn from Benacerraf that none of the set-theoretic “constructions of the natural numbers” like 3 = {0, 1, 2} or 3 = {{{0}}} gives the definition of the natural numbers. The set theoretic constructions give a model of the natural numbers, but our interest is in the structure they all have in common. Likewise with proof.

The problem becomes even worse if we take a nominalist approach to proof like Goodman and Quine do, where proofs are concrete inscriptions. For then what counts as a proof depends on our latitude with regard to the choice of font!

The model theoretic approach has a similar issue. A model, on the modern understanding, is a triple (M,R,I) where M is a set of objects, R is a set of relations and I is an interpretation. We immediately have the Benacerraf problem that there are many set-theoretic ways to define triples, relations and interpretations. And, besides that, why should sets be the only allowed models?

One alternative is to take logical consequence to be primitive.

Another is not to worry, but to take the important and fundamental relation to be metaphysical consequence, and be happy with logical consequence being relative to a particular logical system rather than something absolute. We can still insist that not everything goes for logical consequence: some logical systems are good and some are bad. The good ones are the ones with the property that if ψ follows from ϕ in the system, then it is metaphysically necessary that if ϕ then ψ.

Wednesday, March 5, 2025

A praise-blame asymmetry

There is a certain kind of symmetry between praise and blame. We praise someone who incurs a cost to themselves by going above and beyond obligation and thereby benefitting another. We blame someone who benefits themselves by failing to fulfill an obligation and thereby harming another.

But here is a fun asymmetry to note. We praise the benefactor in proportion to the cost to the benefactor. But we do not blame the malefactor in proportion to the benefit to the malefactor. On the contrary, when the benefit to the malefactor is really small, we think the malefactor is more to be blamed.

Realism about arithmetical truth

It seems very plausible that for any specific Turing machine M there is a fact of the matter about whether M would halt. We can just imagine running the experiment in an idealized world with an infinite future, and surely either it will halt or it won’t halt. No supertasks are needed.

This commits one to realism about Σ1 arithmetical propositions: for every proposition expressible in the form nϕ(n) where ϕ(n) has only bounded quantifiers, there is a fact of the matter whether the proposition is true. For there is a Turing machine that halts if and only if nϕ(n).

But now consider a Π2 proposition, one expressible in the form mnϕ(m,n), where again ϕ(m,n) has only bounded quantifiers. For each fixed m, there is a Turing machine Mm whose halting is equivalent to nϕ(m,n). Imagine now a scenario where on day n of an infinite future you build and start Mm. Then there surely will be a fact of the matter whether any of these Turing machines will halt, a fact equivlent to mnϕ(m,n).

What about a Σ3 proposition, one expressible in the form rmnϕ(r,m,n)? Well, we could imagine for each fixed r running the above experiment starting on day r in the future to determine whether the Π2 proposition mnϕ(r,m,n) is true, and then there surely is a fact of the matter whether at least one of these experiments gives a positive answer.

And so on. Thus there is a fact of the matter whether any statement in the arithmetical hierarchy—and hence any statement in the language of arithmetic—is true or false.

This argument presupposes a realism about deterministic idealized machine counterfactuals: if I were to build such and such a sequence of deterministic idealized machines, they would behave in such and such a way.

The argument also presupposes that we have a concept of the finite and of countable infinity: it is essential that our Turing machines be run for a countable sequence of steps in the future and that the tape begin with a finite number of symbols on it. If we have causal finitism, we can get the concept of the finite out of the metaphysics of the world, and a discrete future-directed causal sequence of steps is guaranteed to be countable.

Tuesday, March 4, 2025

Degrees of gratitude

How grateful x should be to y for ϕing depends on:

  1. The expected benefit to x

  2. The actual benefit to x

  3. The expected cost to y

  4. The actual deontic status of yϕing

  5. The believed deontic status of y’s ϕing.

The greater the expected benefit, the greater the appropriate gratitude. Zeroing the expected benefit zeroes the appropriate gratitude: if someone completely accidentally benefited me, no gratitude is appropriate.

I think the actual benefit increases the expected gratitude, even when the expected benefit is fixed. If you try to do something nice for me, I owe you thanks, but I owe even more thanks when I am an actual beneficiary. However, zeroing the actual benefit does not zero the expected gratitude—I should still be grateful for your trying.

The more costly the gift to the giver, the more gratitude is appropriate. But zeroing the cost does not zero the expected gratitude: I owe God gratitude for creating me even though it took no effort. I think that in terms of costs, it is only the expected and not the actual cost that matters for determining the appropriate gratitude. If you bring flowers to your beloved and slip and fall on the way back from the florist and break your leg, it doesn’t seem to me that more gratitude is appropriate.

I think of deontic status here as on a scale that includes four ranges:

  1. Wrong (negative)

  2. Merely permissible (neither obligatory nor supererogatory) (zero)

  3. Obligatory (positive)

  4. Supererogatory (super positive)

In cases where both the actual and believed deontic status falls in category (i), no gratitude is appropriate. Gratitude is only appopriate for praiseworthy actions.

The cases of supererogation call for more gratitude than the cases of obligation, other things being equal. But nonetheless cases of obligatory benefiting also call for gratitude. While y might say “I just did my job”, that fact does not undercut the need for gratitude.

Cases where believed and actual deontic status come apart are complicated. Suppose that a do-not-resuscitate order is written in messy handwriting, and a doctor misreads it as a resuscitate order, and then engages in heroic effort to resuscitate, succeeds, and in fact benefits the patient. (Maybe the patient thought that they would not be benefited by resuscitation, but in fact they are.) I think gratitude is appropriate, even if the action was actually wrong.

There is presumably some very complicated function from factors (1)–(5) (and perhaps others) to the degree of appropriate gratitude.

I am really grateful to Juliana Kazemi for a conversation on relevant topics.

Wednesday, February 26, 2025

Against full panpsychism

I have access to two kinds of information about consciousness: I know the occasions on which I am conscious and the occasions on which I am not. Focusing on the second, we get this argument:

  1. If panpsychism is true, everything is always conscious.

  2. In dreamless sleep, I exist and am not conscious.

  3. So, panpsychism is false.

One response is to retreat to a weaker panpsychism on which everything is either conscious or has a conscious part. On the weaker panpsychism, one can say that in dreamless sleep, I have some conscious parts, say particles in my big toe.

But suppose we want to stick to full panpsychism that holds that everything is always conscious. This leaves two options.

First, one could deny that we exist in dreamless sleep. But if we don’t exist in dreamless sleep, then it is not possible to murder someone in dreamless sleep, and yet it obviously is.

Second, one could hold that we are conscious in dreamless sleep but the consciousness is not recorded to memory. This seems a dubious skeptical hypothesis. But let’s think about it a bit more. Presumably, the same applies under general anaesthesia. Now, while I’m far from expert on this, it seems plausible that the brain functioning under general anaesthesia is a proper subset of my present brain functioning. This makes it plausible that my experiences under general anaesthesia are a proper subset of my present wakeful experiences. But none of my present wakeful experiences—high level cognition, sensory experience, etc.—are a plausible candidate for an experience that I might have under general anaesthesia.

Tuesday, February 25, 2025

Being known

The obvious analysis of “p is known” is:

  1. There is someone who knows p.

But this obvious analysis doesn’t seem correct, or at least there is an interesting use of “is known” that doesn’t fit (1). Imagine a mathematics paper that says: “The necessary and sufficient conditions for q are known (Smith, 1967).” But what if the conditions are long and complicated, so that no one can keep them all in mind? What if no one who read Smith’s 1967 paper remembers all the conditions? Then no one knows the conditions, even though it is still true that the conditions “are known”.

Thus, (1) is not necessary for a proposition to be known. Nor is this a rare case. I expect that more than half of the mathematics articles from half a century ago contain some theorem or at least lemma that is known but which no one knows any more.

I suspect that (1) is not sufficient either. Suppose Alice is dying of thirst on a desert island. Someone, namely Alice, knows that she is dying of thirst, but it doesn’t seem right to say that it is known that she is dying of thirst.

So if it is neither necessary nor sufficient for p to be known that someone knows p, what does it mean to say that p is known? Roughly, I think, it has something to do with accessibility. Very roughly:

  1. Somebody has known p, and the knowledge is accessible to anyone who has appropriate skill and time.

It’s really hard to specify the appropriateness condition, however.

Does all this matter?

I suspect so. There is a value to something being known. When we talk of scientists advancing “human knowledge”, it is something like this “being known” that we are talking about.

Imagine that a scientist discovers p. She presents p at a conference where 20 experts learn p from her. Then she publishes it in a journal when 100 more people learn it. Then a Youtuber picks it up and now a million people know it.

If we understand the value of knowledge as something like the sum of epistemic utilities across humankind, then the successive increments in value go like this: first, we have a move from zero to some positive value V when the scientist discovers p. Then at the conference, the value jumps from V to 21V. Then after publication it goes from 21V to 121V. Then given Youtube, it goes from 121V to 100121V. The jump at initial discovery is by far the smallest, and the biggest leap is when the discovery is publicized. This strikes me as wrong. The big leap in value is when p becomes known, which either happens when the scientist discovers it or when it is presented at the conference. The rest is valuable, but not so big in terms of the value of “human knowledge”.

Monday, February 24, 2025

Epistemically paternalistic lies

Suppose Alice and Bob are students and co-religionists. Alice is struggling with a subject and asks Bob to pray that she might do fine on the exam. She gets 91%. Alice also knows that Bob’s credence in their religion is a bit lower than her own. When Bob asks her how she did, she lies that she got 94%, in order to boost Bob’s credence in their religion a bit more.

Whether a religion is correct is very epistemically important to Bob. But whether Alice got 91% or 94% is not at all epistemically important to Bob except as evidence for whether the religion is correct. The case can be so set up that by Alice’s lights—remember, she is more confident that the religion is correct than Bob is—Bob can be expected to be better off epistemically for boosting his credence in the religion. Moreover, we can suppose that there is no plausible way for Bob to find out that Alice lied. Thus, this is an epistemically paternalistic lie expected to make Bob be better off epistemically.

And this lie is clearly morally wrong. Thus, our communicative behavior is not merely governed by maximization of epistemic utility.

More on averaging to combine epistemic utilities

Suppose that the right way to combine epistemic utilities across people is averaging: the overall epistemic utility of the human race is the average of the individual epistemic utilities. Suppose, further, that each individual epistemic utility is strictly proper, and you’re a “humanitarian” agent who wants to optimize overall epistemic utility.

Suppose you’re now thinking about two hypotheses about how many people exist: the two possible numbers are m and n, which are not equal. All things considered, you have credence 0 < p0 < 1 in the hypothesis Hm that there are m people and 1 − p0 in the hypothesis Hn that there are n people. You now want to optimize overall epistemic utility. On an averaging view, if Hm is true, if your credence is p1, your contribution to overall epistemic utility will be:

  • (1/m)T(p1)

and if Hm is false, your contribution will be:

  • (1/n)F(p1),

where your strictly proper scoring rule is given by T, P. Since your credence is p1, by your lights the expected value after your changing your credence to p0 will be:

  • p0(1/m)T(p1) + (1−p0)(1/n)F(p1) + Q(p0)

where Q(p0) is the contribution of other people’s credences, which I assume you do not affect with your choice of p1. If m ≠ n and T, F is strictly proper, the expected value will be maximized at

  • p1 = (p0/m)/(p0/m+(1−p0)/n) = np0/(np0+m(1−p0)).

If m > n, then p1 < p0 and if m < n, then p1 > p0. In other words, as long as n ≠ m, if you’re an epistemic humanitarian aiming to improve overall epistemic utility, any credence strictly between 0 and 1 will be unstable: you will need to change it. And indeed your credence will converge to 0 if m > n and to 1 if m < n. This is absurd.

I conclude that we shouldn’t combine epistemic utilities across people by averaging the utilities.

Idea: What about combining them by computing the epistemic utilities of the average credences, and then applying a strictly proper scoring rule, in effect imagining that humanity is one big committee and that a committee’s credence is the average of the individual credences?

This is even worse, because it leads to problems even without considering hypotheses on which the number of people varies. Suppose that you’ve just counted some large number nobody cares about, such as the number of cars crossing some intersection in New York City during a specific day. The number you got is even, but because the number is big, you might well have made a mistake, and so your credence that the number is even is still fairly low, say 0.7. The billions of other people on earth all have credence 0.5, and because nobody cares about your count, you won’t be able to inform them of your “study”, and their credences won’t change.

If combined epistemic utility is given by applying a proper scoring rule to the average credence, then by your lights the expected value of the combined epistemic utility will increase the bigger you can budge the average credence, as long as you don’t get it above your credence. Since you can really only affect your own credence, as an epistemic humanitarian your best bet is to set your credence to 1, thereby increasing overall human credence from 0.5 to around 0.5000000001, and making a tiny improvement in the expected value of the combined epistemic utility of humankind. In doing so, you sacrifice your own epistemic good for the epistemic good of the whole. This is absurd!

I think the idea of averaging to produce overall epistemic utilities is just wrong.

Friday, February 21, 2025

Adding or averaging epistemic utilities?

Suppose for simplicity that everyone is a good Bayesian and has the same priors for a hypothesis H, and also the same epistemic interests with respect to H. I now observe some evidence E relevant to H. My credence now diverges from everyone else’s, because I have new evidence. Suppose I could share this evidence with everyone. It seems obvious that if epistemic considerations are the only ones, I should share the evidence. (If the priors are not equal, then considerations in my previous post might lead me to withhold information, if I am willing to embrace epistemic paternalism.)

Besides the obvious value of revealing the truth, here are two ways to reason for this highly intuitive conclusion.

First, good Bayesians will always expect to benefit from more evidence. If my place and that of some other agent, say Alice, were switched, I’d want the information regarding E to be released. So by the Golden Rule, I should release the information.

Second, good Bayesians’ epistemic utilities are measured by a strictly proper scoring rule. But if Alice’s epistemic utilities for H are measured by a strictly proper (accuracy) scoring rule s that assigns an epistemic utility s(p,t) to a credence p when the actual truth value of H is t, which can be zero or one. By definition of strict propriety, the expectation by my lights of what Alice’s epistemic utility for a given credence should be is strictly maximized when that credence equals my credence. Since Alice shares the priors I had before I observed E, if I can make E evident to her, her new posteriors will match my current ones, and so revealing E to her will maximize my expectation of her epistemic utility.

So far so good. But now suppose that the hypothesis H = HN is that there exist N people other than me, and my priors assign probability 1/2 to there being N and 1/2 to its being n, where N is much larger than n. Suppose further that my evidence E ends up significantly supporting hypothesis Hn, so that my posterior p in HN is smaller than 1/2.

Now, my expectation of the total epistemic utility of other people if I reveal E is:

  • UR = pNs(p,1) + (1−p)ns(p,0).

And if I conceal E, my expectation is:

  • UC = pNs(1/2,1) + (1−p)ns(1/2,0).

If we had N = n, then it would be guaranteed by strict propriety that UR > UC, and so I should reveal. But we have N > n. Moreover, s(1/2,1) > s(p,1): if some hypothesis is true, a strictly proper accuracy scoring rule increases strictly monotonically with the credence. If N/n is sufficiently large, the first terms of UR and UC will dominate, and hence we will have UC > UR, and thus I should conceal.

The intuition behind this technical argument is this. If I reveal the evidence, I decrease people’s credence in HN. If it turns out that the number of people other than me actually is N, I have done a lot of harm, because I have decreased the credence of a very large number N of people. Since N is much larger than n, this consideration trumps considerations of what happens if the number of people is n.

I take it that this is the wrong conclusion. On epistemic grounds, if everyone’s priors are equal, we should release evidence. (See my previous post for what happens if priors are not equal.)

So what should we do? Well, one option is to opt for averaging rather than summing of epistemic utilities. But the problem reappears. For suppose that I can only communicate with members of my own local community, and we as a community have equal credence 1/2 for the hypothesis Hn that our local community of n people contains all agents, and credence 1/2 for the hypothesis Hn + N that there is also a number N of agents outside our community much greater than n. Suppose, further, that my priors are such that I am certain that all the agents outside our community know the truth about these hypotheses. I receive a piece of evidence E disfavoring Hn and leading to credence p < 1/2. Since my revelation of E only affects the members of my own commmunity, depending on which hypothesis is true, if p is my credence after updating on E, the relevant part of the expected contribution to the utility of revealing E with regard to hypothesis Hn is:

  • UR = p((n−1)/n)s(p,1) + (1−p)((n−1)/(n+N))s(p,0).

And if I conceal E, my expectation contribution is:

  • UC = p((n−1)/n)s(1/2,1) + (1−p)((n−1)/(n+N))s(p,0).

If N is sufficiently large, again UC will beat UR.

I take it that there is something wrong with epistemic utilitarianism.

Bayesianism and epistemic paternalism

Suppose that your priors for some hypothesis H are 3/4 while my priors for it are 1/2. I now find some piece of evidence E for H which raises my credence in H to 3/4 and would raise yours above 3/4. If my concern is for your epistemic good, should I reveal this evidence E?

Here is an interesting reason for a negative answer. For any strictly proper (accuracy) scoring rule, my expected value for the score of a credence is uniquely maximized when the credence is 3/4. I assume your epistemic utility is governed by a strictly proper scoring rule. So the expected epistemic utility, by my lights, of your credence is maximized when your credence is 3/4. But if I reveal E to you, your credence will go above 3/4. So I shouldn’t reveal it.

This is epistemic paternalism. So, it seems, expected epistemic utility maximization (which I take it has to employ a strictly proper scoring rule) forces one to adopt epistemic paternalism. This is not a happy conclusion for expected epistemic utility maximization.

Tuesday, February 18, 2025

An example of a value-driven epistemological approach to metaphysics

  1. Everything that exists is intrinsically valuable.

  2. Shadows and holes are not intrinsically values.

  3. So, neither shadows nor holes exist.

Monday, February 17, 2025

Incompleteness

For years in my logic classes I’ve been giving a rough but fairly accessible sketch of the fact that there are unprovable arithmetical truths (a special case of Tarski’s indefinability of truth), using an explicit Goedel sentence using concatenation of strings of symbols rather than Goedel encoding and the diagonal lemma.

I’ve finally revised the sketch to give the full First Incompleteness theorem, using Rosser’s trick. Here is a draft.

Friday, February 14, 2025

What numbers could be

Benacerraf famously argued that no set theoretic reduction can capture the natural numbers. While one might conclude from this that the natural numbers are some kind of sui generis entities, Benacerraf instead opts for a structuralist view on which different things can play the role of different numbers.

The argument that no set theoretic reduction captures the negative numbers is based on thinking about two common reductions. On both, 0 is the empty set . But then the two accounts differ in how the successor sn of a number n is formed:

  1. sn = n ∪ {n}

  2. sn = {n}.

On the first account, the number 5 is equal to the set {0, 1, 2, 3, 4}. On the second account, the number 5 is equal to the singleton {{{{{⌀}}}}}. Benacerraf thinks that we couldn’t imagine a good argument for preferring one account over another, and hence (I don’t know how this is supposed to follow) there can’t be a fact of the matter about why one account—or any other set-theoretic reductive account—is correct.

But I think there is a way to adjudicate different set-theoretic reductions of numbers. Plausibly, there is reference magnetism to simpler referrents of our terminology. Consider an as consisting of a set of natural numbers, a relation <, and two operations + and ⋅, satisfying some axioms. We might then say that our ordinary language arithmetic is attracted to the abstract entities that are most simply defined in terms of the fundamental relations. If the only relevant fundamental relation is set membership , then we can ask which of the two accounts (a) and (b) more simply defines <, + and .

If simplicity is brevity of expression in first order logic, then this can be made a well-defined mathematical question. For instance, on (a), we can define a < b as a ∈ b. One provably cannot get briefer than that. (Any definition of a < b will need to contain a, b and .) On the other hand, on (b), there is no way to define a < b as simply. Now it could turn out that + or can be defined more simply on (b), in a way that offsets (a)’s victory with <, but it seems unlikely to me. So I conjecture that on the above account, (a) beats (b), and so there is a way to decide between the two reductions of numbers—(b) is the wrong one, while (a) at least has a chance of being right, unless there is a third that gives a simpler reduction.

In any case, on this picture, there is a way forward in the debate, which undercuts Benacerraf’s claim that there is no way forward.

I am not endorsing this. I worry about the multiplicity of first-order languages (e.g., infix-notation FOL vs. Polish-notation FOL).

Tuesday, February 11, 2025

Theistic Humeanism?

Here’s an option that is underexplored: theistic Humeanism. There are two paths to it.

The path from orthodoxy: Start with a standard theistic concurrentism: whenever we have a creaturely cause C with effect E, E only eventuates because God concurs, i.e., God cooperates with the creaturely causal relation. Now add to this a story about what creaturely causation is. This will be a Humean story—the best I know is the David Lewis one that reduces causation to laws and laws to arrangements of stuff. Keep all the deep theistic metaphysics of divine causation.

The path from heterodoxy: Start with the metaphysics of occasionalism. Don’t change any of the metaphysics. But now add a Humean analysis of creaturely causation in terms of regularities. Since the metaphysics of occasionalism affirms regularities in the world, we haven’t changed the metaphysics of occasionalism, but have redescribed it as actually involving creaturely causation.

The two paths meet in a single view, a theistic Humeanism with the metaphysics of occasionalism and the language of concurrentism, and with creaturely causation described in a Humean way.

This theistic Humeanism is more complex than standard non-theistic Humeanism, but overcomes the central problem with non-theistic Humeanism: the difficulty of finding explanation in nature. If the fact that heat causes boiling is just a statement of regularity, it does not seem that heat explains boiling. But on theistic Humeanism, we have a genuine explanatory link: God makes the water boil because God is aware of the heat.

There is one special objection to theistic Humeanism. It has two causal relations, a divine one and a creaturely one. But the two are very different—they don’t both seem to be kinds of causation. However, on some orthodox concurrentisms, such as Aquinas’s, there isn’t a single kind of thing that divine and creaturely causation are species of. Instead, the two stand in an analogical relationship. Couldn’t the theistic Humean say the same thing? Maybe, though one might also object that Humean creaturely causation is too different from divine causation for the two to count as analogous.

I suppose the main objection to theistic Humeanism is that it feels like a cheat. The creaturely causation seems fake. The metaphysics is that of occasionalism, and there is no creaturely causation there. But if theistic Humeanism is a cheat, then standard non-theistic Humeanism is as well, since they share the same metaphysics of creaturely causation. If non-theistic Humeanism really does have causation, then our theistic Humeanism really does have creaturely causation. If one has fake causation, so does the other. I think both have fake causation. :-)

Monday, February 10, 2025

Autonomy and relativism

Individual relativism may initially seem to do justice to the idea of our autonomy: our moral rules are set by ourselves. But this attractiveness of relativism disappears as soon as we realize that our beliefs are largely not up to us—that, as the saying goes, we catch them like we catch the flu. This seems especially true of our moral beliefs, most of which are inherited from our surrounding culture. Thus, what individual relativism gives to us in terms of autonomy is largely taken away by reflection on our beliefs.

Tuesday, February 4, 2025

Asymmetry between moral and physical excellence

We can use a Mahatma Ghandi or a Mother Teresa as a moral exemplar to figure out what our virtues should be. But we cannot use an Usain Bolt or a Serena Williams as a physical exemplar to figure out what our physical capabilities should be. Why this disanalogy between moral and physical excellence?

It’s our intuition that Bolt and Williams exceed the physical norms for humans to a significant degree. But although Ghandi and Mother Teresa did many supererogatory things, I do not think they overall exceed the moral norms for human character to a significant degree. We should be like them, and our falling short is largely our fault.

My LaTeX "ide"

I haven’t found a LaTeX IDE that I am happy with (texmaker comes close, but I don’t like the fact that it doesn’t properly underline the trigger letter in menus, even if Windows is set to do that), and so I ended up defaulting to just editing my book and papers with notepad++ and running pdflatex manually. But it’s a bit of a nuisance to get the preview: ctrl-s to save, alt-tab to command-line, up-arrow and enter to re-run pdflatex, alt-tab to pdf viewer. So I wrote a little python script that watches my current directory and if any .tex file changes in it, it re-runs pdflatex. So now it’s just ctrl-s, alt-tab to get the preview. I guess it’s only four keystrokes saved, but it feels more seamless and straightforward. The script also launches notepad++ and my pdf viewer at the start of the session to save me some typing.

Thursday, January 30, 2025

Teleology and the normal/abnormal distinction

Believers in teleology also tend to believe in a distinction between the normal and the abnormal. I think teleology can be prised apart from a normal/abnormal distinction, however, if we do something that I think we should do for independent reasons: recognize teleological directedness without a telos-to-be-attained, a target to be hit. An example of such teleological directedness is an athlete trying to run as fast as possible. There isn’t a target telos: for any speed the athlete reaches, a higher speed would fit even better with the athlete’s aims. But there is a directional telos, an aim tlos: the athlete aims in the direction of higher speed.

One might then say the human body in producing eyes has a directional telos: to see as well as possible. Whether one has 20/20 or 20/15 or 20/10 vision, more acuity would fulfill that directional telos better. On this view, there is no target telos, just a direction towards better acuity. If there were a target telos, say a specific level of acuity, we could identify non-attainment with abnormalcy and attainment with normalcy. But we need not. We could just say that this is all a matter of degree, with continuous variation between 20/0 (not humanly available) and 20/∞ (alas humanly available, i.e., total blindness).

I am not endorsing the view that there is no normal/abnormal in humans. I think there is (e.g., an immoral action is abnormal; a moral action is normal). But perhaps the distinction is less often applicable than friends of teleology think.

Wednesday, January 29, 2025

More on experiments

We all perform experiments very often. When I hear a noise and deliberately turn my head, I perform an experiment to find out what I will see if I turn my head. If I ask a question not knowing what answer I will hear, I am engaging in (human!) experimentation. Roughly, experiments are actions done in order to generate observations as evidence.

There are typically differences in rigor between the experiments we perform in daily life and the experiments scientists perform in the lab, but only typically so. Sometimes we are rigorous in ordinary life and sometimes scientists are sloppy.

The epistemic value to one of an experiment depends on multiple factors in a Bayesian framework.

  1. The set of questions towards answers to which the experiment’s results are expected to contribute.

  2. Specifications of the value of different levels of credence regarding the answers to the questions in Factor 1.

  3. One’s prior levels of credence for the answers.

  4. The likelihoods of different experimental outcomes given different answers.

It is easiest to think of Factor 2 in practical terms. If I am thinking of going for a recreational swim but I am not sure whether my swim goggles have sprung a leak, it may be that if the probability of the goggles being sound is at least 50%, it’s worth going to the trouble of heading out for the pool, but otherwise it’s not. So an experiment that could only yield a 45% confidence in the goggles is useless to my decision whether to go to the pool, and there is no difference in value between an experiment that yields a 55% confidence and one that yields a 95% confidence. On the other hand, if I am an astronaut and am considering performing a non-essential extravehicular task, but I am worried that the only available spacesuit might have sprung a leak, an experiment that can only yield 95% confidence in the soundness of the spacesuit is pointless—if my credence in the spacesuit’s soundness is only 95%, I won’t use the spacesuit.

Factor 3 is relevant in combination with Factor 4, because these two factors tell us how likely I am to end up with different posterior probabilities for the answers to the Factor 1 questions after the experiment. For instance, if I saw that one of my goggles is missing its gasket, my prior credence in the goggle’s soundness is so low that even a positive experimental result (say, no water in my eye after submerging my head in the sink) would not give me 50% credence that the goggle is fine, and so the experiment is pointless.

In a series of posts over the last couple of days, I explored the idea of a somewhat interest-independent comparison between the values of experiments, where one still fixes a set of questions (Factor 1), but says that one experiment is at least as good as another provided that it has at least as good an expected epistemic utility as the other for every proper scoring rule (Factor 2). This comparison criterion is equivalent to one that goes back to the 1950s. This is somewhat interest-independent, because it is still relativized to a set of questions.

A somewhat interesting question that occurred to me yesterday is what effect Factor 3 has on this somewhat interest-independent comparison of experiments. If experiment E2 is at least as good as experiment E1 for every scoring rule on the question algebra, is this true regardless of which consistent and regular priors one has on the question algebra?

A bit of thought showed me a somewhat interesting fact. If there is only one binary (yes/no) question under Factor 1, then it turns out that the somewhat interest-independent comparison of experiments does not depend on the prior probability for the answer to this question (assuming it’s regular, i.e., neither 0 nor 1). But if the question algebra is any larger, this is no longer true. Now, whether an experiment is at least as good as another in this somewhat interest-independent way depends on the choice of priors in Factor 3.

We might now ask: Under what circumstances is an experiment at least as good as another for every proper scoring rule and every consistent and regular assignment of priors on the answers, assuming the question algebra has more than two non-trivial members? I suspect this is a non-trivial question.

Tuesday, January 28, 2025

And one more post on comparing experiments

In my last couple of posts, starting here, I’ve been thinking about comparing the epistemic quality of experiments for a set of questions. I gave a complete geometric characterization for the case where the experiments are binary—each experiment has only two possible outcomes.

Now I want to finally note that there is a literature for the relevant concepts, and it gives a characterization of the comparison of the epistemic quality of experiments, at least in the case of a finite probability space (and in some infinite cases).

Suppose that Ω is our probability space with a finite number of points, and that FQ is the algebra of subsets of Ω corresponding to the set of questions Q (a question partitions Ω into subsets and asks which partition we live in; the algebra FQ is generated by all these partitions). Let X be the space of all probability measures on FQ. This can be identified with an (n−1)-dimensional subset of Euclidean Rn consisting of the points with non-negative coordinates summing to one, where n is the number of atoms in FQ. An experiment E also corresponds to a partition of Ω—it answers the question where in that partition we live. The experiment has some finite number of possible outcomes A1, ..., Am, and in each outcome Ai our Bayesian agent will have a different posterior PAi = P(⋅∣Ai). The posteriors are members of X. The experiment defines an atomic measure μE on X where μE(ν) is the probability that E will generate an outcome whose posterior matches ν on FQ. Thus:

  • μE(ν) = P(⋃{Ai:PAi|FQ=ν}).

Given the correspondence between convex functions and proper scoring rules, we can see that experiment E2 is at least as good as E1 for Q just in case for every convex function c on X we have:

  • XcdμE2 ≥ ∫XcdμE1.

There is an accepted name for this relation: μE2 convexly dominates μE1. Thus, we have it that experiment E2 is at least as good as experiment E1 for Q provided that there is a convex domination relation between the distributions the experiments induce on the possible posteriors for the questions in Q. And it turns out that there is a known mathematical characterization of when this happens, and it includes some infinite cases as well.

In fact, the work on this epistemic comparison of experiments turns out to go back to a 1953 paper by Blackwell. The only difference is that Blackwell (following 1950 work by Bohnenblust, Karlin and Sherman) uses non-epistemic utility while my focus is on scoring rules and epistemic utility. But the mathematics is the same, given that non-epistemic decision problems correspond to proper scoring rules and vice versa.

Comparing binary experiments for non-binary questions

In my last two posts (here and here), I introduced the notion of an experiment being epistemically at least as good as another for a set of questions. I then announced a characterization of when this happens in the special case where the set of questions consists of a single binary (yes/no) question and the experiments are themselves binary.

The characterization was as follows. A binary experiment will result in one of two posterior probabilities for the hypothesis that our yes/no question concerns, and we can form the “posterior interval” between them. It turns out that one experiment is at least as good as another provided that the first one’s posterior interval contains the second one’s.

I then noted that I didn’t know what to say for non-binary questions (e.g., “How many mountains are there on Mars?”) but still binary experiments. Well, with a bit of thought, I think I now have it, and it’s almost exactly the same. A binary experiment now defines a “posterior line segment” in the space of probabilities, joining the two possible credence outcomes. (In the case of a probability space with a finite number n of points, the space of probabilities can be identified as the set of points in n-dimensional Euclidean space all of whose coordinates are non-negative and add up to 1.) A bit of thought about convex functions makes it pretty obvious that E2 is at least as good as E1 if and only if E2’s posterior line segment contains E1’s posterior line segment. (The necessity of this geometric condition is easy to see: consider a convex function that is zero everywhere on E2’s posterior line segment but non-zero on one of E1’s two possible posteriors, and use that convex function to generate the scoring rule.)

This is a pretty hard to satisfy condition. The two experiments have to be pretty carefully gerrymandered to make their posterior line segments be parallel, much less to make one a subset of the other. I conclude that when one’s interest is in more than just one binary question, one binary experiment will not be overall better than another except in very special cases.

Recall that my notion of “better” quantified over all proper scoring rules. I guess the upshot of this is that interesting comparisons of scoring rules are not only relative to a set of questions but to a specific proper scoring rule.

Monday, January 27, 2025

Comparing binary experiments for binary questions

In my previous post I introduced the notion of an experiment being better than another experiment for a set of questions, and gave a definition in terms of strictly proper (or strictly open-minded, which yields the same definition) scoring rules. I gave a sufficient condition for E2 to be at least as good as E1: E2’s associated partition is essentially at least as fine as that of E1.

I then ended with an open question as to what the necessary and sufficient conditions for a binary (yes/no) experiment to be at least as good as another binary one for a binary question.

I think I now have an answer. For a binary experiment E and a hypothesis H, say that E’s posterior interval for H is the closed interval joining P(HE) with P(H∣∼E). Then, I think:

  • Given the binary question whether a hypothesis H is true, and binary experiments E1 and E2, experiment E2 is at least as good as E1 if and only if its posterior interval for H contains the E1’s posterior interval for H.

Let’s imagine that you want to be confident of H, because H is nice. Then the above condition says that an experiment that’s better than another will have at least as big potential benefit (i.e., confidence in H) and at least as big potential risk (i.e., confidence in  ∼ H). No benefits without risks in the epistemic game!

The proof (which I only have a sketch of) follows from expressing the expected score after an experiment using formula (4) here, and using convexity considerations.

The above answer doesn’t work for non-binary experiments. The natural analogue to the posterior interval is the convex hull of the set of possible posteriors. But now imagine two experiments to determine whether a coin is fair or double-headed. The first experiment just tosses the coin and looks at the answer. The second experiment tosses an auxiliary independent and fair coin, and if that one comes out heads, then the coin that we are interested in is tossed. The second experiment is worse, because there is probability 1/2 that the auxiliary coin is tails in which case we get no information. But the posterior interval is the same for both experiments.

I don’t know what to say about binary experiments and non-binary questions. A necessary condition is containment of posterior intervals for all possible answers to the question. I don’t know if that’s sufficient.

Comparing experiments

When you’re investigating reality as a scientist (and often as an ordinary person) you perform experiments. Epistemologists and philosophers of science have spent a lot of time thinking about how to evaluate what you should do with the results of the experiments—how they should affect your beliefs or credences—but relatively little on the important question of which experiments you should perform epistemologically speaking. (Of course, ethicists have spent a good deal of time thinking about which experiments you should not perform morally speaking.) Here I understand “experiment” in a broad sense that includes such things as pulling out a telescope and looking in a particular direction.

One might think there is not much to say. After all, it all depends on messy questions of research priorities and costs of time and material. But we can at least abstract from the costs and quantify over epistemically reasonable research priorities, and define:

  1. E2 is epistemically at least as good an experiment as E1 provided that for every epistemically reasonable research priority, E2 would serve the priority at least as well as E1 would.

That’s not quite right, however. For we don’t know how well an experiment would serve a research priority unless we know the result of the experiment. So a better version is:

  1. E2 is epistemically at least as good an experiment as E1 provided that for every epistemically reasonable research priority, the expected degree to which E2 would serve the priority is at least as high as the expected degree to which E1 would.

Now we have a question we can address formally.

Let’s try.

  1. A reasonable epistemic research priority is a strictly proper scoring rule or epistemic utility, and the expected degree to which an experiment would serve that priority is equal to the expected value of the score after Bayesian update on the result of the experiment.

(Since we’re only interested in expected values of scores, we can replace “strictly proper” with “strictly open-minded”.)

And we can identify an experiment with a partition of the probability space: the experiment tells us where we are in that partition. (E.g., if you are measuring some quantity to some number of significant digits, the cells of the partition are equivalence classes under equality of the quantity up to those many significant digits.) The following is then easy to prove:

Proposition 1: On definitions (2) and (3), an experiment E2 is epistemically at least as good as experiment E1 if and only if the partition associated with E2 is essentially at least as fine as the partition associated with E1.

A partition R2 is essentially at least as fine as a partition R1 provided that for every event A in R1 there is an event B in R2 such that with probability one B happens if and only if A happens. The definition is relative to the current credences which are assumed to be probabilistic. If the current credences are regular—all non-empty events have non-zero probability—then “essentially” can be dropped.

However, Proposition 1 suggests that our choice of definitions isn’t that helpful. Consider two experiments. On E1, all the faculty members from your Geology Department have their weight measured to the nearest hundred kilograms. On E2, a thousand randomly chosen individiduals around the world have their weight measured to the nearest kilogram. Intuitively, E1 is better. But Proposition 1 shows that in the above sense neither experiment is better than the other, since they generate partitions neither of which is essentially finer than the other (the event of there being a member of the Geology Department with weight at least 150 kilograms is in the partition of E2 but nothing coinciding with that event up to probability zero is in the partition of E1). And this is to be expected. For suppose that our research priority is to know whether any members of your Geology Department are at least than 150 kilograms in weight, because we need to know if for a departmental cave exploring trip the current selection of harnesses all of which are rated for users under 150 kilograms are sufficient. Then E1 is better. On the other hand, if our research priority is to know the average weight of a human being to the nearest ten kilograms, then E2 is better.

The problem with our definitions is that the range of possible research priorities is just too broad. Here is one interesting way to narrow it down. When we are talking about an experiment’s epistemic value, we mean the value of the experiment towards a set of questions. If the set of questions is a scientifically typical set of questions about human population weight distribution, then E1 seems better than E2. But if it is an atypical set of questions about the Geology Department members’ weight distribution, then E2 might be better. We can formalize this, too. We can identify a set Q of questions with a partition of probability space representing the possible answers. This partition then generates an algebra FQ on the probability space, which we can call the “question algebra”. Now we can relativize our definitions to a set of questions.

  1. E2 is epistemically at least as good an experiment as E1 for a set of questions Q provided that for every epistemically reasonable research priority on Q, the expected degree to which E2 would serve the priority is at least as high as the expected degree to which E1 would.

  2. A reasonable epistemic research priority on a set of questions Q is a strictly proper scoring rule or epistemic utility on FQ, and the expected degree to which an experiment would serve Q is equal to the expected value of the score after Bayesian update on the result of the experiment.

We recover the old definitions by being omnicurious, namely letting Q be all possible questions.

What about Proposition 1? Well, one direction remains: if E2’s partition is essentially at least as fine as E1’s, then E2 is better with regard any set of questions, an in particular better with regard to Q. But what about the other direction? Now the answer is negative. Suppose the question is what the average weight of the six members of the Geology Department is up to the nearest 100 kg. Consider two experiments: on the first, the members are ordered alphabetically by first name, and a fair die is rolled to choose one (if you roll 1, you choose the first, etc.), and their height is measured. On the second, the same is done but with the ordering being by last name. Assuming the two orderings are different, neither experiment’s partition is essentially at least as fine as the other’s, but the expected contributions of both experiments towards our question is equal.

Is there a nice characterization in terms of partitions of when E2 is at least as good as E1 with regard to a set of questions Q? I don’t know. It wouldn’t surprise me if there was something in the literature. A nice start would be to see if we can answer the question in the special case where Q is a single binary question and where E1 and E2 are binary experiments. But I need to go for a dental appointment now.

Thursday, January 23, 2025

Recollection and two types of "Aha!" experiences

On some argumentatively central occasions, Plato refers to an intellectual “aha!” experience of seeing some point (say, something philosophical or mathematical). This is supposed to be evidence for the theory of recollection, because the experience is similar to remembering a nearly forgotten thing.

After insightful comments from students in my philosophy of mathematics seminar today, I think “aha!” experiencess come in two varieties. We might express paradigm instances of the two varieties like this:

  1. Aha! I’ve always thought this, but never quite put it into words!

  2. Aha! Now that I think about this, I see it’s got to be true!

An example of the first variety might be someone who hears about the Golden Rule, and realizes that whenever they were at their best, they were acting in accordance with it. I had a case of the second variety when I was introduced to the distributive law in arithmetic in grade three: I had never thought about whether a ⋅ (b+c) = a ⋅ b + a ⋅ c, but as soon as the question came up, with some sort of an illustrating mental picture, it was clear that it was true.

The two experiences are phenomenologically quite distinct. Type (i) experiences fit better with the Platonic picture of innate knowledge, since type (ii) experiences feel like a new acquisition rather than the recovery of something one already had. Another difference between type (i) and type (ii) experiences is that in type (ii) experiences, we not only take ourselves to have evidence for the thing being true, but the thing becomes quite unmysterious: we see how it has to be true. But type (i) experiences need not have this explanatory feature. When I have the vision of the truth of the distributive law of arithmetic, I see why it’s got to be true though I may not be able to put it into words. Not so with the Golden Rule. I can continue to be mystified by the incumbent obligations, but cannot deny them.

Literal remembering of a forgotten thing seems less like (ii) than like (i). When I remember a forgotten phone number by some prompt, I don’t have an experience of seeing why it’s got to be that.

Plato’s theory of recollection does not account for the phenomenology of type (ii) experiences. And perhaps Plato would admit that. In the Republic, he talks of “the eye of the soul”. The context there is the abilities of this life, rather than recollection. Perhaps type (ii) experiences fit more with the activity of the eye of the soul than with recollection.

At the same time, while (i) is a bit more like remembering, it’s not exactly like it, either. Remembering need not have any “I’ve thought this all along” aspect to it, which type (i) experiences tend to have. So I think neither of our “Aha!” experiences is quite like the theory of recollection leads us to. Is there a third “Aha!” experience that does? I doubt it, but maybe.

Tuesday, January 21, 2025

Competent language use without knowledge

I can competently use a word without knowing what the word means. Just imagine some Gettier case, such as that my English teacher tried to teach me a falsehood about what “lynx” means, but due to themselves misremembering what the word means, they taught me the correct meaning. Justified true belief is clearly enough for competent use.

But if I then use “lynx”, even though I don’t know what the word means, I do know what I mean by it. Could one manufacture a case where I competently use a word but don’t even know what I mean by it?

Maybe. Suppose I am a student and a philosopher professor convinces me that I am so confused that don’t know what I mean when I use the word “supervenience” in a paper. I stop using the word. But then someone comments on an old online post of mine from the same period as the paper, in which post I used “supervenience”. The commenter praises how insightfully I have grasped the essence of the concept. This someone uses a false name, that of an eminent philosopher. I come to believe on the supposed authority of this person that I meant by “supervenience” what I in fact did mean by it, and I resume using it. But the authority is false. It seems that now I am using the word without knowing what I mean by it. And I could be entirely competent.

Kripke's standard meter

Back when there was a standard meter, Kripke claimed that it was contingent a priori that the standard meter is a meter in length.

This seems wrong. For anything narrowly logically entailed by something that’s a priori is also a priori. But that the standard meter is a meter in length entails that there is an extended object. And that there is an extended object is clearly a posteriori.

Kripke’s reasoning is that to know that the standard meter is a meter in length all you need to know is how “meter” is stipulated, namely as the actual length of the standard meterstick, and anything you can know From knowing how the terms are stipulated is known a priori.

There is something fishy here. We don’t know a priori that the stipulation was successful (it might have failed if, for instance, the “standard meter” never existed but with a conspiracy to pretend it exists). In fact, we don’t know a priori that any stipulations were ever made—that, too, is clearly a posteriori.

Maybe what we need here is some concept of “stipulational content”, and the idea is that something is a priori if you can derive it a priori from the stipulational content of the terms. But the stipulational content of a term needs to be defined in such a way that it’s neutral on whether the stipulation happened or succeeded. If so, then Kripke should have said that it’s a priori that if there is a standard meterstick, it is a meter long.

The unthinkable and the ineffable

Suppose that Alice right now thinks about some fact F and no other fact. Then we can stipulate that “Xyzzies” is a sentence whose content is that very fact which Alice is thinking. Thus:

  1. If a linguistically identifiable person can think about some fact F to the exclusion of other facts at a linguistically identifiable time, then F can be expressed in a language.

It does not, however, follow that every fact can be expressed in a language. For it’s epistemically possible that there is a fact F such that a person can only think about F if the person is simultaneously thinking about G and H as well, and there may be no way for us to distinguish F from G and H in such a way as to stipulate a term for it.

This may seem like a pretty remote possibility, but I think it’s pretty plausible. There could be some fact F that only God can think. But presumably any fact has infinitely many logical consequences. But since God is inerrant and necessarily thinks all facts, necessarily if God thinks F, he thinks all the infinitely many logical consequences of F as well. And it could well be that we have no way of distinguishing F from some of its logical consequences in such a way that we could delineate F.

So it is possible to accept (1) while holding that some thinkable facts are ineffable.

However, plausibly any fact thinkable by a human can be thought by the human in a specifiably delineated way (the primary fact thought about at t1, etc.). Thus our thought cannot exceed the possibilities of our language, since for anything we can think we could stipulate that “Xyzzies” means that. (Though, of course, our thought can (and sometimes does) exceed the actualities of our language.) Thus:

  1. The humanly ineffable is humanly unthinkable.

Nonetheless, we might make a distinction between two ways of extending human language. A weak extension is one that can be introduced solely in terms of current human language. Stipulations in mathematics are like that: we explain what “continuous” is using prior vocabulary like “limit”. A strong extension is one that requires something extralinguistic, such as ostension to a non-linguistic reality.

  1. There are things that are humanly thinkable that are only expressible using a strong extension of human language.

Monday, January 20, 2025

Beyond us

A being that does not represent the world has no conception of what representation might be like, since the being has no conceptions.

A being that lacks consciousness has no conception of what consciousness might be like. The being might have intentionality (our unconscious thoughts, after all, have intentionality), and so might have the contentful thought that there can be beings that have some crucial mental quality that goes beyond the unconscious being’s mentality.

A being that lacks will presumably has no consciousness of what rational will or responsibility might be like. Again, the being might have the concept of beings with “something more” in causation of activity by means of thought.

The distinctions between non-representing and representing, unconscious and conscious, and involuntary and voluntary involve immense qualitative and value gaps. In each of the three cases, we humans exemplify the higher of the two options. At the same time, we are not alone in all these on earth. We share representation with all living things, I suspect. We share consciousness with many animals. But responsibility, I suspect, is ours alone.

I find it implausible to think that we are at the qualitative apex of the space of valuable possibilities. It seems quite likely to me that there could be beings that differ from us in further fundamental valuable qualities in such a way that we are on the lower end, and if we were to meet these beings, we would be unable to grasp what they have which we lack, though we might on testimony, or maybe even empirical observation of behavior, conclude that there is such a thing.

In fact, I suspect there are infinitely many such distinctions, and that God is beyond the higher side of all of them.

In heaven, might we be raised to have the further higher levels? Maybe, but maybe not. However, the mere epistemic possibility of us being gradually raised to acquire infinitely many further such irreducible values is enough to undercut any “argument from boredom” against eternal heavenly life.

Assuming there are infinitely many more such non-V and V pairs, I wonder what this infinity is. Does it have a cardinality?

Open-mindedness and epistemic thresholds

Fix a proposition p, and let T(r) and F(r) be the utilities of assigning credence r to p when p is true and false, respectively. The utilities here might be epistemic or of some other sort, like prudential, overall human, etc. We can call the pair T and F the score for p.

Say that the score T and F is open-minded provided that expected utility calculations based on T and F can never require you to ignore evidence, assuming that evidence is updated on in a Bayesian way. Assuming the technical condition that there is another logically independent event (else it doesn’t make sense to talk about updating on evidence), this turns out to be equivalent to saying that the function G(r) = rT(r) + (1−r)F(r) is convex. The function G(r) represents your expected value for your utility when your credence is r.

If G is a convex function, then it is continuous on the open interval (0,1). This implies that if one of the functions T or F has a discontinuity somewhere in (0,1), then the other function has a discontinuity at the same location. In particular, the points I made in yesterday’s post about the value of knowledge and anti-knowledge carry through for open-minded and not just proper scoring rules, assuming our technical condition.

Moreover, we can quantify this discontinuity. Given open-mindedness and our technical condiiton, if T has a jump of size δ at credence r (e.g., in the sense that the one-sided limits exist and differ by y), then F has a jump of size rδ/(1−r) at the same point. In particular, if r > 1/2, then if T has a jump of a given size at r, F has a larger jump at r.

I think this gives one some reason to deny that there are epistemically important thresholds strictly between 1/2 and 1, such as the threshold between non-belief and belief, or between non-knowledge and knowledge, even if the location of the thresholds depends on the proposition in question. For if there are such thresholds, then now imagine cases of propositions p with the property that it is very important to reach a threshold if p is true while one’s credence matters very little if p is false. In such a case, T will have a larger jump at the threshold than F, and so we will have a violation of open-mindedness.

Here are three examples of such propositions:

  • There are objective norms

  • God exists

  • I am not a Boltzmann brain.

There are two directions to move from here. The first is to conclude that because open-mindedness is so plausible, we should deny that there are epistemically important thresholds. The second is to say that in the case of such special propositions, open-mindedness is not a requirement.

I wondered initially whether a similar argument doesn’t apply in the absence of discontinuities. Could one have T and F be openminded even though T continuously increases a lot faster than F decreases? The answer is positive. For instance the pair T(r) = e10r and F(r) =  − r is open-minded (though not proper), even though T increases a lot faster than F decreases. (Of course, there are other things to be said against this pair. If that pair is your utility, and you find yourself with credence 1/2, you will increase your expected utility by switching your credence to 1 without any evidence.)

Friday, January 17, 2025

Knowledge and anti-knowledge

Suppose knowledge has a non-infinitesimal value. Now imagine that you continuously gain evidence for some true proposition p, until your evidence is sufficient for knowledge. If you’re rational, your credence will rise continuously with the evidence. But if knowledge has a non-infinitesimal value, your epistemic utility with respect to p will have a discontinuous jump precisely when you attain knowledge. Further, I will assume that the transition to knowledge happens at a credence strictly bigger than 1/2 (that’s obvious) and strictly less than 1 (Descartes will dispute this).

But this leads to an interesting and slightly implausible consequence. Let T(r) be the epistemic utility of assigning evidence-based credence r to p when p is true, and let F(r) be the epistemic utility of assigning evidence-based credence r to p when p is false. Plausibly, T is a strictly increasing function (being more confident in a truth is good) and F is a strictly decreasing function (being more confident in a falsehood is bad). Furthermore, the pair T and F plausibly yields a proper scoring rule: whatever one’s credence, one doesn’t have an expectation that some other credence would be epistemically better.

It is not difficult to see that these constraints imply that if T has a discontinuity at some point 1/2 < rK < 1, so does F. The discontinuity in F implies that as we become more and more confident in the falsehood p, suddenly we have a discontinuous downward jump in utility. That jump occurs precisely at rK, namely when we gain what we might call “anti-knowledge”: when one’s evidence for a falsehood becomes so strong that it would constitute knowledge if the proposition were true.

Now, there potentially are some points where we might plausibly think that epistemic utility of having a credence in a falsehood takes a discontinuous downward jump. These points are:

  • 1, where we become certain of the falsehood

  • rB, the threshold of belief, where the credence becomes so high that we count as believing the falsehood

  • 1/2, where we start to become more confident in the falsehood p than the truth not-p

  • 1 − rB, where we stop believing not-p, and

  • 0, where the falsehood p becomes an epistemic possibility.

But presumably rK is strictly between rB and 1, and hence rK is no one of these points. Is it plausible to think that there is a discontinuous downward jump in epistemic utility when we achieve anti-knowledge by crossing the threshold rK in a falsehood.

I am incline to say not. But that forces me to say that there is no discontinuous upward jump in epistemic utility once we gain knowledge.

On the other hand, one might think that the worst kind of ignorance is when you’re wrong but you think you have knowledge, and that’s kind of like the anti-knowledge point.

Thursday, January 16, 2025

Aristotle and Aquinas' Third Way

Aristotle seems to have thought that the earth and the species inhabiting it are eternal. This seems extremely implausible for reasons that should have been available to Aristotle.

It is difficult to wipe out a species, but surely not possible: all it takes is to kill each of the finitely many individuals. Given a species s that cannot have more than n members, and given a long enough time, we would expect there to be a very high probability that all the members of s would have died out during some hour due to random events. Given any finite number of species each with a bound on how many members it can have, and given a long enough time, we would expect with very high probability that all the members would die off.

Now there is a finite limit on how many species there are on earth (as Aristotle knew, the earth is finite), and a finite limit on how many members the species can have (again, the earth is finite). So we should have expected all the species that existed some long amount of time ago to have died out.

The above provides an argument that if the world is eternal, new species can arise. For if new species can’t arise and the world is eternal, then by now there should have been no species left.

How could Aristotle have gotten out of this worry without rejecting his thesis about the eternity of the earth?

One way be to suppose a powerful protector of our ecosystem that would make sure that the species-destroying random events never happen. This protector would either itself have to be sufficiently powerful that it would not be subject to the vicissitudes of chance, or there would have to be an infinite (probably uncountably infinite!) number of such protectors.

Another option would be for Aristotle to reject his thesis that there is only one earth (which was based on theory of gravitation as attraction to the center of the universe: if there were more than one earth they would have both collapsed into the center of the universe by now).

If there were infinitely many earths, then it’s perhaps not so crazy to think that some earth would have lucked out and not had its species die out. Of course, this would not only require Aristotle to reject his thesis that there is only one earth, but also the finitist thesis that there cannot be an infinite number of co-actual things. (Interestingly, given the plausibility that any given species has probability one of dying out given infinite time, and given the countable additivity of probabilities, this way out would require not merely infinitely many earths, but an uncountable infinity of earths. Assuming an Archimedean spacetime for our universe, it would require a multiverse.)

In any case, Aristotle’s commitment to new species not coming into existence (or at least new species of interesting critters; he may be OK with worms coming into existence) is in tension with what he says about the earth’s eternity.

Wednesday, January 15, 2025

Change and matter

Aristotle’s positing matter is driven by trying to respond to the Parmenidean idea that things can’t come from nothing, and hence we must posit something that persists in change, and that is matter.

But there two senses of “x comes from nothing”:

  1. x is uncaused

  2. x is not made out of pre-existing materials.

If “x comes from nothing” in the argument means (1), the argument for matter fails. All we need is a pre-existing efficient cause, which need not be the matter of x.

Thus, for the argument to work, “x comes from nothing” must mean (2). But now here is a curious thing. From the middle ages to our time, many Aristotelians are theists, and yet still seem to be pulled by Aristotle’s argument for matter. But if “x comes from nothing” means (2), then theism implies that it is quite possible for something to come from nothing: God can create it ex nihilo.

There are at least two possible responses from a theistic Aristotelian who likes the argument for matter. The first response is that only God can make things come from nothing in sense (2), and hence things caused to exist by finite causes (even if with God’s cooperation) cannot come from nothing in sense (2). But there plainly are such things all around us. So there is matter.

Now, at least one theistic Aristotelian, Aquinas, does explicitly argue that only God can create ex nihilo. But the argument is pretty controversial and depends on heavy-duty metaphysics, about finite and infinite causes. It is not just the assertion of a seemingly obvious Parmenidean “nothing comes from nothing” principle. Thus at least on this response, the argument for matter becomes a lot more controversial. (And, to be honest, I am not convinced by it.)

The second and simpler response is to say that it’s just an empirical fact that there are things in the world that don’t come from nothing in sense (2): oak trees, for example. Thus there in fact is matter. This response is pretty plausible, but can be questioned: one might say that we have continuity of causal powers rather than any matter that survives the generation.

Finally, it’s worth noting that I suspect Aristotle misunderstands the Parmenidean argument, which is actually a very simple reductio ad absurdum:

  1. x came into existence.
  2. If x came into existence, then x did not exist.
  3. So, x did not exist.
  4. But non-existence is absurd.

The crucial step here is (6): the Parmenidean thinks the very concept of something not existing is absurd (presumably because of the Parmenidean’s acceptance of a strong truthmaker principle). The argument is very simple: becoming presupposes the truth of some past-tensed non-existence statements, while non-existence statements are always false. Aristotle’s positing matter does nothing to refute this Parmenidean argument. Even if we grant that x’s matter pre-existed, it’s still true that x did not exist, and that’s all Parmenides needs. Likewise, Aristotle’s famous actuality/potentiality distinction doesn’t solve the problem. Even if x was pre-existed by a potentiality for existence, it’s still true that x wasn’t pre-existed by x—that would be a contradiction.

To solve Parmenides’ problem, however, we do not need to posit matter or potentiality or anything like that. We just need to reject the idea that negative existential statements are nonsensical. And Aristotle expressly does reject this idea: he says that a statement is true provided it says of what is that it is or of what is not that it is not. Having done that, Aristotle should take himself as done with Parmenides’ problem of change.

Tuesday, January 14, 2025

More on the centrality of morality

I think we can imagine a species which have moral agency, but moral agency is a minor part of their flourishing. I assume wolves don’t have moral agency. But now imagine a species of canids that live much like wolves, but every couple of months get to make a very minor moral choice whether to inconvenience the pack in the slightest way—the rest is instinct. It seems to me that these canids are moral agents, but morality is a relatively minor part of their flourishing. The bulk of the flourishing of these canids would be the same as that of ordinary wolves.

Aristotle argued that the fact that rationality is how we differ from other species tells us that rationality is what is central to our flourishing. The above thought experiment shows that the argument is implausible. Our imaginary canids could, in fact, be the only rational species in the universe, and their moral agency or rationality (with Aristotle and Kant, I am inclined to equate the two) is the one thing that makes them different from other canids, but yet what is more important to their flourishing is what they have in common with other canids.

At the same time, it would be easy for an Aristotelian theorist to accommodate my canids. One needs to say that the form of a species defines what is central to the flourishing, and in my canids, unlike in humans, morality is not so central. And one can somehow observe this: rationality just is clearly important to the lives of humans in a way in which it’s not so much these canids.

In this way, I think, the Aristotelian may have a significant advantage over a Kantian. For a Kantian may have to prioritize rationality in all possible species.

In any case, we should not take it as a defining feature of morality that it is central to our flourishing.

One might wonder how this works in a theistic context. For humans, moral wrongdoing is also sin, an offense against a loving infinite Creator. As I’ve described the canids, they may have no concept of God and sin, and so moral wrongdoing isn’t seen as sin by them. Could you have a species which does have a concept of God and sin, but where morality (and hence sin) isn’t central to flourishing? Or does bringing God in automatically elevate morality to a higher plane? Anselm thought so. He might have been right. If so, then the discomfort that one is liable to feel at the idea of a species of moral agents where morality is not very important could be an inchoate grasp of the connection between God and morality.