Showing posts with label chance. Show all posts
Showing posts with label chance. Show all posts

Tuesday, September 24, 2024

Chanceability

Say that a function P : F → [0,1] where F is a σ-algebra of subsets of Ω is chanceable provided that it is metaphysically possible to have a concrete (physical or not) stochastic process with a state space of the same cardinality as Ω and such that P coincides with the chances of that process under some isomorphism between Ω and the state space.

Here are some hypotheses ones might consider:

  1. If P is chanceable, P is a finitely additive probability.

  2. If P is chanceable, P is a countably additive probability.

  3. If P is a finitely additive probability, P is chanceable.

  4. If P is a countably additive probability, P is chanceable.

  5. A product of chanceable countably additive probabilities is chanceable.

It would be nice if (2) and (4) were both true; or if (1) and (3) were.

I am inclined to think (5) is true, since if the Pi are chanceable, they could be implemented as chances of stochastic processes of causally isolated universes in a multiverse, and the result would have chances isomorphic to the product of the Pi.

I think (3) is true in the special case where Ω is finite.

I am skeptical of (4) (and hence of (3)). My skepticism comes from the following line of thought. Let Ω = ℵ1. Let F be the σ-algebra of countable and co-countable subsets (A is co-countable provided that Ω − A is countable). Define P(A) = 1 for the co-countable subsets and P(A) = 0 for the countable ones. This is a countably additive probability. Now let < be the ordinal ordering on 1. Then if P is chanceable, it can be used to yield paradoxes very similar to those of a countably infinite fair lottery.

For instance, consider a two-person game (this will require the product of P with itself to be chanceable, not just P; but I think (5) is true) where each player independently gets an ordinal according to a chancy isomorph of P, and the one who gets the larger ordinal wins a dollar. Then each player will think the probability that the other player has the bigger ordinal is 1, and will pay an arbitrarily high fee to swap ordinals with them!

Tuesday, May 21, 2024

A problem for probabilistic best systems accounts of laws

Suppose that we live in a Humean universe and the universe contains an extremely large collection of coins scattered on a flat surface. Statistical analysis of all the copper coins fits extremely well with the hypothesis that each coin was independently randomly placed with the chance of heads being 1/16 and that of tails being 15/16.

Additionally, there is a gold coin where you haven’t observed which side it’s on.

And there are no other coins.

On a Lewisian best systems account of laws of nature, if the number of coins is sufficeintly large, it will be a law of nature that all coins are independently randomly placed with the chance of heads being 1/16 and that of tails being 15/16. This is true regardless of whether the gold coin is heads or tails. If you know the information I just gave, and have done the requisite statistical analysis of the copper coins, you can be fully confident that this is indeed a law of nature.

If you are fully confident that it is a law of nature that the chance of tails is 15/16, then your credence for tails for the unobserved gold coin should also be 15/16 (I guess this is a case of the Principal Principle).

But that’s wrong. The fact that the coin is of a different material from the observed coins should affect your credence in its being tails. Inductive inferences are weakened by differences between the unobserved and the observed cases.

One might object that perhaps the Lewisian will say that instead of a law saying that the chance of tails on a coin is 15/16, there would be a law that the chance of tails on a copper coin is 15/16. But that’s mistaken. The latter law is not significantly more informative than the former (given that all but one coin is copper), but is significantly less brief. And laws are generated by balancing informativeness with brevity.

Wednesday, September 6, 2023

On probabilistic best-systems accounts, laws aren't propositions

According to the probabilistic best-systems account of laws (PBSA), the fundamental laws of nature are the axioms of the system that optimizes a balance of probabilistic fit to reality, informativeness, and brevity in a perfectly natural language.

But here is a tricky little thing. Probabilistic laws include statements about chances, such as that an event of a certain type E has a chance of 1/3. But on PBSA, chances are themselves defined by PBSA. What it means to say “E has a chance of 1/3” seems to be that the best system entails that E has a chance of 1/3. On its face, this is circular: chance is defined in terms of entailment of chance.

I think there may be a way out of this, but it is to make the fundamental laws be sentences that need not express propositions. Here’s the idea. The fundamental laws are sentences in an formal language (with terms having perfectly natural meanings) and an additional uninterpreted chance operator. There are a bunch of choice-points here: is the chance operator unary (unconditional) or binary (conditional)? is it a function? does it apply to formulas, sentences, event tokens, event types or propositions? For simplicity, I will suppose it’s unary function applying to event types, even though that’s likely not the best solution in the final analysis. We now say that the laws are the sentences provable from the axioms of our best system. These sentences include the uninterpreted chance(x) function. We then say stuff like this:

  1. When a sentence that does not use the chance operator is provable from the axioms, that sentence contributes to informativeness, but when that sentence is in fact false, the fit of the whole system becomes  − ∞.

  2. When a sentence of the form chance(E) = p is provable from the axioms, then the closeness of the frequency of event type E to p contributes to fit (unless the fit is  − ∞ because of the previous rule), and the statement as such contributes to informativeness.

I have no idea how fit is to be measured when instead of being able to prove things like chance(E) = p, we can prove less precise statements like chance(E) = chance(F) or chance(E) ≥ p. Perhaps we need clauses to cover cases like that, or maybe we can hope that we don’t need to deal with this.

An immediate problem with this approach is that the laws are no longer propositions. We can no longer say that the laws explain, because sentences in a language that is not fully interpreted do not explain. But we can form propositions from the sentences: instead of invoking a law s as itself an explanation, we can invoke as our explanation the second order fact that s is a law, i.e., that s is provable from the axioms of the best system.

This is counterintuitive. The explanation of the evolution of amoebae should not include meta-linguistic facts about a formal language!

Wednesday, June 7, 2023

Chance and intention

If I intend for an event to happen, I had better intend for my action to raise the chance of the event happening. And most of the time I raise the chance of an event happening precisely in order that the event happen.

But I can also intend to raise the chance of an event happening without intending the event to happen. Thus, when testing one’s product, one uses the product in more extreme ways that deliberately raise the chance of failure. But one isn’t intending failure. One raises the probability of failure with the hope that despite the raised chance, the product does not fail.

Monday, March 6, 2023

More steps in the open future and probability dialectics

I’ve often defended a probabilistic objection to open future views on which either future-tensed contingents are all false or are neither true nor false. If T(q) is the proposition that q is true, then:

  1. P(T(q)) = P(q).

But on the open future views, the left-hand-side is zero, since it’s certain that q is not true. So the right-hand-side is zero. But then both q and its negation have zero probability, and we can’t make any predictions about the future.

An open futurist might push the following response. First, deny (1). Then insist that P(q) for a future contingent q is the objective tendency or chance towards q turning true. Thus, P(coin will be heads) is 1/2 for a fair indeterministic coin, since the there is an objective tendence of magnitude 1/2 for the coin to end up heads.

In this post I want to discuss my next step in the dialectics. I think there may be a problem with combining the objective tendency response with epistemic probabilities. Suppose that yesterday a fair coin was flipped. If the coin was heads, then tomorrow two fair indeterministic coins will be flipped, and if the coin is tails, then tomorrow one fair indeterministic coin will be flipped. Let H be the proposition that tomorrow at least one coin will be heads. If yesterday we had heads, then the objective tendency of H is 3/4. If yesterday we had tails, then the objective tendency of H is 1/2. But we need to be able to say:

  1. P(H) = (1/2)(3/4) + (1/2)(1/2) = 5/8.

Now note that we are quite certain that 5/8 is not the objective tendency of H. The objective tendency of H is either 1/2 or 3/4.

So the open futurist needs a more sophisticated story. Here seems the right one. We say that P(q) is the average of the objective tendencies towards q weighted by the subjective probabilities of these tendencies. This is basically causal probability. The story requires that there be a present fact about all the objective tendencies.

On the technical side, this works. But here is a philosophical worry. If P(H) = 5/8 neither represents the objective tendency of H (which is either 1/2 or 3/4) nor one’s credence that H is true (which is zero on open-futurism), why is it that we should be making our decisions about the future in the light of P(H)?

Tuesday, August 23, 2022

Intending to lower the probability of one's success

It seems a paradigm of irrationality to intend an event E in an action A and yet take the action to lower the probability of E.

But it’s not irrational if my principle that intending a specification of something implies intending that which it is a specification of.

Suppose that Alice is in a bicycle race and is almost at the finish. If she just lets inertia do its job, she will inevitably win. But she carefully starts braking just short of the finish, aiming to cross the finish just a hair in front of Barbara, the cyclist behind her. She does this because she wants to make the race more exciting for the spectators, and she carefully calibrates her braking to make her win but not inevitably so.

Alice is aiming to win with a probability modestly short of one. This is a specification of winning, so by my principle, she is intending to win. But she is also, and in the very same action, aiming to decrease the probability of winning.

Monday, August 8, 2022

Might well

It’s occurred to me that the “might well happen that” operator makes for an interesting modality. It divides into an epistemic and a metaphysical version. In both cases, if it might well happen that p, then p is possible (in the respective sense). In both cases, there is a tempting paraphrase of the operator into a probability: on the epistemic side, one might say that it might well happen that p if and only if p has a sufficiently high epistemic probability, and on the metaphysical side, one might say that it might well happen that p if and only if p has a sufficiently high chance given the contextually relevant background. In both cases, it is not clear that the probabilistic paraphrase is correct—there may be (might well be!) cases of “might well happen that” where numerical probabilities have no place. And in both cases, “might well happen that” seems context-sensitive and vague. It might well be that thinking about this operator could lead to progress on something interesting.

Friday, November 27, 2020

An improvement on the objective tendency interpretation of probability

I am very much drawn to the objective causal tendency interpretation of chances. What makes a quantum die have chance 1/6 of giving any of its six results is that there is an equal causal tendency towards each result.

However, objective tendency interpretations have a serious problem: not every conditional chance fact is an objective tendency. After all, if P(A|B) represents an objective causal tendency of the system in state B to have state A, to avoid causal circularity, we don’t want to say that P(B|A) represents an objective causal tendency of the system in state A to have state B.

There is a solution to this: a more complex objective tendency interpretation somewhat in the spirit of David Lewis’s best-fit interpretation. Specifically:

  • the conditional chance of A on B is r if and only if Q(A|B)=r for every probability function Q such that (a) Q satisfies the axioms of probability and (b) Q(C|D)=q whenever r is the degree of tendency of the system in state D to have state C.

There are variants of this depending on the choice of formalism and axioms for Q (e.g., one can make Q be a classical countably additive probability, or a Popper function, etc.). One can presumably even extend this to handle lower and upper chances of nonmeasurable events.

Friday, October 23, 2020

Explanation and understanding

In the 1960s, it dawned on philosophers of science that:

  1. Other things being equal, low-probability explanation confers equally good understanding as high-probability explanation.

If I have a quantum coin that has a probability 0.4 of heads and 0.6 of tails, and it yields heads, I understand why it yielded heads no less well than I would have had it yielded tails—the number is simply different.

On the other hand, the following thesis (which for years I’ve conceded to opponents to low-probability explanations):

  1. Other things being equal, low-probability explanations are less good than high-probability ones.

Finally, add this plausible comparative thesis:

  1. What makes an explanation good is how much understanding it confers (or at least would confer were it true)

which plausibly fits with the maxim that I’ve often been happy to concede that the job of an explanation is to provide understanding.

But (1)–(3) cannot all be true. Something must go. If (2) goes, then Inference to Best Explanation goes as well (I learned this from Yunus Prasetya’s very recent work on IBE and scientific explanation). I don’t want that (unlike Prasetya). And (1) seems right to me, and it also seems important to defending the Principle of Sufficient Reason in stochastic contexts.

Reluctantly, I conclude that (3) needs to go. And this means that I’ve overestimated the connection between explanation and understanding.

Monday, August 26, 2019

Functionalism and imperfect reliability

Suppose a naturalistic computational theory of mind is true: To have mental states of a given kind is to engage in a particular kind of computation. Now imagine a conscious computer thinking various thoughts and arranged around standard logic gates. Modify the computer to have an adjustment knob on each of its logic gates. The adjustment knob can be set to any number between 0 and 1, such that if the knob is set to set to p, then the chance (say, over a clock cycle) that the gate produces the right output is p. Thus, with the knob at 1, the gate always produces the right output, with the knob at 0, it produces the opposite output, with the knob at 0.5, it functions like a fair coin. Make all the randomness independent.

Now, let Cp be the resulting computer with all of its adjustment knobs set to p. On our computational theory of mind, C1 is a conscious computer thinking various thoughts. Now, C0.5 is not computing anything: it is simply giving random outputs. This is true even if in fact, by an extremely unlikely chance, these outputs always match the ones that C1 gives. The reason for this is that we cannot really characterize the components of C0.5 as the logic gates that they would need to be for C0.5 to be computing the same functions as C1. Something that has a probability 0.5 of producing a 1 and a probability 0.5 of producing a 0, regardless of inputs, is no more an and-gate than it is a nand-gate, say.

So, on a computational theory of mind, C0.5 is mindless. It’s not computing. Now imagine a sequence of conscious computers Cp as p ranges from 0.5 to 1. Suppose that it so happens that the corresponding “logic gates” of all of them always happen to give the same answer as the logic gates of C1. Now, for p sufficiently close to 1, any plausible computational theory of mind will have to say that Cp is thinking just as C1 is. Granted, Cp’s gates are less reliable than C1’s, but imperfect reliability cannot destroy thought: if it did, nothing physical in a quantum universe would think, and the naturalistic computational theorist of mind surely won’t want to accept that conclusion.

So, for p close to 1, we have thought. For p = 0.5, we do not. It seems very plausible that if p is very close to 0.5, we still have no thought. So, somewhere strictly between p = 0.5 and p = 1, a transition is made from no-thought to thought. It seems implausible to think that there is such a transition, and that is a count against computational theories of mind.

Moreover, because all the gates actually happen to fire in the same way in all the computers in the Cp sequence, and consciousness is, on the computational theory, a function of the content of the computation, it is plausible that for all the values of p < 1 for which Cp has conscious states, Cp has the same conscious states as C1. Either Cp does not count as computing anything interesting enough for consciousness or it counts as imperfectly reliably computing the same thing as C1 is. Thus, the transition from C0.5 to C1 is not like gradually waking up from unconsciousness. For when we gradually wake up from unconsciousness, we have an apparently continuous sequence of more and more intense conscious states. But the intensity of a conscious state is to be accounted for computationally on a computational theory of mind: the intensity is a central aspect of the qualia. Thus, the intensity has to be a function of what is being computed. And if there is only one relevant thing computed by all the Cp that are computing something conscious-making, then what we have as p goes from 0.5 to 1 is a sudden jump from zero intensity to full intensity. This seems implausible.

Physical possibility

Here is an interesting question: How can one tell from a physics theory whether some event is physically possible according to that theory?

A sufficient condition for physical possibility is that the physics assigns a non-zero chance to it. But this is surely not a necessary condition. After all, it is possible that you will get heads on each of infinitely many tosses of an indeterministic die, while the chance of that is zero.

Plausibly, a necessary condition is that the event should be describable within the state space of the theory. Thus, the state space of classical mechanics simply cannot describe an electron being in a superposition of two position states, and hence such a superposition is physically impossible. But this necessary condition is not sufficient, as Newtonian mechanics bans various transitions that can be described within the state space of classical mechanics.

So, we have a necessary condition and a sufficient condition for physical possibility relative to a physics theory. It would be nice to have a necessary and sufficient condition.

Monday, July 15, 2019

Probabilistic propensities and the Aristotelian view of time

Consider an item x with a half-life of one hour. Then over the period of an hour, it has a 50% chance of decaying, over the period of a second it only has a 0.02% chance of decaying. Imagine that x has no way of changing except by decaying, and that x is causally isolated from all outside influences. Don’t worry about Schroedinger's cat stuff: just take what I said at face value.

We are almost sure that x will one day decay (the probability of decaying approaches one as the length of time increases).

Now imagine that everything other than x is annihilated. Since x was already isolated from all outside influences, this should not in any way affect x’s decay. Hence, we should still be almost sure that x will one day decay. Moreover, since what is outside of x did not affect x’s behavior, the propensities for decay should be unchanged by that annihilation: x has a 50% chance of decay in an hour and a 0.02% chance of decay in a second.

But this seems to mean that time is not the measure of change as Aristotle thought. For if time were the measure of change, then there would be no way to make sense of the question: “How long did it take for x to decay?”

Here is another way to make the point. On an Aristotelian theory of time, the length of time is defined by change. Now imagine that temporal reality consists of x and a bunch of analog clocks all causally isolated from x. The chances of decay of x make reference to lengths of time. Lengths of time are defined by change, and hence by the movements of the hands of the clocks. But if x is causally isolated from the clocks, its decay times should have nothing to do with the movements of the clocks. If God, say, accelerated or slows down some of the clocks, that shouldn’t affect x’s behavior in any way, since x is isolated. But an Aristotelian theory of time, it seems, such an isolation is impossible.

I think an Aristotelian can make one of two moves here.

First, perhaps the kinds of propensities that are involved in having an indeterministic half-life cannot be had by an isolated object: such objects must be causally connected to other things. No atom can be a causal island. So, even though physics doesn’t say so, the decay of an atom has a causal connection with the behavior of things outside the atom.

Second, perhaps any item that can have a half-life or another probabilistic propensity in isolation from other things has to have an internal clock—it has to have some kind of internal change—and the Aristotelian dictum that time is the measure of change should be understood in relation to internal time, not global time.

Friday, February 1, 2019

God, probabilities and causal propensities

Suppose a poor and good person is forced to flip a fair and indeterministic coin in circumstances where heads means utter ruin and tails means financial redemption. If either Molinism or Thomism is true, we would expect that, even without taking into account miracles:

  1. P(H)<P(T).

After all, God is good, and so he is more likely to try to get the good outcome for the person. (Of course, there are other considerations involved, so the boost in probability in favor of tails may be small.)

The Molinist can give this story. God knows how the coin would come out in various circumstances. He is more likely to ensure the occurrence of circumstances in which the subjunctive conditionals say that tails would comes up. The Thomist, on the other hand, will say that God’s primary causation determines what effect the secondary creaturely causation has, while at the same time ensuring that the secondary causation is genuinely doing its causal job.

But given (1), how can we say that the coin is fair? Here is a possibility. The probabilities in (1) take God’s dispositions into account. But we can also look simply at the causal propensities of the coin. The causal propensities of the coin are equibalanced between heads and tails. In addition to the probabilities in (1), which take everything including God into account, we can talk of coin-grounded causal chances, which are basically determined by the ratios of strength in the causal propensities. And the coin-grounded causal chances are 1/2 for heads and 1/2 for tails. But given Molinism or Thomism, these chances are not wholly determinative of the probabilities and the frequencies in repeat experiments, since the latter need to take into account the skewing due to God’s preference for the good.

So we get two sets of probabilities: The all-things-considered probabilities P that take God into account and that yield (1) and the creatures-only-considered probabilities Pc on which:

  1. Pc(H)=Pc(T)=1/2.

Here, however, is something that I think is a little troubling about both the Molinist and Thomist lines. The creatures-only-considered probabilities are obviously close to the observed frequencies. Why? I think the Molinist and Thomist have to say this: They are close because God chooses to act in such ways that the actual frequencies are approximately proportional to the strengths of causal propensities that Pc is based on. But then the frequencies of coin toss outcomes are not directly due to the causal propensities of the coin, but only because God chooses to make the frequencies match. This doesn’t seem right and is a reason why I want to adopt neither Molinism nor Thomism but a version of mere foreknowledge.

Friday, January 18, 2019

Thomism, chance and cooperative providence

Thomists have two stories about how God can act providentially in the world. First, God can work simply miraculously, directly producing an effect that transcends the relevant created causal powers. Second, God can work cooperatively: whenever any finite causal agency is exercised, God intentionally cooperates with it through his primary causation, in such a way that it is up to God which of the causal agents natural effects is produced.

I think there is a difficult problem for cooperative divine agency. Suppose Alice is desperate for food for her children. She finds an indeterministic alien device which has the following property. If she presses the big button on it, the machine has probability 1/2 of producing enough food for a month for her family, and probability 1/2 of giving her a mild shock and turning off for a month.

Alice says a quick but sincerely prayer and presses the button. Then, presumably:

  1. The probability that the machine will produce food is 1/2 conditionally on God not working miraculously.

But now notice:

  1. Necessarily, if God does not work miraculously, the machine will produce food if and only if God intentionally connaturally cooperates with the machine to produce food.

From (1) and (2) we can conclude:

  1. The probability that God will intentionally cooperates with the machine to produce food is 1/2 conditionally on God not working miraculously.

But imagine a different machine, where pressing the button has probability 1/2 of producing enough round pizza for a week and probability 1/2 of producing enough square pizza for a week. If Alice pressed the button on that machine, God, in acting cooperatively, would not have any significant reason to make the output of the machine come out one way or another.

In the round-or-square-pizza machine, we would expect the probability that God would cooperate to produce a particular outcome to be 1/2. But in the food-or-nothing machine, God does have a good reason to make the output of the machine be food: namely, God loves Alice and her family. We would expect the statistics for divine intentional cooperation to be different in the case of the two machines. But they are the same. In other words, it seems that God’s cooperative providence cannot depart from the statistics built into the natures of creatures. Yet that providence is fully under God’s voluntary control according to Thomism. This is puzzling.

If the Thomist says that God’s special providence is always exercised miraculously rather than cooperatively, the problem disappears. Absent special providential reasons, God has reason to follow the natural statistics of the machines. But if we allow that God sometimes exercises his special providence cooperatively, that should skew the statistics, and it cannot do that given the argument from (1) and (2) to (3).

Restricting special providence to miracles is a real option, but it destroys one of the advantages that Thomism has over competing theories.

Wednesday, May 16, 2018

Possibly giving a finite description of a nonmeasurable set

It is often assumed that one couldn’t finitely specify a nonmeasurable set. In this post I will argue for two theses:

  1. It is possible that someone finitely specifies a nonmeasurable set.

  2. It is possible that someone finitely specifies a nonmeasurable set and reasonably believes—and maybe even knows—that she is doing so.

Here’s the argument for (1).

Imagine we live an uncountable multiverse where the universes differ with respect to some parameter V such that every possible value of V corresponds to exactly one universe in the multiverse. (Perhaps there is some branching process which generates a universe for every possible value of V.)

Suppose that there is a non-trivial interval L of possible values of V such that all and only the universes with V in L have intelligent life. Suppose that within each universe with V in L there runs a random evolutionary process, and that the evolutionary processes in different universes are causally isolated of each other.

Finally, suppose that for each universe with V in L, the chance that the first instance of intelligent life will be warm-blooded is 1/2.

Now, I claim that for every subset W of L, the following statement is possible:

  1. The set W is in fact the set of all the values of V corresponding to universes in which the first instance of intelligent life is warm-blooded.

The reason is that if some subset W of L were not a possible option for the set of all V-values corresponding to the first instance of intelligent life being warm-blooded, then that would require some sort of an interaction or dependency between the evolutionary processes in the different universes that rules out W. But the evolutionary procesess in the different universes are causally isolated.

Now, let W be any nonmeasurable subset of L (I am assuming that there are nonmeasurable sets, say because of the Axiom of Choice). Then since (3) is possible, it follows that it is possible that the finite description “The set of values of V corresponding to universes in which the first instance of intelligent life is warm blooded” describes W, and hence describes a nonmeasurable set. It is also plainly compossible with everything above that somebody in this multiverse in fact makes use of this finite description, and hence (1) is true.

The argument for (2) is more contentious. Enrich the above assumptions with the added possibility that the people in one of the universes have figured out that they live in a multiverse such as above: one parametrized by values of V, with an interval L of intelligent-life-permitting values of V, with random and isolated evolutionary processes, and with the chance of intelligent life being warm-blooded being 1/2 conditionally on V being in L. For instance, the above claims might follow from particularly elegant and well-confirmed laws of nature.

Given that they have figured this out, they can then let “Q” be an abbreviation for “The set of all values of V corresponding to universes wehre the first instance of intelligent life is warm-blooded.” And they can ask themselves: Is Q likely to be measurable or not?

The set Q is a randomly chosen subset of L. On the standard (product measure) understanding of how to probabilistically make sense of this “random choice” of subset, the event of Q being nonmeasurable is itself nonmeasurable (see the Sawin answer here). However, intuitively we would expect Q to be nonmeasurable. Terence Tao shares this intuition (see the paragraph starting “Intuitively”). His reason for the intuition is that if Q were measurable, then by something like the Law of Large Numbers, we would expect the intersection of Q with a subinterval I of L to have a measure equal to half of the measure of I, which would be in tension with the Lebesgue Density Theorem. This reasoning may not be precisifiable mathematically, but it is intuitively compelling. One might also just have a reasonable and direct intuition that the nonmeasurability is the default among subsets, and so a “random subset” is going to be nonmeasurable.

So, the denizens of our multiverse can use these intuitions to reasonably conclude that Q is nonmeasurable. Hence, (2) is true. Can they leverage these intuitions into knowledge? That’s less clear to me, but I can’t rule it out.

Wednesday, March 28, 2018

A responsibility remover

Suppose soft determinism is true: the world is deterministic and yet we are responsible for our actions.

Now imagine a device that can be activated at a time when an agent is about to make a decision. The device reads the agent’s mind, figures out which action the agent is determined to choose, and then modifies the agent’s mind so the agent doesn’t make any decision but is instead compelled to perform the very action that they would otherwise have chosen. Call the device the Forcer.

Suppose you are about to make a difficult choice between posting a slanderous anonymous accusation about an enemy of yours that will go viral and ruin his life and not posting it. It is known that once the message is posted, there will be no way to undo the bad effects. Neither you nor I know how you will choose. I now activate the Forcer on you, and it makes you post the slander. Your enemy’s life is ruined. But you are not responsible for ruining it, because you didn’t choose to ruin it. You didn’t choose anything. The Forcer made you do it. Granted, you would have done it anyway. So it seems you have just had a rather marvelous piece of luck: you avoided culpability for a grave wrong and your enemy’s life is irreparably ruined.

What about me? Am I responsible for ruining your enemy’s life? Well, first, I did not know that my activation of the Forcer would cause this ruin. And, second, I knew that my activation of the Forcer would make no difference to your enemy: she would have been ruined given the activation if and only if she would have been ruined without it. So it seems that I, too, have escaped responsibility for ruining your enemy’s life. I am, however, culpable for infringing on your autonomy. However, given how glad you are of your enemy’s life being ruined with your having any culpability, no doubt you will forgive me.

Now imagine instead that you activated the Forcer on yourself, and it made you post the slander. Then for exactly the same reasons as before, you aren’t culpable for ruining your enemy’s life. For you didn’t choose to post the slander. And you didn’t know that activating the Forcer would cause this ruin, while you did know that the activation wouldn’t make any difference to your enemy—the effect of activating the Forcer on yourself would not affect whether the message would be posted. Moreover, the charge of infringing on autonomy has much less force when you activated the Forcer yourself.

It is true that by activating the Forcer you lost something: you lost the possibility of being praiseworthy for choosing not to post the slander. But that’s a loss that you might judge worthwhile.

So, given soft determinism, it is in principle possible to avoid culpability while still getting the exact same results whenever you don’t know prior to deliberation how you will choose. This seems absurd, and the absurdity gives us a reason to reject the compatibility of determinism and responsibility.

But the above story can be changed to worry libertarians, too. Suppose the Forcer reads off its patient’s mind the probabilities (i.e., chances) of the various choices, and then randomly selects an action with the probabilities of the various options exactly the same as the patient would have had. Then in acting the Forcer, it can still be true that you didn’t know how things would turn out. And while there is no longer a guarantee that things would turn out with the Forcer as they would have without it, it is true that activating the Forcer doesn’t affect the probabilities of the various actions. In particular, in the cases above, activating the Forcer does nothing to make it more likely that your enemy would be slandered. So it seems that once again activating the Forcer on yourself is a successful way of avoiding responsibility.

But while that is true, it is also true that if libertarianism is true, regular activation of the Forcer will change the shape of one’s life, because there is no guarantee that the Forcer will decide just like you would have decided. So while on the soft determinist story, regular use of the Forcer lets one get exactly the same outcome as one would otherwise have had, on the libertarian version, that is no longer true. Regular use of the Forcer on libertarianism should be scary—for it is only a matter of chance what outcome will happen. But on compatibilism, we have a guarantee that use of the Forcer won’t change what action one does. (Granted, one may worry that regular use of the Forcer will change one’s desires in ways that are bad for one. If we are worried about that, we can suppose that the Forcer erases one’s memory of using it. That has the disadvantage that one may feel guilty when one isn’t.)

I don’t know that libertarians are wholly off the hook. Just as the Forcer thought experiment makes it implausible to think that responsibility is compatible with determinism, it also makes it implausible to think that responsibility is compatible with there being precise objective chances of what choices one will make. So perhaps the libertarian would do well to adopt the view that there are no precise objective chances of choices (though there might be imprecise ones).

Monday, January 8, 2018

Counting and chance

A countably infinite number of people, including me, is about to roll fair indeterministic dice? What probability should I assign to rolling six?

Obviously, 1/6.

But suppose I describe the situation thus: “There are two equally sized groups of people. How likely is it that I am in the former rather than the latter?” (After all, I know that infinitely many will roll six and infinitely many won’t, and that it’ll be the same infinity in both cases.) So why 1/6, instead of 1/2, or undefined?

Here’s what I want to say: “The objective chance of my rolling six is 1/6, and objective chances are king, in the absence of information posterior to the outcome.” Something like the Principal Principle should apply. And it should be irrelevant that there are infinitely many other people rolling dice.

If I say this, then I may have to deny both the self-sampling assumption and the self-indication assumption. For if I really consider myself to be a completely randomly chosen person in the set of die rollers, or in some larger set, in the self-indication cases, it seems I shouldn’t think it less likely that I rolled six than that I didn’t, since equal numbers did each.

It looks to me that we have two competing ways of generating probabilities: counting and objective chance. I used to think that counting trumped objective chance. Now I am inclined to think objective chance trumps counting, and counting counts for nothing, in the absence of objective chance.

Tuesday, December 12, 2017

Zero chance events

A standard thing in the philosophy of science to say such stochastic explanation questions is that one can given an answer in terms of the objective chance of the event, even when these chances are less than 1/2.

But consider the question: Why did this atom decay exactly at t1?

Here, the objective chance may well be zero. And surely that an event had zero chance of happening does nothing to explain the event. After all, that the decay at t1 had zero chance does not distinguish the atom’s decaying at t1 from the atom’s turning into a square circle at t1. And to explain something we minimally need to say something that distinguishes it from an impossibility.

Here, I think, the causal powers theorist can say something (even though I may just want to reject the presuppositions; see the Response to Objection 2, below). Stochastic systems have a plurality of causal powers for incompatible outcomes. The electron in a mixed-spin state may have both a causal power to have its spin measured as up and to have its spin measured as done. Normally, some of the causal powers are apt to prevail more than others, and hence have a greater chance than others. But even the weaker causal powers are there, and we can explain the event by citing them. The electron’s spin was measured as, say, up because it had a causal power to that outcome; had it been measured as, say, down, that would have been because it had a causal power to that outcome. We can give further detail here: we can say that one of these causal powers is stronger than the other. And the stronger causal power has, because it is stronger, a higher chance of prevailing. But even the weaker causal power can prevail, and when it does, we can explain the outcome in terms of it.

This story works just fine even when the chances are zero. The weaker causal power could be so weak that the chance associated with it has to be quantified as zero. But we can still explain the activation of the weaker causal power.

So, going back to the decay, we can say that the atom had a causal power to decay at t1, and that’s why it decayed at t1. That causal power was of minimal strength, and so the chance of the decay has to be quantified as zero. But we still have an explanation.

The causal powers story about the atom encodes information that the chances do not. The chances do not distinguish the atom’s turning into a square circle from the atom’s decaying exactly at t1. The causal powers do, since it has a power to decay but no power to turn into a square circle.

Objection 1: Let’s say that the atom has twice as high a chance of decaying over the interval of times [0, 2] as over the interval of times [0, 1]. How do we explain that in terms of causal powers, given that there are equally many (i.e., continuum many) causal powers to decay at precise times in [0, 2] as there are causal powers to decay at precise times in [0, 1]?

Response: It could be that just as the causal power story carries information the chance story does not, the chance story could carry information the causal power story does not, and both stories reflect aspects of reality.

Another story could be that there are causal powers associated with intervals as well as points of times, and the causal power to decay at a time [0, 2] is twice as strong as the causal power to decay at a time in [0, 1]. There are difficulties here, however, with thinking about the fundamentality relations between the powers associated with different intervals. I fear that there is no avoiding an infinite sequence of causal powers that violates causal finitism, and I am inclined to reject the possibility of exact decay times—and hence reject the explanatory question I started this post with. I don’t see much hope for a measurement of an exact time after all. But someone with other commitments about finitism could have a story.

Objection 2: This is just like a dormitive power explanation of opium making someone sleepy.

Response: Opium’s dormitive power is fundamental or not. If opium has a fundamental dormitive power, then the dormitive power explanation is perfectly fine. That’s just the kind of explanation we have to have at the fundamental level. If the dormitive power explanation is not fundamental, then the explanation is correct but not as informative as an explanation in terms of more fundamental things would be.

Likewise, the power to decay at t1 either is or is not fundamental. If it is fundamental, then the explanation in terms of the power is perfectly fine. If it is not, then there is a more fundamental explanation. But probably the more fundamental explanation will also involve minimal strength powers with zero activation chances, too.

Monday, November 6, 2017

Statistically contrastive explanations of both heads and tails

Say that an explanation e of p rather than q is statistically contrastive if and only P(p|e)>P(q|e).

For instance, suppose I rolled an indeterministic die and got a six. Then I can give a statistically contrastive explanation of why I rolled more than one (p) rather than rolling one (q). The explanation (e) is that I rolled a fair six-sided die. In that case: P(p|e)=5/6 > 1/6 = P(q|e). Suppose I had rolled a one. Then e would still have been an explanation of the outcome, but not a statistically contrastive one.

One might try to generalize the above remarks to conclude to this thesis:

  1. In indeterministic stochastic setups, there will always be a possible outcome that does not admit of a statistically contrastive explanation.

The intuitive argument for (1) is this. If one indeterministic stochastic outcome is p, either there is or is not a statistically contrastive explanation e of why p rather not p is the case. If there is no such statistically contrastive explanation, then the consequent of (1) is indeed true. Suppose that there is a statistically contrastive explanation e, and let q be the negation of p. Then P(p|e)>P(q|e). Thus, e is a statistically contrastive explanation of why p rather than q, but it is obvious that it cannot be a statistically contrastive explanation of why q rather than p.

The intuitive argument for (1) is logically invalid. For it only shows that e is not the statistically contrastive explanation for why q rather than p, while what needed to be shown is that there is no statistically contrastive explanation.

In fact, (1) is false. The indeterministic stochastic situation is Alice’s flipping of a coin. There are two outcomes: heads and tails. But prior to the coin getting flipped, Bob uniformly chooses a random number r such that 0 < r < 1 and loads the coin in such a way that the chance of heads is r. Suppose that in the situation at hand r = 0.8. Let H be the heads outcome and T the tails outcome. Then here is a constrastive explanation for H rather than T:

  • e1: an unfair coin with chance 0.8 of heads was flipped.

Clearly P(H|e1)=0.8 > 0.2 = P(T|e1). But suppose that instead tails was obtained. We can give a constrastive explanation of that, too:

  • e2: an unfair coin with chance at least 0.2 of tails was flipped.

Given only e2, the chance of tails is somewhere between 0.2 and 1.0, with the distribution uniform. Thus, on average, given e2 the chance of tails will be 0.6: P(T|e2)=0.6. And P(H|e2)=1 − P(T|e2)=0.4. Thus, e2 is actually a statistically contrastive explanation of T. And note that something like this will work no matter what value r has as long as it’s strictly between 0 and 1.

It might still be arguable that given indeterministic stochastic situations, something will lack a statistically contrastive explanation. For instance, while we can give a statistically contrastive explanation of heads rather than tails, and a statistically contrastive explanation of tails rather than heads. But it does not seem that we can give a statistically contrastive explanation of why the coin was loaded exactly to degree 0.8, since that has zero probability. Of course, that’s an outcome of a different stochastic process than the coin flip one, so it doesn't support (1). And the argument needs to be more complicated than the invalid argument for (1).

Thursday, August 10, 2017

Uncountable independent trials

Suppose that I am throwing a perfectly sharp dart uniformly randomly at a continuous target. The chance that I will hit the center is zero.

What if I throw an infinite number of independent darts at the target? Do I improve my chances of hitting the center at least once?

Things depend on what size of infinity of darts I throw. Suppose I throw a countable infinity of darts. Then I don’t improve my chances: classical probability says that the union of countably many zero-probability events has zero probability.

What if I throw an uncountable infinity of darts? The answer is that the usual way of modeling independent events does not assign any meaningful probabilities to whether I hit the center at least once. Indeed, the event that I hit the center at least once is “saturated nonmeasurable”, i.e., it is nonmeasurable and every measurable subset of it has probability zero and every measurable superset of it has probability one.

Proposition: Assume the Axiom of Choice. Let P be any probability measure on a set Ω and let N be any non-empty event with P(N)=0. Let I be any uncountable index set. Let H be the subset of the product space ΩI consisting of those sequences ω that hit N, i.e., ones such that for some i we have ω(i)∈N. Then H is saturated nonmeasurable with respect to the I-fold product measure PI (and hence with respect to its completion).

One conclusion to draw is that the event H of hitting the center at least once in our uncountable number of throws in fact has a weird “nonmeasurable chance” of happening, one perhaps that can be expressed as the interval [0, 1]. But I think there is a different philosophical conclusion to be drawn: the usual “product measure” model of independent trials does not capture the phenomenon it is meant to capture in the case of an uncountable number of trials. The model needs to be enriched with further information that will then give us a genuine chance for H. Saturated nonmeasurability is a way of capturing the fact that the product measure can be extended to a measure that assigns any numerical probability between 0 and 1 (inclusive) one wishes. And one requires further data about the system in order to assign that numerical probability.

Let me illustrate this as follows. Consider the original single-case dart throwing system. Normally one describes the outcome of the system’s trials by the position z of the tip of the dart, so that the sample space Ω equals the set of possible positions. But we can also take a richer sample space Ω* which includes all the possible tip positions plus one more outcome, α, the event of the whole system ceasing to exist, in violation of the conservation of mass-energy. Of course, to be physically correct, we assign chance zero to outcome α.

Now, let O be the center of the target. Here are two intuitions:

  1. If the number of trials has a cardinality much greater than that of the continuum, it is very likely that O will result on some trial.

  2. No matter how many trials—even a large infinity—have been performed, α will not occur.

But the original single-case system based on the sample space Ω* does not distinguish O and α probabilistically in any way. Let ψ be a bijection of Ω* to itself that swaps O and α but keeps everything else fixed. Then P(ψ[A]) = P(A) for any measurable subset A of Ω* (this follows from the fact that the probability of O is equal to the probability of α, both being zero), and so with respect to the standard probability measure on Ω*, there is no probabilistic difference between O and α.

If I am right about (1) and (2), then what happens in a sufficiently large number of trials is not captured by the classical chances in the single-case situation. That classical probabilities do not capture all the information about chances is something we should already have known from cases involving conditional probabilities. For instance P({O}|{O, α}) = 1 and P({α}|{O, α}) = 0, even though O and α are on par.

One standard solution to conditional probability case is infinitesimals. Perhaps P({α}) is an infinitesimal ι but P({O}) is exactly zero. In that case, we may indeed be able to make sense of (1) and (2). But infinitesimals are not a good model on other grounds. (See Section 3 here.)

Thinking about the difficulties with infinitesimals, I get this intuition: we want to get probabilistic information about the single-case event that has a higher resolution than is given by classical real-valued probabilities but lower resolution than is given by infinitesimals. Here is a possibility. Those subsets of the outcome space that have probability zero also get attached to them a monotone-increasing function from cardinalities to the set [0, 1]. If N is such a subset, and it gets attached to it the function fN, then fN(κ) tells us the probability that κ independent trials will yield at least one outcome in N.

We can then argue that fN(κ) is always 0 or 1 for infinite. Here is why. Suppose fN(κ)>0. Then, κ must be infinite, since if κ is finite then fN(κ)=1 − (1 − P(N))κ = 0 as P(N)=0. But fN(κ + κ)=(fN(κ))2, since probabilities of independent events multiply, and κ + κ = κ (assuming the Axiom of Choice), so that fN(κ)=(fN(κ))2, which implies that fN(κ) is zero or one. We can come up with other constraints on fN. For instance, if C is the union of A and B, then fC(κ) is the greater of fA(κ) and fB(κ).

Such an approach could help get a solution to a different problem, the problem of characterizing deterministic causation. To a first approximation, the solution would go as follows. Start with the inadequate story that deterministic causation is chancy causation with chance 1. (This is inadequate, because in the original dart-throwing case, the chance of missing the center is 1, but throwing the dart does not deterministically cause one to hit a point other than the center.) Then say that deterministic causation is chancy causation such that the failure event F is such that fF(κ)=0 for every cardinal κ.

But maybe instead of all this, one could just deny that there are meaningful chances to be assigned to events like the event of uncountably many trials missing or hitting the center of the target.

Sketch of proof of Proposition: The product space ΩI is the space of all functions ω from I to Ω, with the product measure PI generated by the product measures of cylinder sets. The cylinder sets are product sets of the form A = ∏iIAi such that there is a finite J ⊆ I such that Ai = Ω for i ∉ J, and the product measure of A is defined to be ∏iJP(Ai).

First I will show that there is an extension Q of PI such that Q(H)=0 (an extension of a measure is a measure on a larger σ-algebra that agrees with the original measure on the smaller σ-algebra). Any PI-measurable subset of H will then have Q measure zero, and hence will have PI-measure zero since Q extends PI.

Let Q1 be the restriction of P to Ω − N (this is still normalized to 1 as N is a null set). Let Q1I be the product measure on (Ω − N)I. Let Q be a measure on Ω defined by Q(A)=Q1I(A ∩ ΩN). Consider a cylinder set A = ∏iIAi where there is a finite J ⊆ I such that Ai = Ω whenever i ∉ J. Then
Q(A)=∏iJQ1(Ai − N)=∏iJP(Ai − N)=∏iJP(Ai)=PN(A).
Since PN and Q agree on cylinder sets, by the definition of the product measure, Q is an extension of PN.

To show that H is saturated nonmeasurable, we now only need to show that any PI-measurable set in the complement of H must have probability zero. Let A be any PI-measurable set in the complement of H. Then A is of the form {ω ∈ ΩI : F(ω)}, where F(ω) is a condition involving only coordinates of ω numbered by a fixed countable set of indices from I (i.e., there is a countable subset J of I and a subset B of ΩJ such that F(ω) if and only if ω|J is a member of B, where ω|J is the restriction of ω to J). But no such condition can exclude the possibility that a coordinate of Ω outside that countable set is in H, unless the condition is entirely unsatisfiable, and hence no such set A lies in the complement of H, unless the set is empty. And that’s all we need to show.