Wednesday, May 25, 2022

Anti-Bayesian update and scoring rules in infinite spaces

Bayesian update on evidence E is transitioning from a credence function P to the credence function P(⋅∣E). Anti-Bayesian update on E is moving from P to P(⋅∣Ec) (where Ec is the complement of E). Whether one thinks that Bayesian update is rationally required, it is clear that Bayesian update is better than anti-Bayesian update.

But here is a fun fact (assuming the Axiom of Choice). For any scoring rule on an infinite space, there is a finitely additive probability function P and an event E such that 0 < P(E) < 1 where P(⋅∣E) and P(⋅∣Ec) get exactly the same score everywhere in the probability space. It follows that when dealing with finitely additive probabilities on infinite spaces, a scoring rule will not always be able to distinguish Bayesian update from anti-Bayesian update. This is a severe limitation of scoring rules as a tool for evaluating the accuracy of a credence function in infinite cases.

Here’s a proof of the fun fact. Let s be a scoring rule. Say that two credence functions are maximally opinionated provided that they assign 0 or 1 to every event. It is known that then there are two different maximally opinionated finitely additive probability functions p and q such that s(p) = s(q) everywhere. Let P = (p+q)/2 be their average. Let E be an event such that p(E) = 1 and q(E) = 0 (such an event exists because p and q are maximally opinionated and yet different). Then P(⋅∣E) = p and P(⋅∣Ec) = q while P(E) = 1/2. Hence conditionalization on E and Ec has exactly the same score.

One might take this as some evidence that finite additivity is not good enough.

Zsolt Nagy said...

If E and Ē (complement of E) are the complements of each other (yes, Ē is the complement of E, but also E is the complement of Ē), then in what way is the Bayesian Update better than the Anti-Bayesian Update?!?
If that would actually be the case, then I guess, that the Anti-Bayesian Update is better then the Anti-Anti-Bayesian[simple Bayesian] Update.

Alexander R Pruss said...

I don't follow. The idea behind anti-Bayesian evidence is this. You observe E. So you replace your credences with credences conditionalized on not-E!

An immediate consequence is that although you observed E, your credence for E after the update is zero.

For instance, suppose you roll a die, and a friend observes that the answer was even. What do you do? You conditionalize on odd, and assign probability 1/3 to each of 1, 3 and 5.

Zsolt Nagy said...
This comment has been removed by a blog administrator.
Michael N said...

Hi Alex, interesting observation. I think this can be brought to bear more directly on the standard accuracy-based argument for conditionalization. That argument is based on showing the *plan* to conditionalize has highest expected accuracy. In general, an update plan is a function from a partition to the space of probability measures.

Take a binary partition {E, not-E} for simplicity. Say an update plan U is anti-Bayesian for P if U(E) = P(. | not-E) and U(not-E) = P(. | E). Let C be the conditionalization plan, e.g. C(E) = P(. | E). In general, the P-expected value of an update plan U is

E_P(U) = \int s(w, U(E(w))) P(dw),

where E(w) is that cell of the partition containing w. So, with P as in your post and U anti-Bayesian, we have

E_P(U) = \int_E s(w, P(. | not-E))P(dw) + \int_{not-E} s(w, P(. | E)) P(dw) = \int_E s(w, q)P(dw) + \int_{not-E} s(w, p) P(dw).

And if C is the conditionalization plan for P, then

E_P(C) = \int_E s(w, P(. | E))P(dw) + \int_{not-E} s(w, P(. | not-E)) P(dw) = \int_E s(w, p)P(dw) + \int_{not-E} s(w, q) P(dw).

So E_P(U) = E_P(C) because s(p) = s(q).

If s is strictly proper that argument doesn't work, which is why the argument for conditionalization plans work in finite spaces.

Michael N said...

The expected-accuracy argument for conditionalization does work for countably additive probabilities in infinite spaces:

https://www.sciencedirect.com/science/article/abs/pii/S016771522200030X

There is also an accuracy-dominance argument for conditionalization in finite spaces:

https://philpapers.org/rec/NIEAAC

Perhaps this argument fails to generalize to finitely additive probabilities in infinite spaces as well.

Michael N said...
This comment has been removed by the author.
Michael N said...

In fact, yes, it's easy to see that Theorem 3 of the second linked paper fails to generalize, using your example.

In the terminology of that paper, the credal strategy (P,C) is probabilistic and conditionalizing, but the credal strategy (P,U), where U is anti-Bayesian, is not conditionalizing. By your example, the score of (P,C) is the same as the score of (P,U). So, if part (c) of Theorem 3 generalizes, then there is no credal strategy that strongly dominates (P,C). But then part (b) fails. So either (c) or (b) fails to generalize.

Zsolt Nagy said...

On the other hand proper answers and responses have much more chances(-wink, wink) properly addressing my questions.

I mean another proper question of mine and not the previous one. Ahhh, silly and fallible me.
It is just so easy to be fooled and tricked by one's own intellect, I guess.

So here is that proper question of mine for you, administrator:
Why and how exactly is the "regular" conditionalizing or Bayesian Update supposed to be "better" than the "anti" conditionalizing or "Anti"-Bayesian Update?
I would rather consider the "Anti"-Bayesian Update to be just a "complementary" Bayesian Update on pair with the original Bayesian Update, since each of those Bayesian Updates are complementary to each other as E and not-E are complementary to each other.
But hey, if you have good reasons to think, that the one is "better" than the other one, then please, enlighten me about how and why that is exactly supposed to be the case, administrator.
Thank you.

Zsolt Nagy said...

(1) For evidence E, the Bayesian Update on evidence E is "better" than the Anti-Bayesian Update on not-E.
(2) Sometimes absence of evidence is evidence of absence or to say, that sometimes not-E' is the evidence E: E = not-E'
From 1 and 2:
(3) Therefore sometimes for evidence not-E', the Bayesian Update on evidence not-E' is "better" than the Anti-Bayesian Update on not-not-E'(=E').

Zsolt Nagy said...

Prosecutor: The probability of Sally winning the lottery is astronomically low by random chance. But since she won the lottery and the probability of her winning the lottery by fair random chance is astronomically low and the probability of her winning the lottery by cheating is much higher than the probability of her winning the lottery by fair random chance, therefore it is more probable, that Sally has cheated than that she has not cheated given that she has actually won the lottery.
P(Sally winning the lottery|Sally cheating the lottery) » P(Sally winning the lottery|Sally not cheating and the lottery being fair and random)
=> P(Sally cheating|Sally winning the lottery) » P(Sally not cheating and the lottery being fair and random)

I know, prosecutor, accessing evidence can be sometimes very difficult. But please before you make a fool out of yourself and any body believing in such a fallacy at least try to consult a real expert on statistics and also trying to consider also accessing prior probabilities of currently considered hypothesis while accessing certain relevant evidence for those hypothesis in relation to those currently considered hypothesis.
Otherwise you might put an innocent person into jail for no good reasons.

Making A Math Murder

Alexander R Pruss said...

Zsolt:

I am sorry. I really just don't get the question. Anti-Bayesian update would mean that you get E as evidence, and you conclude that E didn't happen. Let's say you're not sure it's raining. You go outside, see that it's raining, and therefore you assign probability 1 to its NOT raining. If that's not absurd, I don't know what more to say.

Zsolt Nagy said...

No, Alexander, the Anti-Bayesian update doesn't mean, that you get E as evidence and you conclude, that E didn't happen.
According to your own post here the Anti-Bayesian update is supposed to be this:
Anti-Bayesian update on E is moving from P to P(⋅∣not-E) (where not-E is the complement of E).
What does even "moving" mean in this context?!?
If we just simply identify the Anti-Bayesian update on E with this P(⋅∣not-E), then the Anti-Bayesian update on E is just that - the conditional probability on not-E.
Do you even know, what the "regular" Bayesian update is supposed to be?!?
"Bayesian inference" from Wikipedia
Let's say we consider the following example argument from Wade Tisthammer alias Maverick Christian:
*1. If it is raining, then my car is wet.
2. It is raining.
3. Therefore, my car is wet.

Or to put that into our current context:
1) P(my car is wet|it is raining) is high.
2) P(it is raining) is 1.
3) Therefore, P(my car is wet) is high.

What is then the Anti-Bayesian update on <"it is raining"> according to you, Alexander?
I think, the Anti-Bayesian update on <"it is raining"> is the conditional probability P(my car is wet|it is not raining) here according to you and your blogpost here.
So then what do we assign to such a conditional probability?!? Is that high or low?
Do the prior probabilities P(it is raining) and P(it is raining) matter for the previous questions?
The prior probabilities do matter here, if and only if P(my car is wet AND it is raining) ≠ P(my car is wet)·P(it is raining) ("Independence (probability theory)" from Wikipedia)

Zsolt Nagy said...

Do you, Alexander, understand what I'm talking about here?
I certainly don't understand, what you mean by "observing, that <"it is actually raining"> AND simultaneously assigning to the probability of <"it is NOT raining"> the value 1.
Is that supposed to be a proper response to my previously given argument?
(1) For evidence E, the Bayesian Update on evidence E is "better" than the Anti-Bayesian Update on not-E.
(2) Sometimes absence of evidence is evidence of absence or to say, that sometimes not-E' is the evidence E: E = not-E'
From 1 and 2:
(3) Therefore sometimes for evidence not-E', the Bayesian Update on evidence not-E' is "better" than the Anti-Bayesian Update on not-not-E'(=E').
If so, then here is my nonsensical response to your nonsensical (and uncharitable) response:
(1) For evidence E, the Bayesian Update on evidence E is "better" than the Anti-Bayesian Update on not-E.
(2) Sometimes absence of evidence e.g. <"it is raining"> is evidence of absence or to say, that sometimes not-<"it is raining"> is the evidence E: E = not-<"it is raining"> = <"it is not raining">
From 1 and 2:
(3) Therefore sometimes for evidence <"it is not raining">, the Bayesian Update on evidence <"it is not raining"> is "better" than the Anti-Bayesian Update on not-<"it is not raining"> (= <"it is raining">).
I feel so bamboozled by your response. Why are you calling yourself an "analytical philosopher"?
Seriously, you shouldn't call yourself that by giving such nonsensical responses.

*Sidenote: What if your car is in the garage, while it's raining?
And I don't wonder about that any more given him learning logic and rationality from such persons, who shouldn't have called themselves "analytical philosophers" in the first place or who are calling themselves wrongly and falsely that currently.

Alexander R Pruss said...

I made up the term "anti-Bayesian update" so I get to say what it means. And what it means is that when you get evidence E, you modify your credences just like the Bayesian would have modified them on getting the evidence of not-E. So, you observe that the it's raining, and you update your credences just like the Bayesian would if the Bayesian observed that it's not raining. In other words, the anti-Bayesian on getting evidence E pretends that they are a Bayesian getting evidence not-E. Yup, that's the dumbest thing ever. That's the point.

Zsolt Nagy said...

Really? That's supposed to be the dumbest thing ever?
If so, then I guess, your Anti-Bayesian Update would be on pair with the prosecutor's fallacy - they are simply irrelevant as most of the work from wrongly and falsely claimed to be "analytical philosophers". And therefore, your whole blog post here is just irrelevant.

Given the transpositions for material conditionals: For every proposition P and proposition Q it follows, that [P → Q] ⇔ [~Q → ~P].
Is this also true or in some way analogous for conditional probabilities?
Is it for evidence E and hypothesis H, such that P(H|E) = P(~E|~H) or at least P(H|E) ≈ P(~E|~H)?
And what about the material implication [P → Q] ⇔ [~P ∨ Q]?
Is it then P(H|E) = P(~E ∪ H) or at least P(H|E) ≈ P(~E ∪ H)?
And further with De Morgan’s law [P → Q] ⇔ [~P ∨ Q] ⇔ [~(P ∧ ~Q)]?
Is it then P(H|E) = P(~(E ∩ ~H)) or at least P(H|E) ≈ P(~(E ∩ ~H))?
Relevant questions upon more relevant questions.

Zsolt Nagy said...

P(E→H) = P(~E∪H) = P(~E)+P(H)-P(~E∩H)
=1-P(E)+P(E∩H)=1-P(E)+P(E∩H)/P(E)•P(E)
=1+(P(H|E)-1)•P(E)

So,
P(H|E)=1+(P(E→H)-1)/P(E) and
P(E→H)=1+(P(H|E)-1)•P(E)
Further in the case of P(E) = 1
we get the following:
P(H|E) = P(E→H) = P(~H→~E)
= P(~E∪H) = P(~(E∩~H))