Alexander Pruss's Blog: Anti-Bayesian update and scoring rules in infinite spaces

Wednesday, May 25, 2022

Anti-Bayesian update and scoring rules in infinite spaces

Bayesian update on evidence E is transitioning from a credence function P to the credence function P(⋅∣E). Anti-Bayesian update on E is moving from P to P(⋅∣E^c) (where E^c is the complement of E). Whether one thinks that Bayesian update is rationally required, it is clear that Bayesian update is better than anti-Bayesian update.

But here is a fun fact (assuming the Axiom of Choice). For any scoring rule on an infinite space, there is a finitely additive probability function P and an event E such that 0 < P(E) < 1 where P(⋅∣E) and P(⋅∣E^c) get exactly the same score everywhere in the probability space. It follows that when dealing with finitely additive probabilities on infinite spaces, a scoring rule will not always be able to distinguish Bayesian update from anti-Bayesian update. This is a severe limitation of scoring rules as a tool for evaluating the accuracy of a credence function in infinite cases.

Here’s a proof of the fun fact. Let s be a scoring rule. Say that two credence functions are maximally opinionated provided that they assign 0 or 1 to every event. It is known that then there are two different maximally opinionated finitely additive probability functions p and q such that s(p) = s(q) everywhere. Let P = (p+q)/2 be their average. Let E be an event such that p(E) = 1 and q(E) = 0 (such an event exists because p and q are maximally opinionated and yet different). Then P(⋅∣E) = p and P(⋅∣E^c) = q while P(E) = 1/2. Hence conditionalization on E and E^c has exactly the same score.

One might take this as some evidence that finite additivity is not good enough.

16 comments:

Zsolt NagyMay 25, 2022 at 11:33 AM
If E and Ē (complement of E) are the complements of each other (yes, Ē is the complement of E, but also E is the complement of Ē), then in what way is the Bayesian Update better than the Anti-Bayesian Update?!?
If that would actually be the case, then I guess, that the Anti-Bayesian Update is better then the Anti-Anti-Bayesian[simple Bayesian] Update.
ReplyDelete
Replies
Alexander R PrussMay 26, 2022 at 8:30 AM
I don't follow. The idea behind anti-Bayesian evidence is this. You observe E. So you replace your credences with credences conditionalized on not-E!

An immediate consequence is that although you observed E, your credence for E after the update is zero.

For instance, suppose you roll a die, and a friend observes that the answer was even. What do you do? You conditionalize on odd, and assign probability 1/3 to each of 1, 3 and 5.
ReplyDelete
Replies
Nagy ZsoltMay 26, 2022 at 10:16 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies
Michael NMay 26, 2022 at 6:09 PM
Hi Alex, interesting observation. I think this can be brought to bear more directly on the standard accuracy-based argument for conditionalization. That argument is based on showing the *plan* to conditionalize has highest expected accuracy. In general, an update plan is a function from a partition to the space of probability measures.

Take a binary partition {E, not-E} for simplicity. Say an update plan U is anti-Bayesian for P if U(E) = P(. | not-E) and U(not-E) = P(. | E). Let C be the conditionalization plan, e.g. C(E) = P(. | E). In general, the P-expected value of an update plan U is

E_P(U) = \int s(w, U(E(w))) P(dw),

where E(w) is that cell of the partition containing w. So, with P as in your post and U anti-Bayesian, we have

E_P(U) = \int_E s(w, P(. | not-E))P(dw) + \int_{not-E} s(w, P(. | E)) P(dw) = \int_E s(w, q)P(dw) + \int_{not-E} s(w, p) P(dw).

And if C is the conditionalization plan for P, then

E_P(C) = \int_E s(w, P(. | E))P(dw) + \int_{not-E} s(w, P(. | not-E)) P(dw) = \int_E s(w, p)P(dw) + \int_{not-E} s(w, q) P(dw).

So E_P(U) = E_P(C) because s(p) = s(q).

If s is strictly proper that argument doesn't work, which is why the argument for conditionalization plans work in finite spaces.
ReplyDelete
Replies
Michael NMay 26, 2022 at 6:15 PM
The expected-accuracy argument for conditionalization does work for countably additive probabilities in infinite spaces:

https://www.sciencedirect.com/science/article/abs/pii/S016771522200030X

There is also an accuracy-dominance argument for conditionalization in finite spaces:

https://philpapers.org/rec/NIEAAC

Perhaps this argument fails to generalize to finitely additive probabilities in infinite spaces as well.
ReplyDelete
Replies
Michael NMay 26, 2022 at 6:31 PM
This comment has been removed by the author.
ReplyDelete
Replies
Michael NMay 26, 2022 at 6:35 PM
In fact, yes, it's easy to see that Theorem 3 of the second linked paper fails to generalize, using your example.

In the terminology of that paper, the credal strategy (P,C) is probabilistic and conditionalizing, but the credal strategy (P,U), where U is anti-Bayesian, is not conditionalizing. By your example, the score of (P,C) is the same as the score of (P,U). So, if part (c) of Theorem 3 generalizes, then there is no credal strategy that strongly dominates (P,C). But then part (b) fails. So either (c) or (b) fails to generalize.
ReplyDelete
Replies
Nagy ZsoltMay 27, 2022 at 4:58 PM
Deleting my comments, administrator, won't answer my questions though.
On the other hand proper answers and responses have much more chances(-wink, wink) properly addressing my questions.

How about addressing just one question of mine properly?
I mean another proper question of mine and not the previous one. Ahhh, silly and fallible me.
It is just so easy to be fooled and tricked by one's own intellect, I guess.

So here is that proper question of mine for you, administrator:
Why and how exactly is the "regular" conditionalizing or Bayesian Update supposed to be "better" than the "anti" conditionalizing or "Anti"-Bayesian Update?
I would rather consider the "Anti"-Bayesian Update to be just a "complementary" Bayesian Update on pair with the original Bayesian Update, since each of those Bayesian Updates are complementary to each other as E and not-E are complementary to each other.
But hey, if you have good reasons to think, that the one is "better" than the other one, then please, enlighten me about how and why that is exactly supposed to be the case, administrator.
Thank you.
ReplyDelete
Replies
Zsolt NagyMay 28, 2022 at 8:13 PM
(1) For evidence E, the Bayesian Update on evidence E is "better" than the Anti-Bayesian Update on not-E.
(2) Sometimes absence of evidence is evidence of absence or to say, that sometimes not-E' is the evidence E: E = not-E'
From 1 and 2:
(3) Therefore sometimes for evidence not-E', the Bayesian Update on evidence not-E' is "better" than the Anti-Bayesian Update on not-not-E'(=E').
ReplyDelete
Replies
Zsolt NagyMay 31, 2022 at 6:07 PM
Prosecutor: The probability of Sally winning the lottery is astronomically low by random chance. But since she won the lottery and the probability of her winning the lottery by fair random chance is astronomically low and the probability of her winning the lottery by cheating is much higher than the probability of her winning the lottery by fair random chance, therefore it is more probable, that Sally has cheated than that she has not cheated given that she has actually won the lottery.
P(Sally winning the lottery|Sally cheating the lottery) » P(Sally winning the lottery|Sally not cheating and the lottery being fair and random)
=> P(Sally cheating|Sally winning the lottery) » P(Sally not cheating and the lottery being fair and random)

I know, prosecutor, accessing evidence can be sometimes very difficult. But please before you make a fool out of yourself and any body believing in such a fallacy at least try to consult a real expert on statistics and also trying to consider also accessing prior probabilities of currently considered hypothesis while accessing certain relevant evidence for those hypothesis in relation to those currently considered hypothesis.
Otherwise you might put an innocent person into jail for no good reasons.
Making A Math Murder
ReplyDelete
Replies
Alexander R PrussJune 2, 2022 at 10:05 AM
Zsolt:

I am sorry. I really just don't get the question. Anti-Bayesian update would mean that you get E as evidence, and you conclude that E didn't happen. Let's say you're not sure it's raining. You go outside, see that it's raining, and therefore you assign probability 1 to its NOT raining. If that's not absurd, I don't know what more to say.
ReplyDelete
Replies
Nagy ZsoltJune 2, 2022 at 12:27 PM
No, Alexander, the Anti-Bayesian update doesn't mean, that you get E as evidence and you conclude, that E didn't happen.
According to your own post here the Anti-Bayesian update is supposed to be this:
Anti-Bayesian update on E is moving from P to P(⋅∣not-E) (where not-E is the complement of E).
What does even "moving" mean in this context?!?
If we just simply identify the Anti-Bayesian update on E with this P(⋅∣not-E), then the Anti-Bayesian update on E is just that - the conditional probability on not-E.
Do you even know, what the "regular" Bayesian update is supposed to be?!?
"Bayesian inference" from Wikipedia
Let's say we consider the following example argument from Wade Tisthammer alias Maverick Christian:
*1. If it is raining, then my car is wet.
2. It is raining.
3. Therefore, my car is wet.
Or to put that into our current context:
1) P(my car is wet|it is raining) is high.
2) P(it is raining) is 1.
3) Therefore, P(my car is wet) is high.

What is then the Anti-Bayesian update on <"it is raining"> according to you, Alexander?
I think, the Anti-Bayesian update on <"it is raining"> is the conditional probability P(my car is wet|it is not raining) here according to you and your blogpost here.
So then what do we assign to such a conditional probability?!? Is that high or low?
Do the prior probabilities P(it is raining) and P(it is raining) matter for the previous questions?
The prior probabilities do matter here, if and only if P(my car is wet AND it is raining) ≠ P(my car is wet)·P(it is raining) ("Independence (probability theory)" from Wikipedia)
ReplyDelete
Replies
Nagy ZsoltJune 2, 2022 at 12:27 PM
Do you, Alexander, understand what I'm talking about here?
I certainly don't understand, what you mean by "observing, that <"it is actually raining"> AND simultaneously assigning to the probability of <"it is NOT raining"> the value 1.
Is that supposed to be a proper response to my previously given argument?
(1) For evidence E, the Bayesian Update on evidence E is "better" than the Anti-Bayesian Update on not-E.
(2) Sometimes absence of evidence is evidence of absence or to say, that sometimes not-E' is the evidence E: E = not-E'
From 1 and 2:
(3) Therefore sometimes for evidence not-E', the Bayesian Update on evidence not-E' is "better" than the Anti-Bayesian Update on not-not-E'(=E').
If so, then here is my nonsensical response to your nonsensical (and uncharitable) response:
(1) For evidence E, the Bayesian Update on evidence E is "better" than the Anti-Bayesian Update on not-E.
(2) Sometimes absence of evidence e.g. <"it is raining"> is evidence of absence or to say, that sometimes not-<"it is raining"> is the evidence E: E = not-<"it is raining"> = <"it is not raining">
From 1 and 2:
(3) Therefore sometimes for evidence <"it is not raining">, the Bayesian Update on evidence <"it is not raining"> is "better" than the Anti-Bayesian Update on not-<"it is not raining"> (= <"it is raining">).
I feel so bamboozled by your response. Why are you calling yourself an "analytical philosopher"?
Seriously, you shouldn't call yourself that by giving such nonsensical responses.

*Sidenote: What if your car is in the garage, while it's raining?
Yeah, weirdly enough Wade Tisthammer also couldn't answer that question properly.
And I don't wonder about that any more given him learning logic and rationality from such persons, who shouldn't have called themselves "analytical philosophers" in the first place or who are calling themselves wrongly and falsely that currently.
ReplyDelete
Replies
Alexander R PrussJune 2, 2022 at 8:42 PM
I made up the term "anti-Bayesian update" so I get to say what it means. And what it means is that when you get evidence E, you modify your credences just like the Bayesian would have modified them on getting the evidence of not-E. So, you observe that the it's raining, and you update your credences just like the Bayesian would if the Bayesian observed that it's not raining. In other words, the anti-Bayesian on getting evidence E pretends that they are a Bayesian getting evidence not-E. Yup, that's the dumbest thing ever. That's the point.
ReplyDelete
Replies
Nagy ZsoltJune 3, 2022 at 5:33 AM
Really? That's supposed to be the dumbest thing ever?
If so, then I guess, your Anti-Bayesian Update would be on pair with the prosecutor's fallacy - they are simply irrelevant as most of the work from wrongly and falsely claimed to be "analytical philosophers". And therefore, your whole blog post here is just irrelevant.

How about a more relevant analysis about conditional probabilities then?
Given the transpositions for material conditionals: For every proposition P and proposition Q it follows, that [P → Q] ⇔ [~Q → ~P].
Is this also true or in some way analogous for conditional probabilities?
Is it for evidence E and hypothesis H, such that P(H|E) = P(~E|~H) or at least P(H|E) ≈ P(~E|~H)?
And what about the material implication [P → Q] ⇔ [~P ∨ Q]?
Is it then P(H|E) = P(~E ∪ H) or at least P(H|E) ≈ P(~E ∪ H)?
And further with De Morgan’s law [P → Q] ⇔ [~P ∨ Q] ⇔ [~(P ∧ ~Q)]?
Is it then P(H|E) = P(~(E ∩ ~H)) or at least P(H|E) ≈ P(~(E ∩ ~H))?
Relevant questions upon more relevant questions.
ReplyDelete
Replies
Zsolt NagyJune 5, 2022 at 8:57 AM
P(E→H) = P(~E∪H) = P(~E)+P(H)-P(~E∩H)
=1-P(E)+P(E∩H)=1-P(E)+P(E∩H)/P(E)•P(E)
=1+(P(H|E)-1)•P(E)
So,
P(H|E)=1+(P(E→H)-1)/P(E) and
P(E→H)=1+(P(H|E)-1)•P(E)
Further in the case of P(E) = 1
we get the following:
P(H|E) = P(E→H) = P(~H→~E)
= P(~E∪H) = P(~(E∩~H))

Just in case someone was wondering about this.
ReplyDelete
Replies

Add comment