Suppose that L is an unknown natural number (1,2,3,...) and as far as we know, all positive numbers are equally likely candidate for L. The number L is input into a machine, and pressing a button generates a natural number between 1 and L, which number is shown.
Consider the probabilistic distribution for L. Initially, there is no well-defined distribution. The information we have is that L is any number between 1 (inclusive) and infinity (exclusive), but there is no meaningful uniform measure on that set.
You press the button once and get a number x1. You press it again and get x2.
After pressing the button once, you know that L≥x1. But you still don't have a well-defined posterior distribution of values of L. (You can try to generate one by supposing L is chosen at random between 1 and M and then taking the limit as M goes to infinity. This won't work. You get the unwelcome conclusion that for all y we have P(L>y)=1.)
But once you get the second data point, you do have a well-defined posterior distribution of values of L. Approximately (for large x, x1 and x2) the posterior probability that L≥x will be max(x1,x2)/x, assuming that max(x1,x2)≤x (of course, if x is less than one of the xi, the probability that L≥x is 1). Thus, with probability approximately 1/2, we can say that L is no more than twice as large as the larger of x1 and x2.
This is curious. You don't have a well-defined posterior distribution with one data point, but with two you get one. And then as you gather more and more data points, you get standard Bayesian convergence. With enough data points, you can be fairly confident that L is pretty close to the largest of your data points.
I suppose this is yet another one of those phenomena where the unconditional probabilities are undefined, but the conditional ones are defined.