Comments on James' Empty Blog: Probability, prediction and verification V: Commotion in the comments

>the new record is still likely to be marginally l...

2006-02-07T17:39:00.000+00:00

>the new record is still likely to be marginally less extreme than the measured temperature!

Did you mean the new record is marginally more likely to be less extreme than the measured temperature?

crandles

Arun,It doesn't matter what the real temperature d...

2006-02-06T02:13:00.000+00:00

Arun,

It doesn't matter what the real temperature distribution is. The meaurement error is symmetrical for any distribution of real temperatures. The distribution of measured temperatures will be broader than the real temperatures. Like in my simple example, a real temp of 4.5 with equiprobable +-0.5C errors gives a symmetric observational distribution over 4-5C, and a skewed distribution over inputs 4.5-5.5 with symmetrical errors gives a skewed distribution over 4-6C. However, an observed temperature of 5C does not imply a symmetric distribution over real temperatures 4.5-5.5C, in the same way as an observation of 6C does not imply a symmetric real distribution over 5.5-6.5C because the latter (6.5C) did not happen at all!

The wikipedia bayes pages have a few more similar examples, such as making inferences from medical tests. Just because a test has a false positive rate of 20% does not mean that someone with a positive result has a 20% chance of having the disease!

And yes, under this simple description of errors, if you had 100 independent thermometers and took the average, then this would give you an observation with only 0.1C error. However, you still need a prior and Bayes theorem in order to form a probabilistic estimate of the real temperature! Even if you are now very confident that you really have smashed the old record, the new record is still likely to be marginally less extreme than the measured temperature!

Yet another question - if I had hundreds of your m...

2006-02-05T19:28:00.000+00:00

Yet another question - if I had hundreds of your min-max thermometers, then I would not need any knowledge of the a priori temperature distribution, I'd simply take the average of all the readings, these are uniformly Gaussian distributed about the true value?

James,Still trying to understand what the meaning ...

2006-02-05T18:23:00.000+00:00

James,

Still trying to understand what the meaning of your error bounds on your thermometer mean.

Is it that for a uniform distribution of calibration temperatures as determined by a precise thermometer, your thermometer shows a normal distribution of error, with zero mean error?

>For the n-th time.>You need to have a theory to d...

2006-02-05T17:09:00.000+00:00

>For the n-th time.
>You need to have a theory to do this.

Wolfgang,

Is your answer to the question you seem to want to ask - 'Is there a theory?' always a boolean Yes/No or is there a range from knowing nothing about a system to knowing everything about it.

If you know nothing about a systems, I don't think you would dare dream about trying to estimate probabilities without doing some investigation of the system first. If you know everything about the system then you can go about trying to estimate probabilities.

So at both extremes, I think we seem to agree. What about the situation when we know a bit about the system but not everything?

crandles

Arun,"that the error on each day's measurement is ...

2006-02-05T07:23:00.000+00:00

Arun,

"that the error on each day's measurement is well-characterised as a standard gaussian deviate"

On what sample of days was measurement made to verify this standard gaussian deviate?

It doesn't matter. Assume it was checked across a huge range of artificially-generated temperatures, with the real temperature measured precisely and compared to my thermometer. What it means is that for a given real temperature (and therefore when integrated across all real temperatures) the measurement discrepancy of my thermometer has zero mean with specified variance. It's simply not the case that

"measuring error is symmetric with zero mean"

implies that

"for a given observation, the real temperature is as likely to be higher as lower."

In order to evaluate the latter statement, we also have to consider the prior distribution of real temperatures. I know it sounds a little counterintuitive at first but that's the way probability works.

A simple discretised version: assume my thermometer rounds up or down randomly with 50% probability to the nearest degree. I have two bowls of water, with exact temperatures 4.5C and 5.5C. I choose the former bowl with probability p and the latter with probability q=1-p. If my thermometer says 5C, what is the probability that I chose the cooler bowl?

There are 4 cases of (true,obs) to consider with their associate probabiities:

(4.5,4) p/2
(4.5,5) p/2
(5.5,5) q/2
(5.5,6) q/2

Of the cases where I see 5C, exactly p/(p+q) = p of them are the cooler bowl, even though for any particular bowl, the observation is as likely to be warmer as cooler. In fact if we see 6C then we _know_ the real temp was only 5.5C, not 6.5C, because the prior probability of the latter is zero by construction.

The continuous case works similarly.

I'd like an explanation of what this means:"that t...

2006-02-04T15:14:00.000+00:00

I'd like an explanation of what this means:

"that the error on each day's measurement is well-characterised as a standard gaussian deviate"

On what sample of days was measurement made to verify this standard gaussian deviate? "Statistics for every day in the last 30 years? Every February in the last 30 years? Every February 2nd when there had been rain on the previous day (as was the case this time)?"

What you're arguing is that the error on each day's measurement is not characterised as a standard gaussian deviate; especially when the temperature falls below the previous record low.

> Unless and until God tells you what the "real" p...

2006-02-04T13:19:00.000+00:00

> Unless and until God tells you what the "real" prior is from which reality was actually chosen

Yes, and there is always a 50% chance that everything is just a dream and we are all wrong.
Or is the probability that we live in The Matrix only 25% ?

If I watch the evening news I sometimes think its 100%.

Which is it ?

And after all there is a small chance that 12 is a prime number, because once my daughter was not able to factor it when she was 6 years old.

CIP,This decision implies a roughly uniform prior ...

2006-02-04T05:03:00.000+00:00

CIP,

This decision implies a roughly uniform prior on the mass. Don't worrry, I'm not going to hold you to it in detail or anything, but the point is that there has to be a threshold at which you are indifferent, and a uniform prior puts that at X=1.3eV (I'm assuming you mean "bigger part of the interval" not "the section that is greater than X"). Perhaps someone else might say that there are many more small particles than large ones, and their density is uniform in log-space, and to them the point of indiference would be 0.5eV (don't bother telling me this is silly - I know nothing abut particle physics, it's just an example). So far, so subjective.

If someone came along with some more information (say a measurement of 1eV, with an uncertainty of 0.2eV), you could (and would, assuming you are rational) update your priors. As the evidence increases, your different distributions would in fact converge.

But the point of my example is there is no getting away from the fact that there is always a subjective component to each of your estimates, and your posteriors will never truly coincide (even when the difference has become entirely negligible in practice). Unless and until God tells you what the "real" prior is from which reality was actually chosen, you have no option but to acknowledge that there will always be a subjective component in your estimate. Even Lubos is not God, and his opinion is also subjective!

You might feel happier if you don't try to think of bayesian probability as directly telling you "the truth" about the real world but instead as describing and formalising your imperfect understanding of the real world. It's not that the world is probabilistic, but that you are uncertain.

It is quite deliberate that many of my examples have explicitly concerned decision-making, since when faced with the need to make a decision, the frequentist attitude of "I can't give an answer until someone tells me the prior" is simply not an adequate response. A bayesian approach gives you the tools to make decisions, which of course will not be perfect, but which will improve as you iteratively incorporate knowledge into your prior. It's not about generating a grand theory of everything, but about dealing with the real world.

I'm not sure that your bayesian joke tells me much about whether you understand what I'm trying to say...

James - Another angle on it: you have the choice o...

2006-02-04T02:55:00.000+00:00

James - Another angle on it: you have the choice of two equally expensive and time-consuming experiments, one of which searches the space between 0.1eV to X, and the other covers X to 2.5eV. Does your decision about which experiment to perform depend in any way on the value of X, or do you just flip a coin irrespective of the value of X?

I would choose the big side of X, because I don't know anything else, or, more realistically, ask Lubos and other experts. And your point is?

what would I guess?Heck, I can't even remember the...

2006-02-04T02:47:00.000+00:00

what would I guess?

Heck, I can't even remember the question. If your point is that one should take into account any prior knowledge you have, I don't think either Wolfgang or I would disagree. My argument was that prior knowledge of that sort is a kind of a theory, and that if you want to quantify it, you are in effect assuming some kind of frequency or measure in your theory.

You, on the other hand, have not told me whether or not I correctly understood your explanation of Bayesian reasoning.

Sorry,I really am "typo man". I should have typed:...

2006-02-04T02:46:00.000+00:00

Sorry,
I really am "typo man". I should have typed:

"The electron neutrino is not necessarily an eigenstate and the theory would tell you what distribution to expect for the mass!"

above, insetad of the nonsense I produced. These comment windows on blogger are too small for me.

Your description of a bayesian is an extremely nai...

2006-02-04T02:28:00.000+00:00

Your description of a bayesian is an extremely naive caricature, a straw-man entirely of your own creation. Perhaps this helps you to justify your prejudice, but if instead you were prepared to look at the subject with an open mind you might find you learn something useful. If you can manage the doublethink required to outwardly reject bayesianism while simultaneously using it, then good luck to you. You are clearly not alone in this.

> The frequentist would throw their hands up in ho...

2006-02-04T02:19:00.000+00:00

> The frequentist would throw their hands up in horror at how ill-posed this plane problem is, but what would they actually do?

I do not know what a "frequentist" does, but I can tell you what a physicist would do: Try to come up with a reasonable theory, e.g. by talking to people etc., instead of relying on some arbitrary "probabilities".

By the way, the reason I like to discuss simple examples is not because I want to duck and weave, but because simple examples make it easier to explain an idea to your readers.

And it turned out that your neutrino probem was indeed a great example. The mass of the electron neutrino is not necessarily an eigenstate and the theory would tell you what distribution to expect for the mass!

And I am sure the Cargo-Cult people gave up after a while; But they never understood anything.
And I am sure a lot of people bought internet stocks in 1999 and 2000, because their Bayesian priors made it plausible to buy stocks which were going up in previous years and they may have thought they understood something in an uncertain market - until March 2000.

I hope you were not one of them. But I see that you try to understand climate change by studying financial markets.
Good luck with that!

What would you guess? Can you imagine any circumst...

2006-02-04T02:02:00.000+00:00

What would you guess? Can you imagine any circumstances in which such a decision could be made on purely objective grounds, by any expert?

Another angle on it: you have the choice of two equally expensive and time-consuming experiments, one of which searches the space between 0.1eV to X, and the other covers X to 2.5eV. Does your decision about which experiment to perform depend in any way on the value of X, or do you just flip a coin irrespective of the value of X? Or do you just stay in bed all day, saying you have no grounds for making a decision either way?

James, you say I never answered your neutrino ques...

2006-02-04T01:49:00.000+00:00

James, you say I never answered your neutrino question, but I think I answered pretty clearly that I didn't know enough to answer sensibly. If I were forced to answer, I would of course guess - what else could I do.

Meanwhile, Lubos has pointed out that the electron neutrino isn't a mass eigenstate, so that arbitrarily accurate measurements of its mass really should give a distribution.

I have attempted to absorb your lesson in Bayesian probability. You can check whether I actually learned anything from my post So, a Bayesian and a Frequentist go into a Bar

Wolfgang,You started out by saying that probabilit...

2006-02-04T01:00:00.000+00:00

Wolfgang,

You started out by saying that probability made no sense in cases such as the mass of the neutrino. You then decided you could estimate such probabilities. You still don't seem to understand that such an estimate necessarily requires subjective judgement. That's simply the way the world works.

Your childish jibes about "cargo cult science" sugest you are more interested in point-scoring than trying to understand. Of course in your problem it would be reasonable to think there was a non-zero likelihood of a plane landing on the next day. After it failed to land, each day the bayesian would revise their estimate downwards until after a few days they would give upn waiting. This is of course the precise opposite of "cargo cult science" where the belief is maintained in the face of evidence to the contrary.

You are correct that there is no frequentist approach to this problem (or indeed any problem where the prior is not explicitly or implicitly defined). Nevertheless, we have to make multiple decisions every day regarding such problems. Bayesianism provides the mechanism for making such decisions.

The frequentist would throw their hands up in horror at how ill-posed this plane problem is, but what would they actually do? They either go and wait, or they do not. There is no "null decision" - simply two alternatives, one of which has to be chosen. On what basis would they make that decision?

> You comment that "you need to have a theory" see...

2006-02-03T23:21:00.000+00:00

> You comment that "you need to have a theory" seems to be where the prior is concealed

If you want to call "prior" what I call theory, this is fine with me.
I really do not care if you put me in the Bayesian or frequentist category.

But, I think it is very misleading to call a theory of physics a "prior".
Lubos explained (much better than I could) what a theory of neutrinos really looks like.
This is very different from somebody who tries to fit some model from some data (he or she does not really understand).

Please, think about my example 2 again. You have three very good data points (a plane landed at 2pm +- 10min at point B the last three days).
As long as you have no theory (e.g. this is a commercial airport, commercial airlines have flight plans etc.) you cannot estimate a probability for a plane to land tomorrow.
If you do you are, you are a practitioner of cargo-cult science, as described by Feynman.

Best regard,
Bayesian Man

Wolfgang,I'm disappointed that you have not attemp...

2006-02-03T23:02:00.000+00:00

Wolfgang,

I'm disappointed that you have not attempted to clarify what you mean.

You comment that "you need to have a theory" seems to be where the prior is concealed, but you won't admit this openly. Since you started off by saying that such probabilities make no sense, you must be rather confused.

PPS: Thank you for the link!

2006-02-03T15:07:00.000+00:00

PPS: Thank you for the link!

> How do you go from measurements with uncertainti...

2006-02-03T15:05:00.000+00:00

> How do you go from measurements with uncertainties, to a distribution on the parameter in question?

For the n-th time.
You need to have a theory to do this.
In your example you specified the theory: neutrino mass, which is a constant in a physical theory, plus measurment errors (you need to have a theory about your measurment device etc.)

I suggested to think about a simple example, like counting birthdays, with a simple theory.

I am at my job now and really have to stop this discussion.
Feel free to write one more comment or a whole post about what a moron and wannabee I am.

I see that Lubos showed up on CIP's blog, maybe you want to discuss with him. I think it may be easier with him, becuase he likes to put people into categories just like you do.

PS: By the way, I really am a wannabee 8-)

Wolfgang,This is all very trivial and point-dodgin...

2006-02-03T13:44:00.000+00:00

Wolfgang,

This is all very trivial and point-dodging.

The fundamental question is: How do you go from measurements with uncertainties, to a distribution on the parameter in question?

Of course the obvious answer is that you can't. Your comments remain incoherent and inconsistent.

There are two obvious alternatives to resolving this disagreement. You can either agree that any probabilistic estimate of a physical parameter (fixed but unknown, like the mass of a neutrino) is necessarily bayesian and relies on a subjective prior, or you can try to demonstrate a counterexample.

So far, you have both said that such a probabilistic estimate makes no sense, and also claimed that you can generate one without recourse to a subjective prior. Whether or not the approach implied by the first comment is a sensible one, the second statement is simply false.

Sorry, another typo:Narrows from N to n should hav...

2006-02-03T13:26:00.000+00:00

Sorry, another typo:
Narrows from N to n should have been
Narrows from N to N+n

You make N measurements which gives you a mean and...

2006-02-03T13:24:00.000+00:00

You make N measurements which gives you a mean and variance and a confidence interval.
You make n more measurements which gives you a new mean and variance and and a new confidence interval. I will not calculate it here because your comments do not take LaTex, but you can look it up.

IF you are dealing with measurment errors, i.e.
you have a theory, you will find that the confidence interval narrows from N to n, assuming that N and n
are sufficiently large.

If you want to frame this example in Bayesian priors so be it, go ahead. I have no problem with it.

This was really the last time I will write a comment here, because I need to go to work.

Wolfgang, There still seems to be quite a gulf bet...

2006-02-03T13:13:00.000+00:00

Wolfgang,

There still seems to be quite a gulf between your vague claims and any actual demonstration of what you are on about. Telling me to look it up is not adequate. Remember the maxim that if you can't explain it, then you don't understand it.

I'm still hoping for some credible explanation of what this means in practice:

if you have reason to believe that the neutrino mass is in the range m = 0.1-2.5eV you can calculate probabilities.

This is because you have different scenarios ( m = 0.11, m = 0.21, etc.) which are all consistent and compatibel with what you know.

followed by

ii) you have measured m experimentally (which requires a lot of theory!) and found N different values in the range 0.1 - 2.5eV.
[...]
in case ii) you use the distribution u(m) determined by the experiment(s). If you have several experimental samples you will use modern re-sampling techniques and respect conditional probabilities.

All I'm asking for is the simplest example you can think of where you can generate a posterior distribution without needing to assume a prior. Is that really too much to ask that you will do this small thing rather than just wander around the subject, occasionally referring to it obliquely?