A small twist on an old tale...it's not really related to the bayesian/frequentist fuss, but rather this...
It was the end of term, and the three top students (one each of engineering, physics, and maths) had one practical exam left to determine which one got the overall prize.
The tutor took them to a clock tower, and gave each one a barometer, paper and pencil. "Your task is to determine the height of the tower in the next 2 hours. You should assume a priori that it is a typical example of its type, which have heights uniformly distributed in the range 20-40m. We've examined the foundations and found that they are not safe for a building of more than 30m height. Obviously we are very worried. Unless we can rule out the possibility that it exceeds 30m at the 99% level or better, we'll have to spend a million pounds reinforcing it. Can you tell us whether this is necessary?" He then walked off.
The students sat down on the front step to think. The physicist and engineer started scribbling ideas. The mathematician thought for a few seconds, then went off to the pub for lunch.
2 hours later, they all assembled at the clock tower, and the tutor asked for their answers. The physicist says, "I observed the air pressure at the bottom of the tower, climbed to the top and observed the air pressure again. Based on the pressure difference, I estimate the height to be 25+-2.5m, which means there is a 2.3% chance of it exceeding 30m. You'd better do the reinforcements." (All uncertainties are assumed Gaussian and quoted as 1 standard deviation.) The engineer says, "I climbed to the top of the tower, and dropped the barometer, timing the fall with the clock. Based on the time it took, I estimate the height to be 27+-2m, which means there is a 7% chance of exceeding 30m. You definitely have to reinforce it."
The mathematician says, "I estimate the tower's height to be 26.2+-1.56m. The chance of it exceeding 30m is less than 1%. Let's spend the money on renovating the bar instead."
The physicist and engineer are aghast. "But we've both proved there is a substantial chance of disaster! Something Must Be Done!"
How did the mathematician calculate her answer, and was her decision the correct one? If she is right, what did the others do wrong?
Updated 6/02
Well, this wasn't really intended as a serious problem - I'm sure that you all realised that the mathematician combined the previous two estimates using Bayes' Theorem. Note that the question as posed specifically defines the tower as a sample from a uniform distribution - so it's a perfect well-posed frequentist problem and the Bayesian/frequentist rambling in the comments is completely misplaced.
The vaguely amusing point I was making is that although the engineer and physicist did nothing wrong initially, and both concluded that there was a significant (>1%) danger, as soon as each of them hears the other one agree with their conclusion, their position immediately becomes untenable. Their only fault is to not realise this as quickly as the mathematician did :-)
It was the end of term, and the three top students (one each of engineering, physics, and maths) had one practical exam left to determine which one got the overall prize.
The tutor took them to a clock tower, and gave each one a barometer, paper and pencil. "Your task is to determine the height of the tower in the next 2 hours. You should assume a priori that it is a typical example of its type, which have heights uniformly distributed in the range 20-40m. We've examined the foundations and found that they are not safe for a building of more than 30m height. Obviously we are very worried. Unless we can rule out the possibility that it exceeds 30m at the 99% level or better, we'll have to spend a million pounds reinforcing it. Can you tell us whether this is necessary?" He then walked off.
The students sat down on the front step to think. The physicist and engineer started scribbling ideas. The mathematician thought for a few seconds, then went off to the pub for lunch.
2 hours later, they all assembled at the clock tower, and the tutor asked for their answers. The physicist says, "I observed the air pressure at the bottom of the tower, climbed to the top and observed the air pressure again. Based on the pressure difference, I estimate the height to be 25+-2.5m, which means there is a 2.3% chance of it exceeding 30m. You'd better do the reinforcements." (All uncertainties are assumed Gaussian and quoted as 1 standard deviation.) The engineer says, "I climbed to the top of the tower, and dropped the barometer, timing the fall with the clock. Based on the time it took, I estimate the height to be 27+-2m, which means there is a 7% chance of exceeding 30m. You definitely have to reinforce it."
The mathematician says, "I estimate the tower's height to be 26.2+-1.56m. The chance of it exceeding 30m is less than 1%. Let's spend the money on renovating the bar instead."
The physicist and engineer are aghast. "But we've both proved there is a substantial chance of disaster! Something Must Be Done!"
How did the mathematician calculate her answer, and was her decision the correct one? If she is right, what did the others do wrong?
Updated 6/02
Well, this wasn't really intended as a serious problem - I'm sure that you all realised that the mathematician combined the previous two estimates using Bayes' Theorem. Note that the question as posed specifically defines the tower as a sample from a uniform distribution - so it's a perfect well-posed frequentist problem and the Bayesian/frequentist rambling in the comments is completely misplaced.
The vaguely amusing point I was making is that although the engineer and physicist did nothing wrong initially, and both concluded that there was a significant (>1%) danger, as soon as each of them hears the other one agree with their conclusion, their position immediately becomes untenable. Their only fault is to not realise this as quickly as the mathematician did :-)
22 comments:
Dear James,
here is my solution to your quiz:
The Bayesian mathematician "knows" that it is better to combine the information of two different independent measurements. She plugged the numbers into her standard statistical procedure.
Of course, the professor is smarter than that.
He knows that in order to estimate the height with the claimed precision by dropping a barometer one would need to measure the release and landing time to less tha n a fraction of a second.
This would require timinig devices at the top and bottom of the building and could not be done within 2 hours of time.
And this assumes that there was no wind and no air at all. A barometer
tumbling down a building is just not a good measurment device.
It is obvious that the second result was achieved by cheating or faked somehow.
The first result is somewhat suspicious too. Using only one commercial barometer would certainly introduce a systematic error.
In any case, the 2nd and 3rd student get an F and the 1st one a C.
He then explains to his students once more that without proper understanding of physics you should not estimate errors.
Then the professor decides to use good old trigonometry to determine the height himself.
Did I get it right ?
Dear Wolfgang,
your approach mimicks the procedure how the top quark was discovered at the Tevatron.
The two detector teams, CDF and D0, normally compete. But neither of them could have made a 5 sigma discovery. But when they combined their data, they could make it into 5 sigma. See Lisa's Warped Passages for a description of this story.
Of course, by today, we have seen many top quarks and the discovery is much more solid than 5 sigma.
I find James' example unscientific. What he talks about is politics, not science. If there is a law that dictates whether a building must be reinforced, the law should exactly define the procedure to find out whether the reinforcing is necessary. The definition he mentioned is not well-defined.
Claiming that there is an objective "probability" that the building is taller than 30 m is a stupidity, and we have explained why it is so many times. I assure you that more skillful engineers and physicists could measure the height with the given equipment up to the error of 0.1 meters. And with better equipments, up to micrometers.
By giving them a stupid barometer and two hours only, you force them to act irrationally. This is just not how science should operate. In science, it is essential to have enough time and enough available tools to do things right.
Incidentally, s=gt^2/2 gives you, for g=10 and s=30, 30=5*t^2, which means t=2.5 seconds or so, while the relative error of "s" is twice the relative error of "t".
Best
Lubos
> Incidentally, s=gt^2/2 gives you, for g=10 and s=30, 30=5*t^2, which means t=2.5 seconds or so, while the relative error of "s" is twice the relative error of "t".
Exactly my point. The 2nd student would have to measure the flight time at a fraction of a second to achieve the claimed accuracy, which is impossible with just a stop watch.
It would be equally foolish to assemble a panel of "height experts"
and ask for their opinions to improve the estimate.
Come on, Wolfgang, I can measure time exactly this precisely with stopwatch, especially if they're beeping.
Toss a piece of the barometer so that you hear a tick, press "start", stop it, press "stop". You will hear "tick beep ... tick beep". If you listen carefully, you can estimate the time delays between the beeps and the ticks (tossing/falling on the ground), and if there were some discrepancies, you can estimate them better than with the 0.1s accuracy.
Don't forget that it takes 0.1 second for the sound to return back from the opposite end of the tower, if you rely on sound.
Best, LM
> Don't forget that it takes 0.1 second for the sound to return back from the opposite end of the tower, if you rely on sound.
Why did you have to give away this point? I wanted to keep the speed of sound argument for later 8-)
And what about the
"http://www.jamstec.go.jp/frcgc/research/d5/jdannan/#publications">real experiment where a panel of "height experts" had to place bets in order to determine the best estimate for the hight of the building?
And what about the real experiment, as described here,
http://www.jamstec.go.jp/frcgc/research/d5/jdannan/#publications
where a panel of "height experts" had to place bets in order to
determine the best estimate for the hight of the building?
OK, this joke was popular in a different form when I was a grad student. I think the most accurate way to measure the building used the rulings on the barometer to measure the height of each step on the way to the roof.
Another good way is to go to the architect, and offer her a fine barometer in exchange for her telling you the height of the building.
There is also a Syrian way to measure the height of the building: use the barometer as a fuse to burn the tower - surely it will be below 30 meters after 2 hours of work. ;-)
Children,
I'm glad my little story amused you.
Please note that as posed, it has an entirely straightforward frequentist interpretation as the prior is explicitly given. So no need for any anti-Bayesian rants here please.
If I were doing it, I would measure the height of the barameter, put the baramoter next to the tower, then find out how many "barometers" high the tower is. Sorry, the statistics are beyond me.
Lumo's responces are just bizzre. The question does not say anything about a law. And if there were a law, I have no idea why it would specify how to measure height. Apparently anything he does not like is "politics".
Not a bad idea for an accountancy exam or as an inter-disciplinary exam where accountancy students are excluded.
My answer would run along the lines of:
Assuming for the moment the hypothetical to be real, I would spend the two hours keeping the public away from the tower. I would also solicit someone to find an expert who could measure the height of the building without undertaking the dangerous activity of climbing the tower. I anticipated that this would be done using trigonometry so I also solicited someone to find out the quickest way of obtaining a sextant, theodolight or other appropriate surveying equipment. I did tell the physicist and engineer not to climb the tower. Fortunately for me they thought I was trying to prevent them winning the prize. Had I not been so busy keeping the public away from the tower, I would have drafted a liability disclaimer to ensure that anyone who did climb the tower did so at their own risk.
My reasoning for the above answer is to consider the controllable costs. No-one is going to order a million pounds of work without checking it is necessary with appropriate tools for the task. Any attempt I made to measure the height of the tower will not make a difference to whether it falls down before strengthening work takes place or not. Therefore none of the costs of strengthening of the building or the loss of the building and rebuilding costs if that is considered the best option are controllable costs. The most major cost I may be able to control is public liability. The university is undoubtedly insured but I suspect there is a clause that the university should take appropriate steps if it knows of risks. This explains my suggested activities of keeping the public away from the tower.
However, I noticed that the tutor walked away after setting the question. I consider that this is significant. Given this information, if the situation was real, I believe it is too much of a risk to assume the prize students would take the appropriate action. Therefore it is clearly just a hypothetical question and the answer is that it is not necessary to measure the height of the tower.
Next I turn my attention to the actions of the competing students to see how I am doing compared to them. While climbing the tower is not dangerous, any attempt to measure the tower means they are working on the assumption that the problem is real. As far as they are concerned they are undertaking a dangerous activity and acting inappropriately. The maths student gets a few marks for spotting that using two independent measurements is better than one. However, he looses marks in several areas: 1. For not using this to conclude three measurements are better than two and ten measurements better than three. 2. For not doing some trigonometry on the problem and assuming the problem is real and giving an answer. 3 For effort to answer the problem.
Therefore I conclude that I think this answer should win. At this point I want to go to the pub but I will probably hang around in case still being on the scene when the tutor returns is considered important.
crandles
The shorter answer to the question "Can you tell us whether this is necessary?" is No.
Referring to measuring the height of the building because it is clearly a test not real. The only reason for going through the long answer above is to avoid losing marks for effort.
crandles
Eventually I'll get to the answer:
In case it isn't clear, the mathematician calculated the answer the same way as I did deciding it was a test of deciding when not to do something.
crandles
Chris,
The mathematician gets bonus points for producing the best answer with minimum effort! Being a mathematician, of course she couldn't dirty her hands with a real experiment.
Should I cry foul? Both the mathematician and I decided to put in some effort. The mathematician only doing so after hearing the other answers and she could not be sure she would hear two answers. I put my effort in before that.
crandles
Sorry Chris,
Low cunning always beats effort in my book - and it's my puzzle, so I get to make the rules :-)
James thank you for compliment of telling me I don't use low cunning methods. :)
crandles
Stand the barometer up vertically.
Measure the length of its shadow.
Measure the length of the building's shadow.
You can get to the answer from there, right?
Jimmy - I can see one serious blunder in your problem formulation right off. That hypothetical chick was no mathematician - no mathematician would take the word of two miscelleaneous louts for anything. Mathematicians have to prove it for themselves.
She was definitely some lesser being - maybe a statistician - they will believe anything if they here it from enough people.
er, hear.
Post a Comment