I've not really paid that much attention to the sea ice area myself, but I did take a punt on Lab Lemming's betting pool, mainly because (at the time of my bet) there seemed to be a pretty big hole in the bets close to where the sea ice graph seemed to be heading (based on nothing more than eyeballing the graph). So without any detailed calculations, I chose a value in the middle of the hole, to maximise my chances of winning. I didn't quite win, but was one of the closest few.
More interesting than that, is the performance of the "pros". I use the inverted commas because as far as I know, there is no real history of producing annual forecasts - and neither is there any customer - so these people aren't really experienced forecasters, rather they are sea ice modellers and experts of some sort or another. Note that weather forecasting certainly makes heavy use of both numerical models and experts in atmospheric science, but on top of that it involves experts in prediction. Anyway, here is the final sea ice prediction of the experts, issued in August and based on July data:
Clicking on the pic should get a bigger version, or you can go here for more details.
SEARCH have now published the final outcome, including a comparison with the July forecasts. That earlier set of forecasts happens to be the only month that contains a prediction which exceeds 5, which I'm sure is entirely unrelated to their decision to cherry pick that month. Note also that SEARCH are using a different observational analysis which gives a slightly higher observed value (5.36 versus the 5.25 of Lab Lemming). Also they are using monthly mean rather than daily min. None of this, however, obscures the fact that the forecasts are all wrong, and most of them are very very badly wrong.
Looking at the IARC-JAXA graph, it seemed very obvious through July that the sea ice extent was towards the low end of the pack, but well above the 2007 track and not clearly lower than the other years (the ticks mark the start of the month):
So I don't know what basis the researchers had for expecting such low values. SEARCH helpfully adds that many of the forecasts had their own uncertainty estimates of about 0.4, which means that almost all of them failed dismally to even include the observed value in their predicted range, and one might have been as far as 6 sigma out (interpreting the 0.4 value as 2 sigma). Back to the drawing board for them!
Turning briefly to the "semi-pros", Stoat has a bit of a gloat, although he still managed to lose a bet with Bob Grumbine. However, both of them were a lot further off than me...(but to be fair, they made their bets much earlier too).
Turning briefly to the "semi-pros", Stoat has a bit of a gloat, although he still managed to lose a bet with Bob Grumbine. However, both of them were a lot further off than me...(but to be fair, they made their bets much earlier too).
3 comments:
It is fun to point a 6 sigma error.
My version was that the obvious null forecast is a straight line through the satellite data and this yielded an error of 0.1m Km^2. Doing 13 times worse than the obvious null forecast sounds rather embarrassing wrong.
However I was wondering if this was being just a little bit unfair. So I worked out the errors using this method for the last few years:
1999 0.051
2000 -0.167
2001 -0.665
2002 0.193
2003 -0.1135
2004 -0.0532
2005 0.387
2006 -0.067
2007 1.5355
2008 0.916
2009 0.104
It would seen that 0.4 is far to low an estimate of 2 sigma for this null forecast.
There is also the highest two values in the 2007 and 2008. Could the situation be that the uncertainty is growing and the null method by chance happens to have got lucky this year. The 2 sigma error could be over one and the median forecast has an error of 0.76.
It wouldn't seem that impossible for the pros to have a little skill compared to this null forecast method and the null method just got lucky this year.
Or is that being far too generous to the pros?
Chris,
It does seem from your analysis that the null might have got a bit lucky this year - but not too lucky, as 3 previous years were better, with another 3 not far behind. It also depends how many years of data you use - Stoat did quite a lot worse.
However the pros' performance seems pretty awful whichever way you look at it. Note that these guys didn't just have previous annual min to work from, they could see the daily data right through July (in the top case). And presumably some of them might even have had seasonal (weather) forecasts to work from - though this may have been a source of some error, too, if it turned out wrong.
I hope they will undertake some analysis to see where they went wrong. You'd think they might learn something about their models...
Thanks for posting this. If I ever get any free time again (after GSA, maybe), I'll compare these folks to the random blog readers, but they look to be in the same ballpark.
Post a Comment