- Bishop Hill blog - Testing two degrees

Monday

Jul042011

Bishop Hill

Testing two degrees

Jul 4, 2011

Climate: Models

One of the questions I would have liked to ask at the Cambridge conference the other week related to a graph shown by John Mitchell, the former chief scientist at the Met Office. Although Mitchell did not make a great deal of it, I thought it was interesting and perhaps significant.

Mitchell was discussing model verification and showed his graph as evidence that they were performing well. This is it:

As you see, the data is actually derived from the work of Myles Allen at Oxford and examines how predictions he made in 2000 compare to outturn.

The match between prediction and outturn is striking, and indeed Mitchell was rather apologetic about just how good it is, but this is not what bothered me. What I found strange was that the prediction (recalibrated - but that's not the issue either) was compared to decadal averages in order to assess the models. As someone who is used to Lucia Liljegren's approach to model verification, I found the Allen et al method rather surprising.

The difference in assessment is obviously very different - Lucia is saying that the models are doing rather badly while Allen (Mitchell) et al are saying that they are doing fine. It seems to me that they cannot both be correct, but as a non-statistician I am not really in a position to say much about who is right. I have had some email correspondence with Myles Allen, who is quite certain that looking at sub-decadal intervals is meaningless. However, I have also read Matt Briggs' imprecations against smoothing time series, and his fulminations again smoothing them before calculating forecast skill.

We really ought to be able to agree on issues like this. So who is right?

127 comments

View Printer Friendly Version

References (1)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Response: Showing Warming After it has Stopped

by Kevin Marshall at Manicbeancounter on Jul 4, 2011 at 11:48 PM

.....the models overestimated. It is only by exploiting the arbitrary construct of decadal data that the difference appears insignificant.

Reader Comments (127)

What is the "pre-industrial temperature"? Just curious. I didn't realize there was 'one'.

Jul 4, 2011 at 8:01 PM |

stan

no no no. It's a fiddle. Mitchell has taken the average temp of the past decade (up on the 1990s), plonked it in the middle of the decade (2005) and drawn ascending lines through it.

It is a technique that hides much.

Because the past decade hasn't seen any rise in temperature the decades average temperature was reached in 2000 not 2005 so the last part of the graph should be a straight horizontal line.

Mitchell is being very naughty indeed.

There was something about this fiddle on the GWPF website, but I can't find it at the moment.

Jul 4, 2011 at 8:10 PM |

wilson

@stan: Think McFly, think!

Jul 4, 2011 at 8:10 PM |

pax

What a shameful exercise. Inexcusable, really.

Jul 4, 2011 at 8:30 PM |

dearieme

An embarrassing graph. If he made the prediction in 2000, all the prior data is simply hind-casting and irrelevant. Without the hind-casting to fluff up the graph his prediction looks pretty far off the mark. It's a do-over.

Jul 4, 2011 at 8:36 PM |

ivp0

The graph is from a paper by Hume et al

It is discussed here

http://www.thegwpf.org/the-observatory/3192-warming-what-warming.html

It looks better than it is.

Jul 4, 2011 at 8:45 PM |

David Whitehouse.

sorry that should be allen et al

Jul 4, 2011 at 8:46 PM |

David Whitehouse.

Exactly ivp0.

Whenever you see predictions in Climate World you just know without looking that the X-axis will start way earlier to mask how awful the predictions are.

The guys here last week from Skeptical Science said they are about to evaluate how good various projections have been which is a useful exercise. This will be the first thing I look for to see if they're serious.

Jul 4, 2011 at 8:54 PM |

SimonW

A good way to show if the graph is good or not would be to color the graph in, say, light yellow until 2000, to show the difference between hindcasting and forecasting. Eyeballing it, it seems that in 2000, the Y value is already the one reached in the red marker (2005), which seems to be a decadal average. This means two things:

1. The value of 2005 is consistent with the model;

2. The value of 2005 is equal's to the 2000' value, which means that it may be also consistent with global cooling or global "stagnation" models.

It doesn't "invalidate" GA, but it sure isn't an "amazing" evidence for GA.

Jul 4, 2011 at 9:07 PM |

Luis Dias

I'm not the usual GA skeptic, but I've long let go of John Cook and his utterly close-mindedness perfected to automatic robotic replies. He must not be a real human being, but a troll bot. That's how I view him anyway.

Jul 4, 2011 at 9:09 PM |

Luis Dias

"John Cook and his utterly close-mindedness perfected to automatic robotic replies. He must not be a real human being, but a troll bot."

Ha..sometimes the human inside the BOT comes out.Ages ago..I pointed out a minor mistake on a page.That was contradicted by IPCC/Aust Govt etc..The human inside the bot then accused me of being in possible need to be dishonest in my interpretation..WTF
No ad hominems are allowed unless done by Cook.. :)
He is a bot..that can replicate humans..but still just a bot..nothing original happens inside its head..

Jul 4, 2011 at 10:35 PM |

mike Williams

Why are scientists graphs so bad, fuzzy, small scale, and with no grid lines; that they are unmeasurable and often unreadable?

Truly poor presentation skills of what is supposed to be meaningful and accurate science.

Jul 4, 2011 at 10:39 PM |

Greg Cavanagh

Allen et al 2000 states:

"Dashed line, after scaling the model-simulated spatio-temporal patterns of response to greenhouse gas and sulphate forcing individually to give the best combined fit to the observations over the 1946-96 period. Shaded band, 5-95% confidence interval on scaled response. Diamonds, observed decadal global mean temperatures (anomalies about the 1896-1996 mean...)"

So the dashed line is arrived at after tinkering to bring the models into line with 'real' data. Wow! Why am I not surprised that the next red diamond is close to the dashed line?!

One could dispense with all the complexities of constructing a climate model. Simply take the five diamonds and do a polynomial fit on an Excel spreadsheet. Let Excel choose a default polynomial order. Most likely the next one or two 'predicted' points would be close to the polynomial curve, which proves there is necessarily no skill whatsoever. Then they would most likely diverge after more years. No problem, just keep on fitting new polynomials to the data when the data is in and they will miraculously fit AND appear to 'predict' the data a few years in advance. As I say, the fact that you can play this game with no skill whatsoever, and by dispensing with the need for complex models, shows just how unconvincing this approach is. Anyone can play this game without knowing the slightest thing about climate, or even whether the data points relate to climate or the price of baked beans.

How can people be so easily seduced and deceived?

Jul 4, 2011 at 10:54 PM |

ScientistForTruth

Apologies for not being a statistics whizz but

this graph suggests a linear rise in temperature, from the 70's to the present.

Everybody knows this has not occurred

Therefore it is easy to dismiss this graph as bullsh1t, as it has no basis in reality.

Why should I care about the statistical arguments, if such arguments are being used to cover up the fact that nothing statistical is happening?

FAIL, go away, don't come back. I am not a gulible politician

If it looks too good to be true, it is a con

Jul 4, 2011 at 11:04 PM |

golf charley

I think Matt Briggs is very right about the smoothing.

And I keep thinking that focussing only on trends and anomalies miss the target by a mile. Models are calibrated to mimic past prescribed trends, even when the do a very poor job at simulating the actual values. That means that models have the energy balance of the earth quite wrong.

Think of a met office that gives you the trends but not the actual values. That would be useless, water and snow does not freezes or melt when it is a bit cooler or warmer, there are well defined thresholds and if the models get them wrong they can not say anything about water condensation, cloud formation, snow cover, etc, etc. Remember Lucias chart. They have a wrong energy balance, and that is not a trivial thing.

I have had a quick look at non-smoothed series, choosing a random model (I haven't got Allen's data), it seems that the discrepancies in the eighties forcing a match in 2010 are as large as half the century anomaly.

Jul 4, 2011 at 11:49 PM |

Patagon

Lucias chart is at: http://rankexploits.com/musings/wp-content/uploads/2009/03/temperatures_absolute.jpg

It seems that the blog filters out hyperlinks to jpegs.

Jul 4, 2011 at 11:50 PM |

Patagon

I expect that Myles's point about sub-decadal intervals is really that we can't expect GCMs to agree with observations year-by-year unless we use initialised forecasting techniques in which we assimilate observed data in order to kick the model off at the right point within natural modes of variability. In the projections that John presented, this wasn't used (we only developed the techniques a few years ago) so we only expect the models to capture natural internal variability in a statistical sense rather than for individual years.

(This would hold true for sub-decadal intervals, ie: we'd hope for ENSO to emerge with the right frequency and magnitude, but could not expect to get individual El Nino years correct by their calendar date unless we'd assimilated data to initialise the model at the right point in the ENSO cycle, and even then it would probably only work for a year or two until the "butterfly effect" took over)

But for externally-forced changes (whether from the sun, volcanoes, GHGs, land use or urbanisation) we'd expect to see the long-term signal emerging - ie: short-lived global cooling following a major eruption such as Mt Pinotubo (which the models do capture) or an ongoing general warming trend (with natural internal variability about this trend) for a ongoing increase in GHG concentrations.

It's worth remembering that general circulation models are actually pretty remarkable. Yes, they are a very long way from being perfect, but to be able to numerically represent some fundamental physics such as fluid dynamics and radiative transfer, plus some (admittedly sometimes crude) approximations representing the smaller-scale stuff like cloud microphysics, tell it how much radiation is coming from the sun and how the Earth is spinning, tilted and orbiting, and then let it go and see just how the general circulation of the atmosphere and its broad-scale patterns of cloud cover and precipitation just emerges by itself (along with modes of variability such as ENSO and NAO) is (I think) pretty amazing!

We often get asked "do you factor ENSO / the thermohaline circulation / monsoons / jet streams etc" into your models - but the answer is "No, we don't factor them in - these things just arise naturally within the model as a consequence of the basic laws of physics". OK they are not always right - there are systematic biases (some places too wet or too dry) and also, for example, the Indian monsoon actually got worse in our more recent model compared to HadCM3 - but generally speaking they are pretty good as a representation of the behaviour of the atmosphere and ocean.

But yes, how to test them in "forward mode" is clearly a problem. Either we look back at projections done decades ago with old models (which did, incidentally, predict warming that was pretty close to the observed ), or we use our current models to do hindcasts over recent decades (but then how do we convince people we've not simply tuned the model to give the right answer?!). Or we use initialised forecast techniques which then allow us to produce forecasts that are both a fair test of the model and can be verified within a few years.

Jul 4, 2011 at 11:58 PM |

Richard Betts

It's worse than we thought.

Did they build their climate prediction model in 1950 as the graph suggests ? If not then they are liars.

Jul 5, 2011 at 12:02 AM |

Jack Hughes

So where is his data from 1850-1950, which on my reading of history is firmly post industrial. How does he explain the high decadal temperature of the 1930's eh?

The simple reason is he is avoiding the sinusoidal in the temperature record. Which peaks in the last decade. And ignoring the solar effects or he'd have two red data points, not one. Note that all the other datapoints are at 10 year intervals except the last one. TSI and solar magnetism peaked in 2004 and have been on the slide since.

I smell fish. Modelling fish.

Jul 5, 2011 at 12:05 AM |

Bruce of Newcastle

@Richard

Answer your questions this way:

What would Richard Feynman do ?

Jul 5, 2011 at 12:07 AM |

Jack Hughes

@Richard

Read the other Richard's 1974 cargo-cult speech.

Jul 5, 2011 at 12:16 AM |

Jack Hughes

@Jack

Feynman is a massive inspiration for me!

I think he would by-pass the senior management and PR folks and go straight to the working-level climate scientists, find out what they are doing, make some massively insightful comments, find some things he doesn't agree with but also lots that he loves, and then tell the scientists to ignore the handed-down conventional wisdom that "thou shalt not talk to sceptics" and that everyone is just regular guys and we should all just go for a beer and talk it through like adults (but with the enthusiasm and curiosity of children).

"It is our responsibility to do what we can, learn what we can, improve the solutions, and pass them on" - Richard Feynman, What do you care what other people think" (final page)

:-)

Jul 5, 2011 at 12:28 AM |

Richard Betts

How about an opinion from a third Richard? :)

In fact, I speak not so much as a Richard as a programmer. And this is how I feel reading Richard Betts. First, grateful that he came on Bishop Hill to explain the excitement of seeing GCMs produce profound natural effects from algorithms based on basic physics. Second, I would love to work on these things, as a programmer. It must be one of the best modelling jobs one could possibly have.

The key words here though are 'seeing GCMs produce profound natural effects'. How do we recognise something in software as an ENSO or a monsoon? Through being human beings, through a kind of pattern matching we do all the time. Well, that's the sense I pick up - and that is exciting.

It's the relationship of this excitement (which I think is justified, by the way) to boring old scientific ideas like replication and falsifiability, that I believe have served us well. It's not only an important question but a fascinating one.

Jul 5, 2011 at 12:38 AM |

Richard Drake

Thanks - yes, I agree.

PS. Next time we have a vacancy I'll try to let you know! :-)

Jul 5, 2011 at 12:52 AM |

Richard Betts

I'm not looking so much for a vacancy as an 'opening'.

What would full openness of GCMs mean, that would allow thousands more programmers and testers to become involved, as has happened in such a powerful way in the open source movement? Much more than source code, of that I'm sure. But thanks again for stopping by - and for the offer of a beer for all of us. (They told you you have to be careful what you write on these blogs!)

Jul 5, 2011 at 1:01 AM |

Richard Drake

Full availability of GCM code is supported by some but will probably take a while - I wouldn't rule it out (eventually)

In the meantime you can get the code for the land surface scheme of the Met Office Hadley Centre GCM here:

http://www.jchmr.org/jules/

This community model runs independently of the GCM as well as being run as part of it (incidentally the GCM is the weather forecast model as well as the climate model - it's effectively the same model)

Your input would be welcome! And yes, if you come along to one of the JULES science meetings I might even buy you a beer... :-)

Jul 5, 2011 at 1:16 AM |

Richard Betts

@ Richards

Stop dicking around, you blokes. You'll be giving us all a bad name if you keep this civility up.

Jul 5, 2011 at 2:56 AM |

Mique

bish, apologies for being O/T but this little tale involves esteemed british media, such as The Guardian and the Beeb:

4 July: Guardian: Alison Rourke in Wollongong: Woe in Wollongong as mining city prepares for Australia's new carbon tax
City of coal and steel exemplifies hostility to a pollution levy, which has made prime minister Julia Gillard very unpopular
The growing number of climate change deniers put recent events like this year's devastating floods in Queensland and the most powerful cyclone in Australia's history (cyclone Yasi in February 2011 was as powerful as hurricane Katrina) down to freaks of nature rather than climate change.
From his inner city veggie patch stocked with lettuce, beetroot and cabbage, Sydney resident Greg Bearup despairs at the government's handling of the carbon tax debate. "I just can't see how we went from 60-70% support for action on climate change to a position where Gillard looks like she could lose her job over it," he said. "It's unbelievable it could have been handled so badly."
Bearup's street is just 15 minutes from the centre of the city, but a world away from the glistening blue of the harbour. The area has been gentrified but remains a concrete jungle with first world war era houses.
Two years ago Bearup and his neighbours dug up the concrete in front of their homes and planted gardens. He says the concrete was acting as a heat bank. Removing it has lowered the temperature in the summer, reducing the need for air conditioning.
He said a carbon tax will make sure heavy polluting industries like mining pay their fair share for the damage they are causing to the environment. "Everyone should be making an individual contribution to tackling climate change," he said.
http://www.guardian.co.uk/world/2011/jul/04/wollongong-view-of-australia-carbon-tax

Wikipedia: Greg Bearup
Greg Bearup is an Australian journalist, author and international election expert. He currently a feature writer at the Good Weekend magazine which is distributed with both The Sydney Morning Herald and The Age on Saturdays...
During this time he also filed for The Guardian, The Times and the Christian Science Monitor...
In 2008 Bearup travelled around Australia in a caravan with his partner, Lisa Upton, and young son, Joe. The adventure was documented in the book 'Adventures in Caravanastan: Around Australia at 80ks' ...
http://en.wikipedia.org/wiki/Greg_Bearup

Raconteur: The Team (TWO PEOPLE ONLY, THE FOLLOWING)
Lisa Upton
More recently, she has been a regular guest on ABC radio 702 in Sydney and a contributor to the book Adventures In Caravanastan.
Alison Rourke
In print, Alison writes for the London-based Guardian and Observer newspapers. She is also a contributor to the BBC's From Our Own Correspondent program...
In Australia she has worked for the ABC's Foreign Correspondent and Lateline programs covering stories in Sydney, the Middle East and North Korea. Alison has also written journalism training programs for the United Nations.
(FROM THE HOMEPAGE) Raconteur is a boutique media company that prides itself on storytelling...
http://www.raconteur.com.au/the_team/the_team.phtml

bish, this is the calibre of people who get PAID by the MSM.

Jul 5, 2011 at 3:01 AM |

pat

should have added, rourke makes no mention of Bearup being anything other than a Sydney "resident" she happens to stumble across while doing a story on Wollongong, 50 miles south of Sydney...

Jul 5, 2011 at 3:06 AM |

pat

Big thanks to Richard Betts for participating.

The problem I have with this graph can be expressed very simply.

Statistical certainty is about large numbers of samples.

This graph uses five data points to predict a sixth point.

Why would you do that when you could have instead, with the same data, used 50 data points to predict the next 10? That would be far more convincing.

And a direct question for Richard: does the ENSO affect the total energy of the earth (for example by changing reflectivity), or does it merely move some of the heat to different places for a while?

Jul 5, 2011 at 3:25 AM |

Bruce Hoult

Yes, good to see a real live Climate Modeller on here. And not even dissing us all with the D word.

I note the warm words (despite major doubts about the methods used) from some regulars here.

But, leaving that on one side, I too have a direct question for Richard Betts.

How comfortable are you that your work is sufficiently robust and has sufficient predictive skill to justify spending Trillions on attempting to "decarbonise" the economy?

I also leave on one side the fact that the "solutions" to that "problem" largely don't work.

That's not your fault.

But you are no doubt in a nice comfortable job playing about with computerised simultations (which even Richard Drake "would love to work on").

Do you have any concerns about the use that your "results" are being put to?

Jul 5, 2011 at 6:43 AM |

Martin Brumby

Richard Drake asks:

"How do we recognise something in software as an ENSO or a monsoon?"

The answer is that you cannot. If you could, then that "piece of software" would have a unique association with the results, numbers, that it generates for the ENSO or monsoon. In that case, the generation of numbers that do not match the observed facts of the ENSO or monsoon would require changes in that unique "piece of software." As we all know, this cannot be done. The unique "piece of software" that generates a set of numbers bears to those numbers something akin to the logical principles that are used to derive a particular theorem in some formalized discipline. As we all know, such derivations are not unique. There can be two models that generate all and only the same results, sets of numbers, yet share no "pieces of software."

If there can be a unique association between a "piece of software" and its results, a set of numbers, then that "piece of software" bears to its results the same relationship that exists between a set of physical hypotheses and the results, observations, that they imply. That would make the "piece of software" logically equivalent to physical hypotheses. Yet physical hypotheses have cognitive content; that is, they are "about" something. But a "piece of software" is not "about" anything, just as a proof in a formal discipline is not "about" something. Having no cognitive content, the description of a "piece of software" is given in terms of its interrelations with all the other "pieces of software" with which it interacts.

So, models are not logically equivalent to physical hypotheses. Nor are they functionally equivalent to physical hypotheses. They cannot be used to make predictions and they cannot be falsified. Because they have no cognitive content, they cannot be used to explain the phenomena that they generate, the sets of numbers that are their products in a given simulation, or what those numbers are taken to represent. The most important point about the relationship between models and physical hypotheses is that if you have the latter then you no longer need the former except as a quick way of investigating the assumptions that are found in your hypotheses. Models are great analytic tools but have no synthetic power whatsoever. The product of science is reasonably well confirmed physical hypotheses. Models are no more than helpful tools used along the way.

Jul 5, 2011 at 7:00 AM |

Theo Goodwin

What are Myles Allen's qualifications? So far as I can make out Statistics are not one of them.

Its Mann and Myles Allen vs Lucia, William Briggs and Steve McIntyre.

The latter are saying actual temperatures are way below the forecasts.

I back the latter because

1. They say smoothed data is fictional data
2. They are statisticians, Mann and Allen are not
3. We know temperatures have not risen in the last 10 years at least, while CO2 has been shooting up. Then how the H@#L can the actual temperatures be matching the forecasts?

Mann and Myles Allen sound like lies, damned lies and worse and they smell of scam.

The research income of Myles Allen and his merry men 2009/10 £87.4 million. Smells of scam.

http://www.ox.ac.uk/research/mathematical_physical_life_sciences/mpls_statistics.html

Jul 5, 2011 at 7:08 AM |

Richard

Richard Betts

I've very much enjoyed your contributions.

For a while now, I've been comtemplating the following rather direct comment on one of Judith Curry's posts, namely

The only way to validate any model is to have it accurately predict what happens in the future on enough occasions that the results could not have occurred by chance alone. Anything else is not very useful.

It seems to be generally accepted that whilst accurate hindcasting is necessary for model validity, it is not sufficient. The ability to accurately forecast is also required. But the success of models wrt forecasting is much disputed. I found the exchange between Gavin Schmidt and Lucia Lilegren on this thread at Briggs immensely instructive in this regard.

And it seems to me that the nature of the (purported) climate change problem means one can never test whether models can accurately make relevant long-term forecasts. As I understand it, the claim is that - according to the models - if we keep spewing GHGs into the atmosphere (let's called the amount spewed X Gt), then by 2050/2100 temperature will have risen (very) substantially. The further claim is that this increase would net be a (very) bad thing and should be avoided. If we believe the models and that the increase would indeed net be a (very) bad thing, we will take action to avoid this forcasted (very bad) increase...by spewing (very) substantially less that X Gt into the atmopshere over the period. And having taken this action, by definition we can never know whether the models would have been correct about the temperature increase from spewing X Gt. So we seem to be left with some irreducible compenent of faith with regard to long-term forecasts.

Does the prospect of using initialized forecast techniques to make short term forecasts in any way dig us out of this hole?

Jul 5, 2011 at 7:28 AM |

RichieRich

Any scientific paper should establish a thesis - It should say something. All what the paper seems to say is that the models predicted that temperatures would be higher in the 2000s than in the 1990s - and they were!
However, the models say temperatures would not just keep on rising - they will rise an accelerating rate. The HADCRUT data says that global warming has stopped.

I try to show the slight-of-hand on my blog. The trick is achieved my comparing just two data points. The reason for using decadal data may be a good one, but it needs to be used in the context of many decades to establish the thesis.

http://manicbeancounter.wordpress.com/2011/07/04/showing-warming-after-it-has-stopped/

Jul 5, 2011 at 7:34 AM |

ManicBeancounter

Let me join in the welcome party to Richard Betts here.

I think we should distinguish between two kinds of models.

The first one is The Model as an Oracle,

The second one is a model as a testing and forecasting tool.

Unfortunately the first kind is taking over the rest, due to the political and ideological weight attached. As a scientific tool is not only useless but pernicious, it has to be right by definition and cannot be challenged, is 'settled' and therefore reactionary.

The second kind is most useful, but it has to be approached with a very open and humble mind. Models are excellent tools to test our understanding of a very complicated system such as climate. If our understanding is correct, then they can be extremely valuable as a predictive tool.

Think of the ENSO example above, I think Richard Betts is way over-optimistic on their current forecasting abilities, but if in the near future they were able to forecast an El Niño one year ahead, that would mean benefits of hundreds of millions if not billions (of £ $ € ), think of fisheries and agriculture from California to Chile. Likewise if they could simulate the monsoon accurately.

We are not there yet, the oscillations shown by current models require a leap of faith to be called ENSOs, but the path looks promising. That is why I do not mind the millions invested in a few supercomputers or grants and salaries (scientists and programmers have the right to a decent job). However you have to keep the Oracle locked.

Jul 5, 2011 at 7:42 AM |

Patagon

It is nice to have another discussion about models! I'd agree with several other commenters that the GCM models are impressive pieces of software that are based on a whole lot of good physics, and do a remarkable job of reproducing some weather and climate features. That notwithstanding, it is legitimate to ask: is the physics they are based on correct and complete in terms of the climatically important terms (cloud microphysics is "smaller-scale stuff" but it may be pretty important also); do the models therefore yield accurate results for global average temperature; and, the topic of this post, does the assessment of the latter question always occur in the most honest possible way? It seems to me that the plot shown above is not completely honest. True, short-term fluctuations are going to be very tricky to reproduce, as Richard Betts has said ("we only expect the models to capture natural internal variability in a statistical sense rather than for individual years"). That is fair enough. But the point is, the 'natural internal variability', if it is that, is starting to make global temperature look less and less like the models suggest it should (see Lucia's). This may of course still be 'noise'. But it does seem a bit more than a coincidence that a plot of decadal averages - which 'hides' this discrepancy - should have been used by Mitchell to present the level of agreement of models and observation.

Another interesting development to do with testing model accuracy is discussed in a post at Judith Curry's blog. Prediction of the future is of course more difficult than hindcasting when it comes to temperatures - but it is also quite legitimately difficult in terms of the inputs - how much CO2 there will actually be, and how much aerosol will be emitted. An uncharitable reading of the PNAS paper would be that this latter aspect provides a convenient 'the dog ate my homework' excuse for model inaccuracy (already trailed by Hansen a few weeks ago). I suspect we'll be hearing a lot more of that.

Jul 5, 2011 at 8:44 AM |

Jeremy Harvey

Fitting 5 data points and using these to predict a 6th shows no real level of skill at all, especially when the 6th point is just the linear extrapolation of points 3,4 and 5.

Let's calibrate the model from 1900 to 1950 and then see it predict 1950-2010. Now that would be interesting and would show us that these models have zero skill.

I am afraid that this computer based modelling of global temperatures is not science. It is simply applied mathematicians solving differential equations for a system that no-one yet understands. Their skill is getting the system to solve many such calculations in the shortest time. The physics is secondary.

No let's be frank and admit that the real purpose of these models is not to model nature but to scare us all into doing something about the model-implied effect of CO2.

If they were "scientists" they would be saying to us "our model's are bad right now and we are working on them. They should not be used for anything. Come back in 10 years and we may have something to say".

Jul 5, 2011 at 9:07 AM |

Frederick Bloggsworth

Not a statistician, but a few comments on the graph nevertheless.

Firstly, the graph is what it is, a few (six) points plotted and an apparent "Good" correlation with models. Of those six points, only two show any discernable warming (1990's & 2000's).

I believe the graph has been constructed to visually "convince" a degree of model "skill" that is in actual fact not present;

It is not obvious for example why error bars are missing from the 2000's result. You could make a case for being a "distraction" of the otherwise near perfect result of actual vs forecast.

Assuming, the error bar is of the same magnitude as the other plotted points, I believe it would show that there was no "statisticaly significant" warming over the 14yr sample period between the 1990's and 2000's. ( There is a statisticaly significant warming between the 1980's and 2000's).

There may be some bias in the selection of the verification period;

The existing periods appear to be 19x6->19x5, with plotted values centred around 19x1. It would have been a valid choice to plot the 1996->2005 result, continuing the sequence. I haven't done the calculations, but this would have included the 1998 super El Nino year, and there is an expectation is that the average temp for the period would be perhaps slighlty warmer than that calculated for 2000-2009. This would be a "much worse than we thought" result.

Instead, Allen chose an "out of step" 2000-2009 verification period, that coincidentaly suggests a greater "skill" to the model. (Being unkind, Allen may be more interested in selling "look how accurate my model is" than "much worse than we thought")

As it stands, as plotted, the 1998 Super El Nino year is ignored, falling somewhere between 1986->1995, and 2000->2009 samples.

Also, just "eyeballing" the graph I think Allen was on a fairly safe bet with his forecast in the short term. The range of uncertainty and error bars of the data are such that falsification of his forecast would have required an unprecedented "decadal cooling" of about 0.3K.

Jul 5, 2011 at 9:19 AM |

GSW

Frederick, I think you do a major disservice to climate modellers in your last paragraph.

Most, (but not all) of the modellers I have worked with in my somewhat long life as a scientist have been firmly in Patagon's second mode. They are interested in furthering scientific understanding and numerical models, vastly expanded by the use of increased computing power, have been a fantastic means to that end. I think the posts by Richard Betts amply display this.

Of course the climate system is complex and is not understood, we have a very long way to go before models have any realistic predictive skill at any level that might be useful. But that is not a reason to put models in the dustbin. Using models is probably the only way that our understanding of the basic physical processes will advance.

The problem is that far too many advocates have jumped onto the model as Oracle mode, in order to pursue a strongly held vision. That is fundamentally where the problem lies

Jul 5, 2011 at 9:23 AM |

Arthur Dent

How could the model be invalidated?

Assuming the solid grey area is the range of temperature, relative to pre industrial, that falls within validating the model then the temperature range is 0.8 +/- 0.3C.
The last decade's average is marked on the graph with the red diamond at the 0.8C mark so what would it take for the average of the last decade to be 0.5C and invalidate the model?
One obvious way is for the temperature to decrease by 0.066C every year so that, for example, 2001 decreases by 0.066, 2002 decreases by 0.132 and so on. Since we know that the trend for the last decade is flat (0) this means that the only way the model could have been invalided would be with a decadal temperature trend of -0.66C over the last decade.

How likely is that?

Jul 5, 2011 at 9:37 AM |

TerryS

Is it possible to (entirely) escape from using models as oracles?

Models predict that if we emit alot of GHG (X Gt) over the century, temp increase will be large.

If we believe this prediction and that large temp increases will be a bad thing, we will make efforts to emit alot less GHG over the century (<X Gt).

If we emit <X Gt over the century, we can never know whether the models' prediction that large temp increases would have resulted from emitting X Gt would have come true.

That is, we can't both wait to see if the predictions are true and take action to avoid the predicted state of affairs.

Jul 5, 2011 at 9:44 AM |

RichieRich

Sorry, but that graph at the top of the posting is an excellent example of a Texas Sharpshooter. Shoot a hole in the barn, then draw a circle around the hole. Great shot.

CO2 forcing is about constant rate of increase, or an increasing rate of warming over unit time.

Over the last 15 years, the rate has been essentially zero.

If all the observed temperatures had been plotted, instead of just 1, we would see models under performing observed temps, then the observed passing through modeled and then passing through CRUTem projected.

You can say it is accurate, but only because the damn line is FALLING through the models and projected temps. It like a stopped clock; which is accurate twice a day.

Where is Jason Box when this happens? He dislikes using incomplete data. The CRUTem data is only projected from 1996. Allen et all should have used 2000, or 1999 at the least. Lets use the last 11 years too, and see what the projection looks like now. ( I suspect that the reason that 1998 was not included in Allen et al, was that then the CRUTem projected temps would be HIGHER than modeled.)

Shoot a hole in the barn, then paint a circle around it. Yep, great shot, pardner.

Jul 5, 2011 at 9:49 AM |

Les Johnson

@richard betts

'Or we use initialised forecast techniques which then allow us to produce forecasts that are both a fair test of the model and can be verified within a few years'

Yes please. Avoiding doing (and widely publishing and publicising in advance) this obvious step has been a major reason for me to disbelieve your models.

The world is full of bullshitters who use hindsight to demonstrate their forecasting ability. Politicians are great at it. But eventually they have to stop the BS and start the game.

If your models are so b...y wonderful , put your cojones on the line and show that you are not among them. Otherwise you have zero credibility as a forecaster. And your reliance on models, not observations, does nothing to improve it.

Stop just talking the talk ...Walk the Walk! In public for all to see! Crunch time.

Jul 5, 2011 at 10:04 AM |

Latimer Alder

Prediction is very difficult, especially if it's about the future...
(Niels Bohr)

Jul 5, 2011 at 10:20 AM |

James P

The issue here is not the climate models themselves but the misleading presentation of data. Climate scientists are masters of this (my favourite example is the IPCC FAQ 3.1 Fig 1).

If someone claims a successful fit between a model and a single data point, it is right to be suspicious, particularly when, as many others above have already pointed out, simple extrapolation would give an equally skillful 'prediction'.
In fact, if you plot the observed decadal average as a function of time, ie the observational equivalent of Allen's dashed line, you see quite a different picture, with a dip in the curve around 2003-4. I'm not going to show this curve because it is in fact not very meaningful, but it just shows that by playing with averaging you can get more or less whatever result you want.

The article by statistician Briggs that Andrew links to could hardly be clearer:
"Do NOT smooth time series before computing forecast skill".

Jul 5, 2011 at 10:39 AM |

PaulM

Theo Goodwin "The most important point about the relationship between models and physical hypotheses is that if you have the latter then you no longer need the former except as a quick way of investigating the assumptions that are found in your hypotheses."

I largely agree with what you have written, though I have a somewhat different view from the sentence above. Science is desperately wrong today because scientists don't know (and hence can't understand) the philosophical underpinnings to their work. I heard Sir Paul Nurse on the radio this morning and he was talking about science as generating knowledge. Science never generated knowledge, it only ever generated ideas. It overturns ideas, and develops new ideas, but it does not generate knowledge. Science itself is an idea.

We now have a situation where scientists believe they are handling reality rather than ideas, that, for example, the output from particle accelerators and X-ray telescopes is reality rather than something to do with an idea. The tools themselves are ideas. Because scientists don't actually know what they are dealing with they think they are describing reality rather than inventing fairy tales and just-so stories (read most of the output of astronomy these days, and that would be a fair description - it has long since abandoned any connexion to reality at all). I know that all of this sounds strange to most people, but that is because they have been lulled into the same way of thinking by the scientific priesthood; and who can blame them, especially when the presidents of the Royal Society (Paul Nurse, Martin Rees etc) don't themselves understand what science is and indulge in fallacious arguments.

I have had strong disagreement with Jerome Ravetz, the founder of Postnormal Science, but not so much about philosophical issues as about how science should be conducted given the philosophical problems. Many climate scientists consider their work to be postnormal science, and so apply the methods of Ravetz, as does the IPCC (which explicitly cites him). He has helpfully described climate models thus:

" …climate change models are a form of “seduction”...it should be observed that the process may well be directed even more to the modelers themselves, to maintain their own sense of worth in the face of disillusioning experience…but if they are not predictors, then what on earth are they? The models can be rescued only by being explained as having a metaphorical function, designed to teach us about ourselves and our perspectives under the guise of describing or predicting the future states of the planet…A general recognition of models as metaphors will not come easily. As metaphors, computer models are too subtle…for easy detection. And those who created them may well have been prevented…from being aware of their essential character."

Note: 'under THE GUISE of describing or predicting'. Description or prediction is not possible with a model, it is just a con. But above all, the climate modellers themselves have to be seduced into believing they are something that they are not, otherwise they wouldn't produce them. They have to believe they are doing something related to reality rather than producing a tool for a social narrative.

Apart from having a metaphorical function, in the service of politics, the only real use for models in the service of science (and this is where I would see things, if not differently, then at least in a complementary way from your sentence) is in hypothesis generation, e.g. what might be considered some of the more useful hypotheses to test. Thus they are related to ideas and not to reality. We have a very serious problem in science where models are assumed to be some sort of oracle, or some sort of analogue of reality. There is another crisis in the making in Artificial Intelligence, where many scientists are dropping the 'artificial' and considering that 'machines' (for want of a better word) can actually have intelligence. Again it's a confusion between an idea and a reality.

Jul 5, 2011 at 10:52 AM |

ScientistForTruth

"the only real use for models in the service of science (and this is where I would see things, if not differently, then at least in a complementary way from your sentence) is in hypothesis generation, e.g. what might be considered some of the more useful hypotheses to test. Thus they are related to ideas and not to reality. "

Or to put more simply, to paraphrase one of my PhD supervisors in correcting my thesis - 'Don't say that the model proves or demonstrates anything - models can't do that. At most you can discuss what the model output is, and whether this appears resonably consistent with the observational data.'

Of course, my other favourite quote from the same Prof was 'There is no such thing as bad data, just bad interpretation'.

Jul 5, 2011 at 11:14 AM |

Ian Blanchard

@ Bruce of Newcastle
"So where is his data from 1850-1950, which on my reading of history is firmly post industrial."

Here you have an example, and it is not that flattering.
http://tinyurl.com/3le9lcy

The HadCM3 data is a more modern model than the one in the paper by Allen, but I haven't found the HadCM2. From the HadCM3 I got only two ensembles via the climexplorer, so it might not be a fair representation. If any one can point me to a link with the full dataset, ideally a collection of NetCDFs, I can redraw the plot.

Please note that the baselines are different, it is 1896-1996 for the model and probably 1971-2000 for the CRUTEM. This does not matter, it represent only a shift and if the model were an accurate simulation of global temperatures the lines should be parallel, which is not the case.

Jul 5, 2011 at 12:22 PM |

Patagon

Hi RichardB,

Welcome. I like your tone and style and would also love to discuss this stuff over a beer.

But Latimer is right, it's way past time to put that neck on the line. Why do we not have rolling forecasts (GTA, precip, hurricanes, whatever) year on year adjusted for actual CO2 and other actual parameters to see how good they are. We'd have built up 30-40 years of data by now which would be very illuminating.This would seem such a basic requirement that it is mystery it is not out there in the open (unless of course the risk of failure is too great).

I play with horse racing data every day and generate predictions for the afternoon races. If they are no good then my kids don't eat, simple as that. I am so used to seeing BS Tipsters with wonderful past records who fall flat on their face when faced with the reality of predicting future results (although they've pocketed the subscriptions by then) . The similarities with the Climate guys is overwhelming.

One of my early memories of the Climate debate was reading that Piers Corbyn had been banned by William Hill (an accolade I share!) for beating MET Office forecasts over 4000 bets. To my mind, if true, that is an astonishing fact and unequivocally proves that PiersC's methods (whatever they are) are better than the presumably CO2 driven MET Office models (it doesn't say much else to be fair). Critics will point to some failed forecasts and his lack of peer reviewed papers etc, but to my mind the real life predictive skill "victory" over such a long period in a controlled test trumps all that and shows the CO2 driven models to be defective.

Again I'm hoping the SkS guys will investigate the truth of PiersC's claims dispassionately.

Jul 5, 2011 at 12:25 PM |

SimonW

Post a New Comment

Enter your information below to add a new comment.

My response is on my own website »

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>