## Has JG-C found an error in CRUTEM?

John Graham-Cumming, the very clever computer scientist who has been replicating CRUTEM thinks he and one of his commenters have found an error in CRUTEM, the land temperature index created by Phil Jones at CRU which forms part of the HADCRUT global temperature index.

I have no idea why the correction given in this blog post by Ilya and I works: perhaps it indicates a genuine bug in the software used to generate CRUTEM3, perhaps it means Ilya and I have failed to understand something, or perhaps it indicates a missing explanation from Brohan et al. I also don't understand why when there are less than 30 years of data the number 30 appears to still be used.

If these are bugs then it indicates that CRUTEM3 will need to be reissued because the error ranges will be all wrong.

John is not a "jump up and down and shout fraud!" kind of guy. He is very careful and very cautious, as you can probably tell from the way he announces his findings.

Those of a mathematical bent might like to take a look and check what he's done.

## Reader Comments (31)

From John Graham-Cumming's blog:

"..Ilya Goz.. correctly pointed out that although a subset had been released, for some years and some locations on the globe that subset was in fact the entire set of data and so the errors could be checked..."

The apparent errors Ilya and JCG have apparently revealed is indeed interesting. But what is also interesting is that "a subset" which only applies for "some locations on the Globe" are in fact "the entire set of data"?

Am I to understand then, that HADCRUT3, which is supposed to represent the temperatures for the entire Globe, only represents those of some locations on the Globe, and that too inaccurately?

The questions that arise: Which locations? What parts of the Globe are left out and not represented? Why is this? If they were represented how would that affect the temperatures?

Is this another scam of epic proportions in the making?

From what I can glean from the page there is no 'scam of epic proportions' about to arise.

The error discovered seems to have caused reported error estimates on some data to be bigger than they actually were.

So yes, it is an error in code. But no, any fix will make the data more certain, not less.

@Tilde Yes, that's right. If Ilya and I are correct then the error estimates for CRUTEM3 station errors are too large in many cases and fixing this would narrow the error range.

That's one of many reasons why I don't tend to scream 'fraud' :-)

@Tilde

Actually it will increase some and decrease others.

Where there is less than 30 years of data error estimates should increase. Where there is more than one station providing the data it should decrease. If there is both less than 30 years and more than one station then if there is more than 15 years worth of data it should decrease.

In other words the error estimates are screwed up.

John, sorry for posting a comment here, but I couldn't get my openid to work in your blog (and you don't accept anon comments).

Anyhow, Brohan et al:

"The grid-box anomaly is the mean of the n station anomalies in that grid box, so the gridbox

station uncertainty is the root mean square of the station errors,

multiplied by 1/sqrt(n)."So it is not simply root mean square of errors, but there is a multiplication term (correct). So it really might be a bug in their code, that is, they multiplied by sqrt(n) instead of 1/sqrt(n). Could you check the code if that is the case (I haven't parsed through FOIA code so I have no idea where to start looking)?

@ terryS

Totally agreed. But in practical terms does this affect the usefulness of CRUTEM?

The analysis by John Graham-Cumming simply shows that a particular feature of the reported data is likely wrong. This feature, error estimates, have erred on the side of caution in most cases. I can't comment on your under 30 years statement, but I can ask if this is significant? That is does the reliability of the under 30 years data have a significant effect on any modeling or public policy?

"does the reliability of the under 30 years data have a significant effect on any modeling or public policy?" Alternatively, "should the fact that the chumps hadn't debugged their code properly have any effect on politicians' tendency to accept uncritically whatever they say?"

If this is a genuine mistake then the lesson to be learned may well be the need to free the data and the code.

@Jean I can't check their code. It hasn't been released.

@John,

yes, I thought that there were some code in the FOIA files. Went just through them, and I couldn't find anything where the errors might have been calculated. Anyhow, I believe the scaling by sqrt(# of stations) instead of 1/sqrt(# of stations) is a plausable explenation.

@Jean The problem with that is that it would show up as a constant ratio between the values I calculate and the values that the Met Office has given (since it's in the final step). That is not the case.

@tilde

John's blog says that one of steps in calculating the error estimates is to divide by the sqrt(n) where n is the number of years of data. They always divide by sqrt(30) even if there are missing years in the data. So if, for example, there are only 25 years then they should divide by 5 instead of 5.48 which means that in these cases they are under estimating the error estimates.

I don't know what the effect will be. But what you should consider is that these 2 bugs have been discovered using just a subset of the data and a description of what the code is supposed to do. What would be found if the complete set of data and code was publicly available?

BTW I agree with Bishop Hill. The real lesson from all of this is probably going to be that open code and data leads to better results.

"open code and data leads to better results": better for whom?

John, rethinking the issue ... did you use the divisor sqrt(n) in your calculations? If so, then the thing amounts to that they may have forgotten the division, and the ratio MetOffice/Your calculations should be sqrt(n) assuming the number of years is correct. If this is not the case, could you so a few numbers (ratios) where this does not hold.

@Jean I did use the sqrt(n) in my calculations and the ratio between my numbers and the Met Office numbers is not constant: it changes for each month. The examples on my blog show this.

Tilde Guillemet

From what I can glean from the page there is no 'scam of epic proportions' about to arise. The error discovered seems to have caused reported error estimates on some data to be bigger than they actually were... any fix will make the data more certain, not less.TerryS

Actually it will increase some and decrease others. Where there is less than 30 years of data error estimates should increase. Where there is more than one station providing the data it should decrease. If there is both less than 30 years and more than one station then if there is more than 15 years worth of data (per station?) it should decrease. ..the error estimates are screwed up.( and what happens if there is more than one station but combined less than 30 years of data?)So can we agree that the error estimates are screwed up? In some cases more and in some less?

What I cant figure out is what this means: "..Ilya Goz.. correctly pointed out that although a subset had been released, for some years and some locations on the globe that subset was in fact the entire set of data and so the errors could be checked..."

What does the "subset was .. the entire set of data" mean? Does "the entire set of data", "for some years and some locations on the globe" mean that is all the data that there is? That other locations on the globe have no data that has been processed, or no data that exists?

@Richard "What does the "subset was .. the entire set of data" mean?" It means that the data in CRUTEM3 is drawn from a large number of stations around the world, so far the Met Office has released only a subset (around 3,000) of those stations.

So the data Ilya and I have been working with is only a subset of the data used to generate CRUTEM3.

"Does "the entire set of data", "for some years and some locations on the globe" mean that is all the data that there is?"

It means that the data released by the Met Office for certain locations and certain times (e.g. the grid boxes I cite in the blog post in January 1850) are all the data that goes into CRUTEM3.

PS - and all this is supposed to be "quality controlled"? Peer reviewed?

They hide their data and it is left open to the bloggers to torture this out of them and discover their mistakes.

How do we know they havent screwed up on temperature adjustments, for example, if what they do is hidden and not open to review?

@John Graham-Cumming - "It means that the data released by the Met Office for certain locations and certain times (e.g. the grid boxes I cite in the blog post in January 1850) are all the data that goes into CRUTEM3."

What about the rest of the data then? Are the questions I raised above valid?

"Am I to understand then, that HADCRUT3, which is supposed to represent the temperatures for the entire Globe, only represents those of some locations on the Globe, and that too inaccurately?

The questions that arise: Which locations? What parts of the Globe are left out and not represented? Why is this? If they were represented how would that affect the Global temperatures?"

Though that should be CRUTEM3

@Richard All the answers to your questions can be found in Brohan et al. 2006 (http://hadobs.metoffice.com/crutem3/HadCRUT3_accepted.pdf). If you really want to know how CRUTEM3 and HADCRUT3 are generated the information is in there. It covers how they handle errors (including the station errors that I'm talking about and the coverage error). The coverage error appears to be your concern: i.e. how good a proxy for global temperature are a smaller number of observations around the globe. Section 6.1 of the paper covers how they resampled a different data set to estimate the coverage error.

In terms of which locations and which are 'missing' the paper contains a number of diagrams showing which grid squares contain measurements. Also, the Met Office web site at http://www.metoffice.gov.uk/climatechange/science/monitoring/subsets.html has a location map for the released data.

The answer to your question about the effect of 'missing' data on the global temperature is answered by the coverage error calculation.

@John Graham-Cumming - Thanks for that. Will have a look sometime. I presume people have looked into the coverage error calculation and found that to be ok?

@Richard I did my own analysis of the coverage error calculation and you can follow the trail of that starting here: http://www.jgc.org/blog/2009/12/phew-got-limited-coverage-error-sorted.html

Now, I did not attempt to match it decimal place for decimal place with CRUTEM3 because I can't. Their data on that does not appear to have been released.

Had a look at the locations. The US is the most densely packed followed by Australia along the coast, UK, Central and northern Europe.

But the US is not warming, (or warming much?), so maybe they have found a way to reduce its influence and increase those of others.

Interesting Antarctica seems to be mainly represented by the Northern Tip of the Antarctic Peninsula, which everyone knows is warming.

@John Graham-Cumming - "@Richard I did my own analysis of the coverage error calculation and you can follow the trail of that starting here: http://www.jgc.org/blog/2009/12/phew-got-limited-coverage-error-sorted.html"

So there doesnt seem to be an error there.

A lot seems to depend on Brohan et al.

1. Brohan et al. say these should be the errors, as per these calculations. So they dont seem to have screwed up on those calculations, so far as you can, with your limited information, make out.

2. What about the logic on which the calculations are based? Does that seem to be ok?

3. Despite what Brohan et al. say on how you should handle coverage errors, on what basis do CRUTEM3 choose the stations they do and those they leave out? and what would be the effect if they chose different stations, or all the stations?

And having a brief read of Brohan et al. it again strikes me as crazy the "Global" temperature records we depend on. Over the land it is the air temperature, but over the sea it is the water temperature. That is crazy.

Instead of the ships dipping buckets into the sea, why dont they simply measuer the air temperature? Much simpler, more logical. And I would bet, you would get far less variation than over the land.

@Richard I can't answer your questions because they are well outside my area of expertise. All I am capably able to do is follow the science as described in the paper and attempt to reproduce the results they have. I have a lot of skill in computer programming and can use that to check that part of their work (and I am a mathematician so that mathematical content isn't scary to me).

It is most likely a stupid error in the coding, which should have been caught.

What this says is the QC of the code is non-existent and that makes me wonder WHAT ELSE IS BROKEN in the program?

We really have no way of knowing unless we look at the code carefully. Which apparently we will not be permitted to do. That is where there is a fraud for certain.

I am with TerryS here - two errors in simple math from one example is the real issue - not whether these errors balance out. This is a tip of the iceberg discovery, an indicator of problems to come. How long were these errors in the calculations- decades? And this is the easy part, where you have a temperature record and you are simply averaging it.

What is the basis for their stated errors in homogeneity? Are these valid? Why a constant measurement error - that is clearly not valid over decades of technological advancements!

And what about extrapolation error? The reality is 99.99+% of the global temperatures are not measured but guessed. If a temperature is only accurate to a degree or so in a 50x50 km grid, 2 degrees in a 100x100 km grid and 4 degrees in a 500x500 km grid, then there are huge errors compounding by the time we take point measurements in 1500 sites and expand them to the global surface.

Does ANYONE expect someone who cannot get this basic mathematical equation right, or who allowed it to go unchecked for years, to handle this complex mathematical analysis to a global level well?

If so I have some beach front property (mathematically speaking) in Montanna to sell.

AJStrata

Times picks this one up 15/2/2010:

http://www.timesonline.co.uk/tol/news/environment/article7028362.ece