Friday
Jan242014
by
Bishop Hill

HadCRUT 2013


The HadCRUT global temperature anomaly for 2013 is 0.486. If so it should be outside the 5-95% bands on Ed Hawkins' famous graph.
That will be more standstill then.
Books
Click images for more details
A few sites I've stumbled across recently....
The HadCRUT global temperature anomaly for 2013 is 0.486. If so it should be outside the 5-95% bands on Ed Hawkins' famous graph.
That will be more standstill then.
Reader Comments (93)
Re. Folland et al. and the one-year-ahead global temperature forecasts.
The discussion has got a bit confused in places, not helped by the journal being one that you need a subscription for. However there is a copy of the paper here, which should help clarify things.
To be clear, this is *not* to do with the simulations with climate models over the last hundred through to the next hundred years that form the basis of the IPCC-type projections. These are also compared with the observed temperatures for recent years, an interesting exercise, as done by Ed Hawkins amongst others (see Ed's comparison here).
Completely separate to that, the UK Met Office make year-ahead forecasts. i.e. given what they know in December of year X, what can they predict for year X+1? They've been making these one-year-ahead forecasts since 2000 (presumably the forecast for 2000 was made in December 1999) and the latest one is here.
I'm not sure about the utility and value of such information, but a couple of comments early in this thread made reference to these forecasts, so I thought it would be useful to point to Folland et al. (2013). Since the Met Office had made about a dozen of these forecasts (since 2000), it provided an opportunity to evaluate how well they've been doing.
In the link to the paper above, look at Figure 1. Red shows the timeseries of forecasts (remember that each value wasn't forecast until the December of the preceding year -- this is not a 12-year forecast made at the beginning). Black shows the "contemporary" observations (i.e. I guess HadCRUT1 in 2000-2002, HadCRUT2 in 2003-2005, HadCRUT3 in 2006-2011, though I don't recall exactly when each new version came in) against which they compared each forecast value at the time. The blue dashed line shows HadCRUT4 throughout.
There is some agreement in the overall shape of the timeseries -- hence the correlation of around 0.75 against the contemporary observations. However the mean of the red line is above the mean of the black line -- this is the mean bias (given as +0.05 degC for 2000-2011 and +0.06 degC for 200-2010).
Since HadCRUT4 temperatures are warmer since 2005 than the earlier versions, the bias is less when the forecasts are compared with HadCRUT4 rather than with the contemporary observations. I'm not sure if the correlation strengthens or weakens, but the bias gets smaller.
@JamesEvans: hopefully this explanation and the paper itself (link above) will be useful. Regarding the analogy to being overpaid by 7p each year: the point is that if you were overpaid 7p every year then in year +1000 you would still be overpaid by 7p in that single year. Of course you would have accumulated an extra £70 over the period, but we're not comparing a cumulative sum of temperature, we're instead comparing the actual temperature in each year with the forecast. The bias is +0.06 degC not +0.06 degC/year.
Tim Osborn,
I do thank you. I will peruse the paper over the w/e, with a clear head. It really is good of you to find a free version for us all to peruse. That's going above and beyond.
If I may be so bold in the meanwhilst:
"we're not comparing a cumulative sum of temperature, we're instead comparing the actual temperature in each year with the forecast. The bias is +0.06 degC not +0.06 degC/year."
I totally understand that you are trying to limit the understanding of the figures in that way. The point is - in the context of the size of global warming over the last century, a bias of 0.06 degC for an annual forecast is shockingly large. I hope that's fairly obvious.
James
Simple numbers for example
Year one actual temp anomaly is 0.4C
My forecast for year 2 is 0.47C based on year one actual
Year two actual temp anomaly is 0.4C, my error is 0.07
My forecast for year 3 is 0.47C based on year 2 actual
Year 3 temp anomaly is 0.4C my error is 0.07
My model is good except for there is a 0.07C warming bias but the model is based on actual temps and the bias is reset each year
Run for 100 years the bias is still 0.07
Apologies Tim, James
Didn't realise there were posts on a second page before posting last comment.
Tim Osborn
Thanks for the link to the paper, will have a ponder.
Just one puzzle to be going on with. If aware of the "warm bias" why after 12 years is the "central estimate" always placed at the mid point of the range?
Why not just offset it in the public forecast, explain why it is offset and then say "look how good we are"?
LB,
OK. :) You have made things simple enough for my brain. I get what you are saying now. I was having a moment, apparently.
My point (probably hopelessly inadequately expressed) is that every year (in your example) the model is predicting a temperature rise of 0.07 degrees (which doesn't happen). To say "my model is good except for there is a 0.07C warming bias" is surely a nonsense. Global Warming over the last century has been approx 0.007 degC/year on average. If the model has a bias 10 times bigger than the signal that the model is trying to replicate then we're in Alice In Wonderland territory.
James
Welcome to the tea party, reality can become estranged!
@JamesEvans: "You're not the CRU chappy, are you?" -- Yes
@JeremyShiers on measurement errors:
A few comments about the error estimates for the HadCRUT4 temperature values. There are multiple components of error -- see Brohan et al. (2006, not paywalled) for HadCRUT3, though there have been some changes in HadCRUT4.
The component mentioned by Jeremy earlier (Jan 24, 2014 at 4:34 PM) is just one part of the error -- the "measurement error" in Brohan et al.,'s equation (1) -- and for large area averages such as the global mean, it is one of the smaller components.
As Jeremy stated, Folland et al. (2001) do claim an error of 0.2 degC on each individual measurement, but note that this is 1 standard deviation. If the measurement errors are normally distributed, 95% will lie within +/- 0.4 degC. Is this reasonable? If the thermometer is reading, say, 8.0 degC will 95% of observers read it as somewhere between 7.6 degC and 8.4 degC? According to Strangeways, I. (1999, Back to basics: The ‘met. enclosure’: Part 4 — Temperature. Weather, 54: 262–269. doi: 10.1002/j.1477-8696.1999.tb07262.x) this is reasonable. Meteorological thermometers are "usually graduated in 0.5 degC steps, from which temperature can, with care, be estimated to 0.1 degC." He also notes that "if the liquid column is not read at eye level, a parallax error is also introduced which can be as high as +/- 0.2 degC." Presumably "with care" would include trying read it at eye level to avoid this latter. So +/- 0.4 degC seems reasonable.
@Thinking Scientist (Jan 24, 2014 at 8:34 PM) on reduction of measurement errors with averaging. The assumption is that the measurement error for monthly means will be reduced by sqrt(60), since there are at least two obs per day (min/max) or more (fixed times). You note that this assumes they are i.i.d. and imply this is wrong because of (e.g.) autocorrelation. But what matters is whether the errors are independent isn't it? Not the temperature themselves. Of course temperatures will be autocorrelated. But will an observer's error in reading the thermometer depend on what error he made yesterday? It seems a reasonable assumption that it won't. (Note this is separate from an observer always being biased high or low throughout their tenure -- biases like that would need to be dealt with differently. This is about the time-varying component of the measurement error).
As noted above, the HadCRUT3 and HadCRUT4 error estimates also attempt to capture other sources of error, especially due to incomplete sampling of each grid cell, and incomplete coverage of the globe by grid cells with data in them. See Brohan et al. paper above.
LB and TO, thanks for the brain food. I go to dream.
Last one for today.
@JamesEvans:
Ah, I understand your point. I guess that's why they don't use this model for long-term projections, just for one-year-ahead. So the signal they're trying to replicate is not the century trend. They're trying to predict shorter-term influences such as ENSO, Atlantic Multi-Decadal Oscillation, explosive volcanoes (if they've already erupted prior to the prediction being made -- they're not predicting eruptions!) on next year's global-mean -- albeit on top of the effect of the more slowly varying GHG, aerosol, solar forcings.
@GreenSandL:
I suppose they could do that. You'd need to be sure that it is a bias and would continue on next year. One complication is that the statistical model is updated each year, as they have an extra year's data to train it. Perhaps they hope that the extra year -- which includes any bias in last year's forecast -- will help re-tune the model (note this is a statistical model not a GCM-based climate model) and so reduce the bias? Though with data being used from 1947 onwards, an extra few years may not have much power to change the model. However the Folland et al. (2013) paper may give the answer -- they have introduced changes in their methods and data for fitting the statistical model in 2010 and the new methods show much less bias during hindcasts, so perhaps their forecasts will also have less bias without the need for the ad-hoc adjustment that you propose?
Note both these answers are about their forecasts with statistical models. Since 2008 they also use a GCM-type approach (their "dynamical forecasts"). My comments about updating (re-tuning) the model each year don't apply to the dynamical model. It also blurs the distinction between year-ahead forecasts and longer-term projections -- since similar models are being used for both (and for in-between timescales of a decade-ahead). This gives a chance to try to improve the long-term projections through understanding how they perform at shorter timescales. But this isn't my area of expertise so I'll stop here.
@Tim Osborn @Patagon and everyone else
My understanding is Folland et al were simply wrong to claim the measurement error of thermometer is 0.2C so error in monthly mean is 0.2/sqrt(60) to claim a reduced error.
Brohan and chums were even more wrong to apply this single value to every weather station in world.
In brief this formula applies when making repeated measurements of the same quantity, for example making repeated measurements of the resitance of the same resistor. Why would any one want to do this? To reduce measurement error! (See chap4 Bevington & Robinson, or 4.4 of John Taylor).
Would you consider 60 measurements, using the same thermometer, at 60 different locations measuring the same quantity? Why should 60 different locations in time be any different?
There are of course other sorts of error in attempting to asses error in measuring global temperatures
Mark Cooper discusses errors in measuring at one site
http://pugshoes.blogspot.co.uk/2010/10/metrology.html
Clive Best talks about errors due to distribution of weather stations around world.
http://clivebest.com/world/Map-data.html
Ross Mcktrick has noted that around 1990 the number of weather stations in world approximately halved and global temperature jumped 1C at the same time. It's hard to explain this other than by some sort of error.
Just thinking monthly error at one weather station I thought, as an ex experimentalist, why not look at some data. Here I show 1 months data from 1 weather station (ok it's a small sample)
http://jeremyshiers.com/blog/uncertainties-in-global-temperature-data-at-least-0-5c-jan-2014/
The distribution of temperatures seems roughly normal so we can use the formula used by Folland. BUT taking the stddev of observered temperatures, 4.0C, which gives error in mean monthly temp of 0.5C.
This says nothing about systematic errors at the station, or errors between stations. So the real error will be even larger.
In any case this all seems irrelevant. Murray Salby showed changes in CO2 are driven by temperature not vice versa. Murray was criticised for not disclosing his source of data or making his calculations explicit. So last weekend I had a go. If you go here you can get python script and excel spreadsheet + data so you can have a go to.
http://jeremyshiers.com/blog/murray-salby-showed-co2-follows-temperature-now-you-can-too/
I am constantly amazed people try and fit straight lines to temperature data, when there are clear signs of cycles (so it's not co2 then).
http://jeremyshiers.com/blog/global-temperature-rise-do-cycles-or-straight-lines-fit-best-may-2013/
Talkbloke, Nicola Scafetta and chums are pursuing a connection with planets, which seem the obvious source of these cycles. That's where I'm focusing my attention too.
@ Jeremy Shiers, Jan 25, 8:13 AM: “I am constantly amazed people try and fit straight lines to temperature data, when there are clear signs of cycles”.
The temperature data are time series. The “clear signs of cycles” might not be signs of cycles at all, but rather of the series being correlated with itself, i.e. of autocorrelation. Indeed, the series statistically exhibit many autocorrelations—and would be expected to on the basis of thermodynamics.
Highly autocorrelated series often appear to have cycles, even though no cycles exist. If you plot some simulated series, you can check that for yourself.
An example of how people were mislead by autocorrelation appearing to be a cycle is with the so-called Pacific Decadal Oscillation. The PDO is discussed by Roe [Annu.Rev.EarthPlanet.Sci, 2009]. Roe considers the underlying physical mechanisms, the primary mechanism being “re-entrainment of wintertime heat anomalies into the following year’s mixed layer”. That mechanism implies first-order autocorrelation. Indeed, an AR(1) process well fits the data. Roe concludes that “the PDO should be characterized neither as decadal nor as an oscillation” (!). The apparent oscillation is an illusion.
Douglas
Unfortunately the Roe paper requires a payment for access. Are you saying that they propose that there is no 60 year cycle apparent in the PDO?
@ Douglas J Keenan
Point taken I accept some cycles may not be genuinely cycles. Unfortunately the paper you provided a link to is paywalled. Because autocorrelation can produce cycles does not necessarily all cycles are produced by autocorrelation.
Surely the point is to try and find out what if anything is driving cycles and whether cycles are due to autocorrelation or not.. Planets and sun seem an obvious candidate but there is not yet, as far as I know, an clear description of how they would affect weather on earth.
Would autocorrelation produce repeated cycles of same period? (Remember I havent been able to read paper you referenced yet)
I've just had a similar discussion with EM about proxies and having them characterised but here is the basic idea based on work i've done with National Physical Laboratories in the past
If you want to measure to 0.2 degrees use a device that measures to 0.04 degrees (5 times less at least). Otherwise you don't have any real data.
You can't use the CLM if you don't know the expected lifetime for convegence to CLM errors - it might take 200 years or 2 days - you don't know and most sensors have non linear bias variation.
The use of reduction of errors through multiple measurement is only realistic if the variation is much larger than the systematic accuracy. We do you think physicists and engineers like to use interferometry a lot?
Patagon wrote:
I am still amazed by the precision of the HadCRUT, better than 1/1000th of a degree Celsius, what a wonder.
When I was at university I soon learnt that that when writing reports of laboratory experiments, if your measurements were accurate to only one decimal place then the results of calculations should only be given with one decimal place.
Don't the scientists responsible for HadCRUT understand that, or do different rules apply in climate science?
Tim Osborn
Thanks Tim,
Lets hope so. I am not really proposing an "ad-hoc adjustment", I am proposing adjusting to comply with the findings of Folland et al 2013. Because at present I don't know what the forecast is. The MO say +0.57c but you point to the paper which says the statistical model has had a "warm bias" all through its 12 year existence. So what is the MO forecast? Is it +0.57C including a known "warm bias", therefore we should expect +0.50C (approx)? Or does this years forecast take into account the findings of Folland 2013 and we should expect +0.57C? Could/have the findings been included in:-
Once again many thanks for your involvement.
@ Roy
Don't the scientists responsible for HadCRUT understand that, or do different rules apply in climate science?
No
Yes
Roy
There is also the classic: what is the width of a human hair when all you have is a ruler gradiated in mm to estimate it
1.0 mm ± 0.5 mm
There was a free pdf copy of the Roe paper on line but that seems to have been taken down. However I seem to be able to read it at http://www.learningace.com/doc/1492067/0c751b351f4d2c128046ac563446d702/roe_feedbacksrev_08
Actually 0.0 ± 0.5 mm - shows how long it is since I've done the exercise
Jan 24, 2014 at 8:34 PM | Unregistered CommenterThinking Scientist
There are several graphs that show correlation coefficients for temperatures at stations separated by up to several thousand km.
BEST used one, Australia has has its own.
http://www.geoffstuff.com/BEST%20correlation.jpg
This looks to me like there is more going on than meets the eye.
I have been unable to finger the source of my unease. Can you help please?
Ot's slightly OT, but such graphs are used for gridding/contouring/weighting to get national averages.
Geoff.
An easy way to find an online copy of a research paper is to use Google Scholar. There, enter the surname of the first author and the quoted title of the paper. The results page will include a line that says “All n versions”: clicking on that brings up a page that links to online copies.
I just did that with the Roe paper, and it found 14 versions. Several versions were indeed of the paper, e.g. http://clidyn.ethz.ch/ese101/Handouts/roe09a.pdf.
There is no valid evidence for a cycle in the PDO.
About the influence of the planets, I have not looked at any publications on this. Having said that, Jupiter has a mass greater than that of all the other planets combined, and the gravitational force exerted by Jupiter (on Earth) is <0.0001 that of the gravitational force exerted by the sun. Hence claims that other planets could affect Earth’s climate seem extremely unrealistic.
Work that claims to have found cycles in the temperature records always ignores proper consideration of time series.
Jeremy Shiers
Just read your blog on the correalation between CO2 and temperature.
You make a good case for a correalation between annual temperature variation and annual CO2 variation.. The CO2 annual variation is mainly due to seasonal variations in photosynthesis and respiration, whch in turn are temperature driven. It makes good sense that the annual temperature cycle drives the annual CO2 cycle.
On a larger scale a similar process amplifies the temperature variation due to Milankovich cycles, turning an insolation change which would produce 1C wqrming into a 5C difference between glacial and interglacial periods. The temperature change affects carbon sinks such as oceanic dissolved CO2 and biological sinks such as tundra. Warming tends to trigger CO2 release into the atmosphere and cooling triggers uptake over millennial timescales, with consequent effects on temperature and ongoing feedback loops. Again, gradual long term changes in temperature drive long trm changes in CO2.
The problem comes when this is stretched one step too far. It has been suggested that the pattern of CO2 and temperature change of the last 130 years has been due to a natural temperature increase driving a natural increase in CO2.
This is false. Our civilization has been releasing CO2 at a rate equivalent to 3% of the planetary CO2 budget, of which half has been taken up by various carbon sinks. The remainder accumulates, at a rate of 1.5ppm/year. This is a rate of change unprecedented in past glacial cycles, and therefore probably not due to natural temperature change. It is only seen when an abrupt CO2 increase due to vulcanism, impact or human activity takes place. In these situations the CO2 increase precedes and drives the temperature increase. Since the rate of increase in CO2 matches industrial CO2 production and there are no other CO2 sources of similar scale, the correct inferrance is that the CO2 increase is anthropogenic, and that the temperature increase is driven by it.
The only alternative is that human CO2 production is driven by, and proportional to, global temperature. This is unlikely.
Work that claims to have found cycles in the temperature records always ignores proper consideration of time series.
Jan 25, 2014 at 12:08 PM | Douglas J. Keenan
Sceptics arguing for the effect of cycles on modern climate can overlook the consequences of their claim. For example, it is quite possible to fit a hypothetical 60 year PDO cycle to the modern temperature record, with "warm" PDO correalating with the rapid warming between 1910-1942 and 1972-2002; while "cold" PDO correalates with 1880-1910, 1942-1972 and the 21st century pause. The validity of such curve fitting is another matter.
What would be less palatable for our cyclist would be the implication that the pause would then be expected to continue to 2030, to be followed by more rapid wqaming. "No warming since 1998" then becomes evidence that global warming is proceeding on schedule. ;-}
@Doug Keenan
"You remember rightly—strictly, the exam was to be in time series" £500 per minute? A 3 hour exam?? Surely not???
-- Peter
@Entropic Man
I fitted the change in co2 levels had assuming the change 2 components
1) an increase proportional to temperature anomaly for each month
2) an annual increment which surprise surprise turned out to be about 1.6ppm/year
At the end of the post I said I was unhappy with the annual increment as
"It seems almost certain to me there is some other mechanism (maybe it’s not currently active) which will eventually lower CO2 levels at some point in the future. An ice age?"
I showed also have added the fit was only for 1980 to 2013 and it is more than likely the annual step increased over time. So in 1800 or 1880 it would have been closer to zero.
It seems to me you start from saying "I make a good case for a correlation between annual temperature variation and annual co2 variation". From this you proceed to argue the correct inference is the CO2 anthropogenic and that temperature increase is driven by it. As far your chain of reasoning proceeds without evidence.
Yours confused.
@ Doug Keenan
Thanks for the tip about google scholar
I don't know a great deal about timeseries analysis or pdo for that matter but I have read Roe's paper.
Although he says the PDO is neither decadal or an oscillation, he also said
"...rather it is to highlight that vast majority of the variance in the PDO can be explained
by simple integrative physics with a perhaps surprisingly short timescale. This should be the
expectation (i.e., the null hypothesis) in the interpretation of geophysical time series"
which seems the door open that some, if not all, geophysical time series may be caused by other than autocorrelation.
There are other temperature records other than HADCRUT and GISS, news papers for example which google is helpfully digitising. Stephen Goddard is one person regularly posting clips of temperature records from old papers and it seems the 1930's (60years before 1990's) were a period of high, probably record temperatures.
In my comment I did not make clear that I had fitted the sum of two cosine curves. The fit found a period of 64 years for one and 305 for the other. Can autocorrelation produce this?
Tallbloke Scafetta and chums are looking at the effective of planets gravity on the sun causing changes in sun's behaviour.
Jupiter's gravity on earth is between 10^-4 and 10-^5 of that of the suns. So what.
The moons gravity on earth is around 5*10^-3 that of sun yet moon has noticable effect on earth.
Moreover the moons path is elliptical, you may have seen references to supermoon in the press.
The difference of moons gravity between closest and furthest point is of order of 10^-4 times that of sun.
The real effect of planets and moon would be when they all line up.
Now we have communication of a sort (as David Tennant once said) would please explain the meaning of
1+0+3 in calcAICc(AIC(gls.ML), length(gistemp), 1+0+3) and
3+0+2 in calcAICc(AIC(arima310z), length(gistemp), 3+0+2)
in so I could understand what's going on
also are you aware of an AIC function in python
thanks
@ Peter Mott, Jan 25, 4:19 PM
Some further details are given in my critique of AR5 statistics (§11).
@ Jeremy Shiers, Jan 26, 10:16 AM
Certainly there can be cycles in the climate system. My comment just stated “There is no valid evidence for a cycle in the PDO” and “Work that claims to have found cycles in the temperature records always ignores proper consideration of time series”.
When analyzing data, the crucial question is usually this: what model should be chosen? If someone wants to choose a model that incorporates a cycle, it is up to them to provide evidence that that model is appropriate. Simply showing a graph of fit is rarely enough. The evidence should include comparisons with other plausible models. In particular, if we are dealing with time series, then there should be comparisons with standard time-series models. That is particularly so for climatic time series, where we know from physical considerations that autocorrelations are present.
About “1+0+3” and “3+0+2”, each gives the number of parameters in the respective model. The first model is a straight line with ARMA(p,q) residuals, where p=1 and q=0; the other 3 parameters are the slope, intercept, and variance. The second model is a driftless ARIMA(p,1,q) model, where p=3 and q=0; the other 2 parameters are the offset and variance.
Good point about the “difference of moons gravity between closest and furthest point”. Perhaps I should stay away from physics….
AIC is easily calculated from the maximum likelihood (for more on this, see the Wikipedia article). Note that a pure sine model, with iid Gaussian residuals, will have 5 parameters: offset, shift, amplitude, frequency, variance. A model similar to what you propose (with two cycles), but including first-order autocorrelation, will have 9 parameters: offset, shift1, shift2, amp1, amp2, freq1, freq2, variance, correlation.
Jeremy Shiers
Your correlation is between the seasonal variation in temperature and the seasonal variation in temperature. You demonstrated that on a seasonal timescale temperature change precedes CO2 change. The problem comes when this seasonal correlation is used to "prove" that since 1880 temperature has driven CO2, when it is more likely that Homo Sapiens has driven CO2 and temperature has followed.
The transition to glacial conditions does drive CO2 levels down. More extensive tundra absorbs more CO2 and cooler oceans dissolve more CO2. The driver is reduced Northern Hemisphere insulation. When conditions warm the process reverses.
This does not apply at present. There are no identified forcings driving the long term warming trend except our own Co2 production, though the amplification seen at the beginning of interglacial is also likely to amplify the long term effect now.
I am reluctant to believe in fairies, little green men and hypothetical "other mechanisms". There is already a probable mechanism operating, increased heat retention due to an increasing greenhouse effect.
Your correlation may not be causation. The northern hemisphere drives the annual CO2 variation by a change in the relative amounts of respiration and CO2. In the NH Ssummer photosynthesis exceeds respiration and CO2 drops. In NH Winter respiration exceeds photosynthesis and CO2 increases.
Since both processes get faster with increasing temperature, the seasonal change is probably driven by something else. The causation behind your correlation may the change in the amount of light available for photosynthesis at different seasons.
James. Evans
Consider a temperature graph (temperature on the y axis, year on the x axis) with a systemic +0.07 error. The line is 0.07C too high along its whole length. The error does not increase with time. Correcting the error would drop the entire line by 0.07C.
If the rate of change were underestimated by 0.07C, then each year would be 0.07C higher than the one before, because the slope of the line would be steeper.
The central value for the Met Office prediction for 2013 was 0.57 K (anomaly relative to 1961-1990).
They seem to be comparing this to the WMO value, which averages GISS, HadCRUT4, and NCDC estimates; this works out to be +0.50 K. [I had to posit a December value for NCDC which hasn't posted yet, but it's unlikely to be off by enough to affect the average to two decimal places.]
Hence an over-estimate by, coincidentally, 0.07 K.
I had a look at the original data, downloaded from http://www.metoffice.gov.uk/hadobs/crutem4/data/download.html
and there is something that either I do not understand or is a bit worrying.
The average number of monthly observations for 2013 per grid cell ranges from 2 to 13,with more than half of the grid cells empty and with a clear bias towards the East coast of USA and towards the industrial belt of Germany. The SST data comes from HadSST3 which is not included, but if you compare with the published map (http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/web_figures/anomalies.png), there are a lot of data in that map over surface land that are not published in the CRUTEM4 datasets.
Where do they come from?
Best estimate temperature anomalies 2013:
http://imagizer.imageshack.us/v2/800x600q90/703/z39e.png
Average monthly number of observations 2013:
http://imagizer.imageshack.us/v2/800x600q90/46/7p28.png
For the actual month of december there are even less.
Interestingly enough they publish the standard station error, which ranges from 0.2 to 1, so much for the 3 decimal significant digits...
Here is the actual December 2013 map for a better comparison with the Met Office map:
http://imagizer.imageshack.us/v2/800x600q90/822/2kh7.png
And here is the Dec 2013 map at the Met Office for CRUTEM4 (i.e. without the SST in HadCRUT4) for a better comparison:
http://www.metoffice.gov.uk/hadobs/crutem4/data/web_figures/anomalies.png
@Patagon: which data file did you download to make your plot? I just downloaded the netCDF "best estimate temperature anomalies" CRUTEM.4.2.0.0.anomalies.nc.gz and "number of observations" CRUTEM.4.2.0.0.nobs.nc.gz from http://www.metoffice.gov.uk/hadobs/crutem4/data/download.html and the coverage for Dec 2013 seems to be as shown on the Met Office map. Not sure why your map is so much sparser.
@Tim,
I have used the same file that you suggest, tried again, but got the same plot.
In grads:
Regarding the second point, don't you think that if the standard station error ranges from 0.2 to 1, it is not good practice to give 3 decimal significant digits?
@ Patagon, Jan 27, 1:12 PM “don't you think that if the standard station error ranges from 0.2 to 1, it is not good practice to give 3 decimal significant digits?”
The measurement errors are assumed to be independent, hence they will tend to cancel each other out. With enough measurements, it is possible to get an arbitrary level of precision.
The precision is real. You can test this by simulating random samples of size n from a standard Gaussian distribution. Consider each member of a sample as being a measurement, where we know the true (exact) value of the quantity being measured is 0, and the standard deviation of the measurement error is 1. As n gets larger, the average of each sample will tend to be closer to 0.
@ Douglas J. Keenan
You said it, "With enough measurements".
But if you check the maps, the average number of monthly measurements range from 2 to 13 (and very skewed to the 2's), and from a handful of cells a global temperature is extrapolated.
Sorry clicked submit too soon
@Patagon (Jan 27, 2014 at 1:12 PM): I don't know GRADS very well, but is it possible that the contouring command you are using is only drawing contours in regions where there is a block of four (2x2) grid cells with values? If one or more of a 2x2 block is a missing value, it might not plot anything. The Met Office may have more sophisticated package for drawing contours that shows even isolated grid cells. This would explain why your plot doesn't show anything for isolated islands of 1, 2 or even 3 grid cells, nor around coasts and high northern latitudes -- or indeed anywhere without blocks of complete 2x2 grid cells.
Re. precision. Quite a few people have commented on the precision and accuracy of the data and the number of decimal places given. During a chain of calculations it is best to maintain as much precision in the calculations at each stage, and to only round the values to a lower precision at the final stage. Lowering the precision earlier on can give inaccurate results. Since we are providing temperature data that might be used for further calculations by users, the entire calculation doesn't finish with the CRUTEM4/HadCRUT4 numbers -- the final stage is whatever result the users of the data obtain at the end of their analysis. So, I think it is appropriate to provide figures to three decimal places even if the overall uncertainty on, say, the global-mean annual temperature is as much as +/-0.1 degC. Especially as the estimate of the uncertainty range is also given, so that users can round their final results appropriately.
Hello all,
Patagon: If a HadCRUT grid cell contains even a small amount of ocean and there are ship or buoy observations available from HadSST3 then that value is used for HadCRUT. Your maps look a little strange though. Is it possible that the contouring algorithm isn't reducing the apparent coverage? The CRUTEM4 coverage should look more like this:
CRUTEM4
Jeremy Shiers: I read your blog post on the uncertainties in HadCRUT and I think you've made a common mistake that a number of other people have made. You look at the standard deviation of the temperatures of days in a month and derive from these a 'standard error on the mean'. That's not the same thing in this case as the uncertainty. I'll try and explain what I mean, because in other circumstances what you did is exactly the right thing to do.
If we are calculating a monthly average temperature then we add up all the measured temperatures for that month and divide by the number of days in the month. If we assume for a moment that we have perfect measurements of the actual temperatures then
Monthly average = (T1 + T2 + T3 + ... + Tn) / n
This is the thing we want to know. In reality we don't know T1, T2, etc... All we have are observations with some error. We could write this down like so:
Oi = Ti + Ei
Where O is the measured value. T is the true temperature and E is the error in the measurement. We don't know what T and E are precisely, but we do know O. That's what gets written down in the observer's log.
Now if we take the monthly averages of the observations we have
Observed Monthly average = (O1 + O2 + ... + On) / n
Or expanding things out a little:
Observed Monthly average = (T1 + T2 + ... + Tn) / n + (E1 + E2 + ... + En) / n
In other words the observed monthly average is the actual monthly average (the bit that we want to know) PLUS an average of the errors. The uncertainty we are interested in is the likely range of the dispersion that is attributable to those errors. How large the uncertainty in the monthly average is will depend on the average magnitude of the E's and on the kind of errors they represent.
In your blog post you calculate the standard error. This would be the relevant quantity if you were sampling from a population and making inferences about general properties of the distribution. However, when measuring a monthly average at a station, that's not what is being done. In your example for November 2013, we know that there is one and only one 1st November 2013, and one and only one 2nd November 2013 and so on. We're not sampling from a population.
The Observed Monthly Average (let's just call it OMA) is a simple function of the observed temperatures
OMA = f(O1, O2, O3 ... On)
The fact that this function, (O1 + O2 + ... + On)/n , is identical to the function used to calculate the sample mean of values drawn from a particular population lead to all sorts of confusion.
In order to calculate the uncertainty in the Observed Monthly Average, we need to know the uncertainties on the observed values and can then apply the usual rules for propagating uncertainties through a calculation.
http://en.wikipedia.org/wiki/Propagation_of_uncertainty
In HadCRUT, the 0.2C uncertainty you mention is the random, uncorrelated error on a single measurement. It is just one component of the total uncertainty and, once any appreciable averaging has been done, it is becomes an insignificant one. The HadCRUT paper and the associated HadSST3 papers list other contributions to the measurement uncertainty that are far more important from the point of view of estimating global averages and long-term changes.
John
@ Tim Osborn
Many Thanks Tim,
You are right, the interpolation algorithm in grads seems to need the 4 surrounding nodes, so it draws nothing where there are single values or a single row or column. "set gxout grid" show all values now, and they are like in the Met Office map. My fault.
I agree now with your explanation for the need of keeping precision and decimal places. But as Sven pointed out at Jan 24, 2014 at 1:29 PM, there is a serious risk that people forget the +/- 0.1°C bit when giving press releases about the Nth warmest year, which might be basically identical to another 20 or so years within ±0.1C.
Thanks also John Kennedy, I saw your post after submitting mine.
It is good to see you guys from CRU and Met Office around. It makes the discussion more interesting, and friendlier too.
As for the map, I guess I should have tried also the text version before assuming there was something weird with the nc file.
These global maps are more interesting than the single global figure. After all, people are affected by regional climate, not by global climate.
An easy plot in R if anyone is interested: