## Discussion > Evidence for Rhoda and Dung

EM

Confidence limits are published, which would only be possible if the frequency distribution For GISS the confidence limits of an annual average are quoted as +/- 0.09.

Since confidence limits are+/- 2 standard deviations we can deduce:

Go back and write this out again so that it makes sense, please.

* A few figures for GISS.*

*Sample size for an annual average is number of stations * days * frequency of observation. Let's say 3000*365*2.*

*n=2,190,000.*

*Confidence limits are published, which would only be possible if the frequency distribution For GISS the confidence limits of an annual average are quoted as +/- 0.09.Since confidence limits are+/- 2 standard deviations we can deduce:*

*SD=0.045*

*In summary, for each annual average tthe raw data has a Gaussian frequency distribution; n=2,000,000+, SD=0.045*

*(...)*

*Dec 10, 2015 at 6:33 PM | Unregistered CommenterEntropic man *

...the confidence limits of an annual average are quoted as +/- 0.09....

"quoted" - but they don't say how they get them? But you still believe they have a statistical model that enables them to do so?

And you think that all the same you can infer standard deviations from confidence limits that someone gives...

Gosh EM - where to start. Bearing in mind that spatial and temporal variation (plus you are averaging min and max) which means that each of those two million values each has its own distribution different from all the others... So not identically distributed....

And todays high at place x is correlated with tomorrow's high at x. So not independent....

Will I give it a go tomorrow...?

Dec 10, 2015 at 7:19 PM | Entropic man

It won't be the first case of data tampering by researchers.

Micky H Corbett

I have had some difficulty finding a link for this. The nearest I could find is this, which discusses how to calculate the size of sample needed to achieve a required precision of the mean. This issue the inverse of our question" How does sample size affect precision?"

It also discusses the question in considerably more detail than I was ever taught.

I was taught in my admittedly basic statistics course that the precision(d) of a sample mean(xbar) increased with sample size(n) according to the equation

d=xbar/√n

Let us assume xbar=10 and n=10.

d=10/√10=3.16

The mean is best described as 10+/-3.16

For n=100 d=1

The mean becomes 10+/-1

For n=1000 d=0.316

The mean becomes 10+/-0.316.

For n=10,000 d=0.1

The mean becomes 10+/-0.1 and can meaningfully be written as 10.0

In this example, each hundred fold increase in sample size gives an extra decimal point in the precision of the mean.

For n=1,000,000 the mean becomes 10.00. Now we are moving into the realm of climate data, usually given to 2 decimal places, based on thermometer readings accurate to +/- 0.5C and sample sizes of 1,000,000+

EM

Yes that example you linked to is correct. But only because the individual components are discrete units. You can't have half a beetle.

A temperature measurement is not a discrete measurement but can be considered a distribution. Knowledge of uncertainty and how it varies over time becomes important. If you have a process then it becomes even more important to understand what can cause the largest uncertainty.

Uncertainty that is discontinuous i.e sensor jumps, means discrete statistics as above can't be used so easily. That's why the Met Office tries to get measurements to follow the normal distribution model. It makes the calculations easier. Doesn't mean that's what happens. It's an assumption for the sake of the science but one that needs to be watched closely.

Micky H Corbett

Most of the statistical methods we use came out of Sir Ronald Fisher's work in the 1920s. He was a biologist and mathematician seeking to produce vigorous methods for analysing biological data and designing experiments.

For a biologist, high variance data was a fact of life, so a clear understanding of uncertainty was a necessity. Physicists and engineers usually worked with low variance data so statistics was less important.

I tend to take uncertainty values given with data at face value for several reasons.

Of course assumptions should be watched closely. A self-critical approach is normal scientific practice. If you lapse, your peer reviewers or your rivals will eagerly correct you.

The researchers are interested in understanding the processes producing the data. Lying to themselves about the quality of the data would be counterproductive.

The raw data is normally published as supplementary information. Any peer reviewer, competitor or sceptic with a statistics package can check the published uncertainties. Only a fool would publish incorrect figures when they can be easily checked.

There is also the tendency for good research to inspire replication and further research. Shoddy research then shows up very quickly.

Jobs, tenure, professorships and funding in science all come out of a reputation for high quality research, so professional reputation is hard won and zealously protected. Competition is also intense. Poor practice, faking or distorting data gets you rusticated very quickly.

There was great annoyance here when I suggested that engineers were less than perfect. Kindly give scientists the same professional courtesy you demand for yourselves.

EM

Jobs, tenure, professorships and funding in science all come out of a reputation for high quality research, so professional reputation is hard won and zealously protected. Competition is also intense. Poor practice, faking or distorting data gets you rusticated very quickly.There was great annoyance here when I suggested that engineers were less than perfect. Kindly give scientists the same professional courtesy you demand for yourselves.

I think you misunderstand what I wrote. The assumptions the Met Office make are clear in the paper, for example 2 minutes for the water to sit or words to that effect. But that is an assumption, a best case. Also it is normal to simply assume that errors can be random and to restate this at the end. That's scientific practice. I didn't invent it. If the conditions are such that those assumptions are relatively correct then the conclusions stand. If not then the conclusions don't. Again go and read the Met Office papers or other climate papers.

For engineering you simply ask the question "can we afford to assume that?" The real world implications demand time consuming characterisation. Pure science doesn't often do this because once the idea is out there someone else can test it. That's not a criticism, that's just how the process works.

Micky H Corbett

Our different approaches to Met Office data remind me of an old RAF anecdote.

Two pilots were transferred to a Mosquito squadron.

One pilot found the Mosquito rather pedestrian; he had come from Spitfires.

The other pilot thought the Mosquito was marvellous; he had come from Beaufighters!

SandyS

Sorry, the geological temperature record link didn't take last time.

EM

They added the bucket corrections to SST (well according to the Real Climate post) but there's an old thread on here that puts doubt on that correction. It was posted by me and John Kennedy got involved. It's also a good example of science versus engineering. I can't say if the correction was "enough" if that's the word. The temperature may have to be adjusted even more.

The bucket correction assumes a model where relative humidity is the key factor. They performed experiments in 1948 and then again 1991 on ships but with no continuous improvement or characterisation in subsequent years or any obvious prolonged lab campaigns. There is also the problem of the process of measurement. A wet finger in the air as to how long buckets were left is used and there is no consideration of the complicated interplay between thermometer and bucket.

As I said then, this is okay in a science paper as you have to allow some assumptions. But it's not okay in an official temperature record where it affects policy and can be considered a "national standard". You have to painstakingly demonstrate you understand the process. The tolerance i.e. uncertainty should be larger in the meantime as a precaution (as in this is how the Precautionary Principle should be used). It's currently stated at 0.1 degrees. That has come only from using a statistical model of the process.

This is important. The model is of the

wholeprocess not just how the bucket cools, which in itself looks to have an uncertainty of around 0.15 degrees C when compared to the students' measurements.Nothing really came after that thread but at least it's here for posterity.

Plus one last thing, I've noticed you have said before and are saying here in a roundabout way that measurements can be averaged so that uncertainties reduce. It's how the temperature anomalies are often quoted to 0.1 degrees C uncertainty. This only occurs when individual measurements or even samples of measurements can be reduced to a normal distribution (with zero or fixed bias) behaviour. It's in the Met Office papers and is at times a best case assumption that is made. They made the same assumption with bucket corrections.

Problem is that engineers are all too familiar with sensor drift and the headaches. If you don't know how your measurement varies over the lifetime of the instrument then uncertainties become discontinuous and only in very long stretches of time do they approximate normal distributions (as in Central Limit Theorem).

If you can engineer low uncertainty and have other ways to correct for drift, as is the case with using orbital dynamics to help correct satellite view factors, along with onboard calibration, then it helps. Otherwise it's high tolerance time.