Monday
Nov232015
by Bishop Hill
A change to the playing field
Nov 23, 2015 Climate: Statistics
Doug Keenan has posted a note at the bottom of the notice about his £100,000 challenge, indicating that he has reissued the 1000 data series. This was apparently because it was pointed out to him that the challenge could be "gamed" by hacking the (pseudo)random number generator he had used.
Brandon Shollenberger emails to say that this is a terrible thing, but I can't get terribly excited about it. Presumably it doesn't make any difference to those who think they can detect the difference between trending and non-trending series.
Reader Comments (89)
Meanwhile, the insanity continues as Ed Miliband calls for mass suicide [ 'Ed Miliband calls for law for UK to eradicate all carbon emissions' - See more at: https://www.politicshome.com/energy-and-environment/articles/story/ed-miliband-calls-law-uk-eradicate-all-carbon-emissions#sthash.kJZ637Nf.dpuf ]
There's no such thing as a secure code - just one that takes a long time to crack.
Do you think Ed means all carbon emissions or all carbon dioxide emissions? I'm never sure if these politicians understand the difference or whether a carbon-based environment can actually do that or should want to.
Either way somebody should point out to him gently that every living human being emits CO2 every second or so and ask if he would like to set an example to the rest of us since "setting an example" appears to be what politicians think they are for these days!
Of course not, why would you? On the other hand, if this was Michael Mann, Kevin Trenberth, Naomi Oreske, Stephan Lewndowsky, or any of the other people you typically abuse on your site, you'd be throwing your toys out of the cot and wailing like a two-year old.
Who thinks this? Some examples would be nice. Certainly when it comes to climatic data, I don't think anyone is claiming that they can do attribution using statistical models alone. That would be silly. Surely this is obvious? I realise Doug Keenan is a bit clueless, but you've been writing about this topic long enough to realise this, right?
Mike,
Maybe some people think that certain things are so bleeding obvious that they don't have to explicitly state precisely what they mean. However, the more I read this site - and the comments - the more I think that this is not true.
The set up should be as intended. The real challenge is whether the climate community can implement what it claims it can - separate signal from overwhelming noise - not whether someone can break encryption to an answer key, or defeat a poorly constructed example.
ATTP, I would think reasonably intelligent people can see through your BS about attribution not being embedded in the statistical analysis itself. The mathematics of *no* statistical test contains inferential reasoning within it. That is exactly what Keenan asks you to construct. Propose that his collection of series contains one/few that a trend caused by a physical process. Develop a statistical test that differentiate the physically-caused series from the randomly generated ones. Use it to identify the series.
Either solve the problem, or let others do it.
Shub,
If you really think that one can do attribution using statistical models ONLY, then you're even stupider than I realised.
ATTP
"Certainly when it comes to climatic data, I don't think anyone is claiming that they can do attribution using statistical models alone. That would be silly. Surely this is obvious?"
You'd think so wouldn't you, but such plonkers exist....
http://link.springer.com/article/10.1007%2Fs10584-015-1495-y#page-1
"A number of scientific hypotheses have been put forward to explain the hiatus, including both physical climate processes and data artifacts. However, despite the intense focus on the hiatus in both the scientific and public arenas, rigorous statistical assessment of the uniqueness of the recent temperature time-series within the context of the
long-term record has been limited. We apply a rigorous, comprehensive statistical analysis of global temperature data that goes beyond simple linear models to account for temporal dependence and selection effects. We use this framework to test whether the recent period has demonstrated i) a hiatus in the trend in global temperatures, ii) a temperature trend that is statistically distinct from trends prior to the hiatus period, iii) a “stalling” of the global mean
temperature, and iv) a change in the distribution of the year-to-year temperature increases."
And that's only one out of many...
ATTP, are you some type of a dummy? I explicitly stated no statistical test contains its inferential reasoning within it and your next comment is to ask whether I believe in the opposite?
A time series produced by a caused process will look *numerically* similar to one that is synthetically generated to mimic it. If your statistical test can distinguish the former from noise, it can and should too, the latter. Simple enough.
*head desk.
ATTP
And another plonker here....
http://geographical.co.uk/nature/climate/item/273-warming-hiatus-caused-by-natural-climate-change-variation "Shaun Lovejoy of McGill University in Montreal had previously developed a statistical methodology that used pre-industrial temperature proxies to analyse historical climate patterns. Using this technique, he ruled out, with more than 99 per cent certainty, the possibility that global warming in the industrial era is just a natural fluctuation in Earth’s climate. In the present study, he applied this same approach to the 15-year period after 1998..."
Need any more?
Stuff like this is published all the time and added to the pile of 'evidence' despite it being evidence only of innumeracy.
Shub,
Okay, I missed that. You seem to think that Keenan's challenge can be described like this
If you don't like me calling you stupid, you should probably stop saying stupid things.
*head desk
JamesG,
Your first link appears to not do any attribution, your second appears to be using our understanding of natural variability to develop a statistical model. I fail to see how either of these qualify as doing attribution using statistical models ONLY.
MikeHaseler
Or code you don't publish?
ATTP
'I fail to see' or IFTS should be your acronym instead of ATTP. Perhaps because you do not like to look.
JamesG,
Apologies, I should stop saying that and be more direct. Neither of your links are examples of studies that do attribution using statistical models ONLY. Is that clear enough for you now?
Yes, ATTP, Keenan's synthetically generated series, hidden amongst the thousand or so, resembles numerically a series that would be produced by a causal process, as proposed by the IPCC. That is exactly what I said.
If, as the climate attributionists propose, a time series of a CO2 caused temperatures can be distinguished with reasonable certainty from non-forced ones, you should be able to apply the same test to Keenan's series.
You need to read carefully and think before typing out stuff.
"Of course not, why would you? On the other hand, if this was Michael Mann, Kevin Trenberth, Naomi Oreske, Stephan Lewndowsky, or any of the other people you typically abuse on your site, you'd be throwing your toys out of the cot and wailing like a two-year old."
You should try looking in the mirror sometime. You might learn something.
Noone claims that it is possible to do any such thing using statistical models ONLY. Keenan - and now you - are savaging a massive strawman!
Words to live by, Shub, words to live by!
When attributionists propose they are confident to x decimal point - say 95% certain, for eg - that CO2 caused global warming and the 2th century rise in temperatures - that is exactly what they do. They say, in effect, that they can discern with 95% certainty a rise over and above what ought to have been seen. The IPCC AR4 had a picture to show this.
No it is not. They are NOT using statistical models ONLY!
Huh, no they don't. The 95% attritbution statement is based on null hypothesis testing in which they reject the null hypothesis that more than 50% of the warming since 1950 was non-anthropogenic with 95% certainty. To be strict, they are not 95% certain that it is mostly anthropogenic, they have simply rejected the null hypothesis that it is mostly non-anthropogenic.
You're getting closer, ATTP. If you give the IPCC attributionists a bunch of curves they wouldn't wimp out, instead they would pull the CO2-caused one from the non-caused ones. Cause that's what they claim to do.
This is fun.
What did I say on the original thread?
lol
Shub,
I don't know why you think I'm getting closer. I haven't changed my position at all.
There is only one planet Earth.
No it isn't.
I agree with the climate scare chattering cohort on this one
Changing the rules, or even the data in midst competition/challange is bad form.
It gives the impression of either having made a serious mistake, or having realized that the chance/risk of losing is much larger than initially assumed.
That said, I can't see that there is much merit in their complaining or whining about the challenge.
And the idea that it would be simpler to extract a detectable signal from noisy data as long as you really believed that there must be one ... for instance because your pet hypothesis is so dear to you, and even has some physics in it for some parts ...
.. that's just laughable.
Some people have asked why the simulated time series all start with −0.23. The answer is that the simulated series are intended to be roughly analogous to global temperatures during 1880–2014 (thus the series length of 135). I thought that all the series should have the same starting temperature. For that starting temperature, I picked the global temperature in 1880, which was −0.23, according to HadCRUT (4.4.0.0). That way, if the simulated temperatures are plotted on the same graph as the observed temperatures, they can be more easily compared.
@Jonas N
There was indeed a mistake, as explained on the Contest web page. For some illustration, see
http://scottishsceptic.co.uk/2015/11/19/the-doug-keenan-challenge/
Jonas N:
"And the idea that it would be simpler to extract a detectable signal from noisy data as long as you really believed that there must be one ... for instance because your pet hypothesis is so dear to you, and even has some physics in it for some parts ... "
Hitting the nail on the head. +100
This post is complete and utter BS. It says:
But what he conveniently fails to mention is I explicitly told him things like:
With me linking to the post I wrote on this subject for documentation of the claim. He didn't respond to this e-mail, ask any questions about my claims or dispute anything I said. Instead, he just wrote this post painting me as unreasonable, telling everyone the changes to the data set shouldn't make any difference to people's ability to win the contest.
So our how was directly informed those changes make the contest significantly more difficult to win. Anyone who even spends two minutes looking at the two data sets will see the changes make the challenge more difficult to win. Anyone who just looks at two graphs our host was linked to would know what he says in this post is simply untrue. And yet, he still wrote this post saying things that are obviously untrue.
This is complete and utter BS. And sadly, it's not even as bad as what Anthony Watts did.
Brandon Schollenberger,
Seems to me that Douglas Keenan should have nothing to lose now by publishing the key to his original answer file, unless what you say is true about the solution being easy. So if you get in quick and publish your answers, and they prove correct or Keenan refuses to publish the key, I'd say you're vindicated. OK, so it won't net you $100k, but at least it'd give you title to a bit more of that moral high ground.
What you are arguing about, Brandon, is a p*ssing contest between you and Douglas. What the rest of us are interested in is the relevance to the climate debate.
The point has already been conceded by ATTP that "There is no statistical model that can determine what is causing the warming." If you will make the similar concession, then the rest of us are done here, and can move on.
@ Robert Swan at 8:35 PM
Publishing the key to the original Answer file would allow people to determine which original series were generated with a trend. That would give them valuable information that could be used to assist in determining which of the revised series were generated with a trend.
I will publish the keys, and computer programs, to both the original and the revised series when the Contest closes.
Let me illustrate the problem with a graph.
The green line is GISTEMP. The brown line is the linear trend. This is real temperature data.
I have included two sets of 95% confidence limits.
The narrower set show 95% confidence limits of +/-0.09, the actual value for this data. Since the 95% confidence limits at the beginning and end of the time series do not overlap, one can be 95% confident that there is a real trend.
The broader set show 95% confidence limits of +/-0.5.Since the 95% confidence limits at the beginning and end of the time series overlap, one cannot be 95% confident that there is a real trend, so it is possible that this is a random series.
If you examine Keenan's sets, some of them have a temperature range exceeding 2.0C Even with a 1C trend the 95% confidence limits will be greater than 1.0. This makes it impossible to identify the trend.
This is why we are calling the challenge a lottery. Some of the sets may confidently identified as random, some may be confidently identified as trends. Unfortunately several dozen sets have large confidence limits and cannot be classified.
As a statistical challenge the exercise is meaningless.
Bart,
How is that a concession? The reason I said that was because this entire challenge is a massive strawman. Was that too complicated for you?
No, you very obviously are not. You're interested in setting up strawmen that you then majestically knock down while the rest of us look on thinking "good grief, they can't really be that stupid, can they?"
"The narrower set show 95% confidence limits of +/-0.09, the actual value for this data."
No. They don't. They show confidence limits based upon a model. If the model is wrong, so likely are the confidence limits.
@ Entropic man at 9:24 PM
Your graph is based on an OLS (ordinary least squares) calculation. No one uses OLS, because it ignores autocorrelation. Everyone who has studied this issue agrees—including the IPCC, the Met Office, Michael Mann, etc.
The basic idea underlying autocorrelation is simple. Suppose that today is extremely warm: then that increases the chance that tomorrow will be warmer than average. Similarly, suppose that this year is extremely warm: then that increases the chance that next year will be warmer than average. The OLS calculation, however, assumes that what happens in a given year is independent of what happened in the prior years. The assumption is wrong, and that invalidates your calculation.
Dougals Keenan:
ATTP:
You two agree. That is all that matters. Quit whining.
Robert Swan, as Douglas Keenan points out, publishing the answers to the first version of this challenge would give information about how the original dataset was generated. Even with the changes Keenan has made to the process for generating these series, that information would likely be useful for people trying to solve the challenge.
Bart:
What in the world are you smoking? I accused Keenan of changing his data set in a way which makes it significantly more difficult for people to win after the challenge was publicized and people had already entered it. Nothing about that suggests "a p*ssing contest."
You may only be interested in the relevance this contest has to the climate debate, but the fact this contest is being run by a person who engages in dishonest and even illegal activities is highly relevant to anyone who might consider participating in the challenge. It's certainly more relevant than this contest is to the climate debate, because this contest tells us absolutely nothing about that except on a sociological level, namely that people will accept any sort of nonsense when it's convenient. There is certainly no value to the climate debate on a technical level in this contest.
I'm not sure I understand what you want me to "concede." Are you saying you want me to concede that a study of numbers can't identify the physical cause of the observed warming? If so... why? No statistical model can determine the cause of anything. Statistical models don't determine physical causes. Statistical models model things to provide us insight and information.
We may then use that insight and information to try to draw conclusions regarding what we observe, but that's a separate step. The application of a model to reality always requires information/knowledge beyond that used to create the model. So... sure, "no statistical model... can determine what is causing the warming." Because no statistical model, on its own, can determine what is causing any physical phenomena.
"You may only be interested in the relevance this contest has to the climate debate, but the fact this blah, blah, blah...
Yawn... Couldn't care less.
"No statistical model can determine the cause of anything."
Thank you. I think we're done here.
No, we do not. Douglas J Keenan is - at best - a clueless buffoon who is savaging a massive strawman and who has wasted a great deal of taxpayers money getting the Met Office to answer his ridiculous questions. He may not be a clueless buffoon, but - if so - the conclusions are far less complimentary.
I'll even explain this again. This is not true
Noone claims this. Douglas J Keenan can repeat this as many time as he likes. It still does not make it true. Do you get this yet? This entire challenge is savaging a massive strawman. Noone is doing what Douglas J Keenan is claiming that they're doing. Hence, he - and you it seems - are setting up a massive strawman that you can knock down, pretending that you've just illustrated something significant while anyone who understands this will go "what, huh, are they really that dense?".
@Entropic man
I don't think you've actually looked at Keenan's series.
Douglas J Keenan
My graph was intended to illustrate the problem of identifying trends when the 95% confidence limits are large. OLS was a convenient way of generating the graph, not a statement of how the analysis should be done.A number of your pseudorandom datasets have confidence limits large enough to obscure the trend in the same way that I illustrated. This makes your challenge a lottery since statistical methods cannot identify all the trends.
Your datasets are intended as a test of statistical methods applied to temperature data. May I take it that they are autocorrelated, as genuine data would be?
Douglas J Keenan,
IPCC Chapter 2
Nov 23, 2015 at 10:28 PM | ...and Then There's Physics
"Noone claims this."
I know for a fact that people do, though I am not acquainted with this "Noone" fellow in particular. Hopefully, we can now rely upon you to admonish him when you next encounter him.
Nov 23, 2015 at 10:38 PM | Entropic man
"OLS was a convenient way of generating the graph, not a statement of how the analysis should be done."
Then, how do you know those are 95% confidence intervals for the data in question?
This is my (and Douglas') point. Here you are generating confidence limits for a single delimited member of a statistical ensemble, and you don't even know the true autocorrelation.
"...that allows for first-order autocorrelation in the residuals..."
Wow! A first order autocorrelation! Maybe, someday, they will learn to count to 2 and higher.
Paul
Only a fool would comment on Keenan's data without inspecting it, and only a fool would expect that of me.
If you want to check, go through the data yourself. IIRC the first 300 datasets contain about ten with a final value exceeding 2.0 and the two largest I saw ended with 2.37 and 2.39.
Bart
GISS and others publish confidence limits with their data.
They also publish the raw data, so you can check their figures for yourself.
"GISS and others publish confidence limits with their data."
And, what does that tell you about GISS? It tells me a lot, and explains a lot, too.
Douglas J. Keenan, Brandon Schollenberger,
You both say publishing the original answer file will help people "cheat" on the new problem. I'll accept that (though it's a surprise). Should the new problem not also be withdrawn? The weak encryption of the original answer file is still there and if that still helps the cracker to a six-figure sum...
Anyhow, my original point stands. As long as Brandon can publish his list of answers reasonably quickly (i.e. not through cracking the crypto), when Doug eventually publishes the key to the answer file (as promised above) we can see how it went.
@Entropic man
Your comment about the range of some of Keenan's series and confidence intervals makes no sense. So it seemed likely you hadn't looked. Traces like the extremes of Keenan's set are consistent with about the right amount of natural variability coupled with a 0.01/year trend.
Robert Swan:
I would think so, if that were the real reason for the change in data sets. I don't, however, believe that was the reason he changed the data sets. It is abundantly obvious Keenan made changes to the data set which makes the challenge significantly more difficult. I just whipped together a couple graphs for a user at my site, so I might as well post them here.
This is what you get when you plot the histogram of the linear trends of the two data sets on the same chart. The greenish color shows the trends in the original data set. The pinkish color shows the trends in the new data set. The purple shows where the two overlap. If Keenan only made changes to address concerns about randomness, one would expect the differences in the two data sets to be randomly distributed. As this chart shows, that's not the case.
We can make this more obvious though by taking note of the fact there is no practical difference between positive and negative trends in this challenge. That means we can use the absolute value of the calculated trends instead, giving us this chart. As it shows, the changes between the two data sets are notably non-random. The same is true for any number of other tests one could perform, though I don't think any would produce results so as visually obvious.
Given there were clear changes to the method used to generate these data series, and given there was nothing but vague, unspecified concerns regarding supposed RNG vulnerabilities, I'd posit Keenan's true motivation for changing the data sets was to make the challenge more difficult. If so, the RNG issue is just a smokescreen he's using as an excuse.
Nov 23, 2015 at 11:44 PM | Brandon Shollenberger
By the playbook of course. Personalize the problem. Argue to the man. Divert attention from the true question at hand.
You guys are shameless.
Oh, and two other things. First, another notable difference between the two data sets is Keenan rounded everything in the first data set to three decimal points, but he rounded everything in the second data set to two. That is an obvious change nobody could possibly deny, and it reduces the amount of information available to people attempting his challenge.
Second, Robert Swan says:
I don't see any value in me attempting to solve this challenge, either with the original data set or the new one. Quite frankly, I've never believed Douglas Keenan would pay anyone the $100,000. That's not to say I thought he was lying. I just had nothing to make the claim seem believable, and I'm not inclined to believe something just because some guy on the internet says it. That means I've never seen any value in trying to actually win this contest. That view certainly hasn't changed with Keenan changing the data set to make the contest more difficult to win.
I don't foresee that view changing. I did throw together a "guess" that I considered entering into the contest though. I didn't think it would win, but I initially thought this contest closed on November 30th, 2015 rather than November 30th, 2016. As such, I thought we only had two weeks to work on it. I thought it'd be neat to see how close people might get in such a short time. My initial approach which led to my "guess" was to just rank series by the magnitude of their trend and pick the top 500 as having a trend added to them.
I'm sure much better approaches exist, and I can even think of ways I could potentially refine that one (I even toyed with a couple), but I don't expect I'll put any effort into trying them. It clearly doesn't matter if someone could solve the original challenge. It probably doesn't even matter if they could solve the new one.