Climate science is not sound science
Mar 29, 2007
It's pretty much fundamental that scientific results have to be reproducible in order to be accepted as valid. You have to describe exactly what you did, in sufficient detail for somebody else to be able to reproduce what you say you did. If they can't, and you can't explain  where they went wrong, then the result will be written off as erroneous or even fraudulent.

For many specialisms, statistical manipulation is a normal and necessary part of the  scientific process. In order for the results to be replicated, a number of things are necessary, but chiefly:

Now obviously, for most studies, the amount of data is too large to reproduce in the printed journal. Because of this many journals try to enforce data availability in their conditions of acceptance for a journal submission. There seem to be two main approaches taken. The "strong" approach is that the data must be available in an online archive at the time of publication. The "weak" approach is a requirement that data is made freely available on request.

It's perhaps surprising that Nature, the premier science journal in the UK if not the world, adopts the weak approach. Their data availability policy is here:

An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols available to readers promptly on request.Any restrictions on the availability of materials or information must be disclosed at the time of submission of the manuscript, and the methods section of the manuscript itself should include details of how materials and information may be obtained, including any restrictions that may apply.

Compare this to the Journal of Applied Econometrics

Authors of accepted papers are expected to deposit in electronic form a complete set of data used onto the Journal's Data Archive, unless they are confidential. In cases where there are restrictions on the dissemination of the data, the responsibility of obtaining the required permission to use the data rests with the interested investigator and not with the author.

Well, so what? 

It matters because the rules are being flouted by scientists - particularly climate scientists - and the journals are struggling to enforce them. Requests for data are being ignored or met by delay and obfuscation. This is unacceptable, particularly for public funded scientists.  Steve McIntyre details just a few of the problems he has encountered in this comment:

[I]f the data is not archived at the time of publication, the authors will typically move on to other things and there is no guarantee that the data will ever archived. Lonnie Thompson had never archived any data from his Himalayan sites, some taken in 1987, until I started raising the issue in 2004 and then archived the least conceivable information. The time when the data is most useful is when you read the article. I like to see what actual data looks like before it's massaged and the best time to do this is when you read the article. So the data should be online contemporary with publication rather than a year later when you may or may not still be interested int he file.

As it happens, many of [dendrochronologist Rob Wilson’s] associates aren’t very prompt about archiving data. None of Luckman’s data is archived; Rob’s ICefields and B.C. data done with Luckman are not archived, other than the reconstruction. None of Esper’s data from Tian Shan is archived. Esper refused to provide data except through repeated requests through Science and even after over 3 years of effort, the data provision is still not quite complete.

This situation stinks, and it may well eventually develop into a full-fledged scandal. No science which is not capable of reproduction should be permitted in the IPCC process, and that means the IPCC should insist that data and methods are fully disclosed, before the paper is considered.

To my mind it's the journals who must take the primary responsibility for putting it right though. If the Journal of Applied Econometrics is able to insist on concurrent data archiving, then there is absolutely no reason why other disciplines cannot insist too. There is certainly no excuse for Nature, whose scientific cachet is so great that they reject 90% of submitted manuscripts, nor indeed for Science.

To my mind the journals who fail to insist on full concurrent disclosure are risking their reputations. If one of these articles is later found to be wrong, or even fraudulent, the journal will certainly get egg on its face. By insisting on concurrent disclosure they will at least concentrate the minds of the authors on ensuring that their data and methodology are flawless.

Let's hope they recognise this and do something about it.

