Books
Click images for more details

Support

Recent posts
Currently discussing

A few sites I've stumbled across recently....

Subscribe
Monday
Nov232009

## The code

This is a new thread for updates on the analyses of the data and code freed from CRU.

Everybody, I'm sinking under weight of things to do here. I need you to post one or two line analyses of what you are finding in which bits of code. I'll transfer these to the main post as they come in. It needs to be in layman's language and to have a link to your work.

CRU code

• Francis at L'Ombre De L'Olivier says the coding language is inappropriate. Also inappropriate use of hard coding, incoherent file naming conventions, subroutines that fail without telling the user, etc etc.
• AJStrata discovered a file with two runs of CRU land temp data which show no global warming per the data laid out by country, and another CRU file showing their sampling error to be +/- 1°C or worse for most of the globe. Both CRU files show there has been no significant warming post 1960 era
• A commenter notes the following comment in some of the code:"***** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********"
• Good layman's summary of some of the coding issues with a file called "Harry". This appears to be the records of some poor soul trying to make sense of how the code for producing the CRU temperature records works. (rude words though, if you're a sensitive type)
• Some of annotations of the Harry code are priceless - "OH **** THIS. It's Sunday evening, I've worked all weekend, and just when I thought it was done I'm hitting yet another problem that's based on the hopeless state of our databases. There is no uniform data integrity, it's just a catalogue of issues that continues to grow as they're found."
• CRU's data collation methods also seem, ahem, amusing: "It's the same story for many other Russian stations, unfortunately - meaning that (probably) there was a full Russian update that did no data integrity checking at all. I just hope it's restricted to Russia!!"
• Borepatch discovers that CRU has lost its metadata. That's the bit that tells you where to put your temperature record on the map and so on.
• Mark in the comments notices a file called resid-fudge.dat, which he says contains, believe it or not, fudged residuals figures!
• Mark in the comments notes a program comment: "Apply a VERY ARTIFICAL correction for decline!! followed by the words `fudge factor' " See briffa_sep98_d.pro.
• From the programming file combined_wavelet.pro, another comment, presumably referring to the famous Briffa truncation: "Remove missing data from start & end (end in 1960 due to decline)".
• From the file pl_decline.pro": "Now apply a completely artificial adjustment for the decline only where coefficient is positive!)"
• From the file data4alps.pro: "IMPORTANT NOTE: The data after 1960 should not be used. The tree-ring density' records tend to show a decline after 1960 relative to the summer temperature in many high-latitude locations. In this data set this "decline" has been artificially removed in an ad-hoc way, and this means that data after 1960 no longer represent tree-ring density variations, but have been modified to look more like the observed temperatures."
• From the Harry readme:"What the hell is supposed to happen here? Oh yeah - there is no )'supposed', I can make it up. So I have :-)...So with a somewhat cynical shrug, I added the nuclear option - to match every WMO possible, and turn the rest into new stations (er, CLIMAT excepted). In other words, what CRU usually do. It will allow bad databases to pass unnoticed, and good databases to become bad, but I really don't think people care enough to fix 'em, and it's the main reason the project is nearly a year late. " (see Harry readme para 35.
• James in the comments says that in the file pl_decline.pro the code seems to be reducing temperatures in the 1930s and then adding a parabola to the 1990s. I don't think you need me to tell you what this means.

•

View Printer Friendly Version

### References (3)

• Charming chap, by all accounts.Ably aided by UKian the Baroness Ashton. "Last week she was unknown in Britain. Today she is unknown all over Europe". Call Me Dave's previous "we will not let matters there" now turns out to be "I don't want an in or out...
• From the file pl_decline.pro: check what the code is doing! It's reducing the temperatures in the 1930s, and introducing a parabolic trend into the data to make the temperatures in the 1990s look more dramatic. - Recycled to a separate posting today by ClimateGate blogstar Bishop Hill from among the comments ...
• (Cross-posted en The North Cave) En la larga semana que ha pasado desde que los ahora famosos datos de la CRU saliesen a la luz pública en la blogosfera se ha explorado ya considerablemente los 160 megabytes de datos, gracias a los esfuerzos de decenas...

Clearly, this is ugly code, and their are strong indications that a particular result is desired. But, is there any evidence that these programs are the basis for published papers? Also, this code doesn't seem to be part of a GCM - as others have mentioned, there are no physical laws being simulated. Instead, we are likely looking at code to do statistical tests, curve-fitting, output manipulation, etc. Circumstantially suspicious, but not a smoking gun, IMO.

But, the point is - why do we need to be guessing about this? The data and code which are the basis of published papers should all be freely publicly available, and clearly documented in a way that allows any knowledgeable individual to exactly reproduce the published results. There should be no detective work involved. This is how science is done. If you think it is justified to manipulate data, then do so, but tell everyone why and how you are doing it, and let them have the data so that they can judge for themselves the validity of these manipulations.

In other words, the larger context is what's important here: the CRU have not been making the data and code available, to the point where an FOI request has been made to try to force it out of them! This proves that the CRU is not doing science - they are doing politics. All of their published work should be withdrawn (and NASA's, too, to the extent they have not released their code and data), and there should be a serious reconsideration of any other published work which substantially depends onh CRU-derived results, as well.

Nov 25, 2009 at 1:45 PM | Roger Zimmerman

Ah, you found "pl_decline.pro" I see. But for a real treat, see "pl_decline_nerc.pro"

I've seen better code from high school students.

Nov 25, 2009 at 4:24 PM | mojo

Another area worth looking at:

In Harry's readme file, he talks about a conversion program to convert from one format to another. He also wrote a program to correlate two data files.

The part I find worrying is this:

====================================================================
Files compared:
1. cru_ts_2_10.1961-1970.tmp
2. glo2grim1.out

Total Cells Compared 4037
Total 100% Matches 0
Cells with Corr. == 1.00 0 ( 0.0%)
Cells with 0.90<=Corr<=0.99 3858 (95.6%)
Cells with 0.80<=Corr<=0.89 119 ( 2.9%)
Cells with 0.70<=Corr<=0.79 25 ( 0.6%)

..which is good news! Not brilliant because the data should be
identical.. but good because the correlations are so high! This
could be a result of my mis-setting of the parameters on Tim's
programs (although I have followed his recommendations wherever
possible), or it could be a result of Tim using the Beowulf 1
cluster for the f90 work. Beowulf 1 is now integrated in to the
latest Beowulf cluster so it may not be practical to test that
theory.

15. All change! My 'glo2grim1' program was presciently named as
it's now up to v3! My attempt to speed up early iterations by
only reading as much of each glo file as was needed was really
stupidly coded and hence the poor results. Actually they're
worryingly good as the data was effectively random :-0
====================================================================

So he can feed (in his words) random data into this and get correlations exceeding 0.9 for over 95% of the data!

Is this how they compare their results against those of others, and then claim correlation between results?

If so, the whole of their research is bogus.

Nov 25, 2009 at 5:03 PM | PJP

The Windows equivalent of
> grep -iR artificial FOIA/documents/
would be
> findstr /i /s artificial FOIA/documents/*

Enjoy.

Nov 25, 2009 at 6:43 PM | Shazam Orange

Bishop, I've posted up an account of my foia experience with CRU at Watts Up With That:

http://wattsupwiththat.com/2009/11/24/the-people-vs-the-cru-freedom-of-information-my-okole.../

If you could note this in your head post it'd be great.

Thanks, keep up the good work,

w.

Nov 25, 2009 at 6:49 PM | Willis Eschenbach

RE: searching the emails & files

The fastest, most convenient way to search the whole archive (emails & programs & everything else) is to open the zip file in Winzip and click the "find" button, which will give you a nice clickable list of each string or file desired, case insensitive or not selectable, etc.

Nov 25, 2009 at 7:08 PM | mark

I found the remarks of AJStrata most in keeping with my own questions regarding AGW. How do you claim an accuracy for temperature measurement that is not possible. The temperature trend is noise- the AGW issue should have ended as a result a very long time ago.

Nov 25, 2009 at 8:30 PM | PatM

"I dunno what you windows users use for recursively searching for case-insensitive strings, but I truly loves me my /usr/bin/grep"

Well cross platform developers that develop on Windows probably do the same as me and use ports of GNU grep on windows. :)

As a side note...along with Win32 there is also a OS/2 and POSIX (Interix now I believe) subsystem.

Nov 25, 2009 at 9:01 PM | Mick

I remember a guy at Imperial 20 years ago saying he didn't know what language he would be using in twenty years' time, but he knew it would be called FORTRAN.

For really big number crunching it's still a good pick (for the penny-ante small-scale stuff these guys seem to have been working on MATLAB would have been better). The work that has gone into compilers and libraries is simply astronomical, and if you're writing, for example, something that is using a lot of linear algebra and needs to run on a big vectorised machine FORTRAN will still be your go-to language. There's nothing that says it has to result in spaghetti code. Sound software engineering practises apply here just as anywhere else. A big problem is that academic scientists and engineers are not software engineers, and are often quite disdainful of sound coding methods when a 'quick and dirty' approach will do the job adequately at the time. But this inevitably leads to the whole panoply of horrors we're seeing in the CRU code: unmaintainable codebases, ad hoc and primitive hacks to bludgeon data into line, and missing/incomplete/corrupted datasets. The difference is that in most other areas the shoddy code isn't being used to bolster arguments for the fundamental rearrangement of Western civilisation.

Nov 25, 2009 at 9:42 PM | David Gillies

Look at the connection too, between CRU and NZ based NIWA (National Institute for Water and Atmospheric Research.) Dr Jim Salinger whose name comes up frquently was fired by NIWA earlier this year. THIS could be why...

Nov 25, 2009 at 11:24 PM | Ayrdale

Haha, none of you really know what you're talking about.

Nov 25, 2009 at 11:43 PM | Tristan

I used the seach term 'synthetic on the HARRY_READ_ME file today. And it threw up some interesting stuff (which I may have comepletely wrong of course.

'Harry', it seems, got the temperature side to work relatively quickly, but ran into problems with 'cloud' and other bits, because this is not just a temperature model but a climate model, to which cloud/sun and precipitation have been added. A surprise to 'Harry' (and to me) was that the programme produced anomalies first (+ or - deviations from a mean) rather than actual values, and it seemed to do this using 'synthetic cloud data' to which the actual data was added afterwards. This is not unusual for a model. But this set me thinking - what if each part of the model was derived like this? (see link if interested - I didn't post all here for sake of brevity)

Nov 26, 2009 at 1:35 AM | vjones

> Haha, none of you really know what you're talking about. <

Was that supposed to convince us that YOU do?

Back up your claim or go away.

Nov 26, 2009 at 2:00 AM | Ryan

I think a bit too much attention is being given to the source code comments. I am a software engineer and I have been going through the cru-code directory. The first thing any programmer will notice is the general lack of comments. That makes code like this very difficult to hand over to someone else in the future, for future maintenance and modification.

Let's take a look at a couple of examples:
documents\cru-code\f77\mnew\sh2sp_m.for

This program, sunh2sunp, converts the "sun hours monthly time series to sun percent (n/N)". I do not have access to the cited reference used for calculation so have not yet determined if the code is implemented correctly. This program appears to have been used, primarily from 1997 to 2003.

However, in the odd situation where the calculation exceeds 100%, the code, surprisingly, checks for this but then leaves the incorrect value in place:

c decide what to do when % > 100
if(sunp(im).gt.100)sunp(im)=sunp(im)

For non programmers this says, in simplified form
if x > 100, then let x = x
which means, leave the wrong value the same as it was. Hello?

Normally, if a value is incorrect, the error is either flagged or perhaps in this case it
could be due to round off error - in which case, we might expect something like:
if x > 100, then let x = 100
which would force the value of x to never exceed 100.

The purpose of this program and how it fits into any analysis is not yet understood. The program appears to date back to 1997 (and probably went out of usage by 2003) and it may no longer be in use. It is entirely possible that the above error condition never occurred - and consequently, this defect in the software would have no impact on the results.

In a separate code file (sp2cld_m.for), the above test is implemented correctly:
IF(CLD(im).GT.80.0) CLD(im)=80.0

The file exhibits poor Fortran coding standards such as:
ratio=(REAL(sunp(im))/1000)
IF(RATIO.GE.0.95) CLD(im)=0
Note the lower case 'ratio' and the upper case 'RATIO' variable names. Best to be consistent - in other programming languages, consistency is enforced by the compiler.

The variables XLAT and RATIO are not declared. Similarly for iy, iy1, iy2. Fortran 90 permitted this practice and would automatically define the value based on the first letter of the variable name: A through H and O to Z are set to type 'real'. Use of this feature is discouraged because the compiler is then unable to flag typographical errors - instead of warning of using an undeclared variable, it just defines a new one. This can result in erroneous program operation - if that occurs. Note - this is a software engineering issue and is not the source of any identified execution errors in this program. This is to note that this is poor programming practice. It does not appear to have resulted in an implementation or execution error.

Note - the issues I cite do not mean the program's executed incorrectly. They are more indications of poor programming practices. And I believe we the people deserve the utmost care and professionalism in a matter as important as this. Climate change is too important to be conducted in secret with such disregard for issues of quality and reliability in the data analysis and models.

Second Example:
documents\cru-code\linux\cruts
This code is used to convert new data into the new CRU 2.0 data format (.cts files).
There is another version of this code in cru-code\alpha which is, per comment in the

readme file, intended for running on the "Alphas"

Data can come in from text files or Excel spreadsheet files (or actually Excel

spreadsheets written out to text files from Excel). These programs are designed to read

multiple climate data file formats include
GHCNv2
CLIMAT (Phil Jones formt)
MCDW
CLIMAT (original)
CLIMAT (AOPC-Offenbach)
Jian's Chinese data from Excel (appears to be text output from Excel)
CRU time-series file format - with the comment "(but not quite right)"

Data files for running these code files are not available in this archive.

Software engineering comment - this collection of programs - very large source code files - is implementing a crude database management system. Most of the source code is uncommented and undocumented. From a s/w engineering perspective, it would have seemed wise to have used an existing DBMS that had been extensively tested and verified. Instead, the approach chosen results in extremely large amounts of custom code being written. There is no evidence provided of software quality assurance (SQA) procedures applied, such as a test plan, test scenarios, unit testing, test driven development and so forth. It would most likely have been quicker and more reliable to use existing software tools like DBMS.

The goal of the software is to eventually calculate the anomalies of the temperature series from the 1961-1990 mean.

Because station reporting data is often missing, the code works to find nearby stations and then substitute those values directly or through a weighting procedure. In effect, the code is estimating a value for missing data. Station data will be used as long as at least 75% of the reporting periods are present (or stated the other way, up to 25% of the data can be missing and missing data will be estimated).

The linux\_READ_ME.txt file contains an extensive description. Of interest, stations within 8km of each other are considered "duplicates" and the data between the stations is "merged". I have a question about this which may not really matter - but there is no attempt to determine if the nearby stations are correlated with one another. It is possible, for example, that one station is near a body of water (and less volatile) and another is on the roof of a fire station (see surfacestations.org). Or the stations could be at different elevations. In my town, the official weather reporting station moved 4 times over the past century - from downtown in a river valley, to eventually up on a plateau next to a windy airport. These locations would today fall within the 8km bounding area. My concern is that this could skew results in an unpredictable way. For example, the valley is prone to cold inversion layers while the airport, 200 meters higher in elevation, is then windy, dry and sunny. Then again, it could be that the situations like I describe are rare and would have negligible impact on the calculations.

I doubt there is a test plan for any of this code - doubt there are test scenarios and test scripts for automated testing. This means that when someone "tweaks" the code in the future, they will have to rely on the "kharma feels right" way of s/w quality rather than having a reliable way of insuring that future changes do not break existing code.

I am considering filing a Freedom of Information Act request for a copy of their test plan documents regarding these programs. I doubt the plans exist.

Nov 26, 2009 at 3:35 AM | Eric

I believe that for climate change to be taken seriously, we must ask that the climate scientists give their models the level of concern and care that professional software is given.

Would you wish to fly on an airplane whose aerodynamic modeling simulation code was written in this haphazard, poorly engineered code?

We deserve much better from the climate science community.

Nov 26, 2009 at 3:38 AM | Eric

PatM,

Because the IPCC doesn't care if error bars are correct "and having them is all that matters. It seems irrelevant whether they are right or how they are used."

From: Phil Jones <p.jones@xxxxxxxxx.xxx>
To: mann@xxxxxxxxx.xxx
Subject: Re: Out in latest J. Climate
Date: Thu Aug 4 09:49:54 2005
Mike,
Gabi was supposed to be there but wasn't either. I think Gabi isn't
being objective as she might because of Tom C. I recall Keith
telling me that her recent paper has been rejected, not sure if outright
or not.
Gabi sees the issue from a D&A perspective, not whether any curve
is nearer the truth, but just what the envelope of the range might be.
There is an issue coming up in IPCC. Every curve needs error
bars, and having them is all that matters. It seems irrelevant whether
they are right or how they are used. Changing timescales make this
simple use impractical.
We have a new version of HadCRUT just submitted, so soon
the'll be HadCRUT3v and CRUTEM3v. The land doesn't change much.
This has errors associated with each point, but the paper doesn't yet
discuss how to use them. I'll attach this paper. Only just been submitted to JGR - not
in this format though. This format lays it out better.
Thanks for reminding Scott.
Cheers
Phil

Nov 26, 2009 at 3:52 AM | mark

"RC is simply trying to play the victim card, in a broad CYA attempt. Its simply a diversionary tactic."

For those still unaware, realclimate is owned by Arlie Schardt - Al Gore's press officer. You'll find it registered to 'Environmental Media Services' on a whois.net search.

As such, I feel at complete liberty to entirely disregard anything they say.

Nov 26, 2009 at 4:31 AM | frank verismo

In HARRY_READ_ME.txt line 1496 Harry remembers that the precipitation grid around the south tip of Greenland has been tweaked to fit expectations !! Gets you wondering if there is any real science modeled in this simulation.

Firstly, wrote mmeangrid.for and cmpmgrids.m to get a visual comparison of old and
new precip grids (old being CRU TS 2.10). This showed variations in 'expected' areas

Nov 26, 2009 at 4:43 AM | reutefleut

There is a PDF called “idl_cruts3_2005_vs_2008b.pdf” in the documents folder that has plots of seasonal temps for 154 stations from around the world. Each plot has two traces overlaid:

19012005ann_seasons_regcountrymeans____.climgen (black)
19012008ann_seasons_regcountrymeans____.climgen (magenta)

Not surprisingly, if you zoom in, you can see the black trace ends in 2005 and the magenta trace ends in 2008. The really interesting thing is that in the overwhelming majority of the cases, the magenta trace has a cooler trend than the black trace. The magenta trace often shows a warmer early part of the 20th century and a cooler late part of the century than the black trace. Some of the differences are quite large, over half a degree C.

It seems clear that these two versions were calculated using different algorithms or possibly different input data sets. The author of the Harry read me talks of his inability to replicate earlier data and his revising sections of the code that didn’t seem to make sense. Just a thought, but perhaps this could be a comparison of the earlier version and his results? Regardless, it does make one wonder why the later version shows less warming.

Nov 26, 2009 at 7:00 AM | Jeff C.

From the file pl_decline.pro: check what the code is doing! It's reducing the temperatures in the 1930s, and introducing a parabolic trend into the data to make the temperatures in the 1990s look more dramatic. If this sort of adjustment is widespread in the code...

Nov 26, 2009 at 11:58 AM | James Smith

from http://online.wsj.com/article/SB10001424052748703499404574557583017194444.html

Q: How many climate scientists does it take to change a light bulb?

A: None. There's a consensus that it's going to change, so they've decided to keep us in the dark.

Nov 26, 2009 at 3:55 PM | mark

Any company involved in large scale IT projects would have both a project plan, and documented QA procedures, and regular QA audits. Do we know if any of these were implemented at UAE?

Nov 26, 2009 at 4:35 PM | Jeremy Poynton

Might want to look at this:

http://strata-sphere.com/blog/index.php/archives/11518

Nov 26, 2009 at 5:29 PM | Dave

-------
I’ve been trying to puzzle out for myself what this code is actually trying to do. My IDL-fu is effectively non-existant however…

If you expand ESRs original quote a little bit you get…

plot,timey,comptemp(*,3),/nodata,$/xstyle,xrange=[1881,1994],xtitle='Year',$
/ystyle,yrange=[-3,3],ytitle='Normalised anomalies',$; title='Northern Hemisphere temperatures, MXD and corrected MXD' title='Northern Hemisphere temperatures and MXD reconstruction' ; yyy=reform(comptemp(*,2)) ;mknormal,yyy,timey,refperiod=[1881,1940] filter_cru,5.,/nan,tsin=yyy,tslow=tslow oplot,timey,tslow,thick=5,color=22 yyy=reform(compmxd(*,2,1)) ;mknormal,yyy,timey,refperiod=[1881,1940] ; ; Apply a VERY ARTIFICAL correction for decline!! ; yrloc=[1400,findgen(19)*5.+1904] valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,$
2.6,2.6,2.6]*0.75 ; fudge factor
if n_elements(yrloc) ne n_elements(valadj) then message,'Oooops!'
;
;
;oplot,timey,tslow,thick=5,color=20
;
filter_cru,5.,/nan,tsin=yyy,tslow=tslow
oplot,timey,tslow,thick=5,color=21

I read this as being responsible for plotting the ‘Northern Hemisphere temperatures and MXD reconstruction’.

Note however the commented out code. The way i’m reading this, any graph titled ‘Northern Hemisphere temperatures, MXD and corrected MXD’ with a thick red line released before the brown matter hit the whirly thing probably has cooked data. Likewise if you see a thick blue line you might be ok.

Where do i get thick red line from?

According to the IDL Reference Guide for IDL v5.4, thick=5 means 5 times normal thickness. and color=21 links to (i believe)

def_1color,20,color='red'
def_1color,21,color='blue'
def_1color,22,color='black'

from just above the code segment itself.

However there’s one thing that i’m not sure about without either being able to play with IDL or seeing the end graph. It does actually plot the uncooked data in black (oplot,timey,tslow,thick=5,color=22) so whats the point of showing a cooked data line along with the uncooked dataline? Not to mention that the uncommented version apparantly prints the same data series twice.

Nov 26, 2009 at 5:42 PM | PR Guy

@Eric: Re: FOI: Just make sure you have your 10 pounds sterling ready and waiting. Those guys seem obsessed with this. Times must be hard in East Anglia.

Nov 27, 2009 at 1:36 AM | MikeE

Q: How many climate scientists does it take to change a light bulb?

A. You'll have to fill out a FOI request to get this information.

What's that, you have already? Well, it doesn't matter; we've squared the FOI guy and he's keeping shtumm.

Nov 27, 2009 at 1:39 AM | MikeE

Making changes in Wikipedia can be very challenging due to AGW gatekeepers. Wikipedia is a microcosm of the battle to control the flow of information. Most of the articles regarding "clamategate' are thinly disguised excuses and defense of CRU and its fraud. I made a comment on Stephen McIntyre's bio that it needed to be reviewed in light of what appears to be some vindication of his point of view. http://en.wikipedia.org/wiki/Talk:Stephen_McIntyre

Nov 28, 2009 at 3:29 AM | Strix

At first as i read through the various coding comments I thought well this is rally interesting but what does it really prove? having just looked again at the most recent article in the Telegraph I fully appreciate the work and results that reviewrs of the files are completing. What I find most striking is that in just a couple days that the coding comments completely compromise the explanations offered by CRU personnel literally on a point by point basis. What we need is to compile this contradictory information into point by point refutations of the answers being given to the public by CRU as it is devastating. I will help and have left my e-mail addy for further contact. Anyaway, I am relatively new to all t his and truly appreciate the work being done on this blog. Priceless comes to mind, but I prefer Visa myself!

Nov 28, 2009 at 8:25 AM | Daniel

Jeff C, ragarding those two lines (red and black), did you check Hawaii and Cook Islands on page 24? The change from 2005 to 2008 is pretty big!

Not to mention that these islands are the only pieces of land in the middle of millions of square kilometres... the impact spreads.

Nov 28, 2009 at 7:11 PM | Jarmo; Finland

How many millions did we pay for this kludge? And, its output is used to tune the GCMs?

It's apparent they would be hard pressed to regenerate the temperature history anew using this code, the apparently few raw data they possess and the missing metadata. This stuff is even poor by research coding standards. This does not exactly instill confidence in their "theory" of AGW.

Nov 28, 2009 at 8:12 PM | Joe Crawford

re: hacked vs. leaked:
the BBC meteorologist claimed he got the same set of emails in october. the most recent email in foi2009.zip is dated november 12th I beleive (if not 14th). conclusion: bbc weatherman is lying, or got a subset of the emails concerning him personally. conclusion to conclusion: bbc weatherman got emails concerning him by email, and he has the source addess (probly gmail.com account). conclusion to conclusion to conclusion: leaked not hacked, probly by someone working with semisecure directory listing inside uni.

Nov 29, 2009 at 2:02 PM | millstone

PS for windoze do-it-yourselfers/microcomputer hobbyists, here's a small collection of fortran compilers/environments for you to tinker around with. sorry to use rapidshare, it sucks lately, so try for the small files first before it tells you to come back later, or use the work around:

fortran.exe 444 KB http://rapidshare.com/files/313835984/fortran.exe.html

F80.zip 100 KB or so? http://rapidshare.com/files/313836256/F80.ZIP.html

watcom fortran 11.0c-b1.exe 35.9 MB http://rapidshare.com/files/313840193/watcom-fortran-11.0c-b1.exe.html

Rapidshare says the last can be downloaded 10 times max, so go for it. Check for viruses as always.

Nov 29, 2009 at 2:28 PM | millstone

Jarmo; Finland: the station for CO2 measurements in Hawaii is on the active volcano Mauna Loa on the island of Hawaii, read: lots of carbon dioxide output, totally unrepresentative.

Nov 29, 2009 at 2:48 PM | millstone

check this out: in mbh98.tar\TREE\ITRDB\NOAMER which is in the documents\mbh98-osborn.zip file there is the infamous BACKTO_1400-CENSORED file AND a bunch of others e.g. BACKTO_1400-FIXED, BACKTO_1400, BACKTO_1300-CENSORED, etc. etc. Haven't had a chance to analyze yet, but hope others will take a look too!

Nov 29, 2009 at 4:51 PM | mark

Re my post above, I have plotted the BACKTO_1400-CENSORED, BACKTO_1400-FIXED, and BACKTO_1400 tree ring data files on my new blog at http://hockeyschtick.blogspot.com/

Nov 29, 2009 at 8:06 PM | mark

Big huge thank you to all reporting and otherwise working on this.

Truth shall set us free!

Nov 30, 2009 at 3:54 AM | Meridian

I've inserted a link to your fine work within an essay I wrote on this topic. Thank you for your dedication.

Nov 30, 2009 at 8:51 PM | Patvann

Dear hfj: ad hominem attacks are soooo compelling. Please, sir, may we have another?

Dec 2, 2009 at 6:59 PM | Redducati

"James in the comments says that in the file pl_decline.pro the code seems to be reducing temperatures in the 1930s and then adding a parabola to the 1990s. I don't think you need me to tell you what this means."

This is strange. Willis Eschenbach has discovered that the "raw temperature date" of the Arctic and a few other areas (Africa, Australia) show them to be warmer in the 30's than today!

What do you think of that? Needs further investigation / sleuthing

Dec 2, 2009 at 9:45 PM | Richard

For a satirical look and the programming fraud:

Dec 4, 2009 at 12:17 AM | Andrew

i have it on good sources the Vatican has employed several scientists to do some peer review articles for Science Magazine, Discovery and Scientific American that vindicates the Catholic Church on the Galileo issue....

Dec 4, 2009 at 4:06 AM | Timray

This is only the 3rd time I have ever posted to a blog, but after stumbling across you guys while on my daily read of El Reg, I couldn't help but share this. I am an old retired computer programmer, who cut my teeth on Fortran before many of you were born. It was so refreshing “hearing” my own kind talk about this stuff. Reading your banter and feeling your sense of humor, reminded me of how much more fun we are than those guys. We make our judgments on the quality of the code, not the number of Phd.'s we may have. We may call each other names, but we are rarely mean spirited. In the open source world, our product it is truly open to everybody slice and dice. In the closed source world, there is still rigorous code review, testing and standards of documentation. That is my definition of professional.

My 1st two posts were to the RealClimate website, link to first Post #146 here, you can follow to the second #182. http://www.realclimate.org/index.php/archives/2009/12/cru-hack-more-context/comment-page-3/#comments

As you can see, you boys and girls may be much smarter than I am, but I'm not sure which is higher, “sub-intellectual with pretensiousness”(sic) or “intellectual midget”. Now to be fair, the moderator did allow my comments and did not comment on them, even though reading all the other posts, which I did, he can be summary and acidic. So, their appears to be at least 1 adult in the group. It is also possible that these responses are from some outside conspiracy trying to make them look bad, we have to consider that. I haven't wasted my time tracking them down. I have to thank them though, it is good to start the day with laugh.

I had considered downloading their code and trying it myself, but you folks seem to having plenty of fun with that. I have been having too much fun with the new Google language, Go. Applications like logic simulation, neural networks, or my personal favorite, massively parallel process monitor and control go from extremely difficult to trivial and fun. Give it a try!

Have a pint for the old guy.

Dec 4, 2009 at 3:51 PM | j gordon

The code has hard-coded constants everywhere. This is absolutely terrible practice. The coder should always define a constant as a preprocessor directive or as a well-defined variable.

For example (I will give C code here).

Instead of using 3.1415 for pi and then writing

while (x < 3.1415)

I can write

#define PI 3.1415

and then write:

while (x < PI)

Alternatively, you could simply declare it as a global variable (float PI = 3.1415)

By hard-coding constants everywhere, it becomes almost impossible to know what these numbers signify. The programmer would be fired in the professional software world for doing this.

Dec 6, 2009 at 7:33 AM | Glob

1. Lack of factoring the code into manageable functions. Large blocks of code are written in-line.
2. Doesn't use MySQL, SQL Server or Oracle for its "database". It's database(s) are flat text files, it appears. Thus, they do not have the added benefit of enforced referential integrity and strong typing that one gets with a database, or the speed of retrieval, ability to join, etc. This is incredibly significant.

Dec 6, 2009 at 7:43 AM | Glob

The hypothesis that human activity may effect the climate seems to me to be a valid area of research.

Man uses fossil fuels which have required large geological times to create and is now releasing that energy over a relatively short period of time.
Areas of the earth's surface are being deforested.

So we need to test this hypothesis.

The only problem is we don't have the necessary firsthand experimental data.

Historical data preceding recorded data has to be reconstructed using scientific hypothesis that probably are the subject to debate.

Recorded data is error prone, subjected to change in station location, missing data etc etc.

I have university degrees in Analytical Chemistry and Electronic Engineering. I am not a Climate Engineer, but I think I have a reasonable understanding of the experimental method and modelling to realize we are dealing with an enormously complex system when trying to understand how the earth's climate works.

It appears to me that the hypothesis was chosen, then a quick look around was made to find what man did that could effect the environment and the only one that stood out was burning fossil fuels.

As a prerequisite, was enough research done on any other likely causes of climate change?

BIG HINT.

Go outside, look up into the day sky and what do you see?

I could get cynical here ( and juvenile) and say that given the research is done in Britain, (where the only way you know it's summer is the rain gets hotter) maybe these guys just didn't get outside.

I have had to rely on these "climate scientists" to do the research and am bitterly disappointed if even 1% of the climategate revelations are true.

I don't know how these guys are going to unwind their research and start again but clearly there must be sufficient doubt on the results that it has to be started afresh.

My thanks to the contributors here who have the time and expertise to pick up the errors, mistakes and falsehoods.

Cheers

Norm

Dec 6, 2009 at 7:59 AM | Norman Webb

From: Tom Wigley <wigley@xxxxxxxxx.xxx>
To: Mike Hulme <m.hulme@xxxxxxxxx.xxx>

Subject: Re: New MAGICC/SCENGEN
Date: Mon, 9 Feb 1998 15:48:15 -0700 (MST)

It just happens that, in your version, I 'faked up' column 5 as the difference between column 6 and the sum of columns 2, 3 and 4. I did this simply to get the code working; but (as you now know) I never got around to fixing it up until now. In the latest version, column 6 is again equal to the sum of columns 2, 3, 4 and 5 because I scale columns 3, 4 and 5 to ensure that this is so. . . .
(3) Re HadCM2, again it is impossible to be consistent. What I said before is that the reason for adding these results is simply to make them readily available. I do *not* advocate using them in combination with any other model results.. . .
etc.

Dec 10, 2009 at 10:32 PM | David L. Hagen

Fortran is fine. Fortran is still used extensively in mathematical modelling because it is good for this sort of task.

Even the application of a fudge-factor table is OK if the model actually warrants it. For example if it is a set of coefficients for a filter. But if it is really tweaking to fake a desired result then that is of course not rigt.

What stuns me about these models is that they don't seem to be energy based but temperature based. Warming means adding energy. It does not mean increasing temperature. These models seem to ignore latent heat. Any climate model that ignores latent heat is surely pointless.

Feb 9, 2010 at 6:21 PM | charles

I have an issue with unqualified (in software development) scientists writing bad code.

Tim Mitchell is in the climategate emails, did his phd at Cru, worked at tyndall
Collated the emails, from the eleven to get the consensus, that Tom Wigley thought was reprehensible.

Former Geography student, and evangelical eco christian, who wrote the 'labyrinthine software suites' that 'Harry' works on.

"At Oxford University I read geography (1994-1997, School of Geography). My college was Christ Church. At Oxford I developed a special interest in the study of climate change."

Tim Mitchell again: (HADCRU - 1.2, 2.0 - Harry is doing HADCRU 3.0 one of only 3 global temp datasets)

“…Although I have yet to see any evidence that CLIMATE CHANGE is a sign of Christ’s imminent return, human POLLUTION is clearly another of the birth pangs of creation, as it eagerly awaits being delivered from the bondage of CORRUPTION (Romans. 19-22).

Tim Mitchell works at the Climactic Research Unit, UEA, Norwich, and is a member of SOUTH PARK Evangelical Church.”

I found that he is now a minister via the articles he posts on:
http://www.e-n.org.uk/4928-Creation-at-worship.htm
and put this into google: Tim Mitchell, Eastgate, Lewes

All his articles, here showing him working at Cru, tyndall and leaving to study at LTS (London theological Seminary
http://www.e-n.org.uk/searchpage.php?term=tim+mitchell

From: Tom Wigley <wigley@xxxxxxxxx.xxx>
To: jan.goudriaan@xxxxxxxxx.xxx, grassl_h@xxxxxxxxx.xxx, Klaus Hasselmann <klaus.hasselmann@xxxxxxxxx.xxx>, Jill Jaeger <jaeger@xxxxxxxxx.xxx>, rector@xxxxxxxxx.xxx, oriordan@xxxxxxxxx.xxx, uctpa84@xxxxxxxxx.xxx, john@xxxxxxxxx.xxx, mparry@xxxxxxxxx.xxx, pier.vellinga@xxxxxxxxx.xxx
Subject: Re: ATTENTION. Invitation to influence Kyoto.
Date: Tue, 25 Nov 1997 11:52:09 -0700 (MST)
Cc: Mike Hulme <m.hulme@xxxxxxxxx.xxx>, t.mitchell@xxxxxxxxx.xxx

Dear Eleven,

I was very disturbed by your recent letter, and your attempt to get
others to endorse it. Not only do I disagree with the content of
this letter, but I also believe that you have severely distorted the
IPCC "view" when you say that "the latest IPCC assessment makes a
convincing economic case for immediate control of emissions." In contrast
to the one-sided opinion expressed in your letter, IPCC WGIII SAR and TP3
review the literature and the issues in a balanced way presenting
arguments in support of both "immediate control" and the spectrum of more
cost-effective options. It is not IPCC's role to make "convincing cases"
for any particular policy option; nor does it. However, most IPCC readers
would draw the conclusion that the balance of economic evidence favors the
emissions trajectories given in the WRE paper. This is contrary to your
statement.

This is a complex issue, and your misrepresentation of it does you a
dis-service. To someone like me, who knows the science, it is
apparent that you are presenting a personal view, not an informed,
balanced scientific assessment. What is unfortunate is that this will not
be apparent to the vast majority of scientists you have contacted. In
issues like this, scientists have an added responsibility to keep their
personal views separate from the science, and to make it clear to others
when they diverge from the objectivity they (hopefully) adhere to in their
scientific research. I think you have failed to do this.

scientist who wishes to maintain respect in the community should ever
endorse any statement unless they have examined the issue fully
themselves. You are asking people to prostitute themselves by doing just
this! I fear that some will endorse your letter, in the mistaken belief
that you are making a balanced and knowledgeable assessment of the science
-- when, in fact, you are presenting a flawed view that neither accords
with IPCC nor with the bulk of the scientific and economic literature on
the subject.

Let me remind you of the science. The issue you address is one of the
timing of emissions reductions below BAU. Note that this is not the same
as the timing of action -- and note that your letter categorically
addresses the former rather than the latter issue. Emissions reduction
timing is epitomized by the differences between the Sxxx and WRExxx
pathways towards CO2 concentration stabilization. It has been clearly
demonstrated in the literature that the mitigation costs of following an
Sxxx pathway are up to five times the cost of following an equivalent
WRExxx pathway. It has also been shown that there is likely to be an
equal or greater cost differential for non-Annex I countries, and that the
economic burden in Annex I countries would fall disproportionately on
poorer people.

Furthermore, since there has been no credible analysis of the benefits
(averted impacts) side of the equation, it is impossible to assess fully
the benefits differential between the Sxxx and WRExxx stabilization
profiles. Indeed, uncertainties in predicting the regional details of
future climate change that would arise from following these pathways, and
the even greater uncertainties that attend any assessment of the impacts
of such climate changes, preclude any credible assessment of the relative
benefits. As shown in the WRE paper (Nature v. 379, pp. 240-243), the
differentials at the global-mean level are so small, at most a few tenths
of a degree Celsius and a few cm in sea level rise and declining to
minuscule amounts as the pathways approach the SAME target, that it is
unlikely that an analysis of future climate data could even distinguish
between the pathways. Certainly, given the much larger noise at the
regional level, and noting that even the absolute changes in many
variables at the regional level remain within the noise out to 2030 or
later, the two pathways would certainly be indistinguishable at the
regional level until well into the 21st century.

The crux of this issue is developing policies for controlling greenhouse
gas emissions where the reductions relative to BAU are neither too much,
too soon (which could cause serious economic hardship to those who are
most vulnerable, poor people and poor countries) nor too little, too late
(which could lead to future impacts that would be bad for future
generations of the same groups). Our ability to quantify the economic
consequences of "too much, too soon" is far better than our ability to
quantify the impacts that might arise from "too little, too late" -- to
the extent that we cannot even define what this means! You appear to be
putting too much weight on the highly uncertain impacts side of the
equation. Worse than this, you have not even explained what the issues
are. In my judgment, you are behaving in an irresponsible way that does
you little credit. Furthermore, you have compounded your sin by actually
putting a lie into the mouths of innocents ("after carefully examining the
question of timing of emissions reductions, we find the arguments against
postponement to be more compelling"). People who endorse your letter will
NOT have "carefully examined" the issue.

When scientists color the science with their own PERSONAL views or make
categorical statements without presenting the evidence for such
statements, they have a clear responsibility to state that that is what
they are doing. You have failed to do so. Indeed, what you are doing is,
in my view, a form of dishonesty more subtle but no less egregious than
the statements made by the greenhouse skeptics, Michaels, Singer et al. I
find this extremely disturbing.

Tom Wigley

On Tue, 11 Nov 1997, Tim Mitchell wrote:

> Reference: Statement of European Climate Scientists on Actions to Protect
> Global Climate
>
> Dear Colleague,
>
> Attached at the end of this email is a Statement, the purpose of which is
> to bolster or increase governmental and public support for controls of
> emissions of greenhouse gases in European and other industrialised
> countries in the negotiations during the Kyoto Climate Conference in
> December 1997. The Statement was drafted by a number of prominent European
> scientists concerned with the climate issue, 11 of whom are listed after
> the Statement and who are acting as formal sponsors of the Statement.
>
> ***** The 11 formal sponsors are: *****
>
> Jan Goudriaan Hartmut Grassl Klaus Hasselmann Jill J

Mar 11, 2010 at 9:56 PM | barry woods

Inapropriate use of language and naming conventions? Please. Who cares? If the naming is off, do you suspect the wrong routines have been run? Would the statistics be different using another language? This is whining.

AJSTrata: Not one single point this post make is correct. Everything is wrong. Please read it again and you will understand. If not, take a simple course in statistics and signal processing.

Reference to the Harry---whathever---read-me.txt. This is a diary for solving a difficult problem. What evidence have you that the problem wasn't solved in the end and when solved, the need to update the read-me was gone. There is a LOT of assumptions here.

References to the comments of the code: How can you make any assumptions about code comments? It is probably written by graduates and mayby not by experienced SW engineers but what evidence have you that it is not accurate. No comments here is on the code itself (except maybe from the first in the list).

In your comments, please get rid of all "probably" and "I assume". Replace it with silence until you have better evidence. It would not hold up in court.

Nov 28, 2011 at 1:01 PM | Fredrik Hoffman