The other snippet
Jan 25, 2010
Climate: CRU
FOI I mentioned two snippets of information in the last post and no doubt some of you are wondering what the other one is.
The ICO officer volunteered that my complaint might not eventually be upheld because it was possible that UEA was in fact unaware of the existence of the archive of data and emails that eventually formed the Climategate hack/leak. He said that the current understanding in the ICO's office was that the archive was not an official data repository, but was set up by an individual within CRU for their own use.
This is important because, if true, it strengthens the suggestion that the data was not hacked but leaked. If the archive was on a hard drive on someone's PC then it is highly unlikely that a hacker could have found it, and it seems to me still unlikely that it would have been found on a shared drive either.
It's not definitive, but it does fit in well with earlier evidence of an inside job, such as the cleansing of file creation dates.




Reader Comments (42)
Very strange. Why would someone set up their own archive over a period of 13years (the first email was 7 March 1996)? Intriguing information.
It also confirms that UEA/CRU has no archive system. What a sorry (nay scandalous) state of affairs.
Of course they have an archive somewhere. Someone is lying to get away from FOI requests.
Many people archive mails and data. This gave MS headaches with Exchange because they'd not anticipated database bloat due to users archiving everything, especially attachments. Given the systems used at CRU, the suggestion that this was an unofficial archive process may be what Mosher alluded to. Or it may make it easier to narrow down the leaker.
Assuming someone was archiving their stuff, which is sensible, then why the FOI file? This may simply have been an own initiative process to filter out mail from the archive to create that given CRU staff would have been aware of the FOI requests and concerns. Or it may have been part of a 'CYA' file given the politics.
If it's looking unlikely the ICO won't or can't prefer charges, downside may be the leaker/whistleblower vanishes behind a compromise agreement and strict NDA. I'm hoping the leaker did it for ethical reasons though and will come forward to explain.
... all the while ignoring the incontrovertible fact that the National Extremism Tactical Coordination Unit has been brought in to investigate our (one would assume) extremist hackers and their climate-change denial agenda. Why would you clutch at a whim, when there is real circumstantial evidence?
It also ignores the fact that no whistle blower has come forward... you would understand why no break-and-enter hacker has come forward, but whistle blowers act out of conscience, and are usually likely to be motivated to come forward by the same streak of integrity that drives them to leak in the first place.
John Silver. If there is an archive system, the Climategate emails reveal that nobe of the CRU scientists were using it or were even aware of its existence.
Wadard: What a ridiculous assertion and offensive accusation. There are many reasons why whistle-blowers do not come forward. History has shown that they do not get the reward and recognition that they deserve; on the contrary, they are usually given no support and are thrown onto the scrap-heap.
The good news is from Amazon:
We are pleased to report that the following item will dispatch sooner than expected: A W Montford "The Hockey Stick Illusion;Climategate and the Corruption of Science (Independent Minds)"Previous estimated arrival date: February 09 2010 - February 15 2010. New estimated arrival date: January 27 2010 - January 28 2010
NDET could have been brought in simply because an accusation of hacking by "extremists" was made. Doesn't prove a thing.
I agree that it now looks to be leaked rather than hacked. This then poses the question whether the leaker redacted a huge amount of information that he still has? If this turns out to be the case maybe there is much more information to come out.
I still think NDET is the wrong agency to be investigating hacking. When this broke though, there were reports that idiots had threatened Jones et al. If so, then that's NDET territory given they were created to help protect scientists from animal rights extremists. Now the door's open though, they may also take a look at threats to damage other scientists reputations, or even threaten physical assaults against Michaels, or auditors in dark alleys.
Expect more of this type of smokescreen the closer to truth we get. Those in charge at UEA will use every nook and cranny the legal system allows to make sure there's no truly independent view of what they've been up to.
From their point of view...worst case it's better to get your wrist slapped by the Information Comissioner for a problem over data retention and FOI responses than get banged up for obtaining research funds fraudulently.
Philp Bratby: I was thinking more of the system administrators back up and such.
The emails are numbered i UNIX fashion, and implies professionalism with tapes in fire proof lockers and such.
But maybe they have slack IT department.
A couple of sources have previously indicated a backup server rather than someones PC. This does seem plausible.
For example one such scenario would be a head of department requesting a backup archive of departmental emails which are automatically copied from the main email system based on key researchers or some other trigger. The people inolved leave and nobody else is aware and the archive gradually builds but is not large enough to get attention. That is until Mr F comes along and spots the archive.
Don't forget the leaked information contained more than just emails. Server side archival would be the simplest way to keep copies of all departmental correspondance and may explain the naming conventions. What that doesn't explain is who then selected emails and files to asssemble the FOIA file, or why that happened. Somebody spent time putting that together, as well as attempting to obfusticate the source.
Long-term archiving of emails is standard practice in many institutions. The mail server program (sendmail or whatever) is simply configured to copy all outgoing messages to an archive folder in addition to actually sending them. Given modern disk capacities, this would not occupy an inordinate amount of space for an institution the size of CRU.
Collecting emails relevant to an FOIA would then be simply a matter of a script grep-ing (searching) the archive directory for emails containing keywords and copying the messages found to a special directory. Then another script goes through the message files x-ing out email addresses.
Everything about the leak points to a whistle-blower releasing a collection of information that had been prepared by clerks or programmers for release in response to a real FOIA request, not to an external hacker. Note that the last date on the messages was just before McIntyre's request was officially rejected (iIrc).
AGW has turned into a perfect storm of government/academic/industry corruption.
In the US, NASA is busy dumping things into the memory hole.
I wonder how many terabytes of tax payer funded data, info, and communications have been destroyed in the last couple of months?
Craig- The thing about the grep theory is what keywords would you use to assemble the FOIA file? This is an aspect that makes me think it's not a simple hacker given the signal to noise ratio in the leaked archive. Somebody knowledgeable appears to have spent time filtering the content for effect, so it's not just emails relating to FOI but there is also a lot about the abuse of the peer review process.
In my opinion the theory that the files were collated for FOI is wrong and unneccessary. There is no evidence from any source other than blog speculation that such collation activity was going on. That speculation is entirely based on 1. The name used for the files. 2. The selection of content to exclude personal and trivial emails. 3. A desire by one side of the debate that "it is a whistle blower" rather than "a hacker".
We know that the details released are a "random selection" of a larger archive ie it is not a complete archive. Any dossier prepared for FOI would be much more complete
The released archive does contain a small amount of trivia. The selection was rushed and not completely thorough. An archive produced for FOI would require significant effort over many days if not weeks to produce.
We know the contents far exceed the scope of any FOI requests. The collection of the complete archive is clearly a progressive activity over years rather than the result of once off activity to prepare material for FOI.
We know that the emails and files were selected by Mr/Ms FOIA whover they are because they took time to provide a long list of the more juicy content. That selection was probably performed mainly between 13 and 16 November and what was provided was all that they had time to read and select during those dates.
The file names were chosen by F when they selected from the archive. They are not necessarily the names of the original archive.
F is clearly motivated to some extent by the FOI activity. They called themselves FOIA and chose FOIA for the file names and that certainly gives a clue to their motivation. It is possible they were directly motivated by the refusal of the FOI request on 13th November though I think it just as likely that the timing was down to opportunity plus proximity to a weekend.
Personally I think it more likely they are someone based at the CRU/UEA rather than an outsider but at this point it is indeterminate whether they are a whistlebower or hacker. The blog wars over whether it was a whistleblower or hacker whilst entertaining tend to lead people to make invalid assertions to protect their position. In reality it does not matter much which is true because it is the content of the files which really matters.
So many questions left open... What a nice guy to clean up addresses for everyone? A person who knew that the jig was soon to be up; a good time to get out stage left?... Someone who was aware of the FOIA file at CRU, perhaps preparing to show administrators or police? Probably a well respected fellow, good with computer stuff or close to someone who is. Clivere, suggests that somewhere between Nov 9th-13th, the files were reviewed. Strangely, PJ lets the reader know he is going home early on Friday the 13th. I can't think of another email that contains that kind of detail. Anyone?
A couple things. First our book is now on Lulu
http://www.lulu.com/product/e-book/climategate-the-crutape-letters/6282107?productTrackingContext=center_search_results
Ok. now for the important stuff. clivere gets a lot of things correct
On first pass through all the mails ( nov 17-18) It "looked like" the mails had been hand picked,
picked by a human intelligence ( no speculations on intelligent design please) In talking with
McIntyre he came to the same conclusion. But that task seemed daunting. On the 18th
it occurred to me that it looked like a file prepared for discovery, for example for an FOIA.
That didnt make any sense until I talked to Steve and he remarked that he had just recieved
a denial of his appeal, dated Nov 13th.
So, it seemed to me that the end of collection ( nov 12th) and the end of Mcintyre's appeal
Where tied together. Whoever, was collecting the mails knew the FOIA appeal was denied
( note there are no mails about this appeal in the file) and they decided to act on their own.
In the end the notion that the files were collected in anticipation of an FOIA doesnt hold
together well enough for me. But the date coincidence is a strong one and bears
examination. As an investigator I would short list people who knew about the denial.
After reread all the mails a second and third time, it become apparent that there were handfuls
of meaningless mails. out of office replies. That argued against a file built by human hands.
It also argued against a FOIA file. The whole coverage of Yamal in the files works against
FOIA interpretations since YAMAL was already released to steve and since it was never the subject
of FOIA.
Still the file has a very high S/N and some patterns do emerge. Those patterns suggest a proceedure
that would select mails based on certain keywords ( yamal,sres,soap) that are found on ClimateAudit
and/or certain to/from criteria.
The other thing to note is that there is evidence consistent with two different bleachings of the files.
You can see this best by just looking in the document folders at various documents and your see
the attempt to bleach the files to two different dates. For those of use around computer security, the bleaching pointed directly inside CRU. circumstantial of course since any hacker could create a false trail
by doing the same.
In the end statistics tell us that 80% of the time its an inside job. If this were a "fact" of global warming
at 80% probability, ANYBODY who suggested a hacker would be called a denier.
Regardless of leak or hack - these words sound to me like typical weasel words, used by civil servants to brush off demands by'civilians', i.e. taxpayers.
I bet that in 10% of the cases, people just accept such information and go away.
It is absolutely ludicrous to say that because the UEA was 'was in fact unaware of the existence of the archive of data and emails that eventually formed the Climategate hack/leak.', there is now no possibility of providing e-mails asked for through FOI.
I'm sure a friendly lawyer would have a field day with that.
Slightly OT, but on BBC 6pm news just now, Roger Harrabin claimed that the BBC first broke the glacier story. Is that true? I bet it isn't.
Frank - See this post
http://www.chron.com/commons/readerblogs/atmosphere.html?plckController=Blog&plckBlogPage=BlogViewPost&newspaperUserId=54e0b21f-aaba-475d-87ab-1df5075ce621&plckPostId=Blog%3a54e0b21f-aaba-475d-87ab-1df5075ce621Post%3aa2b394cc-5b5f-47ad-8bb5-c1aec91409ad&plckScript=blogScript&plckElementId=blogDest
On 5th December 2009
http://news.bbc.co.uk/1/hi/world/south_asia/8387737.stm
Can someone run a script against the emails to determine the words that are common in every email? This would surely lead to some interesting insights. If it turns out that the only common words are things like 'the' and 'a', then the next logical analysis would be to look for the frquency of words and phrases.
mpaul:
Someone I recall did and I believe Jones is a common word or at least that is what they said.
Apart from the hacker/whistleblower issue, where I'm with Steve Mosher and the 80% case, I was struck by this:
If that's where the investigation has got to, great. We don't have to speculate on the motives. But I found the phrase "for their own use" cute. Not official, for their own use. It's the best soft soap anyone's got left. But it hardly gives the impression that any of the investigators plan to come down hard on the person concerned. All of which is good.
Just ran a frequency analysis on the email folder and results are as follows, excluding (most) common words, and stopping at <1000 hits-
mann 3400
climate 3367
jones 2740
phil 2255
briffa 2180
keith 2012
science 1520
model 1411
ipcc 1291
ucar 1281
mike 1265
tom 1228
tim 1161
psu 1145
arizona 1115
osborn 1098
virginia 1002
So I doubt the email folder was based on a simple keyword search. I also looked through some of the metadata in the documents. Some of the autorecovery info has pointers to architecture, but not convinced that's anything significant. So I still think this is an insider rather than a 'hacker', or if it's a hacker, they had time to understand the system and the information people were after.
On which note, question for the Mosher. What suprises were there in the FOIA? So anything useful but unexpected, and wasn't previously known or subject to an existing CA et al FOI request? That may help nail down whether it was internal or external, but then most of the time, it's internal that is the problem.
Jones needed to delete these e-mails and files to prevent any embarrassment in case a FOI request was successful. But he also wanted to retain a copy for himself.
The FOIA.zip file's properties show it was compiled on a unix system.
Jones shoots himself in the foot by leaving the zipped file on their network where it was accessible to others. When McIntyre's FOI request was refused, someone with morals puts it into the public domain.
Just my guess, but it would be extremely difficult for anyone but Jones to compile this lot.
Except, Bishop, that they are still on the job as we speak. Hacking is very easy to rule out. Which is why Nature Geo Science can say with authority that it was a hack.
A hack also explains the rather strange spike in router-traffic from, "A miracle just happened!", as discovered by Frank Bi in the International Journal of Inactivism.
The desperation to characterise this as a leak, while ignoring real evidence belies an agenda, IMHO.
Here's a conspiracy theory for you to consider.
The story about glacier melting was first publicised by an Indian Journalist prior to Copenhagen. India is one country that would have its economic growth prospects curbed if Copenhagen was a success. It seems likely that the glacier story timing was intended to disrupt Copenhagen.
The release of the Climategate emails was also - prima facie - intended to disrupt Copenhagen.
So my grand conspiracy theory is that Indian intelligence services acting on behalf of the Indian National Interest have engineered not only the glacier scandal but the leaking of the emails.
Perhaps the British authorities will be looking very closely at South Asian employees at CRU?
Wadard- Maybe stick to climate science and leave networking to the professionals? The IP address you quoted was an open HTTP proxy. A quick google would find other uses of it-
http://www.stopforumspam.com/ipcheck/82.208.87.170
http://www.projecthoneypot.org/ip_82.208.87.170?vid=8kdeb3gi1r83jbivi52u66jqs4
which is what tends to happen with open proxies. Still, leet climate hacker diversifying into xmas tree sales would make a more colorful story I suppose. The white line/gap in the graph is also what typically happens when the router is switched off or not logging. In climate science, this would mean infilling the data and making some traffic stats up.
Thanks for the responses to my question!
Wardard --
Please be polite and try to keep to real facts, and not make wild speculations common to the fanbois. You are dealing with a number of educated professionals on this site. Atomic Hairdryer has called you out on your post which was little more than rantings. I am also a seasoned networking expert with 20 years in Silly Con Valley and I am positive that there are several others other who contribute here who know even more than we do.
You are welcomed to be a counterbalance, if you wish to contribute. but please respect our intelligence. We are not a bunch of sheep easily spooked by the boggy man rantings. We know how to check the facts, something few in government appear to know how to do.
Addresses were not stripped from the emails.
One of the online databases of the emails did strip them but that was because the creator was being a nice guy.
Content was kept pretty much on point, unlike the real hackers who really hacked mediadefender and really released everything - including "honey do" and salary type emails. which makes me think the cru leak was put together from the inside. Hackers love the attention and the "We pwned" aspect of releasing every little thing possible. Also using a proxy is rather non-elite. A hacker capable of breaking in to the CRUs many and varied servers to assemble the data, indicates that they could have released the foi2009.zip by hacking any web server and leaving it there, or hacking several servers and leaving it on all of them. Why not deface the CRU site and put it there and continue to deface it? They could have used bittorrent or gnutella or the highly anonymous tor to disseminate the data. But no, they used a Russian proxy - big deal.
Hackers are also not fettered by trivial things like bandwidth. They would have no problems dumping gigs of data and laughing about it later. Hackers it seems to me, take a "release it all and let the world sort it out" stance.
I'd put my money on internal gathering and then internal bumbling - leaving it on a public ftp (for too long) as a means of transfer between depts or organizations for example.
I'd say there is not much more to come in terms of another batch of CRU emails. I'd say this is serendipity rather than a concentrated break in.
The NETC is astroturfing to provide a distraction. They are but one of the wagons circled to protect the CRU hockey team. Their goal is make the CRU look like victims.
As for visualizing the emails to spot trends or patterns, look to IBM's many eyes (free) or http://AnalyzeThe.US (also free and far more comprehensive) You can upload all the emails and then do all sorts of cool stuff.
Here is a pic of what I am talking about. It is an actual visualization of the actual cru emails, with addresses hidden. http://img194.imageshack.us/img194/6077/cruu.jpg
here is a nice example of analyzethe.us in action:
http://www.palantirtech.com/government/analysis-blog/public_sphere
I know I'm leading horses to water here but maybe one will drink.
Atomic Hairdryer, thanks for the analysis.
It seems nearly certain that the file had to be the result of a lot of human effort. Search terms alone would not result in this grouping of emails without a lot of additional hand sorting.
Its possible that someone had been collecting incriminating emails for 13 years on their PC (hard to imagine), or there was a backup of emails being made somewhere that was discovered after the Bishop's FOI denial was issued.
Wadard
The Nature Geoscience piece was published at the time the Climategate story broke. Nobody knew a thing at that point. They are just repeating a line rather than "speaking with authority". As Pablo says, we do welcome voices from the other side of the debate, but please don't treat us as fools.
The only Man on the BBC to question the consensus has posted up an excellent piece regarding the IPCC.
His earlier stuff is worth reading too.
http://www.bbc.co.uk/blogs/dailypolitics/andrewneil/2010/01/the_dam_is_cracking.html
Here's a better link to the Andrew Neil piece.
Bishop, your assumption does not withstand interrogation. It was published in Nature at least three days after the crack - well after the likelihood of the hack has been established, as you will see below. Regardless, your supposed 'leaker' still turns turns out to be a hacker, according to Gavin Schmidt of RC:
So, either Gavin is lying, or your leaker can hack (oh, and knows how to use open relays - thanks for your nothing observation, atomic), both of which are a bloody strange skills sets for the average whistleblower to have - or.... drumroll... this is the work of a black-hat hacker or hackers.
Why don't you use your claimed intelligence to work out the relative likelihoods of those three possibilities? I can't wait. Maybe Pablo can help you if he is finished being condescending.
I'm sorry, I don't want to get into a slanging match. I would seek to continue this discussion with civility. As evidence of this goodwill, you might note I have ignored the person who called me a troll (despite me being the only one here who has reference climate-science research).
Sorry Bishop, I confused Pablo's claim to possessing intelligence as your's. On that note, I don't think you are a fool. It's late in Aus - I'm off to bed.