Why code should be published
Nick Barnes has written an interesting article on why scientific code should be published, with particular reference to John Graham-Cumming's work on the Russell review code.
This report included a good algorithmic description, and has been accompanied by source code. We greatly welcome both of these departures from the norm, as setting a good example and following the report’s own recommendation. These facts also allow us to illustrate particular reasons why code release is important, and why science software skills should be improved.
The four separate bugs – in the description, in the code, in the configuration, and in the expectation of the reader – are, in this case, trivial and unimportant – they do not affect the broad results of the report in any way. However, each is characteristic of problems with science software which can be more serious, and which are impossible to discover unless code is released.
Reader Comments (16)
There seems to be a growing consensus that code etc should be published. I am glad that the issue is now becoming settled
Why do some scientists continue to deny the obvious? Are they just trying to hide the decline in their beliefs?
According to author Steve McConnell in Code Complete: A Practical Handbook of Software Construction the average number of errors/defects per KLOC (1000 lines of code) is 15-50. And that is delivered or released code. Funny thing is, in the InfoSec industry, one error could be detrimental. In the Climate science industry, one error is no big deal.
This is one damn good reason all publicly funded science related code should be released.
I feel really sad. All the billions of dollars wasted on fake climate research and the global warming scam could have been better used to put the human race into space already. Whys does NASA study climate change? I thought NASA's goal was to put humans in space? Eventually we have to move out and colonise other planets, so why delay?
Buzz Lightyear
Maybe we should assemble all climate scientists, and send them into space, permanently
Now Buzz, the believers are going to accuse you of making wicked serious threats against them.
@ Buzz
Good practical thinking.
If the sun enters a long Grand Minimum and temperatures plummet, climate scientists should be sent there to take direct measurements.
Oh, I forgot - they don't do empirical science ... and even if they did, their computer models would tell them to go at night!
There are, or at least have been, a number of professional software engineers who have posted to this blog, particularly with regard to Climategate. They did discuss the engineering procedures that they must follow everyday in their work: Design reviews, design documentation, implementation plans, coding standards, coding, code review, code review, code review, testing, integration, testing, regression testing, and on and on. And as for peer review, just make one little mistake and see what happens at the code review. It can be brutal and so everyone triple checks their work before hand.
The typical "scientific programmer", and believe me I have seen a lot of their work at university, has a book in one hand something like "Fortran for Dummies" and a five line outline of his algorithm, which will change continually.
While some code done at the university can be quite elegant and beautiful, most is sloppy and rushed. Having the code published in it's full glory would instantly bring up the quality 1000 fold. Just ask any real software engineer about their peer reviews.
I consider any scientific work based on computer code without full disclosure of that code, either by publishing the full particulars of the commercial code used, or the actual purpose built code for the project as just "grey" science and suspect.
At least if these climate scientists didn't want to release their precious code his precious code (ring)>, they would come up with a better reason like national security. I thought the NSA or the Pentagon was making this argument last year even if it is horsesh!t.
I hate it when I screw up html tags...
should have read:
At least if these climate scientists didn't want to release their precious code [insert Josh comic here with a climate scientist clutching his precious code (ring)], they would come up with a better reason like national security. I thought the NSA or the Pentagon was making this argument last year even if it is horsesh!t.
"The typical "scientific programmer", and believe me I have seen a lot of their work at university, has a book in one hand something like "Fortran for Dummies" and a five line outline of his algorithm, which will change continually. "
Even the grossest code can appear elegant to the superficial gaze when pretty-printed!
I hate it when I screw up html tags...
So do I, Kevin. I have fumbly fingers as well so I use the Preview Post option. Just a suggestion. :)
AJC
Even the grossest code can appear elegant to the superficial gaze when pretty-printed!
Not if you read it, which software engineers would do.
Don Pablo,
The preview option seems to lead to commenting errors. But, otherwise, yes I agree. In reality, I meant to to do it on purpose to make the case for how easy it is to make coding errors...lol. :-) Kidding of course.
Perhaps we should have SW engineers do a code review of your postings, Kevin. :)
I don't work in software and know nothing about versioning, configurations, etc., yet even my experience demonstrates beyond doubt desirability of disclosing the code.
In decades of work as a patent lawyer, of which a significant part involved presenting in patent applications the inventions that inventors had described to me in disclosure documents, I found it depressingly common that the arrangements described in the inventors' disclosures could not possibly yield the results claimed--and/or that the actual code, when I could get it, was inconsistent with what the other disclosure documents had said.
In most cases these shortcomings did not result from the inventors' lack of honesty or intelligence. The inventors presumably had an incentive to avoid the patent invalidity in which inaccurate descriptions could result, and I can attest to the fact that many of those inventors were highly intelligent indeed. I think that in most cases the inconsistencies arose from laudable attempts to present the central concepts without imposing upon the reader the need to slog through distracting details. But inconsistencies there were--and those inconsistencies became much easier to detect when the code was provided.
Even when the other description was neither wrong nor inconsistent with the software, moreover, I found that access to the actual code often enabled me to eliminate in a few minutes ambiguities that even days of effort might otherwise have been inadequate to dispel.
And there's another aspect of my experience that confirms Mr. Barnes' analysis: when the disclosure did take the form of computer code, that code itself was often wrong. In one case, for instance, the heart of the invention--made by highly regarded and widely published experts--was embodied in a code snippet consisting of a mere fifteen or twenty lines of C code, yet I had to go back to the inventors repeatedly to show them that the latest version they'd given me still had fatal errors. (Those who find it hard to believe that competent workers could produce such errors repeatedly in so few lines should try to write thread-safe transaction software.)
I don't know whether the problem in that case was that the inventors had never yet actually implemented the invention or that they had but the code they gave me resulted from "cleaning up" the working code to make it more intelligible to an outsider. But I do know that what they were working on was software that is literally used by millions. The point is that software is just going to have errors--it's the nature of the beast--and revealing the software to others makes it much more likely that the errors will be identified and corrected--even when those to whom the software is disclosed are, as was true in my case, non-experts.
I wonder if the RealClimate/Tamino fans who parroted their anti-open-code talking points are having seconf thoughts?