
Royal Society policy lab



The Royal Society is continuing its project on open science, with a "policy lab" at the start of next month:
In the wake of ‘Climategate’, a Lancet editorial warned that the call for UEA climate scientists to make their research more transparent was a wake-up call for all researchers. “If scientists do not adapt to the forces shaping and sustaining this revolution in the public culture of science, the trust that the public and politicians put in science will be jeopardised.
A year on, the Royal Society have launched a study looking at how science can open up: ‘Science as a public enterprise‘. This requires understanding what forms of access are required, and to what ends. A blanket policy on access to scientific information does not take into account the diverse demands being made on scientists. Nor does it take into account the massive datasets, complex models and specialist equipment involved in much of modern science. Opening up science is not a simple task, but a challenge that requires discussion and debate.
Geoffrey Boulton will be speaking on the purposes of opening up science. More details can be seen here.
Reader Comments (20)
The previous event, at the Royal Festival Hall, was better than I for one expected. I'll be at this one.
Trust me I am scientists
Move along there is nothign to see
We can't tell you
Your to stupid to understand
Under the current RS leadership all those would make much better slogans than the current 'take nobodies word for it ' that they seem to be so uncomfortable with .
If there worried about the house of cards effect for trust in science caused by climate-gate they could always try to deal the people whose behavior created the problem in the first place . Expect they already done that by rubber stamping the 'evidenced' CRU 'offered' to the inquiry which allowed CRU to hind behind the claim this was the result of RS selection when it was no such thing .
Sorry but Boulton and friends have played their own dishonorable part in creating this problem, in both passive and active senses, their hardly the leadership need to get them out of it and Nurse has show just how they intended to play the 'public image game' and its not with honesty .
Should the term 'Scientist' not be one conferred on a worthy of the title rather than a career choice? I ask this not to denigrate but to support those that are inquisitive and achieve deserved acclaim by adding substantially to the sum of human knowledge. Before that Rubicon is crossed, are they all not just researchers? I can more easily live with "current research suggests" than I can with "scientists say".
We could move from there to separate statements into relevance, such as "... in the opinion of Sir Paul Nurse ..." when he steps outside his sphere to "...the position of Sir Paul Nurse ..." when he does not.
On its current course, the RS is in extreme danger of becoming no more than a 'talking shop'.
I'm not getting this at all. The product of research comes from data and methods. To produce a scientific paper you need both. It follows that as you've used them to get to the results of the paper both should be documented and available for study by people trying to find something wrong with your methods and data. What's difficult about that?
"Geoffrey Boulton FRS will then offer his perspective on the purposes of opening up science, linked to the Royal Society’s study of ‘Science as a Public Enterprise’."
Never let it be said that the Royal Society lacks a sense of humor.
A little like Mortazavi (widely accused of practicing systematic torture in Tehran) representing Iran in its delegation to the United Nations Human Rights Council.
One day, perhaps it is already here - but I think not, they will realise what an awful mistake they made taking the IPCC at face value and joining in the cacaphony of alarmism rather than keeping calm, rational, and neutral. They, the RS leadership, failed in all three respects. They can scarcely have been more irresponsible. Almost anything they can do towards encouraging 'open science' will be an improvement, but they do it with their feet well and truly in the mire of their own making.
"the trust that the public and politicians put in science will be jeopardised"
Surely this should read, any trust that the public had in science and politicians has been destroyed along with their (the public) trust in meaningful enquiries into the so called climate scientist behaviour.
As for a policy on access, the soluyion is simple. If the research is funded by the public, then access to all data etc should be a given.
who put the y where the t should be
solution
sorry
"Opening up science is not a simple task, but a challenge that requires discussion and debate."
Well maybe it's difficult, but I'm sure there are courses on how to make PDF files and place downloadable data files on the Internet.
Bebben,
You missed the get-out-of-jail-free card;
"Nor does it take into account the massive datasets, complex models and specialist equipment involved in much of modern science."
:-)
This Editorial in the Lancet would have been written by Richard Horton, would it? You know they man who started the whole MMR=Autism and 150 million dead civilians in Iraq.
Opening up science is not a simple task, but a challenge that requires discussion and debate.
Yeap, "when cornered appoint a committee to study the problem to death". Right out of the Bureaucratic Tricks of the Trade handbook.
Don Pablo, I agree. The majority of the problem would be addressed by the simple expedient of having the scientific journals follow the best practices of archiving data & code for those papers which are computational in nature. This would include both those papers which analyze observations, and those which are model-based. That is the way to ensure replication, and a fair assessment of the conclusions.
I probably wouldn't mind if the remainder of the problem is consigned to a committee, if that first giant step be made.
Yes, opening up science needs the free availability of data, methods and program code. However, there are some additional requirements. The most important is open access to the research literature. Second, good understandable books on some scientific topic. I have tried understanding biomolecular and genetic science recently and have beeen severely dissapointed by the obtuseness of work. Third universities need to reach out to the citizen science community and carry out much more outreach activities rather than just concentrate on the Research Assessment Exercise, impact assessment of journals and metrics that measure a researchers worth. Look at
http://www.universetoday.com/41006/high-school-student-discovers-strange-pulsar-like-object/
for an example of what can happen when outreach projects become succesful. The peer-review system also needs a severe looking at, see
http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=6577844
for a shocking experiment that was carried out twenty years ago.
AMP
I think it is important to realize that this is not always a trivial issue: providing all the calculated data and the code for modern scientific projects can be a challenge. In my field of research, depending on what you mean by "all", it would be impossible to provide all the data. The research I do is about understanding chemical reactions using computer simulations. In a recent paper I was involved in, we carried out simulations of the time-course of a chemical reaction, involving running multiple 'trajectories' describing the motion of the atoms in the model of the system. We carried out 250 separate simulations, each with ca. 2 million timesteps, and for a system comprised of about 700 atoms. The output that we were interested in can be described as a small set of numbers - the average of a particular property of the system over the 250 simulations, as a function of time. Though we needed to model 2 million time-steps to get things right, the property we were interested in varies fairly slowly, so the meaningful output can be specified as the value of that property for about 1000 timesteps. In other words, the key output is about 1000 numbers - maybe a kB or so of data, which we plotted on a graph in the paper. For other projects, we might provide detailed listing of some of the output from calculations as "Supporting Information" which is published alongside the paper in electronic form, at the level of a few 10s of kB or so. For that volume of data, providing all of it is easy to do.
The code we used to run these simulations is a commercial code - we are not allowed to publish it, but we can provide a reference to it, so someone else can repeat the simulations if they get hold of the code also. That is standard practice in my field.
If you take "all" the data to mean all of the raw data generated by the simulations - which would be needed if you wanted to check that our analysis was right, without running the calculations again yourself, then then would require archiving quite a bit more than a few kB. The key data is the coordinates and velocities of the atoms in the system at each timestep in each simulation. A rough calculations suggests that is 16 terabytes. In fact, we don't even store all of that data ourselves, and we certainly could not archive it.
So we cannot simply archive all the 'data' and the 'code'. We try to do our best to provide enough information to make it easy for other researchers in the same field to replicate or extend our work if they wanted to. If asked, we would be able to provide a bit more information than was actually in the paper or the associated supporting information. But choices do have to be made and there are limitations on how much data we can provide outright.
With modern computer power, similar issues arise in pretty much any computational science project. In fact, in many projects, 16 terabytes of raw data would be viewed as being fairly small beer.
Now, I'm certainly not arguing against openness. If it is possible to provide the key data for a scientific project, it should be done. But it is worth realizing that this is not always a simple question of sticking 'all' the data on a webserver somewhere. I don't think you could come up with a simple 'mechanical' rule defining openness. What seems more important to me is that there be a culture whereby the intentions of people in a given research field are in favour of the maximum possible openness. That people have a bias towards responding positively to requests for data. That people have a culture of looking at other peoples conclusions critically, and expecting their own conclusions to be looked at critically. That the system be set up such that as much information as possible is made available for people to look at.
I am both cautiously optimistic and deeply skeptical; not quite yet at the cynical threshold.
Let the RS show its stuff on more openness. I am watching and my expectations are almost hopeful.
John
Jeremy's right that it's not always trivial. Stephen Emmott, the Microsoft open science guy who spoke at the previous RS do at the Festival Hall, with Nurse and Boulton presiding, sure agreed that it wasn't trivial - in that he couldn't get the general circulation model (GCM) he'd chosen to study to work at all, due to bugs, despite the code being 'open'. We're at the early stages of making such complex models truly open. It's going to be a lot of fun.
“the massive datasets”
I know - data storage is so expensive, isn’t it? (He says, pocketing a memory stick that holds 1000 times as much as the hard disc fitted in his PC twenty years ago).
James P, would your memory stick store the 16 terabytes of data that one recent project in my group (see above) generated? If so, please tell me where you bought it because I'd like to get one! OK, I know, hard disc/flash disc capacity has gone up a lot in the last 20 years, but most scientific computing projects also generate a lot more data than they used to. Its very likely that there has been too much use of the "massive datasets" excuse to provide a pretext for not releasing bite-sized datasets. But its only fair to recognize that complete data openness is not trivial.
Engineers have developed "Validation and Verification" methodologies for large simulations, some of which can be for safety critical applications subject to regulatory control. IMO the size of the output dataset should not be a barrier to producing open and transparent work.
http://www.nafems.org/publications/browse_buy/qa/
Richard - how much do you value the output of a GCM which doesn't run due to bugs?!?