Conveying truth
I had an interesting exchange with Doug McNeall on Twitter yesterday. Doug is a statistician at the Met Office and an occasional commenter here at BH. We were discussing how scientists convey uncertainty and in particular I asked about a statement made by Julia Slingo in a briefing (warning 10Mb!) to central government in the wake of Climategate:
Globally, 17 of the warmest years on record have occurred in the last 20 years.
This statement was made without any caveats or qualifications.
If I recall correctly, I've posted on the briefing paper before, so for today I just want to concentrate on this one statement. I think Slingo's words represent very poor communication of science since they do not convey any uncertainties and imply to the reader that the statement actually means something. There is, of course, a possibility that it signifies nothing at all.
By this I mean that the occurrence of the 17 warmest years on record could have happened by chance. Doug and I agree that this is a possibility, although we differ on just how much of a possibility. Doug assesses the chances as being very slim based on comparison of the temperature record to climate models. I don't see a problem in this per se, but I think that by introducing models into the assessment, certain things have to be conveyed to the reader: the models' poor performance in out-of-sample verification, our lack of knowledge of clouds, aerosols and galactic cosmic rays, and the possibility of unknown unknowns being obvious ones. Doug reckons our knowledge of clouds and aerosols is adequate to determine that that the temperature history of recent decades is out of the ordinary. This is not obvious to me, however.
But more than that, is the very fact that we are having to introduce models into the equation needs to be conveyed to the reader. Were our knowledge of temperature history better, we would be able to show based on purely empirical measurements that the temperature was doing something different in recent decades. That we cannot do so needs to be conveyed to the reader, I would say.
My challenge to you, dear readers, is to convey in, say, four sentences, the state of the science in this area. (We will take it as given that it is reasonable for Slingo to convey the basic statement about recent temperatures that she has chosen to do. If you feel otherwise, feel free to make your case in the comments.)
Reader Comments (125)
It was reported that a paper would be published in Science suggests a lower sensitivity than the IPCC's 3 degrees for a doubling of CO2- in the range 1.7°C to 2.6°C, with a median value of 2.3°C:
http://chimalaya.org/2011/11/11/a-new-lower-estimate-of-climate-sensitivity/
is this plausible? how certain is the 3degree claim that James Annan refers to?
Theo,
'You have failed to grasp the difference between physical hypotheses and models.'
Yes, I think you have correctly identified an area where I have not yet understood you, although your latest posts are helpful. Obviously we need to clarify a simpler example before we can understand what's going on in climate models! If we can reach agreement about a pendulum then we can make the next step.
Let me focus on this part of your response:
'Physical hypotheses bear an important logical relationship to the reality that they describe. When combined with statements of initial conditions specifying observable fact, they logically imply observation sentences about future events. These observation sentences are what logicians call "instances" of the natural regularities described by the physical hypotheses. A record of predictions found true make physical hypotheses well confirmed and make for them a place in science.'
I think that there is some truth to this description of science. However, when we consider real science as done by human beings we need more than to know that a physical hypothesis, in combination with initial conditions, logically implies observation sentences. We need to know what those sentences are! It turns out that in practice most of science is in fact about trying to extract the logical implications, and testing by experiment whether the extraction has been done correctly. The logical implication of laws is not generally transparent.
It is here that I think models play a role that is complementary to physical hypotheses rather than opposed to them. To be clear, let me distinguish this use of models from two other uses:
1. Statistical pattern matching, with no understanding of the underlying principles. An example would be trying to predict the stock market based on past performance. I will call this a 'statistical model'. This is also where your example of the DoD fits. Sometimes these models will work for prediction, but there is no principle which makes it so.
If the DoD data show an order for 3000 boots every Tuesday for the last 5 years then we might have a good chance of being correct if we predict an order going in next Tuesday. Unless there's a declaration of war, or a budget cut, or the guy placing the order retires and his replacement puts in the order on Wednesdays. There's no natural regularity that guarantees an order every Tuesday.
Sometimes statistical models can uncover natural regularities, and in this way they can be used to discover natural laws. For example, Jon Snow used the statistical pattern of cholera cases around the Broad Street pump to uncover the natural regularity that people get cholera by drinking infected water. But with this step we have something that is no longer just a statistical model, and there is no necessary reason that they will always do this.
2. A model which aims just to reproduce appearances, as might be used in computer graphics for a movie or game. Here physical principles, statistical principles or anything else might be used opportunistically if the programmer thinks it 'looks good'. But there is no constraint other than the whim of the programmer. I will call this a 'simulation'.
These are distinguished from what I will call a 'scientific model'. I will define a scientific model as a calculation that attempts to extract logical implications from a physical hypothesis in combination with initial conditions.
Take the example of a swinging pendulum and a physical hypothesis of Newton's laws and Newtonian gravity. What observational statements are logically implied by this hypothesis? The most common model that we learn in school to answer this question is simple harmonic motion. This allows us to numerically calculate the motion of the pendulum over time. In principle we can always calculate more decimal places using this model, but we may be limited by calculational resources. In this simple example, we might be limited if we were carrying out the calculation by hand.
The model of simple harmonic motion is often good, but it is far from perfect. For example, it neglects air resistance. A real pendulum will come to rest. You state that:
'If the model can reproduce 90% of the data but months of work show that it cannot reproduce the remainder then it should be abandoned. The purpose of a model is to reproduce the phenomena. Anything less than exact reproduction is not reproduction, no?'
Here I disagree. All models are wrong to some degree, but some models are useful, to paraphrase a famous quote I have neglected to google for. There are purposes for which this model of a pendulum will be useful. Note that this example generalises. We can never (unless you can correct me?) exactly extract the full logical consequences of an hypothesis.
(Here I find some of your statements about models a little confusing. You state elsewhere 'models produce simulations that reproduce some salient features of reality' which implies that they do not need to reproduce all of reality, and that they can fail 'to some degree' which implies failure is not all or nothing. I agree with that - no model is perfect, but we can rank them better (those that fail least) to worst (those that fail most))
We can improve the model of the pendulum by introducing air resistance, as is most commonly done with a damping term proportional to velocity. This actually works reasonably well for many purposes, but this parameterisation of air resistance is as much for calculational ease as physical truth. With linear damping the equations of motion can be solved in closed form in terms of exponentials and sinusoids, which makes numerical calculation much easier and more accurate.
The model of the pendulum can be further improved. Sinusoidal motion only arises from linearising the equations of motion which is valid for small swings. For large amplitude motions (think of the bob swinging right over the pivot!) we need the full nonlinear equations. These can solved using elliptic functions (although I'm not sure if that's true when we introduce air resistance). This more sophisticated model allows us to logically extract more accurate statements from the physical hypotheses over a wider range of initial conditions. However, the calculational demands are greater and we will run into limits sooner.
Note that the statements we extract are not all or nothing. They can be more or less accurate because they are numerical. If the measured period of a pendulum is 3.1 seconds then a prediction of 3.09 seconds is more accurate than a prediction of 4 seconds, which is turn is more accurate than a prediction of 17 years.
If our pendulum is one of those executive toys which contains a magnet and magnets on the base then extracting logical implications from initial conditions becomes even harder. I will leave aside discussion of chaos here, as that is not my key point. What I want to focus on is that just as in the simpler cases, Newton's laws logically imply the motion of the bob. But now we cannot simplify our model calculations using well known functions. Our only alternative is to try to numerically integrate the equations of motion. This is very often quite practical.
But now we are into the realm of something that looks more like what is commonly described as a model. We have to make decisions about simplifications and parameterisations. But I hope my example has been able to show that there is really a continuity here with what is often thought of as the 'exact' case. There is not really a new principle involved in this model that was not present in the simple pendulum.
If we can clarify the status of these pendulum calculations - whether or not any or all of them count as models, and why not - then I think we have the possibility of moving on to discuss more complex examples, and eventually get to climate models.
Philip,
'In tens minutes I could construct three GCMs, one which would show the earth is going to cool, one which would show the earth is going to warm and one which would show the earth is going to remain unchanged. All would be equally worthless.'
If it's just ten minutes work perhaps you could post your three models? They might be useful examples to discuss?
"...but are far from being the only strand of evidence that climate change is anthropogenic."
As Philip has pointed out they aren't evidence of anything other than the opinions of the people who make the models. However the phrase above appears everywhere in the warmist literature, in slightly different forms, like "mountain of evidence", "multiple streams of evidence" etc. I don't think there is any dispute that humans, or for that matter elephants, have an effect on the climate, and some of it bad, but what we're addressing on this blog, amongst other things for sure, is whether the effect will result in catastrophic climate change. Now, as far as I'm aware, there is no way of foretelling the future whether Bayesian, or any other techniques are used, for sure, you wouldn't be having these discussions with us now if there were, you'd be basking in the Carribean sun on the island you'd bought out of you lottery winnings. I, ignorant though I may be, happen to believe that in terms of complexity foretelling the future behaviour of the earth's weather systems is orders of magnitude greater than foretelling the lottery numbers.
I can clearly understand that scientists like you and Richard Betts are busily going about their science and making judgements on what has caused what etc. but you should understand the gravity of the situation you've the scientific community has caused to happen already. We are committed to get our emissions down from 4 weeks of China's output to 2 weeks. Already the entire population of the country, rich and poor are paying an extra 10% in their energy bills to subsidise the development of renewables. The plan is to raise it toe 30%. Real fuel poverty is already with us, and will get worse. Our industry will be made uncompetitive and jobs will be lost, note, I don't need a model to do this, I have reality and genuine experiments to show that for every green job in Spain 2.8 real jobs were lost before they abandoned the folly. Oh and there's plenty more woe in line with artificially increasing the cost of energy but we'll just stick with these.
I appreciate you feel slightly detached from the horrors that are being visited on the British, and European peoples, in that you do the science, and tell the politicians what the science is and let them decide on evidence from others what the policies are. But if it turns out that what we're seeing and the sacrifices the British people have had to go through at your behest turn out to be decadal variations in climate caused by unknown unknowns, you will have to pay a price, along with all of science, for your advocacy.
JK:
I've better things to do with my life than waste 10mins on useless models.
I understand Phillip's reluctance and his point stands. Any of us could also easily spend 10 million minutes on three models that do as asked, by which time each of them may look may plausible to the outsider, not least because they are so complex that it is impossible to figure out anything about them at all. The great Tony Hoare famously said of software in general:
Everywhere models and real world climate or weather phenomena part company, as one would expect with a system exhibiting spatiotemporal chaos. Models aren't evidence.
JK:
I've better things to do with my life than waste 10mins on useless models.
Jan 7, 2012 at 7:07 AM | Phillip Bratby
Yes and I've better things to do with mine than wade through the endless futile discussions they would engender.
Wow, so JK's response to von Neumann's infamous quote "With four parameters I can fit an elephant, and with five I can make him wiggle his trunk" would be, go make those models and we can discuss them? I think JK has missed the point here by quite a wide margin.
I'm sorry JK but your posts show you understand little about modelling. Your analogies are trite and irrelevant. Take the wing design one as an example - the wings we model are constrained by design to that which we are capable of modelling. Outside very strict design parameters, we cannot reliably model an arbitrary wing shape through an arbitrary fluid field. Even within our design parameters, constrained by assumption, nobody would build an aircraft based on model output alone, which is why every major aircraft builder owns or has access to a wind tunnel, to produce out-of-sample data with which to validate the model output.
Furthermore, your claims about statistical and physical models are a false dichotomy. Every model is imperfect to some degree - a point you make yourself - which means testing any model requires, by definition, statistics. And nobody in the real world (well perhaps except Mike Mann... no, not even him) just throws statistical methods randomly at data sets. For example, in your case of ordering boots for the army, the decision to buy boots will be based on an understanding of the driving factors - the wear rate and reliability of the boots (which may, in turn, be linked to physical parameters of the boots - the strength of the materials etc), the rate of new recruits, likely future operations which would require different types of boots etc.
The reality is all models require some underlying understanding of the system being modelled, and all models require statistics to assess them.
On the topic of weather forecasting, one thing you will find is the limited time horizon of weather forecasting, which has been stuck at around ten days pretty much forever, and never really improved on. This is because of sensitivity to initial conditions. Despite the massive advances in models, computing power, measurements of initial conditions from satellites, etc, etc, this time horizon is still pretty much stuck and unlikely to move without a fundamental breakthrough. While detail within this time horizon has been improved, the time horizon really has not.
The problem you run into is that if this limit is governed by exponential error growth from initial conditions - which seems likely - the scale averaging cannot overcome the error. Scale averaging reduces the error by the square root of the averaging length and the exponential error will always overcome that. So the belief that models get better at longer time scales is contrary to basic understanding of error propagation.
It is clear that climate modellers have not understood this, hence the hubris with which modellers believed they could make seasonal predictions, or predict 6-month behaviours such as ENSO. We all saw the abject failures of these attempts. These come as no surprise to those of us that understand error propagation in models.
In comparing this to mechanics, a better example would be the equivalent of solving the 3-body problem out to an arbitrary point in time. The 3-body problem also exhibits exponential error growth, which imposes a strict time horizon (problem dependent) on the ability to predict the future. Beyond this, deterministic predictions can be incorrect by an arbitrary amount.
For climate modelling to be credible, we need to
1. Fully understand the time horizon of predictive skill
2. Validate results against out-of-sample data with a reasonable number of degrees of freedom
These are fundamental questions which any modeller worth his salt would be questioning and addressing. On note, climate scientists pointedly avoid asking these important questions and instead indulge in pointless hand waving about irrelevant analogies, like wing design, none of which answer the real questions about the limits of climate modelling.
Spence,
"For climate modelling to be credible, we need to
1. Fully understand the time horizon of predictive skill"
You miss the fundamental point that these models aren't predicting, they are "projecting". This is an entirely different thing which I believe amounts to "illustrating possible scenarios".
To misquote Spock, "Modelling Jim, but not as we know it".
Cosmic, yes I agree with you, they like using these weasel words to get them out of trouble when their models are shown to be wrong, as we are seeing now with Hansen's laughable 1988 model output.
The word "projection" is a good one, because the models are indeed primarily projections of the bias of the model authors into numbers.
Incidentally, in responding to JKs comment about weather models, I had not checked the link that JK had provided about the increasing skill. I thought I should check this link, and just as I had expected, they report skill at a fixed five days in the future, and do not address the hard limit imposed by exponential error growth.
But I think this is true; we are very good at predicting climate five days in advance. There is no evidence that we have any skill in predicting - or projecting or whatever - climate 30 years or 100 years in advance.
There's a lot here and I hope to get to Spence_UK's remarks soon, but let me revisit something Philip Bratby wrote earlier, as it has helped me think some of these things through:
'GCMs are not evidence. They are just mathematical constructs and unless they are fully verified and validated, they are worthless.'
To take the second sentence first, it seems to me to be too high a standard and too absolutist. I don't know of any theory or model which is 'fully verified and validated'. We are still testing the speed of light...
I also think that to ignore as worthless any model that is not perfect misses a lot. A model can fail to be perfect and still be worth something. (It seems at least Spence_UK might agree with me on this point.) Models can be worth more and less.
How does this relate to the bigger argument? Philip's argument seems to me to go like this:
1. I can construct models which warm or cool or stay the same temperature.
2. These models have two properties: a) they reproduce the past and b) they have not been fully verified and validated.
3. Therefore we can conclude that if a model has not been fully verified and validated then the capacity to reproduce the past tells us nothing about future trends. Therefore they are worthless.
4. GCMs that run on supercomputers are in someways unlike my 10 minute models. But they share the property that they have not been fully verified and validated. Therefore they are worthless.
If I were to see Philip's models I suspect that I could argue that while I completely agree supercomputers are not completely verified and validated they are more verified and validated than 10 minute models. While not proof of anything they are therefore worth more.
This also gets at my difficulty with the first sentence, that models 'are not evidence'. I agree that models are not proof (except maybe a 'fully verified and validated' model). But in science we have neither proofs (except in a colloquial sense) nor full verification and validation.
Instead on every scientific question there is evidence pointing in several directions. What is generally meant by saying that a question is settled is that the overwhelming majority of evidence points in one direction and not others.
Models seem to me to play an important role in understanding and interpreting theories and evidence. Whether that makes them part of the evidence is perhaps a matter of words.
I don't like introducing too many new examples in one thread, but let me give one more. How do we know the structure of the earth's mantle and core? We have seismological data. But the interpretation and understanding of this data - it's transformation into knowledge depends crucially on models. These models are not 'fully verified and validated'. They are not even as well verified and validated as the best scientific models. If models are not part of the evidence for the structure of the earth, how would Philip (or anyone else) best describe their role in our knowledge of it?
Instead what happens is that models improve. Better models allow better data collection which allow better models. In the process models become better verified and better validated, and they can become worth more.
Spence_UK wrote:
'I'm sorry JK but your posts show you understand little about modelling.'
No need to apologize. It is why I try to ask questions and keep an open mind.
I think it's useful to breakdown the problem a bit. At least this is what I'm finding in trying to understand some of the objections raised about modeling, which is what I'm trying to do here.
I'm sure that in a discussion about climate my pendulum example seem trite, but the reason that I raised it was my exchange with Theo. I came to the conclusion that his objection to climate modeling rested on a distinction between physically based hypotheses and simulations which has little to do with climate as such.
At first I thought I could get better understand this distinction through discussing an example like the weather but then decided to try to understand his distinction through analyzing a very simple system, like a pendulum. I would still be interested if he has a response, but anyway that was why I raised those things.
This is also why I'm particularly interested in two of your points:
'Furthermore, your claims about statistical and physical models are a false dichotomy.'
and
'The reality is all models require some underlying understanding of the system being modelled, and all models require statistics to assess them.'
Theo Goodwin gave an example of why he is skeptical of model predictions:
'all of that data manipulation is nothing more than a sophisticated method of extrapolating the future from existing graphs. That is not science. That is a system of hunches.'
and
'Even a perfect model is worthless for prediction. To make this point clearly and definitively, let's take a very real example. Suppose that I am modeling the US Defense Department's logistics system. In my model I have all items regularly shipped, all origination points, all destination points, all means of transportation, and so on. Let's suppose this model can reproduce the Defense Department's shipping patterns for the last ten years and is accurate up to yesterday. Can I use it to predict the changes in shipping that will occur today. Surely, I do not need to answer that question. All that model can predict is "the same old, same old," which is exactly what any model predicts.'
It was in response to this that I made a distinction between statistical pattern matching and what I called scientific modeling. While I agree that at an abstract level your point that the distinction is a false dichotomy, and there is no hard and fast line between statistical and physical modeling, it is still true that some models are much more statistical than others.
In fact it would seem to be that your approach:
'For example, in your case of ordering boots for the army, the decision to buy boots will be based on an understanding of the driving factors - the wear rate and reliability of the boots (which may, in turn, be linked to physical parameters of the boots - the strength of the materials etc), the rate of new recruits, likely future operations which would require different types of boots etc.'
exactly illustrates that there are different possibilities here, and it these I was trying to understand. But maybe I am wrong and you think your approach is the same as Theo's?
(I also think that in some fields the strategy is almost, even if not quite, just to throw statistics at data. That's more true of paleoclimate reconstructions than of climate modeling and even more so, e.g. in looking for correlations between genes and disease.)
Fitting a general functional form with many free parameters is modeling and numerically solving differential equations which are well known to describe a system is modeling. But that just shows that we need a finer grained terminology. Otherwise people can get away with confusing things both ways. They can argue they are doing something valuable (solving a physics problem) when all they are doing it curve fitting. They can also dismiss valuable work (solving a physics problem) as worthless (curve fitting). It's all modeling!
After all, von Neumann, who made his famous remark about fitting an elephant was also involved in the first computer models of weather. Presumably he didn't think that the two activities were the same?
At the moment my view is that climate modeling with it's parameterisations is somewhere between these two extremes. But to me that just heightens the need for precision in the discussion. General slogans about modeling get us nowhere.
I will try to get to your interesting points about weather, chaos and climate later.
Jan 7, 2012 at 2:16 AM | JK
You are making very good effort to understand what I have said and you are doing so in good faith. Wonderful! However, I have no time for the next few days. As soon as I have a chance I will reply to you in detail. I do thank you for your clear and extensive response to me.
For now, let me say that your pendulum analogy does not quite address my criticism of Warmist models because in your analogy you know the principle of the pendulum - you know the math. My criticism of Warmists is that they are trying to create models without first creating the physical hypotheses - the principles - that will specify those models. Without the physical hypotheses, the models are worthless. However, as you show in your pendulum example, once you have the physical hypotheses, the principle of the pendulum, then a model is enormously valuable.
JK 9.45pm
I worked with thermal-hydraulic computer models for many years. These models were developed embodying physical models (for example a heat transfer correlation or a natural convection equation). Before these models could be used in a real-world situation they had to be verified and validated and that V&V had to be documented. Verification eliminated human errors in producing the model. Validation consisted of testing the model against a wide range of experimental data. The validation would not show that the model was perfect, i.e. it could not reproduce exactly the experimental results. (There was no perfection expected, but there was a high standard required.) However it would show the uncertainty in the calculation results of the model. It would also show the range of conditions for which the model was valid. This would all be documented and the model would then only be used for a real-world application provided it was within the valid range of conditions and provided the uncertainty in the results was given.
I strongly supect that if the current climate models were validated in this way (not that they could be because there are too many unknowns, too many unjustified assumptions and no experiments to use for validation) and with all the uncertainties in the component sub-models adequately addressed, then use of the models would show that, given a continuing rise in CO2 concentrations, the global temperature in 50 years time would be +/- 10degC of today's value (to 95% confidence). We certainly would not have been allowed to use such models in a real-world application and nobody sane would consider restructuring society at a cost of trillions using the predictions or projections (take your pick) of such models. I would have been out of a job if I had suggested using the results of such a model.
But then there is a big difference between the real world and that inhabited by climate scientists and politicians.
JK, I'm not sure Theo is being as absolute as you are suggesting, although I don't speak for Theo and I haven't discussed this topic with him at any length, so I make no assumptions on whether I agree or disagree with him on it.
However, if someone were to be making absolute black/white statements where they seemed inappropriate, I'm not sure responding with similarly black/white statements really moves the debate forward.
I appreciate this is a "hostile" environment for your viewpoints, and it takes a fair amount of restraint on your part to engage politely here, and I wanted to recognise that and also add my thanks. The other thing I would add is that this blog will have a wide range of abilities with respect to modelling within the audience. Some readers will have little knowledge of experience of modelling and simple analogies can be useful in conveying high level concepts, and I realise this is something you are trying to do.
It is worth bearing in mind that some of the readers here have considerable modelling experience, and their scepticism of climate models are built on real concerns with considerable nuance and thought behind them.
On that note, getting back to the von Neumann quote. The quote refers to the dangers of deluding yourself with complex models based only on hindcast data. If the data are known a priori, then it is trivial to make a complex model match these data. Indeed, there are an infinite number of models that can be made to match the 20th century global temperature over three or four degrees of freedom. Virtually all of these models have no predictive skill whatsoever. Only a tiny subset do, and to reliably determine which set our models fall into, you ideally need out-of-sample data to test on.
As you can see, von Neumann's quote is not intended to dismiss all of weather modelling. But it is certainly a sign that he would have witnessed exactly this during his time modelling - people getting over excited about hindcasting data with a complex model, only to find the model had no predictive value.
JK writes:
"But now we are into the realm of something that looks more like what is commonly described as a model. We have to make decisions about simplifications and parameterisations. But I hope my example has been able to show that there is really a continuity here with what is often thought of as the 'exact' case. There is not really a new principle involved in this model that was not present in the simple pendulum."
JK, I have almost no time for a reply. But having read your comments carefully, it seems to me that you miss my major point. Modelers are trying to model the pendulum without first understanding the principle of the pendulum. The best they can hope for is a model that can reproduce the past behavior of the pendulum but is useless for prediction because the principle of the pendulum is necessary for prediction.
JK writes:
"Fitting a general functional form with many free parameters is modeling and numerically solving differential equations which are well known to describe a system is modeling. But that just shows that we need a finer grained terminology. Otherwise people can get away with confusing things both ways. They can argue they are doing something valuable (solving a physics problem) when all they are doing it curve fitting. They can also dismiss valuable work (solving a physics problem) as worthless (curve fitting). It's all modeling!"
We do need a finer terminology. But solving differential equations is not science, though it can contribute greatly to science. Time series analysis is often called modeling but it is useful only as an analytic tool. It is employed most often in budget analysis by large corporations but no one would consider making predictions from the results of that analysis.
As regards my Defense Department example, what you call a "prediction" is only a hunch. Where we really need finer terminology is in our talk about scientific method. A brilliant set of hunches do not a science make.
The goal of science is understanding. Understanding is embodied in principles such as Newton's Laws or, using our example, the principle of the pendulum. Something short of that kind of understanding does not produce statements that are true or false and does not produce something worthy of the name science. I think you will find yourself changing to the tools used by game theorists who work on decision under uncertainty. That is a fine discipline but it does not address science. Science does not make decisions. Science provides well confirmed principles for predictions that can be used in decisions but it does not make decisions. My goal is to stop the Warmists from using the good name of science in their claims for their models.
I think (Jan 8, 2012 at 7:37 AM | ) Phillip Bratby makes an important point from the practical side.
JK, I greatly appreciate your contributions and I regret that I do not have the time to respond to them fully.
Thanks to everybody who's responded.
To sum up so far, I think we have three key points about the limits of modeling:
1. Philip's point about verification. As I understand verification, this is essentially showing that a program has been properly debugged.
2. The problem of understanding the system. This encompasses Philip's point about validation, Theo's point that modeling a system without understanding is not science and Spence_UK's point that we need to 'validate results against out-of-sample data with a reasonable number of degrees of freedom'. More generally, it is the problem of parameterisation, and the worry that models are falling into von Neumann's trap.
In practical terms there is also the worry that neither 1 nor 2 has been adequately documented.
3. Spence_UK's point about chaos and sensitive dependence on initial conditions, summed up in his point that we need to 'fully understand the time horizon of predictive skill'.
These all seem to me to be reasonable concerns, although I am not yet convinced that they make all climate models as worthless as some other here believe. (Of course bad modeling should be rejected, but I also don't think that these reasons justify the wholesale dismissal of modeling per se - and not just climate modeling - that I sometimes see in blog comments. Limits to Growth probably has much to answer for!)
I think that point 1, verification, while significant is probably the least significant of the three problems. I have two reasons for this, one practical and one in principle.
The practical reason is that the size of the error associated with bugs seems likely to be small relative to the error associated with parameterisations and bad assumptions.
The principled reason is that we know how to fix it: spend more money on software engineers. In principle this should resolve the issue. Throwing more resources at improving physics parameterisations, on the other hand, while likely to result in long term progress holds no guarantee of successful results. If chaos is the limiting factor for climate models then progress will be even more limited, as spending more on better data collection and faster computers will only buy us a logarithmic advance in the time horizon.
If anyone thinks I've summarized things wrong or that verification deserves equal billing with 2 and 3 please say so and I will think more about it. I will try to produce more detailed responses on validation and chaos.
JK, I think your summary is generally reasonable. I would add one caveat with respect to 3. While chaos is a cause of exponential divergence from initial conditions, it is not the only possible explanation for this phenomenon, and any such discussion should not be narrowly restricted to chaos (although chaos - and different types of chaos - are clearly relevant and has a place in this discussion).
As a primer for my perspective, it is worth reading this presentation on the presence of chaos and implications of what we know about climatic time series, which may help to clarify the above paragraph, and also this peer reviewed paper on a toy chaotic model generating scaling behaviour consistent with climatic time series.
I’ve seen progression in every post. Your newer posts are simply wonderful compared to your posts in the past. Keep up the good work!
This is a great thread, and I hope Theo, Spence and JK will continue to explore this issue. It is my belief, going back to the original post, that too much confidence is placed in the quality of climate models, and it is this overconfidence that lies behind statements such as that made by Prof. Slingo. Hence I think it is really interesting to explore these models.
Many here seem to be assuming that JK believes the current generation of GCMs (maybe it is helpful to say 'GCM' or some more precise technical term, to avoid confusion with models based only on statistics of time-series of global average temperature anomaly) are broadly correct in their predicted climate sensitivity to doubling of CO2 - from my reading, he or she does not say this, but is instead looking at the more detailed question of whether GCMs have any scientific predictive value. Many on the sceptic side of the debate disagree with the high predicted sensitivities and conclude from that that the GCMs are worthless - this is not a good conclusion to draw (they may not be accurate, but you need more care than that to prove it!).
Verification and validation of GCM models is something that gets debated a lot in the scientific literature, and also in the climate blogosphere. Towards the end the "Dangerous Climate Change?" thread initiated by a comment of Richard Betts, several people, including Richard, debated these models. I though that Jonathan Jones made a good point about validation:
There is also a long thread on validation at Judith Curry's site.
Also, I think that when using Theo's language of 'hypothesis' vs. blind 'model', the people who develop GCMs would argue strongly that they develop hypothesis-based models. See for example this comment by Richard Betts on the thread I referred to above. Of course, the 'hypothesis' is not as well-known and nailed down as e.g. Newton's laws of motion for the pendulum mentioned by JK, but the properties of e.g. mass/viscosity/heat capacity of air parcels, and the radiative absorption/emission properties of the various gases are well-known physical properties. The hypothesis is that when using such properties, and discretizing the atmosphere, ocean and land, as well as time, into small discrete elements, it is possible to write a set of differential equations, whose integration leads to some meaningful properties of the actual climate system. There are loads of caveats about initial conditions, ensemble prediction of mean properties rather than actual weather prediction, sensitivity to model parameterization, etc., but I do not believe that the GCM developers would accept that they are merely scrabbling around in the dark, throwing random terms in their models hoping for a successful hindcast.
Extremely lucid, by all, and it provokes me to repeat my simple predictive model: The Concatenation of Oceanic Oscillations and the Cheshire Cat Sunspots to produce cooling for the next 20 to 200 years or more. Deviation from that may be the warming effect of Anthropogenic CO2 and I hope it's enough to keep us warm.
==============
Theo (or anyone else),
Perhaps when you reply on the difference between climate models and a pendulum you could make an intermediate step.
If we agree, at least in part, on the account of the pendulum could you look at the analogy between the simplification of the pendulum equation - for example linearisation or parameterisation of air resistance - and the derivation of the primitive equations from the Navier-Stokes equations and thermodynamics?
http://en.wikipedia.org/wiki/Primitive_equations
A climate model is a lot more than the primitive equations, but they in turn are more complex than a pendulum. What do they introduce that's new? But if we take things one step at a time we have more chance of clarity and understanding exactly how principles apply.
If we get clear on that perhaps we could discuss tidal prediction and solution of the Laplace equations (interesting review here http://www.siam.org/pdf/news/621.pdf) before getting into the more difficult case of climate.
Spence_UK (or anyone else),
Thanks, I will have a look at those references. In the mean time perhaps you could make some comment on this recent paper?
Matei et al., Multiyear Prediction of Monthly Mean Atlantic Meridional Overturning Circulation at 26.5°N
Science 6 January 2012:
Vol. 335 no. 6064 pp. 76-79
DOI: 10.1126/science.1210299
Press release: http://www.sciencedaily.com/releases/2012/01/120106110212.htm
Paper: http://www.sciencemag.org/content/335/6064/76
From your remarks above it suggests that you think such a result is implausible, presumably the result of a fluke or a cherry pick, and that it will go away soon?
At present my own view is that beyond a few weeks predictability does indeed become difficult, although it should be recovered although only in a sense for certain statistical indices on a multi-decadal (climate) scale . This may account for the failure of ENSO and seasonal predictions. I have not yet formed a strong view on this, however. It seems to me that for phenomena in the mid range (from a month to a decade) predictability is a very interesting question which should depend on the details of the phenomena involved.
Do you think that your simple scaling argument which you claimed explained the failure of seasonal and ENSO forecasts:
'The problem you run into is that if this limit is governed by exponential error growth from initial conditions - which seems likely - the scale averaging cannot overcome the error. Scale averaging reduces the error by the square root of the averaging length and the exponential error will always overcome that.'
is powerful enough to dismiss the plausibility of the Matei paper without further consideration? Maybe it is not so obvious whether the error growth for AMOC is exponential with a time constant less than several years? But it is not obvious to me at this point that this is the case for ENSO, either.
Philip (or anyone else),
Thinking some more about your comment that
'unless [models] are fully verified and validated, they are worthless'
it seems to me that the example of Numerical Weather Prediction is a useful one to think about. I pointed to some data from the Met Office that Spence_UK seemed to think was evidence of some real improvement,
http://www.metoffice.gov.uk/research/weather/numerical-modelling/verification/global-nwp-index
and I want to say more about this topic later. There is more data from the European Center for Medium Range Forecasting in figure 2, here:
http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1252&context=usdeptcommercepub
I think this also shows some progress (of course as the article points out some of the improvement is due to better data collection, but much is also due to better models).
My question is, if this progress is real then what does that say about the necessity of full verification and validation before a model can be worth something to forecasters? (The desirability of certain formal V&V standards, for transparency, is a different although related point to necessity. Wider transparency may be necessary in order for models to play certain social or political roles, but I am not concerned with that here.)
Numerical Weather Prediction models are not, so far as I know, fully verified. They are 'partially verified' in the sense that they are continually being monitored and tested.
The question of validation is more tricky. So far as I know in most settings the standard to be met are specified and set out before hand. This is not the case for NWP. But the data in the two links I have given here would seem to show a sort of validation to higher and higher standards without ever getting to 'full validation'.
So my question is whether this might be a viable form of progress? If NWP can make progress without formal full verification (regardless of whether this might not improve it further, or whether it would be desirable for other reasons) and NWP can make measurable progress without meeting formal validation criteria set out beforehand then why is this not viable for climate modeling? Might some kind of 'dynamic validation' be a possibility?
Sorry that this is just more questions, I will try to put down some of my own thoughts soon.
1. Regarding the pendulum, try extending it to a double pendulum! ;)
2. Regarding Matei et al, I don't subscribe to science (they rarely publish anything in my field and if they do I'll just pay fees for the paper), so cannot yet access the paper. Perhaps a preprint will appear at some point soon. So I am reduced to commenting on the press release and abstract, which I am uncomfortable doing since in my experience press releases and abstracts are usually sales pitches and do not contain all of the information needed to assess a paper. With that caveat my first impressions, subject to change on reading the actual paper, would be:
2a. They published on 6th Jan 2012, and at first glance their claim of predictive skill to date appears to be based on hindcasting.
2b. However! They have published time series with predictions out to 2014 (and perhaps beyond? difficult to judge from the PR graph). This is good, as through publication there is a peg in the ground that cannot be changed.
2c. This means in 2014 we can meet up again and determine whether there really was any skill in the prediction. But we can't tell at present.
2d. One thing we should probably do in advance (prior to assessing the prediction, as selectivity at the time of testing the prediction can lead to bias) would be to determine the naive baseline and number of degrees of freedom (from spatial / temporal resolution of the prediction). If there are only one or two true degrees of freedom wrt the naive baseline then I'm unlikely to be impressed.
2e. Note the observations seem to have a distinct seasonal cycle. I would not be impressed if their only skill were to be able to predict the change of the seasons. I can do that without a model. Therefore, any naive baseline must include a projection of the seasonal cycle from the recent past, and any predictive skill must improve over this naive baseline.
So in summary, the claims of skill from that paper are currently only really in hindcast and remain to be proven. However, kudos to the team for publishing a prediction that (1) will come to bear before the people making the predictions have retired and (2) appears at first sight to be testable. But as to whether skill is present... I think we have to wait a couple of years. (As noted, my view may change on reading the paper).
Another side issue here is that the 10 day limit is a characteristic of atmospheric prediction. It is possible that aspects of ocean circulation could be longer, where there is significant inertia or filtering present in the system - but this again feeds into the "degrees of freedom" aspect of the argument. This applies to ENSO, as an example, which fluctuates more slowly than atmospheric circulation, and it takes 6 months or so to get a true degree of freedom in out-of-sample data.
3. I may have muddied the waters with the simple scaling model. It was an attempt to pre-empt stock "consensus" arguments about the Lorenz system :). Simple scaling is not exactly the cause of the failure of predictive skill in models. Error propagation from initial conditions is probably the most important "cause", scaling behaviour is an additional effect which renders averaging ineffective as a mechanism to improve predictive performance, even for stationary time series. In retrospect, I may be causing confusion by raising it now. Scaling behaviour is important to understand though, since it is pervasive and a common component of many complex systems that are subject to certain constraints.
4. Sure, I recognise that weather forecasting can (and is) validated against out-of-sample data, and within the time horizon of predictability, it does a good job and (particularly) with better input data thanks to advances in remote sensing, and to a lesser degree improvements in modelling capability and number crunching, weather forecasts have (I believe) improved. The time horizon is the big problem, I think.
@Theo Goodwin, all math we do in any science is the manipulation of one or more models. Math is an idealization. Newton's equations are models of reality that assume bodies will behave like point "masses" (no consideration is given to atomic structure, for example) and "move" so as to satisfy certain exact mathematics in the context of a Cartesian system. Newton's laws is a good model for some uses. It's a bad model for other uses (eg, where quantum mechanics or relativity are better tools, or where the system is complex, such as the earth's climate).
We'd like to create models of complex systems like the Earth's climate by including constituent parts based on quantum mechanics (eg, QED) and other very precise theories/models; however, we compromise, using some results from QED, for example, as well as equations that are much less precise but do allow us to make useful statements about otherwise intractable problems.
@mydogsgotnonose
>> 1. 'Back radiation' heating is plain wrong; as it breaches the 2nd law of thermodynamics
You didn't learn that in a science course, did you? I don't think universities teach that (maybe a few do).
In general, if you go against mainstream, you have the burden to show your work and explain the almost surely arising inconsistencies as they are pointed out. If you don't or can't, you will probably not change mainstream.
"Back radiation" is just ordinary photon radiation to which some have given that name. A more accurate name is downward longwave radiation (DLR).
It's not too hard to conduct experiments to show that radiation adds energy to a material that otherwise would go elsewhere (eg, into outer space for the case of the planet). Eg, see bottom of this exercise about a thermocoupler (which is just two wires each composed to different metals from the other, then fused together into a loop): http://mit.edu/16.unified/www/FALL/thermodynamics/notes/node136.html where it's clear that blocking the line of sight leads to lower temperatures of the radiated metal. This principle forms the backbone of pyrgeometers, instruments which measure this DLR coming from above at values comparable to what is shown in studies by Trenberth and others.
Are you claiming energy is not conserved? Are you claiming instantaneous energy transfer at a distance rather than via the traveling photons themselves? You would be going against very well accepted and utilized mainstream science. The climate scientists have indeed embraced mainstream radiation physics (eg, physics evidenced through many spectroscopy measurements and used to build lasers and many modern devices). What model of radiation energy transfer are you using?
Temperature is a reflection of average kinetic energy of particles. If more energy stays around the planet at any given instance, the average temperature would be higher. This is as true for a closed door oven as it is for a ghg atmosphere. If significantly more energy exists in an enclosure with most other variables relatively stable, regardless of the reason, the temperature will remain higher.. at the new equilibrium point.
GHG absorb energy that would otherwise be leaving into space more readily. Most of this energy is immediately added to the surrounding gases' temperature. A fraction does radiate isotropically to reflect the gases' temperature (well established quantum electrodynamics result). A fraction of this radiation is absorbed into the Earth+atmosphere system rather than going into space immediately. This, as the mathematics modelling this fairly well accepted physics shows, leads to energy accumulation until a new higher equilibrium value is established where the fractions that leave into space again match what the sun continues to rain onto the earth second after second after second. While those photons are being absorbed instead of going out into space to do useful work for some alien lifeforms, we "enjoy" rising energy levels until the new balance is struck.
If the sun were to be turned off, the earth would lose its energy into space, but the "blanket" effect provided by the ghg means that with the sun turned on a higher equilibrium temperature is reached.
BTW, convection keeps the temperature gradient in the lower atmosphere (where there is much higher water vapor and significant pressure) at near linear in distance.
Can you document what thermodynamics calculations you used on paper to convince yourself that the second law would be violated? Are you aware that the second law is about entropy and not temperature per se? Are you aware that you have to define your system so that there is no unaccounted energy/work exchange with the outside environment. Are you aware that different sets of lasers, essentially the same devices but tuned and set up a little differently, can be used to either cool things to near 0 K in one scenario or otherwise heat them up hotter than the surface temperature of the sun in another? Would you deny reality and say the 2nd law is being violated in either of these two cases? Would you say that cold mirrors used today to reflect sunlight to heat up water (or a hotdog) are violating the 2nd law as well?
Really, the math and physics does not appear to support your claims. This is why I was wondering what science course you took to have come up with those ideas. If I could read a book or paper or website that does some calculations I would have a better idea of what you mean.
>> 3. The predicted future temperature rise is exaggerated by ~3.7 times because present GHG warming is claimed to be 33 K when it's really ~9 K.
Did you make that up? Wikipedia does a calculation for the 33K which you can study if you want. I had seem some people claim it's about 18K because they were ignoring the Earth's .3 albedo it gets thanks to the atmosphere and using instead about .1 for albedo, but the fact is we have an albedo of about .3. That means that less energy gets down to the ground, so the ghg have to work from that low level and raise it by about 33K. If the albedo were around .1 today, it would be much hotter. And we risk moving significantly in that direction from ice melts. Unlike in prehistory times, the sun's irradiance today is appreciably greater.
>> 2. To account for the failure of temperatures to rise as predicted
Can you be more specific about which projections you are talking about?
Are you including error bars? The models are created and function within a specified uncertainty level. You have to respect that. The predictions never claim to specify the temperature on any given year with 100% confidence. They don't claim that as part of their projection, so you should not pretend they do. There is a rather wide error range given, a range which nevertheless projects significant temperature rises by 2100 under many scenarios.
Are you putting in the correct values for the variables that the IPCC and models had to guess at the time the projection was made? Guesses include volcano activity, human CO2 releases, aerosols in the air, etc.
The models disagree with each other, indicating there are numerous recognized uncertainties and flaws. I agree, and this will likely always be the case to some extent. Obviously, you never prove any model correct. Newton wasn't correct, and we can't prove Einstein correct. We can only prove models wrong once they fail. The failure might be slight or not, but we aren't there yet for the climate models.
In contrast, can you point to an alternative model that has done better and which isn't just a lucky model (since some fraction of many wrong models will be correct on any given instance) and follows accepted physics?
Waiting for the perfect model will lead many patients to die that otherwise could have been saved or had their lives extended or pain decreased.
>> 4. Then these people have the gall to claim that anyone probably better qualified than most in the discipline criticize its elementary scientific failures as 'deniers'.
You made a very general statement that almost surely is wrong (*all* climate scientists call *anyone* "better" qualified a "denier"??), but I am curious of which papers you think have it right yet are inconsistent with the IPCC projections and are being ignored. I have seen numerous papers reaching unsupportable conclusions. I wonder if you would include some of those.