Lewis on radiocarbon
Nic Lewis has a very long and rather technical post up at Climate Audit about the Bronk Ramsey radiocarbon dating algorithm and its statistical problems. It's well worth perservering with though, because when he gets on to testing different statistical approaches to the problem - the Bronk Ramsey subjective Bayesian one, an objective Bayesian one, and a frequentist approach too - the failings of the subjective Bayesian approach become startingly clear.
The different approaches are also considered by means of a rather fascinating analogy, namely the recovery of a satellite that has fallen to Earth.
This is the conclusion of Nic's paper:
...there seems to me no doubt that, whatever the most accurate method available is, Doug is right about a subjective Bayesian method using a uniform prior being problematical. By problematical, I mean that calibration ranges from OxCal, Calib and similar calibration software will be inaccurate, to an extent varying from case to case. Does that mean Bronk Ramsey is guilty of research misconduct? As I said initially, certainly not in my view. Subjective Bayesian methods are widely used and are regarded by many intelligent people, including statistically trained ones, as being theoretically justified. I think views on that will eventually change, and the shortcomings and limits of validity of subjective Bayesian methods will become recognised. We shall see. There are deep philosophical differences involved as to how to interpret probability. Subjective Bayesian posterior probability represents a personal degree of belief. Objective Bayesian posterior probability could be seen as, ideally, reflecting what the evidence obtained implies. It could be a long time before agreement is reached – there aren’t many areas of mathematics where the foundations and philosophical interpretation of the subject matter are still being argued over after a quarter of a millennium!
I gather that the story still has a way to run, so watch this space.
Reader Comments (27)
Might I respectfully point out that, to retain the interest of the majority of those who have no idea about Bayesian theory, statistics or general number-crunching (such as me – though I appreciate I might be in the minority on this site), we really, really should be getting over the idea that the whole Global Warming/Climate Change/Climate “Weirding” scam is not, not, NOT, NOT about science – it is about control. “Science” is merely part of the smoke and mirrors. The desired control is not of climate but of the people – it is about Government controlling every aspect of our individual lives; what we eat, what we wear, who we meet, where we go. It started with DDT, has moved onto CFC propellants and smoking, and is now circling alcohol, and the salt, sugar and other components of our diets. With AGW, we can be scared into accepting yet more taxation and yet more restrictions on our liberties; we see ever-increasing costs with little increasing benefits. Presently, our only defenders are – oh, such irony! – the corporations who fund government, as they seek to thwart the regulations that they inspire (fuel prices rise – make more efficient engines; sell more cars!).
+1
Radical Rodent
Agreed. But I knew that already, and understand it. Subjective Bayesian statistics is something I don't understand, though I've tried (a little). It's pure Humpty Dumpty through the Looking Glass to me. If anyone here would like to explain more, I'm sure many of us would appreciate it.
I'm prone to compare the climate models to the recent spate of "discovery" regarding extra-solar planets. If certain assumptions are true, (priors) and if the measurements are accumulated over enough time, (sample size) and if the noise elimination methods (statistics) are applied correctly, it would seem to be probable that the blips in the trends of the data can be associated to "planets around that distant star" or "radiant energy accumulating in local weather systems".
If the assumptions, measurements or statistics are NOT valid, or even not QUITE valid, then the results may need to be revised. Maybe that so-called planet will in future be "downgraded" to some other category, as Pluto, relatively nearby, has been. Maybe the "heat is hiding in the deep oceans" instead of stirring up hurricanes.
Nobody is proposing to spend significant amounts of global GDP on sending colonists to other planets. Many are proposing to spend such sums on retro-terraforming Earth.
Under what James Hansen is pleased to call "Business as Usual" there will some people choosing to spend some money on their choice of either, both, neither, or some other consequences of recent scientific work. But what he wants instead is a global oligarchy to collect and spend ALL the world's citizens' spare change on HANSEN's branch of science.
Does it make me a denialist to "consider the source" when I contemplate whether or not I "believe" in the science -- and the policy recommendations of scientists?
To me, statistics are like art. Some of it looks right, whereas some looks decidedly dodgy, and while I can see the beauty of it, it's totally beyond my skill level.
Radical Rodent
I think the phrase is For your comfort and safety
Statistics is for graphophiles, whereas most of us are mortal graphophobes..
An excursion through the posts of ClimateAudit gives the zest of this
SandyS: Ah, yes... To paraphrase Ronald Reagan's 11 words that should strike terror into a person's heart: "We are from The Government; we are here to help you."
Or, perhaps even better words from the Great Communicator: “You can’t be for big government, big taxes, and big bureaucracy and still be for the little guy.” Let’s face it, Big Government, Big Taxes, Big Bureaucracy and Big Control seems to be the ultimate aim of most of the politicos, nowadays. Long gone are the days of Calvin Coolidge.
"Subjective Bayesian statistics is something I don't understand, though I've tried (a little)."
The issue Nic is talking about is that one of the problems with using a uniform prior is that what it means depends on what coordinate system you use to express the parameters. What's uniform in one coordinate system is distinctly non-uniform in another.
For example, say we express the coordinates of a point inside a circle using range and bearing, and want to define a 'uniform' distribution of them. It would be easy to pick both of the coordinates range and bearing uniformly and independently on some interval. But this doesn't work, because when we switch to Cartesian XY coordinates we can immediately see that the distribution is denser close to the origin. By assigning uniformly on range, we have assigned equal probability to every circle centred on the origin, even though some are far longer than others.
And it sits uncomfortably for our supposedly objective prior to depend on what coordinate system we use, when the coordinates are an entirely arbitrary feature of the way we choose to describe the problem. Why is XY objectively any better than range-bearing?
The Jeffrey's prior seeks to deal with that by generating a distribution that works out the same whatever coordinate system we use. It does this by looking at the sort of measurement we can make, and picking a distribution that is uniform on the information that measurement can provide. It effectively assigns a coordinate system to the parameter space such that the measurement is uniformly informative. Areas that the measurement doesn't tell you much about get scrunched up, and areas where it gives a higher resolution are stretched out.
It's a good background to use for that particular measurement, but it has the peculiar property that your prior belief about the true value apparently depends on what sort of measurement you plan to make. It has an element of arbitrariness to it, like the coordinate system dependence mentioned earlier. The difference is that the measurement you actually do is a real, objective thing. But it's still a "controversial" approach, and the philosophy behind it is unclear.
geoffchambers: my understanding is that Bayesian statistics is used to refine one's confidence in a theory in the light of new evidence. Roughly speaking, my confidence in theory T will improve when I observe evidence in support of T and diminish when I find evidence running counter to T (this evidence, of course, being statistical in nature).
Now, say I have a collection of competing theories, T1, T2, ..., I can use Bayesian statistics to rank them in order of likelihood as evidence accumulates. If I assume a flat prior then my initial position is that each of T1, T2, etc., are equally likely. On the other hand, I may have a priori reason to believe that T1 is more likely than T2; in this case a flat prior may be inappropriate, since it doesn't reflect my differing levels of confidence with regards to T1 and T2.
With enough evidence, Bayesian statistics will promote the best theory to the top of the pile regardless of your starting position [give or take...]. If I start with a flat prior, that process may take much longer (more evidence will be required) and, in the earlier stages, make lesser theories seem unreasonably plausible. Right now we're in the "earlier stages" part of the process. On the other hand, if I start with a bad non-uniform prior, I will have to accumulate more evidence than with a flat prior to dislodge my preferred theory from the top of the list.
I think the argument here is whether a flat prior is sensible. If all sides of the argument agree that some subset of theories are less likely then a flat prior is arguably a bad starting point.
the C14 measurement and the calibration are 2 INdependent random variables I think.
One random variable results from C14 measurements, the other from a past activity with tree rings.
No Bayes involved here.
So the pdf of a calibrated/renormalised measurement is begot with a bit of number crunching.
Feller II, Ch1.2, (2.11) the pdf of a sum of 2 independent random vars is the convolution of the pdfs.
The easiest is probably to just calculate the "mapped" binomial and presume the other one having a "step 1" function.
This is then just a change of variables.
I would involve Bayes if I were making 2+ C14 measurements and wanted to find out a model for the error (find out the pdf form). But this is not at issue here.
ps
it might be confusing to see why the sum.
calibration curves are usually normalised to show differences only?
Then the calendar age = end result = Xmeas(C14 age) + Ycal.
anyway this is my non Bayesian , graphs economical, approach
please do carry on.
ptw
Not sure I fully follow. The calibration error is in the C14 age - my understanding is the calendar age of calibration points should be pretty much known exactly (from tree ring counts or whatever). So you have to sum of two independent Gaussian (normal) random variables. That has a Gaussian distribution with variance equal to the sum of the variances of the constituent random variables.
The objective Bayesian approach using Jeffreys' prior in effect then carries out a change of variables from C14 to calendar year. The uniform prior appraoch used by Prof Bronk Ramsey in OxCal also carries out a change of variables but without multiplying at each calendar date by the standard "Jacobian determinant" conversion factor required to convert probability densities on a change of variable - here the derivative (rate of change) at each date of C14 age with calendar age.
In practice it is trickier because the standard change of variable formula doesn't apply where, as here, the relationship between the two variables is non-monotonic (not all in one direction)..
One of the greatest scientists of the past century, Richard Feynman, wrote a short collection of reminiscences, Surely You're Joking. In that book, one of the things that Feynman tells is how he liked to approach analyzing proposed theorems: via considering examples, especially simple examples.
Following that approach, to analyze the theory proposed by Nic Lewis, consider this simple example.
For this example, Lewis' method gives prior probabilities, for the three calendar years, of ½, ¼, ¼, respectively.
Thus, the probability distribution of the prior is equivalent to a uniform distribution on the radiocarbon ages. Such a prior distribution would be rejected by radiocarbon scientists, and I would strongly support those scientists doing so.
I think perhaps the mathematically challanged readers aren't going to get much from a post that starts by discussing "uniform priors", since the writer seems to not understand that you've already gone into jargon mode so any further explanation is pointless.
So it goes ;(
Nullius in Verba
Re Jefreys' prior "It's a good background to use for that particular measurement, but it has the peculiar property that your prior belief about the true value apparently depends on what sort of measurement you plan to make."
I think this is the usual misunderstanding, compared to the objective Bayesian approach, about the nature of prior distributions in the absence of genuine existing information about the true value of the parameter you are estimating (the issue can also arise even when there is such information - see my arXiv paper linked to from my post at CA).
Where there is no actual prior information then from an objective Bayesian viewpoint the prior distribution has no valid probabilistic interpretation. It is just a mathematical weighting function (as Don Fraser puts it) or a tool (as Bernardo and Smith put it) to derive an objective Bayesian posterior PDF from a data-derived likelihood function.
When you perform a different measurement, the prior changes to reflect what weighting is then necessary to best represent what the results of the measurement imply about the parameter value. They don't represent a different prior belief about the true value of the parameter since, in this case, not having any real prior information there can be no real prior belief as to its value.
" Such a prior distribution would be rejected by radiocarbon scientists, and I would strongly support those scientists doing so."
Doug, I quite agree. The purpose of the objective prior is not to get the best a priori estimate of the value of the parameter in question, the purpose is to reduce as much as possible the influence the prior has on your conclusion, and maximise the influence of the evidence. That's a quite different purpose.
For parameter values (year 9) where the measurement is strongly informative, a high prior weight doesn't matter, because the accumulating evidence will soon overwhelm it. For those where the measurement is less informative (years 10 and 11) a big prior takes longer to be overridden, so it is initially given less weight. This allows the influence of the evidence to show through more quickly.
Objective priors are really designed for the case where you have absolutely no idea what the appropriate parameters or distributions should be. If you *do* have prior information, or reasons to think a particular parameter is the natural or appropriate one to use, then you shouldn't be using an objective prior, you should be using the one you actually do have. But do think carefully about your reasons for believing it.
In the case of radiocarbon dating, a uniform prior on calendar age is clearly inappropriate. We know that older objects are rarer, due to their continual decay, destruction and recycling over the ages. Artefacts from 50 years ago are far more common than those from 50,000 years ago, which in turn are more common than those from 5 million years ago. We usually know from context - what the material is made of, where it was found, how it got there, how it has been modified, and so on - a lot about how old it is likely to be. The prior age of a found object is *not* uniformly distributed on zero to infinity, or even on zero to fourteen billion years. When you think about it, the idea that this is the archaeologist's actual prior belief is obviously nonsensical.
However, different archaeologists obviously have different priors, depending on their opinions, theories, and competing hypotheses about past events, so it isn't obvious what prior a radiocarbon lab - who likely don't even know what their customer's views are, let alone whether they're justified - should use. So I believe they use the uniform prior as a neutral background on which the client archaeologist can superimpose their own personal prejudices. The only reason for using calendar year as the parameter for this is simply that most archaeologists express their own priors about the past in terms of calendar year. There is no mathematical or scientific justification for the choice - it is simply a matter of convenience for combining data. And nobody should ever accept a radiocarbon result as a given, it must always be interpreted in the specific context of the object in question.
A uniform prior is always wrong - but a uniform prior makes it easier to apply the 'correct' prior.
Nic's article is interesting as a demonstration of the technique, and it might even be useful if we ever did come across a found object where we had no context and really did have no clue as to when it was from. If you ever find yourself looking at the radiocarbon result and not correcting from the uniform prior because you have no idea how to do so, then Nic's approach might be a better one. It seems highly unlikely to me, though.
"Where there is no actual prior information then from an objective Bayesian viewpoint the prior distribution has no valid probabilistic interpretation."
Quite so. Which results in the common view that the posterior therefore likewise has no valid probabilistic interpretation.
There is a school of quantum physics that similarly rejects intuitive explanations and justifications, and simply treats the mathematics as an algorithmic procedure, that happens to give the right answers to questions about the outcomes of experiments, but otherwise shouldn't be interpreted as saying anything about how reality actually is. They sometimes call it the "Shut up and Calculate!" school. It has given up any hope of understanding it.
I don't think it's enough to just regard it as a weighting function or a mathematical tool. For one thing, you have to justify that choice rather than any other. For another, by sitting in the place in the equations where the prior belief belongs, the effect is that of actually holding it as your prior belief and updating that belief with Bayes rule. If it sits in the box labelled "Duck", and you act on it in the same way you would a duck, people will naturally regard it as such.
I think a probabilistic interpretation of objective priors is perfectly possible - you just have to recognise that its purpose isn't to estimate the true state of the parameter you're observing, it's to say as little as possible in advance about what you think the experiment is going to tell you. It is in a sense arbitrary and unjustified, but you have to recognise they're all arbitrary and unjustified, that's unavoidable, and this one is at least arbitrary in a useful way.
I have negligible background in the Bayesian paradigm, but even so, I do not see how that can be true. There is a formal mathematical definition of “probability distribution”. The prior fulfills the requirements of that definition. Ergo, the prior is a probability distribution.
Regarding Mermin’s exhortation for researchers in quantum mechanics to “Shut up and calculate!”, I do not interpret that advocating giving up any hope of understanding, but rather advocating giving up any hope of understanding today. As an analogy, people used to use Newton’s Laws to prove that everything was deterministic, etc. We now know that Newton’s Laws are an approximation, and attempts to draw major philosophical conclusions from those Laws can be unsound. Similarly, quantum mechanics potentially has fundamental weaknesses, given that it seems to conflict with general relativity; so attempts to interpret quantum mechanics today might be unsound.
@ Nullius in Verba, 12:56 PM
I really appreciate your explanation. All your points seem good to me.
Sadly I don't think that's entirely true. I don't know about Mermin, but Feynman for one was originally hostile and dismissive towards the experimental efforts to test Bell's inequalities, and he was reflecting a common attitude among the instrumentally-minded physicists of the time.
I'ts probably worth mentioning that Radford Neal, who commented on the original radiocarbon post in this diocese, has a long comment on the CA post, defending subjective Bayesianism. (I don't claim to understand it...)
Having established that we have really 2 INdependent random variables and 2 SEPARATE exercises (estimating the UNcalibrated C14 age pdf for the sample(a measurement) , and doing the calibration which is a trivial exercise in probability/measure theory using the estimated pdf's ) we can mop up the remaining issues.
We should not use Bayesian estimation for the UNcalibrated C14 age because we know the "real" pdf is normal.
That is an established given in the community.
So the best pdf estimate is the sample's mean and standard deviation for the 2 (unbiased) point estimators that a normal pdf requires. And nothing Bayesian is advised.
It would be ridiculous to have a sample C14 age(mean of the estimated pdf) of say 2000years and then see that Bayesian posteriously "pushed" to say 2100years because we had some subjective prior information. What subjective prior information would that be for a normal pdf? it can only be 2 things. would the prior information be a suggestion for the age before we measure it? inside information ..Would it be a suggestion for the precision of the measurement before our wickedest colleague or most apt colleague starts the experiment?? Again: NO information coming from an Oxford sw package, tree rings, human history, the evolution of fossils on earth is relevant for the measurement. It is normal. It is a poisson process converging to normality because it is radiation from THAT sample THAT measurement equipment , done THERE and none other. Point a la ligne.
"The Feynmann Gedanken Experiment do we have 1/2 or 1/3 for calendar y=9?"
As there is no prior estimation information provided to do any Bayesian estimation on the C14 age, we have Pr=1/2. We can discuss long and hard about how the ideal GE should have looked like but for the one given the Pr=1/2.
"The calibration curve is not uniform slope it goes up and down."
The pdf of the sum (convolution) is uniform or we do not have a measure and a 2 dimensional convolution integral (in plane) is easily done by putting the 2 functions in matlab.
@Douglas Keenan
I have gone through your paper, and I have following comments:
-I agree 100% with your intuitive description 4.2 except for formula (5) which should be = formula (4).
-You could add a note that in case Fig.5 (the idealized calibration curve) in not monotonous, then there is a possibility
that the (#of years that have age a) have to be searched for over disparate sequences along the Ccurve.This does not affect the further reasoning. It affects the algorithm and makes the Sigma's in 3.1 complicated.
-3.1 (a formal derivation) is incomprehensible to me. The fault of 4.2 is in there as well somehow. Also I think in formula (1) you attempt to do the required convolution integration in continuous domain, to take into account the calibration curve uncertainty (what I called the 2nd independent random variable, before). However this should be a double convolution integral then. You should either have 3.1 rewritten by a mathematician or just refer to Feller II :) It is not that you need it really, because you already know that the "measure" from the sample's PDF , like paint, need to be induced/spread out into the "calendar years" evenly (this is an effect of the fact that every sample in a sample space has to be attributed the same probability and basic to probability theory)
-I would call eg Fig.6/7 the "induced probability density function" on the calendar years, for the sample.
It is obvious that your method is better than the others! as you point out, just by sight of the graphs they are plain wrong. Also your ch 1 & 2 were very new and interesting to me.
@Nic Lewis, apr 18 9:23
Thanks for your comment.
It is indeed best for now to consider the calibration curve exact, without a probability measure associated with it.
The used methods have much bigger faults that that.
I note you use the (more modern) notations of CDF (cumulative distribution function) , which I called PDF before I think.
This was my mistake sorry.
I started reading your CA post, which seems a good overview of the methods.
I consider them all wrong however, because they use parameter estimation methods (statistics) , where they should use a probability/measure theory method, like Keenan proposed. All mention of priors, jeffrey etc should be wiped out of the methodology.
As an example, early on in your post you mention the "red curve" would be a "likelihood function". Well no well no it isn't. It's the measurement result of the sample. To be fair I think it was Gandalf from canada that became adamant about this the first.
I have already sent out an avatar to CA; it should create some havoc if it is allowed to live for a while and If I have the time this week.
James Annan has also blogged about this topic:
http://julesandjames.blogspot.co.uk/2014/04/objective-probability-or-automatic.html
I agree with James that calling Jeffreys prior “objective” tends to be misleading for non-statisticians.
Following on from my prior comment, I intended to post the following at James’ blog, but posting apparently requires an account, which I do not have.
________________________________________________________________________________
@ Richard S J Tol, Apr 22, 1:48 pm
The work of Kullback & Leibler builds on the work of Shannon and forms a substantial part of the foundation for the information-theoretic paradigm in statistics. That paradigm was set out by Burnham & Anderson in their book Model Selection (2002). The book currently has over 20000 citations on Google Scholar, which seems to make it the most-cited statistical research work published during the past quarter century.
Here is an extract from the book (§2.12; emphasis in original).
That conflicts with using Jeffreys prior. Indeed, the information-theoretic paradigm requires leveraging scientific knowledge (to select the candidate statistical models).
We are supposed to be doing science. Statistics, in this context, should be regarded as a tool, somewhat like a microscope or a telescope, to assist in seeing things. If we truly know absolutely nothing about the data, then we should not be analyzing statistically; rather, we should go back and do some science, to learn something. Then, we return and analyze statistically—leveraging our scientific knowledge.
Whether or not you agree with adopting the information-theoretic paradigm, the work of Kullback & Leibler is nowadays being used to underpin that paradigm—and so requires using prior scientific knowledge. The work can be utilized in different ways, but it definitely does not imply that using Jeffreys prior is desirable.
Doug:
No disagreement there. I included Kullback and Leibler because they made the link between what were then two strands of literature on information.
I object to the notion that there is no such thing as an informative prior; to the use of unwittingly informative priors; and to the use of priors to fix problems with the likelihood.
I think now Keenan's method is not using any "priors" and certainly no likelihood function.
It is however true that the format of the "calibration curve" implies that all calendar ages are equally possible. But that is a given of the calibration curve, and any method to derive a pdf for the calendar age corresponding to a measurement, will have to accept that as a given. To do otherwise is to put you in the seat of the calendar curve people..which might be nice fodder for discussion but is not at issue..separation of concerns. Kmethod uses the mathematical Bayes formula ( and in some confusing way imho.)
Not saying anymore calculations are wrong now though, I have no reason to believe that; the imploded grey cake is quite better than Bronk's.