Comments on James' Empty Blog: Objective probability or automatic nonsense?

Regarding "one thing that Nic's method wi...

2014-05-21T19:21:02.893+01:00

Regarding "one thing that Nic's method will say with certainty is "this is not a

Kamakura-era artefact"! ... Nic's posterior (green solid curve) is flatlining along the

axis over the range 650-900y, meaning zero probability for this whole range. The obvious

reason for this is that his prior (dashed line) is also flatlining here, making it

essentially impossible for any evidence, no matter how strong, to overturn the prior

presumption that the age is not in this range."

I think this is actually an incorrect statement of what the figure shows.

Statements like "x is in this range" are about the CDF, not the pdf. Nic's posterior pdf

here says that there's a low probability that an artifact lies in any _individual_ year

between 650 and 900. That's a natural consequence of the fact that the chunk of C14

density that maps to that era is spread over a long calendar age, due to the degenerate

transfer function. But it's not the same as saying the era is excluded; if you look at the

CDF, its 90% or 95% range would certainly include the era.

Nic's posterior matches what you'd get if you generate a random sample from the Normal C14

age distribution and pass it through the calibration curve, which seems like a pretty

intuitive procedure to me. Of course, this begs the question of why you'd bother

constructing an arcane prior in calendar space when it's much easier in C14 space, which

he actually discusses in the text.

Even though Nic's argument strikes me as technically correct as far as it goes, the

trimodal prior on calendar age is intuitively repugnant. So, I went back to the Bronk

Ramsey methods paper.

I think Nic has actually missed an absolutely crucial point. In his zeal for an

uninformative p(calendar age), he skipped over BR's explanation, "This prior reflects that

fact that some 14C measurements (notably those on the plateaus) are more likely than

others." In other words, BR is making the auxiliary assumption that there is a constant

rate artifact-generating process.

The distribution of c14 age measurements arising from a constant artifact-generating process is such that the likelihood of observing a given measurement is highest exactly where the Fisher information from the transform (i.e. the slope) is lowest. The two effects cancel, rendering the uniform prior for calendar age sensible. Score one for intuition.

This means that, if you present Nic's method with a random selection of artifacts from the real world, it'll perform poorly.

You could still argue about the true or least informative distribution for artifacts - I would guess that there's actually a survival bias that renders it nonuniform, but rather gently.

Horrors, all agree. Time for another blogger ethi...

2014-04-25T16:35:57.245+01:00

Horrors, all agree. Time for another blogger ethics panel

Crandles, Your conclusion is correct. The consequ...

2014-04-25T16:31:32.681+01:00

Crandles,

Your conclusion is correct. The consequences of that problem are probably the reason that the confidence limits given for calendar date of samples listed on page 10 of this presentation have very often upper or lower limits in the ranges 350-420 BC and 750-830 BC, but in no case in the range 510-750 BC and in only two cases in the range 420-510 BC.

The method is very weak in resolving dates 400-750 BC, therefore those dates occur rarely in as end points of confidence intervals, but the confidence interval includes often this whole range.

Perhaps I am reading it wrong, but to me it seems ...

2014-04-25T15:32:37.384+01:00

Perhaps I am reading it wrong, but to me it seems that the calibration curve is saying that carbon-14 dating is a hopeless method if you want to determine whether an item is 700-800 years old or just a bit younger or older as these give identical results.

So, if you are already confident your item is 600-900 years old and you do use such a method then prior and posterior should be very similar.

It seems to me there are more outcomes than just this item is now more likely (less likely) to be 700-800 years old.

This example brings out the conclusion that this method is not very good at telling.

It seems using more than one prior remains sensible to help get at other possibilities rather than aiming for one invariant prior.

Eli, Thanks for the link. In my latest comment I ...

2014-04-25T07:27:03.623+01:00

Eli,
Thanks for the link.

In my latest comment I tried to explain very briefly, how the Bayesian approach can be extended more explicitly to the handling of all the types of uncertainties those slides tell about.

You are surely right that all users of the method are not equally competent. Errors are made in doing the preparatory steps, and in interpreting the results.

There's hardly any field of science, where statistical methods are always used correctly. Well prepared standard software is of great help in that. The proposals of Kennan and Lewis would represent a really serious step in a wrong direction, but there might also be much potential for improvement. That's the part I cannot say anything more that to consider it likely, because errors are so common all around.

Pekka, take a look at this presentation esp down t...

2014-04-25T00:26:17.064+01:00

Pekka, take a look at this presentation esp down to the bottom. The preparation of samples looks as if it requires a great deal of care and expertise, and in Eli's experience that varies all over the place.

Captcha CERVECERIA. Eli wins the week

Coming back to the question of Eli, I had some mor...

2014-04-24T08:58:04.415+01:00

Coming back to the question of Eli, I had some more ideas about an answer.

In the Bayesian approach, what is determined from the experiment are likelihoods. That's what the experiment tells, pdf's require also a prior.

What the experiment really tells is one single number, an integer that's the count recorded by the detector and the counter. As such this is a precise number with no uncertainty. All uncertainties can be handled elsewhere.

Now we wish to calculate the likelihoods that exactly this number is observed for all the values of the variables we are interested in. In this case we have only a single variable, the calendar age.

Calculating the likelihoods consists of following steps:

1) Determine the amount of carbon in the sample, both of the age being studied and possible contamination from other times and the efficiency of the detector in observing C14 decays taking into account geometry and all other factors. Present all this information as pdf's (assuming Gaussian distributions is probably justified).

2) Determine from the calibration data (the band with data on its shape) the probabilities of each C14 age given the precise calendar age. Do that also for the ages of possible contamination.

3) For each C14 age calculate the pdf of counts taking into account the uncertainties of step one. If the uncertainties are small the distribution is a Poisson distribution, with corrections it could be a convolution of Gaussian and Poisson. Again do that also for the C14 ages of potential contamination. Take into account also effects like the time between the growth of a tree and manufacturing of the sample being studied and other comparable factors.

4) Combine the results of steps (2) and (3) to get the probability of the actually observed count. These probabilities form the relative likelihoods of each calendar date.

(The probabilities of all counts add up to one for each calendar age. Picking from the same set of numbers the probabilities of a single value of count for every calendar age results in relative likelihoods that do not add up to one, and should not be summed at all without the addition of a prior.)

Up to this point there should not be much disagreement. We have converted all relevant data to a set of likelihoods. Doing that we have extracted all the information the measurement can tell about the calendar date.

As a result we have an unnormalized likelihood function that tells in relative terms, how strongly the measurement favors some calendar ages over others. To give confidence intervals or full pdf's we must add a prior. It makes absolutely no sense to determine the prior based on the empirical setup. How we perform the measurement has no bearing on the probability of various ages of the sample. The prior must be fixed by some other arguments. It could be uniform in calendar time or it could be inversely proportional to the age, or we might use some additional information pertinent to that specific sample. That's up to the person who interprets the results. The measurement can tell only the relative likelihoods.

In steps (1), (2), and (3) pdf's of contributing factors are used. They are real probabilities that describe some effectively random contributions to the expected count for a given calendar age.

Rabbit, the source of the width is explained in Ke...

2014-04-23T23:23:51.656+01:00

Rabbit, the source of the width is explained in Keenan(2012), referring to Stuiver and Polach(1977)

(A Poisson radiation process leading to logGaussian, approximated to Gaussian. The measurement protocol can probably further be fine tuned for age of the samples etc. Also the width can be further reduced to practically zero if you have the money to pay for that)

Eli, What you write sounds reasonable, but for me...

2014-04-23T19:58:14.174+01:00

Eli,

What you write sounds reasonable, but for me to say something more specific would require digging deep in the procedures.

Pekka, The width of the pink Gaussian on the side...

2014-04-23T18:45:09.338+01:00

Pekka,

The width of the pink Gaussian on the side of the figures seems, at least to Eli, to be a by gosh and golly seat of the pants estimate handed down by a guru of all of the different sources of variation in the counting experiment, but not just for a particular sample.

Finding a real error budget has proven difficult, and it appears that most people just use what pops out of a program that all the C14 folk use. There is no doubt that that curve is totally frequentest, and as such, (you were the first to mention this?) the place where Bayseans could help is finding a better way to estimate that width.

Eli, (You are often too concise and cryptic for m...

2014-04-23T16:14:45.323+01:00

Eli,

(You are often too concise and cryptic for me. That may be, in part, due to cultural differences.)

That's a part of the problem that I have just accepted as correct enough presentation of all uncertainties related to the sample and the actual data collection.

By "correct enough" in the above I mean that the correct distribution is roughly that wide in comparison to the width of the bluish band and its wiggles.

Little of what I have written is dependent even on that, but there might be something that is.

Prekka, Eli's question is what is the source o...

2014-04-23T14:57:27.356+01:00

Prekka, Eli's question is what is the source of the width of the pink "Gaussian"

Eli, What I mean can be explained by this figure ...

2014-04-23T14:28:03.876+01:00

Eli,

What I mean can be explained by this figure from Nic's post and further from Keenan (2012) according to Nic.

In that case the red distribution tells about the uncertainty in the determination of the C14 value, and the width of the narrow bluish band about the uncertainty in going from a given C14 value to real date. (The wiggles of the band add another aspect of uncertainty.)

Prekka, what do you mean by uncertainty in the C14...

2014-04-23T12:11:57.117+01:00

Prekka, what do you mean by uncertainty in the C14 value? There are several steps from specimen to C14 content and then more in going from C14 content to calendar age.

As far as I have understood, the uncertainty in th...

2014-04-23T09:52:43.515+01:00

As far as I have understood, the uncertainty in the calibration curve has been analyzed and taken into account in the software used in interpreting C14 values. That uncertainty seems, however, to be very small in comparison with the uncertainty in the determination of the C14 value, and therefore a very minor factor in these considerations.

The main original source in the uncertainty is in the determination of the C14 value, but that uncertainty may be amplified greatly by the unfortunate form of the calibration curve (the real one, not only the artificial one shown in this post).

This example clarifies very well just what is wron...

2014-04-23T08:16:32.720+01:00

This example clarifies very well just what is wrong with unthinking application of the Jeffrey's prior. In what way does the true age of the objects being analyzed depend on the calibration curve? It doesn't, there is no good reason to suppose that more objects made 2000-1700, about 1000 and 500-0 years ago are more likely to be subjected to analysis than objects produced during the intervening periods, so why should the prior favour those intervals? It shouldn't. The uniform prior seems much more in accord with what we actually don't know.

While a Jeffrey's prior may have mathematical rigour, that doesn't mean we shouldn't ignore common sense, or check the consequences of the mathematically rigorous prior.

I suspect part of the problem would go away if the uncertainty in estimating the calibration curve were taken into account.

Richard S. Tol wrote: "There is nothing intui...

2014-04-23T08:05:23.347+01:00

Richard S. Tol wrote: "There is nothing intuitive about statistics and probability."

I disagree, the Bayesian conception of probability is straightforward and intuitive (but the analysis is often mathematically taxing), the problem with "statistics" is that it is generally performed in a frequentist setting, where the analysis is easy, but the definition of a probability is deeply unintuitive. Ideally both sets of tools should be in our toolbox though.

"Indeed, the hard part of teaching statistics is to erase sloppy intuition and replace it with rigorous mathematics."

No, if you do that you only mask the sloppy intuition with a veneer of mathematics, but the sloppy thinking is still there. To apply probability and statistics correctly you need your intuition to be correct as well. Much better to fix the sloppy intuition and then reinforce it with mathematical rigour.

"So, indeed "uninformative" does not refer to the common English usage of the word "information". Instead, "uninformative" means "proportional to the square root of the determinant of the Fisher information", where Fisher Info is the accepted, rigorous definition of information."

This is the sort of thing that results from being overly concerned with mathematical rigour rather than common sense (i.e. intuition). Why describe Jeffreys' prior as "uninformative" (which it isn't) rather than "invariant" (which it is in a fairly general sense)? What is gained by this?

If you want to make probability and statistics intuitive then needlessly using terminology in a way that is not in accordance with everyday usage seems like a bit of an own goal.

Don't misunderstand me, I like mathematical rigour, but I also like engineering common sense, and the two are not mutually exclusive.

Those contrived statistical machinations are of a ...

2014-04-22T22:54:22.718+01:00

Those contrived statistical machinations are of a minor concern compared to the (unreported) measurement uncertainty in the calibration curve.

That's the largest source of error! and its not even reported. Sloppy.

Eli, A lot of information is used to create the c...

2014-04-22T22:06:08.741+01:00

Eli,

A lot of information is used to create the calibration curve, i.e., to determine the relationship between measured C14 and the true age. When that relationship has been determined, the likelihoods of different true ages that correspond to a particular measured C14 value can be determined. Up to that point of the analysis no priors are used.

Priors enter if and when the likelihood values are used to determine posterior probabilities (a pdf or confidence intervals). Thus it's possible to perform the whole empirical work and present it's results in full without any reference to any prior. The limitation of that is that the results are not probabilities but (relative) likelihoods. The likelihood curve need not integrate to one. Actually it's not appropriate to calculate any integrals of it, because that would lead to a confusion with probabilities.

Prekka, the stylized curve that Lewis uses hides a...

2014-04-22T21:47:35.292+01:00

Prekka, the stylized curve that Lewis uses hides a lot. It forces a strong peak in the prior where the wiggle is in the center. It would do so where there is any wiggle, and indeed, because there are lots of wiggles in the real calibration data, you have to use metadata to get sensible answers and build sensible priors.

http://www.geo.arizona.edu/palynology/geos462/10radiometric.html

That part of my comment was purely semantic, i.e. ...

2014-04-22T20:50:42.685+01:00

That part of my comment was purely semantic, i.e. about understanding the meaning of the expression.

I don't think we have any disagreement on the content (on this point at least), I was just proposing a different way of using words. It may well be that my proposal was not a good one.

Pekka: The likelihood on the carbon date is fine. ...

2014-04-22T20:40:22.312+01:00

Pekka: The likelihood on the carbon date is fine. The likelihood on the calendar date is degenerate. The latter matters.

A degenerate likelihood implies a degenerate prior -- as it should -- and together they make a degenerate posterior -- as it should.

Fix the transfer function and all is fine.

Leave the prior out of it.

Isn't the problem rather a degenerate transfer...

2014-04-22T18:55:39.290+01:00

Isn't the problem rather a degenerate transfer function than a degenerate likelihood.

The likelihood derived from the empirical results is well defined. It's constant over wide ranges, i.e. it cannot differentiate at all between values in these ranges, but that doesn't lead to any technical difficulties or in any problems in interpreting directly the likelihoods.

All problems are related to the prior and they are serious specifically, when an attempt is made to use the Jeffreys' prior. That leads to an obviously nonsensical PDF.

Due to the flat likelihood function, any choice of prior has a strong influence. That may be considered a problem, but it's a problem of priors.

Another way of describing the situation is that radiacarbon dating is useless as a tool for telling what the true value is within the range 500-950 years or in the range 1050-1700 years, and of limited value also in making a difference between these two ranges. This idealized radiocarbon dating is useful only for dates more recent than 400 years or in the range 1750-2000 years.

This is weakness of the method. It's not uncommon that various methods have good resolving power only in a limited range of the quantity to be determined.

While the discussion here and elsewhere focuses on...

2014-04-22T18:27:08.255+01:00

While the discussion here and elsewhere focuses on choice of prior, the problem at hand is a degenerate likelihood.

Using a prior to fix a likelihood is a bad idea.

I don't dare to give an opinion on, whether an...

2014-04-22T17:19:13.213+01:00

I don't dare to give an opinion on, whether any prior can be described as the square root of the determinant of the Fischer information, but many and very different ones certainly can.

The theory of Haar measures might give the answer for that question, but I know very little about Haar measures.

To me it's more significant that Jeffreys' prior may lead to nonsensical results. Being maximally noninformative with respect to some measure is not a useful property, if the measure is pathological, and I would call a measure that corresponds to the C14 values pathological, when the issue being studied is the determination of the true age of a sample.