Highly multivariate statistical problems may lead to slow inference procedures. One example of such a problem involves palaeoclimate reconstruction from fossil pollen data, which is an example of an inverse problem. Existing explorations of this challenging area of study often involve a trade off between model complexity and speed of inference. Fast approximate Bayesian inference methods offer a solution. In addition, an extension of the methodology allows for model validation to be performed quickly for the inverse problem. Conversely, the large scale of the palaeoclimate project offers a real challenge to the emerging approximate inference engine.
The Royal Statistical Society read paper “Bayesian Palaeoclimate Reconstruction”, Haslett et al. (2006) presented work on high resolution pollen based reconstruction of the palaeoclimate at a single location since the last ice-age. This paper outlined the basic concepts involved in performing fully Bayesian inference on unknown climates given modern and fossil pollen data and modern climatic data. The work was a detailed “proof of concept”; extensions and improvements to the statistical methodology were considered, both in the paper and in the subsequent printed discussion.
The main crux of the methodology in that work was acknowledged to be computational; indeed the computational burden imposed compromises on the modelling. The work presented herein represents extensions in the statistical methodology and advances in the computations involved as developed by the author. These contributions are outlined in Section 1.4.
The Bayesian palaeoclimate reconstruction project is an ongoing initiative to build upon existing classical approaches to the reconstruction of prehistoric climates, using fossil pollen data. Specifically, the project seeks to handle all uncertainties quantitatively and coherently in a fully Bayesian framework and to combine different types of information to reduce these uncertainties.
The primary dataset for the palaeoclimatology reconstruction project is the RS10 dataset of Allen et al. (2000). A collection of modern pollen surface sample counts Y m