Highly multivariate data is often challenging to model due to the “curse of dimensionality” Bellman (1957). Much of the work in this thesis is concerned with reducing the computational burden associated with modelling the response of vegetation to climate using the modern training data (see Section 1.1.1).
This is achieved through separate analysis of the marginal responses of individual plant taxa to climate. The approach necessarily ignores between taxa dependencies but allows for reduction of overall computation. The separate, marginal analyses can then be brought together post-hoc. This is referred to as the inference-via-marginals posterior and is an approximation to the joint posterior of the full model. Situations where the approximation is poor and where it is excellent, or even exact, are identified.
Details related to working on a discrete grid across the location space are also investigated. This chapter serves as a review and assessment of the preceding chapters methodology. The methods are brought together in preparation for the application to the pollen dataset in Chapter 6.
In particular, the concern here is with inference on multiple latent spatial Gaussian processes defined on a lattice. The INLA methodology introduced in Section 4.1 is unsuitable for this task as it can only deal with one such spatial process at a time (see Section 5.1.4). If the model does not disjoint-decompose (Section 3.2) exactly then approximations to the model that do decompose must be sought. The accuracy of these approximations must then be tested, as per Section 5.3.
The goal is inference on latent random variables X, given counts data Y defined at discrete locations C. X is composed of NT processes which are Gaussian (or approximately so) given the data. Thus at each value of C, there are NT counts or proportions arising from NT potentially dependent X values. As the posterior is GMRF due to the methods introduced in Section 4.1, it is expressed via the posterior mean vector μ and precision matrix Q. i.e. the posterior distribution of X is given approximately by:
![]() | (5.1) |
The goal is therefore inference on the μ and Q terms. By disjoint-decomposition into marginals defined as the product of independent multivariate Xi,i