Climate of the Past Bayesian analysis of rapid climate change during the last glacial using Greenland δ 18 O data

We present statistical methods to determine climate regimes for the last glacial period using three temperature proxy records from Greenland: measurements of δ18O from the Greenland Ice Sheet Project 2 (GISP2), the Greenland Ice Core Project (GRIP) and the North Greenland Ice Core Project (NGRIP) using different timescales. A Markov Chain Monte Carlo method is presented to infer the number of states in a latent variable model along with their associated parameters. By using Bayesian model comparison methods we find that a model with 3 states is sufficient. These states correspond to a gradual cooling during the Greenland Interstadials, more rapid temperature decrease into Greenland Stadial and to the sudden rebound temperature increase at the onset of Greenland Interstadials. We investigate the recurrence properties of the onset of Greenland Interstadials and find no evidence to reject the null hypothesis of randomly timed events.


Introduction
Measurements of δ 18 O from Greenland ice cores for the last glacial are a good proxy for regional temperature.These time series indicate large temperature changes occurring on time scales of centuries or less.They are characterised by a rapid warming of up to 16 • C, followed by a slow cooling period.They are known as Dansgaard-Oeschger (DO) events and were numbered and dated by Dansgaard et al. (1993) using a time scale derived from a simple ice flow model and constrained by well established events at 11.5 kyr and 110 kyr BP (before present).The DO events correspond to the onset of a Correspondence to: D. Peavoy (d.peavoy@warwick.ac.uk)Greenland Interstadial (GIS), the warm phase of the glacial period.An alternation between GIS and the colder Greenland Stadial (GS) will be referred to as a DO cycle.
There is evidence that these dramatic climate shifts at high latitudes in the Northern Hemisphere had global scale effects.For example, Blunier et al. (1998) observed an out of phase signal in Antarctic ice cores, whereas Wang et al. (2001) note general agreement of temperature changes in GISP2 and changes in the hydrological cycle from stalagmite records from the Hulu cave in China.A new common time scale of Greenland/Antarctica temperature back to Marine Isotope Stage 5 will allow more accurate tests of modelling ability to simulate the global impacts of these events (Capron et al., 2010).Bond et al. (1993) used the abundance of planktic foraminifera in sediment records to show that sea surface temperatures in the North Atlantic (NA) are correlated with DO cycles.They also note that there are longer periods of cooling, which each include several DO cycles.At the termination of these 10-15 kyr cycles there are signatures of large scale ice rafting events that correlate with a GS.Known as Heinrich events, they are associated with collapse of the Laurentide ice sheet, thereby inundating the NA with freshwater, preventing the formation of North Atlantic Deep Water (NADW) and reducing the thermohaline circulation.The subsequent depletion of icebergs in the NA is thought to provoke a rapid reorganisation of the climate system and a temperature increase into a prominent GIS followed by gradual cooling punctuated by a series of DO cycles.This does not explain the mechanism driving the shorter DO cycles, although they are considered to be linked with similar ice sheet calving affecting the salinity of the NA and the formation of NADW.
Published by Copernicus Publications on behalf of the European Geosciences Union.
The apparent regular occurrence of DO events has lead some authors to suggest an external forcing or internal oscillation of the climate system.In particular the Fourier spectrum of the temperature record using the GISP2 time line has a significant peak at 1470 years (Grootes and Stuiver, 1997).However, the δ 18 O record from the Greenland Ice Core Project GRIP, completed the same year as GISP2, whilst also exhibiting DO cycles does not have this spectral peak.Neither does that from the North Greenland Ice Core Project (NGRIP) using the GICC05 time scale.Alley et al. (2001) suggest a simple model for the dynamics that includes weak periodic forcing and a stochastic resonance mechanism.Stocker and Johnsen (2003) link this to a bipolar thermal "see-saw", explaining the observed response in Antarctica and Braun et al. (2005) propose a resonance between two faster solar modes of variability to give a resulting weak signal at 1470 years.
It has been argued that Fourier methods are not suitable to detect recurrence patterns for such non-linear discrete events that are likely generated by a thresholding process.Rahmstorf (2003) measures the deviation of the events from a perfect 1470 years signal and finds that their timing is well within the assumed dating error, although there are several 1470 year periods where no event occurs at all.Ditlevsen et al. (2007) assess the significance of the observed periodicity by defining null models and computing the Rayleigh measure and standard deviation of residuals for ensembles of surrogate data.They find no evidence to reject the null model of randomly timed events.The exception is when using the GISP2 ice core record and omitting DO event 9 as numbered by Dansgaard et al. (1993).However, this does not change the result when using the more recent NGRIP ice core record.
The event detection algorithm of Rahmstorf (2003), which uses a criteria of 2‰ δ 18 O increase over 200 years, does not include DO 9, potentially influencing the conclusion.Ditlevsen et al. (2005) defines a DO event as first up crossings of an upper level following a lower level, identifying several more events and concluding that they are randomly paced.Given the sensitivity of results regarding the existence of periodicity to the definition of the events it is desirable to have a method that objectively detects their occurrence.Kwasniok and Lohmann (2009) show that a stochastic double well model could be fitted using the unscented Kalman filter to capture stadial-interstadial transitions, while Ditlevsen (1999) argues that non-Gaussian "heavy tailed" noise is suitable to capture extreme events within Greenland paleoclimate data.Livina et al. (2010) used GRIP and NGRIP δ 18 O data to study the number of states in the climate for the last 60 kyrs using a polynomial fitting algorithm to windows of the data.They detect the two states corresponding to the stadial and interstadials of the last glacial and find that these merge to a single state around 25 kyrs BP.Although these methods allow for asymmetry between climate states they do not include the characteristic "saw tooth" features of DO temperature cycles.The data sets we study are shown in Fig. 1.The NGRIP core is dated using GICC05 (Andersen et al., 2006), GRIP chronology from the ss09sea time scale (Johnsen et al., 2001) and GISP2 according to the Meese/Sowers time scale (Meese et al., 1997).We consider the period 82-11.0kyr BP for GRIP and GISP2, and 60-11.0 because of limited resolution at older dates.
The aim of this paper is to study models that are able to capture the features of the cycles (sudden temperature changes and small fluctuations) while also including the associated uncertainty in the identification of climate states.This uncertainty can then be included in subsequent analysis.We consider there to be hidden states corresponding to different climate regimes and identify these in a probabilistic sense.We generate an ensemble of data sets derived from NGRIP, GRIP or GISP2 using Gaussian Process regression.This incorporates variability into the data and reflects our uncertainty about the dating and measuring process while retaining the characteristic features of the DO cycles.It will also provide a means of assessing how robust our inference method is.
We apply a Bayesian statistical analysis to each ensemble member, which makes no assumptions about the parameters (e.g.mean and variance of the temperature change in a given state).Instead these are computed from the posterior distribution using Bayes' rule.We discuss Gaussian Process Regression in Sect. 2 and the modelling strategy in Sect.3. In Sect. 4 we analyse the results from the posterior simulation and find that models with three climate regimes fit the data well.In Sect. 4 we analyse the timing of the model state corresponding to the DO events.

Gaussian process regression
We consider discrete time models and so require data at regular intervals.An interpolation method such as cubic splines will smooth the data removing any jumps or discontinuities.Since it is the jumps we are interested in we choose to use Gaussian process regression (GPR), which can be tuned to capture the "roughness" of the data.GPR is a linear least squares estimation algorithm for interpolation where the sample path is considered to be a random process.It also provides uncertainty estimates of the interpolated sample path.
The method relies upon the specification of a covariance function between data points.We use the Matern class with smoothness parameter set equal to 1 (see Rasmussen and Williams, 2006).Then points r apart have covariance K(r) = exp(−r/ l), where l is the length scale.This is then equivalent to a first order autoregressive (AR(1)) covariance, generating continuous but non-differentiable time series.The length scale, process noise and observation noise are set using the maximum likelihood software of Rasmussen and Williams (2006).
The method generates a distribution over processes as shown in Fig. 2 for GISP2.It is clear that the uncertainty increases as we go back in time but decreases around the observed values.The "saw tooth" feature of the DO cycles is preserved.
We also apply the method to all three data sets and obtain an ensemble of 100 members from the Gaussian process distribution for each ice core, each with a time step of 50 years.

Latent state models
We assume that there are M unobserved (latent) states governing the temperature.Rather than consider states corresponding to absolute temperature, for example, a high temperature interstadial and a cooler stadial, we model the temperature increments of the series.For example, there may be one state for the sudden temperature increase at the onset of a Greenland Interstadial (GIS) and another for the slow cooling during the GIS.In this way we aim to capture the asymmetry between warming and cooling phases of the DO cycles.Considering only the increments implies that the method will not be affected by longer time scale variations in the data such as the Milankovitch cycles.
The parameters of the model include the mean temperature change per unit time and the variance within each state.These are not fixed but are inferred directly from the data using Bayes' rule.We analyse models with up to five hidden states.The occurrence of each state and all of the associated parameters are estimated in a Bayesian framework from the posterior distribution.We also determine the most suitable model for the data by estimating their marginal likelihoods.This involves approximating the integral over the whole parameter space to obtain an estimate of the probability of the model given the data.In this way, models with too many parameters, that over fit the data, are penalised.
We assume that the time series can be modelled as a random walk, dependent upon the state of the climate.This is intended to capture the differing distributions of the increments that occur during different regimes.Note that one could use a more general first order process, however we found that this did not significantly affect the identification of climate states.
Under the random walk model the data X i at time i evolves according to where i ∼ N (0,1) is the standard normal distribution, µ S i is the forcing and σ S i is the standard deviation in state S i .For N data points, the likelihood function of the data given the unobserved state sequence is To capture our uncertainty about the time spent in each state an extra layer is introduced into the hierarchy that conditions S on an unknown parameter λ j , j ∈ {0...M}.This is the probability that S i is in state j , implying a multinomial distribution for S i .
In Bayesian modelling one considers the unknown parameters as random variables and aims to compute their probability distribution conditional upon the observed data.For generic parameter θ and data X the posterior distribution is proportional to the product of likelihood function and prior distribution P (θ|X) ∝ P (X|θ )P (θ ).The prior distribution can incorporate any previous knowledge about θ or it can be designed to be as "uninformative" as possible.
In our problem λ j , µ j and σ j for j ∈ {0...M} are random variables which need a prior distribution.We use conjugate priors, which means that the posterior distributions for a given parameter will be of the same family as the priors.The full posterior will be intractable but using conjugate priors implies that one can sample each parameter from their conditional posterior separately.One then cycles through the conditional posterior distributions to obtain a sample from the full joint distribution.This is a Gibbs sampling algorithm www.clim-past.net/6/787/2010/Clim.Past, 6, 787-794, 2010 and will generate dependent samples from the full posterior distribution after discarding some initial convergence period (for an introduction to Bayesian inference using Markov Chain Monte Carlo (MCMC) methods see Gilks et al., 1995).
The conjugate prior for λ j is given by a Dirichlet distribution with hyperparameter α.This respects the constraint j λ j = 1.For σ j one defines τ j = 1/σ 2 j and assigns a Gamma prior τ j ∼ (a,b), whereas µ j is a priori normally distributed µ j ∼ N (0,σ 2 µ ).The hyperparameters a, b and σ µ are fixed.The posterior distributions will then be Dirichlet, Normal and Gamma respectively.Their particular form is given in the appendix.
To estimate the number of states we compute the probability of the model given the data, i.e the marginal likelihood.For model M, the marginal likelihood is To estimate this integral/sum one could simply sample from the prior and compute an average of the likelihood function for these values.However, this is likely to be a poor estimator due to the discrepancy between prior and posterior distributions (Kass and Raftery, 1995).Another option would be to use the Harmonic Mean estimator.This is calculated using samples from the posterior so is more accurate although it does not obey a Gaussian central limit theorem and so exhibits instabilities due to the occasional sample with very low likelihood value (Kass and Raftery, 1995).
We choose instead to use the Laplace-Metropolis estimator of Raftery (1996).This is an adaptation of the classic Laplace method, which is known to provide more efficient estimates than those based on posterior simulation.However, it is not always applicable since it requires the availability of the posterior mode and covariance matrix.Instead one can estimate these quantities from the posterior simulation output.
As the hidden states are conditionally independent given θ the above sum can be performed analytically and each integral approximated separately | | 1/2 P (X| μ, τ ,S)P (S| λ)P ( μ, τ , λ|M), (4) where μ, τ and λ is the posterior mode and is the covariance matrix computed from the MCMC output.

Estimation of climate states
In all cases we use uninformative priors with large variance.For example µ ∼ N (0,100).Repeated simulations reveal that the results are not sensitive to the prior.To determine the model most supported by the data we estimate marginal likelihoods using the method described above.The logarithm of the marginal likelihoods is computed from each of the 100 ensemble members of the Gaussian Process regression using There is a wider spread in results for GISP2 than GRIP or NGRIP due to the broad distribution of sampled series shown in Fig. 2. The values for NGRIP are much higher due to the smaller data set.If we consider the Bayes Factors BF ij for model i versus j for the medians alone we find that for GRIP and GISP2 the three state model is supported.For GRIP we have BF 32 = 4.1 × 10 4 regarded as strong by the scale of Kass and Raftery (1995), whereas for GISP2 we have BF 32 = 5.2, regarded as substantial evidence.For NGRIP we have BF 43 = 22, favouring four states.In the further analysis we work with the three state model as no significant extra modelling ability is given by including more than three states for two out of the three data sets.
Using the three state model we find that one state j = 1, corresponds very closely to the DO events.The state j = 0 occurs during stadial and interstadial periods and corresponds to small fluctuations.There is also some agreement between the temperature decrease state j = 2, and the termination of GIS.However, this does not occur for all DO cycles so we focus our attention upon the state corresponding to DO events.Figure 4 shows the posterior probability of a DO event for the NGRIP data, i.e for each time i the probability that S i = 1.The posterior distribution of hidden states is similar for the other data sets, although more events are detected in the GISP2 data (not shown).The events numbered by Dansgaard et al. (1993), all have high posterior probability, implying that the model is able to distinguish these events.There are, however several other regions of high probability, for example at 26.5 kyr BP.Although not typically regarded as DO events in the literature, these jumps in temperature share similar statistics and may be important in the analysis of recurrence patterns.Many of the high probability regions cluster together and would previously have been classed as a single event.Here, we only regard two events as one if they occur in immediate succession.In this way we make little restriction on the minimum time period between events.
For each data set we retained the hidden state sequence with the largest posterior probability and calculated its distribution of waiting times.These are shown in Fig. 7 along with their best fit exponential distribution.The 95% confidence interval is computed according to the bounds derived in Massart (1990).The theoretical distribution is within this bound for all three data sets implying that we can not reject the model of a Poisson process with no periodicity.
The posterior distributions for the parameter of interest µ are shown in Fig. 6.Note that this parameter has units of δ 18 O‰ and can be considered to be proportional to the local temperature change within a 50 year period for each climate state.There is a clear distinction in this forcing between different climate states.µ 0 , corresponding to the relaxation state, has a mean of −0.05 − 0.0 for the three data sets.The termination state has mean −2.2 to −1.7.There is good agreement for the DO state between the three data sets with a mean of approximately 2.0.This corresponds to a rate of change of 0.04‰ per year.This is larger than the 0.01‰ per year of Rahmstorf (2003), although our model allows noise in the state which would allow a DO event to have a slower temperature change.For each data set we retained the hidden st and calculated its distribution of waiting tim best fit exponential distribution.The 95% bounds derived in Massart (1990).The theor data sets implying that we can not reject the The posterior distributions for the paramete parameter has units of δ 18 O‰ and can be con change within a 50 year period for each clim between different climate states.µ 0 , correspo -0.0 for the three data sets.The terminat agreement for the DO state between the three corresponds to a rate of change of 0.04‰ p Rahmstorf (2003), although our model allow to have a slower temperature change.(c) G Fig. 5: Empirical cumulative waiting times (red) with the best fit exponential di and the 95% confidence interval.Also given are the mean waiting times in yea on the minimum time period between events.
For each data set we retained the hidden state sequence with the largest pos and calculated its distribution of waiting times.These are shown in Figure 7 best fit exponential distribution.The 95% confidence interval is computed bounds derived in Massart (1990).The theoretical distribution is within this b data sets implying that we can not reject the model of a Poisson process with n The posterior distributions for the parameter of interest µ are shown in Figure parameter has units of δ 18 O‰ and can be considered to be proportional to the change within a 50 year period for each climate state.There is a clear distincti between different climate states.µ 0 , corresponding to the relaxation state, has -0.0 for the three data sets.The termination state has mean −2.2 -−1.7 agreement for the DO state between the three data sets with a mean of approx corresponds to a rate of change of 0.04‰ per year.This is larger than the 0. Rahmstorf ( 2003), although our model allows noise in the state which would a to have a slower temperature change.(c) GISP2, 2900 mpirical cumulative waiting times (red) with the best fit exponential distribution (black) 95% confidence interval.Also given are the mean waiting times in years.inimum time period between events.ach data set we retained the hidden state sequence with the largest posterior probability ulated its distribution of waiting times.These are shown in Figure 7 along with their exponential distribution.The 95% confidence interval is computed according to the derived in Massart (1990).The theoretical distribution is within this bound for all three s implying that we can not reject the model of a Poisson process with no periodicity.osterior distributions for the parameter of interest µ are shown in Figure 6.Note that this ter has units of δ 18 O‰ and can be considered to be proportional to the local temperature within a 50 year period for each climate state.There is a clear distinction in this forcing Fig. 5: Empirical cumulative waiting times (red) with the best fit exponential distribution (black) and the 95% confidence interval.Also given are the mean waiting times in years.

Analysis of periodicity
As in Ditlevsen et al. (2007) we use the Rayleigh measure to quantify the amount of periodicity for discrete events.This is a measure for the average phase coherence for N events.For period T it is given by R(T ) = There is significant agreement between NGRIP and GRIP for all three states.
null ensemble of the original data.The null ensemble is generated by sampling the data without replacement so that it is randomly ordered.Then hidden state sequences are generated by sampling from the posterior distribution conditioned upon the randomly ordered data and the other parameters (inferred for the original data).This means that the climate states in the ensemble will be equal in distribution to those in the actual data but their occurrence will be randomised.
Figure 7 shows the Rayleigh measure computed for the state corresponding to the DO events over a range of periods T with the mean and 2σ values for the null ensemble.Using this method no significant peridiocities occur for any of the data sets agreeing with the result of Fig. 7 for randomly timed events.

Conclusions
We propose a statistical modelling strategy able to identify different climate regimes corresponding to differing rates of temperature change.This is a more objective method than setting threshold criteria based on the appearance of the data.The method allows for uncertainty in the detection of events, avoiding the sensitivity of results due to absolute inclusion or exclusion of a single event according to an authors preferred interpretation of the data.
To obtain a regular 50 year sampling, and to check the robustness of the procedure, we apply the method to an ensemble of time series derived from a Gaussian Process regression, separately for three Greenland δ 18 O series: NGRIP, GRIP and GISP2 on different time scales.Using a Bayesian model comparison method we find that models with three states offer a good fit to all three data sets.
The inferred model states have distinct posterior distributions for their rates of temperature change with good agreement for all data sets.We found that one of the states closely corresponds to the onset of Greenland Interstadials and accounts for all of the canonical Dansgaard-Oeschger events of Dansgaard et al. (1993).This state had a mean rate of temperature change of 0.04‰ per year, which is greater than the 0.01‰ per year threshold criteria of Rahmstorf (2003).Contrary to Rahmstorf (2003), DO event 9 at 40 kyr BP is included in this state while the Allerød event, occurring at the start of the Younger Dryas, is not.Although the threshold criteria is larger the method allows for greater flexibility in the identification of states due to the incorporation of noise.
Having identified states we analyse their recurrence properties.From the waiting time distribution we find no evidence to reject the null hypothesis of randomly timed events.We also apply the Rayleigh periodicity measure used by Ditlevsen et al. (2007) to an ensemble derived from the Clim.Past, 6, 787-794, 2010 www.clim-past.net/6/787/2010/posterior distribution of shuffled data and find no evidence of significant periodicity within the δ 18 O series.Thus, we cannot reject our null hypothesis that DO events are random events.This suggests that DO events are likely triggered by internal dynamics of the climate system.The periodicity seen in GISP2 by previous authors is apparently due to a small number of events that occur at the regular interval of 1450-1500 years.The method we present is robust to these events.We conclude that there is no lasting periodic signal for the rapid warming events of the last glacial period.
We applied the analysis to the GRIP ice core record on the GICC05 time scale giving parameter estimates similar to those in Fig. 6.Contrasting with the results for NGRIP using this time scale there is more support for a single state model, similar to the results for the other time scales shown in Fig. 3a.This implies that some of the differences in the results could be due to regional effects (different locations of drilling sites for GRIP and NGRIP) rather than the different time scale.Indeed the NGRIP data is less smooth, with larger jumps than the other data sets.
Here, we have been primarily concerned with statistical methods to detect jump events within noisy, irregularly sampled data.To make conclusions about variability of high latitude glacial climate one should relate the signal to temperature while controlling for regional affects.Past changes in atmospheric transport from the source of evaporation to the site of measurement can affect the stable isotope ratios.The seasonality of precipitation can also affect the temperature reconstruction.Simulation studies are one way to gain a better understanding of the influences on δ 18 O signals (Krinner et al., 1997).Multi-proxy studies may also be useful to control for factors that could affect δ 18 O besides the temperature of condensation.Masson-Delmotte et al. (2006) reconstruct temperatures allowing for changes in seasonality of precipitation and seawater isotope composition using a different slope between temperature and isotope ratio for cold and warm periods.The temperatures are shown to be consistent with the borehole and gas fractionation temperature reconstruction.
Future work could include applying our analysis directly to temperature resonstructions.Extensions could include an extra component to the likelihood function incorporating the dating uncertainty.This would be a more challenging statistical problem as these errors are not independent.This analysis would benefit from including the estimated maximum annual layer counting errors available for the NGRIP GICC05 time scale.
Fig. 1: (δ 18 O) from NGRIP, GRIP and GISP2 expressed in ‰ with respect to Vienna Standard Mean Ocean Water using GICC05, ss09sea and Meese/Sowers time scales respectively.

ClimFig. 2 :
Fig.2: Gaussian Process (GP) fit to the GISP2 data set.The raw data is shown in red with the mean from the GP in blue and the upper and lower 3σ levels in grey.

Fig. 3 :
Fig. 3: Marginal likelihoods for the ensemble derived from each data set for models with 1-5 states.The red line indithe median of the ensemble, the box is the 25 and 75 percentiles, the whiskers are the maximum and minimum of the distribution while the red markers indicate outliers.

ClimFig. 4 :
Fig. 4: Posterior probability of DO event as a function of time for NGRIP on the GICC05 timescale.

Fig. 6 :
Fig.6: Posterior distributions of forcing µ for three hidden states.There is significant agreement between NGRIP and GRIP for all three states.

Fig. 7 :
Fig. 7: Rayleigh measures for ice core data (solid lines) shown together with the ensemble means (dashed lines) and upper 2σ values (dotted lines) in each case.Shown are NGRIP (black), GRIP (red) and GISP2 (blue).