The strongest mode of centennial to millennial climate variability in the paleoclimatic record is represented by
Dansgaard–Oeschger (DO) cycles. Despite decades of research, their dynamics and physical mechanisms
remain poorly understood. Valuable insights can be obtained by studying high-resolution Greenland ice
core proxies, such as the NGRIP

Different physical mechanism(s) underlying Dansgaard–Oeschger (DO) events have been proposed in the
literature. Most of these are characterized by changes between different modes of operation of
the Atlantic Meridional Overturning Circulation (AMOC) that accompany the warm and cold phases
of a DO cycle. This is supported by marine sediment data evidence linking DO cycles and
changes in the AMOC

The modeling of DO events is guided by proxy records, among which stable water isotope records from Greenland ice cores are very prominent. DO-type transitions in models range in their dynamics from stochastic to excitable and oscillatory, and they are sensitive to different forcings. A statistical analysis of the DO cycles extracted from Greenland ice core records can thus be useful to evaluate the proposed models. The records are noisy, and since there are no established theories about how they should evolve, there is no obvious filter to extract the large-scale climate signal. A common characteristic of the DO cycles seems to be an abrupt temperature increase from cold stadial conditions to a maximum temperature in the warm interstadial state followed by a gradual cooling until there is another abrupt jump back into the stadial state. This is referred to as the sawtooth shape of the events.

Due to the high noise level in the record it is, however, difficult to discern this specific
structure in all of the events. Some events do not seem to follow the generic shape.
Furthermore, there are very short events during which it is difficult to speak of a
gradual cooling episode. Other events are interrupted by shorter cooling episodes,
referred to as sub-events

First, our method gives an objective basis of the validity of the generic sawtooth description of the DO cycles and identifies which individual cycles fall outside this description. Secondly, with a piecewise linear fit, we obtain estimates for the stadial and interstadial levels, the abruptness of the transitions, and the gradual cooling rate in the interstadial periods. By bootstrapping, we estimate the uncertainty in extracting these parameters from the noisy background. Finally, we perform a comprehensive statistical analysis of the fit parameters across the DO events and their relation to external forcings in order to obtain an empirical understanding of what controls the evolution of the amplitudes and durations of the DO cycles. This can potentially be used for identifying or excluding proposed mechanisms and for benchmarking model results.

Previous efforts to extract robust DO event features from the record were conducted on only part
of the record and were focused on single or very few features. In

In this study, we show that a characteristic sawtooth waveform can be fit to
all DO cycles. However, almost half of the cycles do not actually display a
significant rapid cooling episode after the more gradual interstadial cooling.
A subsequent statistical analysis of DO event features hints at different mechanisms underlying warming and cooling transitions.
First, this follows from the distributions of the durations of the stadials as well as the rapid DO
warmings, on the one hand, and of the interstadials on the other hand.
Secondly, the influence of external forcing is contrasting, with stronger evidence for insolation
influence on stadials and

The paper is structured in the following way: in Sect.

This study is based on the

Our method uses a previously classified set of events from Greenland,
which has been reported by

Our analysis uses other datasets that are not derived from Greenland ice cores.
These are referred to as external forcings, although not all are truly external to
the climate system but rather obtained from independent data sources.
As a proxy for global ice volume, we use the LR04 ocean sediment stack

We aim to fit a continuous piecewise linear waveform to the record. This is not possible by simply cutting the time series into DO cycles and fitting each cycle individually because the points at which the time series is cut need to be defined from the fit and in turn influence the fit. Fitting the whole time series at once to a piecewise linear model with 186 parameters, corresponding to 6 parameters for each of the 31 DO events, will be difficult to achieve without invoking very complicated constraints because of high noise and an abundance of sub-event features. Instead, we propose the following iterative fitting routine that converges to a consistent fit. We start with a guess for the stadial onset and end times, which determine the constant stadial levels. Then we fit a sawtooth shape individually to each event. Thereafter, we update the stadial onset and end times according to the fit and repeat the procedure. When after some iterations the onset and end times do not change significantly anymore, the fit has converged and is consistent.

The initial guess of the stadial onsets and ends is based on the timings reported by

Piecewise linear model fit to DO event 20, for which the time series consists of GS-21.1, GI-20, and GS-20.
The parameters of the piecewise linear model are the four break points

The fitting procedure outlined above yields a single best fit that we hope to be close to
the absolute global minimum of the optimization problem and furthermore
as consistent as possible, meaning that the stadial sections that were used for the fit
in the last iteration are identical to the stadial sections defined by the resulting fit.
Additionally to this best fit we assess the
uncertainty in each of the parameters that arises due to noise in the record.
We cannot estimate this from the output of our fitting procedure in a straightforward way.
Instead, we use bootstrapping to repeatedly generate synthetic data for each transition
and optimize the parameters. Like this, we yield a distribution on each parameter.
Due to computational demands, we do not combine this with our iterative procedure but rather
resample and fit every transition independently.
Thus, we neglect the covariance structure of the errors in the parameters of neighboring transitions.
However, we still consider it to be a very good estimate of the uncertainty due to the noise in the record.
The detailed procedure is given in Appendix

From the best-fit parameters of each DO cycle a variety of features follow.
For each rapid warming period, gradual interstadial cooling period, and
rapid cooling period at the end of an interstadial, we consider the duration, rate of change, and
amplitude. Furthermore, several absolute levels are of interest, including the constant stadial
levels, the interstadial levels after the abrupt warming, and the interstadial level before the rapid
cooling. As a level relative to each event, we consider the level before the rapid cooling above the
previous stadial level, which is given by the rapid warming amplitude minus the gradual cooling
amplitude. Finally, the gradual cooling amplitude divided by the rapid warming amplitude measures
the position of the point of rapid cooling within the event amplitude. In total, we consider 15
interdependent features, which are listed in Table

List of DO event features obtained from the fit that are analyzed in this study.

Our aim is to develop an empirical understanding of the evolution of the DO cycles.
To this end, we employ several tools to search for relations between different features,
as well as between features and external climate factors.
Additionally, the distributions of the individual features themselves hold important information,
especially when there is no strong external modulation in time.
We test the distributions using Anderson–Darling (AD), Cramér–von Mises, and Kolmogorov–Smirnov tests.
Since the AD test is typically the most powerful and the other tests
yield qualitatively unchanged results in all of our analyses, we only report

Because of the large number of possible combinations of features, we first preselect significant and potentially relevant relationships and thereafter investigate in detail whether the results are robust to outliers, among other things. In some cases we also consider relationships of features and forcings that are not significant for the whole dataset but for a large subset. This might highlight the fact that there were qualitatively different periods within the last glacial or that some DO events are of a different nature than most.

We first consider Pearson and Spearman correlation coefficients

Next, in order to find relations between more than two variables, we search for multiple linear regression models to explain selected features of the data. Here, we often use logarithmic quantities because it is otherwise often unlikely to find linear relationships that are not dominated by outliers. Given a feature as a response variable, we consider linear regression models of combinations of two other features or forcings, preselect the models with the largest coefficients of determination, and then further analyze them.

Furthermore, in order to find subsets of events that have distinct properties or relationships
that are only valid in part of the data, we perform a clustering analysis on the data using
two different algorithms (

The significance of such an analysis must be viewed in light of the multiple comparisons problem.
Tests for significant correlations of many pairs of
features using, e.g., the Spearman correlation yield a
non-negligible number of false positives when using confidence levels that are reasonable for our
purposes. We consider features of both the same and neighboring events, yielding

There are sophisticated methods to control the multiple comparisons problem. These could be helpful to better detect false positives from our analysis, but they depend on being able to properly estimate the significance of individual correlations between features with autocorrelation and assess the statistical dependence of the hypothesis tests due to the dependence of some of the features. For simplicity, we do not consider such an analysis, but we consider individually significant correlations as suggestions to be investigated further.

The fitting routine is performed for 40 iterations so that initial
fluctuations in the parameters have died out and converged to a consistent fit,
as detailed in Appendix

High-resolution NGRIP

Parameters resulting from the fitting routine on the NGRIP data.

In our fit, all transitions follow the characteristic sawtooth shape. For a few events, this is because of the constraints we use in the fitting algorithm. Typically, the constraints do not strictly bound the best-fit parameters, but they force the fit into another local minimum that is consistent with the sawtooth shape, which often yields parameters that are still clearly within the constraints. There are, however, four events with parameters close to the bounds. This happens for GI-5.1 and GI-3, which both have ratios of rapid to gradual cooling rates very close to the constraint value of 2.0. Similarly, for GI-15.2 and GI-6 the ratio of gradual to rapid cooling duration is close to 2.0. Detailed pictures of each transition and the corresponding fit are shown in Fig. S2.

The fact that constraints are needed to ensure that each event follows a
sawtooth shape can be used to classify which events fall outside this description.
To this end, we perform another run of the iterative fitting routine without using constraints
3, 4, 6, and 7 listed in Appendix

the abrupt cooling rate is at least twice as large as the gradual cooling rate;

the gradual cooling lasts at least twice as long as the abrupt cooling;

there is gradual cooling after the rapid warming, i.e., the gradual cooling rate is negative; and

the abrupt cooling amplitude is larger than 0.5 ‰.

From the best fit, we estimate the uncertainty of each parameter via bootstrapping, as explained in
Appendix

Gaussian kernel density of the model parameters and some derived quantities for the DO event 20 after 5000 iterations of
the bootstrap resampling procedure. The parameter values for the best fit, as reported in Sect.

The uncertainty varies from event to event. In the case of the warming durations, the average bootstrap standard deviation is 20.0 years, with a minimum of 3.4 years for GI-16.2 and a maximum of 57.4 years for GI-18. Shorter warmings typically also have smaller uncertainties. As a comparison, the durations of the rapid coolings at the end of an interstadial have a larger uncertainty of 53.6 years. This is expected because the rapid cooling is typically less well pronounced in the record compared to the rapid warming. The coolings also have a larger spread in the bootstrap standard deviations, with a minimum of 4.6 years for GI-16.2 and a maximum of 209.9 years for GI-23.1. Similarly, the onset times of the rapid warmings have an average bootstrap standard deviation of 11.4 years, whereas the stadial onsets have a corresponding average uncertainty of 31.7 years.

Durations and amplitudes of the rapid warmings inferred from the fit, together with a confidence interval obtained by bootstrapping.

As a complementary approach to assess the uncertainties of the features, we compare
them to those derived in the same way from another Greenland ice core. We chose the

For the gradual cooling rates we find

The warming amplitudes are very well correlated with

The rapid cooling durations, i.e.,

The stadial and interstadial durations are very well correlated with

In summary, the uncertainties obtained by bootstrapping and by comparison with the GRIP ice core are compatible, giving confidence in the estimates of the former method. The average bootstrap standard deviation of rapid warming and cooling durations is 20 and 54 years, respectively. This compares well to the average absolute deviation between GRIP and NGRIP of warming and cooling durations of 31 and 59 years, respectively. The discrepancy of 31 years for warming durations also includes a systematic bias of warmings that are 8 years longer on average in GRIP. Thus, the unbiased uncertainty is likely even closer to the one obtained by bootstrapping. Shorter-timescale features like rapid warming durations are not fully representative for every single event in one core. However, the overall trends are consistent, as seen by significant correlation. Features on a longer timescale, such as most of the cooling slopes and stadial levels, as well as the stadial and interstadial durations, are clearly representative.

Histograms of our sample of 31 events for all features considered in
this study, as defined in Table

In Fig.

Spearman correlation heat map of

We focus on the factors influencing the durations of the interstadial periods

We test which of the two scenarios is better supported by the data. This depends on whether
the cooling amplitudes or the cooling rates have a larger spread than the other. The coefficient
of variation for the amplitudes is CV

In Fig.

The relationship between interstadial durations and cooling rates also manifests itself in the respective
distributions. As seen in Fig.

As opposed to other skewed distributions like exponential, Gumbel, and power law, both durations and
cooling rates are also consistent with an inverse Gaussian distribution.
The observation that the durations and rates and are both well fitted by the inverse Gaussian
despite their inverse relation is explained by the similar shape of the reciprocal inverse Gaussian distribution.
If a variable is inverse Gaussian

The strong relationship of interstadial durations and cooling rates might have some implications
for DO event dynamics. If the durations are correlated more strongly with the cooling rates than with
the amplitudes, they can already be approximately predicted as soon as the rate is established, which
might happen early in the interstadial.
To test this, we take small slices of the beginnings of each interstadial, fit a linear slope

Our interpretation is that the cooling rate is an indicator of a timescale of large-scale climate reorganization, which can already be measured relatively early in the interstadial and which remains approximately constant. Although we can see that there are exceptions, we conclude that for most events the interstadial duration can be predicted a few hundred years after the rapid warming. Some of the unexplained variance of this prediction is due to other factors influencing the interstadial duration that are not diagnosed by the linear cooling rate but, e.g., by the cooling amplitude.

Given the previous result, we investigate whether the variability in the timescale associated with the cooling rate can be explained by other features of the DO cycles or by external forcing. Among correlations of the cooling rates with other features deemed significant by a permutation test, none are relevant, either because they are caused by a few outliers or else directly due to their definition and parameter constraints.

Considering external climate factors, we find

As shown in Fig.

A better predictor of the interstadial cooling rates of the more recent DO cycles is given
by the

Additionally, in a subset of the events, there is a linear relationship between the logarithm of the
cooling rates and EDML at the interstadial onsets. While
the entire dataset is not significantly correlated at 90 % confidence (

A corresponding linear relationship between the logarithms of interstadial durations and Antarctic temperature
in different ice cores has been noted before by

The stadial periods are defined to start after the rapid cooling and end at the onset of the
rapid warming, and their duration is thus

In the following we discuss whether the stadial duration variability is
influenced by other features in the data or external factors.
Among external factors, the durations are best correlated with 65Nss (

While the stadial levels correlate well with LR04 and EDML due to a common linear trend,
there is better correlation with insolation, as seen by

We investigate whether multiple linear regression models with two predictors
explain the stadial levels and durations better.
A model comprised of 65Nint and eccentricity determines the levels very well
(

The exponential tail in the variability of the stadial durations is not a
result of the modulation by the external forcings we consider. To demonstrate this, we remove the
forcing influence by fitting a linear model of one or more forcings to the log durations.
Detrended data are obtained by adding the mean of the
logarithmic data to the residuals of the fit and then exponentiating. When using 65Nss as forcing,
we find

Besides DO events, Heinrich events are the other major mode of glacial millennial-scale climate
variability. They correspond to massive discharges of ice-rafted debris found in ocean sediment
cores

We test whether these “Heinrich stadials” have significantly different properties than the remaining
stadials, such as longer durations, by randomly sampling nine stadials (five for the reduced set) from the
entire set without replacement and calculating the mean duration of this subset. This is repeated
until we can estimate the probability of trials yielding a higher mean duration than the actual set
of Heinrich stadials. If this is less than 5 % (corresponding to

For the full (reduced) set of Heinrich events we find

The rapid warming transitions in NGRIP as determined by our method have an average
duration of 63.2 years. There is a large spread, with a minimum duration of 15.3 years for
GI-17.1 and a maximum of 179.5 years for GI-11, but there is no clear trend, as we find both short and
long warmings in early and later parts of the record. The distribution is skewed as seen in
Fig.

In our analysis we cannot identify any DO cycle features,
external forcings, or combinations thereof that explain a significant part of the variability
in the warming durations. Thus, we aim to infer something about the mechanism of the warming
transitions from the distribution of their durations.
The lognormal (

In the following we compare the warming durations to what is expected in the framework of
noise-induced transitions in multi-stable systems.
The DO warmings are much shorter than the time spent in the stadial state.
If we consider the stadial–interstadial transition as a noise-induced transition from
one metastable state to another, starting at the stadial onset, most of the time is spent in the vicinity
of the stadial state. The part of the trajectory that leaves this vicinity for the last time and
then moves towards the other state (interstadial) is referred to as the reactive trajectory.
Because of the high noise level in the record, an unknown part of which is non-climatic or regional and changes
over time, we do not estimate reactive trajectories by defining neighborhoods of two metastable states.
Instead, we believe the warming periods obtained by our piecewise linear fit are reasonable estimates.
Figure

With a small numerical experiment we address the case of finite noise levels and small sample sizes.
We use stochastic motion in a double-well potential as a generic model for a noise-induced transition
from one metastable state to another.
It is given by the stochastic differential equation

To show this, we collect

This implies that a small sample of 31 reactive trajectories cannot reliably identify the true distribution and thus a potential mechanism. Still, the data are at least consistent with the expected behavior of noise-induced escape from a metastable state. Other simple mechanisms can be consistent with the data, too. For example, as mentioned above, the inverse Gaussian is the distribution of time elapsed for a Brownian motion with drift to reach a fixed level.

The average amplitude of the warmings is 4.2 ‰, with most events clustering around
this value. The most extreme values are 7. ‰ for GI-19.2 and 1.7 ‰ for GI-5.1,
which is almost not visually discernible as an event in the

This work presents a statistical analysis of DO event features based on best-fit parameters of a piecewise linear
waveform to the NGRIP

Furthermore, the work relies on the classification of Greenland ice core centennial to millennial variability
into a set of DO events by

Our analysis suggests that the mechanisms underlying warming and cooling transitions are likely
different due to contrasting statistical properties.
The stadial duration distribution closely resembles an exponential, and its large dispersion cannot be
explained by external forcing alone (Sect.

The situation is different for the interstadial–stadial transition.
Although the interstadial durations are also highly variable, they are characterized by a roughly
linear cooling with rates that correlate strongly with the durations.
Because this correlation is much stronger than that of the durations and the amplitudes before the
rapid DO coolings, the interstadial–stadial transition can be predicted to a good approximation as
soon as the rates have stabilized, which happens within the first 150 to 350 years of the
interstadial for most DO cycles (Sect.

External forcing might explain the large variability of this timescale,
as proposed by

Thus, the influence of external forcing is different for stadial and
interstadial periods, with more evidence for insolation forcing on stadials and ice volume or

We developed a method to fit a continuous piecewise linear waveform to the entire
last glacial NGRIP

This work is based on the high-resolution NGRIP oxygen isotope record of the entire last glacial
period. The data up until 60 kyr BP are available at

In the following, we detail the optimization procedure to find the best sawtooth-shaped fit
for each event, i.e., line 18 of the algorithm above. To determine the six parameters at each transition, we minimize the root mean square deviation of the fit from the
time series segment. Due to the high noise level, there are many local minima in this
optimization problem. Thus, either a brute-force parameter search on a grid or
an advanced algorithm is needed to find a global minimum.
We chose an algorithm called basin-hopping, which is described in

Within basin-hopping, one has the freedom of choosing any local minimizer
as well as a perturbation kernel. These have to be adapted to our optimization problem.
We have several constraints on the parameters that need to be satisfied by the optimization.
For instance, we demand that all segments of the fit are present and do not overlap (

Two hyperparameters have to be specified in the basin-hopping algorithm:
the variance of the perturbation kernel and the parameter

The following list contains all constraints used in the optimization problem in order to ensure
convergence of the algorithm to a fit within the qualitative limits of the desired
characteristic waveform. Specifically, constraints 3 and 4 shall guarantee that there is
a distinction between gradual cooling and rapid cooling at the end of an interstadial.
With these constraints we can prevent our algorithm from splitting an interstadial in half with two
very similar slopes, which can easily happen because there are interstadials that arguably
have a rather gradual cooling all the way down to the next stadial with no easily discernable steep
cooling at the end. The lower limit of constraint 6 shall help to only fit to the steep part
of warming transitions, which might have a slight warming prior to it.
The upper limit of constraint 7 is needed in order to force a small negative slope on
very short transitions that otherwise could also be viewed as plateaus.

No overlap of segments:

Gradual slope cannot go below the following stadial level

Gradual slope must be twice as long as a steep drop:

The drop at the end of the interstadial must be at least twice as steep as a gradual slope:

The stadial period must not be shorter than 20 years:

Limit the steepness of the up-slope (‰ yr

Limit the steepness of the down-slope (‰ yr

For the basin-hopping algorithm we use a multivariate Gaussian kernel of fixed variance with

We repeatedly run our iterative fitting routine and monitor whether the individual parameters
converge so that a consistent fit is obtained in the end. Critical for obtaining a consistent
fit is that the stadial levels do not change substantially, as explained in the Methods section.
In Fig.

Because of the nature of the data, care has to be taken when generating synthetic data. The properties of the data change throughout the record and are also quite different between adjacent stadials and interstadials. Stadials have both a larger variance and a larger effective sample spacing in time than the interstadials. For this reason, synthetic data will be created for each stadial and interstadial period individually. The original data are unevenly spaced, which would provide difficulties on its own, while our data are nearest-neighbor interpolated and oversampled to a 1-year resolution. This means that there are typically multiple neighboring points with the same value, making it challenging to find a valid autoregressive or autoregressive moving-average (ARMA) model for the residuals to generate synthetic data. Instead, we use a block bootstrap resampling technique to keep all relevant structure in the data. We chose a simple block bootstrap whereby non-overlapping blocks of fixed length of the time series are randomly ordered because it preserves the correct mean of the individual stadial and interstadial residuals. More involved methods, such as the stationary bootstrap, could be applied, but it likely will not change any of our conclusions.

In the following, we present the procedure for uncertainty estimation. We denote the original data
time series of a given transition as

Divide the residuals into four segments

For each segment, divide into

For each segment, randomly sample blocks without replacement and concatenate until all blocks have been used.
This yields resampled segments

Concatenate the four resampled segments and add the fit to get synthetic data

Fit

Repeat from step 2.

In the following, we give an overview of the pairwise correlations between different features
and forcings. We show the Spearman correlation coefficients of all tests and their significance
in Fig.

Among features within the same DO cycle,
the three different levels yield a strong correlation with each other. However, the significance is
overestimated due to their autocorrelation, and after linear detrending, the correlations are not significant
anymore. Thus, the correlation comes mostly from a common trend associated with the evolution of the
background climate state during the glacial.
Furthermore, we find significant correlations of fast cooling, gradual cooling, and warming amplitudes,
and a correlation of interstadial levels and gradual cooling amplitudes.
This implies a certain consistency of DO cycles, wherein a large-amplitude warming is typically also
followed by a large-amplitude cooling (gradual and/or fast). This is equivalent to the fact that
the stadial levels are autocorrelated.
In Sect.

For features in adjacent DO cycles, we do not expect any true positives a priori because no features are related by construction. Significant correlations at 99 % confidence are only found for the levels. Due to their autocorrelation, the significance determined by permutation tests are not reliable, however. Detrending shows that the correlations are dominated by a common linear trend due to the slowly changing background climate state. The remaining eight correlations significant at 95 % confidence could either be false positives or a result of common external forcing. This is because seven of the eight correlations involve the levels, which are clearly influenced by forcing, as detailed below.

We furthermore correlate the features with all forcings at the onset times of the respective periods
within the DO cycles.
The tests clearly indicate more significant correlations than expected by chance. However, due to
autocorrelation, the significance is overestimated by permutation tests. In particular, the levels
yield significant correlation with most forcings; however, both are autocorrelated.
By linearly detrending and discarding outliers where necessary, we find that the interstadial levels
are best correlated with LR04, EDML, and

The supplement related to this article is available online at:

JL and PD designed the study, interpreted the results, and wrote the paper. JL performed the statistical analysis.

The authors declare that they have no conflict of interest.

We gratefully acknowledge discussions of this work with Sune O. Rasmussen.

This research has been supported by the Horizon 2020 Framework Programme, H2020 Marie Skłodowska-Curie Actions (grant no. CRITICS (643073)).

This paper was edited by Barbara Stenni and reviewed by two anonymous referees.