Estimating breakpoints in the Cenozoic Era: an econometric approach

Bennedsen, Mikkel; Hillebrand, Eric; Koopman, Siem Jan; Larsen, Kathrine By; Lupien, Rachel

doi:https://doi.org/10.5194/cp-21-1981-2025

Articles | Volume 21, issue 11

https://doi.org/10.5194/cp-21-1981-2025

Articles | Volume 21, issue 11

Research article

05 Nov 2025

Research article |

| 05 Nov 2025

Estimating breakpoints in the Cenozoic Era: an econometric approach

Mikkel Bennedsen, Eric Hillebrand, Siem Jan Koopman, Kathrine By Larsen, and Rachel Lupien

Abstract

This study presents a statistical time-domain approach for identifying transitions between climate states, referred to as breakpoints, using well-established econometric tools. Our approach offers the advantage of constructing time-domain confidence intervals for the breakpoints, and it includes procedures to determine how many breakpoints are present in the time series. We apply these tools to a 67.1 million-year-long compilation of benthic foraminiferal oxygen isotopes (δ¹⁸O), which signify global temperature and ice volume throughout the Cenozoic. This foundational dataset is presented in Westerhold et al. (2020), where the authors use recurrence analysis to identify five breakpoints that define six climate states. Fixing the number of breakpoints to five, our procedure results in breakpoint estimates that closely align with those identified by Westerhold et al. (2020). By allowing the number of breakpoints to vary, we provide statistical justification for more than five breakpoints in the time series. Our method adds to our understanding of Cenozoic climate history in terms of the timing and rate of transitions between climate states and provides a tool for robustly assessing breakpoints in many other paleoclimate time series.

Download & links

How to cite.

Received: 07 Nov 2024 – Discussion started: 19 Nov 2024 – Revised: 04 Sep 2025 – Accepted: 04 Sep 2025 – Published: 05 Nov 2025

1 Introduction

Understanding the transitions between climate states in Earth's past is crucial for constraining nonlinear and feedback dynamics of our climate system, and anticipating potential climate system responses to anthropogenic warming. The Cenozoic Era, spanning from 66 million years ago (Ma) to today, is particularly informative in this regard, as it is well studied and includes major shifts from hothouse climates with temperatures 10 °C warmer than today to the onset of permanent glaciations at both poles (Zachos et al., 2001; Hansen et al., 2013). These transitions, or breakpoints, reflect large-scale changes in the climate system, involving shifts in the carbon cycle, ocean circulation, ice volume, and more (Zachos et al., 2008; Mudelsee et al., 2014). As emphasized by Tierney et al. (2020), paleoclimate records are essential for assessing climate sensitivity and evaluating climate models under warmer-than-present conditions. Evidence suggests that the sensitivity of the climate system to external forcings may depend on the climate state (Caballero and Huber, 2013), and that projected future climates may increasingly resemble early Cenozoic conditions under continued emissions (Burke et al., 2018; Steffen et al., 2018). These insights underscore the importance of identifying when past climate state transitions occurred, how many there were, and how certain we are about their timing. Addressing these questions is crucial for understanding the dynamics of long-term climate variability, and recent work has increasingly emphasized transition detection as a key task in climate data analysis (e.g., Marwan et al., 2021; Trauth, 2025).

A widely used approach to identify breakpoints in paleoclimate records is recurrence analysis, which identifies when a system returns to similar states over time, helping to detect changes in the underlying dynamics of time series (Marwan et al., 2007; Marwan, 2023; see also Fischer et al., 2024; Liang et al., 2025, for recent applications in paleoclimate research). Westerhold et al. (2020) apply this technique to a stacked record of δ¹⁸O from benthic foraminifera spanning from 67.1 Ma to the present, covering the Cenozoic Era. Based on the recurrence structure of the record, the authors identify four major climate states – Hothouse, Warmhouse, Coolhouse, and Icehouse – which are further divided into six states through time. To conduct this analysis, they resampled the data at an interval of 5 thousand years (kyr) and used both raw and detrended versions. Recurrence analysis provides valuable insights into the recurrence structure and shifts in a time series, and recurrence quantification analysis offers complementary summary measures, such as recurrence rate and determinism. However, the identification of transitions remains largely based on visual interpretation of recurrence plots, and the method lacks formal procedures for determining the number and statistical certainty of the transitions.

Several methodological extensions have sought to address these limitations. For instance, Goswami et al. (2018) propose a breakpoint detection method using a probability density function sequence representation of the time series, which accounts for timestamping uncertainty. Bagniewski et al. (2021) combine recurrence analysis with Kolmogorov–Smirnov tests to statistically assess abrupt shifts in recurrence distributions. Rousseau et al. (2023) apply this method to the Westerhold et al. (2020) data, identifying a similar set of transitions along with several additional ones. As discussed by Marwan et al. (2021), there are several other approaches to identify transitions in paleoclimate time series. Among these, Livina et al. (2010) develop a statistical method of potential analysis and apply it to detect the number of states in an ice core record. In a Bayesian framework, Schütz and Holschneider (2011) develop a method for detecting changes in trend, and Ruggieri (2013) introduce a Bayesian algorithm for identifying multiple breakpoints. Reviews of breakpoint detection techniques in more general climate time series are provided by Reeves et al. (2007) and Lund and Shi (2023).

Recently, Trauth et al. (2024) explore a suite of methods, including recurrence analysis, changepoint detection, and nonlinear curve fitting (e.g., sigmoid and ramp functions), to identify climate transitions in a paleoclimate record. They apply a changepoint detection algorithm by Killick et al. (2012), which efficiently detects multiple changepoints in the mean, variance, and trend by minimizing a cost function that balances goodness-of-fit with a penalty for additional changes. The sigmoid functions are characterized by their S-shaped curves and allow for modeling gradual transitions (Crowley and Hyde, 2008; Trauth et al., 2021). In contrast, the ramp functions consist of two horizontal segments connected by a linear trend and represent gradual transitions bounded by abrupt changes in slope, which can be fitted using regression techniques. This method was proposed by Mudelsee (2000) and has been applied to various paleoclimate records (e.g., Fleitmann et al., 2003; Mudelsee and Raymo, 2005). Furthermore, Mudelsee et al. (2014) apply this ramp-function method, among others, to detect major climate transitions in the Cenozoic. For further details, we refer to the textbook treatments in Mudelsee (2014) and Trauth (2025). While these approaches are widely used, they do not typically include tools for selecting the number of transitions or for assessing the uncertainty in their timing. In this study, we apply an alternative method that addresses both aspects.

Specifically, we employ a statistical approach based on least-squares to estimate breakpoints, and showcase its use with the benthic δ¹⁸O record from Westerhold et al. (2020). The approach is an econometric time-domain method by Bai and Perron (1998, 2003), which was originally applied to detect shifts in real interest rates data in economics (Garcia and Perron, 1996). We henceforth refer to this as the Bai-Perron framework. While well-established in the econometrics literature, this framework has not been applied to paleoclimate data, where it has a great potential for providing a rigorous statistical foundation for the estimation of breakpoints. In particular, it offers the advantages of constructing confidence intervals for the timestamps of the breakpoints, providing a measure of estimation uncertainty, as well as procedures for selecting the number of breakpoints in the time series. These additional measures are crucial for understanding the certainty, significance, and timing of climate transition periods in the past. The Bai-Perron framework offers flexibility in modeling both abrupt and gradual transitions. We demonstrate its application and benefits by using the data from Westerhold et al. (2020), though the framework is broadly applicable to a wide range of paleoclimate time series.

2 Methodology

2.1 Data

We use the dataset provided by Westerhold et al. (2020), which compiles measurements of oxygen isotope ratios from benthic foraminifera across 34 different studies and 14 ocean drilling locations into a single stack covering the Cenozoic. Our study focuses on the benthic δ¹⁸O record, specifically the correlation-corrected values of benthic δ¹⁸O¹.

Benthic δ¹⁸O measures the deviations in the ratio of the stable oxygen isotopes ¹⁸O to ¹⁶O in the shells of benthic foraminifera relative to the Vienna Pee Dee Belemnite (VPDB) standard. The ratio of heavy to light stable oxygen isotopes is a function of deep ocean temperatures (Epstein et al., 1951; Shackleton, 1967; Lisiecki and Raymo, 2005) and of the δ¹⁸O of the seawater in which the foraminifera grow their shells, which in turn is a function of ice volume and salinity (e.g., Waelbroeck et al., 2002; Oerlemans, 2004). Thus, the benthic stack is an important reference record for global climate history across the Cenozoic. Hereafter, we refer to benthic δ¹⁸O simply as δ¹⁸O.

The δ¹⁸O compilation by Westerhold et al. (2020) spans 67.10113 Ma to 564 years before present (Fig. 1). Using recurrence analysis, Westerhold et al. (2020) identify six climate states, and we refer to these as Warmhouse I (66–56 Ma), Hothouse (56–47 Ma), Warmhouse II (47–34 Ma), Coolhouse I (34–13.9 Ma), Coolhouse II (13.9–3.3 Ma), and Icehouse (3.3 Ma–present). Summary statistics for the full record and for each climate state identified by Westerhold et al. (2020) are reported in Appendix B1.

The dataset contains 24 333 entries, of which 74 are missing in the published version. After excluding these, we retain 24 259 data points, ordered from oldest to most recent. The δ¹⁸O record is irregularly spaced in time, as is typical for paleoclimate proxy data. Its average resolution is 2.77 kyr, ranging from 7.28 kyr during Warmhouse II, which has the lowest resolution, to 0.88 kyr during the Icehouse, which has the highest. The longest gap in the data spans about 115.4 kyr and 533 gaps exceed 10 kyr. Additionally, 591 time stamps contain multiple δ¹⁸O values, with up to four observations recorded at the same time. The δ¹⁸O stack includes an estimated age uncertainty of ± 100 kyr in the early Cenozoic and ±10 kyr in more recent periods, primarily due to uncertainties in orbital tuning and sedimentation rates (Westerhold et al., 2020). We do not explicitly account for age uncertainty in this study, as it is small relative to the duration of the states we estimate. We therefore expect our main findings to be robust and we will return to this issue in the results section.

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f01

Figure 1δ¹⁸O data from Westerhold et al. (2020). The vertical axis is reversed, following standard practice. The vertical dashed lines show transitions between the climate states by Westerhold et al. (2020). The horizontal axis represents time, measured in millions of years before present. Epoch abbreviations: Cret.: Cretaceous; Plio.: Pliocene; Pleist.: Pleistocene.

Download

2.2 The Bai-Perron framework

The Bai-Perron framework is based on minimizing the sum of squared residuals while treating the breakpoints as unknown parameters to be estimated (Bai and Perron, 1998, 2003). Consider a linear regression framework for the dependent variable y_t, for $t = 1, \dots, T$ , and with m breakpoints, corresponding to m+1 distinct states in the sample. The general model equation is

\begin{matrix} (1) & y_{t} = x_{t}^{'} β + z_{t}^{'} δ_{j} + u_{t}, t = T_{j - 1} + 1, \dots, T_{j}, \end{matrix}

with $j = 1, \dots, m + 1$ . The m break dates are denoted by $(T_{1}, \dots, T_{m})$ , with the convention that T₀=0 and $T_{m + 1} = T$ , and u_t is a disturbance term with mean zero and variance $σ_{j}^{2}$ . The (p×1)-vector x_t and the (q×1)-vector z_t comprise two sets of covariate vectors, for which β is the state-independent vector of coefficients and δ_j is the state-dependent vector of coefficients. Since only specific coefficients are subject to structural breaks, this model is referred to as a partial structural change model. Moreover, we consider breaks in the variance of u_t at the break dates $T_{1}, \dots, T_{m}$ , such that $σ_{i}^{2} \neq σ_{j}^{2}$ for i≠j. The parameters β and δ_j are estimated alongside the breakpoints but are not of primary interest here.

We initially treat the number of breakpoints, m, as known and estimate the coefficients and the breakpoints using a sample of T observations of $\{y_{t}, x_{t}, z_{t}\}$ . The estimation method is based on least squares for both the coefficients and the breakpoints. For each possible set of m breakpoints $(T_{1}, \dots, T_{m})$ denoted as ${\{T_{i}\}}_{i = 1}^{m}$ , we obtain estimates of β and δ_j by minimizing the sum of squared residuals (SSR), that is,

\begin{matrix} (2) & SSR = \sum_{j = 1}^{m + 1} \sum_{t = T_{j - 1} + 1}^{T_{j}} {(y_{t} - x_{t}^{'} β - z_{t}^{'} δ_{j})}^{2}, \end{matrix}

where β is common to all states, while δ_j is specific for the state j, which is the period between $T_{j - 1} + 1$ and T_j. The resulting estimated coefficients are denoted as $\hat{β} ({\{T_{i}\}}_{i = 1}^{m})$ and ${\hat{δ}}_{j} ({\{T_{i}\}}_{i = 1}^{m})$ . These coefficients are then used to determine the SSR associated with each set of breakpoints,

\begin{matrix} (3) & \begin{aligned} {SSR}_{T} ({\{T_{i}\}}_{i = 1}^{m}) & \equiv \sum_{j = 1}^{m + 1} \sum_{t = T_{j - 1} + 1}^{T_{j}} (y_{t} - x_{t}^{'} \hat{β} ({\{T_{i}\}}_{i = 1}^{m}) \\ {- z_{t}^{'} {\hat{δ}}_{j} ({\{T_{i}\}}_{i = 1}^{m}))}^{2} . \end{aligned} \end{matrix}

The estimated breakpoints are then given by

\begin{matrix} (4) & ({\hat{T}}_{1}, \dots, {\hat{T}}_{m}) = \underset{T_{1}, \dots, T_{m}}{argmin} {SSR}_{T} ({\{T_{i}\}}_{i = 1}^{m}) . \end{matrix}

The minimization is conducted over all partitions $(T_{1}, \dots, T_{m})$ such that $T_{j} - T_{j - 1} \geq dim (z_{t})$ to ensure that there are enough data points to estimate the parameters δ_j in each partition. This procedure leads to estimated parameters for the m breakpoints, i.e., ${{\hat{T}}_{i}}_{i = 1}^{m}$ , $\hat{β} = \hat{β} ({{\hat{T}}_{i}}_{i = 1}^{m})$ , and ${\hat{δ}}_{j} = {\hat{δ}}_{j} ({{\hat{T}}_{i}}_{i = 1}^{m})$ . Since the possible combinations of the placement of the breakpoints is finite, this optimization can be conducted using a grid search, which can be computationally heavy, especially for many breakpoints. Bai and Perron (2003) introduce an efficient method for determining the global minimizers.

An essential advantage of the Bai-Perron framework is that it allows for constructing confidence intervals for the timing of the breakpoints, something that is not available for the recurrence analysis approach implemented in Westerhold et al. (2020). The construction of confidence intervals is based on the asymptotic distribution of the estimated break dates. The convergence results for the construction of confidence intervals rely on a number of assumptions (see Bai and Perron, 2003).

2.3 Model specifications

Three distinct specifications are considered within the Bai-Perron framework, referred to as the “Mean”, “Fixed AR”, and “AR” models, where AR refers to the autoregressive model of order one with intercept. These are all special cases of the framework outlined in Eq. (1). The simplest among them, the Mean model, is specified as follows,

\begin{matrix} (5) & y_{t} = c_{j} + u_{t}, t = T_{j - 1} + 1, \dots, T_{j}, \end{matrix}

for $j = 1, \dots, m + 1$ , where c_j is the state-dependent intercept and u_t is an error term. This model is equivalent to setting x_t=0, z_t=1, and δ_j=c_j in Eq. (1). A breakpoint in this model specification leads to an abrupt change in the mean of the dependent variable y_t.

The Fixed AR model extends the Mean model by incorporating an autoregressive term. We obtain the model

\begin{matrix} (6) & y_{t} = c_{j} + φ y_{t - 1} + u_{t}, t = T_{j - 1} + 1, \dots, T_{j}, \end{matrix}

for $j = 1, \dots, m + 1$ , where y_t−1 is the dependent variable lagged by one period, and φ is the autoregressive coefficient that is constant over the whole sample. In this model, the effect of a change in the coefficient c_j is more gradual, since it depends on the autoregressive dynamics. The Fixed AR model is obtained from Eq. (1) by specifying $x_{t} = y_{t - 1}$ , β=φ, z_t=1, and δ_j=c_j.

The general AR specification also allows the autoregressive term to be state-dependent, resulting in the AR model,

\begin{matrix} (7) & y_{t} = c_{j} + φ_{j} y_{t - 1} + u_{t}, t = T_{j - 1} + 1, \dots, T_{j}, \end{matrix}

for $j = 1, \dots, m + 1$ , where the autoregressive coefficient φ in Eq. (6) is now state-dependent and is denoted by φ_j. This model is obtained from Eq. (1) by setting x_t=0, $z_{t} = (1, y_{t - 1})$ , and $δ_{j} = (c_{j}, φ_{j})$ . Here, both the intercept and the autoregressive coefficient are state-dependent. Thus, the three specifications are nested: The AR model is the most general, the Fixed AR model is nested in the AR model by setting $φ_{1} = φ_{2} = \dots = φ_{m + 1} = φ$ , and the Mean model is nested in the Fixed AR model by setting φ=0.

Figure 2 illustrates how the models capture breakpoints. The Mean model is designed to detect abrupt breaks in the mean of a time series, while the Fixed AR model is for smoother breaks. The AR model is more flexible, allowing for both relatively gradual (e.g., T₁) and abrupt (e.g., T₂) breakpoints compared to the Fixed AR model.

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f02

Figure 2Simulated time series using the three model specifications, each with breakpoints T₁=25 and T₂=75, and total sample size T=100. For the Mean model, we set c₁=1.0, c₂=1.2, and c₃=0.8. In the Fixed AR model, the parameters are φ=0.7, c₁=0.30, c₂=0.36, and c₃=0.24, chosen to yield comparable state-wise means. Likewise in the AR model, we set φ₁=0.7, φ₂=0.9, φ₃=0.4, c₁=0.30, c₂=0.12, and c₃=0.48. In all specifications, we set u_t=0 for all t.

Download

2.4 Implementation

The Bai-Perron framework is implemented using mbreaks, an R package specifically designed for this purpose (Nguyen et al., 2023). For all model specifications, we set the minimum length of a state, h, to 2.5 million years (Myr), facilitating the estimation of shorter climate states. Also, we let the variance of the error term, denoted as $σ_{j}^{2}$ , be state-dependent.

As outlined by Bai and Perron (2003), no serial correlation is permitted in the regression residuals. However, the time series of δ¹⁸O is likely subject to both autocorrelation and heteroscedasticity, as documented in ice core records (Davidson et al., 2015; Keyes et al., 2023). Autocorrelation occurs when current values correlate with past values, which in paleoclimate data arises from both long-term persistence in climate dynamics (Mudelsee et al., 2014) and taphonomic processes such as bioturbation (Kunz et al., 2020). Since only up to one lag is included in the covariates in the model specifications in this paper, residual serial correlation is likely to remain. Heteroscedasticity, or time-varying error variance, is already partially addressed in the model specifications through state-dependent variance. However, additional heteroscedasticity may arise within the estimated states due to factors such as orbital forcing and changes in ice sheet extent. Addressing both autocorrelation and heteroscedasticity is essential to ensure unbiased parameter estimates and valid confidence intervals for the estimated breakpoints.

To account for these issues, we use the autocorrelation and heteroscedasticity consistent (HAC) covariance matrix estimator with prewhitening in the Bai-Perron framework. The prewhitening procedure, proposed by Andrews and Monahan (1992), entails applying an autoregressive model with one lag to $z_{t} {\hat{u}}_{t}$ , where ${\hat{u}}_{t}$ denotes the residuals. The HAC covariance matrix estimator by Andrews (1991) is then constructed based on the filtered series using the quadratic spectral kernel with bandwidth selected by an AR of order one approximation. This approach is used for all model specifications and is straightforward to implement using the R package (Nguyen et al., 2023).

2.5 Constant data frequency

To conduct breakpoint estimation using the Bai-Perron framework, we need a regularly sampled time series. We use a binning approach to construct a dataset with evenly spaced observations, which is common practice in the analysis of paleoclimate data; see for instance Boettner et al. (2021). We divide the dataset into bins of fixed time intervals and compute the mean of the observations within each bin. In the case of gaps in the binned data, we use the values immediately preceding and succeeding the section with missing data to perform linear interpolation. We consider six different bin sizes, namely 5, 10, 25, 50, 75, and 100 kyr (Fig. 3). Summary statistics for the full sample length and for each climate state identified by Westerhold et al. (2020) for all binning frequencies are provided in Appendix B1.

Data binned at higher frequencies follow the variations in the dataset more closely, whereas data binned at lower frequencies tend to be smoother (Fig. 3). In case of large gaps, a high binning frequency results in linear interpolation between observations (Fig. 3 bottom left). This effect does not occur for periods with many observations, where low binning frequencies capture only a small part of the variation in the original data (Fig. 3 bottom right). Binning offers a simple approach to handle the uneven frequency of the dataset. However, it leads to data loss at lower binning frequencies and to the introduction of artificial data points resulting from linear interpolation at higher binning frequencies. The selection of binning frequencies can therefore alter the properties of the time series, potentially misrepresenting the dynamics of the original data.

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f03

Figure 3Top panel: Benthic foraminiferal δ¹⁸O data from Westerhold et al. (2020), along with 5 and 100 kyr-binned versions. The record spans 67.10–0.0006 Ma and is based on cores from 14 ocean drilling sites. The vertical dashed lines show transitions between the climate states by Westerhold et al. (2020). Bottom left: 36–35 Ma with an average resolution of approximately 17.2 kyr. Bottom right: 3–2 Ma with an average resolution of approximately 0.9 kyr. Epoch abbreviations: Cret.: Cretaceous; Plio.: Pliocene; Pleist.: Pleistocene.

Download

The Bai-Perron framework is developed for estimating and testing for multiple breakpoints in linear regression models where the regressors are non-trending or state-wise stationary (Bai and Perron, 2003). A time series is considered stationary if its statistical properties, such as mean and variance, do not change over time. The δ¹⁸O data appears non-stationary over most of the record, even within climate states found by Westerhold et al. (2020). As pointed out by Kejriwal et al. (2013), if the time series maintains its stationarity properties over the respective states, the methods developed for stationary data are still applicable for these cases. However, if the process alternates between stationary and non-stationary states, the theoretical properties of the methodology are unknown.

To investigate whether the time series is non-stationary, we apply the Augmented Dickey-Fuller (ADF) test (Dickey and Fuller, 1979), with the null hypothesis of non-stationarity. For the entire 25 kyr-binned data sample, the ADF test does not reject the null hypothesis at the 1 % significance level, indicating non-stationarity. However, when examining the binned data for each climate state identified by Westerhold et al. (2020) separately, the ADF test rejects the null hypothesis at the 1 % significance level for the Warmhouse II, Coolhouse I, and Icehouse states. These tests indicate the presence of state-wise non-stationarity, and we therefore need to examine whether the Bai-Perron framework is applicable to data-generating processes that are state-wise non-stationary or alternating between stationary and non-stationary states. For this purpose, we conduct a large simulation study to verify that the Bai-Perron framework works as intended when applied to these types of data-generating processes using the three model specifications. The study is conducted for both independent and identically distributed (i.i.d.) error terms and serially correlated error terms (Appendix C1 and C2, respectively). The results show that the procedure works well with non-stationarity and is robust to processes with one stationary and one non-stationary state for Fixed AR and AR models. However, the Mean model performs poorly when the data-generating process exhibits high persistence. In the case of serial correlation, the results are less conclusive, but if the states are sufficiently different, the methodology appears effective. The study reveals that the coverage rates for confidence intervals are generally adequate for the Fixed AR model, while the confidence intervals of the AR model are too narrow in many cases. Overall, the Fixed AR model performs best across the data-generating processes considered.

3 Results

3.1 Fixed number of breakpoints

As an initial step, we fix the number of breakpoints to 5, which is the number used in the recurrence analysis presented in Westerhold et al. (2020). We estimate the breakpoints and corresponding 95 % confidence intervals for each of the binning frequencies, 5, 10, 25, 50, 75, and 100 kyr, using Mean, Fixed AR, and AR models for each (Appendix B2 and Fig. 4). The estimated confidence intervals around the breakpoints are often asymmetrical. Bai and Perron (2003) advocate the use of asymmetric confidence intervals, as these provide better coverage rates when the data are non-stationary.

For the Mean model, the estimated breakpoints generally remain at the same dates throughout as the binned data frequency decreases step-by-step from 5 to 100 kyr (Fig. 4a). The width of the 95 % confidence intervals increases as the frequency decreases, which can be attributed to the resultant decrease in the number of binned observations available for estimation at the lower frequencies. All the breakpoints align with those identified by recurrence analysis in Westerhold et al. (2020). A similar pattern of alignment is observed in the Fixed AR model, albeit with tighter confidence intervals (Fig. 4b). The AR model exhibits more sensitivity to the frequency of the binned data (Fig. 4c). At higher frequencies, the breakpoints tend to appear in the more recent parts of the sample. However, as the frequency decreases further, the breakpoints are estimated to be in the older parts of the sample period.

For the results using 25 kyr, we find that the estimated breakpoints from the three model specifications align closely with each other and nearly perfectly with those identified by Westerhold et al. (2020). The three model specifications estimated using the 25 kyr-binned data yield parameter estimates that differ across states, reflecting differences in mean and autoregressive dynamics (Appendix B3).

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f04

Figure 4A comparison of estimated breakpoints using binned data with frequencies of 5, 10, 25, 50, 75, and 100 kyr from top to bottom, fixing the number of breakpoints to 5 for each model specification. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020) (blue dots) and their transitions (vertical dashed lines). Panels: (a) Mean model, (b) Fixed AR model, (c) AR model.

Download

As a robustness check, we re-estimate the model specifications for 5 breakpoints using the 25 kyr-binned data reversed with respect to the time dimension, so that the time series is ordered from present to past rather than past to present (Appendix A1). We find that the results of the Mean and Fixed AR models are robust to the ordering of the time axis, with almost unchanged estimated breakpoints. Conversely, the AR model leads to estimated breakpoints in the more recent part of the sample, resulting in breakpoints at 16.9 and 9.7 Ma, which differ from those estimated using the same model and binning frequency with time running forward (Fig. 4).

In summary, changing the binning frequency mainly affects the width of the confidence intervals, while the estimated breakpoint timing remains largely unchanged for both the Mean and Fixed AR models. In contrast, the AR model is more sensitive to resolution and the direction of the time frame. As detailed in the simulation study (Appendix C), the Mean model fails to accurately detect breakpoints in highly persistent data-generating processes. Consequently, in what follows, we focus on the Fixed AR model for the estimation of breakpoints in the δ¹⁸O time series. Among the binning frequencies, we proceed with 10 and 25 kyr, as these yield the most consistent results across model specifications and strike a good balance between temporal resolution and signal quality. For the 25 kyr bin width, the mean number of observations per bin is approximately 9, and 3.6 for 10 kyr. However, these numbers vary across the sample, being only 3.5 and 1.4, respectively, in the Warmhouse II and increasing to 28.3 and 11.3, respectively, in the Icehouse period. This highlights the importance of accounting for varying sampling resolution when selecting bin widths. For applications of this framework to other paleoclimate records, we recommend seeking a similar balance.

3.2 Flexible number of breakpoints

We now relax the assumption of a pre-specified number of breakpoints and use information criteria to guide the choice of the number of breakpoints. These criteria are model selection tools that balance goodness of fit with model complexity, helping to avoid overfitting. We initially consider the following three criteria: the Bayesian Information Criterion (BIC) by Yao (1988), the modified Schwarz Information Criterion (LWZ) by Liu et al. (1997), and the modified BIC (KT) by Kurozumi and Tuvaandorj (2011). For all criteria, the preferred number of breakpoints is determined as the number of breakpoints that minimizes the information criterion in question.

Bai and Perron (2006) note that the BIC and LWZ criteria perform well in the absence of serial correlation, but both lead to overestimation of the number of breakpoints in case of serial correlation in the error term. In simulation studies, we find that the KT information criterion performs poorly, and hence, we exclude it from the subsequent analysis (Appendix C1 and C2). We also find that the number of breakpoints determined using the Mean model specification is generally too large when employing the information criteria. For the Fixed AR and AR models, the BIC and LWZ criteria typically perform well, especially in data-generating processes with a large break. With serial correlation in the error term, the BIC criterion tends to slightly overestimate the number of breakpoints, whereas the LWZ criterion generally performs well in the Fixed AR and AR model specifications.

We use the BIC and LWZ information criteria for each model specification and binning frequency to determine the number of breakpoints, and set the minimum state length to h=2.5 Myr (Appendix B4). For our preferred specification, the Fixed AR model with 25 kyr binning frequency, the LWZ and BIC criteria suggests 6 and 12 breakpoints, respectively. For a 10 kyr binning frequency, the estimated number of breakpoints are 7 and 14, respectively. Thus, the information criteria indicate that the number of distinct climate states in the δ¹⁸O record is larger than the 5 suggested in Westerhold et al. (2020).

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f05

Figure 5A comparison of estimated breakpoints using the Fixed AR model for 1 to 15 breakpoints on 25 kyr-binned data. The minimum state length is set to h=1 Myr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020) and their transitions.

Download

To further investigate the potential for a higher number of breakpoints, we consider the estimation of up to 15 breakpoints with the minimum length of a state set of h=1 Myr. This analysis is conducted with the Fixed AR model and 25 kyr-binned data (Fig. 5). These findings show that the breakpoints identified by Westerhold et al. (2020) are preserved in estimations which include 5 or more breakpoints. Furthermore, the additional breakpoints are, in most cases, also very stable and consistently reappear across specifications with a higher number of breakpoints. The same analysis using 10 kyr-binned data led to nearly identical breakpoint estimates, while the Mean and AR models with 25 kyr-binned data yielded estimates that align in certain cases (Appendix A2, A3, and A4).

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f06

Figure 6A comparison of estimated breakpoints using the Fixed AR model for 1 and 2 breakpoints on 5 kyr-binned data for the Icehouse period. The minimum state length is set to h=250 kyr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020).

Download

The final estimated breakpoint is placed at 1.425 Ma for the Fixed AR model, just below the upper boundary of the detection window at 1 Ma, imposed by the minimum state length of 1 Myr. Additionally, the estimated breakpoint is located near the midpoint of a linear trend in the time series from approximately 3.3 Ma to the present, suggesting it may be driven by the trend rather than representing a break in the time series (cf. Fig. 5). To investigate this further, we re-estimate the breakpoints for the Fixed AR model, focusing solely on the Icehouse period, with the minimum length of a state set to 250 kyr and using 5 kyr binned data, leveraging the denser sampling in this part of the record. For the Fixed AR model, the LWZ criterion suggests one breakpoint, while the BIC indicates two. With one breakpoint, the estimate is 1.355 Ma, and with two, the estimated breakpoints are 2.54 and 0.95 Ma (Fig. 6). Estimating more than two breakpoints leads to overlap between the estimated confidence intervals, reducing the interpretability, and these models are therefore excluded. The results are comparable for the Mean and AR models (Appendix A5 and A6).

3.3 Limitations of the Bai–Perron framework

Although the Bai–Perron framework provides a flexible and well-established method for detecting breaks, it has some limitations. First, the approach assumes piecewise linearity and white noise residuals (Bai and Perron, 2003). However, in the estimations conducted in this study, the residuals are not white noise, indicating that some dynamics are left unexplained. The simulation results show that the Bai-Perron framework nevertheless performs well even when residuals exhibit complex dynamics (Appendix C). Confidence intervals should still be interpreted with caution. Second, the method is also computationally demanding for high-resolution data, although it remains possible to run on personal computers.

Thirdly, the method does not account for age model uncertainty, which is important for interpreting the timing and significance of time series analytical output (Marwan et al., 2021). In the Westerhold et al. (2020) data, dating uncertainty ranges from about ±10 kyr in the younger parts to ±100 kyr in the older parts. This can affect the timing of breakpoints and lead to differences when comparing across records (Franke and Donner, 2019). Previous work has shown that age-depth models often underestimate the true uncertainty in the chronology, which would amplify these effects (Telford et al., 2004). While some progress has been made in including age uncertainty into recurrence analyses (Goswami et al., 2018), incorporating it into the Bai-Perron framework remains a challenge. One could however consider the use of age ensembles which are multiple plausible realizations of the time axis to assess robustness of the estimated breakpoints. Fully integrating age uncertainty into the estimation process, for example by modeling timestamps as random variables, would require further methodological development. However, since the age model uncertainties reported by Westerhold et al. (2020) are small compared to the duration of the estimated climate states, we expect our main findings to be robust.

In addition to age uncertainty, another direction for methodological advancement is developing a breakpoint detection framework for irregularly spaced time series. This would obviate the need for aggregating the data to fixed time intervals, preserving more of the original record. Steps in this direction have already been made in concurrent research (Bennedsen et al., 2024), where the full δ¹⁸O and δ¹³C stacks (Westerhold et al., 2020) are analyzed while taking the climate state transitions as given and addressing measurement errors.

4 Discussion

Our results demonstrate that the Bai-Perron time-domain framework is a flexible and effective tool for detecting breakpoints in paleoclimate time series. When fixing the number of breakpoints to 5, all model specifications lead to breakpoint estimates that closely match those identified by Westerhold et al. (2020), providing strong statistical support for their climate-state classification. The results of this work are also consistent with the findings by Rousseau et al. (2023).

Information criteria point to a higher number of transitions than previously reported (Westerhold et al., 2020), suggesting the potential for a more detailed classification of Cenozoic climate variability. To explore this, we estimate between 1 and 15 breakpoints using the Fixed AR model (Fig. 5). Using the BIC, we find statistical justification for 12 breakpoints in the time series (Table 1). Some of the 12 estimated breakpoints align with major transitions in benthic δ¹³C (Westerhold et al., 2020), atmospheric CO₂ concentration estimates (Hönisch et al., 2023), and global sea level estimates (Miller et al., 2020) (Fig. 7). This alignment may reflect episodes of large-scale reorganization in the Earth system, potentially involving coupled changes in the carbon cycle, temperature, and ice volume.

Table 1Estimated breakpoints and 95 % confidence intervals (CI) in Ma for the Fixed AR model with 12 breakpoints determined using the BIC.

Download Print Version | Download XLSX

Five of these breakpoints (BP₃, BP₅, BP₇, BP₉, and BP₁₁) closely match the five major transitions identified by Westerhold et al. (2020), each corresponding to a well-known climatic event. Specifically, BP₃ aligns with the Paleocene–Eocene Thermal Maximum (PETM, 56 Ma), a short-lived but intense global warming event (McInerney and Wing, 2011). BP₅ marks the end of the Early Eocene Climate Optimum (EECO, 47 Ma), a peak in long-term warmth during the early Cenozoic (Westerhold et al., 2018). BP₇ captures the Eocene–Oligocene Transition (EOT, 34 Ma), when Antarctic glaciation began and global temperatures declined sharply (Coxall et al., 2005; Spray et al., 2019). BP₉ corresponds to the Middle Miocene Climate Transition (MMCT, 13.9 Ma), which is associated with expansion of the Antarctic ice sheets (Flower and Kennett, 1994). Finally, BP₁₁ is close to the M2 glaciation event (3.3 Ma), which preceded the onset of sustained Northern Hemisphere glaciation in the Pleistocene (Lisiecki and Raymo, 2005).

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f07

Figure 7Overview of key paleoclimate proxies across the Cenozoic Era. From top to bottom: Benthic foraminiferal δ¹⁸O (Westerhold et al., 2020), δ¹³C (Westerhold et al., 2020), atmospheric CO₂ concentration estimates from multiple proxy records (Hönisch et al., 2023), and global sea-level estimates relative to present (Miller et al., 2020). Breakpoints (black dots) and confidence intervals (light green bars) are estimated using the preferred 12-breakpoint model on the δ¹⁸O record. The vertical dashed lines show the transitions found by Westerhold et al. (2020). Notable alignments of features in the records with estimated breakpoints include the PETM (56 Ma), EOT (34 Ma), and MMCO (17 Ma), supporting the interpretation of the breakpoints as indicators of major climate transitions.

Download

Several of the remaining seven estimated breakpoints also coincide with known climate events. For instance, BP₆ aligns with the cooling following the Middle Eocene Climatic Optimum (MECO), originally described by Bohaty and Zachos (2003), and occurs after a peak in atmospheric CO₂ concentrations inferred from boron isotope records (Henehan et al., 2020) (Fig. 8). Another breakpoint, BP₁₀, is estimated at 9.975 Ma and broadly coincides with the expansion of C₄ grasslands (Fig. 9), which altered the global carbon cycle and land surface with potential downstream effects on climate (Polissar et al., 2019; Strömberg, 2011). Notably, both of these breakpoints are also identified by Rousseau et al. (2023), who applied recurrence analysis and a Kolmogorov–Smirnov test to the same δ¹⁸O dataset. BP₈ matches the onset of the Mid-Miocene Climatic Optimum (MMCO), estimated at 16.95 Ma (Flower and Kennett, 1994; Zachos et al., 2001). BP₂ aligns with the maximum in both the δ¹³C and sea level records at 58.03 and 58.21 Ma, respectively (Fig. 7). This period has also been described by Harper et al. (2024) as the peak of the Paleocene Carbon Isotope Maximum (PCIM).

Particularly noteworthy is the lack of breakpoints, even with 15 detections, between the EOT at 34 Ma and the onset of the MMCO around 17 Ma. This is consistent with the idea that this interval is especially stable in the Cenozoic Era, following the establishment of the Antarctic ice sheet (Zachos et al., 2001; Mudelsee et al., 2014).

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f08

Figure 8Boron isotope measurements (δ¹¹B) from multiple ocean drilling sites compiled by Henehan et al. (2020), shown alongside δ¹⁸O values. The estimated breakpoint, BP₆ (black dot), and its confidence interval (light green bar), align with the post-MECO cooling.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f09

Figure 9C₄ grassland expansion, inferred from plant wax carbon isotopes in marine sediments (Polissar et al., 2019), shown alongside δ¹⁸O values. The estimated breakpoint, BP₁₀ (black dot), and its confidence interval (light green bar), align with this ecological transition.

Download

We now focus on the breakpoints estimated within the Icehouse period, which has an average resolution of 0.88 kyr compared to 2.77 kyr for the full record (Fig. 6). The estimation yields a single breakpoint at 1.355 Ma, which may reflect a midpoint in the record rather than a distinct climatic shift. When allowing for two breakpoints, as suggested by the BIC, they are estimated at 2.54 and 0.95 Ma, coinciding well with the intensification of Northern Hemisphere Glaciation (iNHG) (Lisiecki and Raymo, 2005) and the Mid-Pleistocene Transition (MPT) (Pisias and Moore, 1981), respectively. The iNHG marks the initiation of sustained, large-scale glaciation in the Northern Hemisphere, beginning around 2.6 Ma, as evidenced by increasing ice-rafted debris and declining sea level (McClymont et al., 2023). The MPT marks a change in the periodicity and amplitude of glacial–interglacial cycles, which Clark et al. (2006) describe as a gradual transition occurring between 1.25 and 0.7 Ma. James et al. (2024) provide a dynamical argument supporting this view, while others identify a more abrupt increase in ice volume and deep-ocean cooling centered around 0.9 Ma (Elderfield et al., 2012). Both the iNHG and the MPT are thought to be relatively gradual and complex events, which is supported by the long, asymmetrical confidence intervals, ranging from 2.92 to 2.41 Ma for the first breakpoint and from 1.545 to 0.66 Ma for the second.

These results underscore the capability of the Bai-Perron framework to detect key transitions in Earth's climate history but also emphasize the importance of prior understanding of the climate system when interpreting breakpoint estimates.

5 Conclusion

This study presents a statistical time-domain approach to estimate breakpoints in the Cenozoic Era using the econometric tools developed by Bai and Perron (1998, 2003). We analyze the time series of benthic δ¹⁸O compiled by Westerhold et al. (2020), which is a widely cited foundational record particularly for the field of paleoclimatology. Westerhold et al. (2020) identified 5 breakpoints using recurrence analysis, and our analysis strongly corroborates the placement of these breakpoints across various model specifications and binning frequencies. Our approach offers the advantage of constructing confidence intervals for the ages of the breakpoints, providing a measure of estimation uncertainty. Based on the results of our simulation study, we advocate using the model specification with a state-independent autoregressive term and state-dependent intercept.

By selecting the number of breakpoints using information criteria, we provide statistical justification for more than 5 breakpoints in the time series. For instance, the BIC suggests 12 breakpoints. For these, the 5 transitions identified by Westerhold et al. (2020) are preserved, while the additional breakpoints suggest further divisions of the climate states they found. This points to the potential for a more detailed classification of Cenozoic climate states, adding to our understanding of Earth system dynamics.

Although we focus on the benthic δ¹⁸O stack (Westerhold et al., 2020) in this study, the Bai–Perron framework is broadly applicable across paleoclimate research and related disciplines. To guide its use in other contexts, we offer several general recommendations based on our findings: (1) Careful consideration should be given to the choice of binning frequency. While finer binning enhances temporal resolution, it may also preserve measurement errors and introduce artifacts by linear interpolations, particularly in unevenly sampled records. Conversely, coarser binning can lead to loss of information. In our application, we find that the bin width 10 and 25 kyr provide a good balance between detail and signal quality. For other records, we recommend seeking a similar balance. (2) The model specification should reflect the statistical features of the data, such as trends and autocorrelation. Although the Fixed AR model has performed well in our study, the flexibility of the Bai–Perron framework allows users to adapt the model specification to suit different datasets. (3) The number of breakpoints should be selected based on information criteria, such as the BIC or LWZ, which may yield different outcomes depending on model complexity. In our analysis, the BIC tends to favor more breakpoints than the LWZ. We recommend complementing statistical selection with a careful assessment of the climatic relevance of the estimated breakpoints.

These recommendations support the broader application of the framework to other paleoclimate records, like the Cenozoic-spanning reconstructions of δ¹³C or paleo-CO₂. The method is suitable for detecting both gradual and abrupt transitions, including climatic events such as Dansgaard-Oeschger events (Dansgaard et al., 1993; Livina et al., 2010). In addition to its versatility in application, the framework allows for the inclusion of covariates, opening up many possibilities for future applications. As such, incorporating orbital parameters (e.g., eccentricity, obliquity, and precession; Laskar et al., 2004) could create the potential for detecting transitions while controlling for these external drivers. Additionally, one could investigate breaks in the relationship between orbital forcings and paleoclimate variables, reflecting changes in how strongly these external factors influence climate dynamics. A key example is the MPT, marked by a shift in the dominant glacial cycle from 41 to 100 kyr (Berends et al., 2021; Barker et al., 2025), the timing of which could be estimated using the Bai–Perron framework.

These examples highlight the broader potential of the framework as a flexible tool for paleoclimate data analysis. Understanding when and how breakpoints in the climate system occurred is essential for interpreting past climate variability, events, and shifts, and ultimately for informing projections of future climate change. The Bai–Perron framework provides a statistically rigorous way of estimating these breakpoints, offering new opportunities to deepen our understanding of long-term climate dynamics.

Appendix A: Graphs

A1 Reversed time

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f10

Figure A1A comparison of estimated breakpoints using the Mean, Fixed AR, and AR model specifications for five breakpoints on 25 kyr-binned data where the time frame is reversed. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020) and their transitions.

Download

A2 One to 15 breakpoints: Fixed AR model 10 kyr

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f11

Figure A2A comparison of estimated breakpoints using the Fixed AR model for 1 to 15 breakpoints on 10 kyr-binned data. The minimum state length is set to h=1 Myr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020) and their transitions.

Download

A3 One to 15 breakpoints: Mean model

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f12

Figure A3A comparison of estimated breakpoints using the Mean model for 1 to 15 breakpoints on 25 kyr-binned data. The minimum state length is set to h=1 Myr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020) and their transitions.

Download

A4 One to 15 breakpoints: AR model

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f13

Figure A4A comparison of estimated breakpoints using the AR model for 1 to 15 breakpoints on 25 kyr-binned data. The minimum state length is set to h=1 Myr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020) and their transitions.

Download

A5 One and two breakpoints in the Icehouse: Mean model 5 kyr

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f14

Figure A5A comparison of estimated breakpoints using the Mean model for one and two breakpoints on 5 kyr-binned data for the Icehouse period. The minimum state length is set to h=250 kyr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020).

Download

A6 One and two breakpoints in the Icehouse: AR model 5 kyr

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f15

Figure A6A comparison of estimated breakpoints using the AR model for one and two breakpoints on 5 kyr-binned data for the Icehouse period. The minimum state length is set to h=250 kyr. The black dots represent estimated breakpoints, while colored shaded rectangles indicate 95 % confidence intervals. The results overlay the δ¹⁸O data from Westerhold et al. (2020).

Download

Appendix B: Tables

B1 Summary statistics: State-wise and full sample

Table B1Summary statistics of the binned data with bin sizes (5, 10, 25, 50, 75, and 100 kyr) and the δ¹⁸O data without binning for each of the states identified by Westerhold et al. (2020) and the full sample period.

Download Print Version | Download XLSX

B2 Estimated breakpoints: 5 breakpoints

Table B2Estimated breakpoints and their 95 % confidence intervals (in Ma) where the number of breakpoints is fixed to 5, and all values are rounded to three decimals. The table shows estimates for each method across bin sizes 5, 10, 25, 50, 75, and 100 kyr.

Download Print Version | Download XLSX

B3 Estimated parameters: 5 breakpoints and 25 kyr binned data

Table B3Estimated parameters and their corresponding standard errors (SE) for each model specification. Parameters absent in a given model specification are denoted by ×. The number of breakpoints is set to 5, and the parameters are estimated with a binning frequency of 25 kyr and h=2.5 Myr. All values are rounded to three decimals.

Download Print Version | Download XLSX

B4 The number of breakpoints selected by information criteria

Table B4The number of breakpoints selected using BIC and LWZ criterion for all models and binning frequencies considered. The minimum state length is set to h=2.5 Myr and the maximum number of breakpoints is 26.

Download Print Version | Download XLSX

Appendix C: Simulation study

C1 Serially uncorrelated error term

In this appendix, we assess whether the methodology by Bai and Perron (1998, 2003) can be used to accurately estimate the number and timing of breakpoints in a state-wise non-stationary time series. We conduct 1000 simulations for each data-generating process (DGP) with a sample size of 500. All the DGPs considered have the following form,

\begin{matrix} (C1) & \begin{aligned} y_{t} & = c_{1} + φ_{1} y_{t - 1} + ε_{t}, ε_{t} \overset{i.i.d.}{\sim} N (0, σ^{2}) for t \leq T / 2 \\ y_{t} & = c_{2} + φ_{2} y_{t - 1} + ε_{t}, ε_{t} \overset{i.i.d.}{\sim} N (0, σ^{2}) for t > T / 2 . \end{aligned} \end{matrix}

Hence, we consider a single breakpoint in the middle of the sample interval, namely at t=250. We examine eight DGPs, each specified and described in Table C1.

Table C1Data-generating processes for the simulation study and short descriptions. RW: random walk.

Download Print Version | Download XLSX

The DGPs range from random walk models with a break in the drift term to models with breaks in both the intercept and the AR coefficient. For comparison, we include a random walk without breakpoints as the sixth model. For each of the DGPs, we are interested in the performance of the methodology by Bai and Perron (1998, 2003) in estimating the breakpoint and confidence intervals. The model specifications from Sect. 2.3 are estimated on the data generated by the DGPs, and we use the implementation outlined in Sect. 2.4. We use the R-package mbreaks by Nguyen et al. (2023), and we impose a single breakpoint in the estimation. The left and right panels of Figs. C1 through C8 display realizations of the DGP and density plots of the estimated breakpoints for each of the models, respectively. The results are summarized in Table C2, which provides the mean of the estimated breakpoints, and medians of the lower and upper boundaries of the estimated 95 % CIs are tabulated along with their coverage rates for each model and DGP.

Table C2Mean of the estimated breakpoints and medians of the lower and upper boundary of the estimated confidence intervals, along with the coverage rates for each model specification and DGP. DGP 6 is simulated without a breakpoint, so the coverage rate is irrelevant and indicated by ×.

Download Print Version | Download XLSX

In the first DGP, a random walk with a small drift term break, we observe that the mean of the estimated breakpoints is later than the true breakpoint in all model specifications. Additionally, the density plots exhibit asymmetry around the true breakpoint. This is expected due to the low magnitude of the break in the drift term, which creates a subtle change in the overall stochastic trend, making accurate breakpoint detection difficult. In the second DGP with a larger drift term break, the estimated breakpoints exhibit a narrower and more bell-shaped density. The mean estimated breakpoints for the Fixed AR and AR models slightly precede the true breakpoint. However, the Mean model performs poorly, with the mean of the estimated breakpoints far from the true breakpoint.

In the third DGP, both the Fixed AR and AR models produce mean estimated breakpoints slightly later than the true breakpoint. The Mean model exhibits better performance in this DGP than in the second DGP. The fourth DGP has a break in the intercept and the AR-coefficient from 0.95 to 1, resulting in a state-wise non-stationary model. This change leads to breakpoint estimates very close to the true breakpoint, except in the Mean model. A similar outcome is observed in the fifth DGP, which features a larger increase in the AR-coefficient. In the sixth DGP, which is defined without any breakpoints, the Mean model estimates breakpoints near the midpoint of the sample period, while the other two specifications yield inconclusive results. In the seventh DGP, the AR and Fixed AR models produce estimates close to the true breakpoint. However, the Mean model continues to produce breakpoint estimates far from the true value. Examining the eighth DGP, the three models perform almost equally well.

Overall, the Fixed AR and AR models tend to perform well in non-stationary scenarios, estimating breakpoints close to the true breakpoints. The methodology, however, appears to struggle with accurately estimating the true breakpoint in cases of minor changes between states and large error term variance. In contrast, the Mean model does not perform well in DGPs featuring gradual changes, aligning with theoretical expectations as detailed in Bai and Perron (2003).

The coverage rate of a CI is the proportion of times the CI covers the true breakpoint, here at t=250. We find that the CIs of the Mean model are generally very wide and have varying coverage. In the Fixed AR and AR models, the CIs are typically narrower. The coverage rates are best in the DGPs with large differences between the states as seen in DGPs 4, 5, 7 and 8 using the Fixed AR model specification, which is in line with the findings of Bai and Perron (2003). For the AR model, the coverage rates are only close to the desired 95 % in the seventh and eighth DGP, indicating that the CIs are inadequate in most of the DGPs considered.

Table C3Means of the estimated number of breakpoints for each model specification across different DGPs, rounded to one decimal. Percentages indicate the proportion of estimates equal to the true number of breakpoints.

Download Print Version | Download XLSX

Table C3 shows the mean number of breakpoints estimated for each DGP and method, along with the proportion of correctly estimated breakpoints. The difficulty in accurately estimating gradual changes using the Mean model is also evident when estimating the number of breakpoints. This model specification leads to overestimating the number of breakpoints in all DGPs considered except DGP 8, where it performs well. The BIC criterion in the Fixed AR specification performs very well, with an estimated number of breakpoints equal to the true number in most simulations in DGP 2–8. The LWZ criterion performs almost equally well except in the third DGP, while the KT criterion vastly overestimates the number of breakpoints in DGP 1–7. In the AR model, the information criteria all perform well in DGPs 2–8 except for the third DGP where the LWZ criterion underestimates the number of breakpoints.

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f16

Figure C1DGP 1: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f17

Figure C2DGP 2: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f18

Figure C3DGP 3: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f19

Figure C4DGP 4: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f20

Figure C5DGP 5: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f21

Figure C6DGP 6: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f22

Figure C7DGP 7: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f23

Figure C8DGP 8: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

C2 Serially correlated error term

A possible extension of the simulation study outlined in Eq. (C1) is allowing the error term to exhibit serial correlation. We use the same DGPs as before, but generate ${ε_{t}}_{t = 1}^{T}$ as follows,

\begin{matrix} (C2) & ε_{t} = ψ ε_{t - 1} + θ η_{t - 1} + η_{t}, η_{t} \overset{i.i.d.}{\sim} N (0, σ_{η}^{2}) \forall t . \end{matrix}

We conduct 1000 simulations for each, with a sample size of 500. Here, we consider DGPs 2, 3, 4, 5, 7, and 8 as outlined in Table C1 and refer to these DGPs in the serially correlated cases as models 2_s, 3_s, 4_s, 5_s, 7_s, and 8_s. We set $ψ = θ = 0.5$ and the standard deviation σ_η, such that the standard deviation of ε_t corresponds to the σ in Table C1. This is accomplished as follows,

\begin{aligned} Var (ε_{t}) & = Var (ψ ε_{t - 1} + θ η_{t - 1} + η_{t}) \\ = ψ^{2} Var (ε_{t - 1}) + θ^{2} Var (η_{t - 1}) \\ + 2 ψ θ Cov (ε_{t - 1}, η_{t - 1}) + Var (η_{t}) . \\ = ψ^{2} Var (ε_{t - 1}) + θ^{2} σ_{η}^{2} + 2 ψ θ σ_{η}^{2} + σ_{η}^{2}, \end{aligned}

since ε_t−1 and η_t−1 have zero means and $E [ε_{t} η_{t}] = ϕ E [ε_{t - 1} η_{t}] + θ E [η_{t} η_{t - 1}] + E [η_{t}^{2}] = σ_{η}^{2}$ . Given stationarity of the process, which implies σ²=Var(ε_t) for all t, we derive,

σ_{η}^{2} = σ^{2} \frac{1 - ψ^{2}}{1 + θ^{2} + 2 ψ θ} .

This adjustment ensures the comparability of the results between the two error term types.

In Figs. C9 through C14, we plot examples of realizations and frequency plots of the estimated breakpoints using each of the models while imposing a single breakpoint in the estimation. The results are summarized in Table C4, which provides means of the estimated breakpoints and medians of the lower and upper boundary of the estimated confidence intervals, along with the coverage rates for each model specification and DGP. Generally speaking, the mean of the estimated breakpoints are further from the true breakpoint and the CIs become wider compared to the results from the corresponding DGPs without serial correlation. It is evident that serial correlation in the error term makes it more difficult to estimate the dating of breaks. We find that the Fixed AR and AR models perform well for DGP 7_s, which has a large difference between the states and low variance. This is in line with the theoretical framework by Bai and Perron (2003), who note that the estimated break dates are consistent even in the presence of serial correlation. The Fixed AR model performs well in DGPs 2_s, 4_s and 5_s where the mean of the estimated breakpoints is close to the true breakpoint, and confidence intervals are reasonably wide with acceptable coverage rates. The results of the AR model are less conclusive.

For the Mean and Fixed AR models, the coverage rates are generally close to the desired 95 % and even higher in some DGPs. However, the CIs are also extremely wide, reaching outside the sample window in many DGPs. The CIs seem reasonable in the Fixed AR model for DGPs 2_s, 4_s, 5_s, and 7_s, where the coverage rates are close to 95 % and the medians of the lower and upper bounds of the CIs are not too extreme. The CIs for the AR model are generally wider than in the version without serial correlation in the error term. In the AR model, the coverage rates are lower than the desired 95 %, but it seems that DGPs with large breaks have higher coverage rates. The relatively poor performance is in line with the theoretical framework by Bai and Perron (2003). The authors note that the construction of the CIs rely on having no serial correlation in the error term if a lagged dependent variable is included as a regressor that has coefficients that are subject to breakpoints.

Table C4Mean of the estimated breakpoints and medians of the lower and upper boundary of the estimated confidence intervals, along with the coverage rates for each model specification and DGP.

Download Print Version | Download XLSX

Table C5Means of the estimated number of breakpoints for each model specification across different DGPs, rounded to one decimal. Percentages indicate the proportion of estimates equal to the true number of breakpoints.

Download Print Version | Download XLSX

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f24

Figure C9DGP 2_s: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f25

Figure C10DGP 3_s: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f26

Figure C11DGP 4_s: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f27

Figure C12DGP 5_s: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f28

Figure C13DGP 7_s: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

https://cp.copernicus.org/articles/21/1981/2025/cp-21-1981-2025-f29

Figure C14DGP 8_s: Left: Five process realizations. Right: The densities of the estimated breakpoints for each specification.

Download

Table C5 shows the mean number of breakpoints estimated for each DGP and method, along with the proportion of correctly estimated number. In the Mean model, all information criteria overestimate the number of breakpoints. An important exception is the eighth DGP, where the performance is better, as in the case without serial correlation. In the Fixed AR and AR model specifications, the LWZ criterion generally performs well, while both the BIC and the KT criteria generally overestimate. However, the LWZ criterion leads to underestimating the number of breakpoints in DGPs 3_s and 8_s. These two DGPs are characterized by fixed AR-coefficients that are lower than one. This implies that these two processes do not exhibit an autoregressive unit root. Hence, it seems that the LWZ criterion performs well in cases of state-wise non-stationarity or switching between stationary and non-stationary states.

Compared to the findings in the DGPs without serial correlation, it is clear that the proportion of correct estimates are lower for most DGPs and model specifications. Overall, the best performing criterion seems to be the LWZ criterion in the Fixed AR and AR models, while the Mean model typically leads to overestimating the number of breakpoints.

Code availability

The code used to conduct the analysis is based on the R-package mbreaks by Nguyen et al. (2023) and the implementation is available upon request.

Data availability

The data used in this study are available as the supplementary material of Westerhold et al. (2020).

Author contributions

Conceptualization: MB, EH, SJK, KBL. Formal analysis: MB, EH, SJK, KBL, RL. Methodology: MB, EH, SJK, KBL. Software: KBL. Visualization: KBL. Writing (original draft): KBL. Writing (review and editing): MB, EH, SJK, KBL, RL. Authors listed alphabetically.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Also, please note that this paper has not received English language copy-editing. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

For helpful comments and suggestions, we thank participants at the conferences on Econometric Models of Climate Change (EMCC-VII and VIII) in Amsterdam in 2023 and in Cambridge, UK, in 2024, at the General Assembly of the EGU in Vienna in 2024, and seminar participants at Aarhus University.

Financial support

This research has been supported by the Danmarks Frie Forskningsfond (grant no. 7015-00018B).

Review statement

This paper was edited by Julien Emile-Geay and reviewed by two anonymous referees.

References

Andrews, D. W. K.: Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation, Econometrica, 59, 817–858, https://doi.org/10.2307/2938229, 1991. a

Andrews, D. W. K. and Monahan, J. C.: An Improved Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator, Econometrica, 60, 953–966, https://doi.org/10.2307/2951574, 1992. a

Bagniewski, W., Ghil, M., and Rousseau, D. D.: Automatic detection of abrupt transitions in paleoclimate records, Chaos, 31, 113129, https://doi.org/10.1063/5.0062543, 2021. a

Bai, J. and Perron, P.: Estimating and Testing Linear Models with Multiple Structural Changes, Econometrica, 66, 47–78, https://doi.org/10.2307/2998540, 1998. a, b, c, d, e

Bai, J. and Perron, P.: Computation and analysis of multiple structural change models, Journal of Applied Econometrics, 18, 1–22, https://doi.org/10.1002/jae.659, 2003. a, b, c, d, e, f, g, h, i, j, k, l, m, n, o

Bai, J. and Perron, P.: Multiple Structural Change Models: A Simulation Analysis, 212–238, Cambridge University Press, https://doi.org/10.1017/CBO9781139164863.010, 2006. a

Barker, S., Lisiecki, L. E., Knorr, G., Nuber, S., and Tzedakis, P. C.: Distinct roles for precession, obliquity, and eccentricity in Pleistocene 100-kyr glacial cycles, Science, 387, eadp3491, https://doi.org/10.1126/science.adp3491, 2025. a

Bennedsen, M., Hillebrand, E., Koopman, S. J., and Larsen, K. B.: Continuous-time state-space methods for delta-O-18 and delta-C-13, arXiv [preprint], https://doi.org/10.48550/arXiv.2404.05401, 2024. a

Berends, C. J., Köhler, P., Lourens, L. J., and van de Wal, R. S. W.: On the Cause of the Mid-Pleistocene Transition, Reviews of Geophysics, 59, https://doi.org/10.1029/2020RG000727, 2021. a

Boettner, C., Klinghammer, G., Boers, N., Westerhold, T., and Marwan, N.: Early-warning signals for Cenozoic climate transitions, Quaternary Science Reviews, 270, 107177, https://doi.org/10.1016/j.quascirev.2021.107177, 2021. a

Bohaty, S. M. and Zachos, J. C.: Significant Southern Ocean warming event in the late middle Eocene, Geology, 31, 1017–1020, https://doi.org/10.1130/G19800.1, 2003. a

Burke, K. D., Williams, J. W., Chandler, M. A., Haywood, A. M., Lunt, D. J., and Otto-Bliesner, B. L.: Pliocene and Eocene provide best analogs for near-future climates, P. Natl. Acad. Sci. USA, 115, 13288–13293, https://doi.org/10.1073/pnas.1809600115, 2018. a

Caballero, R. and Huber, M.: State-dependent climate sensitivity in past warm climates and its implications for future climate projections, P. Natl. Acad. Sci. USA, 110, 14162–14167, https://doi.org/10.1073/pnas.1303365110, 2013. a

Clark, P. U., Archer, D., Pollard, D., Blum, J. D., Rial, J. A., Brovkin, V., Mix, A. C., Pisias, N. G., and Roy, M.: The middle Pleistocene transition: characteristics, mechanisms, and implications for long-term changes in atmospheric pCO₂, Quat. Sci. Rev., 25, 3150–3184, https://doi.org/10.1016/j.quascirev.2006.07.008, 2006. a

Coxall, H. K., Wilson, P. A., Pälike, H., Lear, C. H., and Backman, J.: Rapid stepwise onset of Antarctic glaciation and deeper calcite compensation in the Pacific Ocean, Nature, 433, 53–57, https://doi.org/10.1038/nature03135, 2005. a

Crowley, T. J. and Hyde, W. T.: Transient nature of late Pleistocene climate variability, Nature, 456, 226–230, https://doi.org/10.1038/nature07365, 2008. a

Dansgaard, W., Johnsen, S., Clausen, H., Dahl-Jensen, D., Gundestrup, N., Hammer, C., Hvidberg, C., Steffensen, J., Sveinbjörnsdottir, A., Jouzel, J., and GC, B.: Evidence of general instability of past climate from a 250-kyr ice-core record, Nature, 364, 218–220, https://doi.org/10.1038/364218a0, 1993. a

Davidson, J., Stephenson, D., and Turasie, A.: Time series modeling of paleoclimate data, Environmetrics, 27, https://doi.org/10.1002/env.2373, 2015. a

Dickey, D. and Fuller, W.: Distribution of the Estimators for Autoregressive Time Series With a Unit Root, Journal of the American Statistical Association, 74, https://doi.org/10.2307/2286348, 1979. a

Elderfield, H., Ferretti, P., Greaves, M., Crowhurst, S., McCave, I. N., Hodell, D., and Piotrowski, A. M.: Evolution of ocean temperature and ice volume through the mid-Pleistocene climate transition, Science, 337, 704–709, https://doi.org/10.1126/science.1221294, 2012. a

Epstein, S., Buchsbaum, R., Lowenstam, H., and Urey, H. C.: Carbonate-Water Isotopic Temperature Scale, GSA Bulletin, 62, 417–426, https://doi.org/10.1130/0016-7606(1951)62[417:CITS]2.0.CO;2, 1951. a

Fischer, M. L., Munz, P. M., Asrat, A., Foerster, V., Kaboth-Bahr, S., Marwan, N., Schaebitz, F., Schwanghart, W., and Trauth, M. H.: Spatio-temporal variations of climate along possible African–Arabian routes of H. sapiens expansion, Quaternary Science Advances, 14, 100174, https://doi.org/10.1016/j.qsa.2024.100174, 2024. a

Fleitmann, D., Burns, S. J., Mudelsee, M., Neff, U., Kramers, J., Mangini, A., and Matter, A.: Holocene Forcing of the Indian Monsoon Recorded in a Stalagmite from Southern Oman, Science, 300, 1737–1739, https://doi.org/10.1126/science.1083130, 2003. a

Flower, B. P. and Kennett, J. P.: The middle Miocene climatic transition: East Antarctic ice sheet development, deep ocean circulation and global carbon cycling, Palaeogeography, Palaeoclimatology, Palaeoecology, 108, 537–555, https://doi.org/10.1016/0031-0182(94)90251-8, 1994. a, b

Franke, J. G. and Donner, R. V.: Correlating paleoclimate time series: Sources of uncertainty and potential pitfalls, Quaternary Science Reviews, 212, 69–79, https://doi.org/10.1016/j.quascirev.2019.03.017, 2019. a

Garcia, R. and Perron, P.: An Analysis of the Real Interest Rate under Regime Shifts, Rev. Econ. Stat., 78, 111–125, https://doi.org/10.2307/2109851, 1996. a

Goswami, B., Boers, N., Rheinwalt, A., Marwan, N., Heitzig, J., Breitenbach, S., and Kurths, J.: Abrupt transitions in time series with uncertainties, Nature Communications, 9, https://doi.org/10.1038/s41467-017-02456-6, 2018. a, b

Hansen, J., Sato, M., Russell, G., and Kharecha, P.: Climate Sensitivity, Sea Level and Atmospheric Carbon Dioxide, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371, https://doi.org/10.1098/rsta.2012.0294, 2013. a

Harper, D. T., Hönisch, B., Bowen, G. J., Zeebe, R. E., Haynes, L. L., Penman, D. E., and Zachos, J. C.: Long- and short-term coupling of sea surface temperature and atmospheric CO₂ during the late Paleocene and early Eocene, P. Natl. Acad. Sci. USA, 121, e2318779121, https://doi.org/10.1073/pnas.2318779121, 2024. a

Henehan, M. J., Edgar, K. M., Foster, G. L., Penman, D. E., Hull, P. M., Greenop, R., Anagnostou, E., and Pearson, P. N.: Revisiting the Middle Eocene Climatic Optimum “Carbon Cycle Conundrum” With New Estimates of Atmospheric pCO₂ From Boron Isotopes, Paleoceanography and Paleoclimatology, 35, e2019PA003713, https://doi.org/10.1029/2019PA003713, 2020. a, b

Hönisch, B., Royer, D. L., Breecker, D. O., et al.: Toward a Cenozoic history of atmospheric CO₂, Science, 382, eadi5177, https://doi.org/10.1126/science.adi5177, 2023. a, b

James, A., Emile-Geay, J., Malik, N., and Khider, D.: Detecting Paleoclimate Transitions With Laplacian Eigenmaps of Recurrence Matrices (LERM), Paleoceanography and Paleoclimatology, 39, e2023PA004700, https://doi.org/10.1029/2023PA004700, 2024. a

Kejriwal, M., Perron, P., and Zhou, J.: Wald tests for detecting multiple structural changes in persistence, Econometric Theory, 29, 289–323, https://doi.org/10.1017/S0266466612000357, 2013. a

Keyes, N. D. B., Giorgini, L. T., and Wettlaufer, J. S.: Stochastic paleoclimatology: Modeling the EPICA ice core climate records, Chaos, 33, 093132, https://doi.org/10.1063/5.0128814, 2023. a

Killick, R., Fearnhead, P., and Eckley, I. A.: Optimal Detection of Changepoints With a Linear Computational Cost, Journal of the American Statistical Association, 107, 1590–1598, https://doi.org/10.1080/01621459.2012.737745, 2012. a

Kunz, T., Dolman, A. M., and Laepple, T.: A spectral approach to estimating the timescale-dependent uncertainty of paleoclimate records – Part 1: Theoretical concept, Clim. Past, 16, 1469–1492, https://doi.org/10.5194/cp-16-1469-2020, 2020. a

Kurozumi, E. and Tuvaandorj, P.: Model selection criteria in multivariate models with multiple structural changes, Journal of Econometrics, 164, 218–238, https://doi.org/10.1016/j.jeconom.2011.04.003, 2011. a

Laskar, J., Robutel, P., Joutel, F., Gastineau, M., Correia, A. C. M., and Levrard, B.: A long-term numerical solution for the insolation quantities of the Earth, Astronomy & Astrophysics, 428, 261–285, https://doi.org/10.1051/0004-6361:20041335, 2004. a

Liang, J., Wang, Y., Zhang, S., Huang, C., Xu, E., and Zhang, Z.: Astronomical Forcing of late oligocene to early Miocene Paleoclimate: A case study from the Northern South China Sea, Palaeogeography, Palaeoclimatology, Palaeoecology, 673, 113007, https://doi.org/10.1016/j.palaeo.2025.113007, 2025. a

Lisiecki, L. E. and Raymo, M. E.: A Pliocene-Pleistocene stack of 57 globally distributed benthic δ¹⁸O records, Paleoceanography, 20, PA1003, https://doi.org/10.1029/2004PA001071, 2005. a, b, c

Liu, J., Wu, S., and Zidek, J. V.: On Segmented Multivariate Regressions, Statistica Sinica, 7, 497–525, 1997. a

Livina, V. N., Kwasniok, F., and Lenton, T. M.: Potential analysis reveals changing number of climate states during the last 60 kyr, Clim. Past, 6, 77–82, https://doi.org/10.5194/cp-6-77-2010, 2010. a, b

Lund, R. and Shi, X.: Changepoint Methods in Climatology, CHANCE, 36, 4–8, https://doi.org/10.1080/09332480.2023.2203643, 2023. a

Marwan, N.: Challenges and perspectives in recurrence analyses of event time series, Frontiers in Applied Mathematics and Statistics, 9, https://doi.org/10.3389/fams.2023.1129105, 2023. a

Marwan, N., Carmen Romano, M., Thiel, M., and Kurths, J.: Recurrence plots for the analysis of complex systems, Physics Reports, 438, 237–329, https://doi.org/10.1016/j.physrep.2006.11.001, 2007. a

Marwan, N., Donges, J. F., Donner, R. V., and Eroglu, D.: Nonlinear time series analysis of palaeoclimate proxy records, Quaternary Science Reviews, 274, 107245, https://doi.org/10.1016/j.quascirev.2021.107245, 2021. a, b, c

McClymont, E. L., Ho, S. L., Ford, H. L., Bailey, I., Berke, M. A., Bolton, C. T., De Schepper, S., Grant, G. R., Groeneveld, J., Inglis, G. N., Karas, C., Patterson, M. O., Swann, G. E. A., Thirumalai, K., White, S. M., Alonso-Garcia, M., Anand, P., Hoogakker, B. A. A., Littler, K., Petrick, B. F., Risebrobakken, B., Abell, J. T., Crocker, A. J., de Graaf, F., Feakins, S. J., Hargreaves, J. C., Jones, C. L., Markowska, M., Ratnayake, A. S., Stepanek, C., and Tangunan, D.: Climate Evolution Through the Onset and Intensification of Northern Hemisphere Glaciation, Reviews of Geophysics, 61, e2022RG000793, https://doi.org/10.1029/2022RG000793, 2023. a

McInerney, F. A. and Wing, S. L.: The Paleocene-Eocene Thermal Maximum: A Perturbation of Carbon Cycle, Climate, and Biosphere with Implications for the Future, Annual Review of Earth and Planetary Sciences, 39, 489–516, https://doi.org/10.1146/annurev-earth-040610-133431, 2011. a

Miller, K. G., Browning, J. V., Schmelz, W. J., Kopp, R. E., Mountain, G. S., and Wright, J. D.: Cenozoic sea-level and cryospheric evolution from deep-sea geochemical and continental margin records, Science Advances, 6, eaaz1346, https://doi.org/10.1126/sciadv.aaz1346, 2020. a, b

Mudelsee, M.: Ramp function regression: a tool for quantifying climate transitions, Computers & Geosciences, 26, 293–307, https://doi.org/10.1016/S0098-3004(99)00141-7, 2000. a

Mudelsee, M.: Climate Time Series Analysis: Classical Statistical and Bootstrap Methods, vol. 51 of Atmospheric and Oceanographic Sciences Library, Springer, Cham, Heidelberg, New York, Dordrecht, London, 2nd edn., ISBN 978-3-319-04449-1, https://doi.org/10.1007/978-3-319-04450-7, 2014. a

Mudelsee, M. and Raymo, M. E.: Slow dynamics of the Northern Hemisphere Glaciation, Paleoceanography, 20, PA4022, https://doi.org/10.1029/2005PA001153, 2005. a

Mudelsee, M., Bickert, T., Lear, C. H., and Lohmann, G.: Cenozoic climate changes: A review based on time series analysis of marine benthic δ¹⁸O records, Reviews of Geophysics, 52, 333–374, https://doi.org/10.1002/2013RG000440, 2014. a, b, c, d

Nguyen, L., Yamamoto, Y., and Perron, P.: mbreaks: Estimation and Inference for Structural Breaks in Linear Regression Models, r package version 1.0.0, Econometrica, 66, 47–78, https://doi.org/10.2307/2998540, 2023. a, b, c, d

Oerlemans, J.: Correcting the Cenozoic δ¹⁸O deep-sea temperature record for Antarctic ice volume, Palaeogeography, Palaeoclimatology, Palaeoecology, 208, 195–205, https://doi.org/10.1016/j.palaeo.2004.03.004, 2004. a

Pisias, N. G. and Moore, T. C.: The Evolution of the Pleistocene Climate: A Time Series Approach, Earth and Planetary Science Letters, 52, 450–458, https://doi.org/10.1016/0012-821X(81)90197-7, 1981. a

Polissar, P. J., Rose, C., Uno, K. T., Phelps, S. R., and deMenocal, P.: Synchronous rise of African C₄ ecosystems 10 million years ago in the absence of aridification, Nature Geoscience, 12, 657–660, https://doi.org/10.1038/s41561-019-0399-2, 2019. a, b

Reeves, J., Chen, J., Wang, X. L., Lund, R., and Lu, Q.: A Review and Comparison of Changepoint Detection Techniques for Climate Data, Journal of Applied Meteorology and Climatology, 46, 900–915, https://doi.org/10.1175/JAM2493.1, 2007. a

Rousseau, D.-D., Bagniewski, W., and Lucarini, V.: A Punctuated Equilibrium Analysis of the Climate Evolution of Cenozoic Exhibits a Hierarchy of Abrupt Transitions, Scientific Reports, 13, 11290, https://doi.org/10.1038/s41598-023-38454-6, 2023. a, b, c

Ruggieri, E.: A Bayesian approach to detecting change points in climatic records, International Journal of Climatology, 33, 520–528, https://doi.org/10.1002/joc.3447, 2013. a

Schütz, N. and Holschneider, M.: Detection of trend changes in time series using Bayesian inference, Physical Review E, 83, 041131, https://doi.org/10.1103/PhysRevE.84.021120, 2011. a

Shackleton, N.: Oxygen Isotope Analyses and Pleistocene Temperatures Re-assessed, Nature, 215, 15–17, https://doi.org/10.1038/215015a0, 1967. a

Spray, J. F., Bohaty, S. M., Davies, A., Bailey, I., Romans, B. W., Cooper, M. J., Milton, J. A., and Wilson, P. A.: North Atlantic Evidence for a Unipolar Icehouse Climate State at the Eocene-Oligocene Transition, Paleoceanography and Paleoclimatology, 34, 1124–1138, https://doi.org/10.1029/2019PA003563, 2019. a

Steffen, W., Rockström, J., Richardson, K., Lenton, T., Folke, C., Liverman, D., Summerhayes, C., Barnosky, A., Cornell, S., Crucifix, M., Donges, J., Fetzer, I., Lade, S., Scheffer, M., and Schellnhuber, H.: Trajectories of the Earth System in the Anthropocene, P. Natl. Acad. Sci. USA, 115, 201810141, https://doi.org/10.1073/pnas.1810141115, 2018. a

Strömberg, C. A. E.: Evolution of Grasses and Grassland Ecosystems, Annual Review of Earth and Planetary Sciences, 39, 517–544, https://doi.org/10.1146/annurev-earth-040809-152402, 2011. a

Telford, R., Heegaard, E., and Birks, H.: All age–depth models are wrong: but how badly?, Quaternary Science Reviews, 23, 1–5, https://doi.org/10.1016/j.quascirev.2003.11.003, 2004. a

Tierney, J. E., Poulsen, C. J., Montañez, I. P., Bhattacharya, T., Feng, R., Ford, H. L., Hönisch, B., Inglis, G. N., Petersen, S. V., Sagoo, N., Tabor, C. R., Thirumalai, K., Zhu, J., Burls, N. J., Foster, G. L., Goddéris, Y., Huber, B. T., Ivany, L. C., Turner, S. K., Lunt, D. J., McElwain, J. C., Mills, B. J. W., Otto-Bliesner, B. L., Ridgwell, A., and Zhang, Y. G.: Past climates inform our future, Science, 370, eaay3701, https://doi.org/10.1126/science.aay3701, 2020. a

Trauth, M. H.: MATLAB^® Recipes for Earth Sciences, Springer Textbooks in Earth Sciences, Geography and Environment, Springer, access provided by Vrije Universiteit Amsterdam, https://doi.org/10.1007/978-3-031-57949-3, 2025. a, b

Trauth, M. H., Asrat, A., Berner, N., Bibie, F., Foerster, V., Grove, M., Kaboth-Bahr, S., Maslin, M. A., Mudelsee, M., and Schäbitz, F.: Northern Hemisphere Glaciation, African climate and human evolution, Quaternary Science Reviews, 268, 107095, https://doi.org/10.1016/j.quascirev.2021.107095, 2021. a

Trauth, M. H., Asrat, A., Fischer, M. L., Hopcroft, P. O., Foerster, V., Kaboth-Bahr, S., Kindermann, K., Lamb, H. F., Marwan, N., Maslin, M. A., Schaebitz, F., and Valdes, P. J.: Early warning signals of the termination of the African Humid Period(s), Nature Communications, 15, 3697, https://doi.org/10.1038/s41467-024-47921-1, 2024. a

Waelbroeck, C., Labeyrie, L., Michel, E., Duplessy, J. C., McManus, J. F., Lambeck, K., Balbon, E., and Labracherie, M.: Sea-level and deep water temperature changes derived from benthic foraminifera isotopic records, Quaternary Science Reviews, 21, 295–305, https://doi.org/10.1016/S0277-3791(01)00101-9, 2002. a

Westerhold, T., Röhl, U., Donner, B., and Zachos, J. C.: Global Extent of Early Eocene Hyperthermal Events: A New Pacific Benthic Foraminiferal Isotope Record From Shatsky Rise (ODP Site 1209), Paleoceanography and Paleoclimatology, 33, 626–642, https://doi.org/10.1029/2017PA003306, 2018. a

Westerhold, T., Marwan, N., Drury, A. J., Liebrand, D., Agnini, C., Anagnostou, E., Barnet, J. S. K., Bohaty, S. M., Vleeschouwer, D. D., Florindo, F., Frederichs, T., Hodell, D. A., Holbourn, A. E., Kroon, D., Lauretano, V., Littler, K., Lourens, L. J., Lyle, M., Pälike, H., Röhl, U., Tian, J., Wilkens, R. H., Wilson, P. A., and Zachos, J. C.: An astronomically dated record of Earth's climate and its predictability over the last 66 million years, Science, 369, 1383–1387, https://doi.org/10.1126/science.aba6853, 2020. a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, aa, ab, ac, ad, ae, af, ag, ah, ai, aj, ak, al, am, an, ao, ap, aq, ar, as, at, au, av, aw, ax

Yao, Y.-C.: Estimating the number of change-points via Schwarz' criterion, Statistics & Probability Letters, 6, 181–189, https://doi.org/10.1016/0167-7152(88)90118-6, 1988. a

Zachos, J., MO, P., Sloan, L., Thomas, E., and Billups, K.: Trends, Rhythms, and Aberrations in Global Climate 65 Ma to Present, Science (New York, N.Y.), 292, 686–93, https://doi.org/10.1126/science.1059412, 2001. a, b, c

Zachos, J. C., Dickens, G. R., and Zeebe, R. E.: An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics, Nature, 451, 279–283, https://doi.org/10.1038/nature06588, 2008. a

These are the values in column “benthic d18O VPDB Corr”, found in Sheet 33 of the file aba6853_tables_s8_s34.xlsx provided in the Supplement of Westerhold et al. (2020).

Articles

Short summary

This study presents a statistical approach for identifying transitions between climate states in the Cenozoic Era using a 67.1 million-year proxy record. Our results support transitions previously identified in the literature and provide statistical justification for additional ones. The approach enables construction of confidence intervals, offering measures of uncertainty. This adds to our understanding of the Cenozoic climate and provides a tool for analyzing other paleoclimate records.