Status: this preprint was under review for the journal CP but the revision was not accepted.
Autoregressive Statistical Modeling of a Peru Margin Multi-Proxy Holocene Record Shows Correlation Not Causation, Flickering Regimes and Persistence
Seonmin Ahn,Baylor Fox-Kemper,Timothy Herbert,and Charles Lawrence
Abstract. Correlation does not necessarily imply a causation, but in climatology and paleoclimatology, correlation is used to identify potential cause-and-effect relationships because linking mechanisms are difficult to observe. Confounding by an often unknown outside variable that drives the sets of observables is one of the major factors that lead to correlations that are not the result of causation. Here we show how autoregressive (AR) models can be used to examine lead-lag relationships – helpful in assessing cause and effect – of paleoclimate variables while addressing two other challenges that are often encountered in paleoclimate data: unevenly spaced data; and switching between regimes at unknown times. Specifically, we analyze multidimensional paleoclimate proxies, sea surface temperature (SST), C37, ∂15N, and %N from the central Peru margin to find their correlations and changes in their variability over the Holocene epoch. The four proxies are sampled at high-resolution but are not synchronously sampled at all possible locations. The multidimensional records are treated as evenly spaced data with missing parts, and the missing values are filled by the Kalman filter expected values. We employ hidden Markov models (HMM) and autoregressive HMM (AR-HMM) to address the potential that the degree of variability and the correlations between in these proxies appears to show changes over time. The HMM, which is not autoregressive, shows instantaneous correlations between observables in two regimes. However, our investigation of lead-lag relationships using the AR-HMM shows that the cross-correlations do not indicate a causal link. Each of the four proxies has predictability on decadal timescales, but none of the proxies is a good predictor of any other, so we hypothesize that a common unobserved variable – or a set of variables – is driving the instantaneous relationships among these four proxies, revealing probable confounding without prior knowledge of potential confounding variable(s). These findings suggest that the variability at this site is remotely driven by processes such as those causing the Pacific Decadal Oscillation, rather than locally driven by processes such as increased or decreased vertical mixing of nutrients.
Received: 02 Jan 2018 – Discussion started: 19 Jan 2018
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.