Similarity estimators for irregular and age-uncertain time series
- 1Potsdam Institute for Climate Impact Research, P.O. Box 601203, 14412 Potsdam, Germany
- 2Department of Physics, Humboldt-Universität zu Berlin, Newtonstr. 15, 12489 Berlin, Germany
- 3Institute for Complex Systems and Mathematical Biology, University of Aberdeen, Aberdeen AB243UE, UK
Abstract. Paleoclimate time series are often irregularly sampled and age uncertain, which is an important technical challenge to overcome for successful reconstruction of past climate variability and dynamics. Visual comparison and interpolation-based linear correlation approaches have been used to infer dependencies from such proxy time series. While the first is subjective, not measurable and not suitable for the comparison of many data sets at a time, the latter introduces interpolation bias, and both face difficulties if the underlying dependencies are nonlinear.
In this paper we investigate similarity estimators that could be suitable for the quantitative investigation of dependencies in irregular and age-uncertain time series. We compare the Gaussian-kernel-based cross-correlation (gXCF, Rehfeld et al., 2011) and mutual information (gMI, Rehfeld et al., 2013) against their interpolation-based counterparts and the new event synchronization function (ESF). We test the efficiency of the methods in estimating coupling strength and coupling lag numerically, using ensembles of synthetic stalagmites with short, autocorrelated, linear and nonlinearly coupled proxy time series, and in the application to real stalagmite time series.
In the linear test case, coupling strength increases are identified consistently for all estimators, while in the nonlinear test case the correlation-based approaches fail. The lag at which the time series are coupled is identified correctly as the maximum of the similarity functions in around 60–55% (in the linear case) to 53–42% (for the nonlinear processes) of the cases when the dating of the synthetic stalagmite is perfectly precise. If the age uncertainty increases beyond 5% of the time series length, however, the true coupling lag is not identified more often than the others for which the similarity function was estimated. Age uncertainty contributes up to half of the uncertainty in the similarity estimation process. Time series irregularity contributes less, particularly for the adapted Gaussian-kernel-based estimators and the event synchronization function. The introduced link strength concept summarizes the hypothesis test results and balances the individual strengths of the estimators: while gXCF is particularly suitable for short and irregular time series, gMI and the ESF can identify nonlinear dependencies. ESF could, in particular, be suitable to study extreme event dynamics in paleoclimate records. Programs to analyze paleoclimatic time series for significant dependencies are included in a freely available software toolbox.