Interactive comment on “Exploring errors in paleoclimate proxy reconstructions using Monte Carlo simulations: paleotemperature from mollusk and coral geochemistry” by M. Carré et al

General comments. The authors present an explicit means to explore uncertainties which are known but not well-accounted for in the paleoproxy literature, in particular for proxy measurements made in corals and molluscs. To improve the impact of the paper I suggest (1) inclusion of chronological error in the algorithm; (2) additional sensitivity experiments involving more of the proxy calibration parameters; (3) discussion of the results in the context of existing proxy reconstruction uncertainty estimates and interpretations; (4) overall revision for clarity and focus; (5) correction and revision of the


Introduction
Reconstructions of the past climate from proxy records involve a wide range of uncertainties at every step of the process.These uncertainties and the subsequent error bar in the reconstruction of a paleoclimatic variable need to be understood and quantified in order to properly interpret the reconstructed variability and to perform meaningful comparisons with climate model outputs.In a recent overview of methods used in high resolution paleoclimatology, Hughes and Ammann (2009) concluded that "the study of the processes by which climate proxy records are formed [...] should be accorded high priority".
Corals and mollusks are privileged archives for high resolution paleoceanographic studies and especially for El Niño Southern Oscillation (ENSO) reconstructions (Cole and Published by Copernicus Publications on behalf of the European Geosciences Union.Fairbanks, 1990;Cole et al., 1993;Welsh et al., 2011).Shortterm windows of monthly sea surface temperature (SST) series can be reconstructed from these accretionary archives using paleo-temperature proxies such as Sr/Ca (Beck et al., 1992;Marshall and McCulloch, 2002;Corrège et al., 2004;Rosenheim et al., 2004) or δ 18 O (Epstein et al., 1953;Grossman and Ku, 1986;Böhm et al., 2000;Carré et al., 2005) serially measured along the growth axis.For these archives, the proxy development work has been mainly concentrated on the calculation of empirical regression models (or transfer functions) linking the geochemical proxies to the environmental variables.The error bar is then estimated for each calculated data point from the scattering of the calibration dataset.As for the reconstructed climate statistics, like the mean or the variance, the confidence in the reconstructed value is generally subjectively linked to the quality of the transfer function, but is not quantified with an error bar, except for a few studies in which the propagation of known errors was calculated for a reconstructed mean value (Abram et al., 2009) or for a reconstructed trend (Nurhati et al., 2011).Tudhope et al. (2001) estimated the ENSOrelated variance in successive stages of the Late Quaternary from coral δ 18 O records and could only address qualitatively the question of the statistical representativeness of their signal.A small number of studies focused on the uncertainties of coral-derived reconstructions.Evans et al. (1998) studied the effect of observational errors and the network of sparse coral records for the reconstruction of global SST. Brown et al. (2008) proposed that the variance of anomalies was the most robust measure of ENSO activity, and also pointed out that short records may reflect sampling of natural ENSO variability rather than a response to external forcing.Thompson et al. (2011) compared coral δ 18 O records with pseudo-corals linearly calculated from climate model output and suggested that differences may be due to uncertainties in the proxy model and in the way coral record environmental variables.Sources of uncertainties are numerous (de Villers et al., 1995) but the way they affect reconstructions is unclear.There is growing agreement in the paleoclimate science community on the need for better methods to evaluate the uncertainties in climate proxy records (Jones et al., 2009).However, a full assessment of paleoclimate reconstruction uncertainties from the statistical analysis of modern dataset is strongly limited for corals and mollusks by the small size of datasets due to the practical constraints that characterize these archives (limited number of sites, scarcity, protection laws, analytical costs).In this case, we propose an approach based on Monte Carlo simulations with numerical surrogate climate proxy records.
Specifically, we provide a ready-to-use, parameterizeyourself, open access algorithm called MoCo available for Matlab ® and R (Supplement) for estimating systematic and standard errors of the mean and variance of the annual SST, and the mean and variance of the SST seasonality reconstructed from mollusk and coral geochemistry.This algorithm is used in a case study using two instrumental SST time series from the Eastern Tropical Pacific to characterize and quantitatively evaluate the sensitivity of systematic and standard errors to categories of (1) sampling, (2) stochastic uncertainty sources, (3) archive-related biological limitations, and (4) climate non-stationarity when reconstructing the time series statistics from samples of short mollusk records.
Monte Carlo simulations have been used in previous studies (Briskin and Harrell, 1980;Ballentine and Hall, 1999;Touchan et al., 1999;Meibom et al., 2003;Kaufman, 2003;Evans et al., 2007) to estimate the error of a paleoclimate reconstruction.The method is thus not novel but its use has been limited to the estimation of a raw uncertainty value.
Here we develop a methodological framework to quantitatively explore how systematic and standard errors of reconstructed statistics build from multiple sources, and to improve the understanding of the proxy signal, and eventually the quality of the coral and mollusk-based paleoclimate reconstructions.This technique is conceptually very simple compared to the full probabilistic modelling studies using Bayesian inferences that have been developed by statisticians for climate field reconstructions (Jones et al., 2009).It is intended for use as an intermediate method, realistic enough to provide reliable assessments of paleoclimate errors, while being technically and conceptually accessible to a broad community in paleoclimate science.We only consider the case where the statistics for the mean and variance of the climate state are estimated for a discrete time interval (similarly to Tudhope et al, 2011).We do not address the case of time series data being used to assess temporal changes in variability (similarly to Cole et al., 1993or Cobb et al., 2003) where chronological uncertainties would also need to be addressed.

The MoCo program
As in all surrogate proxy (also referred to as "pseudo-proxy") studies, the basic principle is to use a realistic climate time series, sample and perturb it in a way that mimics the real sources of uncertainties (Mann and Rutherford, 2002).The inputs to use the MoCo program are (1) a monthly "target" time series from the studied locality, (2) a linear non-biased relationship linking the proxy to the reconstructed variable (thereafter referred to as the proxy model), and (3) the parameters defining the perturbations (more details in Sect.3).MoCo randomly draws a sample of short windows out of the target time series and perturb them to simulate the sources of uncertainties characteristic of mollusks and corals.The perturbed chunks are considered as surrogate proxy-derived climate records and used together to estimate the statistics of the target time series.MoCo calculates the error of this simulated reconstruction by comparing the pseudo-reconstructed value with the "true" value, which is known from the original Potential systematic Error Calibration dataset (T j , P j ) Proxy model, T=f(P) non perturbed time series.By iterating this process thousands of times (Monte Carlo analysis), MoCo yields an estimate of the systematic error (the average error) and of the standard error (the standard deviation of the error) for the type of reconstruction that was simulated (Fig. 1).Throughout the article, the variable of the original time series is the water temperature T , but, with slight modifications of MoCo, the method is suitable for any variable such as carbonate δ 18 O, salinity or pH.We explore the reconstruction of the statistics of the target climate, T 0 , V T 0 , 0 , and V 0 , which are respectively the average and the variance of the annual temperature, and the average and variance of the temperature annual amplitude, defined as follows for a N 0 year long monthly time series: i , where i is the annual amplitude (T max − T min ) of year i.
T 0 and 0 define the average climate, whereas V T 0 and V 0 are considered as measures of the interannual climatic variability.These statistics are estimated by T m , , V T , and V respectively, calculated from a random sample of N surrogate climate proxy series of N Y years, cumulating N S years (N S = N • N Y ), with the following equations: T m , , V T , and V are unbiased estimators of T 0 , 0 , V T 0 , and V 0 .

Different types of error
We explore both systematic and standard errors, S E and σ E of the estimators T m , , V T , and V , defined as follows: The esperance and standard deviation of the estimators are calculated from a population of 5000 values obtained from 5000 random samples of surrogate climate proxy records (Fig. 1).Identifying and estimating systematic errors may allow us to correct the reconstruction and improve its accuracy.
A quantitative estimate of the standard error is also essential to determine a threshold of significance in the amplitude of the climate proxy variations.Defining the error in a paleoclimate reconstruction from a local archive like a coral is not trivial because it depends on the climate information sought.An ideal proxy would provide the exact temperature in a precise location and thus be considered as error-free, but if the aim is to have regional scale information, the proxy signal would still be noisy owing to micro-environment effects.Weather also contributes to the noise inherent in climate statistics.Thus, part of the uncertainty in the reconstruction is related to random sampling in time and space and is thus independent of the quality of the empirical regression model linking the proxy to the climate.
The formation of the proxy record involves a complex chain of physical and biological processes (for instance, mechanisms of Strontium incorporation into coral aragonite) that introduce non-climate-related stochasticity and limitations in the climate-proxy relationship (Allison et al., 2001(Allison et al., , 2005;;Meibom et al., 2003).The scatter inherent in in situ calibration datasets partly captures this stochastic variability but does not allow the exploration of its full range or the characterization of the uncertainty produced by different sources.Stochastic parameters may contribute to the standard error in the reconstruction of climate statistics as well as to systematic errors as we will show.
Paleoclimate reconstructions also involve systematic errors that cannot be estimated and corrected for, and could be referred to as potential systematic errors.They include errors related to the proxy calibration model.Considering that the mechanisms behind such errors are identical for all specimens in the archive (which is generally assumed), then the model would be identically wrong for all the climate calculations.Therefore, the paleoclimate error due to the imperfection of the proxy calibration model does not contribute to the standard error, but instead is more comparable to a systematic error (although its value might be linearly dependent on the proxy variable).It belongs thus to the category of potential systematic errors which are systematic errors for which only a potential range of values can be statistically estimated.
Here we only consider the potential systematic error S P related to the uncertainty of the linear regression model between the reconstructed variable T and the proxy P : T = α • P + β. S P is not estimated through a Monte Carlo simulation.±S P defines a 95 % confidence interval calculated as follows: where T C is the mean temperature of the linear model calibration set.E α and E β are the uncertainty associated to the coefficients α and β so that α ± E α and β ± E β define the 95 % confidence interval of α and β.They are calculated by the following equations: where N C is the number of data points in the calibration dataset, σ T and σ P the standard deviations of T and P respectively in the calibration dataset, R is the Pearson's correlation coefficient, and t is the value of the Student variable at the 0.05 confidence level and N C − 2 degrees of freedom.MoCo yields estimates of the standard error, systematic error, and calculates the potential systematic error associated to the proxy linear model.

The Monte Carlo simulation
Monte Carlo techniques are useful for the analysis of complex stochastic systems (Metropolis and Ullam, 1949;Hastings, 1970).In our study, we simulate the reconstruction of climate statistics from samples of corals or shells from the same area.The shells of the sample are not strictly contemporaneous but belong to the same studied time period.The shell samples are simulated by samples of surrogate climate proxy records that are obtained by sampling a known sea surface temperature (SST) time series and perturbing each individual of the sample with a suite of random noises and constraints.The estimator of the climate statistics calculated from a sample surrogate climate proxy record is only the product of one random sample and of one realization of the infinite number of values and combinations that noise can take.In the Monte Carlo simulation, this process (Fig. 1) is repeated many times (5000 iterations in the experiments presented here) in order to have a representative sample of the range of responses of the estimators T m , , V T , and V and robust estimates of the systematic and standard errors as defined in the previous section.

Using the MoCo application
The MoCo program is available as online supplementary material and can also be downloaded at http://www.isem.univ-montp2.fr/carrematthieu, along with a parameter file and the MoCo readme.txtfile which contains step by step instructions for users.Two versions are available: in R and in Matlab ® language.The algorithm is parameterized by the user according to the specifics of the study.The input parameters are listed in Table 1.The inputs required for the calculation of the potential systematic error related to the proxy model uncertainty are described in Sect.2.1.In the next section, we only describe the inputs required for the Monte Carlo simulation.
3 Inputs to the algorithm

A linear proxy calibration model
The linear empirical relationship linking the geochemical proxy (e.g.Sr/Ca or δ 18 O) to the climate variable (SST) is thereafter referred to as the proxy model: T = α • P + β (Sect.2.1).As in all paleoclimate reconstructions, the proxy model is assumed to be unbiased and to be valid in the past.The statistics of the calibration dataset (detailed in Table 1) are used to calculate the potential systematic error related to the calibration scattering (Sect.2.1).The parameters α and β of the proxy model allow a direct isometry between the proxy space and the temperature space.

A target time series
When using MoCo, a climate time series is first chosen that will be used as the "target climate".As will be discussed later, the characteristics of the target time series have a large influence on the estimation of the reconstruction uncertainty.
The number of years N 0 of the time series should be larger than the total number of years N S of the proxy records sample to allow adequate random sampling.A N S /N 0 ratio lower than 0.2 would keep the average overlap rate in the random samples under 10 %.The length of the target time series should also be chosen according to the period the sample is expected to be representative of.records are being used to study millennial scale variability of ENSO, a time series of about 1000 years should be used as a target in the Monte Carlo simulation.Such a long time series could be either provided by climate models or by a very old modern coral, or could be statistically synthesized.Instrumental time series may be used when century-scale periods are studied from short-lived proxy archives such as one-year long mollusk shells.Here, proxy records and target time series have a monthly resolution.

Random sampling of target time series
Once the target time series is identified, N time windows of N Y years are randomly drawn from it.N represents the sample size or the number of specimens that were analyzed to study the paleoclimate.N Y is the typical duration (number of years) of the proxy records.Two approximations are made in this process: (1) the proxy record time resolution is assumed to be monthly and constant; and (2) all specimens of the sample are assumed to have the same duration.

Signal perturbation
In this step, the random sample of N windows is perturbed by stochastic noise and by archive-related limitations to simulate the effects of the real-world uncertainty sources.Overlap of the windows within a sample is possible but is generally limited since the size of the windows (N Y years) is much smaller than the target time series (Sect.3.1).Every window in the sample is independently perturbed.The product is a group of N monthly windows of surrogate proxy-derived SST.Specific details about the perturbations is given below.

Spatial heterogeneity
Corals and mollusks, as sessile organisms, record only a micro-scale environmental variability that may differ from the regional environmental variability.This effect is especially significant in coastal environments where large spatial heterogeneity can occur.In MoCo, this stochastic noise is represented by a random parameter with a normal distribution N(0, σ S ).A random value is drawn for every specimen and added uniformly to the climate variable T over the whole time window.σ S should be estimated by field measurements according to the studied geographic scale.

Monthly noise
MoCo provides for three additional sources of month-tomonth noise: (1) the geochemical analysis standard error, (2) the weather scale variability during the life time of the specimen, and (3) biogenic carbonate heterogeneity at the µm scale of microsampling owing to vital effects and diagenesis.These three types of noise follow here the normal distributions N(0, σ A ), N (0, σ W ) and N(0, σ C ), respectively, where the standard deviations are expressed in the proxy unit (e.g.mmol mol −1 for coral Sr/Ca).These three sources of uncertainty add in quadration because they are independent and yield a month-to-month noise with normal distribution For every monthly value of every time window, a random value is drawn, converted to the temperature scale, and added to the monthly SST value.All the stochastic noise sources are here considered to be normally distributed because they are expected to have symmetric distributions, and each is comprised of a large number of stochastic processes.

Limitations of the biological archives
The archives considered in this study, corals and mollusks, are living organisms.Their biology constrains their growth and thus the way in which they record the environment.Every species is defined by a range of physico-chemical tolerances beyond which they stop precipitating new carbonate skeletal material.If it is possible that the reconstructed variable may represent a growth limitation (like temperature or salinity) the effect of these limits should be explored and quantified.In MoCo, T S and T I represent temperature limits above and under which precipitation of skeletal material stops.The values beyond these threshold in a sample are discarded.
Some species systematically stop growing at a precise period of the year because their resources are exclusively dedicated to reproduction.This implies a systematic growth break in the record that may affect the final calculated averages or variance.The typical monthly growth pattern of the species is defined in MoCo by parameters gb1 to gb12 (Table 1).
Finally, growth breaks may occur randomly because of storms, predation, or sickness.In the MoCo program, the parameter gap allows the choice of occurrence of zero to 12 random growth breaks per year, of a monthly duration each (Table 1).

Sensitivity experiments
Six sensitivity experiments were performed using the MoCo algorithm to explore the influence of specific parameters on the systematic and standard errors calculated by MoCo.In every experiment, MoCo was used repeatedly with different settings to produce a series of error values that were plotted against the studied parameter.Experiments were thus designed to explore under different angles the influence of (1) the sample size (exp.1, 2, and 6), (2) stochastic perturbations (exp. 1, 2, and 3), (3) the biological limitations of the archive (exp.4 and 5), and (4) the target time series (exp.3, 4, 5, and 6) on the systematic and standard errors when reconstructing the statistical characteristics of the target time series using the estimators T m , , V T , and V .To perform these experiments we chose the illustrative case study m and V T ), and the mean and variance of the seasonal amplitude of SST (Δ and V Δ ) were calculated for the complete series as described in section 2.   m and V T ), and the mean and variance of the seasonal amplitude of SST (Δ and V Δ ) were calculated for the complete series as described in section 2.  m and V T ), and the mean and variance of the seasonal amplitude of SST (Δ and V Δ ) were calculated for the complete series as described in section 2. of temperature reconstructions in the Eastern Tropical Pacific from mollusk shell oxygen isotopes.We used the empirical proxy model established by Grossmann and Ku (1986) for biogenic aragonite (Eq.2), which is often considered as the definition of isotopic equilibrium for biogenic aragonite and has been widely used for paleoclimate studies from aragonitic mollusk shells:

Monthly SST target time series
δ 18 O arag. and δ 18 O W are expressed in ‰ versus the V-PDB and V-SMOW international standards respectively (Coplen, 1996).This proxy model is used here for the case study.Any other proxy model could be used with MoCo.The biological characteristics of the species (growth breaks, temperature tolerance range, length of record...) are varied in the experiments to evaluate their influence on the reconstruction skills.
Three sea surface temperature (SST) time series were used as "target" climatology: (1) the 1925-2002 monthly in situ instrumental record from Puerto Chicama, Peru (Quispe Arce, 1993, dataset available at http://jisao.washington.edu/data sets/#time series), (2) the 1950-2009 monthly SST time series of the Niño1+2 region (NOAA Climate Prediction Center, http://www.cpc.ncep.noaa.gov/data/indices/),and (3) the 1000-year long monthly SST time series of the Niño1+2 area from the preindustrial control simulation of the IPSL CM4v2 coupled ocean atmosphere general circulation model (GCM) (for details about the simulation, see Servonnat et al., 2010).The target time series are presented in Table 2.The first 2 time series were used in all experiments except for the experiment 2 in which the GCM time series was used (Table 3).The parameterization of the sensitivity experiments are summarized in Table 3.

Experiment 1: influence of random sampling and stochastic noise
The first experiment was designed to test the effect of sampling on the standard and systematic errors and compare it to the effect of three stochastic noises that were turned off or on (Fig. 2).Realistic values were assigned to σ S (spatial heterogeneity) and σ m (month-to-month noise) based on field measurements on the Peruvian coast.The effect of random 1-month growth gaps was also tested (Table 3).In this experiment, shell records span one year and no biological limitations were included.

Experiment 2: should long or short records be preferred?
Experiment 2 was designed to explore the advantages and drawbacks of using short or long records for a reconstruction considering a constant sample size (i.e. a fixed total number of years N S ), and realistic stochastic perturbations of the proxy signal (Fig. 3 as would be expected for coral-based records, and the total number of years N S = N • N Y recorded by the sample was held constant at 200 years.No instrumental record was long enough for this experiment so a 1000 year long pre-industrial OA-GCM simulation of the Niño1+2 SSTs was used as a monthly target time series.Although this time series is not realistic, it is useful for this sensitivity test.

Experiment 3: influence of month-to-month noise
Experiment 3 explored further the effect of the individual data point quality degraded by the monthly noise, characterized by σ m , which includes the analytical error, weather scale noise, and skeletal carbonate heterogeneity (Sect.3.3.2).
Here, all perturbations were turned off and all parameters were fixed except for σ m .In MoCo, this noise is added to the SST time series as a random monthly series characterized Standard Error Systematic Error

Ideal proxy
With stochastic noises (σs, σm, gap) by a standard deviation of α • σ m .In this experiment, α • σ m increased gradually from 0 to σ , the standard deviation of the instrumental SST time series.The error values obtained from MoCo were plotted against the noise/signal ratio R = α • σ m /σ .Simulations were performed with samples of 20 one-year long shells, and two different target time series (Fig. 4).

Experiment 4: influence of a yearly growth break
Experiment 4 was designed to test the effect on reconstruction errors of yearly growth breaks always occurring at the same period (such as spawning growth breaks) (Fig. 5).All other perturbations were turned off and all parameters except gbi were fixed.We only considered the case of a single one-month growth break per year, and compared the effect of varying its month of occurrence.Simulations were performed with samples of 20 one-year long shells, and with two different target time series.

Experiment 5: influence of temperature tolerance range
In experiment 5, we explored the effect of skeletal growth temperature threshold on the errors of reconstruction of two target time series.In this experiment, an upper (lower) temperature threshold was imposed so that temperatures beyond that threshold were not recorded.The upper (lower) threshold value decreased from the maximum (minimum) temperature T max (T min ) of the target time series to T max − 10 σ a m 0/0.14 0.17 0 to 0.5 0 0 0.17 (T min + 10 • C) by 0.1 • C increments.The standard and systematic error values calculated by MoCo under these conditions were plotted against the threshold values (Fig. 6).All other perturbations were turned off and all parameters except T S andT I were fixed.Simulations were performed with samples of 20 one-year long shells.

Experiment 6: influence of the time series variability
Experiment 6 was designed to explore the effect on the standard errors of changes in the time series climate variability.We used MoCo with the Puerto Chicama time series and 11 additional time series with decreasing variability.These additional time series were produced by removing El Niño and La Niña years from the Puerto Chicama time series.Since most of the interannual variability in this area is produced by ENSO, the values of V T and V decreased when more El Niño or La Niña years were removed from the original time series, while the other general characteristics of the climate are conserved.The experiment was performed using 10, 20, and 30 shells of 1 year (Fig. 7).Realistic stochastic perturbations were included but no biological temperature threshold (Table 3).
Fig. 5. Results of experiment 4. Standard and systematic errors for proxy reconstruction of T m , V T , , and V of Niño1+2 SSTs (blue) and Puerto Chicama SSTs (red), versus the month of systematic growth hiatuses.

Potential systematic error
The MoCo program calculates the potential systematic error bar (at 95 % confidence level) due to uncertainties in the linear proxy model calibration.This error is not estimated from surrogate proxy records but is simply calculated from the equations of error propagation presented in Sect.2.1.This error bar not only depends on the proxy model calibration dataset but also on the target climate.When calculating the mean temperature, for instance, the error bar increases with the difference between the reconstructed conditions and the calibration dataset mean value.The mean value T 0 of Grossmann and Ku's (1986) dataset was 10.

Random sampling (Exp. 1 and 2)
In experiment 1, the effect of the sample size is quantitatively estimated and compared to the effect of stochastic noises σ m , σ S and gap (Fig. 2).The effect of sampling only is represented in Fig. 2 by the "ideal proxy" curves obtained with all the perturbations turned off.It appears that random sampling is one of the main sources of the standard error.This standard error decreases rapidly with the sample size and becomes relatively insignificant for T m and when N reaches 20.On the other hand, the standard error for V T and V due to sampling remains relatively significant up to N = 30.The strong effect of the sample size on the standard errors is also evidenced by the results of experiment 6 (Fig. 7).
In experiment 2, we test whether reconstructions from long proxy records are more reliable than reconstructions obtained from short proxy records considering a fixed total number of years recorded.N S is kept constant and equal to 200 in order to test only the influence of the record length in the sampling and not of the total sample size.For mean temperature (T m ) reconstructions, a large sample set of short records provides a lower standard error than a few long records because local spatial temperature heterogeneity is better averaged out with numerous specimens.In a similar way, the effect of spatial variability (σ S ) tends to overwhelm the climatic variability V T when the number of records is small, unless it is calculated from a single record (Fig. 3, N = 1, N Y = 200).Short or long records do not yield significantly different results for systematic errors except for V T .
Fig. 7. Results of experiment 6. Left panels: Linear relationships between T m and V T standard errors obtained from MoCo versus V T in 12 time series modified from the Puerto Chicama time series.Right panels: Linear relationships between and V standard errors obtained from MoCo versus V .For every variable, a least-square linear correlation was obtained for reconstructions using 10 shells (squares), 20 shells (diamonds), and 30 shells (triangles).
V T is overestimated when using short records because the spatial variability adds up to the climatic variability, but this effect decreases when using fewer longer records.Finally it appears that intermediate values of N and N Y (here 20 10years old shells) would yield the best compromise for accuracy and precision in the reconstruction of T m and V T .The reconstruction skills for , and V are not significantly affected by the record length in our experiment.

Effects of stochastic noise (Exp. 1, 2, and 3)
Three kinds of stochastic perturbations (see Sect. 4) are applied to the climate signal in experiments 1, 2, and 3 in order to produce surrogate climate proxy records.In the first experiment, their influence is observed separately and compared to ideal proxy reconstruction errors.As expected, spatial variability greatly affects the standard error of the T m reconstruction (Fig. 2) and this effect decreases when N increases (Fig. 3).It also induces a systematic positive bias for V T reconstruction which also decreases when N increases (Fig. 3).
In our case study, the monthly stochastic perturbation characterized by σ m involves monthly water δ 18 O variability, carbonate δ 18 O analytic error, and shell carbonate heterogeneity.The monthly stochastic perturbation in experiment 1 does not significantly affect the T m and V T reconstructions but it produces an unexpected overestimation of the annual amplitude and of its variance V (Fig. 2).Its effect on and V reconstructions is further explored in experiment 3 (Fig. 4) using two different target SST time series.The maximum value of σ m corresponded to a noise/signal ratio R = 1, which represents an extremely noisy proxy record.The systematic positive bias on estimate is ∼0.1 • C when R = 0.1 and increases to 3 • C when R = 0.8.For other parameters, the response depends on the target time series.For R = 0.4, the systematic error for V is about 0.8 (or 100 %) for the Niño1+2 time series, while it is almost null for the Puerto Chicama time series.Standard errors for Niño1+2 SSTs are more sensitive to the monthly noise than those for Puerto Chicama SSTs.This can be explained by the higher interannual climate variability of Puerto Chicama (Table 2), which makes the noise-related variability relatively smaller.
Random growth breaks have no significant impact on the standard error for any of the variable (Fig. 2).They induce a slight positive bias for V T and a slight negative bias for .These biases are due to the time series properties since they are not observed when using the Puerto Chicama time series (not shown).

Effects of biological limitations (Exp. 4 and 5)
Growth hiatuses may occur every year at approximately the same date for breeding or other reasons (Sato et al., 1999).In experiment 4 (Fig. 5), we showed that the date of the growth break has little impact on standard errors but may produce systematic errors.As expected, the mean annual temperature would be underestimated (overestimated) if the growth breaks occur in the warmest (coldest) period.Here, the maximum systematic error for T m was −0.3 • C. The annual amplitude may be largely underestimated if the growth hiatuses occur during seasonal extrema.In our experiment the systematic bias for reached −0.4 • C, with a systematic growth break in March for the Niño1+2 SST time series.Proxy reconstructions of variances were affected by growth hiatuses in a much less predictable way.For Puerto Chicama SSTs, maximum systematic errors reached about 8 % for V T when growth breaks occurred in September.The largest underestimation (8 %) for V as obtained for growth breaks occurring in December, January, or February.This is because V is strongly related to El Niño events whose maximum development occurs during these months.Again, the error was strongly dependent on the target time series.
Temperature tolerance for skeletal growth is an especially important biological limitation that may induce significant systematic biases in paleoclimate reconstructions.These biases were explored with two different time series in experiment 5 (Fig. 6).While it is obvious that upper (lower) temperature limits cause underestimation (overestimation) of the mean temperature T m , and that is in both cases underestimated, our experiment permits calculation of a quantitative estimate of this systematic biases.In experiment 5, the systematic error responses to lower and upper limits were not symmetrical.For Niño1+2, the effect of the lower temperature limit began to be significant for T m and when it reached ∼1.5 • C above the time series minimum temperature, whereas the effect of the upper limit began to be significant when it reached ∼3 • C under the time series maximum temperature.The biases produced for V T and V reconstructions remained relatively low for Niño1+2 while, for Puerto Chicama, they rapidly became significant and had unpredictable profiles, switching from positive to negative values (Fig. 6), especially for the lower temperature limit.This is due to the particular variability of coastal SSTs in Peru characterized by a weak annual cycle of ∼4 • C periodically perturbed by very strong El Niño anomalies that can reach up to 9 • C in austral summer.Therefore, when the lower temperature threshold reaches 4 ˚C above the minimum SST, most of the annual cycle is not recorded, so that the variability is restricted to the highly irregular El Niño-related summer anomalies.On the other side, the upper threshold cuts the largest El Niño anomalies and systematically induces an underestimation of the internannual variability V T or V .

Influence of the target time series (Exp. 3, 4, 5, and 6)
The results of experiments 3, 4, and 5 showed that the behavior and values of standard and systematic errors strongly depend on the climate time series used as a target in the experiment because they depend on the probability density function of the temperature.The Puerto Chicama time series has a much wider distribution than the Niño1+2 time series (Table 2) so that standard error due to random sampling is much larger for Puerto Chicama (Fig. 4, σ m = 0; Fig. 5).In the experiments 3, 4, and 5, the error responses to perturbations were strongly modulated by the characteristics of the target time series.The same proxy perturbation may induce strong errors with one time series and be insignificant with another.For instance, the upper growth temperature limit had no significant effect on the reconstruction of T m and until it reached ∼6 • C under the maximum value of the Puerto Chicama SST (Fig. 6) because of the highly asymmetric distribution of temperature due to El Niño phenomenon at this location.These results show that the choice of the target time series when using MoCo to estimate paleoclimate reconstruction errors is a critical step.The standard and systematic error estimated with the IPSL CM4v2 GCM Niño1+2 time series are therefore not reliable because the seasonal cycle and the ENSO variability are not correctly represented (Table 2).
The experiment 6 showed that the standard error of all the proxy climate statistics increases almost linearly with the interannual variability in Puerto Chicama (R 2 values ranges from 0.81 to 0.98).The dependence, indicated by the slope of the linear regressions, is about 10 times lower for the standard error of the reconstructed T m and , than for the reconstructed variance V T and V .The relationships are also strongly affected by the sample size N.This result means that the reconstruction of past ENSO activity indicated by the variance has an error bar that varies with the ENSO activity itself.

Implications for paleoclimate error representation
Three types of errors have been distinguished in this study that should all be treated distinctively.Systematic errors are meant to be corrected for when detected and quantified, and thus are not supposed to be represented.Large modern datasets are required to detect and calculate a robust estimate of a bias in a proxy climate reconstruction.The number of modern coral records in a location is always small because of limited specimen availability and/or analytical costs.MoCo provides an alternative way to detect and quantify biases and thus improve the accuracy of coral and mollusk reconstructions provided the parameterization and the target time series are realistic in the specific context of the study.
The standard error is due to sampling, stochasticity at different time and space scale, and the biological limitations that have been explored in our experiments.This error bar exists even if the proxy model is ideal.The standard errors estimated for different shell or coral samples are independent.
The errors related to the proxy model calculated in different samples of the same type of archive are not independent since they originate from the same proxy model.For this reason, this type of error is fundamentally different from the standard error.
Depending on the context, these two errors may be combined or represented separately.Generally, if estimating the error bar for a single period or event, both errors may be added in quadrature.Nurhati et al. (2009Nurhati et al. ( , 2011) ) combined the analytical standard error with the uncertainty related to the proxy calibration to estimate the error bar of a reconstructed temperature trend through the 20th century.However, if several successive samples are analyzed to study long-term climate variability, the standard error and the potential systematic error may be represented separately in paleoclimate results as in Carré et al. (2012) for a more complete representation of the reconstruction uncertainty.

Understanding the proxy
A sensor-specific precision of reconstruction is often attributed to the paleoclimate proxy estimates (for example Beck et al., 1992;Anand et al., 2003).Our numerical experiments simulating the process of paleoclimate reconstruction from coral or mollusk shell geochemistry show that the error cannot be considered a constant value characteristic of the proxy.Stochastic uncertainties and biological limitations significantly affect the resulting climate reconstruction, in different manners depending on the studied statistics (T m , V T , , V ) and on the studied location.General qualitative relationships between the perturbations, the target time series characteristics and the reconstruction skills are described in the result section.However, standard and systematic errors are so sensitive to the noise parameterization and the target time series that we recommend reproducing the sensitivity of experiments described here with a parameterization specific to the study in order to get a quantitative understanding of the proxy records.
The results of the experiments yield an illustrative example of the range of variations that climate reconstruction errors may undergo, and bring to light their complexity.Although classic calibration-validation techniques are useful for a first approximation test of corals and mollusks reliability as archives, they are not well-suited for identifying the causes of reconstruction errors, estimating their relative contribution, or understanding how errors accumulate from a multitude of sources.
While the influences of several sources of error were qualitatively predictable, some perturbations produced significant unpredictable systematic bias .Beyond the findings of error sources for coral and mollusk-based reconstructions, our study demonstrates that numerical simulations based on Monte Carlo analyses are a simple and powerful approach to improve the proxy calibration process.A thorough understanding of the proxy record errors is essential for the interpretation of paleoclimate records from proxies derived from accretionary skeleton geochemistry.

Quantifying errors: contribution and limitations
Quantifying errors in paleoclimate reconstructions is essential for accurate and meaningful proxy-proxy and proxymodel comparisons.In the case of mollusks and corals, modern datasets are necessarily limited by costs and material availability, which limits statistical techniques to accurately estimate reconstruction uncertainties.The MoCo algorithm was designed to provide quantitative estimates of the three kinds of errors identified in Sect.2.1 for coral and mollusk based climate reconstructions.It implies that the linear proxy model has been previously validated.The accuracy of the systematic and standard error quantification using MoCo may be affected by several limitations, as we now describe.
The model for the paleoclimate reconstruction process implies simplifications including: (1) proxy records in a sample are considered of equal duration and of constant monthly resolution, and (2) stochastic noise was represented by stationary normal distributions.Furthermore, the parameterization of MoCo by the user requires field measurements and knowledge of the organism ecology and growth, characterized by fixed values that have their own uncertainty and variability.Uncertainties and non-stationarity in σ S , σ m , T I or T S estimates may affect the quantification of errors.
The standard and systematic errors calculated by MoCo are not true errors but only estimates from simulations based on the assumption that the temperature distribution of the paleoclimate is similar to the distribution of the time series used in the simulation.The main limitation is thus related to the non-stationarity of climate, although this issue is not peculiar to our method but is general in paleoclimatology since uncertainties are necessarily assessed in modern conditions.By definition, the true target climate is unknown, and necessarily different from the target time series used in the MoCo simulation.Therefore, the estimation of the standard and systematic errors may be biased by the difference of variability between the true climate and the target time series.In the selection process for the simulation target time series, it is recommended to seek the most realistic time series.
However, this bias can be minimized for the standard errors.Experiment 6 showed that in a defined location, the standard error is linearly correlated to the interannual variability through a relationship that is strongly modulated by the conditions of the reconstructions and especially the sample size.Therefore, MoCo should be used repeatedly as in experiment 6 to estimate the linear relationships specific to the paleoclimate study.The standard error of a proxy reconstruction can then be calculated using this relationship and the reconstructed values of V T and V .This method is valid if we consider that in a defined location, the interannual or decadal variability may change significantly through time but the larger climatic features responsible for these modulations remain similar, at least on multi-millennial timescales.For example, in the tropical Pacific, the characteristics of the climate variability can be considered conservative in a defined location and its amplitude is modulated by the activity of ENSO and the Pacific Decadal Oscillation (PDO).Therefore, based on experiment 6, if V = 3 • C 2 is reconstructed in the Peruvian coast from 20 1-year long mollusk shells, the linear relationship estimated in experiment 6 (Fig. 7) yields a value of standard error of 0.4 • C 2 for and 1.3 • C 2 for V .This method uses the information of the reconstruction itself to narrow the potential biases due to climate non-stationarity but is not perfect because the reconstruction is necessarily not exact and because it assumes the stationarity of the variability vs. standard error relationship.Very few coral and mollusk paleoclimate studies include a proper error calculation.One of the most complete attempt was made by Abram et al. (2009) who estimated the standard error of mean temperatures estimated from Sr/Ca coral records combining the variability between several corals from the same area (which corresponds to the parameter σ S in our study) and a measured standard error related to analytical precision and variability within corals (which is similar to our parameter σ m ).Although a large part of the standard error is due to these parameters, MoCo represents a significant improvement since it involves additional error sources for a more realistic estimate.It also offers the possibility to quantitatively estimate the influence of individual error sources.The improvements of this method are: (1) the identification of 3 independent categories of errors, (2) the quantification of the standard error due to uncertainty sources that were so far not considered, and (3) the identification and estimation of systematic biases that would not otherwise be detected because of the limited modern dataset.It thus offers the possibility of correcting the proxy-based climate from these biases for a more accurate reconstruction.

Extending applications
Sensitivity experiments were based on an illustrative case study of SST reconstruction from mollusk or coral δ 18 O in an environment where water δ 18 O is reasonably constrained.These experiment could be similarly performed to improve the understanding of other proxies including temperature reconstructions from Sr/Ca ratios in corals (Beck et al., 1992;De Villiers et al., 1994;Marshall and McCulloch, 2002) and sclerosponges (Swart et al., 2002;Rosenheim et al., 2004), Mg/Ca ratios in coralline algae (Kamenos et al., 2008), and pH from coral δ 11 B (Rollion- Bard et al., 2011).
In many environments, calcium carbonate δ 18 O variations yield a mixed signal between water temperature and water δ 18 O variations related to freshwater input.Changes in the climate variability are thus estimated by the δ 18 O variability in the fossil coral compared to modern corals (Tudhope et al., 2001).Under such conditions, the uncertainties of the proxy reconstructions should be evaluated in the space of the proxy variable (δ 18 O carb ), which would require very little modifications of the MoCo since the climate and the proxy space are linearly linked by the proxy regression model (Sect.3.1).A target time series of δ 18 O carb would be required, that could be either provided by a very long modern coral of the same area or calculated from temperature and water δ 18 O time series.This approach, referred to as forward modelling, has been applied to corals by Thompson et al. (2011) and a variety of proxies, such as planktonic foraminifera δ 18 O (Schmidt, 1999), or stalagmite δ 18 O (Baker and Bradley, 2010).The strength of forward modelling that incorporates Monte Carlo analyses for paleoclimate proxy calibration was showed by Evans (2007) through a case study with wood cellulose δ 18 O.To our knowledge it has never been applied to corals or mollusks.MoCo-type algorithms would also be useful for exploring the uncertainties of salinity or water δ 18 O reconstructions based on the combination of coral δ 18 O and Sr/Ca ratios, since the perturbations of both proxies add with different types of perturbations and propagate in an unpredictable way.
This study does not apply to time series reconstructions used to assess temporal changes in variability.Further work in a similar methodological framework is needed to study the effect of age uncertainty, stochasticity and proxy biological limitations on the correlation between the reconstructed time series and target time series.

Conclusions
Reconstructions of paleoclimate statistics (annual mean, mean seasonal amplitude and interannual variability) from coral and mollusk geochemistry were so far performed either without quantified error bars or with error estimates that do not involve all the sources of uncertainty.We defined here the error bar due to the proxy calibration as a potential systematic error, and defined two additional and independent categories of errors: the standard error and the systematic error that build from multiple sources and are potentially larger than those estimated from empirical calibration scatter.We provided an open access program, MoCo, available for R and Matlab ® (Supplement and also at http: //www.isem.univ-montp2.fr/carrematthieu), that calculates the error related to the proxy linear model, and estimate with a Monte Carlo simulation the systematic biases, and the standard errors for the reconstruction of climate statistics from mollusks and corals.MoCo involves realistic surrogate proxy-derived climate records, and was designed to improve the accuracy and the quantification of coral and molluskbased SST reconstructions.We showed in an illustrative case study that surrogate proxy techniques associated with Monte Carlo analyses are powerful tools to improve the understanding and calibration of mollusk and coral proxy records.Sensitivity experiments showed the significant and often unpredictable influence of random sampling, stochastic variability in the proxy formation, archive biological limitations, and climate characteristics.These numerical experiments are a fast and efficient technique for a quantitative assessment of the relative contribution of these influences in the systematic and standard errors of the paleoclimate reconstruction.
The accuracy of error quantification by MoCo is mainly limited by: (1) the simplification of the simulated climate reconstruction process, (2) the variability of the parameters used in MoCo, (3) the potential non-stationarity of the proxy model, and (4) the non-stationarity of climate between the modern reference used in the simulation and the studied period.We propose a technique to constrain this latter source of bias using the value of the reconstructed variability and the relationship between the standard error and the variability that can be estimated with MoCo.While the errors estimated by MoCo are only the product of simulations and may be imperfect, this method is still a significant improvement for quantitative reconstructions of climate statistics from mollusks and corals.Its conceptual simplicity should allow it to be adapted for a wide range of applications in paleoclimate research involving, among other archives, corals, mollusks, sclerosponges and coralline algae.

Fig. 1 .
Fig. 1.Conceptual representation of the calculation of reconstruction errors in the MoCo algorithm.

Table 2 .
Characteristics of monthly SST time series used as target in sensitivity experiments.Only the first 100 years of the GCM Niño1+2 SST time series are shown.The mean and variance of the annual SST (T m and V T ), and the mean and variance of the seasonal amplitude of SST ( and V ) were calculated for the complete series as described in Sect. 2. Monthly SST target time series T m V T V Table 2. Characteristics of monthly SST time series used as target in sensitivity experiments.Only the first 100 years of the GCM Niño1+2 SST time series are shown.The mean and variance of the annual SST ( Only the first 100 years of the GCM Niño1+2 SST time series are shown.The mean and variance of the annual SST ( Only the first 100 years of the GCM Niño1+2 SST time series are shown.The mean and variance of the annual SST (

Fig. 2 .
Fig. 2.Results of experiment 1 with the 1950-2009 Niño1+2 SST time series.Mean values of T m , V T , , and V (black bold lines) calculated from 5000 iterations of surrogate proxy simulations (MoCo algorithm), and compared to the true values (green) of the target time series, versus the sample size (e.g.number of shells).Systematic errors are indicated by the difference between the mean calculated value and the true value.Dotted lines show the standard error interval (±1 σ ) for an ideal proxy (no signal perturbation) versus the sample size.Thin black lines show the standard error interval (±1 σ ) for surrogate proxies with stochastic noise.In the first three columns, the effects of spatial variability (σ S ), monthly variability (σ m ), and the occurrence of random growth breaks (blue: 1 per year, black: 2 per year) are investigated separately.Their effects are combined in the fourth column.

Fig. 3 .
Fig. 3. Results of experiment 2 with the 1950-2009 Niño1+2 SST time series.Standard error (left panel) and systematic error (right panel) obtained for the reconstruction of T m , V T , , and V using the MoCo algorithm, versus the length (number of years) of the proxy record, considering a constant number of 200 recorded years (from 200 one-year old shells to one 200-years old shell).Results using an ideal proxy (dotted line) were compared to results involving stochastic noise (black line).
N • N Y = 200; c T max and T min are the maximum and minimum temperature values of the monthly SST time series used as a "target".

Fig. 4 .
Fig. 4. Results of experiment 3. Standard error and systematic error for T and var( T ) versus the standard deviation of the monthly stochastic perturbation.The effect was investigated using the 1950-2009 Niño1+2 (thin line) and the 1925-2002 Puerto Chicama (bold line) time series.

Fig. 6 .
Fig.6.Results of experiment 5. Systematic errors for T m , V T , , and V due to growth temperature limits.The effects of inferior (superior) temperature limits are shown on the left (right).In the upper panels are indicated the temperature ranges of Niño1+2 SSTs (blue) and Puerto Chicama SSTs (red).The darker intervals show the range of temperature recorded by the archive limited by T I (left panels) and T S (right panel).

Table 1 .
MoCo input parameters and output variables.

Table 2 .
Characteristics of monthly SST time series used as target in sensitivity experiments.

Table 3 .
Parameter setting in sensitivity experiments 1 to 5. Gray cells indicate varying parameters.