Novel approach for ice core based temperature reconstructions-a synthetic data study for Holocene δ 15 N data

We present a novel approach for reconstructing the Holocene surface temperature histories by forcing the output of a firn densification and heat diffusion model to fit nitrogen isotope data (δN) extracted from ancient air in Greenland ice 10 cores. The performance of this novel approach is demonstrated using synthetic data. The presented approach works completely automated and leads to a match of the δN target data in the low permeg level and to related temperature deviations of a few tenths of Kelvin for different data scenarios, showing the robustness of the reconstruction method. The obtained, final mismatches follow a symmetric, standard distribution function. 95% of the mismatches compared to the synthetic target data are in an envelope in between 3.0-6.3 permeg for δN and 0.23-0.51 K for temperature (2σ, 15 respectively). We solve the inverse problem T(δN) by using a combination of Monte Carlo sampling and quantitative data analysis, based on cubic spline filtering of random numbers and the measured temperature sensitivity for nitrogen isotopes. Additionally, the presented reconstruction approach was tested by fitting δAr and δNexcess data (Döring et al., in prep.), which leads to same robust agreement between modelled and measurement data. In addition to Holocene temperature reconstructions, the fitting approach can also be used for glacial temperature reconstructions (Döring et al., in prep.) and it is 20 reasonable to adapt the approach for model inversions of other non-linear physical processes.


Introduction
Holocene climate variability is of key interest to our society, since it represents the rather constant time window with moderate natural variations, referred to a baseline for today's increasing greenhouse effect driven by mankind.Yet, high resolution studies are still very sparse and therefore limit the investigation of decadal and partly even centennial climate variations over the course of the Holocene.One of the first studies about changes in the Holocene climate was conducted in the early 1970s by Denton and Karle´n (1973).The authors investigated rapid changes in glacier extents around the globe potentially resulting from variations of Holocene climatic conditions.Mayewski et al. (2004) used this data as the base of a multiproxy study identifying rapid climate changes (so called RCCs) globally distributed over the whole Holocene time period.Although not all proxy data are showing an equal behaviour in timing and extent during the quasi-periodic RCC patterns, the authors found evidence for a highly variable Holocene climate controlled by multiple mechanisms, which significantly affects ecosystems (Beaulieu et al., 2017;Crausbay et al., 2017 ;Pál et al., 2016) and human societies (Holmgren et al., 2016 ;Lespez, L. et al., 2016).Precise high resolution temperature estimates can contribute significantly to the understanding of these mechanisms.Ice core proxy data offer multiple paths for reconstructing past climate and temperature variability.The study of Dahl-Jensen et al. (1998) exemplarily demonstrates the usefulness of inverting the measured borehole temperature profile to surface temperature estimates for the investigated drilling site using a coupled heat-and ice-flow model.Because of smoothing effects due to the nature of heat diffusion within an ice sheet, this method is unable to dissolve fast temperature oscillations and leads to a rapid reduction of the time resolution towards the past.Another approach for reconstructing past temperature is based on the calibration of stable water isotopes of oxygen and hydrogen (δ 18 O H2O , δD H2O ) from ice core water samples assuming a constant (and mostly linear) relationship between temperature and isotope values due to fractionation of evaporation and rainout processes for a certain time period (Stuiver et al., 1995 ;Johnsen et al., 2001).This method provides a rather robust tool for reconstructing past temperature for times where large temperature excursions occur (Dansgard-Oeschger events, Glacial-Interglacial transitions (Johnsen et al., 1992;Dansgaard et al., 1982)).However, in the Holocene where Greenland temperature variations are comparatively small, seasonal changes of precipitation as well as of evaporation conditions at the source region contribute possibly more to water isotope data variations (Kindler et al., 2014;Huber et al., 2006;Werner et al., 2001).A relatively new method for ice core based temperature reconstructions uses the thermal fractionation of stable isotopes of air compounds (nitrogen and argon) within a firn layer of an ice sheet (Severinghaus et al., 2001 ;Severinghaus et al., 1998 ;Kobashi et al., 2011;Kindler et al., 2014 ;Huber et al., 2006).The measured nitrogen and argon isotope records of air enclosed in bubbles in an ice core can be used as a paleothermometer due to (i) the stability of isotopic compositions of nitrogen and argon in the atmosphere at orbital timescales and (ii) the fact that changes are only driven by firn processes (Severinghaus et al., 1998;Mariotti, 1983;Leuenberger et al., 1999).To robustly reconstruct the surface temperature for a given drilling site, the use of firn models describing gas and heat diffusion throughout the ice sheet is necessary for decomposing the gravitational from the thermal diffusion influence on the isotope signals.This work addresses two issues relevant for nitrogen and argon isotope based temperature reconstructions.First, we introduce a novel, entirely automated approach for inverting nitrogen isotope data to surface temperature estimates forcing the output of a firn densification and heat diffusion model to fit the nitrogen isotope data.And second, we investigate, how accurate can this be done in the framework of the used model, by examining the approach on different synthetic nitrogen isotope and temperature scenarios.

Firn densification and heat diffusion model
The surface temperature reconstruction relies on firn densification combined with gas and heat diffusion (Severinghaus et al., 1998).In this study, the firn densification and heat diffusion model, from now on referred to as firn model, developed by Schwander et al. (1997) is used to reconstruct firn parameters for calculating synthetic δ 15 N values depending on the input time series.It is a semi-empirical model based on the work of Herron and Langway (1980), Barnola et al. (1991), and implemented using the Crank and Nicholson algorithm (Crank, 1975) and was also used for the temperature reconstructions by Huber et al. (2006) and Kindler et al. (2014).Besides surface temperature time series, accurate accumulation rate data is needed to run the model (Fig. 01).The model then calculates the densification and heat diffusion history of the firn layer and provides parameters for calculating the fractionation of the nitrogen isotopes for each time step, according to the following equations: 15   () =  15   () +  15  ℎ () δ 15 N grav (t) is the component of the isotopic fractionation due to the gravitational settling (Craig et al., 1988;Schwander, 1989) and depends on the lock-in-depth z LID (t) (depth where the density of the firn reaches a certain value, based on the empirical formula from Martinerie et al. (1994), which is the density where the porosity of the firn column is low enough, so the gas diffusion is negligible) and the mean firn temperature T � (t) (Leuenberger et al., 1999).g is the acceleration constant, ∆m the molar mass difference between the heavy and light isotopes (equals one gram for nitrogen) and R the ideal gas constant.The thermal fractionation component of the δ 15 N signal (Severinghaus et al., 1998) is calculated using Eq. ( 2), where T surf (t) and T bottom (t) stand for the temperatures at the top and the bottom of the diffusive firn layer.In contrast to T surf (t) which is an input parameter for the model, T bottom (t) is calculated by the model for each time step.The thermal diffusion constant α T was measured by Grachev and Severinghaus (2003) for nitrogen, expressed there as Ω, and closely matches the value used by Leuenberger et al. (1999) based on measurements of Boersma-Klein and De Vries (1966): The used firn model behaves as a sheer forward model, which means that for the given input time series the output parameters (here finally δ 15 N mod (t)) can be calculated, but it is not easily possible to construct from measured isotope data the related surface temperature or accumulation rate histories (Fig. 01).The goal of the presented study is an automatization of this inverse modelling procedure for the reconstruction of the rather small Holocene temperature variations.

Measurement, input data and time scale
Accumulation rate data: Besides surface temperatures, accumulation rate data is needed to drive the firn model.In this study, accumulation rate data from Cuffey and Clow (1997), adapted to the GICC05 chronology, is used (Rasmussen et al., 2008;Seierstad et al., 2014).From three accumulation rate scenarios the intermediate one is chosen (red curves in Fig. S01).
Since the differences between the scenarios (Fig. S01) are not important for the evaluation of the reconstruction approach, they are not taken into account for this study.

Time scale:
For the entire study the GICC05 chronology is used (Rasmussen et al., 2014;Seierstad et al., 2014).During the whole reconstruction procedure the two input time series (surface temperature and accumulation rate) are split into two parts.
The first part ranges from 20 to 10520 yr b2k (called "Holocene section") and the second one from 10520 yr to 35000 yr b2k ("spin-up section").The whole accumulation rate input (see Sect. 2.4.1) as well as the spin-up section of the surface temperature input remain unchanged during the reconstruction procedure.

Static case
To investigate the model behaviour in its static condition (constant temperature and, accumulation rate) a set of 408 input scenarios was run by the firn model calculating the δ 15 N outputs for each permutation.The surface temperature inputs were varied in the range of -60 °C to -10 °C with 1 K step width and the accumulation rates were changed between 0.10 and 0.45 m/yr in 0.05 m/yr steps leading to 51 times 8 scenarios (Fig. 02a,b).Each input scenario was calculated in a 30 kyr time window to make sure that no spin-up effects influence the output δ 15 N data.With the obtained temperature, accumulation and δ 15 N combinations different polynomial surface fits were conducted in order to find a transfer function (Table S01 and Fig. S02) which approximates the static model behaviour in the best possible manner.Figure 02c shows the comparison between the model δ 15 N and the surface polynomial fit using degree 3 in temperature (T in Kelvin) and degree 2 in accumulation rate () leading to a robust model approximation with a correlation coefficient r 2 = 0.9997 and root mean square error (RMSE) of about 4.5 permeg over the investigated model space and shows a non-linear behaviour in  15 () and  15 ().
15 (, ) =  00 +  10 The coefficients p xy as well as the 95% confidence are listed in Table S01.Figure S02 shows the correlation coefficients r 2 and RMSE for all two-dimensional surface fits which were conducted during this analysis as a function of the polynomial degrees in temperature and accumulation rate (T, Acc).It is trivial that a higher polynomial degree leads to a better approximation.More interesting is the fact that an increase of the polynomial degree in temperature seems to be more important than in accumulation rate, e.g. the polynomial of degree 3 in temperature and degree 1 in accumulation rate leads to an about seven times smaller RMSE compared to a polynomial with reversed degrees.

Dynamic case
A similar study was conducted to investigate the dynamic model behaviour.Here the accumulation rate data (see Sect. 2.2/2.4 and Fig. A01a) together with the reconstructed Holocene temperature for the GISP2 reconstruction (Döring et. al., in prep.) were used as input.Figure 03a shows the 2-dimensional dependence of δ 15 N(T, Acc) and (b) the comparison between modelled and polynomial fitted δ 15 N for which the same surface fit procedure as used for investigating the static model behaviour was applied.It leads to a much poorer approximation with a correlation coefficient (r 2 ) of about 0.7, and RMSE of about 34.2 permeg.Coefficients are again listed in Table S01.The visible poorer correlation and larger nonlinearity of δ 15 N(T, Acc) highlights the problem of system memory effects documented by the firn model output.This is expected, since the model never reaches an equilibrium state for fast temperature and accumulation rate excursions, which complicates the construction of a robust transfer function in the dynamic case.Because our main goal is to fit the δ 15 N data in the low permeg level, i.e. 0.003 permil to 0.005 permil (level of the measurement uncertainty, (Kobashi et al., 2008)), a polynomial transfer function is not suitable for the dynamic case due to its moderate capability to fit the modelled data.
Therefore, we use a combination of Monte Carlo sampling and quantitative data analysis for solving the inverse problem T(δ 15 N) as sketched in Fig. 01 and explained in detail in the following section.

Reconstruction approach
The Holocene temperature reconstruction is implemented by the following four steps: (i) A prior temperature input (first guess) is constructed, which serves as the starting point for the optimization.
(ii) A smooth solution which passes through the δ 15 N data (here synthetic target data) is generated following a Monte Carlo approach.It is assumed that the smooth solution contains all long term temperature trends (centuries to millennial) as well as firn column height changes (temperature and accumulation rate dependent) that drive the gravitational background signal in δ 15 N.
(iii) The smooth temperature solution is complemented by high frequency information directly extracted from the δ 15 N data (here synthetic target data).This step adds short term temperature changes (decadal) in the same time resolution as the data.(iv) The gained temperature solution is corrected using information extracted from the mismatch between the synthetic target and modelled δ 15 N time series.

Accumulation rate input:
The raw accumulation rate data for the main part of the spin-up section (12000-35000 yr b2k) is linearly interpolated to a 20 yr grid and low pass filtered with a 200 yr cut off period (cop) using cubic spline filtering (Enting, 1987).For the Holocene section (20-10520 yr b2k) and the transition part between Holocene and spin-up section (10520-12000 yr b2k) the raw accumulation rate data is linearly interpolated to a 1 yr grid to obtain equidistant integer point-to-point distances which are necessary for the reconstruction, and to preserve as much information as possible for this time period (Fig. A01a).Except for these technical adjustments, the accumulation rate input data remains unmodified, assuming high reliability of this data during the Holocene.This is due to the fact that the data was gained by annual layer counting, and the use of a thinning model which should be rather robust for the first 1500 m of the 3000 m ice core (Cuffey and Clow, 1997).
In order to investigate the influence of smoothing of the accumulation rate data on the model outputs, the high resolution accumulation rate dataset in the time window of 20-12000 yr (Fig. A01a) was low pass filtered with cops between 20 yr and 500 yr, and used to drive the firn model.The surface temperature input was set as constant with a value of -31 °C for this time window.Then, the deviations of the filtered from the unfiltered accumulation rates and model outputs were calculated.
Figure A02 shows the absolute (I) as well as the relative deviations (II) (relative to the unfiltered scenario) as a function of the cut off periods for the accumulation rate input data, δ 15 N, and LID model outputs.Regarding the standard deviation (1σ) of the relative errors as a measure for the mean deviation of the filtered minus the unfiltered values shows that the filtering of the accumulation rates leads to a mean deviation of about 20 % between the filtered and unfiltered accumulation rate data depending on the used cop value (Fig. A02IIa).In Fig. A02IIb the mean 99 % quantile for the same analysis is shown which can serve as a measure for the maximum deviation between the filtered and unfiltered values.It is clearly visible that the filtering leads to a maximum accumulation rate deviation of about 50 %.The comparison of the related deviations in the δ 15 N and LID outputs reveals that the changes in the accumulation rates do not lead to a change in the same order for the model outputs.It can be concluded that the filtering of the accumulation rate data leads to deviations of less than 0.6 % and less than 1.5 % for the mean and the maximum deviations respectively (Fig. A02IIc,d).Therefore, it can be argued that a low pass filtering of the accumulation rates for cops between 20-500 yr does only have a small impact on the model outputs as long as the major trends are being conserved, because the filtering does not modify the mean accumulation.This result was expected due to the fact that the LID and finally δ 15 N changes are the result of the integration of the accumulation over the whole firn column.The integration time corresponds to the age of the ice at the lock-in-depth.
Figure A03 shows a sensitivity experiment where the temperature input was set as a constant value of -31 °C, and used together with the high resolution accumulation rate data (Fig. A01a) to model the LID (Fig. A03a) and δ 15 N (Fig. A03b) values.Due to the absence of temperature changes, only the accumulation rate changes drive the time evolution of the std[δ 15 N 10kyr,modmean(δ 15 N 10kyr,mod )] ≈ 4.5 permeg).This analysis supports our assumption that the accumulation rate change alone cannot fully explain the observed variability in δ 15 N during the Holocene, but gives limits to the contribution of the accumulation rate to the δ 15 N signal.Therefore, the remaining part has to be related to changes in the surface temperature in order to complement the δ 15 N signal.

Surface temperature spin-up:
The surface temperature history of the spin-up section (Fig. A01b) is obtained by calibrating the filtered and interpolated δ 18 O ice data (Eq.6) using the values for the temperature sensitivity α 18O and offset β found by Kindler et al. (2014) for the NGRIP ice core assuming a linear relationship of δ 18 O ice with temperature.
The values 35.2 ‰ and -31.4 °C are modern-time parameters for the GISP2 site (Schwander et al., 1997;Grootes and Stuiver, 1997).The raw δ 18 O ice data is filtered and interpolated in the same way as the accumulation rate data for the spin-up part.
The spin-up is needed to bring the firn model to a well-defined starting condition that takes possible memory effects (influence of earlier conditions) of firn states into account.

Generating synthetic target data:
In order to develop and evaluate the presented approach, eight temperature scenarios were constructed and used to model synthetic δ 15 N data, which serve later on as the targets for the reconstruction approach.From these eight synthetic surface temperature and related δ 15 N scenarios (S1-S5 and H1-H3), three data sets (later called Holocene like scenarios H1-H3) were constructed in such a way that the resulting δ 15 N time series are very close to the δ 15 N values measured by Kobashi et al. (2008) in terms of variability (amplitudes) and frequency (data resolution) of the GISP2 nitrogen isotope data (Fig. A04, Fig. A05).
The synthetic surface temperature scenarios S1-S5 are created by generating a smooth temperature time series (T syn,smooth ) analogous to the Monte Carlo part of the reconstruction procedure for only one iteration step (see Sect. 2.4.2).The values for the cut off period (cop) used for the filtering of the random values, and the s values for the first 5 scenarios can be found in Table S02.The smooth temperatures (Fig. A04I) are calculated on a 20 yr grid, which is nearly similar to the time resolution of the GISP2 δ 15 N measurement values of about 17 yr (Kobashi et al., 2008).For the Holocene like scenarios, the smooth Following this, high frequency information is added to the smoothed temperatures.A set of normally distributed random numbers with a zero mean and a standard deviation (1σ) of 1 K for the scenarios S1-S5 and 0.3 K for the Holocene like scenarios H1-H3 is generated on the same 20 yr grid and added up to the smooth temperature time series.Finally, the resulting synthetic target temperature scenarios (Fig. A04II, Fig. A05I) are linearly interpolated to a 1 yr grid.
The synthetic temperatures are combined with the spin-up temperature and are used together with the accumulation rate input to feed the firn model.From the model output the synthetic δ 15 N targets are calculated according to Sect.2.1.The firn model output provides ice age as well as gas age information.The final synthetic δ 15 N target time series (Fig. A04III, Fig. A05II) are set intentionally on the ice age scale to mirror measurement data, because no prior information is available for the gas-ice age difference for ice core data.

Prior input (step 1)
The starting point of the optimization procedure is the first guess.To construct the first guess temperature input, a constant temperature of about -29.6 °C is used for the complete Holocene section, which corresponds to the last value of the temperature spin-up (Fig. A01b).During the next step of the optimization, the prior temperature input is iteratively changed following a Monte Carlo approach.

Monte Carlo type input generator -Generating smooth solutions (step 2)
The basic idea of the Monte Carlo approach is to generate smooth temperature inputs by low pass filtering uniformly distributed random values, and to superimpose this signal to the prior input.Then, the new input is fed to the firn model and the mismatch D between the modelled signal X mod , calculated from the model output, and the synthetic target values X syn is computed.(note that X equals δ 15 N during the Monte Carlo part, whereas for the later analysis of the mismatches of δ 15 N or temperature, X equals δ 15 N or temperature marked by an additional index on "D") D serves as the criterion which is minimised during the optimization.If the mismatch decreases compared to the prior input, the new input is saved and used as new guess.This procedure is repeated until convergence is achieved.
Table 01 lists the number of improvements and iterations performed for the different synthetic datasets.The perturbation of the current guess (  ()) is conducted in the following way: Let   ����⃗ =   () be the vector containing the prior temperature input.A second vector   ����⃗ with the same number of elements n as   ����⃗ is generated containing n uniformly distributed random numbers within the limits of an also randomly (equally distributed) chosen standard deviation s. s is chosen from a range of 0.05-0.50(Fig. A06II), which means that the maximum allowed perturbation of a single temperature value T(t 0 ) is in a range of ±5 % to ±50 %.Creating the synthetic frequencies,   ����⃗ is low pass filtered using cubic spline filtering with an equally distributed random cut off period (cop) (Fig. A06I) in the range of 500-2000 yr generating the vector  ��⃗ .The new surface temperature input   �������⃗ is calculated from  ��⃗ according to: The superscript "T" stands for transposed and  � is the n by 1 matrix of ones.
This approach provides a high potential for parallel computing.In this study, an eight core computer was used, generating and running eight different inputs of   �������⃗ simultaneously, minimizing the time to find an improved solution.For example, during the 706 iterations for the scenario S2, about 5600 different inputs were created and tried, leading to 351 improvements (Table 01).Since it is possible to find more than one improvement per iteration step due to the parallelization on eight CPU's, the solution giving the minimal misfit is chosen as new first guess for the next iteration step.This leads to a decrease of the used improvements for the optimization (e.g. for S2, 172 of the 351 improvements were used).Additionally, a first gas age scale is extracted from the model using the last improvement conditions, which will then be used in the next step.

Adding high frequency information (step 3)
Finally, the missing temperature information providing a suitable fit to the synthetic δ 15 N target data is directly extracted from the pointwise mismatch D smooth,i , between the modelled δ 15 N smooth signal of the smooth temperature solution given by the Monte Carlo approach and the synthetic target data.D smooth,i can be interpreted in first order as the detrended signal of the synthetic δ 15 N values (Fig. 04c).This signal is transferred to the gas age scale provided by the firn model for the smooth temperature input to reach synchronicity between the gained temperature variations extracted from the mismatch of δ 15 N on the ice age scale and the smooth temperature solution.Additionally, the signal is shifted by about 10 yr towards modern values to account for the gas diffusion from the surface to the lock-in-depth, which is not yet implemented in the firn model.This is necessary for adding the calculated temperature changes (∆T) to the smooth signal.The ∆T values are calculated according to Eq. ( 9): with the thermal diffusion sensitivity ( Ω  2 , ) for the nitrogen isotope fractionation calculated from (Grachev and Severinghaus, 2003;Boersma-Klein and De Vries, 1966): For a further improvement of the remaining δ 15 N and resulting surface temperature misfits, it is important to find a correction method which contains information which is also available for measurement data.The benefit of the synthetic data study is that several later unknown quantities can be calculated, and used for improving the reconstruction approach (see Sect. 3 and 4).For instance, it is possible to split the synthetic δ 15 N data in the gravitational and thermo-diffusion parts or to use the temperature misfit, which is not known later on.The idea underlying the correction algorithm explained in the following is that the remaining misfits of δ 15 N and temperature are connected to the Monte Carlo and high frequency part of the reconstruction algorithm.It is not possible to find a smooth solution which exactly passes through the δ 15 N target data in the middle of the variance in all parts of the time series.This leads to a slightly over or underestimation of the δ 15 N and their corresponding temperature values.For example, a slightly too low (or too high) smooth temperature estimate leads to a small increase (or decrease) of the firn column height which creates a wrong gravitational background signal in δ 15 N on a later point in time (because the firn column needs some time to react).An additional error in the thermal diffusion signal is also created due to the high frequency part of the reconstruction, because the high frequency information is directly extracted from the deviation of the (synthetic) δ 15 N target data and the modelled δ 15 N data from the smooth solution of the Monte Carlo part.Therefore, this error is transferred into the next step of the reconstruction and partly creates the remaining deviations.
To investigate this problem, the deviations D smooth,i of the synthetic δ 15 N target data to the smooth δ 15 N data of the Monte Carlo part is numerically integrated over a time window of 200 yr (see Sect. 4), and thereafter the window is shifted from past to future in 1 yr steps resulting in a time series called IF(t).IF(t) equals a 200 yr running mean of D smooth,i .For t, the mid position of the window is allocated.The time evolution of IF is a measure for the deviation of the smooth solution in δ 15 N (or temperature) from the perfect middle passage through the target data and for the slightly over and underestimation of the resulting temperature. with Next, the sample cross correlation function (xcf) (Box et al., 1994)  From Eq. ( 9) with Δδ 15 N cv,i instead of D smooth,i the corresponding temperature correction values are calculated and added to the high frequency temperature solution giving the corrected temperature T corr.Finally, T corr is used to run the firn model to calculate the corrected δ 15 N time series (Fig. 06).This cause and effect relationship found in the cross correlations between IF(t) and D δ15N,hf , and IF(t) and D T,hf , is exemplarily shown in Fig. 05 for the scenario S1 and was found for all eight synthetic scenarios.The derived correction algorithm leads to a further reduction of the mismatches of about 40 % in δ 15 N and temperature (see Sect. 3.2).

Monte Carlo type input generator
Figure A07 shows the evolution of the mean misfit D mean of δ 15 N of the synthetic target versus the modelled data as a function of the applied iterations for all synthetic scenarios.One can easily see that for all scenarios, there is a steep decline of the mismatch during the first 50-200 iterations followed by a rather moderate decrease, and finally a constant value.
During the Monte Carlo part, it was possible to reduce the misfit of δ 15 N compared to the first guess solution by about 15 % to 75 % depending on the scenario and the mismatch of the first guess solution (Table 01).This leads to a reduction of the temperature mismatches compared to the first guess temperature mismatch of about 51 % to 87 %. contain the time series for D i for δ 15 N and temperature.The D i (δ 15 N) data is used later on to calculate the high frequency signal that is superimposed to the smooth temperature solution according to Eq. ( 9) and Eq. ( 11) (see Sect. 2.4.3).From Fig. 04 it can be concluded that the Monte Carlo part of the reconstruction algorithm leads to two major improvements of the first guess solution.First, it is obvious that the Monte Carlo approach corrects the offsets of the first guess input, which shifts the midpoint of the distribution of D i to zero (see Fig. 04b,e).The second improvement is that the distribution becomes more

High frequency step and final correction
Figures 06 provides the comparison between the Monte Carlo, the high frequency and the correction parts of the reconstruction procedure for the scenarios S5.Additional data and corresponding plots for all other scenarios can be found in Table 02 and Fig Compared to the Monte Carlo solution, the high frequency part leads to a large refinement of the reconstructions.For the mean δ 15 N misfits D, the improvement between the Monte Carlo and the high frequency parts is in the range of 64 % to 76 % (see Table 02).This leads to a reduction of the temperature mismatches of 43 % to 67 %.The standard deviations (1σ) of the pointwise mismatches (Fig. 06c,d,g,h) in δ 15 N and temperature after the high frequency parts are in the range of about 2.7 permeg to 5.4 permeg for δ 15 N and 0.22 K to 0.40 K for the reconstructed temperatures depending on the scenario, which is clearly visible in the decreasing width of the histograms (subplots (c) and (g) of Fig. 06, blue against grey).
The mismatches after the correction part of the reconstruction approach show clearly a further decrease of the misfits.This means that the width of the distributions of the pointwise mismatches of δ 15 N as well as of temperature is further reduced, and the distributions become more symmetric (long tales disappear, see histogram (c) and (g) of Fig. 06).The time series of the mismatches (subplots (d) and (h) of Fig. 06) clearly illustrate that the correction approach mainly tackles the extreme deviations (sharp reduction of extreme values occurrence in the red distribution compared to the blue distribution) leading to a further improvement of about 40 % in δ 15 N and temperature.Finally, the 95 % quantiles of the remaining point wise mismatches of δ 15 N and temperature (D i or ΔT i ) were calculated for the final solutions for all scenarios and are used as an estimate for the 2σ uncertainty of the reconstruction algorithm (see Fig. 06, Fig. S10-S16, and Table 2).The final uncertainties (2σ) are in the order of 3.0 permeg to 6.3 permeg for δ 15 N and 0.23 K to 0.51 K for the surface temperature misfits.It is noteworthy that the measurement uncertainties (per point) of state of the art δ 15 N measurements are in the order of 3-5 permeg (Kobashi et al., 2008), highlighting the effectiveness of the presented fitting approach.nearly the whole of the allowed frequency range (allowed cops were 500 yr to 2000 yr) was used to create the improvements during the iterations.In contrast, the distributions of the s values show clearly that mostly small s values are used to create the improvements, which implies that during the iterations small perturbations are more likely lead to an improvement than larger ones.
Yet, Fig. A07 reveals a weak spot of the Monte Carlo part, namely the absence of a suitable termination criterion for the optimization.The implementation until now is conducted such that the maximum number of iterations is given by the user or the iterations are terminated after a certain time (e.g. 15 h). Figure A07 shows that for nearly all scenarios it would be possible to stop the optimization after about 400 iterations, due to rather small additional improvements later on.This would decrease the amount of time needed for the Monte Carlo part to about 10 h (a single iteration needs about 90 s).Since the goal of the Monte Carlo part is to find a temperature realisation that leads to an optimal middle passage through the δ 15 N target data, it would be possible to use the mean difference between the δ 15 N target and spline filtered δ 15 N data using a certain cut off period as a termination criterion.This issue is under investigation at the moment.Another possibility to decrease the time needed for the Monte Carlo part could be an increase in the numbers of CPUs used for the parallelization of the model runs.For this study an eight core parallelization was used.A further increase in numbers of workers would improve the speed of the optimization.

High frequency step and final correction
In order to investigate the remaining mismatches in δ 15 N and temperature after the high frequency and the correction part of the reconstruction, several analyses were conducted.First, the total misfit of δ 15 N (D δ15Ntot ) was separated into two fractions: gravitational (D δ15Ngrav ) and thermal diffusion mismatches (D δ15Ntherm ) of δ 15 N (Fig. 07). Figure 07 indicates that the main fraction of the total mismatch of δ 15 N is due to the misfit of the thermal diffusion component of the δ 15 N signal, whereas the gravitational misfit of δ 15 N has a minor contribution.The ratio of the standard deviations σ(D δ15Ntherm )/σ(D δ15Ngrav ) is about 2.4 for the high frequency solution, and about 2.3 for the corrected signal, showing that the misfit in the thermal diffusion part is more than twice as high as in the gravitational component.
To investigate the timing and contributions of the mismatches in δ 15 N and temperature for scenario S1, different xcfs were calculated (Fig. A08a-d).The same analyses were conducted for all synthetic scenarios, leading to similar results.In and (c).The direct correlation on l 1a of (a) can be attributed mainly to the mismatch of the thermal diffusion component of δ 15 N, whereas the negative correlation on l 2a is due to the mismatch of the gravitational component of δ 15 N. Regarding the xcfs of (a)-(c) at a certain lag l, i.e. l = 0 yr shows that here (and on most of the other lags) the correlations between D δ15Ngrav,hf with D T,hf and D δ15Ntherm,hf with D T,hf work in opposite directions, which makes it difficult to find a way to correct the remaining temperature mismatch using only information from D δ15Ntot,hf for measurement data (where only D δ15Ntot,hf is available).The correlation on l 1a in (a) is weakened, whereas the lag l 2a is shifted to higher values because of the superposition of gravitational and thermal diffusion mismatch.Figure A08d shows also that the gravitational and thermal diffusion mismatches of δ 15 N are not independent, but the correlations at the extrema are relatively weak (r 1d =0.38, r 2d =˗0.56).The negative correlation r 2d is a sign for the compensation effect between the gravitational and thermal diffusion signals in δ 15 N due to the high frequency part of the reconstruction, whereas no explanation could be found for the positive correlation r 1d .The symmetric behaviour of the lags for r 1d and r 2d (l 1d = ˗88 yr ≈ ˗l 2d =93 yr) suggest that r 1d could be an artefact of a periodic behaviour of D δ15Ngrav,hf and D δ15Ntherm,hf .Figure A09a-d show the same analysis after the correction part of the reconstruction.It is evident that in all cases the extrema in the different xcfs break down due to the correction of the temperature signal, which is the consequence of the decreasing mismatches of temperature as well as of δ 15 N.The comparison of the subplots (a), (b) and (c) also shows that the remaining temperature misfits after the correction are mainly driven by the mismatches of the thermal diffusion signal of δ 15 N with a minor contribution of the gravitational misfit.
Figures A08e-h show the cross correlations between IF(t) used for the correction of the high frequency temperature solution, and the temperature misfit (e), the mismatch of total δ 15 N (f), the mismatch of the gravitational (g) and thermal diffusion (h) component of the δ 15 N signal calculated from the high frequency temperature solution.For the correction, the cross correlations (e) and (f) were used (see Sect. 2.4.4 and Fig. 05).Since for measurement data neither information about the temperature mismatch (the true temperature is not known) nor about the mismatch of the components of δ 15 N (gravitational, thermal diffusion) are available, it is imperative that the symmetric behaviour between the xcf(IF(t), D T,hf (t)) and inverted xcf(IF(t), D δ15Ntot,hf ) holds true.This criterion is fulfilled for all eight synthetic data scenarios and especially for H1-H3.The comparison of the subplots (f), (g) and (h) of Fig. A08 show the same findings as before, namely that the xcf for IF versus D δ15Ntot,hf is the combination of the xcfs of IF(t) versus D δ15Ngrav,hf and IF(t) versus D δ15Ntherm,hf , and that the major fraction of D δ15Ntot,hf is contributed from D δ15Ntherm,hf .The advantage to use IF(t) for the correction is the symmetry between the two cross correlations, which is created by two factors.The first one is the allocation of the window mid position to the entries of IF, which leads to the symmetric behaviour of the gravitational and thermal diffusion misfits.Second, the shifting of the window in 1 yr steps creating IF(t) over the whole data set leads to an averaged information, but even more importantly, to constant dependence between the temperature and δ 15 N mismatches.This can be used later on to fit measurement data.
Additionally, the influence of the window length, used for the construction of IF(t), on the correction was analysed.The construction was conducted for different window lengths ranging from 50 yr to 750 yr (Fig. A10).Also, the correction was small deviations (low permeg level).It is also visible that the correction using both extrema (xcf max and xcf min ) leads to a better correction as the approach using only one quantity.This is somehow surprising because the two extrema are the result of the periodicity of IF(t), D δ15N,hf and D T,hf .One explanation for that result could be that a larger section of the temperature time series is corrected when both extrema are used for the correction, due to shifting in both directions.The correction using xcf max only leads to a better fit than the one with xcf min , which can be attributed to the higher correlation between IF(t) and D δ15N,hf .Figures A10e,f show the evolution of the lags corresponding to the two extrema for the cross correlations between IF(t), and the δ 15 N and temperature mismatches, respectively.The linear dependency between the lags and the window length (the lags are nearly half of the window length) is the result of the construction of IF(t), which means the averaging due to the integration in the window of this certain length and the symmetric behaviour due to the allocation of the window mid position to the entries of IF(t).

Conclusion
A novel approach is introduced and described for inverting a firn densification and heat diffusion model to fit small gas isotope data variations as observed throughout the Holocene.From this new fitting method, it is possible to extract the surface temperature history that drives the firn status which in turn leads to the gas isotope time series.The approach is a combination of Monte Carlo sampling and quantitative data analysis.The procedure works fully automated and provides a high potential for parallel computing for time consumption optimization.Additional sensitivity experiments have shown that accumulation rate changes have only a minor influence on short term variations of δ 15 N, which themselves are mainly driven by high frequency temperature variations.To evaluate the performances of the presented approach, eight different synthetic δ 15 N time series were created from eight known temperature histories.The fitting approach leads to an excellent agreement in timing and amplitudes between the modelled and synthetic δ 15 N and temperature data, leading to mismatches in the low permeg level for δ 15 N and to related temperature deviations of a few tenths of Kelvin (2σ, respectively).The obtained, final mismatches follow a symmetric, standard distribution function.95% of the mismatches compared to the synthetic data are in an envelope in between 3.0-6.3permeg for δ 15 N and 0.23-0.51K for temperature.These values can therefore be used as a 2σ estimate for the reconstruction uncertainty.For δ 15 N the obtained final uncertainties are in the same order of magnitude as state of the art experimental measurement uncertainty.The presented reconstruction approach was also successfully applied to δ 40 Ar and δ 15 N excess data (Döring et al., in prep.).After the successful tests of the presented method using synthetic data, it Clim.Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.

Clim.
Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.diffusivecolumn height (LID) which modulates the δ 15 N values.Next, the modelled δ 15 N variations are compared to the δ 15 N measurement data (Fig.A05III)(Kobashi et al., 2008) to examine the influence of the accumulation rate changes on changes in δ 15 N for two cases.First, for the 8.2k event that is clearly visible in the LID and modelled δ 15 N as well as in the δ 15 N measurement data.The signal amplitude in δ 15 N is about three times higher for the measured data compared to the modelled ones (measurement data: Δδ 15 N 8.2k,meas ≈ 60 permeg, modelled data: Δδ 15 N 8.2k,mod ≈ 20 permeg).The comparison of the standard deviations of the measurement data with the modelled δ 15 N data for the last 10 kyr (both quantities were normalized with their respective means), shows an even higher deviation of the measured versus the modelled variabilities by a factor of about eight (measurement data: std[δ 15 N 10kyr,measmean(δ 15 N 10kyr,meas )] ≈ 37 permeg, modelled data: Clim.Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.temperature time series were generated from the temperature reconstruction for the GISP2 δ 15 N data (Döring et al., in prep.).The final Holocene surface temperature from Döring et al. (in prep.) was filtered with a 100 yr cut off period to obtain the smooth temperature scenario.
10)   � is the mean firn temperature in Kelvin which is calculated by the firn model for each time point i.To reconstruct the final (high frequency) temperature input, T hf , the extracted short term temperature signal ∆T is simply added to the smooth temperature input T sm :  ℎ, =  , + ∆  (11) Clim.Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.2.4.4Final correction of the surface temperature solution (step 4) is applied to IF(t) and the remaining misfits D δ15N,hf of δ 15 N after the high frequency part.The xcf shows two extrema (Fig.05a), a maximum (xcf max ) and a minimum (xcf min ) at two certain lags (lag max,Dδ15N at xcf max and lag min,Dδ15N at xcf min ).Now, the same analysis is conducted for IF(t) versus the temperature mismatch D T,hf (Fig.05b), which shows an equal behaviour (two extrema, lag max,T at xcf max and lag min,T at xcf min ).Comparing the two cross correlations shows that lag max,Dδ15N = ˗ lag min,T and lag min,Dδ15N = ˗ lag max,T (Fig.05d,e).The idea for the correction is that the extrema in D δ15N,hf with the positive lag (positive means here that D δ15N,hf has to be shifted to past values relative to IF) creates the misfit of temperature D T,hf on the negative lag (modern direction) and vice versa.So IF(t) yields information about the cause and gives a handle for correcting this effect between the remaining mismatches of δ 15 N and temperature over the whole time series.The lags are not sharp signals, which results from the case that the cross Clim.Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.correlations are conducted over the whole analysed record, which leads to an averaging of this cause and effect relationship as well as that IF(t) is a smoothed quantity itself.The correction of the reconstructed temperature after the high frequency part is conducted in the following way: From the two linear relationships between IF(t) and D δ15N,hf at the two lags (lag max,Dδ15N at xcf max , lag min,Dδ15N at xcf min ) two sets of δ 15 N correction values (Δδ 15 N max from xcf max and Δδ 15 N min from xcf min ) are calculated.Now the lags are being inverted (Fig. 05c,e) shifting the two sets of the δ 15 N correction values to the attributed lags of the cross correlation between IF(t) and D T,hf (e.g.Δδ 15 N min to lag from xcf max from the cross correlation between IF(t) and D T,hf ) changing the time assignments of Δδ 15 N min (t) and Δδ 15 N max (t) to Δδ 15 N min (t+lag max,T ) and Δδ 15 N max (t+lag min,T ).Now, the Δδ 15 N max and Δδ 15 N min are component wise summed up leading to the time series Δδ 15 N cv (t).

Figures
Figures 04 provides the comparison between the first guess and Monte Carlo solution versus the synthetic target data for the modelled δ 15 N (a-c) and surface temperature values (d-f) for the scenarios S5.Subplots (a) and (d) show the time series of the synthetic target data (black dotted line), the first guess solution (blue line) and the Monte Carlo solution (red line) for δ 15 N and temperature.In subplots (b) and (e), the distribution of the pointwise mismatch D i of the first guess (blue) and the Monte Carlo solution (red) versus the synthetic target data for δ 15 N and temperature can be found.Subplots (c) and (f)

Clim.
Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.symmetric and the misfit is reduced (the distributions become narrower) compared to the first guess, due to the middle passage through the δ 15 N targets.These improvements can be observed for all eight synthetic scenarios, showing the robustness of the Monte Carlo part (Table 01, Fig. 04, and Fig. S03-S09).
. S10-S16.The upper four plots (a-d) illustrate each reconstruction step and their effect on the modelled δ 15 N, the bottom four plots (e-h) show the corresponding results the temperature.Plots (a) and (d) contain the time series of the synthetic δ 15 N or temperature target (black dotted line), the high frequency solution (blue line), and the final solution after the correction part (red line).For visibility reasons, subplots (b) and (f) display a zoom-in for a randomly chosen time window of about 500 yr for the same quantities, which shows the excellent agreement in timing and amplitudes of the modelled δ 15 N and temperature compared to the synthetic target data.Histograms (c and g) and subplots (d) and (h) show the distribution and the time series of the pointwise mismatches D i between the modelled and the synthetic target data in δ 15 N and temperature.

Clim.Figure
Figure A06 shows the distribution of the cut off periods (cop) (I), and the distribution of the s values (II) used to create the improvements (see methods 2.4.2) for all scenarios.The cop values are more or less evenly distributed, which shows that

Fig.
Fig.A08athe xcf between the mismatch of total δ 15 N (D δ15Ntot,hf ) and the misfit of temperature (D T,hf ) is shown.The cross correlation leads to two extrema (r 1a =0.70, r 2a =˗0.55) on two certain lags (l 1a =˗2 yr, l 2a =+126 yr).In subplot (b) and (c) the same analysis is conducted between the mismatch of the gravitational (D δ15Ngrav,hf ) component (b), and the thermal diffusion calculated by using only xcf max or xcf min of IF(t) versus D δ15N,hf for correcting the temperature input.Figures A10a,b show the Clim.Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.remaining mismatches of δ 15 N (D δ15N,corr ) (a), and temperature (D T,corr ) (b) after the correction as a function of the used window length for IF(t).The analysis shows that for all investigated window lengths the correction reduces the mismatches of δ 15 N and temperature, whatever correction mode was used (calculated with xcf max , xcf min , or both quantities, see comparison with the blue line in (a) and (b)).Furthermore, the correction works best for window lengths in the range of 100 yr to 300 yr with an optimum at 200 yr for all cases.This indicates that the maximum mean duration effect of a δ 15 N mismatch creating a temperature mismatch (and vice versa) is in the same range for the investigated scenarios and such

Figure 02 :Figure 03 :Figure 04 :Figure 05 :Figure 07 :
Figure 02: Investigation of the static model behaviour: All surface temperature inputs were combined with all accumulation rates leading to 408 input scenarios: (a) Surface temperature input scenarios, -60 °C to -10 °C in 1K steps; (b) Accumulation rate input scenarios, 0.1 m/yr to 0.45 m/yr in 0.05 m/yr steps; (c) Polynomial surface fit with a polynomial of degree 3 in temperature and degree 2 in accumulation rate leading to a transfer function δ 15 N(T,Acc) with correlation coefficient r 2 = 0.9997 and root mean square error RMSE = 4.5 permeg;

Figure A01 :
Figure A01: (a) Used accumulation rate input time-series divided in a Holocene and a spin-up section with time resolution in the Holocene section (20 yr to 10520 yr b2k) of 1 yr.The time resolution for the transition between the Holocene and the spin-up section (10520 yr to 12000 yr b2k) is 1 yr as well.This is in opposition to the rest of the spin-up section which has a time resolution of 20 yr.(b) Surface temperature spin-up calculated from δ 18 O ice calibration.Time resolution equals the accumulation rate spin-up section.First guess surface temperature input is simply a constant value.

Figure A03 :FigureFigure A07 :
Figure A03: Lock in depth (LID) and resulting δ 15 N time series for the last 10 kyr using a constant surface temperature input of -31 °C and the high resolution accumulation rate input (Fig. S03a).

Table 01 : Summary for the Monte Carlo approach. Mismatch D guess between the modelled δ 15 N (or temperature) values using the first guess input and the synthetic δ 15 N (or temperature) target for each scenario. D mc is the mismatch between the modelled δ 15 N using the final Monte Carlo temperature solution and the synthetic δ 15 N (or temperature) target for each scenario. 3 runs were conducted over weekend, which leads to a higher number of iterations.
Clim.Past Discuss., https://doi.org/10.5194/cp-2017-92Manuscript under review for journal Clim.Past Discussion started: 17 July 2017 c Author(s) 2017.CC BY 4.0 License.