Comparison of Holocene temperature reconstructions based on GISP2 multiple-gas-isotope measurements

Nitrogen and argon stable isotope data obtained from ancient air in ice cores provide the opportunity to reconstruct past temperatures in Greenland. In this study, we use a recently developed ﬁ tting-algorithm based on a Monte Carlo inversion technique coupled with two ﬁ rn densi ﬁ cation and heat diffusion models to ﬁ t several Holocene gas-isotope data measured at the GISP2 ice core and infer temperature variations. We present for the ﬁ rst time the resulting temperature estimates when ﬁ tting d 15 N, d 40 Ar, and d 15 N excess as individual targets. While the comparison between the reconstructions using d 15 N and d 40 Ar shows high agreement, the use of d 15 N excess for temperature reconstruction is problematic because the statistical and systematic data uncertainty is higher and has a particular impact on multi-decadal to multi-centennial signals. Our analyses demonstrate that T( d 15 N) provides the most robust estimate. The T( d 15 N) estimate is in better agreement with Buizert et al. (2018) than with the temperature reconstruction of Kobashi et al. (2017). However, all three reconstruction strategies lead to different temperature realizations. © Authors. Published by Elsevier Ltd. This an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Temperature is a key parameter of the climate system and therefore important in the study of climate change. To put today's temperature variations into a broader context, temperature reconstructions from proxy archives are particularly valuable as they provide information about past climate changes. Here we report on a study that focuses on temperature reconstructions for the Holocene using stable isotope compositions of the ancient atmosphere trapped in ice cores. The use of nitrogen (d 15 N) and argon (d 40 Ar) stable-isotope variations in air extracted from ice cores is a wellestablished tool for reconstructing past temperature (e.g. Huber et al., 2006b;Kindler et al., 2014;Kobashi et al., 2011;Landais et al., 2006;Orsi et al., 2014;Severinghaus et al., 1998Severinghaus et al., , 2001. This method uses the stability of isotopic compositions of nitrogen and argon in the atmosphere at orbital timescales, as well as the fact that observed changes are only driven by processes in polar firn (Leuenberger et al., 1999;Mariotti, 1983;Severinghaus et al., 1998). It provides an alternative to the classical temperature reconstruction based on calibrated stable oxygen (d 18 O ice ) and hydrogen isotopes measurements on melted ice-core-water samples (Gierz et al., 2017;Johnsen et al., 2001;Steen-Larsen et al., 2011;Stuiver et al., 1995). The latter provides a rather robust proxy for reconstructing paleo-temperatures for times where large temperature variations occur (Gierz et al., 2017), e.g. Dansgaard-Oeschger events. However, during the Holocene, where temperature variations are comparatively small, changes in seasonal distribution of precipitation as well as of evaporation conditions at the source region may dominate water-isotope-data variations Kindler et al., 2014;Werner et al., 2001). Recent studies (Buizert et al., 2018;Kobashi et al., 2017) used the nitrogen and argon isotopes of the GISP2 ice core (Greenland Ice Sheet Project Two, Meese et al., 1994;Rasmussen et al., 2008;Seierstad et al., 2014) to reconstruct Holocene temperature variations for Greenland summit following different reconstruction strategies. Kobashi et al. (2017) use the second-order parameter d 15 N excess (≡ d 15 N -d 40 Ar/4) together with the firn-densification and heatdiffusion model from Goujon et al. (2003) to obtain a Holocene temperature estimate. Buizert et al. (2018) reconstructed summit temperatures by calibrating the GISP2 d 18 O ice data by forcing the temperature to reproduce the general trend in d 15 N using a dynamical firn-model. Both methods lead to different temperature estimates. In Buizert et al. (2018) an overall uncertainty of 1.5 K was stated for the reconstructed temperature, slightly higher than Kobashi et al. (2017) estimate of 1.21 K. Both approaches have in our view shortcomings: the d 15 N excess method loses important information about the lock-in-depth (LID) and d 18 O ice scaling method does not consider the influence of seasonal distribution of precipitation in particular for faster signals (multi-decadal to centennial). In D€ oring and Leuenberger (2018), we introduced a new automated method to reconstruct temperature variations using a fittingalgorithm of gas-isotope data based on a Monte Carlo inversion technique coupled to a firn-densification and heat-diffusion model. It was shown on synthetic data experiments that the remaining mismatches of gas-isotope data would lead to a temperature uncertainty (2s) of below 0.3 K for a single measurement under Holocene-like conditions and perfectly known accumulation rates and firn physics. This deviates from real conditions and therefore, this study focuses on the challenges of temperature reconstructions using the gas-isotope fitting-algorithm for real Holocene data. We will examine different aspects which are -in our view -integral for the evaluation of the correctness of the reconstructed temperature estimates. First, we will discuss the gas-isotope data in the context of their uncertainty and investigate the suitability of the different isotopic quantities for reconstructing robust temperature estimates. Next, we explore the reproducibility of our fitting approach and its contribution to the uncertainty budget as well as the influence of different accumulation-rate estimates on temperature reconstructions. For our reconstruction we use and compare results of two different firn-models, the models of Schwander et al. (1997) and of Goujon et al. (2003). Finally, we compare the temperature estimates obtained by fitting d 15 N, d 40 Ar and d 15 N excess including an overall uncertainty of our method using all available information to each other and place them in a context to the estimates of Kobashi et al. (2017) and Buizert et al. (2018). Finally, we will conclude which temperature reconstruction strategy is preferable for inert gas-isotope-based temperature reconstructions from ice cores data for Holocene conditions.

Firn-models and inversion algorithm
The observed gas-isotope data depend mainly on firn compaction processes in combination with gas and heat diffusion (Severinghaus et al., 1998). Therefore, the use of firn densification and heat diffusion models (hereafter referred to as firn-models), which describe the physics of compaction and heat and gas transport, is required to convert measured gas-isotope data into surface temperatures. In this study, we use two classic firn-models. The first model was developed by Schwander et al. (1997) and used for the temperature reconstructions by Huber et al. (2006b) and Kindler et al. (2014). The second one, the model from Goujon et al. (2003) (adapted to the GISP2 site for this study), was used in the studies by Guillevic et al. (2013), Kobashi et al. (2015), and Kobashi et al. (2017), among others. Since this study is the first to compare temperature reconstructions for Holocene conditions using both models, we will demonstrate the comparability of the solutions obtained by using both models.
Converting gas-isotope data to surface temperature estimates using a firn-model is an inverse problem. The firn-model acts as a nonlinear transfer function that combines temperatures and accumulation rates with the gas-isotope data. To solve this problem, we use the automatic fitting-algorithm developed by D€ oring and Leuenberger (2018). The algorithm allows fitting of the gas-isotope data with misfits in the low permeg level (permeg ¼ 10 À6 ), mainly below the analytical uncertainties. For modelling d 40 Ar and d 15 N excess data, we use the thermal diffusion constant a T,Ar and the thermal diffusion sensitivity U Ar empirically derived by Grachev and Severinghaus (2003) in addition to the details presented in D€ oring and Leuenberger (2018) TðtÞ is the mean firn temperature (Leuenberger et al., 1999).
2.2. Timescale, and necessary data

Timescale
The GICC05 chronology is used throughout the study Seierstad et al., 2014). Following D€ oring and Leuenberger (2018), the temperature model-input is temporally divided into two parts. The first part ranges from 10.5 kyr to 35 kyr b2k ("spin-up section"), while the second part ranges from 0.02 kyr to 11.5 kyr b2k ("reconstruction window"). In the latter, we allow the fitting-algorithm to change the temperature. Both the accumulation rate and the surface temperature of the spin-up section remain unchanged during the reconstruction procedure.

Accumulation-rate input data
In addition to temperature, accurate accumulation rate data are needed to drive the firn-models. As in D€ oring and Leuenberger (2018), we use as accumulation rate model-input the original accumulation rates (acc) for the GISP2 site as reconstructed in Cuffey and Clow (1997) but adapted to the GICC05 chronology. We use all three given accumulation rate datasets for the reconstruction to analyze the effects of differences between the three accumulation estimates on the reconstructed temperatures. To obtain the accumulation rate estimates, three different ice sheet margin retreat scenarios (50 km, 100 km or 200 km scenario) were used as boundary conditions for an ice flow model (see Cuffey and Clow (1997) for details; availability of the accumulation rates using the 100 km scenario on GICC05 timescale://doi.pangaea.de/10.1594/ PANGAEA.888997).

Target data
The GISP2 gas-isotope data measured by Kobashi et al. (2008b) are used to reconstruct Holocene temperature as targets for our fitting-algorithm (D€ oring and Leuenberger, 2018). Fig. 1 shows the different gas-isotope fitting-targets (d 15 N, d 40 Ar, d 15 N excess ) set on the GICC05 chronology. Table S1 lists the measurement uncertainties for the target data given in Kobashi et al. (2008b). The d 40 Ar date are corrected for gas-loss induced fraction (Kobashi et al., 2008a(Kobashi et al., , 2010(Kobashi et al., , 2017 suffer from kinetic fractionation due to gas-loss that occur either during bubble close-off or core retrieval and storage, which can lead to an enrichment of d 40 Ar and thus a smaller d 15 N excess Kobashi et al., 2011Kobashi et al., , 2010Kobashi et al., , 2008bSeveringhaus et al., 2003;Severinghaus and Battle, 2006). This necessitated a correction of the GISP2 d 40 Ar data based on firn modelling (Kobashi et al., 2008a) and dAr/N2 (a tracer for potential gas-loss; Kobashi et al., 2010Kobashi et al., , 2017, resulting in smaller d 40 Ar values compared to the uncorrected d 40 Ar values and thus a higher d 15 N excess . As we used the corrected d 40 Ar and d 15 N excess data (Kobashi et al., 2008a(Kobashi et al., , 2010(Kobashi et al., , 2017), we will show in sections 4.2, 4.5.1 and 4.5.2, the correction performed is not sufficient, especially for the late-and early-Holocene data, implying that the fractionation in d 40 Ar caused by gas-loss is not constant over the Holocene part of the GISP2 ice core. A better correction approach would be to use additional isotope data, like d 86 Kr, d 136 Xe, measured together with d 40 Ar on the same samples, as suggested by Baggenstos (2015). Furthermore, the elimination of the gravitational signal when using d 15 N excess as the only target leads to a loss of information (firn column height) and less stable temperature solutions with reduced reproducibility, which complicates the inversion of the firn-model (sect. 4.3).

Prior input and model spin-up
To avoid the influence of possible memory effects (influence of earlier firn states on later firn states), a temperature and accumulation rate spin-up is required to bring the firn-model to a welldefined initial state. The surface temperature spin-up was obtained by extending the temperature reconstruction for the GISP2 site of Buizert et al. (2014) (interval 10.05 kyre20 kyr b2k) to 35 kyr b2k by calibrating the GISP2 d 18 O ice data to temperature (Grootes et al., 1993;Grootes and Stuiver, 1997;Meese et al., 1994;Steig et al., 1994;Stuiver et al., 1995;data availability: Grootes and Stuiver, 1999), using the slopes and intercepts given in Cuffey and Clow (1997). As a prior input for the Holocene section, we simply start with a constant temperature using the last value of the spinup section. The previous temperature input and spin-up temperature have been slightly adjusted to account for the decreasing slope in the oldest part (9.5 kyre12.168 kyr b2k) of the gas-isotope data. The adjustment procedure is described in detail in supplement sect. S2.

Quality of gas-isotope fitting
3.1.1. Reproducibility and quality of the fitting procedure To analyze the reproducibility of the gas-isotope fitting-algorithm coupled to the firn-model of Schwander et al. (1997), ten fitting runs were performed for each gas-isotope species (d 15 N, d 40 Ar and d 15 N excess ). From the ten solutions for each target, a mean solution was calculated as the average of the ten temperature solutions. This mean solution was processed by the firn-model and resulted in the final modelled isotope solution. The obtained data were analyzed in terms of reproducibility between the ten runs (upper six plots in Fig. 2) to estimate the possible scatter of the absolute temperature and isotope solutions. In addition, we compare the final (mean) solution with the best-fitting solution (out of ten) in terms of remaining misfits (Fig. 3). Table S2 contains additional information on the reproducibility ("rep") as well as the quality of the fits ("fit"). The variability of isotope solutions between the ten runs is in the low permeg range for all targets, with mean variability over the entire reconstruction time window of 1.68 ± 1.14 permeg for d 15 N, 2.58 ± 1.77 permeg for d 40 Ar/4, and 1.28 ± 2.14 permeg for d 15 N excess . The small deviations between the modelled isotope solutions demonstrate the robust performance of our gas-isotope fitting-algorithm. Similarly, the variability between the obtained temperatures was analyzed (lower 6 plots in Fig. 2). Using d 15 N or d 40 Ar as single target leads to temperature solutions that vary in a narrow band of 0.17 ± 0.12 K for T(d 15 N) and 0.26 ± 0.18 K for T(d 40 Ar). In contrast to the excellent reproducibility of T(d 15 N) and T(d 40 Ar), the fitting of d 15 N excess produces a wide range of possible temperature realizations. While the variance between the isotopic solutions is comparable for all three species, the dispersion of T(d 15 N excess ) is about 10 times (2.04 ± 1.90 K) higher than for T(d 15 N) or T(d 40 Ar). Using the average temperatures, we obtain final mismatches (2s) of 3.65 permeg for d 15 N, 2.79 permeg for d 40 Ar/4 and 5.43 permeg for d 15 N excess (Fig. 3).

Consistency of single target fitting results
The histograms on the right side of Fig. 4 show the mean misfits for all gas isotopes when fitting a single isotope target and calculating from the respective temperature solution the misfits for the two other isotope quantities. For example, the accurate fit of d 15 N (upper plot, where 96% of the deviations (2s) are smaller than 3.7 permeg when using the firn-model of Schwander et al. (1997)) leads to insufficient fits for the two other gas-isotope targets with deviations (2s values) of 11.3 permeg for d 40 Ar/4 and 11.1 permeg for d 15 N excess , considering that the data uncertainty (1s) of the measured gas-isotope data is 3.0 permeg to 4.0 permeg for d 15 N, 4.0 permeg to 9.0 permeg for d 40 Ar/4 and 5.0 permeg to 9.8 permeg for d 15 N excess . In other words, a precise adjustment of d 15 N does not automatically lead to precise adjustments for d 40 Ar and d 15 N excess .
The same applies to all other individual fits and to the use of both firn-models. It is not possible to find a temperature estimate that leads to modelled isotope regimes that provide sufficient agreement for all isotope targets combined. This result suggests that the GISP2 isotope data suffer from fractionations that are not captured by the firn-models (e.g. gas-loss).

Reconstructed temperatures
In the next sections we show the temperature estimates resulting from the independent fitting of the different gas-isotope targets (d 15 N, d 40 Ar, d 15 N excess ) using the firn-models of Schwander et al. (1997) and Goujon et al. (2003). All temperature estimates in Fig. 4 are presented as anomalies (relative to 11.5 kyr b2k). The modelled firn parameters (Dage, LID) for each target are shown in Fig. S3. In addition, a so-called hybrid solution (Fig. 4c) is created using d 15 N together with d 15 N excess as follows: The hybrid solution is created from the mean temperature solution of the d 15 N fits, low-pass filtered with a cut-off period of 500 years, to obtain a long-term temperature trend. This long-term temperature trend is overlaid by adding high-frequency information calculated from d 15 N excess . The high-frequency temperatures are calculated by converting the deviations between the modelled -using the longterm temperature trend (from d 15 N) -and the measured d 15 N excess data (Dd 15 N excess,hf ) into temperature using the temperature sensitivities U N of d 15 N and U Ar of d 40 Ar as follows: The hybrid solution is used to mimic the temperature reconstruction of Kobashi et al. (2017), but with a different strategy. Kobashi et al. (2017) fitted the temperature gradient across the diffusive firn column DT firn (Eq. (4)) calculated from d 15 N excess (Kobashi et al., 2010(Kobashi et al., , 2011(Kobashi et al., , 2017. Due to the high relative uncertainty of d 15 N excess combined with the annual calculation of surface temperature from modelled bottom temperature values using the DT firn integration method (Kobashi et al., 2008a(Kobashi et al., , 2010, strong drifts in the reconstructed temperature can occur (sect. 4.5.2). To overcome this problem, In addition, the correction may alter trends at the millennial scale, as a change in mean DT firn in a 1500-year window directly alters the temperature trend in that section. In addition, sharp shifts between windows can produce short-term temperature signals (jumps lasting 50e200 years) that could be misinterpreted as true temperature changes. Since our method offers the possibility to know the long-term T(d 15 N) trend, it is interesting to compare the hybrid temperature solution with the solution of Kobashi et al. (2017) and to investigate the differences that should occur mainly due to the "window correction" method (sect. 4.6.2).

Comparison of model results
The comparison of the fitting results using the firn-model of Schwander et al. (1997) or Goujon et al. (2003) (Fig. 4aed) shows that the solutions obtained with the two models are very comparable with the exception of T(d 15 N excess ). To further compare the solutions obtained with both models, the temperature estimates were band-pass filtered (Enting, 1987) in three periodic-time bands (multi-millennial (band: 1000 years e 4000 years), multicentennial (band: 200 years e 1000 years), and multi-decadal (band: 50 years e 200 years)) and the Pearson's correlation coefficients between the filtered temperatures of the two models were calculated. The same analysis was conducted between all temperature estimates reconstructed in this study (supplement sect. S6, Tabs. S3eS6). The temperatures obtained by the two models show a high correlation (r > 0.9) in all considered periodictime bands.
The absolute temperatures show an offset of about 2 K between the two models. This deviation can be explained by the implemented convective zone in the model of Goujon et al. (2003). Since the convective zone reduces the height of the diffusive firn column, a colder temperature (compared to the model of Schwander et al. (1997)) is required, which slows down the densification and leads to the LID needed to fit the gravitational part of the gas-isotope  Table S2.
M. D€ oring and M.C. Leuenberger Quaternary Science Reviews 280 (2022) 107274 data. The temperature differences between the estimates with both models are not completely constant. Besides the offset, we find mean differences (2s) of 0.62 K for T(d 15 N) and 0.73 K for T(d 40 Ar), which are partly caused by residual inconsistencies in the isotopic fits. In the early-Holocene (9.5 kyre11.5 kyr), the T(d 15 N) and T(d 40 Ar) reconstructions using the firn-model of Goujon et al. (2003) show a faster warming, resulting in slightly higher temperature anomalies compared to the model estimates of the Schwander et al. (1997) model. The rest of the time series shows the opposite behavior. Here the model estimates of Schwander et al. (1997) model are slightly warmer than the model estimates of Goujon et al. (2003) model. The fact that the trend in the time series of the differences between the temperature solutions of Schwander et al. (1997) and Goujon et al. (2003) firn-models follows the trend of the reconstructed temperature anomalies suggests a temperature sensitive part of these differences. One explanation for this could be that the densification itself is slightly temperature dependent. The 2 K colder absolute temperature in the model inputs of Goujon et al. (2003) model leads to a different densification path than in the model runs of Schwander et al. (1997) model. Nevertheless, both models provide very comparable temperature estimates when temperature anomalies are considered. 3). Since we use a Monte Carlo method, each fitting run follows a unique path. As a result, the final temperature estimates differ slightly between different fitting runs for the same target. (ii) The quality of the fits (sects. 3.1.1 and 4.3). As discussed in D€ oring and Leuenberger (2018), it is not possible to fit the gas-isotope data with a total misfit of zero. The remaining final mismatches between modelled and measured data contribute to the uncertainty of the reconstruction. (iii) The uncertainty in the accumulation rate data was included in the temperature reconstruction (sect. 4.4), as we used all three different accumulation rate data sets for the GISP2 site provided by Cuffey and Clow (1997). (iv) Differences in the reconstructed temperatures resulting from the use of different firn-models (sect. 3.2.1). This is outlined by analyzing the differences between the reconstructed temperatures using two different firn-models. It would be beneficial to include more available models to better quantify the uncertainty due to differences between firn-model in later studies. (v) The fraction of the uncertainty in the reconstructed temperatures that is due to the analytical uncertainty of the isotopic data. (sects. 2.2, 4.1 and 4.2).

Uncertainty estimation
We can calculate an overall limit for the mean uncertainty of the reconstructed temperatures using the following equation: s T is the uncertainty of the reconstructed temperature, s miss is the remaining mean misfit after fitting the isotope values (sect. 3.1.1), s rep is the fraction of uncertainty due to the reproducibility of the fitting method (sect. 3.1.1), s meas is the analytical uncertainty of the measured data (Table S1) and s model is the standard deviation of the differences of the temperature anomalies between the used models (sect. 3.2.1). U x is the thermal diffusion sensitivity of the respective isotope species. For T(d 15 N), the calculated uncertainty is 2s T ¼ 0.80 … 0.88 K, the range is derived from the minimum and maximum analytical uncertainty of the measured data, respectively. The uncertainty of the T(d 40 Ar) reconstruction is 2s T ¼ 1.14 … 1.95 K. It should be emphasized that the final uncertainties correspond to a lower bound estimate, as they only reflect the quantifiable sources of uncertainty and not all sources of uncertainty (which may arise, for example, from changes in convective zones, ice density stratification, standardization problems in measurements, etc.). This final uncertainty is attributed to each individual temperature point in time (the mean data resolution was 17.8 years), so smoothing or calculation of a running mean reduces the uncertainty due to averaging over a given number of points. For example, smoothing with a cut-off of 100 years reduces uncertainty by a factor of 1/n 1/2 ¼ 1/(100 years/17.8 years) 1/2 ¼ 0.42, as this corresponds to averaging over 5.6 points. This is important to remember when discussing filtered versions of these reconstructed temperatures.

Signal-to-noise analysis of isotope data
Since different uncertainties were given in Kobashi et al. (2008b) for different measurement campaigns, we use the minimum and maximum uncertainties given in Kobashi et al. (2008b) for a signal-to-noise analysis of these data for two cases, signals with periodicities T < 500 years ("high frequency") and T > 500 years ("low frequency"), respectively. Here we focus mainly on the faster signal (T < 500 years) to analyze the suitability of the different isotopic targets (d 15 N, d 40 Ar, d 15 N excess ) for the reconstruction of multi-decadal to multi-centennial temperature variability in terms of analytical uncertainty.
We performed the analysis in the following way: First, we detrended the measured gas-isotope target-data by subtracting the respective low-pass filtered signals with a cut-off period of 500 years (Figs. S1aec). In a next step, we identified the local maxima and minima of the high-frequency isotope data (using the Matlab® "findpeaks" algorithm). We defined the high-frequency signals as the difference between successive local maxima and minima (Figs. S1def) and compared the high-frequency signals with the signal uncertainties calculated from the published measurement uncertainties (Kobashi et al., 2008b). Since a signal is defined by at least two points, the signal uncertainties are calculated using Gaussian error propagation. We use the minimum (red dotted line, Fig. S1) and maximum (blue dotted line, Fig. S1) uncertainties for our calculations (Table S1). The analysis of the high-frequency signals shows that for the minimum uncertainties, 78% of the d 15 N high-frequency signals have amplitudes that are above the uncertainty level (70% for the maximum uncertainty), 74% (or 36%) for d 40 Ar, and only 52% (or 17%) for d 15 N excess . Assuming that the "true" uncertainty lies between the stated maximum and minimum uncertainties, and considering that the listed uncertainties are 1suncertainties, we argue that only d 15 N is suitable as a robust fitting target in the high frequency case. The histograms in Fig. 1 show a detailed listing of the signal-to-noise ratios (SNRs) for all isotope species and for the minimum and maximum uncertainties, respectively. Here the signals are grouped between integer SNR values. For d 15 N most signals have a SNR between one and two or even higher values for the minimum and maximum uncertainties.
In contrast, for d 40 Ar and especially for d 15 N excess the majority of signals have SNR values lower than the uncertainty values (SNR <1), which makes it difficult to extract a robust temperature estimate for multi-decadal to multi-centennial signals from these targets. For the longer periodicities (T > 500 years), it is more difficult to obtain a comparable result (Figs. S1gei). Here we divided the analytical uncertainty by a factor of about 5.3 to account for smoothing (mean data resolution ¼ 17.8 years, (500 years/17.8 years) 1/2 ¼ 5.3). The comparison of the minimum and maximum measurement uncertainties (red and blue error bars, respectively) shows that all three gas-isotope quantities are suitable for reconstructing long-term temperature trends. This is especially true when only the measurement uncertainty is considered, as these uncertainties are in most cases smaller than the amplitudes of the investigated features. However, d 15 N is also the most suitable target for reconstructing long-term temperature trends due to its relatively low uncertainty compared to d 40 Ar and d 15 N excess .

Gravitational versus thermal diffusion signals
The linear dependence (slope) between d 15 N and d 40 Ar/4 (Table 1) is calculated using geometric mean regression (Leng et al., 2007) to investigate the contribution of different processes to the change in isotope data. Furthermore, the slope is calculated in three periodic-time bands (multi-decadal, multi-centennial and multimillennial, see sect. 3.2.1) and for the raw data. Since the variability of isotopic data in firn arises from gravitational settling (mass-dependent fractionation process (Schwander, 1989),) and thermal diffusion (dependent on firn temperature gradient (Severinghaus et al., 1998),), the slope between d 15 N(y) and d 40 Ar/ 4(x) should range from one (gravity settling only) to 1.46 (thermal diffusion only, U N /[U Ar/4 ]). Furthermore, both processes affecting d 15 N and d 40 Ar in the same "direction", so a generally high correlation is to be expected. Comparing the results for the different periodic-time bands, it becomes clear that the slope calculated for  the multi-decadal oscillations cannot be explained by gravitational settling and/or thermal diffusion. For these fast oscillations, the gravitational background is not expected to change significantly, and the slope should mainly correspond to the value of thermal diffusion of 1.46. In contrast, the calculated slope is 0.89 ± 0.05 and the correlation is relatively weak (r 2 ¼ 0.46). A slope of less than one indicates a process that further enriches d 40 Ar/4 compared to d 15 N, which is the case for a possible gas-loss contribution to d 40 Ar that seems to remain after correcting d 40 Ar for gas-loss. The weak correlation shows a decoupling between d 15 N and d 40 Ar for the multi-decadal variability, which is partly due to analytical uncertainties. For the longer periodicities, we find a higher correlation (r 2 > 0.8) and slopes in the range of expectations (Table 1). The decrease of the slope from the multi-centennial to multi-millennial variability shows that with increasing periodic-time the influence of gravitational fractionation due to changes in firn height becomes more important compared to the thermal diffusion signal. This is to be expected because (i) the firn column reacts slowly and (ii) the temperature gradient over the firn column disappears.

Reproducibility and quality of the fitting procedure
As shown in 3.1.1, the dispersion of T(d 15 N excess ) is about 10 times higher than for T(d 15 N) or T(d 40 Ar). This is due to the fact that when calculating d 15 N excess , all information about the gravitational component in the isotopic signals is removed and thus the information about the height of the firn column is lost, resulting in a considerably wider space of firn states or absolute temperatures on which the fitting-algorithm can operate to obtain very similar modelled isotopic solutions. In contrast, the relative temperature variations (deviation from trend) obtained by fitting d 15 N excess are very comparable between several fitting runs. However, the different densification backgrounds lead to differences in the temporal evolution of Dage of about 20 years to 30 years and thus to asynchronous temperature estimates (Fig. S3). For this reason, fitting d 15 N excess as a single target is challenging for determining the correct timing of temperature changes on a multi-decadal scale.
Interestingly, fitting d 15 N excess with the model of Goujon et al. (2003) results in a smaller spread of possible temperature estimates than with the model of Schwander et al. (1997). The reproducibility between 10 runs (Fig. S4) is 2.7 times better when using the Goujon model. The reason for this difference has not yet been found. The implementation of the geothermal heat flux in the Goujon model provides a negative constant fraction adding to the firn temperature gradient DT firn and could lead to this stabilization effect. Fig. S3 shows the modelled Dage and LID data for all fitting targets (d 15 N, d 40 Ar, d 15 N excess ) and all 10 runs. While the variance between the modelled Dage and LID is very small for d 15 N and d 40 Ar, the d 15 N excess fit produces a wide range of LID and Dage states.
When we run the gas-isotope fitting-algorithm several times on the same target, we notice an edge effect for the last 500 years to 1 kyr b2k to the present. Here, different temperature solutions occur, while the rest of the time series is very reproducible. To overcome this edge effect, it was necessary to stabilize the solutions for the last 500 years b2k to present, as described in supplementary sect. S4. To solve this problem, additional information is needed, which was added by using the measured borehole temperature profile for the GISP2 site as an additional boundary condition.
The evaluation of the deviations between the measured and modelled time series provides a constraint on the uncertainty budget of the final temperature. It is evident that for all targets the deviations between the measured and modelled isotope data are at least comparable or below the analytical uncertainty (which is 1s) of the measured data (Table S1). Interestingly, averaging the d 15 N and d 40 Ar temperature solutions leads to a further reduction of the misfits compared to the best-fit from the ten runs (4.5% for d 15 N, 18.9% for d 40 Ar). It appears that averaging the ten temperature solutions corrects some of the remaining (potentially randomly distributed) misfits. Obviously, a larger number of runs (>10) would slightly improve the average solution. Averaging the temperature solutions obtained from d 15 N excess fitting is problematic due to the larger scatter between the temperature solutions and the associated poorer Dage. As a result, averaging leads to more than a doubling (factor 2.55) of the mean misfit compared to the best-fit solution.

Influence of different accumulation rate estimates
To investigate the contribution of uncertainty in the accumulation rate data, we use all three available accumulation rate estimates for the GISP2 site (sect. 2.2) to reconstruct temperature based on the d 15 N fits. It can be seen that the deviation between the three different accumulation scenarios can be up to more than ±10% in the early-Holocene and decreases over time (Fig. 5b). The deviations between the three scenarios have a minor impact on the modelled Dage during the Holocene ( Fig. 5c and d). Starting from a maximum difference of about 30 years in the early-Holocene, the Dage difference decreases until 5 kyr b2k, corresponding to the decrease in the difference between the accumulation rate data.
From the mid-Holocene until today, the modelled Dage difference becomes smaller than 5 years. The same applies to the deviations in the modelled LID (Fig. 5e). Here, too, the effect of the prior adjustment is visible, forcing the same firn condition at the beginning of the reconstruction window for all accumulation rate scenarios (supplement sect. S2). While the reconstructed temperatures using the 50 km and 100 km accumulation rate scenarios lead to essentially the same temperature trend, the reconstruction using the 200 km scenario shows a slightly stronger cooling over the Holocene from about 7.5 kyr b2k (Fig. 5f). This is exactly the point at which the decline in accumulation rate begins to accelerate for the 200 km scenario compared to the averaged scenario (Fig. 5b). Since we do not find the same (but opposite) behavior for the 50 km scenario, we may have found a non-linear response between temperature and accumulation rate change. From about 6 kyr to the present, this faster cooling evens out, leading to a constant divergence of about 0.3 K between the 200 km solution and the other two, accounting for almost 15% of the total cooling trend of about 2 K, relative to the warmest part of the reconstructed temperatures at about 7.8 kyr b2k. Moreover, the shapes and amplitudes of the faster signals of the long-term portions (T > 500 years) are very comparable. The relatively large deviation of the Dage between the scenarios in the early-Holocene leads to a slightly asynchronous behavior of the short-term temperature fluctuations (T < 500 years, Fig. 5g and h). However, the shapes and amplitudes of the signals are independent of the deviations between the accumulation rate scenarios. The decreasing deviation between the accumulation rate scenarios and thus also between the modelled Dage over the Holocene leaves no differences between the short-term proportions of the reconstructed temperatures in the late-Holocene section (Fig. 5h). In summary, the divergence between the three different accumulation rate scenarios does not have a major impact on the reconstructed temperature anomalies. The differences between the accumulation rates lead to a slightly different modelled Dage in the early-Holocene and to a 0.3 K larger cooling for the scenario with the higher accumulation rate compared to the other two scenarios. The comparison between the d 15 N-and d 40 Ar-based temperatures (Fig. 4a,b and Fig. S6) shows that the general trends are very similar between the two reconstructions. The shapes of many shorter-term features also agree well, but T(d 40 Ar) indicates higher amplitudes of these anomalies. For the low-pass filtered data (cutoff: 50 years, r ¼ 0.96) and the multi-millennial band (r ¼ 0.94), we find high correlations (supplementary sect. S6, Tabs. S3 e S6), which was expected given the high agreement in long-term trends between the two temperatures. The multi-centennial band shows a lower but still high correlation (r ¼ 0.87). In the multi-decadal band, the correlation between T(d 15 N) and T(d 40 Ar) is weak (r ¼ 0.69, lag ¼ À4 years or r ¼ 0.67 for lag ¼ 0 years) and is in line with the correlation between the measured isotopes (d 15 N and d 40 Ar) in the same band with r ¼ 0.68. This result is not surprising because: (i) the high-frequency component of the reconstructed temperatures is directly calculated from the high-frequency component of the respective isotopic targets (D€ oring and Leuenberger, 2018) and (ii) the accumulation rate input has only a minor impact on the reconstructed temperatures in the multidecadal band. The mean offset between T(d 15 N) and T(d 40 Ar) over the whole time series is 0.28 K and the standard deviation (2s) of the differences is 1.00 K for the maximum resolution case (mean resolution of the isotope data is 17.8 years). In the early-to mid-(6.4 kyre11.5 kyr) and late-Holocene (0.07 kyre1.3 kyr b2k), the d 15 N reconstruction leads to a slightly higher absolute temperature compared to T(d 40 Ar) with mean differences of about 0.46 K for the early-to mid-Holocene and 0.61 K for the late-Holocene, while the rest of the temperature time series shows a similar trend (1.3 kyre6.4 kyr b2k, mean deviation: 0.06 K). The lower temperatures in T(d 40 Ar) in the early-to mid-and late-Holocene can be explained by an enrichment of d 40 Ar due to gas-loss that are still present after the corrections made. The LID estimates of both reconstructions are in good agreement (Fig. S6b). The differences between them vary in a narrow band from À2 m to 1 m, which is due to the temperature differences between the reconstructions. Comparison of the modelled firn temperature gradients DT firn (Fig. S6c) of the d 15 N and d 40 Ar fits shows high agreement in the general trends and in the shapes of the short-term features. The differences between them are less than ±1 K in most cases. In addition, DT firn,meas calculated directly from the measured isotope data (dotted line, meas) is compared to the modelled estimates together with its maximum (1s max ) and minimum (1s min ) uncertainty. DT firn,meas was recalculated from d 15 N excess in analogy to Kobashi et al. (2010Kobashi et al. ( , 2011Kobashi et al. ( , 2017 using Eq. (4).
The uncertainties (1s max , 1s min ) of DT firn,meas were calculated using Gaussian error propagation on Eq. (4) together with the uncertainties of d 15 N and d 40 Ar as in Table S1. Comparing the modelled   (Fig. 4d and Fig. S7) leads to a distinct temperature regime compared to T(d 15 N) and T(d 40 Ar). In the early-Holocene (9 kyre11.5 kyr), the d 15 N excess fit leads to a flat temperature with almost no trend. This is not only inconsistent with T(d 15 N) or T(d 40 Ar), but also with the reconstructions of Kobashi et al. (2017) and Buizert et al. (2018) for the GISP2 site (Fig. 6). implying that the quality of the d 15 N excess fit needs to be significantly reduced to obtain a potentially meaningful temperature estimate. Since we used the gas-loss corrected d 40 Ar to calculate d 15 N excess and DT firn , we have to argue that the currently available correction (Kobashi et al., 2017;Kobashi et al., 2015) is insufficient, especially for the late-and early-Holocene data. This result is somewhat surprising, as it was argued in Buizert et al. (2018) that the influence of possible gas-loss on d 40 Ar is strongest within the bubble-clathrate transition zone (about 800 me1500 m depth of the GISP2 core, corresponding to 3.8e9.3 kyr BP ice age, see Supplement p. X-2 in Buizert et al. (2018)). In the following, we compare two of the most important mid-Holocene cooling trends in T (d 15 N (Table S4) with r ¼ 0.61 for the best-fit T(d 15 N excess ) solution and r ¼ 0.68 for the averaged T(d 15 N excess ) solution. The correlation with T(d 40 Ar) is even weaker (r ¼ 0.48, best-fit; r ¼ 0.54, average solution). In the multi-decadal periodictime band (Table S6), we find a weak negative correlation between T(d 15 N excess ) and T(d 40 Ar) with r ¼ À0.41 and r ¼ À0.36 for the bestfit and the mean T(d 15 N excess ) solution, respectively. The result for the multi-decadal time band can be explained as the multi-decadal oscillations in T(d 15 N excess ) are mainly driven by d 40 Ar and the influence of d 15 N is smaller due to the higher variability of d 40 Ar/4 compared to d 15 N.

Comparison of uncertainties of T(d 15 N) and T(d 15 N excess )
We cannot state a final uncertainty for T(d 15 N excess ) because of the flat temperature in the early-and the strong drift in the late-Holocene. In addition, we do not understand the reason for the different behavior when fitting the data with the two models (stable Goujon et al. (2003) model vs. unstable Schwander et al. (1997 model solutions). However, since fitting d 15 N excess leads to very different temperature estimates compared to T(d 15 N) or T(d 40 Ar) and also compared to other reconstructions (comparing Fig. 4d with Fig. 6a), we do not yet recommend using T(d 15 N excess ) for a climatic interpretation. Since d 15 N is easier to measure due to the higher abundance of nitrogen in the air and less susceptible to gas-loss-induced fractionation , we argue that T(d 15 N) provides the most robust temperature estimate compared to T(d 40 Ar) and especially T(d 15 N excess ) for reconstructing Holocene temperature.  Fig. 6b) with the solution of Kobashi et al. (2017). As mentioned above, the temperature reconstructions of Kobashi et al. (2017) were performed using d 15 N excess to obtain the temperature, which we believe is problematic due to the high relative uncertainty of d 15 N excess and the systematic offsets to too high d 40 Ar. Buizert et al. (2018) use the stable water isotope d 18 O ice together with the long-term trend of d 15 N to derive temperature for the early-to mid-Holocene (up to 4 kyr b2k). From 4 kyr to present, they use the temperature from Kobashi et al. (2017) overlaid with a larger cooling trend (Fig. S8).

Comparison of data variance
First, we compare the variance of temperatures (Table 2) for two time-windows (0.5e4.0 kyr b2k and 4.0e11.2 kyr b2k) and two periodic-time bands (bands: 100 yearse500 years and 500 yearse4 kyr). Since the temporal resolution of the Buizert et al. (2018) estimate is 20 years, we resampled our data and the Kobashi et al. (2017) data to the same grid before band-pass filtering. We also exclude the 8.2k event as it dominates the variance of the temperature data. In the early-to mid-Holocene (4.0e11.2 kyr b2k, excluding the 8.2k event), the standard deviation (2s) of T(d 15 N) in the periodic-time band of 100e500 years is 0.47 K, which is almost the same as the reconstruction of Buizert et al. (2018) with a value of 0.49 K. The reconstruction of Kobashi et al. (2017) has more than twice the variance in this section with 2s ¼ 1.17 K. For the mid-to late-Holocene (0.5e4.0 kyr b2k), the standard deviation of T(d 15 N) is about 20% smaller (2s ¼ 0.37 K) than for the early-to mid-Holocene (4.0e11.2 kyr). The variance of the Kobashi et al. (2017) reconstruction is slightly smaller (17%, 2s ¼ 0.97 K) during this time. The reconstruction of Buizert et al. (2018) here almost matches the estimate of Kobashi et al. (2017), with 2s ¼ 1.00 K. This is not unexpected, as Buizert et al. (2018) use the data of Kobashi et al. (2017) with slight modifications. However, it is not entirely plausible that a doubling of the variance between the two parts of the Holocene is realistic.
For longer periodicities (band: 0.5 kyre4.0 kyr) we also see a reduction in variance for the late-Holocene compared to the early-Holocene in all three reconstructions. During the early-to mid-Holocene, the variance of T(d 15 N) and the reconstruction of Buizert et al. (2018) agree well (2s ¼ 0.58 K for T(d 15 N), 2s ¼ 0.65 K for Buizert et al. (2018)), while the estimate of Kobashi et al. (2017) suggests a higher variability (2s ¼ 1.26 K), a same behavior as found for the faster periodicities. For the mid-to late-Holocene, T(d 15 N) shows the lowest variability (2s ¼ 0.35 K) compared to the reconstructions of Buizert et al. (2018) (2s ¼ 0.58 K) and Kobashi et al. (2017) We performed the same analysis for the accumulation rate data and for the gas-isotope data (Table 3). In particular, for the periodictime band of 500 years to 4.0 kyr, we find a uniform decrease in data variance between the early-to mid-and mid-to late-Holocene.
For d 15 N, the standard deviation (2s) is about 52% higher in the early-to mid-Holocene than in the mid-to late-Holocene, 34% for d 40 Ar and 41% for d 15 N excess , respectively. The accumulation rate data show this divergence of variance between the two time periods in this periodic-time band with a reduction of 46%. For the faster periodicities (band-pass 100 yearse500 years), the gasisotope data also show this behavior, but with a smaller deviation between the two time periods. In contrast, the mid-to late-Holocene accumulation rate data show a slightly higher variance than in the early-to mid-Holocene. From these results, we conclude that the difference in variability in our temperature estimates between the early-to mid-and mid-to late-Holocene is a direct result of the behavior of the gas-isotope data and accumulation rates.
Interestingly, the difference between the Buizert et al. (2018) and Kobashi et al. (2017) estimates (Fig. S8) for the 0.5e4.0 kyr b2k time interval suggests that the Kobashi et al. (2017) estimate needs to be modified when used in the Buizert et al. (2018) study. One reason for this could be the use of a different firn-model or possible memory effects due to differences in temperature estimates between the Buizert et al. (2018) and Kobashi et al. (2017) estimates in the early-to mid-Holocene. Kobashi et al. (2017) temperature with hybrid temperature Fig. 6b shows the comparison between the estimate of Kobashi et al. (2017) and our hybrid temperature T hybrid . Both estimates agree well, especially for the faster features. We find larger discrepancies in four time-sections. These sections are: 9 kyre10.5 kyr, shortly after the 8.2k event (6.6 kyre8.1 kyr), 5.3 kyre6.1 kyr and 0.07 kyre1.8 kyr. All these sections in the Kobashi et al. (2017) estimate start with a rapid warming or cooling trend of about 100e200 years. It is very likely that these shifts are caused by the "window correction method" used by Kobashi et al. (2017). In our view, this correction method is very critical, as the choice of window length, window positions and offsets found are arbitrary, but on the other hand crucial for the reconstruction.

Correlation of band-pass filtered temperatures
In addition, we correlated the temperature estimates of Buizert et al. (2018) and Kobashi et al. (2017) with each other and with our data after low-pass filtering (cop ¼ 50 years) and in all three periodic-time bands investigated (multi-decadal, multi-centennial, multi-millennial, Table S7). For the reconstruction of Buizert et al. (2018), the correlations were calculated only for the early-to mid-Holocene values (4.0e11.5 kyr b2k), while the correlations between our estimates and those of Kobashi et al. (2017) were calculated for 0.5 kyre11.5 kyr b2k. In the low-pass filtered case (general trend), the reconstruction of Buizert et al. (2018) shows the highest correlations with T(d 15 N) and T(d 40 Ar) with r > 0.9 and a correlation of r ¼ 0.82 with the estimate of Kobashi et al. (2017), due to the high agreement in the general trend between the three studies. The comparison between the temperature of Kobashi et al. (2017) and our estimates has the highest correlation with the hybrid temperatures (r ¼ 0.87, r ¼ 0.83) and T(d 15 N) (r ¼ 0.81) in this case. Besides the general trend, the correlations decrease with faster oscillations. In all periodic-time bands, the reconstruction of Buizert et al. (2018) has the highest correlation with T(d 15 N) with r m ¼ 0.67, r c ¼ 0.61 and r d ¼ 0.30 for multi-millennial (r m ), multi-centennial (r c ) and multi-decadal (r d ) signals, respectively. The correlation coefficients between the reconstruction of Buizert et al. (2018) and Kobashi et al. (2017) Table 2 Standard deviations (2s) of temperature estimates of T(d 15 N) (this study), Buizert et al. (2018) and Kobashi et al. (2017) when band-pass filtered for two periodic-time bands and calculated without the 8.2k-event.  Table 3 Standard deviations (2s) of gas-isotope measured data and accumulation rates when band-pass filtered for two periodic-time bands and calculated without the 8.2k-event.  Kobashi et al., (2017) d 15 N excess temperature from the late-to mid-Holocene and d 18 O ice -and d 15 N-based temperature estimates from the mid-to early-Holocene, the reconstruction is not consistent throughout the Holocene. For example, temperature variability doubles for the late-to mid-Holocene compared to the mid-to early-Holocene.
Future studies should aim to decipher the driving mechanisms for the Holocene temperature variability at the Greenland summit. Comparisons of ice core-based temperature reconstructions during the Holocene with other high latitude proxy records (e.g. sea surface temperature records, North Atlantic circulation proxies, terrestrial temperature records (pollen, speleothems, tree rings)) require a robust temperature record with low uncertainties. Based on the above arguments, we prefer the d 15 N-based temperature reconstruction. Our T(d 15 N) provides robust centennial to millennial scale variations while decadal variations are more uncertain.

Conclusion
In this study, we reconstruct temperature variations during the Holocene by applying the gas-isotope fitting-algorithm of D€ oring and Leuenberger (2018) to d 15 N, d 40 Ar, and d 15 N excess data measured from the GISP2 ice core (Kobashi et al., 2008b). The temperatures reconstructed from both models (Schwander et al. (1997); Goujon et al. (2003)) agree well (r > 0.9) for all periodictime bands considered (multi-decadal, multi-centennial, multimillennial). The gas isotope fitting-algorithm works very well for d 15 N and d 40 Ar with excellent reproducibility between ten individual runs which is not the case for d 15 N excess . Analysis of the ice core data revealed that the signal-to-noise ratio (SNR) is best for d 15 N, favoring this parameter for Holocene temperature reconstruction, especially for multi-decadal to multi-centennial signals.
Comparison of the temperature estimates -when fitting d 15 N and d 40 Ar, respectively -showed a significant and high correlation between T(d 15 N) and T(d 40 Ar) for multi-centennial and multimillennial signals, with T(d 40 Ar) having higher amplitudes for some of these temperature anomalies. For multi-decadal signals, the correlation between T(d 15 N) and T(d 40 Ar) is weak, and consistent with the correlation between the isotopic data (d 15 N and d 40 Ar). Comparison of the firn temperature gradient calculated from T(d 15 N) and T(d 40 Ar), DT firn,model , with the DT firn,meas , calculated from the measured d 15 N excess , documents a good agreement of the general trend in the late-to mid-Holocene (1.3 kyre6.4 kyr b2k). However, in the early-to mid-Holocene (6.4 kyre11.5 kyr) and late-Holocene (0.07 kyre1.3 kyr b2k), DT firn,model significantly exceeds DT firn,meas , which may be indicative of systematically excessive d 40 Ar values in this section, possibly due to fractionation of d 40 Ar caused by gas-loss still remaining after the gas-loss correction on d 40 Ar (Kobashi et al., 2015(Kobashi et al., , 2017. The temperature calculated by fitting d 15 N excess is significantly different from the coherent T(d 15 N) and T(d 40 Ar), especially for the early-and late-Holocene. Analysis of these differences also suggests that the d 40 Ar values are too high in these sections. The correlations between T(d 15 N excess ) and T(d 15 N) or T(d 40 Ar) are weak for all three periodic-time bands studied (multi-decadal, multi-centennial, multi-millennial). For multi-decadal signals, we find a weak negative correlation between T(d 15 N excess ) and T(d 40 Ar) and no correlation between T(d 15 N excess ) and T(d 15 N), implying that the multidecadal oscillations in T(d 15 N excess ) are mainly driven by d 40 Ar, with a smaller influence of d 15 N due to a higher noise contribution on d 40 Ar/4 compared to d 15 N. In addition, we calculated the slope between d 15 N and d 40 Ar using geometric mean regression for all three periodic-time bands. In particular, for multi-decadal signals, the slope significantly underestimates the theoretical value (calculated using the empirically derived thermal sensitivities of d 15 N and d 40 Ar) by 53% and is therefore an additional indication of excessively high d 40 Ar values possibly due to remaining gas-loss signal. For the multi-centennial and multi-millennial bands, the slope is consistent with the theoretical expectation. The influence of three different accumulation rate estimates on the temperature reconstructions was investigated. Accumulation rate uncertainty results in asynchronous multi-decadal temperature signals due to variations in the modelled Dage regimes, with maximum differences of 30 years in the early-Holocene, while signal amplitudes remain unchanged. From the mid-Holocene to present, the modelled Dage difference is less than 5 years. For the longer-term temperature trends, the scenario with the higher accumulation rate results in 0.3 K more cooling compared to the scenarios with the lower and intermediate accumulation rates.
The total uncertainty of the reconstructed temperatures is composed of four terms (all values are 2s): (i) the discrepancy between the measured and modelled isotope data (3.65 permeg for d 15 N and 2.79 permeg for d 40 Ar/4, equals for 0.25 K for T(d 15 N) and 0.27 K for T(d 40 Ar)); (ii) the reproducibility of the temperature estimates of the isotope fits (0.17 K for T(d 15 N) and 0.26 K for T(d 40 Ar)); (iii) the measurement uncertainty of the isotope data (6 … 8 permeg for d 15 N and 8 … 18 permeg for d 40 Ar/4, equals 0.41 … 0.54 K for T(d 15 N) and 0.78 … 1.76 K for T(d 40 Ar)); and (iv) the difference between temperature estimates using different firnmodels (here between two firn-models) (0.62 K for T(d 15 N) and 0.73 K for T(d 40 Ar)). The addition of these terms leads to a final uncertainty of 2s T ¼ 0.80 … 0.88 K for T(d 15 N), and 2s T ¼ 1.14 … 1.95 K for T(d 40 Ar).
Finally, we compared the correlations between three reconstructions for three periodic-time bands (multi-decadal, multicentennial, multi-millennial) and signal variance. In general, we find higher agreement between our d 15 N-based reconstruction and the d 18 O-and d 15 N-based reconstruction of Buizert et al. (2018) than between our T(d 15 N) estimate and the reconstruction of Kobashi et al. (2017) in all bands. The results presented in this paper suggest that T(d 15 N) is the most reliable temperature estimate.
Our Holocene temperature reconstructions allow comparisons with existing or upcoming high latitude proxy data sets (e.g. ice cores or sea sediments) to assign relevant processes for temperature variations on Greenland summit studied here. In addition, another testing ground for model temperature results is created to shed light on the Holocene temperature conundrum.

Author contributions
M. D€ oring performed all calculations and wrote a draft version of the manuscript. M. C. Leuenberger initiated and supervised the work. Both worked on finalizing the manuscript for submission.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.