Even the most sophisticated global climate models are known to have significant biases in the way they simulate the climate system. Correcting model biases is therefore an essential step towards realistic palaeoclimatologies, which are important for many applications such as modelling long-term ecological dynamics. Here, we evaluate three widely used bias correction methods – the delta method, generalised additive models (GAMs), and quantile mapping – against a large global dataset of empirical temperature and precipitation records from the present, the mid-Holocene (

Realistic reconstructions of global palaeoclimate are a key input for modelling many important long-term and large-scale ecological processes

Bias correction has received a great deal of attention for present-day and near-future simulations

Here we combine a set of high-resolution simulations of the climatological means of several temperature and precipitation variables for the present, the mid-Holocene (

Section

We used

Empirical reconstructions (see Sect.

All bias correction methods considered here are calibrated based on present-day observational data. For this, we used monthly terrestrial temperature and precipitation data at a 0.167

We used global datasets of local empirical palaeoclimate reconstructions of terrestrial mean annual temperature, temperatures of the coldest and warmest months, and annual precipitation for the mid-Holocene and the LGM from

Empirically derived climate reconstructions can themselves be subject to biases and uncertainties which arise at the different stages of the reconstruction process, from collecting the data to computationally converting empirical records to climate variables. Nonetheless, these data represent the best empirically based estimates of past climatic conditions available and the most suitable data for our analysis.

The delta method consists of adding the difference between past and present-day simulated climate to present-day observed climate. As such, the delta method assumes that local (i.e. grid-cell-specific) model biases are constant over time

Precipitation is bounded below by zero and covers different orders of magnitude across different regions. A multiplicative rather than additive bias correction is therefore more common when applying the delta method for precipitation, which corresponds to applying the simulated relative change to the observations

Statistical bias correction methods assume the existence of a functional relationship between true climatic conditions (dependent variables) and climate model outputs as well as additional known forcings such as topography (independent variables)

For a set of geographical locations

Quantile mapping aims to correct distributional biases in the simulated climate data. The method consists of first computing a transformation that maps the quantiles of the cumulative distribution function of all present-day observed values (i.e. from all land or ocean grid cells) of a climate variable onto the quantiles of the cumulative distribution function of all present-day simulated values. The derived mapping is then applied to the cumulative distribution function of all simulated values at a given point in the past. For example, let the cumulative distribution function of the values of present-day observed terrestrial mean annual temperature (i.e. from all land grid cells) map the value

Formally, we denote by

All three bias correction methods considered here aim at minimising biases in past simulated data, but they are based on different assumptions as to how this aim can best be achieved. The delta method assumes that the known present-day model bias is also a good estimate for past model bias. GAM methods and quantile mapping operate on the premise that this assumption of the delta method – local biases remaining constant over time – is too strong. Instead, GAM methods assume that a better estimate of past model biases can be obtained by deriving a statistical relationship between present-day bias and present-day simulations and then applying this relationship to past simulations in order to estimate past bias. Because regressions generally do not fit the data perfectly, present-day biases modelled by the GAM will not exactly match the observed biases across all grid cells. Unlike in the case of the delta method, GAM-corrected present-day simulations are therefore not identical to the present-day observed climate. This drawback is accepted under the assumption that the derived statistical model captures the mechanisms that underlie local model biases better than the time-invariant local correction term used in the delta method and indeed to an extent that results in more accurate estimates of past model biases. Similarly, quantile mapping assumes that the distributional correction of climate quantiles – whilst, again, not perfectly eliminating biases in present-day simulations – ultimately represents a better strategy for minimising past bias than the rigid local correction of the delta method.

Another important commonality between the methods is that they are calibrated only using present-day simulated and observed data. All three are based on the concept of establishing a relationship between present-day simulated and observed data and then extrapolating that relationship in order to estimate past biases. The specific aspect that is assumed to be invariant over time is the present-day local bias in the case of the delta method, the regression model linking present-day simulated and observed data in the case of GAMs, and the present-day distributional correction in the case of quantile mapping.

Empirical palaeoclimate reconstructions of climatological normals allow us to assess the performance of different bias correction methods in removing biases in past simulated data. In the following, we define the local differences between empirical reconstructions and bias-corrected simulations for the different climate variables and bias correction method considered and develop a spatially aggregated measure to assess the global performance of each method.

We denote by

We tested whether the median absolute biases associated with any two bias correction methods, as well as a certain climate variable and point in time, were statistically significantly different under the given uncertainty in the empirical reconstructions using the following approach. For each climate variable and point in time, we generated

Debiased simulated data should ideally not contain any systematic bias in that the median bias MB given by

In some applications, the climate change signal, i.e. the difference between past and present climatic states, may be more relevant than the climate at a fixed point in time. The difference between the empirical and the simulated climate change signal CCB of a climate variable

Figure

Comparison of bias-corrected simulated and empirically reconstructed climate variables.

Median absolute biases (MAB in Eq.

All bias correction methods reduce the median absolute bias (MAB in Eq.

The above trends in the performances of the different bias correction methods in terms of the median absolute bias are not always statistically significant. The median absolute bias associated with the delta method was significantly smaller (

Across time periods, raw simulations tended to underestimate terrestrial and marine mean annual temperatures and terrestrial temperature of the warmest month and overestimated annual precipitation (Fig.

The differences between bias correction methods in terms of improving the climate change signal (CCMAB in Eq.

Reduction of the original model bias by the delta method for terrestrial and marine mean annual temperatures and terrestrial annual precipitation. The lower end of the colour scale was capped at

The performance of the different methods is not uniform across space nor time. Figure

Relative performances of the delta method and the GAM approach in terms of debiasing simulated mean annual temperature (left column) and annual precipitation (right column). The colour spectrum represents the interval [0,1], and marker colours are calculated as the ratio of the absolute value of the local bias (Eq.

The performances of the methods relative to each other also vary substantially across both space and time. For example, whilst globally the delta method has a slight overall edge over the GAM approach (Fig.

Whilst, overall, the delta method performs slightly better at debiasing temperature and precipitation compared to the GAM-based method and quantile mapping for the empirical data considered here, we note that this method is only appropriate for a given land conformation. Thus, it is only suitable for the late Quaternary, and, even for this period, changes in sea levels are problematic as they expose areas for which there is no bias information. GAMs should, in theory, obviate this problem by quantifying bias-related processes as statistical relationships; however, whilst this approach might be the only option for the deeper past, our results point to the fact that estimating such processes in such a way is challenging, as demonstrated by its overall inferior performance to the delta method. A possible limitation of GAMs as currently applied is that they assume additivity between predictor variables. By fitting interactions, it would be possible to allow for more complex processes, but the computational complexity of interactions with such large datasets is non-trivial.

Differences between local past and present model bias (at locations for which empirical reconstructions are available) against the local simulated climate change signal (i.e. the difference between past and present simulated value) of the variable of interest. Red, blue, and green markers represent data from the mid-Holocene, the LGM, and the last interglacial period, respectively. Error bars represent standard errors of the empirical reconstructions. Lines and shades show robust linear regressions and 95 % confidence intervals, respectively. Whilst weak, the relationships suggest that it may be possible to model some of the variability of local model biases over time using only the available simulation data. Such an approach could potentially significantly enhance the delta method, which currently operates on the simplifying assumption that this variability is negligible.

A major limitation of current approaches for correcting biases in climate model data is that they all assume bias patterns in present-day climate to be fully representative of the past (see Sect.

Such an approach would tie in with data assimilation methods, which also use empirical climate proxy records to improve climate simulations. These methods have been used to estimate global climate variables at times at which the quantity and spatial coverage of available empirical records is high enough to allow a robust calibration of the relevant computational methods. As a result, they have focussed either on single points in the past, such as the mid-Holocene

Our comparison of global debiased palaeosimulation data and empirical reconstructions suggests that, despite its conceptual simplicity, the delta method is a good starting point for the bias correction of simulated late Quaternary climate data at a global scale, providing slightly stronger bias reductions compared to GAMs and quantile mapping.
However, given the lack of statistical significance of the superior performance in some cases and the considerable variability in the effectiveness of the three methods across different locations and points in time, we echo earlier propositions that studies focussing on specific regions require case-by-case assessments of which bias correction method is most suitable for improving palaeoclimate simulations

A key limitation of all three methods considered here is their assumption that present-day patterns between simulated and observed climate can be extrapolated to estimate model biases in the past. High uncertainties and the spatial and temporal sparseness associated with currently available empirical palaeoclimate datasets will likely impede a robust assimilation of these data into bias correction methods at this stage; however, our data indicate that the increasing quantity and quality of global proxy records could soon make it possible to use empirical reconstructions in the development of improved methods that effectively account for the variation of local model biases through time.

Code and datasets used in this analysis are available on the Open Science Framework:

All authors conceived the study. RB conducted the analysis and wrote the paper. All authors interpreted the results and revised the paper.

The authors declare that they have no conflict of interest.

The authors are grateful to Paul J. Valdes and Joy S. Singarayer for providing the climate simulation data used in this study and to three anonymous reviewers for their helpful comments.

This research has been supported by the ERC (Consolidator Grant “LocalAdaptation”, grant no. 647787).

This paper was edited by Steven Phipps and reviewed by three anonymous referees.