The combined use of proxy records and climate modelling is
invaluable for obtaining a better understanding of past
climates. However, many methods of model-proxy comparison in the
literature are fundamentally problematic because larger errors in
the proxy tend to yield a “better” agreement with the model. Here
we quantify model-proxy agreement as a function to proxy uncertainty
using the overlapping coefficient OVL, which measures the
similarity between two probability distributions. We found that the
model-proxy agreement is poor (

In paleoclimatology, the combined use of proxy records
(reconstruction of past climate conditions) and climate modelling is
invaluable for understanding past climate dynamics. Comparing climate model
simulations with proxy records is also useful for evaluating the model
performance. If the models perform well in the past, this can eventually
increase the reliability of future climate projections

To take the uncertainties of proxy data into account, several methods
have been proposed to evaluate the (dis)agreement between the model
and proxy data. The simplest one is to use the median and
interquartile/total range of the proxy and model data to check whether
they overlap with each other

In this study we elucidate some of the (potentially serious) issues
associated with the conventional model-proxy comparison methods. Since many
of these methods yield an improved model-proxy agreement for large
uncertainties in the proxy data, we dedicate particular attention on this
feature. To quantify agreement we calculate the overlapping coefficient
(OVL), which measures the degree of similarity between two probability
distributions

The method for model-proxy comparison introduced here is based on
a probabilistic approach. If

A major advantage of using OVL over other probabilistic measures is its
simplicity; Eq. (

The main advantage of the overlapping (OVL) method is that the proxy
uncertainty and model variability are accounted for as they alter the SDs of
the corresponding probability distributions. A necessary requirement for
calculating OVL (Eq.

In the following we discuss probability distributions derived from
climate models and proxy data. We restrict our discussion to examples
from our previous work, which includes vegetation proxies based on the
co-existence approach of the mean annual temperature (MAT) and
precipitation (MAP)

In climate models, temperature and precipitation variability usually follow
a Gaussian (Normal) and a gamma distribution, respectively. For both
distributions, only

The use of a gamma instead of a Gaussian distribution for
precipitation is motivated by the fact that the probability frequency
is zero for negative values, which is one of the main features of the
gamma distribution. Furthermore, the gamma distribution can
potentially change the shape depending on the characteristics of the
data. If

Vegetation proxies based on the co-existence approach yield intervals
with homogeneous probability rather than a most likely estimate and
a standard error

To better understand the behavior of OVL, Fig.

In the next two panels (Fig.

The poor agreement in Fig.

Somewhat surprisingly, however, is that large uncertainty proxies in some
cases have a greater agreement with the model if their means are further
apart. This is illustrated in Fig.

In this section, we apply the OVL approach to real climate model and proxy
data. For this purpose we use data from a previously published paleoclimate
modelling study

To evaluate the model-proxy agreement

Using the minimum distance method, the modelled MAT agrees fairly well with
the MAT from the vegetation proxy (Fig.

Figure

To reveal some insight into the low OVL, Fig.

This work was motivated by the fact that many conventional model-proxy comparisons favour a good agreement for large errors in the proxy. These methods are fundamentally problematic because the model “performance” is largely determined by the data used for comparison. Here we illustrate how uncertainty of the proxy influences the model-proxy agreement. We use a simple metric called the overlapping coefficient (OVL), which measures agreement of two probability distributions. Even if OVL has some shortcomings, it has the ability to quantify agreement as a function of uncertainty.

Our main result is that the model-proxy agreement can be poor even if the
mean values are similar. More specifically, for similar means OVL is always
less than

Here we examine how uncertainty in (proxy) data, i.e. the magnitude of error
bars influences OVL for a given (model) distribution. Our analysis here is
based on two hypothetical triangular probability distributions (Fig. S2).
The advantage of using such a simple geometry is that we can derive
analytical relationships. For simplicity, we define the width of each
triangle to be equal to two SDs (

This work was supported by the LOEWE research funding programme of
the state of Hesse, the German Research Foundation (grant MI
926/8-1), and from the Marie Curie Programme of the European
Commission. We are indebted to Arne Micheels and Torsten Utescher
for providing the climate modelling and paleovegetation data. Please
contact the corresponding or second author to obtain the data that
was used to produce Fig.