The sensitivity of the Indian monsoon to the full spectrum of climatic
conditions experienced during the Pleistocene is estimated using the
climate model HadCM3. The methodology follows a global sensitivity
analysis based on the emulator approach of

By focusing on surface temperature, precipitation, mixed-layer depth and
sea-surface temperature over the monsoon region during the summer season
(June-July-August-September), we show that precession controls the response
of four variables: continental temperature in phase with June to July
insolation, high glaciation favouring a late-phase response,
sea-surface temperature in phase with May insolation, continental
precipitation in phase with July insolation, and
mixed-layer depth in antiphase with the latter.

As regards the general methodology, it is shown that the emulator provides a powerful approach, not only to express model sensitivity but also to estimate internal variability and detect anomalous simulations.

Since the pioneering studies of

One general approach to this end has been to perform snapshot experiments for
specific time slices in the past. The general circulation model is run with a
particular set of initial conditions for a perpetual year for a long
computational time until equilibrium is reached. The epoch used for defining
the astronomical forcing and boundary conditions is one for which specific
efforts are being undertaken to collect observations. This is the general
spirit of projects such as COHMAP

Based on these experiments, it is now well understood that glacial boundary
conditions, typical, for example, of the Last Glacial Maximum, induce a
weakening of moisture transport over the Indian subcontinent and a reduction of
precipitation in East Asia

These past climate simulations are often complemented with additional
sensitivity experiments. One classical experimental setup consists in
considering two end-member states, often the pre-industrial and one
well-defined past period, and intermediate configurations for which one
or several forcing components are “activated” while the others are left as the
pre-industrial configuration

Palaeoclimate modelers are also concerned with the phase relationship between
forcing and climate. In particular, climatic precession may be seen as a
quasi-periodic rotation of the point of smallest Earth–Sun distance (it will be
referred to here as the perigee because we work in geocentric coordinates) and the
vernal equinox. By considering specific periods in the past for GCM experiments
one can only already develop a partial understanding of the phase
relationships. Specifically,

Here, we will experiment with an alternative approach that will enable us
to simultaneously document the

The starting point of this approach consists in performing an ensemble of
snapshot simulations. The ensemble is designed such that experiments span the
space of possible forcing configurations that the Earth encountered during
the late Pleistocene (ca. the last 800 000 years). For this reason the approach
will be qualified as “global”; more specifically, this is a

it is derived from a small number of model runs filling the entire multidimensional input space;

once the emulator is built, it is not necessary to perform any additional runs with the model.

This technique of emulation is beginning to be commonly used to estimate uncertainties
on climate model outputs, given probability distributions on uncertain
quantities such as model parameters

Compared to this series of works the present objective is a bit
different. As stated, we are interested in input quantities which we know
varied in the past, though we will assume that they varied sufficiently slowly
to justify a hypothesis of quasi-stationarity of the ocean–atmosphere system
with the forcing.
Our purpose is to estimate the contribution of input factors to the temporal
climate variance that can be observed in palaeoclimate records. To this end we
refer to the statistical theory of global sensitivity analysis with emulation
formalised by

The paper is structured as follows. Section

The first task is to define the space of input configurations to be explored
with an ensemble of experiments. We consider five input factors: the three
elements of astronomical forcing (eccentricity

The three elements of astronomical forcing are combined under the form of

Left panel: ice area, in normalized units, and maximum height (in
meters) in the region 45–75

Experiment plan design, optimised to maximise the minimum distance
between points and to achieve orthogonality (maximise the determinant of the
covariance of input factors). Right:

The glaciation level is determined as follows. Our purpose is to
select 11 realistic boundary conditions representative of
glacial–interglacial dynamics. Pragmatically, we sampled these
boundary conditions among the series prepared by

The next step is to define an ensemble of experiments to run with the climate
model in order to efficiently span the input space. The choice of the number
of experiments and, for each experiment, the choice of input parameters is
called the design. A

For this application, two additional constraints need to be accounted for in
order to avoid sampling unrealistic inputs that would be uninformative for the
sensitivity analysis of climate over the Pleistocene: exclude forcings with

Experiment setup: simulation name and number,
astronomical parameters (eccentricity, longitude of the perigee and
obliquity),

Note that this design is in principle suitable for continuous factor
ranges only. The glaciation level used for experiments is an integer
obtained by rounding the value obtained by this process to the closest
integer. Designs specifically adapted for input spaces mixing
categorical and continuous variables could best be implemented in the
future

Table

The climate model – referred to in this context as the simulator –
is the general circulation model HadCM3

Initial conditions are the final state of the PMIP2 0K experiment
featured in

The last 100 years of all simulations with orographic forcing
were retained for analysis. Over this interval, the
top-of-the-atmosphere imbalance ranges between

At this stage we suppose that the simulator HadCM3 has been run for all design
points. We now show that it is possible to

To this end, we need to develop a statistical model that can interpolate the outputs obtained with the simulator at the design points. The procedure is akin to geospatial interpolation, except that the input field is here five-dimensional, instead of two- or three-dimensional as in most geospatial applications (cf. video in the supplementary material).

In particular, we follow

The calibration of the emulator is mathematically described as follows. Let

Let

We also need to make a choice regarding the values of

In these conditions, the

Remember that

For this application, linear regression is an adequate choice because
the seasonal and annual forcings are almost linear with the input
factors, except possibly for glaciation level. Hence,

The correlation function

There is a further correction to be accounted for before using this function.
The quantity we are interested in emulating is the hypothetic mean of an
infinitely long experiment that has perfectly reached the stationary state.
In practice, we have to be content with the mean of a finite-length
experiment, obtained for a specific set of initial conditions and which may
not have perfectly reached the stationary state. The difference between the
output of an experiment and the ideal experiment average is expected to be
small yet impossible to predict exactly because it may chaotically depend on
initial conditions. It may effectively be accounted for in the emulator as
follows. Observe that the function

The nugget has another benefit: it regularises the problem for large length
scales, and it may in particular be shown that posterior means converge to the
solution of a linear regression problem for

The remaining problem is to estimate the hyperparameters

It is worth noting that, in our case, using the normal likelihood or the penalised one has practically no effect on the results.

We are now in a position to estimate the simulator output at potentially any input point spanned by the design. It is now possible to develop indices, of which the purpose is to summarise the sensitivity of the simulator to individual or combined factor throughout the whole input space. This is the general idea of global sensitivity analysis.

In particular, one of the early applications of Bayesian emulators (as we use
here) was to estimate sensitivity measures to quantify the uncertainty on a
simulator output arising from the fact that the inputs are themselves uncertain

Lines: 66, 90 and 95 % percentiles of the empirical
distribution used to describe the probability distribution in the CO

In particular, the occupation density along the components of the astronomical
forcing can be estimated with histograms of long time series generated with
known astronomical solutions, such as those presented by

Given that we cannot run the model at every point of the space

Strictly speaking,
the word

As above.

In order to compute

JJAS sea-level pressure and surface temperature of the two regions
depicted: NI and IO. Units are in

In order to study the Indian monsoon, we define two regions: northern
India (NI), with coordinates 70–100

Diagnostic of emulator performance considering experiments 11 and 40. Shown are the mean and standard deviations of sea-surface temperature (left panel) and mixed-layer depth (right panel). Clearly seen are the two bad predictions, especially in the case of sea-surface temperature.

Sensitivity to glaciation level and

We focus specifically on four physical variables representative of the
summer Indian monsoon process: June-July-August-September (JJAS)
temperature and precipitation on the continental box, and JJAS sea-surface temperature (SST) and mixed-layer depth on the Indian Ocean
box. Over the experiment design, continental temperature varies
between 15 and 21

An emulator using all 61 experiments is calibrated using the procedure given
in Sect.

This leads us to the following observations:

For

There are some instances where length scales are
much greater than the scale of the variables: this is observed on all
output variables for the response to

The leave-one-out cross-validation plot shows that two experiments are
not well captured by the Gaussian process model for
SST (experiments 11 and 40), and one for mixed-layer depth
(experiment 40). The emulator fails to predict the outputs within an error of
less than 3 standard deviations when they are left out of the calibration
procedure. The effects of these experiments on the emulator output are well
visible in Fig.

At this stage one could consider an alternative emulator, calibrated on a 59-member experiment design in which the two problematic simulations are omitted.

Emulator scales for the different fields under
study. In general, scales are commensurate with the range covered by the
input factors. However, for

This new emulator with new scales

All ancillary emulators constructed for the leave-one-out diagnostic capture between 38 (mixed-layer depth) and 43 (continental temperature) of the leave-one-out experiments within 1 standard deviation, and between 56 and 58 within 2 standard deviations, which roughly correspond to the 66 and 95 % ratios expected for a normal distribution.

The normalised errors are compatible with a normal distribution based on the Shapiro–Wilk normality test, except for continental temperature (normality rejected with 97 % confidence).

There is no error exceeding 3 standard deviations.

Finally, the suspicious anomalies generated on the
glaciation/precession plots are cleared (Fig.

Based on our experience with HadCM3 we are inclined to give more credit to this
new emulator as a predictor of HadCM3 outputs, rather than the one obtained
with simulations 11 and 40. Of course, this choice leaves us with the task of
explaining what went wrong with these two simulations. It seems that we have to
leave it as an open case. Further inspection of these particular experiments
reveals a clear warm–cold–warm pattern in the North Atlantic, and cooling over
the rest of the ocean, exemplified here by comparing experiments 11 and 15
(Fig.

Although we appreciate the difficulty, from a statistical inference prospective, of rejecting problematic experiments for the calibration of the emulator, we find it in fact positive that the emulator is effective in identifying experiments that behave unexpectedly compared to the bulk of the design.

Let us now consider the nugget.

As explained, this quantity quantifies the uncertainty of the simulation, i.e. how representative of the mean model state are the 100-year simulations.

The residual error in the emulator is of the order of

Thus, remarkably, the emulator calibration has successfully estimated model internal variability using only 100-year means, which we take as one more argument to use the recalibrated emulator.

Figure

The figure shows that continental summer temperature is primarily determined
by precession,

Similar to continental temperature, SST is primarily driven by precession and

Figure

Sea surface temperature difference between simulations 11 and 15 (see
Table

Diagnostic of emulator performance. Shown are the mean and standard deviation of the simulated and the emulated data points for the all the simulations with the exception of simulation number 11 and 40. Top left panel: continental temperature; top right panel: continental precipitation; bottom left panel: sea-surface temperature; bottom right panel: mixed-layer depth.

Sensitivity analysis: shown is the standard deviation of model
outputs (

Sensitivity to

Sea surface temperature difference for two

Sensitivity to

Orography–no-orography difference. From top to bottom, left to right: effect on continental temperature, precipitation, sea-surface temperature, and mixed-layer depth, with orography forcing (black) and without (red). The dotted lines show one standard deviation of the emulator prediction. One may see a departure point from glaciation level 3 in all four fields, as this is the point at which orography forcing becomes the most significant.

We see that the temperature response is in phase with June insolation at low glaciation levels, and in phase with July insolation at mid- and high-glaciation stages.

This feature may physically be understood by considering the summer precipitation response. Precipitation enhances latent heat cooling when perigee is around July. This effect gradually weakens as glaciation takes place and the total amount of precipitation declines, hence the drift towards a more linear response. At higher glaciation levels the JJAS temperature response phase also aligns with July insolation.

The maximum precipitation is obtained when perigee is reached in
early July. Among the series of experiments shown by

Furthermore, continental precipitation and mixed-layer depth show opposite
response phases to precession. This result is consistent with the
earlier findings of

The response of obliquity is mostly linear, as we can infer from the high values
of the length scales (see Table

The range of obliquity covered during the Pleistocene induces negligible continental temperature response over the west Indian box. It also induces a slight increase in precipitation. Regarding the Indian Ocean box, there is a somewhat larger effect on SST compared to continental temperature, but not significant. As for the mixed-layer depth, the response to obliquity is negligible.

In order to better understand the effect of obliquity, we considered the four

The response of all variables to

Figure

Finally, we consider the differences between the simulations with and without orography forcing of the ice sheets. The latter is potentially important given that mountains and elevated land masses affect the atmospheric circulation and precipitation patterns, and then the whole climate system. To this end, an emulator was calibrated on the available present-day orography experiments.

The net effect orography can then be seen in
Fig.

A clear deviation is seen around glaciation level 3. This effect is
due to the fact that, as explained in Sect.

We present a first application of a global sensitivity analysis theory to study
the climate response of the Indian monsoon to the climate factors which
evolved during the Pleistocene, namely the astronomical forcing
(

We focus, in particular, on four variables: continental temperature, continental precipitation, sea-surface temperature and mixed-layer depth. These variables were averaged for the JJAS season over northern India and northwestern Indian Ocean.

Similar to a number of recent studies based on statistical modelling for global
sensitivity analysis of computationally expensive simulators, the technical
implementation follows a three-step methodology:

This analysis yielded the following conclusions:

precession controls the response of four variables: continental temperature in phase with June–July insolation; high glaciation favouring a late-phase response; sea-surface temperature in phase with May insolation; and continental precipitation in phase with July insolation, and mixed-layer depth in antiphase with the latter.

The effect of

Obliquity is a secondary effect, negligible on most variables except sea-surface temperature.

The effect of glaciation is dominated by the albedo forcing, and its effect on precipitation competes with that of precession.

The orographic forcing reduces the glacial cooling induced by the albedo forcing, and even has a positive effect on precipitation.

The present study confirms the high potential of emulation for exploring and understanding the response of climate models. One originality of the present work was to consider, as inputs, several elements of the climate forcing that (have) varied in the past, and the emulator was used as a method to help us quantify the link between forcing variability and climate variability. The methodology may naturally be applied to other regions of focus and other climate models.

Paul Valdes and Gethin Williams (University of Bristol) are thanked for their assistance in setting up HadCM3 on our systems, and for providing us with ice topography boundary conditions. Richard Wilkinson (University of Nottingham) provided important clarifications to statistical notations. Constructive comments on the discussion version of this paper provided by Tamsin Edwards, Guangshan Chen, Yongqiang Yu and Alexander Kislov are warmly acknowledged. This research is a contribution to the ITOP project, ERC-StG grant 239604. M. Crucifix is funded by the Belgian National Fund of Scientific Research (FRS-FNRS) and P. A. Araya-Melo and N. Bounceur are funded through the ERC ITOP project. Computational resources have been provided by the supercomputing facilities of the Université catholique de Louvain (CISM/UCL) and the Consortium des Equipements de Calcul Intensif en Fédération Wallonie Bruxelles (CECI) funded by the FRS-FNRS. The data used in this work, together with the emulator code, are available in the supplementary material. Edited by: P. Braconnot