CMIP6/PMIP4 simulations of the mid-Holocene and Last Interglacial using HadGEM3: comparison to the pre-industrial era, previous model versions and proxy data

. Palaeoclimate model simulations are an important tool to improve our understanding of the mechanisms of climate change. These simulations also provide tests of the ability of models to simulate climates very different to today. Here we present the results from two brand-new simulations using the latest version of the UK’s physical climate model, HadGEM3-GC3.1; they are the mid-Holocene ( ∼ 6 ka) and Last Interglacial ( ∼ 127 ka) simulations, both conducted under the auspices of CMIP6/PMIP4. This is the ﬁrst time this version of the UK model has been used to conduct palaeo-climate simulations. These periods are of particular interest to PMIP4 because they represent the two most recent warm


Introduction
Simulating past climates has been instrumental in improving our understanding of the mechanisms of climate change (e.g. Gates, 1976;Haywood et al., 2016;Jungclaus et al., 2017;Kageyama et al., 2017Kageyama et al., , 2018Kohfeld et al., 2013;Lunt et al., 2008;Otto-Bliesner et al., 2017;Ramstein et al., 1997), as well as in identifying and assessing discrepancies in palaeoclimate reconstructions (e.g. Rind and Peteet, 1985). Palaeoclimate scenarios can also provide tests of the ability of models to simulate climates that are very different to today, often termed "out-of-sample" tests. This notion underpins the idea that robust simulations of past climates improve our confidence in future climate change projections Harrison et al., 2014;Taylor et al., 2011). Palaeoclimate scenarios have also been used to provide additional tuning targets for models (e.g. Gregoire et al., 2011), in combination with historical or pre-industrial conditions.
The international Climate Model Intercomparison Project (CMIP) and the Palaeoclimate Model Intercomparison Project (PMIP) have spearheaded the coordination of the international palaeoclimate modelling community to run key scenarios with multiple models, perform data syntheses and undertake model-data comparisons since their initiation 25 years ago (Joussaume and Taylor, 1995). Now in its fourth incarnation, PMIP4 (part of the sixth phase of CMIP, CMIP6), it includes a larger set of models than previously and more palaeoclimate scenarios and experiments covering the Quaternary (documented in Jungclaus et al., 2017;Kageyama et al., 2017Kageyama et al., , 2018Otto-Bliesner et al., 2017) and Pliocene (documented in Haywood et al., 2016).
PMIP4 specifies experiment set-ups for two interglacial simulations: the mid-Holocene (MH) at ∼ 6 ka and the Last Interglacial (LIG) at ∼ 127 ka (although spanning ∼ 129-116 ka in its entirety). These are the two most recent warm periods (particularly in the Northern Hemisphere) in Earth history, and they are of particular interest to PMIP4; indeed, the MH experiment is one of the two entry cards into PMIP . This is because whilst the atmospheric concentration of greenhouse gases, the extent of land ice and the continental configuration are similar in these PMIP4 set-ups compared to the pre-industrial (PI) period, significant changes to the seasonal cycle of radiative forcing, relative to today, do occur during these periods due to long-term variations in the Earth's orbital configuration. The MH and LIG both have higher boreal summer insolation and lower boreal winter insolation compared to the PI, as shown by Fig. 1, leading to an enhanced seasonal cycle in insolation as well as a change in its latitudinal distribution. The change is more significant in the LIG than the MH, due to the larger eccentricity of the Earth's orbit at that time. Note that, in this figure and, indeed, all subsequent figures using monthly or seasonal data, the data have been calendar adjusted (Joussaume and Braconnot, 1997) according to the method of Pollard and Reusch (2002) and Marzocchi et al. (2015); see the Supplement ( Fig. S1) for the same figure but using the modern calendar.
Palaeodata syntheses indicate globally warmer surface conditions of potentially ∼ 0.7 • C than PI in the MH (Marcott et al. 2013) and up to ∼ 1.3 • C in the LIG . During both warm periods there is abundant palaeodata evidence indicating enhancement of Northern Hemisphere summer monsoons (e.g. Wang et al., 2008) and in the case of the Sahara, replacement of desert by shrubs and steppe vegetation (e.g. Drake et al., 2011;Hoelzmann et al., 1998), grassland and xerophytic woodland/scrubland (e.g. Jolly et al., 1998a, b;Joussaume et al., 1999), and inland water bodies (e.g. Drake et al., 2011;Lézine et al., 2011). Recent palaeodata compilations involving either air temperatures or sea surface temperature (SST) (Capron et al., 2014;Hoffman et al., 2017) reveal that the maximum temperatures were reached asynchronously in the LIG between the Northern Hemisphere and Southern Hemisphere. Concerning precipitation, historically this has been lacking relative to temperature or SST reconstructions. One often-cited study for the MH is that of Bartlein et al. (2011), comprising a combination of existing quantitative reconstructions based on pollen and plant macrofossils; this provides evidence of the interaction between the orbital variations and greenhouse gas forcing and the atmospheric circulation response. More recently, one newly published dataset of LIG precipitation proxy data (which the current study benefits from as part of the model-data comparison; see below) is that of Scussolini et al. (2019). Here, a number of climate models are assessed against this brand-new dataset, finding an agreement with proxy data over Northern Hemisphere landmasses but less so in the Southern Hemisphere (Scussolini et al., 2019).
Many modelling studies have been undertaken in an attempt to reproduce the changes suggested by proxy data throughout the Quaternary, especially during the interglacial periods discussed here, and there is not scope in this current study to give a full review here. An overview of multimodel assessments during the LIG can be found in Lunt et al. (2013). However, one example is the aforementioned monsoon enhancement (and expansion/contraction) during the Quaternary, and previous studies have focused on various aspects of this, such as whether any expansion was hemispherically consistent or asynchronous between hemispheres (e.g. Kutzbach et al., 2008;McGee et al., 2014;Singarayer and Burrough, 2015;Singarayer et al., 2017;Wang et al., 2006Wang et al., , 2014. During the LIG, the aforementioned asynchronous temperature distribution between the hemispheres has been investigated by a number of model simulations, suggesting that this may have been caused by meltwater-induced shutdown of the Atlantic Meridional Overturning Circulation (AMOC) in the early part of the LIG, due to the melting of the Northern Hemisphere ice sheets during the preceding deglaciation (e.g. Carlson, 2008;Smith and Gregory, 2009;Stone et al., 2016).
The driving mechanism producing the climate and environmental changes indicated by the palaeodata for the MH and LIG is different to current and future anthropogenic warming, as the former results from orbital forcing changes, whilst the latter results from increases in greenhouse gases. Moreover, the orbital forcing primarily acts on short-wave radiation, whereas greenhouse gas changes primarily act upon the long-wave radiation flux, and the orbital forcing can lead to uneven horizontal and seasonal changes, whereas greenhouse gas forcing can cause more uniform anomalies (it should be noted that whilst a precise calculation of the radiative forcing due to changes in MH and LIG greenhouse gases is beyond the scope of this study, such a calculation could fol-low the methodology of Gunnar et al., 1998). Nevertheless, despite these differences in driving mechanism, these past high-latitude (and mainly Northern Hemisphere) warm intervals are a unique opportunity to understand the magnitudes of forcings and feedbacks in the climate system that produce warm interglacial conditions, which can help us understand and constrain future climate projections (e.g. Holloway et al., 2016;Rachmayani et al., 2017;Schmidt et al., 2014). Running the same model scenarios with ever newer models enables the testing of whether model developments are producing improvements in palaeo model-data comparisons, assuming appropriate boundary conditions are used. Previous iterations of PMIP, with older versions of the PMIP4 models, have uncovered persistent shortcomings (Harrison et al., 2015) that have not been eliminated despite developments in resolution, model physics and addition of further Earth sys-tem components. One key example of this is the continued underestimation of the increase in rainfall over the Sahara in the MH PMIP simulations (e.g. Braconnot et al., 2012).
In this study we run and assess the latest version of the UK's physical climate model, HadGEM3-GC3.1. Whilst older versions of the UK model have been included in previous iterations of CMIP -and whilst present-day and future simulations from this model are included in CMIP6the novelty of this study is that this is the first time this version has been used to conduct any palaeoclimate simulations. In Global Coupled (GC) version 3 (and therefore in the following GC3.1), there have been many updates and improvements, relative to its predecessors, which are discussed extensively in Williams et al. (2017) and a number of companion scientific model development papers (see Sect. 2.1). As a brief introduction, however, GC3 includes a new aerosol scheme, multilayer snow scheme, multilayer sea ice and several other parametrization changes, including a set relating to cloud and radiation, as well as a revision to the numerics of atmospheric convection . In addition, the ocean component of GC3 has other changes including an updated ocean and sea ice model, a new cloud scheme, and further revisions to all parametrization schemes (Williams et al., 2017). See Sect. 2.1 for further details.
Following the CMIP6/PMIP4 protocol, here the PMIP4 MH and LIG simulations have been conducted and assessed, with the assessment adopting a two-pronged approach. Firstly a simulation comparison is made between these simulations and the same model's PI simulation (to describe and understand the differences between them). Secondly a model-data comparison is made between the current and previous versions of the same model relative to newly available proxy data, thereby assessing any improvements due to model advances. In addition to a global assessment, a secondary focus of this paper is on the fidelity of the temperature anomalies and the degree of precipitation enhancement in the Sahara, the latter of which has proved problematic for several generations of models. Following this introduction, Sect. 2 describes the model, the experimental design and the proxy data used for the model-data comparisons and gives a brief discussion of the simulation spin-up phases. Section 3 then presents the results, beginning with the simulation comparison and following with the model-data comparison, and finally Sect. 4 summarizes and concludes. In this paper, consistent with CMIP nomenclature, the "spinup phase" of the simulations refers to when they are spinning up to atmospheric and oceanic equilibrium, whereas the "production run" refers to the end parts (usually the last 50 or 100 years) of the simulation used to calculate the climatologies, presented as the results. When discussed as geological intervals, the pre-industrial, mid-Holocene and Last Interglacial are referred to as the PI, MH and LIG respectively. In contrast, when discussed as the three most recent simulations using HadGEM3 (see below), consistent with CMIP they are referred to as the piControl, midHolocene and lig127k simulations respectively. When the midHolocene and lig127k are discussed collectively, they are referred to as the "warm climate simulations"; whilst it is acknowledged that other factors differentiate these simulations such as orbital configuration or CO 2 , warm climate simulations was deemed an appropriate collective noun.

Model details
The warm climate simulations conducted here and the pi-Control simulation (conducted elsewhere as part of the UK's CMIP6 runs and used here for comparative purposes) were all run using the same fully coupled GCM: the Global Coupled 3 configuration of the UK's physical climate model, HadGEM3-GC3.1. Full details on HadGEM3-GC3.1 and a comparison to previous configurations are given in Williams et al. (2017) and Kuhlbrodt et al. (2018). Here, the model was run using the Unified Model (UM), version 10.7, and including the following components: (i) Global Atmosphere (GA) version 7.1, with an N96 atmospheric spatial resolution (approximately 1.875 • longitude by 1.25 • latitude) and 85 vertical levels; (ii) the NEMO ocean component, version 3.6, including Global Ocean (GO) version 6.0 (ORCA1), with an isotropic Mercator grid, which, despite varying in both meridional and zonal directions, has an approximate spatial resolution of 1 • by 1 • and 75 vertical levels; (iii) the Global Sea Ice (GIS) component, version 8.0 (GSI8.0); (iv) the Global Land (GL) configuration, version 7.0, of the Joint UK Land Environment Simulator (JULES); and (v) the OA-SIS3 MCT coupler. The official title for this configuration of HadGEM3-GC3.1 is HadGEM3-GC31-LL N96ORCA1 UM10.7 NEMO3.6 (for brevity, hereafter HadGEM3). All of the above individual components are summarized by Williams et al. (2017) and detailed individually by a suite of companion papers (see Walters et al., 2019, for GA7 and GL7;Storkey et al., 2018, for GO6;and Ridley et al., 2018, for GIS8). However, a brief description of the major changes relative to its predecessor are given in the Supplement. When all of these components are coupled together to give GC3, there have been several improvements relative to its predecessor (GC2), most noticeably to the large warm bias in the Southern Ocean (which was reduced by 75 %), as well as an improved simulation of clouds, sea ice, the frequency of tropical cyclones in the Northern Hemisphere as well as the AMOC, and the Madden-Julian Oscillation (MJO) . Relative to the previous fully coupled version of the model (HadGEM2), which was sub-mitted to the last CMIP5/PMIP3 exercise, many systematic errors have been improved including a reduction in the temperature bias in many regions, a better simulation of midlatitude synoptic variability, and an improved simulation of tropical cyclones and the El Niño-Southern Oscillation (ENSO) .
Here, the midHolocene and lig127k simulations were both run on the UK National Supercomputing Service, ARCHER, whereas the piControl was run on a different platform based within the UK Met Office's Hadley Centre. While this may mean that anomalies computed against the piControl are potentially influenced by different computing environments and not purely the result of different climate forcings, the reproducibility of GC3.1 simulations across different platforms has been tested (Guarino et al., 2020a). It was found that, although a simulation length of 200 years is recommended whenever possible to adequately capture climate variability across different platforms, the main climate variables considered here (e.g. surface temperature) are not expected to be significantly different on a 100-or 50-year timescale (see, for example, Fig. 6 in Guarino et al., 2020a) as they are not directly affected by medium-frequency climate processes.

Experiment design
Full details of the experimental design and results from the CMIP6 piControl simulation are documented in Menary et al. (2018). Both the warm climate simulations followed the experimental design given by Otto-Bliesner et al. (2017) and specified at https://pmip4.lsce.ipsl.fr/doku.php/exp_design: index (last access: 1 July 2020). The primary differences from the piControl were to the astronomical parameters and the atmospheric trace greenhouse gas concentrations, summarized in Table 1. For the astronomical parameters, these were prescribed in Otto-Bliesner et al. (2017) according to orbital constants from Berger and Loutre (1991). However, in HadGEM3, the individual parameters (e.g. eccentricity, obliquity, etc.) use orbital constants based on Berger (1978), according to the specified start date of the simulation. For the atmospheric trace greenhouse gas concentrations, these were based on recent reconstructions from a number of sources (see Table 1 for values and Sect. 2.2 in Otto-Bliesner et al., 2017, for a full list of references/sources).
All other boundary conditions, including solar activity, ice sheets, topography and coastlines, volcanic activity, and aerosol emissions, are identical to the CMIP6 piControl simulation. Likewise, vegetation was prescribed to present-day values, to again match the CMIP6 piControl simulation. As such, the piControl and both the warm climate simulations actually include a prescribed fraction of urban land surface. As a result of this, our orbitally forced and greenhouse-gasforced simulations should be considered as anomalies to the piControl, rather than absolute representations of the MH or LIG climate.
Both the warm climate simulations were started from the end of the piControl spin-up phase (which ran for approximately 600 years), after which time the piControl was considered to be in atmospheric and oceanic equilibrium (Menary et al., 2018). To assess this, four metrics were used, namely net radiative balance at the top of the atmosphere (TOA), surface air temperature (SAT), full-depth ocean temperature (OceTemp) and full-depth ocean salinity (OceSal) (Menary et al., 2018). See Sect. 2.4 (and in particular Table 2) for an analysis of the equilibrium state of both the piControl and the warm climate simulations. Starting at the end of the piControl, these were then run for their own spin-up phases, 400 and 350 years for the midHolocene and lig127k respectively. Once the simulations were considered in an acceptable level of equilibrium (see Sect. 2.4), a production phase was run for 100 and 200 years for the midHolocene and lig127k respectively, during which the full CMIP6/PMIP4 diagnostic profile was implemented to output both high-s and lowtemporal-frequency variables.

Data
Recent data syntheses compiling quantitative surface temperature and rainfall reconstructions were used in order to evaluate the warm climate simulations.
For the MH, the global-scale continental surface mean annual temperature (MAT) and rainfall (or mean annual precipitation, MAP) reconstructions from Bartlein et al. (2011), with quantitative uncertainties accounting for climate parameter reconstruction methods, were used (see Data availability for access details). They rely on a combination of existing quantitative reconstructions based on pollen and plant macrofossils and are inferred using a variety of methods (see Bartlein et al., 2011, for further details). At each site, the 6 ka anomaly (corresponding to the 5.5-6.5 ka average value) is given relative to the present day, and in the case where modern values could not be directly inferred from the record, modern climatology values  were extracted from the Climate Research Unit historical climatology dataset (New et al., 2002). Further proxy data for the MH, such as SST reconstructions, are not included here, as an extensive model-data comparison is presented in a companion paper (Brierley et al., 2020).
For the LIG, two recent different sets of surface temperature data are available. Firstly, the Capron et al. (2017) 127 ka time slice of surface air temperature (SAT) and sea surface temperature (SST) anomalies (relative to pre-industrial, 1870-1899) is based on polar ice cores and marine sediment data that are (i) located poleward of 40 • latitude and (ii) have been placed on a common temporal framework (see Data availability for access details). Polar ice core water isotope data are interpreted as annual mean surface air temperatures, while most marine sediment-based reconstructions are interpreted as summer (defined here as July-September, JAS) SST signals. For each site, the 127 ka value was calculated as the average value between 126 and 128 ka using the surface temperature curve resampled every 0.1 kyr. Here, we use the SST anomalies only. Secondly, a global-scale time slice of SST anomalies, relative to pre-industrial (1870-1889), at 127 ka was built, based on the recent compilation from Hoffman et al. (2017), which includes both annual and summer SST reconstructions (see Data availability for access details). This adds further novelty to this study, by using a new combined dataset based on this existing data. The 127 ka values at each site were extracted, following the methodology they proposed for inferring their 129, 125 and 120 ka time slices, i.e. the SST value at 127 ka was taken on the provided mean 0.1 kyr interpolated SST curve for each core location. Data syntheses from both Capron et al. (2014Capron et al. ( , 2017 and Hoffman et al. (2017) are associated with quantitative uncertainties accounting for relative dating and surface temperature reconstruction methods. Here, the two datasets are treated as independent data benchmarks, as they use different reference chronologies and methodologies to infer temporal surface temperature changes, and therefore they should not be combined. See Capron et al. (2017) for a detailed comparison of the two syntheses. A model-data comparison exercise using existing LIG data compilations focusing on continental surface temperature (e.g. Turney and Jones, 2010) was not attempted, as they do no benefit yet from a coherent chronological framework, preventing the definition of a robust time slice representing the 127 ka terrestrial climate conditions . A brand-new, recently published dataset of proxy precipitation anomalies (again, relative to the pre-industrial) is also used for model-data comparison purposes here, adding further novelty to this study. The proxy data are compiled from existing literature by Scusscolini et al. (2019), and the dataset includes 138 proxy locations from a number of palaeoclimatic archives including pollen, fossils other than pollen, lacustrine or marine sediment composition, loess deposits, and other multiproxy sources. Note that, as Scusscolini et al. (2019) observe, unlike temperature anomalies the majority of precipitation anomalies in the existing literature are not quantitative. To allow a quantitative comparison, Scusscolini et al. (2019) use a semiquantitative scale, based on their expert judgement, to show a LIG that is "much wetter", "wetter", experiencing"no discernible change", "drier" or "much drier", relative to the PI. The same scale is therefore used here. See Scusscolini et al. (2019) for further information, and see Data availability for access details.

Spin-up simulations
As briefly mentioned above, both the warm climate simulations had a spin-up phase before the main production run was started, briefly discussed here. As an example of atmospheric equilibrium, annual global mean 1.5 m air temperature and TOA radiation from both warm climate simulations, compared to the piControl, are summarized in Table 2; see Supplement ( Fig. S2) for the time series of these fields. For the warm climate simulations, despite considerable interannual variability and arguably more so than in the piControl (see Fig. S2), both are showing long-term trends of −0.06 and −0.16 • C per century for the last 100 years of the midHolocene and lig127k respectively ( Table 2). The spatial patterns of these trends, also shown in the Supplement (Fig. S3), are similar in both warm climate simulations, with much of the statistically significant cooling occurring over high-latitude regions in both hemispheres, particularly so over Antarctica in the lig127k simulation (Fig. S3). The TOA radiation balance is also showing long-term (and again slightly negative) trends by the end of the simulations, with −0.05 and −0.06 W m 2 for the midHolocene and lig127k respectively.
As an example of oceanic equilibrium, annual global mean full-depth OceTemp and OceSal are shown in Table 2 (and again visualized in the Supplement, Fig. S4). OceTemp is steadily increasing throughout the piControl, and this continues in both warm climate simulations, whereas there is a dramatic fall in ocean salinity in these simulations (Fig. S4). Concerning the long-term trends, Menary et al. (2018) considered values acceptable for equilibrium to be < ±0.035 • C per century and < ±0.0001 psu per century (for OceTemp and OceSal respectively); as shown in Table 2, although both warm climate simulations meet the temperature criterion, the midHolocene it is not meeting the salinity criterion (−0.0004 psu). However, running for several thousands of years (and > 5 years of computer time), which would be needed to reach true oceanic equilibrium, was simply unfeasible here given time and resource constraints.

Production runs results
The warm climate production runs were undertaken following the spin-up phase, with the climatology of each simulation being compared to that from the piControl, as well as available proxy data, using either annual means or summer/winter seasonal means. For the latter, depending on the availability of the proxy data, Northern Hemisphere summer is defined as either June-August (JJA) or JAS, and Northern Hemisphere winter is defined as either December-February (DJF) or January-March (JFM) -and vice versa for Southern Hemisphere summer/winter. As briefly introduced in Sect. 1, the focus is on two separate measures: (i) to describe and understand the differences between the two most recent warm climate simulations and the piControl in terms of temperature, rainfall and atmospheric/oceanic circulation changes and (ii) to compare both current simulations, as well as simulations from previous versions of the UK model (where available), with the aforementioned newly available proxy data, to assess any improvements due to model advances. A final aim, discussed only briefly here but shown in the Supplement, is to include previous CMIP5 models to address the question of whether any of the simulations produce enough rainfall to allow vegetation growth across the Sahara: the mid-Holocene Saharan greening.
3.1.1 Do the CMIP6 HadGEM3 warm climate simulations show temperature, rainfall and atmospheric/oceanic circulation differences when compared to the pre-industrial era?
Here we focus on mean differences between the HadGEM3 warm climate simulations and the corresponding piControl.
Calendar-adjusted annual and seasonal mean summer/winter 1.5 m air temperature anomalies (relative to the piControl) from both warm climate simulations are shown in Fig. 2. As an example and for comparative purposes, the same figure but with the data based on the modern calendar is shown in the Supplement (Fig. S5); this suggests that the impact of the calendar adjustments on this field and at this spatial and temporal scale is negligible, with the only obvious impact occurring over the Northern Hemisphere polar regions during JJA in both simulations but more so in the lig127k simulation (due to the larger changes in insolation resulting in a larger change to the calendar, relative to the MH). Consistent with the seasonality of the changes, the differences between either simulation are less at the annual timescale ( Fig. 2a and d) than during individual seasons, but they are still nevertheless statistically significant at the 99 % level. During JJA, the midHolocene is showing a widespread statistically significant increase in temperatures of up to 2 • C across the entire Northern Hemisphere north of 30 • N, more in some places e.g. Greenland (Fig. 2b), consistent with the increased latitudinal and seasonal distribution of insolation caused by known differences in the Earth's axial tilt (Berger and Loutre, 1991;Otto-Bliesner et al., 2017). The only places showing a reduction in temperature are West and central Africa (around 10 • N) and northern India; this, as discussed below, is likely related to increased rainfall in response to a stronger summer monsoon but could also be due to the resulting increase in cloud cover (reflecting more insolation) or a combination of the two. During DJF, only the Northern Hemisphere high latitudes (north of 60 • N) continue this warming trend, with the rest of continental Africa and Asia showing a reduction in temperature (Fig. 2c). These patterns are virtually the same in the lig127k simulation ( Fig. 2e and f), just much more pronounced (with statistically significant temperature increases during JJA of 5 • C or more); again, this is consistent with the differences in the Earth's axial tilt, which were more extreme (and therefore Northern Hemisphere summer experienced larger insolation changes) in the LIG relative to the MH (Berger and Loutre, 1991;Otto-Bliesner et al., 2017). Another clear feature of these figures, at either annual or seasonal timescales, is polar amplification, which is likely associated with changes in sea ice; as shown in the Supplement (Fig. S6), statistically significant decreases in sea ice are shown throughout the polar regions of both hemispheres in the midHolocene, relative to the piControl. The same is true for the lig127k simulation, just more pronounced (not shown).
https://doi.org/10.5194/cp-16-1429-2020 Clim. Past, 16, 1429-1450, 2020 Calendar-adjusted seasonal mean summer and winter surface daily rainfall anomalies (again relative to the piControl) from both warm climate simulations are shown in Fig. 3. In line with the aforementioned increased latitudinal and seasonal distribution of insolation, the largest differences in either simulation occur during Northern Hemisphere summer ( Fig. 3b and e). Both warm climate simulations are showing statistically significant increases in rainfall around the monsoon regions, especially over northern India and equatorial Africa, more so in the lig127k (Fig. 3e). Both simulations are also showing oceanic drying relative to the piControl, especially in the equatorial Atlantic and Pacific, again more pronounced in the lig127k (Fig. 3e). In contrast, during DJF, less of an impact is seen in either simulation relative to the piControl, with a small but statistically significant increase in rainfall in oceanic equatorial regions but drying over tropical land regions, e.g. southern Africa, central Brazil and northern Australia (Fig. 3c and f). Again, consistent with the increased insulation changes during the LIG compared to the MH, these differences are stronger in the lig127k simulation (Fig. 3f). Consistent with the temperature differences, these signals are again weaker at the annual timescale but are nevertheless statistically significant ( Fig. 3a and b).
A measure of oceanic circulation is also considered here, shown by the three HadGEM3 simulations of meridional overturning circulation (MOC) in the Atlantic basin and globally (Fig. 4a-c and d-f respectively). Although not identical, the differences are nevertheless negligible, with both warm climate simulations almost exactly reproducing the structures of weakly and strongly overturning MOC seen in the piControl; for example, the strongly overturning MOC in the upper levels of the Atlantic is marginally stronger in the midHolocene at ∼ 30-40 • N relative to the other two simulations, but the structures are very similar. This suggests that the changes to atmospheric fields such as P -E, energy fluxes and wind stress (in response to the insolation changes) are having a minimal impact on the overturning circulation, and this is consistent with other work (e.g. Guarino et al., 2020b).
A key region of interest, concerning mean precipitation changes and changes to the extent and latitudinal distribution of monsoon regions, is northern Africa, primarily because of the aforementioned inability of previous models to reproduce the increases shown by the proxy data here (e.g. Braconnot et al., 2007Braconnot et al., , 2012. Therefore, Fig. 5 reproduces the above precipitation changes but zooms into Africa and additionally includes calendar-adjusted mean JJA (the primary monsoon region) 850 mbar wind anomalies (relative to the piControl). In response to the increased Northern Hemisphere summer insolation, the West African monsoon is enhanced in both simulations, with positive (negative) rainfall anomalies across sub-Saharan Africa (eastern equatorial Atlantic) suggesting a northward displacement of the rainfall maxima. This is consistent with previous work, with a northward movement of the rain belt being associated with increased advection of moisture into the continent (Haug et al., 2001;Singarayer et al., 2017;Wang et al., 2014). This increased advection of moisture is shown by the enhanced low-level westerlies at all latitudes but especially over the regions of rainfall maxima in Fig. 5a and b, drawing in more moisture from the tropical Atlantic, which are consistent with previous work documenting the intensified monsoon circulation (Haug et al., 2001;Singarayer et al., 2017;Wang et al., 2006). This pattern is enhanced in the lig127k relative to the midHolocene, again in response to the stronger insolation changes relative to the MH, and the northward displacement of the central rain belt is more pronounced in the lig127k simulation (Fig. 5c).
The change to the intensity and the spatial pattern (e.g. latitudinal positioning and extent) of the West African monsoon is further shown in Fig. 6, which shows calendar-adjusted daily JJA rainfall by latitude over West Africa (averaged over 20 • W-15 • E, land points only) from both warm climate sim-ulations. This figure also includes MH and LIG simulations from previous generations of the same model. It should be noted that although LIG experiments have been conducted previously with both model-model and model-data comparisons being made (Lunt et al., 2013), all of these experiments were carried out using early versions of the models and were thus not included in CMIP5. Moreover, as part of their assessment Lunt et al. (2013) considered a set of four simulations, at 130, 128, 125 and 115 ka, none of which are directly comparable to the current HadGEM3 lig127k simulation. Instead, a LIG simulation has recently been undertaken using one of the original versions of the UK's physical climate model, HadCM3, and so this is used here to compare with the lig127k simulation.  Beginning with the recent palaeoclimate HadGEM3 simulations, in line with the changes in insolation both warm climate simulations are showing higher absolute values at their peak (between ∼ 7.5-10 • N) than the piControl (Fig. 6a). Concerning anomalies, both simulations are showing a large increase in rainfall relative to the piControl (of ∼ 2 and 6 mm d −1 for the midHolocene and lig127k respectively) over the monsoon region between ∼ 10-12 • N (Fig. 6b). Relative to previous versions of the same model, the previous generation (HadGEM2-ES) is slightly drier then HadGEM3 over this region for its PI simulation and slightly wetter for its MH simulation; conversely, the version before that (HadCM3) is consistently wetter than HadGEM3 for all of its simulations (Fig. 6a). There also appears to be a northward displacement in the oldest version, with the largest difference between the simulations and their corresponding PI simulations occurring at ∼ 11 • N in the two most recent versions of the model, whereas in HadCM3 this appears to be shifted northwards to ∼ 12.5 • N (Fig. 6b). This northward displacement in certain models is consistent with previous work (e.g. Haug et al., 2001;Otto-Bliesner et al., 2017;Singarayer et al., 2017;Wang et al., 2014). In terms of the latitudinal ex-tent, the results suggest that all warm climate simulations (regardless of generation) are producing a wider Northern Hemisphere monsoon region (i.e. a greater northerly extent) relative to each version's PI, with rainfall falling to near zero at ∼ 18 • N in the PI simulations but extending to 20 • N (and above, in terms of the LIG simulations) in both warm climate simulations (Fig. 6a). This is again consistent with previous work, where various theories are compared as to the reasons behind the latitudinal changes in the rain belt's position, one which is a symmetric expansion during boreal summer (Singarayer and Burrough, 2015;Singarayer et al., 2017).

Simulation comparison and model-data
comparison: do the CMIP6 HadGEM3 simulations reproduce the "reconstructed" climate based on available proxy data, and has there been any noticeable improvement relative to previous versions of the same model?
Although the above analysis is useful and confirms that the most recent warm climate simulations are responding consistently to the increased latitudinal and seasonal distribution of insolation, it does not give any information on which (if any) of the simulations is most accurate or which version of the model is better at reproducing proxy-observed conditions. Therefore, here we bring in a comparison with newly available proxy data, comparing these to all versions of the model, focusing on surface air temperature, SST and rainfall (drawing direct comparisons, as well as using the root mean square error (RMSE; without a cut-off threshold), between both proxy vs. simulated data and HadGEM3 vs. previous versions. The aim of this is to firstly see how well the current warm climate simulations are reproducing the "observed" approximate magnitudes and patterns of change and secondly to assess any possible improvement from previous versions of the same model. It is worth noting that both simulated and proxy anomalies contain a high level of uncertainty (as measured by the standard deviation), and in many locations the uncertainty is larger than the anomalies themselves (not shown). The following results should therefore be considered with this caveat in mind. Before the spatial patterns are compared, it is useful to assess global means from the three HadGEM3 simulations (focusing on 1.5 m air temperature, calculated both annually and during Northern Hemisphere and Southern Hemisphere summer, JJA and DJF respectively). Table 3 shows these global means, where it is clear that when annual means are considered, the midHolocene simulation is actually cooler than the piControl. This discrepancy with the palaeodata, which at many locations suggests a warmer MH relative to PI, is consistent with previous work using other models (e.g. Lui et al., 2014). The lig127k simulation is, however, warmer than the piControl simulation. Given the seasonal distribution of insolation in these two simulations, it is expected that the largest difference to the piControl occurs during boreal sum-mer, and indeed it does; during JJA, there is a warmer lig127k and a slightly warmer midHolocene (1.69 and 0.07 • C respectively). The opposite is true during DJF.
Concerning the spatial patterns during the MH, Fig. 7 shows simulated surface MAT anomalies from the current midHolocene simulation and those from two previous versions of the same model, versus MH proxy anomalies from Bartlein et al. (2011). Note that, here, statistical significance of the simulated anomalies has not been shown, firstly because the aim here is to assess all differences regardless of significance and secondly because a measure of statistical significance (for HadGEM3) has already been presented in Fig. 2; statistical significance from the other versions of the same model is virtually identical (not shown). Globally, all three models are showing a reasonable level of agreement with the proxy data, with RMSE = 2.45, 2.42 and 2.37 • C for HadGEM3, HadGEM2-ES and HadCM3 respectively (Table 4a). Using this metric, the oldest version of the model (HadCM3) is doing marginally better than the other models, relative to the proxy data. Spatially, however, there are differences to the proxy data and between model generations. Although all three generations appear to be able to reproduce the sign of temperature change for many locations, with both simulated and proxy anomalies suggesting increases in temperature north of 30 • N and especially over northern Europe, the Arctic Circle increases are not as homogenous in HadCM3 (Fig. 7d) and indeed this model shows cooling over the Greenland Sea. Although this cannot be corroborated by the proxy data, due to a lack of coverage, neither of the latergeneration models show this to the same extent ( Fig. 7b and  c). Discrepancies with the proxy data also occur in all three simulations across the Mediterranean region, where all three simulations suggest a small warming but the proxy data indicate cooling (Fig. 7). Moreover, regarding the magnitude of change, all three simulations are underestimating the temperature increase across most of the Northern Hemisphere, with for example increases of up to 1 • C across Europe from the simulations compared to 3-4 • C increases from the proxy data. In the simulations, temperature anomalies only reach these magnitudes in the Northern Hemisphere polar region (i.e. north of 70 • N), not elsewhere. Further equatorward, all three simulations are identifying a slight cooling over the West African monsoon region (as discussed above), but the accuracy of this relative to the proxy data is difficult to ascertain given the lack of coverage across Africa and, where there are data locations, a highly variable sign of change (Fig. 7a).
A similar conclusion can be drawn from MAP, shown in Fig. 8, where all three simulations are correctly reproducing the sign of change across most of the Northern Hemisphere, although more so in the two most recent generations of the model (HadGEM3 and HadGEM2-ES) but in some places not the magnitude. Over the eastern US, for example, rainfall decreases of up to 200 mm yr −1 are being shown by the simulations (Fig. 8b-d), whereas the proxy data suggest a much stronger drying of up to 400 mm yr −1 (Fig. 8a).
Clim. Past, 16, 1429-1450, 2020 https://doi.org/10.5194/cp-16-1429-2020 Table 3. Global 1.5 m air temperature means and anomalies from HadGEM3 piControl, midHolocene and lig127k production runs.  Elsewhere, such as over Europe and Northern Hemisphere Africa, the simulations more accurately reproduce the magnitude of rainfall increases; both simulated and proxy anomalies show increases of 200-400 mm yr −1 . Globally, Table 4a suggests that the most recent generation model, HadGEM3, is doing better than the others, relative to the proxy data (RMSE = 285.9, 293.5 and 304.7 mm yr −1 for HadGEM3, HadGEM2-ES and HadCM3 respectively). In terms of how the spatial patterns change according to model version, during the MH the two most recent simulations generally agree (RMSE = 90.8 mm yr −1 , Table 4a) and show similar spatial patterns; focusing again on the African monsoon region (for the aforementioned reasons), both simulations show a drier equatorial Atlantic during the MH and then increased rainfall around 10 • N ( Fig. 8b and c for HadGEM3 and HadGEM2-ES respectively). Both simulations also suggest that the increases in rainfall extend longitudinally across the entire African continent, with the largest changes occurring not only across western and central regions but also further east. In contrast, globally HadCM3 agrees less with HadGEM3 (RMSE = 121.8 mm yr −1 ; Table 4a) and only suggests a wetter MH over West Africa, not further east. HadCM3 and, indeed, HadGEM2-ES also differ from the most recent simulation over the equatorial Atlantic, showing a region of drying that is not only stronger in magnitude but also larger in terms of spatial extent; whilst still present in HadGEM3, this feature that is much weaker (Fig. 8b-d). Table 4. RMSE values (for various metrics) between simulations from different generations of the same model versus proxy data and versus each other: (a) MAT and MAP from the MH simulations versus proxy data from Bartlein et al. (2011); (b) SST from the LIG simulations versus proxy data from Capron et al. (2017) and Hoffman et al. (2017). Regarding the proxy data comparisons in (b), for JAS the simulated SST anomalies are compared to Northern Hemisphere summer reconstructions, and for JFM the simulated SST anomalies are compared to Southern Hemisphere summer reconstructions.
(a)  Concerning the spatial patterns during the LIG, Fig. 9 shows simulated mean SST anomalies (calculated both annually and during JAS/JFM) from the current lig127k simulation and that from the oldest version of the same model, versus LIG proxy anomalies from two sources,  and Hoffman et al. (2017). No LIG simulation using HadGEM2-ES is currently available. When annual anomalies are considered, there is relatively good agreement globally between HadGEM3 and the proxy data where RMSE = 3.03 and 2.42 • C for the Capron et al. (2017) and Hoffman et al. (2017) data respectively (Table 4b) (Table 4b). Similarly varying results also occur when JAS and JFM anomalies are considered, with HadGEM3 comparing slightly better or worse than HadCM3 according to season and proxy dataset used; all of the values, however, show relatively good agreement, with no simulation exceeding RMSE = 4.5 • C in any season or with any dataset (Table 4b). Spatially, HadGEM3 is showing a general agreement between simulated and proxy annual and JAS anomalies in the Northern Hemisphere (and in particular in the North Atlantic), with both suggesting increased temperatures during the LIG of up to 5 • C ( Fig. 9a and b). HadCM3 is not capturing these magnitudes at the annual timescale ( Fig. 9d) and, despite showing greater warming during JAS, is still lower than HadGEM3; this is more in agreement with the proxy data at higher latitudes (e.g. the western Norwegian Sea at ∼ 70 • N) but less so further south (Fig. 9e). This might suggest that, in this region, HadGEM3 is actually overestimating the degree of warming. Nevertheless, in both versions of the model there are discrepancies not just in the magnitude but also in the sign of change, such as in the eastern Norwegian Sea or the Labrador Sea, where reconstructions suggest a cooler LIG but both versions show a consistent warming ( Fig. 9b and e). This is, however, consistent with previous work, and earlier climate models have also failed to capture this cooling (Capron et al., 2014;Stone et al., 2016). In Southern Hemisphere summer, JFM, both versions agree on a general (but weak) cooling in the South Atlantic relative to pre-industrial and a weak warming in the Southern Ocean ( Fig. 9c and f). In contrast certain proxy locations (such as off the coast of southern Africa) suggest a much warmer LIG than pre-industrial, which is opposite to the simulated cooling in the same region ( Fig. 9c and f). Further south, the majority of simulated anomalies reproduce the observed sign of change but not the magnitude; here, the simulations suggest temperature increases of up to 1 • C, whereas both proxy datasets suggest SST increases of 2-3 • C depending on location ( Fig. 9c and f).

Metric
For rainfall changes during the LIG, Fig. 10 shows simulated annual mean surface rainfall anomalies from the current lig127k simulation and that from the oldest version of the same model versus LIG proxy anomalies from Scusscolini et al. (2019). Note that the simulated anomalies shown here are annual anomalies, as opposed to daily anomalies in Fig. 3, to be consistent with the proxy data. Note also that, for these proxy reconstructions, a semiquantitative scale is used by Scusscolini et al. (2019) rather than actual anomalies and is therefore reproduced here; this ranges from a unitless −2 to 2, corresponding to the following: much wetter LIG anomaly, wetter, no noticeable anomaly, drier and much drier LIG anomaly. It is for this reason that RMSE values have not been calculated here. As was suggested from the MH simulations (Fig. 8), both versions of the model are showing similar patterns of rainfall changes, along the same lines as those seen during the MH but again enhanced (Fig. 10). Both versions are showing enhanced rainfall across the Northern Hemisphere equatorial zone and in particular the monsoon regions during the LIG, often exceeding 500 mm yr −1 in some places. In the Northern Hemisphere, both versions of the model are generally in agreement with the proxy data, with most proxy locations showing wetter or much wetter conditions. There are, however, some discrepancies elsewhere, such as the regions of tropical drying over e.g. Brazil and southern Africa in the simulations being in stark contrast to the wetter conditions suggested by the proxy data (Fig. 10). Concerning the differences in the spatial patterns between the model versions, although both generations qualitatively show similar patterns, there are subtle differences. Again focusing on the African monsoon region, HadGEM3 shows greatly increased rainfall across all of sub-Saharan Africa, centred on 10 • N but extending from ∼ 5 to almost 20 • N and longitudinally across the entire African continent (Fig. 10a). In contrast -similar to the MH results -HadCM3 the largest rainfall increases are less apparent over East Africa (Fig. 10b).
It would therefore be reasonable to say that, for both MH and LIG simulations, whilst the most recent version of the model is capturing the sign and magnitude of change relative to proxy reconstructions (for either temperature or rainfall) in some locations, this is highly geographically dependent, and there are locations where the current simulation fails to capture even the sign of change. Compared to previous versions of the same model, any improvement also appears to be highly variable according to metric, proxy reconstruction used for comparison and geographical location, with for example HadGEM3 showing some improvement relative to previous versions for rainfall during the MH but not surface air temperature. The accuracy of the most recent model and, indeed, previous generations also appears to be seasonally dependent, with the most recent lig127k simulation correctly reproducing both the sign and magnitude of change during Northern Hemisphere summer in some locations but not during Southern Hemisphere summer or annually. It would also appear that, for both the MH and LIG simulations, whilst there is less difference between the most recent two configurations of the model, they are nevertheless quite different to the oldest version. For global mean annual rainfall during the MH, Table 4a shows a linear progression of improvement across the three versions of the model, as well as more agreement between the two most recent model generations. This is also true when just the region of rainfall maxima in northern Africa is considered, with both of the two most recent generations, especially HadGEM2-ES, being marginally closer to the proxy data than HadCM3 (RMSE = 463.7, 424.5 and 468.4 mm yr −1 for HadGEM3, HadGEM2-ES and HadCM3 respectively). In all simulations, although spatial patterns of rainfall are similar, there are discrepancies especially over the African monsoon region; the oldest version of the model, for example, only shows rainfall increases over West Africa, whereas the two most recent versions imply Africa-wide rainfall increases at this latitude. If a comparison is made with satellite-derived rainfall data for the modern West African monsoon (not shown), results suggest that rainfall maxima are not just limited to West Africa but also occur over the central region and East Africa, more consistent with the two most recent versions of the model. One reason for HadCM3 not identifying this longitudinal extent might be connected to the very coarse spatial resolution of this model, relative to the others, impacting any topographically induced rainfall, especially over the East African Highlands.

Saharan greening
Finally, a brief discussion is given on the Saharan greening question. Given that the warm climate simulations and, indeed, the piControl did not use interactive vegetation, it is not possible to directly test if the model is reproducing the Saharan greening that proxy data suggest. For example, Jolly et al. (1998a, b) analysed MH pollen assemblages across northern Africa and suggested that some areas south of 23 • N (characterized by desert today) were grassland and xerophytic woodland/scrubland during the MH (Joussaume et al., 1999). To circumvent this caveat, Joussaume et al. (1999) developed a method for indirectly assessing Saharan greening, based on the annual mean rainfall anomaly relative to a given model's modern simulation. Using the water-balance mod-ule from the BIOME3 equilibrium vegetation model (Haxeltine and Prentice, 1996), Joussaume et al. (1999) calculated the increase in mean annual rainfall, zonally averaged over 20 • W-30 • E, required to support grassland at each latitude from 0 to 30 • N, compared to the modern rainfall at that latitude. This was then used to create maximum and minimum estimates, within which bounds the model's annual mean rainfall anomaly must lie to suggest enough of an increase to support grassland (Joussaume et al., 1999).
Therefore, an adapted version of Fig. 3a in Joussaume et al. (1999) is shown in the Supplement (Fig. S7), which shows mean annual rainfall anomalies by latitude (to be consistent with the proxy data-based threshold) from not only the current midHolocene simulation but also all previous MH sim-ulations from CMIP5. Concerning the threshold required to support grassland, it is clear that although the current mid-Holocene simulation is just within the required bounds at lower latitudes (e.g. up to 17 • N), north of this the current midHolocene simulation is not meeting the required threshold, and neither are any of the other CMIP5 models after ∼ 18 • N (Fig. S7). It would therefore appear that the Saharan greening problem has yet to be resolved and may well only be reproduced once interactive vegetation and, indeed, interactive dust is included in the simulation; given the current lack of an interactive vegetation/dust model, vegetation-related climate feedbacks (e.g. albedo) on the system are therefore currently missing.

Summary and conclusions
This study has conducted and assessed the mid-Holocene and Last Interglacial simulations using the latest version of the UK's physical climate model, HadGEM3-GC3.1, comparing the results firstly with the model's pre-industrial simulation and secondly with previous versions the same model, against available proxy data. Therefore this study is novel, being the first time this version of the UK model has been used to conduct any palaeoclimate simulations and therefore being the first time we are in a position to include them as part of the UK's contribution to CMIP6/PMIP4. Both the midHolocene and lig127k simulations followed the experimental design defined in Otto-Bliesner et al. (2017) and the CMIP6/PMIP4 protocol. Both simulations were run for a 350-400 year spinup phase, during which atmospheric and oceanic equilibrium were assessed, and once an acceptable level of equilibrium had been reached, the production runs were started.
Globally, whilst both the recent simulations are mostly capturing the sign and, in some places, magnitude of change relative to the PI, similar to previous model simulations this is geographically and seasonally dependent. It should be noted that the proxy data (against which the simulations are evaluated) also contain a high level of uncertainty in both space and time (in terms of both seasons and geological era), and so it is encouraging that the simulations are generally reproducing the large-scale sign of change, if not at an individual location. Compared to previous versions of the same model, this appears to vary according to metric, proxy reconstruction used for comparison and geographical location. In some instances, such as annual mean rainfall in the MH, there is a clear and linear improvement (relative to proxy data) through the model generations when rainfall is considered globally; likewise there is more accuracy in the two recent versions (again relative to proxy data) than in the oldest version when only the West African monsoon region is considered (see Table 4a and the RMSE values discussed in the concluding paragraph of Sect. 3.1.2).
Likewise, when zooming into Africa, the behaviour of the West African monsoon in both HadGEM3 warm climate sim-ulations is consistent with current understanding (e.g. Haug et al., 2001;Singarayer et al., 2017;Wang et al., 2014), which suggests a wetter (and possibly latitudinally wider and/or northwardly displaced) monsoon during the MH and LIG, relative to the PI. Regarding model development in simulating the West African monsoon, there are differences between model generations; the oldest version of the model, for example, limits the rainfall increases to over sub-Saharan West Africa only, whereas the two most recent versions imply Africa-wide (i.e. across all longitudes) rainfall increases at this same latitude. Lastly, regarding the well-documented Saharan greening during the MH, results here suggest that the most recent version of the UK's physical climate model is consistent with all other previous models to date.
In conclusion, the results suggest that the most recent version of the UK's physical climate model is reproducing climate conditions consistent with the known changes to insolation during these two warm periods. Even though the lig127k simulation did not contain any influx of Northern Hemisphere meltwater, shown by previous work to be a critical forcing in LIG simulations (causing regions of both warming and cooling, according to location), it is still nevertheless showing increased temperatures in certain regions. Another limitation of using this particular version of the model is that certain processes, such as vegetation and atmospheric chemistry, were prescribed, rather than allowed to be dynamically evolving. Moreover, for practical reasons some of the boundary conditions were left as PI, such as vegetation, anthropogenic deforestation and aerosols; a better simulation might be achieved if these were prescribed for the MH and LIG. Processes and boundary conditions such as these may be of critical importance regarding climate sensitivity during the MH and the LIG, and therefore ongoing work is underway to repeat both of these experiments using the most recent version of the UK's Earth Systems model, UKESM1. Here, although the atmospheric core is HadGEM3, UKESM1 contains many other earth system components (e.g. dynamic vegetation) and therefore in theory should be able to better reproduce these palaeoclimate states.
Data availability. The HadGEM3 piControl simulation and the HadGEM2-ES simulations used here are currently available from the Earth System Grid Federation (ESGF) WCRP Coupled Model Intercomparison Project (Phase 6), located at https://esgf-node.llnl. gov/projects/cmip6/ (last access: 1 July 2020). In contrast, the HadGEM3 midHolocene and lig127k simulations used here are not yet publicly available but will be uploaded to the ESGF as soon as practicable. In the meantime, however, these simulations are available to the public by directly contacting the lead author. In addition, the climatologies used here (from both HadGEM3 and HadCM3) are available in the Supplement. For the MH reconstructions, the data can be found within the Supplementary Online Material of Bartlein et al. (2011), at https://doi.org/10.1007/s00382-010-0904-1. For the LIG temperature reconstructions, the data can be found within the Supplementary Online Material of Capron et al. (2017)
Author contributions. CJRW conducted the midHolocene simulation, carried out the analysis, produced the figures, wrote the majority of the manuscript, and led the paper. MVG conducted and provided the lig127k simulation and contributed to some of the analysis and writing. EC provided the proxy data and contributed to some of the writing. IMV provided the HadCM3 LIG simulation. PJV provided the HadCM3 MH simulation. JSS contributed to some of the writing. All authors proofread the paper and provided comments.
Competing interests. The authors declare that they have no conflict of interest.

Special issue statement.
This article is part of the special issue "Paleoclimate Modelling Intercomparison Project phase 4 (PMIP4) (CP/GMD inter-journal SI)". It is not associated with a conference.