Modelling ice sheet evolution and atmospheric CO 2 during the Late Pliocene

In order to investigate the relation between ice sheets and climate in a warmer-than-present world, recent research has focussed on the Late Pliocene, 3.6 to 2.58 million years ago. It is the most recent period in Earth history when such a climate state existed for a significant duration of time. Marine Isotope Stage (MIS) M2 (~3.3 Myr ago) is a strong positive 10 excursion in benthic oxygen records in the middle of the otherwise warm and relatively stable Late Pliocene. However, the relative contributions to the benthic d18O signal from deep-ocean cooling and growing ice sheets are still uncertain. Here, we present results from simulations of the late Pliocene with a hybrid ice-sheet—climate model, showing a reconstruction of ice sheet geometry, sea-level and atmospheric CO2. Initial experiments simulating the last four glacial cycles indicate that this model yields results which are in good agreement with proxy records in terms of global mean sea level, benthic oxygen isotope 15 abundance, ice core-derived surface temperature and atmospheric CO2 concentration. For the Late Pliocene, our results show an atmospheric CO2 concentration during MIS M2 of 233 – 249 ppmv, and a drop in global mean sea level of 10 to 25 m. Uncertainties are larger during the warmer periods leading up to and following MIS M2. CO2 concentrations during the warm intervals in the Pliocene, with sea-level high stands of 8 – 14 m above present-day, varied between 320 and 400 ppmv, lower than indicated by some proxy records but in line with earlier model reconstructions. 20


5
1) Both reviewers pointed out that the introduction is insufficient. Please do extend the introduction to cover recent publications such as Prescott et al., (2018), Tan et al., (2017, EPSL), etc.
We agree with the editor and the reviewers that the introduction did not give an adequate description of the current state of research into the Pliocene in general, but focused too much on just MIS M2. We have expanded the introduction with a broader 10 overview of the Pliocene, both from a model-and a data-based perspective. We have also added a few lines describing how our own work fits into this broader picture.

2) Both reviewers mentioned the lack of description of model configuration, and they are both confused by equation 1 and 2 regarding how it is implemented and how the scaling factor is determined in 2. Those are valid and important points 15 to address. From what I understand, it looks like, in both equation 1 and 2, (t+0.1ky) should be a subscript instead of an actual
term. If so, please correct.
We agree with the editor and the reviewers that the manuscript lacked a proper conceptual explanation of the inverse method, relying too much on the reader's supposed familiarity with our earlier work. We revised the section explaining this method, 20 adding a brief conceptual explanation of what exactly this method does, and how it is quantified by the two equations. The (t+0.1ky) in both equations is a function argument (e.g. d18O_obs(t+0.1ky) is "d18O 100 years after the current model time").
We believe that the revised text resolves the confusion about this term.

3) Both reviewers pointed out that a major caveat of this study is the lack of warm Pliocene-like climates in the 25
climate matrix. Given that vegetation, ocean circulation, and ice sheet all respond nonlinearly to insolation and CO2 forcing, a good discussion on future work should be included beyond a few words. Note that recent studies have reported the nonlinear Earth System sensitivity to forcings (e.g., Friedrich et al., 2016, Nature).
We have added a few lines to Sect. 2.3 (Methodology, Matrix Method) justifying the use of the current climate matrix 30 (containing the 9 HadCM3 snapshots by Dolan et al., 2015), even with its relatively sparse information for warmer-thanpresent climates, based on the fact that earlier work -Berends et al., 2018 -with a matrix of only 2 states still produced good results. We have also added a few lines to the Discussion section describing the possible benefits of expanding the climate matrix with additional GCM snapshots, with higher CO2, different ice sheet geometries or different orbital configurations, in future work, which will likely reduce uncertainties. 4) Finally, reviewer 2 has pretty good comments on discussing model limitation (especially the resolution and parameter sensitivity, whether bias correction is used etc), and many detailed comments on improving the clarity of the 5 manuscript. Please do follow them when revising your manuscript.
We agree with the editor and with anonymous referee #2 that the manuscript was overly brief in describing the ice-sheet model, again relying too much on references to earlier work. We have added a few lines to Sect. 2.2 (Methodology, ice-sheet model) explaining the way sub-shelf melt, ice velocities and the resulting grounding line migration are simulated by the model. 10 In addition, we took all minor comments into account and we thereby hope that the manuscript improved considerably in clarity.

Introduction
One of the major long-term challenges posed by anthropogenic climate change is sea-level rise due to the large-scale retreat of the Greenland and Antarctic ice-sheets (e.g. Church et al., 2013). However, projecting the magnitude and especially the rate of such a retreat is limited by our understanding of the interactions between global climate and the cryosphere on centennial to multi-millennial time-scales, especially in a warmer-than-present climate. In order to gain more insight into the behaviour 25 of the Earth system in such a warmer world, numerous recent studies (Bachem et al., 2017;Bragg et al., 2012;Burke et al., 2018;de Boer et al., 2015de Boer et al., , 2017Dolan et al., 2011Dolan et al., , 2015de Schepper et al., 2014;Dowsett et al., 2016;Dwyer and Chandler, 2009;Haywood et al., 2010Haywood et al., , 2011Haywood et al., , 2013aHaywood et al., , 2013bHill, 2015;Lunt et al., 2009Lunt et al., , 2010Lunt et al., , 2012McKay et al., 2012;Miller et al., 2012;Naish et al., 2009;Naish and Wilson, 2009;Prescott et al., 2014;Sohl et al., 2009;Swann et al., 2018;Tan et al., 2017) have focussed on the Late Pliocene, 3.6 to 2.58 million years ago, since it is the most recent period in Earth history with 30 average global temperatures staying warmer than present-day for a significant length of time. Many of these studies, particularly those carried out as part of the Pliocene Modelling Intercomparison Project (PlioMIP; Haywood et al., 2010Haywood et al., , 2011, focus on the Mid-Pliocene Warm Period (MPWP), 3.29 -2.97 Myr ago. This time slab represents a relatively stable period in Earth's climate history with warmer-than-present global temperatures, lasting longer than any of the Quaternary interglacials.
Since it occurred much more recently than other warm periods, such as the late Eocene, the difference in continental configuration with the present is relatively small. Modelling studies generally show a global mean annual surface temperature 5 that was more than 3 °C warmer than the present day (Bragg et al., 2012;Burke et al., 2018;Haywood et al., 2013aHaywood et al., , 2013bLunt et al., 2010Lunt et al., , 2012. Sea surface temperatures were warmer as well, with a strongly reduced meridional gradient leading to a slight warming in the tropics and a strong warming in the polar regions (Bachem et al., 2017;Dowsett et al., 2009;Hill, 2015). Sea-level estimates range between 10 and 30 meters above present-day values (de Boer et al., 2017;Dolan et al., 2011;Dowsett et al., 2016;Dwyer and Chandler, 2009;Miller et al., 2011Miller et al., , 2012, caused by the almost complete 10 deglaciation of Greenland and West Antarctica. Estimates of atmospheric CO2 concentrations during this period vary between 250 and 450 ppmv (Badger et al., 2013;Bartoli et al, 2011;Martínez-Botí et al., 2015;Seki et al., 2010;Stap et al., 2016;Zhang et al., 2013).
However, global climate during both the Pliocene in general and the MPWP in particular showed significant variability. In 15 order to describe a more general "warm Earth" state, the MPWP is therefore treated in PlioMIP as an average of several different warm peaks that may or may not have occurred synchronously around the globe (Dowsett et al., 2016;. Both directly before the beginning of, and relatively shortly after the end of the MPWP, proxy records indicate that the Earth experienced periods that were apparently colder than present-day, though neither as cold nor as long in duration as typical Late Pleistocene glaciations. 20 Deleted: However, even before the onset of the Pleistocene glacial cycles, the Late Pliocene exhibited climatic variability with periods that were apparently colder than present-day, though not as cold or as long in duration as typical Late Pleistocene glaciations. Of particular interest is the cold excursion that occurred 3.3 Myr ago, during Marine Isotope Stage (MIS) M2, shown in Fig.   1. During the 40,000 years following the warm peak of MIS MG1 (3.315 Myr ago), benthic d 18 O (LR04; Lisiecki and Raymo, 2005) increased by about 0.5 ‰ and subsequently recovered, suggesting either a global cooling, an increase in ice volume on the Northern and/or Southern Hemispheres, or both. Sea-level records (Dwyer and Chandler, 2009;65 ± 25 m;Naish and 10 Wilson, 2009;38 m;Miller et al., 2011;34 ± 10 m;Miller et al, 2012;10 ± 10 m), as well as evidence of glacial till (Gao et al., 2012;de Schepper et al., 2014) and ice-rafted debris (de Schepper et al., 2014;Bachem et al., 2017;Smith et al., 2018) support the hypothesis of at least a partial Northern Hemisphere glaciation. De Schepper et al. (2014) and Dolan et al. (2015) provide detailed overviews of available evidence for glaciation during the Pliocene in general and MIS M2 in particular.
However, because most geological fingerprints that would have been left by Pliocene ice-sheets and glaciers would have been 15 overridden or eroded by waxing and waning of the much larger Pleistocene ice-sheets, evidence is limited to mostly the presence or absence of ice, providing only sparse information on geographical location and little to none on the volumes of these ice sheets. Dolan et al. (2015) studied MIS M2 from a climatological rather than a glaciological point of view. Using the HadCM3 general circulation model (GCM; Gordon et al., 2000;Valdes et al., 2017; see Sect. 2.1), they performed an ensemble of simulations 5 of global climate during MIS M2 for different postulated and fixed ice-sheet configurations and atmospheric CO2 concentrations. By comparing the results from these different equilibrium simulations to a wide range of available climatological proxies, they attempted to constrain MIS M2 ice volume estimates through the impact such ice-sheets would have on the climate. However, the available proxy records from this era have relatively large uncertainties, and where information is available, it remains difficult to use this to draw sound conclusions about Northern Hemisphere ice sheet extent. 10 They therefore concluded that available evidence from climatological proxies was unable to constrain ice volume any further.
In this study, we adopt a different approach, combining both the glaciological and climatological viewpoints. In a recent study, Berends et al. (2018) presented and evaluated a hybrid GCM -ice-sheet model, where they proposed a matrix method of model coupling (see Sect. 2.3) to force the ANICE ice-sheet model (Bintanja and Van de Wal, 2008;de Boer et al., 2013de Boer et al., , 2014de Boer et al., , 201715 see Sect. 2.2) with output from the HadCM3 GCM. By using output from a simulation with HadCM3 of the last glacial maximum (Singarayer and Valdes, 2010) they were able to simultaneously simulate the evolution of the ice sheets on North America, Eurasia, Greenland and Antarctica throughout the last glacial cycle and their contributions to global mean sea level and benthic d 18 O. They showed that their results matched proxy-based reconstructions for ice-sheet volume, ice surface temperature, sea-water d 18 O, deep-water temperature and benthic d 18 O. This matrix method is here applied to the late Pliocene 20 by using HadCM3 results from Dolan et al. (2015). The hybrid GCM -ice-sheet model presented by Berends et al. (2018) is computationally efficient enough to make large ensemble simulations feasible, opening up the opportunity to study the effects of changes in paleotopography, pCO2 and other climatological conditions, as well as the sensitivity to ice-sheet model parameters.

25
However, a high-resolution, time-continuous pCO2 record needed to force the model is not available for this period. We resolve this by using the inverse modelling approach that was also used by van de Wal (2008), de Boer et al. (2013; and Stap et al. (2016). In this approach we compare modelled benthic d 18 O to the LR04 stack (Lisiecki and Raymo, 2005) and calculate pCO2 based on the difference between the two (see Sect. 2.4). This makes our model set-up conceptually very similar to the approach by Stap et al. (2016), who also used the LR04 stack of d 18 O to force a coupled ice-sheetclimate model and 30 thus produce a pCO2 reconstruction. However, they used a relatively simple zonally averaged energy-balance climate model coupled to a 1-D ice model, whereas we use GCM output to drive 3-D ice-sheet models, making our approach more detailed in terms of the behaviour of global climate, the ice-sheets and the interactions between the two, at the expense of computational  Stap et al. (2016) and the one presented here can be viewed as proxy-based reconstructions, based on the concept that benthic d 18 O is a proxy for changes in ocean temperature and land ice volume. All three studies use a climate model describing the known relations between pCO2 and temperature and ice volume, in order to determine how pCO2 must have evolved in the past in order to produce the observed benthic d 18 O signal. 5

Climate model
HadCM3 is a coupled atmosphere-ocean general circulation model (Gordon et al., 2000;Valdes et al., 2017). It accurately reproduces the heat budget of the present-day climate (Gordon et al., 2000) and has been used for future climate projections in the IPCC AR4 (e.g. Solomon et al., 2007), and paleoclimate reconstructions such as PMIP2 (Braconnot et al., 2007) and 10 PlioMIP (Haywood and Valdes, 2003;Dolan et al., 2011Dolan et al., , 2015Haywood et al., 2013). The atmosphere module of HadCM3 has a resolution of 2.5 ° latitude by 3.75 ° longitude. The ocean is modelled at a horizontal resolution of 1.25 ° by 1.25 °, with 20 vertical layers. In the model set-up by Berends et al. (2018), the climate matrix consists of two GCM snapshots, of respectively the pre-industrial period (PI) and the last glacial maximum (LGM), produced by Singarayer and Valdes (2010) with HadCM3. Here, we include several additional snapshots focussing specifically on the Pliocene. 15

Ice-sheet model
To simulate the evolution of the ice sheets we use ANICE, a coupled 3-D ice-sheet-shelf model (Bintanja and Van de Wal, 2008;de Boer et al., 2013de Boer et al., , 2014de Boer et al., , 2017. It combines the shallow shelf approximation (SSA; Morland, 1987) for floating ice shelves with the shallow ice approximation (SIA; Morland and Johnson, 1980) for grounded ice to solve the ice flow. A Mohr-Coulomb plastic law for basal sliding is included, with basal stresses included in the SSA equations. The basal stress is 20 calculated as a function of a till stress, which in turn depends on the local bedrock elevation (Winkelmann et al., 2011;de Boer et al., 2013). For grounded ice, the velocities resulting from both approximations are summed, resulting in a smooth transition zone between slow-flowing land ice and fast-flowing floating ice. This approach allows the grounding line to respond to changes in shelf buttressing, resulting in proper glacial-interglacial differences in Antarctic ice volume (de Boer et al., 2013;Berends et al., 2018) by the advance of grounding lines in the Filcher-Ronne and Ross basins toward the continental shelf. The 25 surface mass balance is parameterised using an insolation-temperature scheme using monthly temperatures and precipitation, refreezing of water and a correction for orographic forcing of precipitation; a more detailed model description is provided by de Boer et al. (2013) and references therein. The horizontal resolution of ANICE for this application is 20 km for Greenland and 40 km for the other three regions (North America, Eurasia and Antarctica). The highly parameterized climate forcing and resulting computational efficiency of ANICE allow for transient simulations of multiple glacial cycles to be carried out within 10 -100 hours on single-core systems, making ensemble simulations feasible. Melt underneath the ice shelves is calculated using a linear relation to ocean temperature change (Pollard and DeConto, 2009;Martin et al., 2011), a parameterisation of sub-shelf cavity circulation based on the shortest linear distance to the open ocean (Pollard and DeConto, 2009), and the glacial-interglacial variance parameterization by Pollard and DeConto (2009). A more detailed explanation is provided by de Boer et al. (2013), who tuned this approach to produce realistic present-day Antarctic shelves and grounding lines. A simple 5 threshold thickness of 200 m is used to describe ice calving, whereby any shelf ice below this thickness is removed.

Matrix method
Using the definition by Pollard (2010), a climate matrix is a collection of pre-calculated output data from several steady-state GCM simulations, called "snapshots", that differ from each other in one or more key parameters, such as prescribed atmospheric pCO2, orbital configuration or ice-sheet configuration, each creating a separate dimension of the matrix. When 10 performing a simulation with an ice-sheet model, at every point in time during the simulation the prescribed climate forcing is determined by combining the climate states constituting the matrix according to the position of the model state within the matrix. This constitutes a middle ground between methods of offline forcing, such as a glacial index method, and fully coupled ice-sheet -climate models. When the ice-sheet model is in a state corresponding to one of the GCM snapshots, the climate from this snapshot will be prescribed, containing the effects of the altitude-temperature and albedo-temperature feedbacks of 15 the ice-sheets, the effect of ice sheet geometry on large-scale atmospheric circulation and precipitation, and possibly also the effects of changed freshwater fluxed on ocean circulation, depending on the GCM. When the ice-sheet model is in a state lying in between different GCM snapshots, the prescribed climate is an interpolation of these snapshots. This means that nonlinearities in different feedback processes, such as the effects of ice-sheet geometry on atmospheric circulation and precipitation, are difficult to properly account for. 20 In this study, we use the model set-up developed by Berends et al. (2018), who created a matrix with the HadCM3 climate states of the pre-industrial and the last glacial maximum from Singarayer and Valdes (2010) and used it to force the ANICE ice-sheet model. In this set-up, temperature fields from the two climate states are combined based on a prescribed value for pCO2 and on the internally modelled ice-sheets, with the feedback of the ice sheets on the climate based via the effect on 25 absorbed insolation through changes in surface albedo. This interpolation is carried out separately for all four ice sheets. The altitude-temperature feedback is parameterised by a constant lapse-rate derived from the GCM snapshots. Precipitation fields are combined based on changes in surface elevation, reflecting the orographic forcing of precipitation and resulting plateau desert caused by the presence of a large ice-sheet. Berends et al. (2018) demonstrated the viability of this method by simulating the evolution of the North American, Eurasian, Greenland and Antarctic ice-sheets throughout the entire last glacial cycle, 30 showing that model results agree well with available data in terms of ice-sheet extent, sea-level contribution, ice-sheet surface temperature and contribution to benthic d 18 O.
In this study, we extended the PI-LGM climate matrix by adding several climate states from the study by Dolan et al. (2015).
The four different ice sheet configurations they used are shown in Fig. 2. The "PRISM" ice sheets are based on the PRISM3 reconstruction , which is a time-slab representation of average peak warm conditions during the MPWP.
The "Small" ice sheets are present-day conditions. The "Medium" and "Large" ice sheets were based on the ICE-5G reconstruction of the last deglaciation (Peltier, 2004) at 8 ky ago and 11 ky ago, respectively. Each of these configurations was 5 used as boundary conditions for two simulations with HadCM3: one with 280 ppmv and one with 220 ppmv pCO2, both with 3.3 My orbital parameters. This adds up to eight different snapshots, plus one additional "Plio_Control" simulation with the PRISM3 ice sheet configuration and orbital parameters, and 405 ppmv pCO2. These simulations allow the climate matrix to separate effects on climate by pCO2 and ice-sheet extent and provide valuable information on climates that are both warmer and colder than present-day. Although the new climate matrix is relatively sparse for warmer-than-present worlds, containing 10 only one snapshot (Plio_Control) for pCO2 > 280 ppmv, and only three snapshots for smaller-than-present ice sheets (Plio_Control, PRISM_280 and PRISM_220), we believe it is still suitable for simulating the warm Pliocene. The matrix used by Berends et al. (2018) to simulate the last glacial cycle only contained two GCM snapshots in total, and still produced satisfactory results. This extended matrix therefore allows for a more accurate simulation of both the warm Late Pliocene and the cold MIS M2 glaciation. For the North America and Eurasia modules of ANICE, we added the simulations of the Medium and Large ice-sheets (both the 280 and 220 ppmv pCO2 versions) by Dolan et al. (2015), since those provide extra information on the effect on climate of intermediate-sized ice-sheets, as well as the Plio_Control simulation for its information on pCO2 levels above 280 ppmv. In the case of North America and Eurasia, we did not use the Small and PRISM states as there is no ice on either continent, meaning these simulations contain no additional constraints for these ice-sheet models. For the Greenland and Antarctica models, we chose to only add the simulations of the two PRISM states and the Plio_Control simulation, because 5 they provide new information on the effect on climate of smaller-than-present-day ice-sheets. The Medium and Large simulations were left out of the matrix because the ICE-5G ice-sheets (Peltier, 2004) that were used to force those HadCM3 simulations have the exact same horizontal extent as the ICE-5G LGM ice-sheets. Not only does this make it difficult to distinguish between these states in the interpolation routines, it also means the effect on local climate, other than through the altitude-temperature feedback, is likely to have been small. 10

Inverse method
The inverse forward modelling approach used to determine pCO2 based on the difference between modelled and observed benthic d 18 O is very similar to that described by de Boer et al. (2013). Their method calculates how the climate at high latitudes, (1) Here, D "# &&&&&&& is the mean surface temperature anomaly between 40 and 80 degrees latitude at sea level over the preceding 2 kyr.
The modelled benthic d 18 O is calculated using ice volume, ice-sheet d 18 O and deep-water temperatures relative to PD for every 100 years. The optimum values of 2 kyr for the length of the averaging window and 20 for the scaling parameter were determined by de Boer et al. (2013), producing a value of D "# = -15 K at LGM.
Since the climate matrix used in our model determines the regional climate based on the modelled ice sheet and the scalar atmospheric CO2 concentration, we adapted this approach for our model set-up by using the difference between modelled and observed d 18 O to calculate a value for pCO2, which is subsequently forwarded to the climate matrix. That way, our model reconstructs how pCO2 should have evolved in order to change global climate in such a way that the resulting changes in deep ocean temperature and land ice volume reproduce the observed benthic d 18 O signal. The algorithm then becomes: (2) As the constrained quantity is the change in pCO2, the scaling factor changes to 120 ppmv / per mill d 18 O change in order to produce a glacial interglacial contrast of 90 ppmv pCO2. Based on the results of preliminary experiments, the length of the CO2 averaging time window was increased to 8.5 kyr, in line with the higher values given by Stap et al. (2016). For forcing the inverse routine, the LR04 stack was used. Although a few different globally distributed stacks are available (e.g. Imbrie et 5 al., 1984;Lisiecki and Raymo, 2005;Zachos et al., 2001Zachos et al., , 2008Cramer et al., 2009), the differences among them are small for the late Pliocene and the Pleistocene. In order to maintain consistency with earlier reconstructions based on inverse modelling methods (de Boer et al., 2013;Stap et al., 2016;van de Wal et al., 2011) we decided to use the LR04 stack.
A conceptual visualisation of the inverse-method forced matrix model is shown in Fig. 3. 10

Paleotopography reconstruction
Several recent studies have investigated the evolution of bedrock topography in the geological past. Although ANICE does include a regional solid Earth model to calculate vertical bedrock movement in response to changes in ice distribution, other processes such as erosion and plate tectonics are currently not accounted for within the model, and require external forcing.
For the last glacial cycle benchmark simulation, such effects are assumed to be negligibly small, but this assumption might no 5 longer be valid when going millions of years back in time. In our study, we use the paleotopography reconstruction of the Barents Sea area by Butt et al. (2002), shown in Fig. 4, based on a reversal of the erosion of sediments by the Pleistocene ice-sheets. In order to investigate the effect on atmospheric and 10 oceanic circulation, Hill (2015) used HadCM3 to perform simulations both with this topography reconstruction and with present-day topography, both at 405 and 220 ppmv pCO2. By subtracting the calculated climate fields (temperature and precipitation) for the paleotopography simulation from the present-day topography simulation, and adding the resulting "fingerprint" to the climate fields generated by our climate matrix, we take into account the effect of this change in topography (and the accompanying change in the land/ocean mask) on the global climate. 15 Preliminary experiments showed that forcing the ice-sheet model with the paleotopography reconstruction without applying this climate "fingerprint" resulted in the persistent presence of a small (~5 m sea-level equivalent) ice-sheet over the newly exposed Barents Land. The climate fingerprint obtained from the simulations by Hill (2015) changes the local climate from an oceanic to a continental climate, with colder, dryer winters and warmer, wetter summers, resulting in more summer melt and an overall more negative mass balance, implying less ice. 20 Although other areas of the world might have been eroded by ice sheets (i.e. the Canadian Archipelago, Dowsett et al., 2016; Antarctica, Wilson and Luyendyk, 2009), no GCM simulations investigating the effect on global climate of reversing those changes are currently available. We have therefore chosen not to apply any of these other topography reconstructions to our model.

Last glacial cycle benchmark
In order to assess the performance of the model when calculating pCO2 with the inverse routine instead of prescribing it directly from an ice core record, we first performed a simulation of the last four glacial cycles, similar to the work by Berends et al. (2018). The model was calibrated by tuning the ablation parameter for the four individual ice-sheets such that their volumes at LGM match the ICE-5G reconstruction. We then performed a sensitivity analysis similar to the experiment described by The simulated pCO2 record is compared to the EPICA Dome C ice core record (Lüthi et al., 2008) in Fig. 5. The ranges of modelled values for pCO2 and sea-level drop at LGM for all investigated model parameters are listed in Table 2. Based on these uncertainties, the model shows that LGM pCO2 is 188 -197 ppmv and that the sea-level equivalent volume of the four 25 continental ice-sheets at LGM was 83 -100 m, agreeing well with the values of 185 ppmv pCO2 from the EPICA ice core and 100 m sea-level equivalent ice volume from the ICE-5G reconstruction (Peltier, 2004). The modelled pCO2 values match the EPICA record better than the values simulated by Stap et al. (2016), as demonstrated by the linear correlation and root mean very similarly to ours (R 2 = 0.72, RMSE = 14.7 ppmv), but since it was partly derived from the EPICA record, the comparison is not independent and therefore can not be compared to our results. Benthic oxygen isotope abundance and its contributions from ice volume and deep-sea water temperature are shown in Fig. 6 and compared to reconstructions by Lisiecki and Raymo (2005) and by Shakun et al. (2015), who made a proxy-based decomposition of the respective contributions to the benthic d 18 O from land ice and deep-sea temperature. The simulated 10 benthic d 18 O shows a near perfect match with the LR04 stack (Lisiecki and Raymo, 2005) that was used to force the model, as is to be expected when using the inverse forward modelling approach. The observed rapid drop in d 18 Osw at the inception, between 120 and 110 ky BP, is reproduced well, as is the drop in Tdw. Deep water temperature between 60 ky and LGM appears to be too high, more so than for the CO2-forced model version by Berends et al. (2018).

Deleted: and to
Surface temperature anomalies over Greenland and Antarctica compared to ice-core records (EPICA Dome C; Jouzel et al., 2007;GISP2;Alley, 2000;NGRIP;Kindler et al., 2014) throughout the last glacial cycle are shown in Fig. 7. The performance of the new model version in terms of ice surface temperature is comparable to that of the model by Berends et al. (2018), as illustrated by the linear correlation coefficients and root mean square error between the modelled temperatures and the ice core records: R 2 = 0.87 and RMSE = 0.86 K for Antarctica in this model, versus R 2 = 0.84 and RMSE = 0.91 K in the old version. 5 For Greenland, the new model produces a value of R 2 = 0.74 and RMSE = 2.2 K, versus R 2 = 0.65 and RMSE = 2.6 K for the old model. The mismatch during the inception of the glacial cycle between isotope-derived Antarctic surface temperature and ice core 15 CO2 on the one hand and benthic d 18 O and sea level on the other hand, reported by Bintanja and van de Wal (2008), van de Wal et al. (2011), de Boer et al. (2014, Niu et al. (2017) and Berends et al. (2018), is much better in the simulations here. The linear correlation coefficient R 2 between modelled and reconstructed Antarctic surface temperatures between 120 and 80 ky ago increased from a value of 0.49 for Berends et al. (2018) to a value of 0.74 for this study. For Greenland, this value increased from 0 to 0.36. The CO2-forced model (Berends et al., 2018) produced Antarctic surface temperatures that were in good 20 agreement with the isotope-based proxy record but failed to reproduce the strong sea-level drop. The d 18 O-forced model from this study reproduces benthic d 18 O and its different contributions and shows a too strong decrease in pCO2, but is in overall agreement with proxy records of CO2, sea level and temperature, indicating that there is an added value of using the climate matrix method as applied here.
Deleted: at reproducing these temperature records

Transient simulation of the Pliocene
The comparisons between model results and (proxy) data for the simulations of the last four glacial cycles indicate that the model accurately reproduces pCO2, ice volume and general geometry (not shown), and surface temperatures. We therefore proceeded to apply the new model set-up to the late Pliocene. We chose to start our transient simulations at 3.65 My ago, capturing the warm period between 3.6 and 3.4 My. The simulations were run until 2.75 My ago, since the density of available 5 pCO2 proxy data is much higher after MIS M2, allowing for a more detailed comparison of modelled pCO2 to proxy-based pCO2 reconstructions. The model was initialised with the same PRISM3 ice-sheets  that were also used to force the PRISM and Plio_Control HadCM3 experiments by Dolan et al. (2015). Due to the nature of the inverse coupling method, initialising the model with present-day ice-sheets quickly converges to the same result. Topography was set to presentday plus the Barents Sea erosion reversal from Butt et al. (2002) and a glacial isostatic adjustment (GIA) correction accounting 10 for the difference in ice loading over Greenland and Antarctica according to the PRISM3 reconstruction .
Insolation and benthic d 18 O were prescribed according to Laskar et al. (2004) and Lisiecki and Raymo (2005), respectively. In order to estimate the uncertainty in the modelled ice volume, we performed the same sensitivity analysis as for the last glacial cycle, with the same parameter values shown in Table 1. The resulting simulated pCO2 record is shown in Fig. 8 and compared to other model reconstructions (van de Wal et al., 2011;Stap et al., 2016) and to proxy-based data derived from alkenones 15 (Seki et al., 2010;Badger et al., 2013;Zhang et al., 2013) and d 11 B ratios (Seki et al., 2010;Bartoli et al., 2011;Martínez-Botí et al., 2015;Stap et al., 2016).  Table 2. KM5c is used because it has been identified as a time slice representing the mid-Pliocene 25 Warm Period (Haywood et al., 2013). These simulations show that during MIS M2 pCO2 is 233 -249 ppmv and that the sealevel equivalent volume of the four continental ice-sheets was 10 -25 m bigger than present-day, with the uncertainty based on the spread in the results from the ensemble of simulations. The uncertainty in modelled pCO2 becomes much larger for warmer-than-present climates, as shown by the modelled ranges for KM5c. The sensitivity to the benthic d 18 O forcing is especially high, resulting in modelled pCO2 values of 303 -384 ppmv. The reason for this is that the climatological forcing resulting from the climate matrix is less constrained for warmer than present-day climates. Whereas the matrix contains six snapshots describing climates with more ice and/or lower pCO2, there is only one ice sheet configuration smaller than present-5 day (PRISM), and only one snapshot with a pCO2 higher than 280 ppmv (the Plio_Control simulation, with 405 ppmv). The resulting modelled sea-level contributions over time are shown in Fig. 9. The modelled ice-sheets over the Northern and Southern Hemispheres at MIS M2 and KM5c are shown in Fig. 10 and Fig. 11, respectively. In North America, MIS M2 is clearly visible as a strong peak in ice volume, which immediately disappears when pCO2 rises again. Most of the ice forms 10 over north-eastern Canada, with a smaller ice sheet developing over the northern Cordillera. In Eurasia, only small ice-caps form on Svalbard and Nova Zembla (no longer islands, but now small mountain areas bordering the newly exposed Barents Land), with no sizeable ice sheets forming even at the peak of MIS M2. Greenland is mostly ice-free until MIS M2, when it rapidly develops an ice sheet slightly larger than present day. After MIS M2, the ice sheet disappears, advancing and retreating several times during the following period. Similar behaviour is observed on West Antarctica, while East Antarctica remains 15 stable throughout the simulation. While both Greenland and Antarctica continue to show substantial variability throughout the remainder of the simulations, North America does not glaciate again until the onset of the Pleistocene glacial cycles, 2.8 My ago.   Global mean sea level is compared to two different reconstructions in Fig. 12. Our model results generally lie between the d 18 O-based reconstruction by Miller et al. (2011) and the reconstruction based on geological backstripping from New Zealand by Miller et al. (2012). During warm periods, our model generally shows lower sea levels and less variability than Miller et al. 5 (2011) andMiller et al. (2012). During cold periods, our model generally shows less sea-level drop than the d 18 O-based reconstruction by Miller et al. (2011), but more than the reconstruction based on geological backstripping. The d 18 O-based reconstruction by Miller et al. (2011) is based on a linear relation between benthic d 18 O and sea level, which is an oversimplification of separating the contributions to the benthic d 18 O signal (e.g. Bintanja et al., 2005;de Boer et al., 2013). Miller et al. (2012) noted that reconstructing absolute values for local relative sea-level based on geological backstripping is 10 difficult due to the required corrections for GIA and dynamic topography. However, the relatively short duration of MIS M2 means that the reconstructed drop in sea-level, of about 20 m relative to the background level, is likely to be accurate. Our model produces a value of about 24 m, in good agreement with this value. The evolution of the West Antarctic ice sheet agrees partially with information derived from the AND-1B sediment core, recovered from beneath the northwest part of the Ross ice shelf by the ANDRILL programme (Naish et al., 2009;McKay et al., 2012). Information derived from this core by de Schepper et al. (2014) is compared to model results in Fig. 13. AND-1B 5 shows ice-free conditions in the Ross Sea up to 3.4 Myr ago, followed by glacial deposits up to 3.24 Myr ago. Our model results show ice-free conditions up to 3.32 Myr ago, just prior to MIS M2. The ice-free conditions shown in our model results around KM5c cannot be validated by AND-1B due to a lack of data. Between 3.14 and 3.04 Myr ago, AND-1B again contains glacial deposits when our model results indicate ice-free conditions. The glacial conditions between 3.04 and 2.95 Myr ago and the subsequent ice-free conditions between 2.95 and 2.90 Myr ago indicated by AND-1B match with our model results. 10 However, since the AND-1B sediment core contains several sizeable data gaps due to geological unconformities, the possibility that observed ~40 ky cycles in ice-rafted debris concentration have been incorrectly matched with 40 ky cycles in the d 18 O age model can not be precluded. We therefore conclude that the AND-1B sediment core record can not be used to confirm or refute our model results.

Discussion and conclusions 20
We have presented a new time-continuous, self-consistent reconstruction of pCO2, ice sheet configuration and climate for the late Pliocene, 3.65 My -2.75 My ago. Our approach is based on the matrix method by Berends et al. (2018), where an icesheet model is forced with a combination of several pre-calculated GCM snapshots. We have extended their two-state climate matrix with several GCM snapshots created by Dolan et al. (2015), who simulated global climate during MIS M2 for different ice-sheet configurations and pCO2 levels. Since our initial experiment, where this model was forced with the pCO2 reconstruction by Stap et al. (2016), proved unable to constrain sea-level during MIS M2 any further, we adopted the inverse forward modelling approach by de Boer et al. (2013), forcing the model with the LR04 benthic d 18 O stack (Lisiecki and Raymo, 2005). By first using this d 18 O-forced model set-up to simulate the last glacial cycle, we showed that it performed at least 5 equally well to the CO2-forced set-up by Berends et al. (2018) in terms of benthic d 18 O (Fig. 6) and surface temperature (Fig.   7), and better than the 1-D model set-up by Stap et al. (2016) in terms of simulated pCO2 (Fig. 5).
Our results for the late Pliocene show a global mean sea-level drop of 10 -25 m during MIS M2, with the uncertainty resulting from a sensitivity analysis investigating several key model parameters and the uncertainty in the applied d 18 O forcing. This 10 value is in good agreement with the reconstruction based on geological backstripping from New Zealand by Miller et al. (2012; 10 ± 10 m) and the d 18 O-based reconstruction by Miller et al. (2011;34 ± 10 m). The extra ice with respect to present day is located mostly on eastern Canada and the northern Cordillera (9 -20 m SLE) and the grounded ice over the Filcher-Ronne Sea (1.5 -3.5 m SLE). The atmospheric CO2 concentration necessary to produce the cooling required to grow these ice-sheets is shown to be 233 -249 ppmv. During MIS KM5c, most of the ice on Greenland and West Antarctica disappears, raising 15 global mean sea level to 3 -10 m above present day, caused by a pCO2 of 303 -384 ppmv. The sea-level high stand of the mid-Pliocene Warm Period is achieved during MIS KM3 (3.155 My ago) at 8 -14 m above present-day. The larger uncertainty in the modelled pCO2 during warmer periods is attributed to the fact that the climate matrix used to force our ice sheet model contains only one GCM snapshot with a pCO2 above present day levels, and only one ice sheet configuration with smaller than present ice sheets. Hence, the relationship between ice sheets and climate for warmer than present worlds is poorly constrained, 20 which is reflected by an increased uncertainty in the simulated pCO2 and ice volume. Expanding the climate matrix with additional GCM snapshots for intermediate pCO2 levels, orbital configurations and ice-sheet geometries could help reduce this uncertainty, by more accurately capturing the non-linear response of many climatological parameters to these forcings and boundary conditions.

25
Despite the large uncertainty, our results suggest that CO2 concentration during this warm time interval have not been significantly higher than present-day (~400 ppmv) values, in contrast to some of the proxy results. Comparing our Pliocene However, based on the ability of the different models to reproduce the EPICA pCO2 record, assigning more confidence to the reconstruction presented here is justified. Berends et al. (2018) provide a detailed discussion of the various advantages and disadvantages of the matrix method with respect to other methods of model forcing and coupling. Non-linear feedbacks of a growing ice sheet on the local and global 5 climate, such as changes in atmospheric stationary waves, are not properly captured by this model set-up, although the inclusion of more GCM snapshots for intermediate-sized ice sheets should make the behaviour of the model more realistic in this respect.
As a result the inception of the last glacial cycle (100 -80 ky ago; Fig. 6, Fig. 7), is now also satisfyingly resolved in term of temperature and sea level drop though the decrease in pCO2 seems stronger than suggested by the ice core record.

10
A drawback of the matrix method used here is that ocean temperature, required for calculating sub-shelf melt, is not included as a data field in the GCM snapshots. Instead, sub-shelf melt is calculated based on a combination of the temperature-based formulation by Martin et al. (2011) and the glacial-interglacial parameterization by Pollard and DeConto (2009), tuned by de Boer et al. (2013) to produce realistic present-day Antarctic shelves and grounding lines. Although Berends et al. (2018) show that this set-up performs well when simulating colder-than-present climates, this is not necessarily a priori true for warmer 15 climates, where the ice shelves are expected to retreat or even disintegrate. A more elaborate parameterisation based on GCMcalculated ocean temperatures can be expected to produce more reliable results.
Similarly, the effect of changes in insolation upon surface temperature is not well constrained. The climate matrix proposed by Berends et al. (2018) uses a parameterisation based on the locally absorbed insolation. While this allows changes in 20 prescribed insolation to affect climate by changing the relative weights assigned to the different GCM snapshots in the climate matrix, the different GCM snapshots used in the current version of the climate matrix were all forced with the same 3.3 Ma reconstruction by Lasker et al. (2004). Expanding the climate matrix with additional GCM snapshots for different orbital parameters, along the lines of Prescott et al. (2014;, would make the relation between insolation and surface temperature more explicit. We believe this could possibly lead to a further retreat of the East Antarctic ice sheet during warm periods. 25 Another possible hindrance to significant retreat of the Antarctic ice sheet in our simulations is the lack of explicit grounding line physics and relatively low model resolution, both of which have been shown to be required for accurate simulations of grounding line retreat (Schoof, 2007;Gladstone et al., 2012;Leguy et al., 2014). Instead, ANICE calculates sheet and shelf ice velocities using the SIA and SSA, respectively, and add these together, without additional grounding-line parameterisations. 30 An additional source of uncertainty in our reconstruction is the paleotopography of the period. Although we did include the Barents Sea erosion reversal by Butt et al. (2002) and its climate "fingerprint" as provided by Hill (2015)  Archipelago has been suggested to have been still one unbroken landmass which only formed later through erosion by ice during the Pleistocene glaciations (Dowsett et al., 2016), the Hudson Bay was likely not yet submerged (present today mostly due to remaining isostatic depression from the Laurentide ice sheet (Dowsett et al., 2016;Raymo et al., 2011). Similarly, based on the Eocene-Oligocene Transition (34 My ago) paleotopography reconstruction by Wilson and Luyendyk (2009), it is possible that, even during the Pliocene, West Antarctica was still mostly dry land (mostly submerged today due to erosion by 5 ice and isostatic depression) and the Filchner-Ronne and Ross seas were significantly deeper (shallowed by ice-eroded sediment from West Antarctica). Although such changes in topography would likely have changed the evolution of the ice sheets, preliminary experiments for the Barents Sea showed that including the topography change without its GCM-calculated effect on climate resulted in a strong overestimation of ice volume, mostly because applying the present-day sea climate to the newly exposed high-latitude landmass resulted in a strongly positive mass balance even with pCO2 above 400 ppmv. Since no 10 studies investigating the effects of these other topography changes on local and global climate are available yet, we did not include these changes in our study. Future work might be focused on reinvestigating these effects once results from new GCM simulations with these topography changes become available.
Considering the results from the comparison of our model output to the available proxy data and the different uncertainties 15 and caveats in our results, we believe our results could be of added value to future iterations of the Pliocene Model Intercomparison Project (PlioMIP), to be used for example as boundary conditions for new GCM snapshots or even transient simulations.
Code and data availability. The reconstructed records of pCO2, global mean sea level and benthic d 18 O, as well as NetCDF 20 files containing ice thickness, bedrock topography and annual mean surface temperature and precipitation for all four ice-sheet model regions during MIS M2, MIS KM5c and MIS KM3 are available online in the Supplement at https://doi.org/10.5281/zenodo.2598292 (Berends et al., 2019a).
Author contributions. CJB, BdB, and RSWvdW designed the study. AMD and DJH provided data from their own studies. CJB