Global and regional sea surface temperature trends during Marine Isotope Stage 11

. The Marine Isotope Stage (MIS) 11 (424–374 ka) was characterized by a protracted deglaciation and an unusually long climatic optimum. It remains unclear to what degree the climate development during this interglacial re-ﬂects the unusually weak orbital forcing or greenhouse gas trends. Previously, arguments about the duration and timing of the MIS11 climatic optimum and about the pace of the deglacial warming were based on a small number of key records, which appear to show regional differences. In order to obtain a global signal of climate evolution during MIS11, we compiled a database of 78 sea surface temperature (SST) records from 57 sites spanning MIS11, aligned these individually on the basis of benthic ( N = 28) or planktonic ( N = 31) stable oxygen isotope curves to a common time frame and subjected 48 of them to an empirical orthogonal function (EOF) analysis. The analysis revealed a high commonality among all records, with the principal SST trend explaining almost 49 % of the variability. This trend indicates that on the global scale, the surface ocean underwent rapid deglacial warming during Termination V, in pace with carbon dioxide rise, followed by a broad SST optimum cen-tered at ∼ 410 kyr. The second EOF, which explained ∼ 18 % of the variability, revealed the existence of a different SST trend, characterized by a delayed onset of the temperature optimum during MIS11 at ∼ 398 kyr, followed by a prolonged warm period lasting beyond 380 kyr. This trend is most con-sistently manifested in the mid-latitude North Atlantic and Mediterranean Sea and is here attributed to the strength of the Atlantic meridional overturning circulation. A sensitivity analysis indicates that these results are robust to record selection and to age-model uncertainties of up to 3–6 kyr, but more sensitive to SST seasonal attribution and SST uncertainties > 1 ◦ C. In order to validate the CCSM3 (Commu-nity Climate System Model, version 3) predictive potential, the annual and seasonal SST anomalies recorded in a total of 74 proxy records were compared with runs for three time slices representing orbital conﬁguration extremes during the peak interglacial of MIS11. The modeled SST anomalies are characterized by a signiﬁcantly lower variance compared to the reconstructions. Nevertheless, signiﬁcant correlations between proxy and model data are found in comparisons on the seasonal basis, indicating that the model captures part of the long-term variability induced by astronomical forcing, which appears to have left a detectable signature in SST trends.


Introduction
Marine Isotope Stage (MIS) 11 (424-374 ka) (Lisiecki and Raymo, 2005) stands out among the middle Pleistocene interglacials by its unusually long climatic optimum and a subdued orbital forcing due to low orbital eccentricity (Tzedakis et al., 2009). The amplitude of orbital parameters during MIS11 is similar to the Holocene, and MIS11 has been often considered an analogue to the present interglaciation (Berger and Loutre, 1991;Loutre and Berger, 2003;EPICA community members, 2004). However, whereas the present interglaciation has so far lasted through one single summer insolation maximum at 65 • N, the MIS11 interglacial optimum spans two such insolation maxima. Further, the deglaciation culminating in MIS11 climatic optimum (Termination V) was associated with an unusually weak orbital forcing and a different phasing of precession and obliquity, making orbital alignment with the Holocene difficult, and driving a protracted deglacial sea-level rise during Termination V, twice as long as during Termination I (Lisiecki and Raymo, 2005;Rohling et al., 2010;Tzedakis, 2010). In contrast to the differences in orbital parameters, the greenhouse gas concentrations in the atmosphere during MIS11 and the rate of their increase during Termination V were similar when compared to the preindustrial Holocene (Petit et al., 1999;Siegenthaler et al., 2005).
During MIS11, warm interglacial conditions lasted longer than in any other mid-to late Pleistocene interglacials as for instance reflected by the longer period of higher temperatures over Antarctica  and the peak sea level appearing to have been slightly higher than at present (Raymo and Mitrovica, 2012). The presence of an extended "climatic optimum", lasting around 30 kyr, has been documented in sea surface temperature records across the world ocean (McManus et al., 1999;Hodell et al., 2000;Kandiano and Bauch, 2003;De Abreu et al., 2005;Dickson et al., 2009;Stein et al., 2009;Voelker et al., 2010), in temperature proxies from Antarctic ice cores (Petit et al., 1999;Pol et al., 2011) and in terrestrial pollen records (Tzedakis, 2010). Like the Holocene, the MIS11 climatic optimum appears to have been a stable interglacial period McManus et al., 1999), characterized by low-amplitude millennial-scale climate variability Healey and Thunell, 2004;Pol et al., 2011;Vázquez Riveiros et al., 2013).
On the other hand, temperatures in the northern high latitudes during MIS11 appear lower than in the Holocene (Bauch et al., 2000;Helmke et al., 2003), and their temporal development seems to deviate from the global trend (Kandiano et al., 2012). These differences have been linked to changes in the strength of the Atlantic meridional overturning circulation, underlining the importance of the response of oceanic circulation to global forcing during Termination V and MIS11 (Dickson et al., 2009;Vázquez Riveiros et al., 2013). Until now, the congruence of sea surface temperature (SST) trends during MIS11 has never been assessed objectively, on a global basis and with an explicit consideration of dating uncertainty. Such an analysis is essential to determine the robustness and timing of MIS11 climatic optimum and the relationship between MIS11 SST trends with global forcing.
Here, we present a global compilation of sea surface temperature (SST) records for MIS 11, aligned by oxygen isotope stratigraphy, that cover a large proportion of the global oceans in both hemispheres. The aim of this study is to an- alyze temporal trends in the SST records and to investigate their linkage to global and regional climate variability during this period. Specifically, the following questions will be addressed: (i) what are the roles of orbital and greenhouse gas forcing in MIS11 climate variability, (ii) to what extent is regional climate variability reflected in SST trends, and (iii) how does tempo-spatial climate variability simulated by a state-of-the-art climate model for orbital configuration extremes of MIS11 correspond to that found in proxy records? 2 Material and methods

Material
We compiled a total of 78 marine SST records from 57 sites, covering a large geographical range (175 • E-172 • W and 57 • N to 54 • S), and water depths from 826 to 4620 m (Table 1). Most records stem from cores drilled in the Atlantic and Pacific oceans, but the database also includes records from the Indian Ocean, the Southern Ocean, and the Mediterranean Sea (Fig. 1). We have only chosen SST records for which stable benthic or planktonic foraminiferal oxygen isotope data are available with a sufficient temporal resolution to establish a robust stratigraphic framework for each record (Sect. 2.2). Most data sets were derived from the Pangaea (http://www.pangaea.de) and National Oceanic and Atmospheric Administration (NOAA) (ftp://ftp.ncdc.noaa.gov) websites. Data not available online were provided by the principal investigators or extracted from published figures through digital image processing.
The SST records are based on different proxies. Aware of the significant differences in the part of the seasonal SST cycle that is represented by each proxy, we have attributed the individual SST records to seasons. Thus a total of seven Y. Milker et al.: Global and regional sea surface temperature trends 2233 Table 1. Summary of the MIS11 SST records used in this study. Longitude, latitude, and water depth (m) of each record are provided. The type of δ 18 O records used for the age model tuning, the quality of the age models (expressed as the correlation coefficients of the stable isotope records with the LR04 stack (Lisiecki and Raymo, 2005)), and the temporal resolution of the age models and SST data is indicated. Records used for the empirical orthogonal function (EOF) analysis are marked with an asterisk, those used for the proxy-data model comparison with an x; also given are the total number of available data points and types of SST records used in the study.  Liu and Herbert (2004); Lawrence et al. (2006); Herbert et al. (2010) www.clim-past.net/9/2231/2013/  Imbrie et al. (1997) 1 The authors made a correction of +0.64‰ for C. wuellerstorfi. 2 Isotope data of C. pachyderma and C. wuellerstorfi were grouped together in this study. 3 De Abreu et al. (2005) measured multiple benthic species (C. robertsonianus, U. peregrina, G. affinis, C. wuellerstorfi, C. kullenbergi, H. elegans, O. umbonatus and C. carinata) and calibrated them to Uvigerina peregrina. 4 Data were taken from the original publications. 5 The authors applied two different calibrations of their alkenone data. In this study the calibration based on Conte et al. (2006) was used. 6 The mean of SSTs estimated with MAT and RAM data was used in this study. 7 The originally published age models, based on a graphical correlation of the isotope records to the LR04 stack, were used. 8 For the summer and winter SSTs, the mean of two MAT estimates was used in this study. 9 The mean of SSTs calculated with MAT, RAM and TFT was used in this study. 10 The mean of two SSTs calibrated with different equations was used in this study. 11 The mean of SSTs based on Mg / Ca ratios of G. ruber and G. sacculifer was used in this study. 12 The originally published age models were used and re-tuned to the LR04 stack. records based on planktonic foraminiferal Mg / Ca in low to mid-latitudes were attributed to annual SST (see Barker et al., 2005), and all 25 alkenone (U k 37 ) records were attributed to annual SST (see Müller et al., 1998) (Fig. 3d). A total of 27 seasonal SST records are based on foraminiferal, radiolarian and diatom assemblages using transfer functions including the modern analog technique (MAT) (Prell, 1985), the Imbrie-Kipp technique (IKT) (Imbrie and Kipp, 1971), the revised analog method (RAM) (Waelbroeck et al., 1998), and SIMMAX (Pflaumann et al., 1996), while one record based on the artificial neural network approach (Malmgren and Nordlund, 1997;Malmgren et al., 2001) was attributed to annual SST (Fig. 3d). Among the individual studies, the calibration data sets and the exact definitions of the seasons vary, but all transfer functions have been calibrated to a representation of "surface" SST, and differences due to different calibration data are unlikely to affect the shape of the SST trends. Finally, one SST record was derived by subtracting the benthic δ 18 O from the δ 18 O signal of the planktonic foraminifer Neogloboquadrina pachyderma (McManus et al., 1999) and another one using the relative abundance of N. pachyderma (sinistral) (Vázquez Riveiros et al., 2010). Both of these records are from high-latitude settings and were considered to represent the summer growth season, following the authors of these studies. Furthermore, we included one stack that is based on the mean SSTs calculated from U k 37 , Mg / Ca, and Tex H 86 measurements (Caley et al., 2011) following the author's statement that the stack gives more accurate SSTs compared to the individual records, and considered that this record represents annual SST. Thus, all SST records can be taken at first approximation to represent a "surface" signature, which has been attributed seasonally as far as possible.

Chronostratigraphy
To allow a direct comparison of SST trends, all records were tuned to the LR04 stack (Lisiecki and Raymo, 2005) on the basis of benthic or planktonic δ 18 O. The tuning was carried out for the period between 200 and 550 ka using the AnalySeries software (Paillard et al., 1996) (Fig. 2a-c). The longer tuning time interval enabled a better correlation between the LR04 stack and the δ 18 O data, because it includes more than one glacial-interglacial cycle. For the majority of the records, the δ 18 O data with their corresponding core depths were tuned to the LR04 stack. Where both benthic and planktonic δ 18 O data were available, the benthic records were used for tuning with priority. Depending on the temporal resolution of the records, between 6 and 18 tie points were defined for the target time interval 300-500 ka (Table 1, Fig. S3). This interval was selected because it covers the entire MIS11 and the major portion of the preceding and following glacials, allowing multiple robust tie points to be defined. For record MD03-2699, the tuning to the LR04 stack was problematic, and therefore we used the original age model that is mostly based on the graphical correlation of the  Lisiecki and Raymo (2005).
isotope record to ODP Site 980 (given in LR04 ages) (see Voelker et al., 2010, for further explanations). For three sites (MD96-2048, MD 96-2077 and ODP Site 1239), we used the age models in the original publications as they were already based on a graphical correlation of the isotope records to the LR04 stack. For 14 of the compiled records that are included in the LR04 stack, we also have identified a limited number of age control points depending on their resolution such that the number and type of control points was comparable to the other records and then used the age and depth assignment of these points based on the original LR04 age model. The age models of each core are being made available together with all data of the compilation via Pangaea. In addition, age-depth plots for all records used in this study are shown in Fig. S3, highlighting the studied MIS11 interval.
The temporal resolution of all proxy records was calculated for the 300-500 ka time interval (Fig. 3a, Table 1), and age model quality was evaluated on the basis of the correlation between the LR04 stack and the stable oxygen isotope curve of each record, as implemented in the AnalySeries software (Table 1)

Analysis of SST trends
To extract principal SST trends in the global compilation, the stratigraphically tuned SST records were subjected to an empirical orthogonal function analysis (principal component analysis applied on equally spaced time series). This method projects a multivariate data set onto a subset of a few principal components, whilst retaining as much information as possible (Hannachi et al., 2007;Jolliffe, 2002;Hammer and Harper, 2006). As the first step of the empirical orthogonal function (EOF) analysis, the tuned SST time series were linearly interpolated to a temporal resolution of 1000 yr allowing a comparison the SST trends reflected by the EOFs with global climate trends across MIS11. All records were normalized to unit variance, preventing the first EOFs from being dominated by variables with large variance. Of the total of 57 sites, SST records from 48 sites (34 representing annual and 14 representing caloric summer SST) covered the interval of 370-430 ka and were included in the analysis (Table 1, Fig. 3e). This interval was selected to allow the inclusion of the most records, whilst covering the entire MIS11. The 11 records not included in the analysis either ended or started within MIS11. Since we were only interested in the shape of SST trends, we have merged records attributed to annual and summer SST in one joint analysis. We believe this is justified since the biggest difference is likely to be that of a systematic offset, rather than a difference in the shape of the trends. Nevertheless, to assess to what degree this decision may have affected the results, we have also carried out the analyses separately for annual and caloric summer SST. In order to test the EOF sensitivity to age model uncertainty, the analysis was repeated on data in which the age control points of all records were randomly resampled with an age uncertainty of 3, 5, and 6 kyr. The lowest value corresponds to the mean temporal resolution of the records (Table 1), and values higher than 6 kyr were not tested as they would equate to age uncertainties approaching the shortest orbital cycle, implying a total failure of the orbital tuning procedure. All age models in this study are based on the age assignment to a small number of distinct features in stable isotope records, between which an interpolation takes place. As a first approximation, we consider that each of the age model control points is liable to the same uncertainty, and that these uncertainties are symmetrical and independent of each other. The randomized age model for each record was then created by stepwise linear correlation between the resampled control points. We have not carried out an autocorrelated age uncertainty propagation between the control points, but the algorithm we used detected and rejected all trials where the resampling created chronological reversals (such a situation may arise when age model control points are closer than twice the age uncertainty of the resampling).
Next, in order to assess the robustness of the EOF pattern to uncertainties in the SST values, the SST reconstructions were resampled under a simulated proxy uncertainty of 1, 2, and 4 • C. The lowest value corresponds to the typical value of an SST proxy uncertainty due to calibration. Higher values of SST uncertainty acknowledge the possibility that the SST reconstructions may be misattributed seasonally between glacial and interglacial ocean conditions, reflecting, for example, shifts in the production season of the proxy carrier. Finally, we used a jackknifing approach to examine the sensitivity of the EOFs to the number of records used in the analysis. EOFs were re-calculated, with 1000 replications, after randomly excluding 5, 15, 25, and 35 records during each replication. The EOF analysis was then carried out with 1000 replicates considering various combinations of age model uncertainty, SST proxy uncertainty and record exclusion. All EOF calculations were carried out with an algorithm written in MATLAB R2012a, which is available in the Supplement (S5).

Model-data comparison of climate variability
A comparison of SST anomalies in the proxy data with climate model output was carried out for three time slices (394, 405, and 416 ka before present) during the peak MIS11 interglacial when the influence of ice sheets on climate was supposed to be small. The three time slices reflect different extremes of the orbital configurations during peak MIS11. The 394 and 416 ka time slices are characterized by minimum and maximum obliquity (cf. Fig. 8a), respectively, while precession is almost identical. The 405 ka time slice coincides with the LR04 oxygen isotopic minimum (Lisiecki and Raymo, 2005) and high northern summer insolation (cf. Fig. 8a). For model-data comparison, arithmetic averages of all proxy values in each tuned record that fall into a 10 kyr interval centered on the respective model time slice were extracted from the proxy records. In this way, we have made sure that age model uncertainties in the proxy data were taken into account, whilst we acknowledge that the proxy time-slice data are likely to have been smoothed. The time slice simulations were performed with the National Centers for Atmospheric  (Siegenthaler et al., 2005;Schilt et al., 2010;Lourergue et al., 2008 Research (NCAR) Community Climate System Model version 3 (CCSM3), which is described in the following section. CCSM3 is a state-of-the-art coupled climate model that performs without flux corrections. The global model is composed of four separate components representing atmosphere, ocean, land, and sea ice (Collins et al., 2006). Here, we use the low-resolution version of CCSM3, which is described in detail by Yeager et al. (2006). In this version, the resolution of the atmosphere is given by T31 (3.75 • transform grid) spectral truncation with 26 layers, while the ocean model has a nominal horizontal resolution of 3 • (like the sea-ice component) with 25 vertical levels. The latitudinal resolution of the ocean grid is variable, with finer resolution around the Equator (0.9 • ). The land model is defined on the same horizontal grid as the atmosphere and includes components for biogeophysics, biogeochemistry, the hydrologic cycle, as well as a dynamic global vegetation model. In order to improve the simulation of vegetation cover, new parameterizations for canopy interception and soil evaporation have been implemented into the land component (Oleson et al., 2008).
A pre-industrial control run was performed following the protocol established by the Paleoclimate Modelling Intercomparison Project, Phase 2 (Braconnot et al., 2007;Otto-Bliesner et al., 2006;Merkel et al., 2010). The control run was integrated for 600 yr starting from present-day initial conditions. For the selected time slices, appropriate orbital parameters (Berger, 1978) and greenhouse gas concentrations were prescribed, as given in Table 2, while all other forcings (ice sheet configuration, ozone distribution, sulfate aerosols, carbonaceous aerosols, solar constant) were kept at pre-industrial levels. Starting from the last year of the (quasi-)equilibrated pre-industrial control run, all simulations were integrated for 400 yr so that the surface climatologies could reach a statistical equilibrium. For each experiment, the mean of the last 100 simulation years was used for analysis.
For a direct proxy-model comparison, we used only such proxy SST records for which at least one value was available in each of the three time-slice intervals and calculated the differences between the SST average for the three time slices and the SST average of each time slice. In all cases, the seasonal attribution has been preserved, allowing data-model comparison on a seasonal basis. The comparison is synoptic in that northern summer SST and southern winter SST are analyzed together and vice versa. The eventually selected 74 records from a total of 52 sites contain a total of 35 annual, 16 Northern Hemisphere summer, and 23 Northern Hemisphere winter SST records (Fig. 3f, Table 1). Model data have been extracted from the surface layer (0-20 m) field of the nearest grid cell to each proxy record. For the data-model comparison, we calculated the proxy-based and modeled SST anomalies of the Northern Hemisphere summer (July-September in the model) and winter (January-March in the model) periods and the annual SST anomalies relative to the mean SST of the 390-420 ka time interval (the average of the 394, 405, and 416 ka time slices in the model).
When testing for the correlation between SST anomalies retrieved from proxy data and model calculations on a perseason basis, part of the uncertainty is whether or not the proxy data have been assigned to the correct season. In order to test the validity of our assignments (compare also with Sect. 2.1), we applied two randomization tests on the data set showing the best correlation between proxy and model (in this case 416 ka summer). Using all 52 records in that time slice, we ran 10 000 permutations during which we correlated summer anomaly data of the model with random season data of the proxies. In a second randomization we introduced one more degree of freedom by randomly choosing only 30 of the 52 records available and correlating a randomly assigned seasonal data set in the proxy data with the summer values of the model data. To test the agreement between proxy and modeled SST anomalies in a quantitative way, the correlation between the proxy and modeled data for each season and each time slice was calculated using the PAST software package (Hammer et al., 2001). The type of correlation analyses was adapted to the results of normality tests applied on the data sets. Pearson's product-moment correlation (r) was used where the data were normally distributed (Shapiro-Wilk test), while Spearman's rank-order correlation (ρ) was used when the data did not meet these requirements. In order to assess not only the strength of the direct relationship between the modeled and proxy-based anomalies, but also to compare the direction of change in the data, Cohen's κ (Cohen, 1960) was used. Here, the temperature anomaly values were categorized into three nominal values, i.e., negative, positive, and no change relative to the mean SST of the 390-420 ka time interval (the average of the 394, 405, and 416 ka time slices in the model data). On the basis of those categories, it was counted how often proxy data and model predictions were in agreement with each other, and how often they contradicted and, if so, in which way they contradicted (e.g., model is positive and proxy is negative, or model has no change and proxy is positive). This resulted in contingency tables with counts of all nine possible relations, from which Cohen's κ and corresponding p values (for H0 = no agreement between raters) were calculated in the SPSS software package (version 20). Verbal assignments of quality of agreement solely on the basis of the κ value are in accordance with standard values given in the literature (e.g., Altman, 1991). In order to quantify the differences in variability in the proxy records and the climate model, the variance of the SST anomalies in the proxy and model data sets was calculated for each time slice and season, and compared using an F test. Finally, we determined whether or not the proxy pattern for each time slice can be considered as significant deviation from no change. To this end, we calculated the confidence interval for the mean proxy-based SST anomaly values using means of error propagation, where the confidence interval is a function of both the number of records averaged and the variance within those records.

Age model quality
In order to quantify the quality of our age models, we examined both the correlation of the δ 18 O records with the LR04 stack (Lisiecki and Raymo, 2005) as well as their temporal resolution. The majority of the benthic isotope records (44 records, i.e., 77 %) show a high correlation of r ≥ 0.80 for the entire tuning time interval between ∼ 200 and ∼ 550 ka (Table 1). Ten records (18 %) show a moderate correlation with the LR04 stack (r = 0.60 between r = 0.80). Three records show a low correlation with the benthic stack with r = 0.50 for ODP Site 882, r = 0.46 for MD03-2699, and r = 0.36 for ODP Site 1168. The records from MD03-2699 and ODP Site 1168 are characterized by low δ 18 O values in the later part of MIS11. According to Voelker et al. (2007), the benthic δ 18 O values of core MD03-2699 are strongly affected by the Mediterranean outflow water (MOW) during glacial and interglacial inceptions, making it difficult to establish their benthic-isotope-based age model we have used in our study. The benthic δ 18 O record of ODP Site 1168 was tuned with the help of the original age model given in Nürnberg et al. (2004) as their isotope curve does not show the typical MIS11 pattern. The ODP Site 882 benthic δ 18 O record has the lowest resolution of all records used, and the tuning to the LR04 stack was guided by the age model given in Haug (1995). When only the MIS11 interval (370-430 ka) is considered, 52 records (91 %) show a correlation with the LR04 stack higher than r = 0.8. Four records (8 %) exhibit correlation coefficients between r = 0.6 and r = 0.8, and one record (ODP Site 1168) has a correlation of less than 0.6 (i.e., r = 0.19) ( Table 1). To determine the δ 18 O records' quality, the records were divided into classes depending on their temporal resolution. For records MD97-2142 and MD01-2443, the mean temporal resolution based on two different δ 18 O records was calculated (Table 1). Eleven records (19.3 %) have a temporal resolution higher than 1000 yr (Fig. 3a). A total of 21 records (36.9 %) have a temporal resolution between 1000 and 3000 yr, and 16 records (28.0 %) have a resolution between 3000 and 5000 yr. Nine of the records (15.8 %) have a low temporal resolution of less than 5000 yr (Fig. 3a). The mean temporal resolution of all records used is approximately 3000 yr. This is the value that was used as the minimum estimate of age uncertainty in the age-model sensitivity simulation.

Temporal resolution of SST proxy records
To evaluate the temporal resolution of SST records, the records have been divided into the same classes as described above for the δ 18 O records. For records having more than one SST record, the mean temporal SST resolution was calculated for this evaluation. A total of eight SST records (14.0 %) have a temporal resolution of more than 1000 yr (Fig. 3b), 24 records (43.0 %) a moderate temporal resolution between 1000 and 3000 yr, and 14 records (24.6 %) a resolution between 3000 and 5000 yr. Eleven SST proxy records (19.3 %) have a low resolution of less than 5000 yr (Fig. 3b). The mean temporal resolution is 3127 yr, indicating that the records should collectively be able to resolve orbital-scale variability, but not millennial variability.
Depending on the specific temporal resolution of each record, varying amounts of SST data points were available for the comparison of the proxy with modeled SST anomalies for the 390-420 ka time interval. From a total of 74 records used for this comparison, 26 records (35.1 %) provided only less than ten data points each for the total time interval (Fig. 3c). For 36 records (48.6 %) 10-40 data points per record were available, whereas nine records (12.1 %) contained more than 40 but less than 130 data points per record, and three records (4.1 %) provided more than 130 data points each.

Empirical orthogonal function analysis
EOF analysis (Fig. 4) of the 48 SST records spanning the entire target time interval revealed the existence of a strong commonality in the shape of the SST trends. The first three EOFs together explained around three quarters of the variability in the data, irrespective of the combination of agemodel and SST-proxy uncertainty and record selection (Table 3). Almost one half of the variability is explained by the first EOF, which describes a temporal trend of a rapid deglaciation, followed by a broad temperature optimum centered on 410 ka and a slow decrease of SST towards the end of MIS11. The second EOF explains nearly 19 % of the total variability and shows a delayed onset of the temperature optimum during MIS11 after 410 ka, followed by a prolonged warm period lasting beyond 380 ka. The third EOF explains around 8 % of the total variability and shows a cyclic pattern with a period of about 30 kyr.
For the first EOF, 42 % of the records have significant positive loadings > 0.75, 31 % of the records positive loadings between 0.5 and 0.75, and only four records (ODP Site 999, ODP Site 1168, RC11-210 and V22-174) show negative loadings (Fig. 5). The latter records reflect SST changes in the tropical Atlantic and Pacific, as well as in southeast-ern Australian coastal regions. In contrast, the loadings of the second EOF are more diverse and show a geographical pattern. Only three of the records show strong positive loadings > 0.75 to EOF2 and are primarily associated with SST changes in the mid-latitude North Atlantic region (IODP Site U1313) and the Mediterranean Sea (ODP Site 976) (Fig. 5). Records with positive loadings to EOF2 between 0.5 and 0.75 are further observed in the mid-latitude North Atlantic region (ODP Site 958), in the Caribbean Sea (V12-122), on the western coast of South Africa (ODP Site 1082), northwest of Australia (MD00-2361), and in the North Pacific (ODP Site 882, RC11-210). Finally, high loadings of the third EOF are limited to a few records, indicating that this EOF (and all subsequent EOFs) tends to express patterns specific for individual sites, rather than highlighting commonalities among the records.
Both the shape of the first two EOFs (Fig. 4) and the amount of variance explained by them are remarkably robust to age-model and temperature uncertainty ( Table 3). The temperature uncertainty has a stronger influence on the EOF robustness than the uncertainty of the age model. Compared with a temperature uncertainty of 1 • C, a temperature uncertainty of 4 • C reduces the variance explained by EOF1 from 49.0 to 35.9 % (Table 3). In contrast, an increase of age uncertainty from 3 to 6 kyr reduces the amount of variance explained by the first EOF by less than 1 % (Table 3). Similarly, a reduction of the number of records included in the analysis has a relatively small influence on the variance that is explained by the EOFs, as long as the subsampling is limited to more than 50 % of the total number of records (Table 3).
In contrast to the robustness of the first two EOFs, the third EOF scores are sensitive both with respect to temperature uncertainty and record selection (jackknifing). The cyclic signal of EOF3 loses significance already at temperature uncertainty of 2 • C and with jackknifing at the level of withholding around 25 records at a time (Fig. 4). This behavior is consistent with the EOF3 signal being associated with a small number of records characterized by low amplitude of the SST signal. We conclude that the third EOF, and by inference all subsequent ones, does not express any general climatic signals and is not interpreted further in this study.
The EOF analyses carried out separately for the 34 annual and 14 summer SST records show similar general trends for the first two components, but small temporal lags with respect to the joint analysis (Fig. S1). The lags amount to around 3-4 kyr, which coincides with the average temporal resolution of the records. The loadings of individual records in the separate analyses are largely similar to those in the joint analysis (Fig. S2).

Comparison of proxy with modeled SST
The most striking pattern when comparing the proxy-based SST anomalies with model results is the large difference in their variance. Whilst proxy-based SST anomalies range   Fig. 4. Results of an empirical orthogonal function (EOF) analysis of MIS11 SST records (Fig. 1, Table 1), including results of a sensitivity analysis with respect to age and temperature uncertainties. Re-calculations were made for age uncertainties of 3, 5 and 6 kyr, and for temperature uncertainties of 1, 2 and 4 • C. To test for sensitivity of the EOF to record selection, jackknifing was applied and 5, 15, 25 and 35 samples were randomly excluded from the data set (age and temperature uncertainties were set to 5 kyr and 1 • C). All calculations were made with 1000 iterations, and the confidence intervals are given for each analysis. For variances explained by the EOFs, see Table 3. Table 3. Variances explained by the first three empirical orthogonal functions (EOFs) of the MIS11 SST records, including 5 and 95 % confidence limits (see also Fig. 4). Re-calculations were made with varying age and temperature uncertainties and a random exclusion of records (jackknifing, jack5 means that five records were excluded per replication). Control run refers to analysis of raw data without consideration of uncertainties. EOF results presented in Fig. 8  from 4 • C (up to 6 • C), the modeled SST anomalies range irrespective of season or time slice rarely from more than 1 • C. This pattern is clearly seen in the comparison of the variances of the anomalies in proxy data and model results (Table 4). However, we determined whether or not the proxy pattern for each time slice can be considered as a significant deviation from "no change", by calculating the error ranges of the anomalies. In 55.6 % of all cases, zero is part of the confidence interval of the anomaly, meaning that in these cases the direction of the anomaly cannot be estimated accurately. Table 4. Correlation coefficients and Cohen's κ values between the proxy and model SST anomalies for the three MIS11 time slices (Fig. 7), as well as a comparison of variances in the proxy and model data (statistically significant values are given in italics). But notwithstanding the striking differences in the amount of SST change at the studied sites between the proxy data and the model, the direction of change and the regional patterns of anomalies show a number of similarities.
We have tested the validity of our seasonal attribution of the SST records with two randomization tests showing that none of the resulting correlation coefficients was equal to or larger than the one we originally observed when using the complete data set (Fig. S4). When using a reduced data subset of 30 records, only 1.4 % of the resulting correlation coefficients were equal to or higher than the one originally observed. We can thus state that our observed correlation cannot be significantly enhanced by disregarding the season assignment of the proxy data. It is therefore reasonable to assume that the season assignment is correct and does not bias our results. On this basis, the comparison between the proxy-based and modeled SST anomalies in MIS11 (Fig. 6) shows a generally better agreement for the boreal summer than for the boreal winter. For the boreal summer, a robust trend to colder (390-400 ka) and warmer temperatures (400-410 and 410-420 ka) can be observed for the northernmost Atlantic region (Fig. 6). Modeled positive SST anomalies in the (sub)tropics during the 390-400 ka time interval only partly agree with the SST anomalies recorded in the sediments, while negative SST anomalies modeled for the Southern Ocean region are in better accordance with the proxy data for this time slice. The general temperature increase simulated for the 400-410 ka time slice during the boreal summer season is also reflected by the positive SST anomalies in the proxy records except for one record in the tropical Pacific and the northernmost Atlantic. A modeled temperature increase in the Northern Hemisphere accompanied by a temperature decrease in the (sub)tropics can also be observed in the proxy data for the 410-420 ka interval, especially for the Atlantic Ocean and the South China Sea. For the tropical Pacific, the Caribbean Sea, and for two South Atlantic sites, the proxy data show opposite trends to the model data (Fig. 6).
The modeled negative SST anomalies in the Southern Hemisphere for the boreal winter season for the 390-400 ka time slice generally agree well with the proxy data, while model and proxy data for the (sub)tropical regions as well as the Northern Hemisphere partly show opposite trends. The simulated SST increase in the Northern Hemisphere and the (sub)tropics accompanied by negative anomalies in the Southern Hemisphere during the 400 to 410 ka time interval can also be observed from the proxy data for the Atlantic and Southern Ocean regions, while the observations disagree with the model data for the tropical Pacific (Fig. 6). Modeled positive SST anomalies in the Southern Hemisphere and negative temperature anomalies in the (sub)tropics are in agreement with most of the proxy records of the 410-420 ka interval, too, while, particularly in the North Atlantic, model and proxy data show opposite trends.
Despite the apparently good qualitative agreement between model and proxy data (Fig. 6), a quantitative comparison shows a different picture (Fig. 7, Table 4). A statistically significant correlation between proxy and model anomalies can only be observed for the boreal winter of the 400-410 ka time interval and for the summer of the 410-420 ka interval, with ρ = 0.47 (p = 0.021) and r = 0.59 (p = 0.020), respectively. For all other time slices, the seasonal and annual SST anomalies show low correlations between proxy and modeled data (Table 4, Fig. 7). Cohen's κ, which was used to test whether or not the proxy and modeled data are qualitatively in agreement, shows a moderate and significant (p = 0.025) agreement for boreal summer during the 410-420 ka interval. Fair agreement is given for the boreal summer during the 390-400 ka and for the boreal winter during the 400-410 ka time intervals, respectively, whereby these agreements are not statistically significant, with p values ranging between 0.185 and 0.251 (Table 4)

Global and regional climate trends in the proxy SST records
Under the limitation of the uncertainty in the alignment of the benthic oxygen isotope stack with insolation as carried out by Lisiecki and Raymo (2005), it is possible to explore the relationship between insolation forcing and the global SST pattern during MIS11 as revealed by the EOF1 scores (Fig. 8a). Such a comparison is justified, because the SST records at each site are co-registered with the stable oxygen isotope variations used to align the records temporarily but are in all cases entirely independent of these. This comparison reveals a rapid global temperature rise during Termination V occurring between 430 and 425 kyr. The highest global temperatures during MIS11 were reached at around 411 kyr when the high-latitude summer insolation was at its minimum. The SST cooling trend into the glacial inception started at around 405 kyr, corresponding to the onset of icesheet growth as indicated by benthic isotopes and the Red Sea sea-level curve (Fig. 8d). Across MIS11, the first EOF1 follows a consistent glacialinterglacial pattern with cold SSTs during MIS12 (> 430 ka) and MIS10 (< 370 ka), and a relatively long duration of warmer SSTs from 416 to 405 kyr. A similar trend is reflected in the Antarctic temperature change based on the deuterium record of the EPICA Dome C ice core   (Figs. 8b and 9b). However, the position of the interglacial temperature peak is offset with a temperature peak in the Antarctic record lagging the EOF1 signal as well as the mean global temperature peak calculated from all records (not shown here) by ∼ 4 kyr (Figs. 8b and 10). The EOF1 signal is similar to mean relative SST changes in the South Atlantic and the Southern Ocean (Fig. 8b), while North Atlantic records with high loadings to EOF1 are characterized by a slower temperature decrease during the late MIS11 period -a pattern more similar to the CO 2 record measured in the Dome C Antarctic ice core (Siegenthaler et al., 2005) (Figs. 8c and 9a). Compared to the δ 18 O sea-water record of ODP Site 1123 as a proxy for ice volume (Elderfield et al., 2012), it seems that the deglacial SST rise indicated by EOF1 preceded the reduction of the global ice volume by ∼ 5 kyr (Fig. 8d, e). This pattern remains even when the sea level development during MIS11 is approximated by the Lisiecki and Raymo (2005) stack or by the Red Sea stable isotope record by Rohling et al. (2009) (Figs. 8d, 9c and d), indicating a faster reaction of the surface ocean to insolation and greenhouse gas forcing than that of the slowly melting ice sheets.
The temporal resolution of our study and the level of temporal uncertainty make it difficult to address the phasing of different proxies during MIS11 at millennial timescales, but we note that the global SST trend indicated by EOF1 scores does not precede CO 2 (Fig. 8c). Further, the observed lag between the global SST peak and the temperature peak over Antarctica of around 4 kyr is robust to dating uncertainties (Figs. 8b, 10). This would indicate that during the interglacial, temperature over Antarctica was not as closely coupled to the global mean as during the deglaciation, perhaps reflecting more strongly the antiphased Southern Hemisphere insolation pattern .
The second EOF scores in our analysis follow a trend that differs from the global pattern and indicate a later establishment of a relative temperature maximum and a longerlasting period of warmer temperatures during late MIS11 and into MIS10 (Fig. 8e and f). This regional trend is primarily reflected in the SST records of the mid-latitude North Atlantic, the Mediterranean Sea as well as in one record south of Australia (Fig. 5). A similar trend, with delayed onset of interglacial conditions after Termination V and a longer-lasting interglacial optimum, has been recently reported also from the eastern Mediterranean (Maiorano et al., 2013). The apparent later onset of MIS11 optimum and the longer duration of interglacial warmth have been also noted by the authors of the individual records included in our compilation, particularly for records from the North Atlantic region. These authors hypothesized that the persistence of the northern ice sheets throughout MIS11 may have led to a dominant negative mode of the North Atlantic Oscillation (NAO) (Kandiano et al., 2012) whilst the associated sustained meltwater input in the (sub-)polar regions may have resulted in a less stable Atlantic meridional overturning circulation (AMOC; Voelker et al., 2010). In either case, these phenomena would lead to a reduced ocean heat transfer into the North Atlantic, causing a delayed optimum in the SST trends. Indeed, Dickson et al. (2009) conclude that a stronger AMOC during MIS11 was first established at 415 ka. Similarly, mean δ 13 C of benthic Foraminifera from water depths between 1100 and 2300 m in the North Atlantic that can be used as a proxy for NADW production (Lisiecki et al., 2008) shows increasing values from 425 to 405 ka, where the heaviest values were reached before only slightly decreasing until the end of MIS11 (Fig. 8f). This trend, which is indicative for enhanced NADW production between 410 and 400 ka, is quite similar to our EOF2 scores as well as to the mean relative temperature anomaly trends found in the records with high EOF2 loadings ( Fig. 8e and f). The persistence of longer-lasting warmer temperatures in the terrestrial high northern latitudes in the late MIS11 and into MIS10 has also been explained by a weaker Siberian High pressure system during times of insolation minima due to lower ice and snow accumulation rates, leading to weakened East Asian winter monsoon (EAWM) as reflected by the GT32 grain size distribution in Chinese loess sequences (Hao et al., 2012) showing a similar pattern to our EOF2 signal (Fig. 8f). However, other terrestrial records from lakes Baikal and El'gygytgyn (Propopenko et al., 2010;D'Anjou et al., 2013) do not show a longer-lasting warm phase during MIS11 in the northern  (Table 4). Higher positive correlations between the proxy and model data can be observed for the boreal summer seasons of the 390-400 and 410-420 ka time slices and for the boreal winter season of the 400-410 ka time slice. Note different x and y axes scaling. higher latitudes as observed by Hao et al. (2012) and our study.

Comparison of the climate variability between proxy and model SST
The observation of a much lower variance in modeled temperature trends when compared to paleo-data (Table 4) has been found in other studies where marine and terrestrial proxy have been compared with simulated temperature trends for the Holocene period (Brewer et al., 2007;Zhang et al., 2010;Sundqvist et al., 2010;O'ishi and Abe-Ouchi, 2011;Lohmann et al., 2013). In principle, such disagreement might be caused by an underestimation of temperature changes in climate models, as also suggested by, for example, Brewer et al. (2007), O'ishi and Abe-Ouchi (2011) and Lohmann et al. (2013), by an overestimated proxy SST variability, or a combination of both. Underestimation of climate variability in model simulations may be caused by shortcomings in the model physics (e.g., subgrid-scale parameterizations associated with clouds in the atmosphere or mixing in the ocean) and/or missing climate components (e.g., continental ice sheets) resulting in a lack of potentially important feedback mechanisms. At single sites, undersimulated SST variance may also be caused by too coarse grid resolution, such that, for example, shifts in oceanic fronts or upwelling zones are not resolved. Higher variance in the proxy data may result from noise and calibration uncertainties, or from uncertainties in seasonal and/or vertical attribution in the proxy records, as also suggested by Brewer et al. (2007) and Sundqvist et al. (2010) Fig. 8. Scores of the first EOF with their confidence intervals versus (A) orbital parameter (June insolation at 60 • N and obliquity (Laskar et al., 2004)), (B) relative temperature changes in the South Atlantic and Southern Ocean and Antarctic temperature changes recorded in the Dome C ice core during MIS11 [a]) (thin pink line: original data; thick pink line: smoothed curve), (C) mean relative temperature change in the North Atlantic records with high loadings to the first EOF and CO 2 concentration recorded in the Dome C ice core from the Antarctic (Siegenthaler et al., 2005;[b]), and (D) relative sea-level changes in the Red Sea (Rohling et al., 2009;[c]) and the LR04 benthic stack (Lisiecki and Raymo, 2005;[l]). Scores of the second EOF mean relative temperature changes in records with high positive loadings (> 0.75) on the second EOF versus (E) δ 18 O of sea water as proxy for ice volume (Elderfield et al., 2012;[d]), and (F) mean δ 13 C values in the North Atlantic as a proxy for North Atlantic deep water (NADW) strength and the East Asian winter monsoon signal as reflected by the GT32 grain size (content of > 32 µm particles) in loess sequences (Hao et al. 2012;; note inverse scale). For the mean of δ 13 C North Atlantic, records between 1100 and 2300 m water depth were selected according to Lisiecki et al. (2008). The δ 13 C records used here are from Shackleton and Hall (1984) [e],  [f], Venz et al. (1999) [g], Raymo et al. (1998) [h], McIntyre et al. (1999) [i], Kleiven et al. (2003) [j], and Raymo et al. (2004) [k]. The EDC ages of the records from the Dome C ice core in (B) and (C) were converted into LR04 ages (Lisiecki and Raymo, 2005;[l]) according to Parrenin et al. (2007) lead to a lowering of proxy variance with record resolution (assuming that the records are all subsampling a signal with similar spectral properties). In order to explore the potential causes of variance in the SST reconstructions by proxies, we plot SST variance against sampling resolution for all records using the proxymodel comparison (Fig. 11). The majority (eight) of the SST records with higher variability (Fig. 11) were reconstructed with the modern analog technique (MAT) and other transfer functions applied on microfossils such as radiolarians, diatoms and Foraminifera. The main assumptions when using microfossils for past SST estimates via transfer functions are that (1) the microfossil composition that is used to create a transfer function is systematically related to SST and (2) the ecology of the microfossil assemblages has not significantly changed since the time of interest. The presence of a variable SST during the MIS11 indicated by these methods thus could reflect the effects of nuisance parameters on the reconstructed SST or large shifts in the ecology of the microfossils. The latter seems unlikely on the timescale of one glacial cycle (although changes in seasonality and vertical habitats could have occurred and affected geochemical proxies), but the former could indeed be significant, especially where the fossil assemblages differed from the calibration data set, resulting in the detection of very different modern analogs with small changes in the assemblages. On the other hand, regression-based transfer function methods, such as the MAT and the Imbrie-Kipp method, are unlikely to yield SST www.clim-past.net/9/2231/2013/  (Siegenthaler et al., 2005), (B) Antarctic temperature change , (C) Red Sea relative sea level change (Rohling et al., 2009) and (D) the benthic LR04 stack (Lisiecki and Raymo, 2005). EDC ages of the Antarctic temperature record were converted into LR04 ages according to Parrenin et al. (2007). reconstructions with variance inflating the level of variance in the assemblage data. In our case, SST records based on MAT yield, for records with similar resolution, similar variance (Fig. 11), indicating that the variance of the reconstructions is unlikely to have been inflated due to the presence of no-analog faunas. In individual cases, the high variance in SST reconstructions by proxies can be attributed to nuisance variables. For example, Becquey and Gersonde (2002) concluded that carbonate dissolution may result in an over-or underestimation of SSTs when using Foraminifera with varying dissolution resistance for the application of transfer functions. These authors further conclude that their summer SSTs estimated with MAT for core PS2489 (used in this study) are overestimated by 6-7 • C for a short interval within MIS11. Whether or not the same can be said for all records in this study remains unclear.
Despite the large differences in variance and considering all the potential sources of uncertainty in the proxy-based SST values, it is remarkable that in several cases not only a visual agreement between the direction of SST change implied by data and models is similar, but also a positive relationship between the values of SST anomalies from both approaches can be observed (Figs. 6 and 7). Whereas it is likely that many of the proxies used could produce SST reconstructions systematically shifted from their a priori sea-   Fig. 10. Sea surface temperature optima during MIS11 calculated with various EOF analyses based on age and temperature uncertainties of 3-6 ka and 1-2 • C, respectively, and jackknifing versus temperature optimum observed in Antarctica  showing a lag of ∼ 4 kyr between the SST optimum calculated with EOF1 and the Antarctic temperature optimum during MIS11. Most EOF runs (74-89 %) given here as numbers (n) show an earlier SST optimum than that recorded in Antarctica. EDC ages of the Antarctic temperature record were converted into LR04 ages according to Parrenin et al. (2007). sonal or vertical attribution, the calculation of SST anomalies between the investigated time slices should largely reduce this problem, as long as the shifts in species ecology causing such misattribution remain temporarily stable. Apparently, especially for the boreal summer season reconstructions, the signal in the proxy-based SST anomalies resonated with processes captured by the CCSM3 model runs. Since these model runs differ mainly by orbital parameters (with greenhouse gas concentrations being largely similar, Table 2), it appears that orbital forcing has left a detectable signature in the global SST pattern during MIS11, despite its unusually low magnitude.

Conclusions
Using a compilation of SST records from 57 sites, aligned to a common timescale through oxygen isotope stratigraphy, and a series of CCSM3 model runs forced by greenhouse gas concentrations and orbital parameters, we investigated global patterns of MIS11 SST and their correlation with forcing mechanisms.
-An empirical orthogonal function analysis of 48 SST records revealed the presence of two main SST trends, which collectively explained nearly three quarters of the variation in the data set. We have shown that the results of this analysis are robust against sample selection and errors in the age model, but are more sensitive to SST uncertainty.
-The main SST trend describes the global glacialinterglacial pattern, showing rapid deglaciation followed by a broad climatic optimum culminating around 410 ka. The SST development during MIS11 optimum is not in phase with Antarctic temperature and CO 2 . This lag is the dominant signal in the second EOF. We speculate that this phase difference, which is most strongly manifested in the North Atlantic, may be explained by a later establishment of a stable AMOC in MIS11.
-The second EOF further differs from the global SST trend by a protracted warmer period during MIS11, lasting into MIS10. This regional trend may reflect not only a later onset but also a longer duration of a stable North Atlantic AMOC during MIS11.
-The comparison of the CCSM3 model with the proxy data shows that similar temperature trends can be found especially for the summer seasons. It further shows that the SST variability found in the proxy data is significantly higher than in the model. This is more likely a consequence of an underestimation of SST changes in climate models. Alternatively, SST reconstructions for MIS11 would have to be subject to pervasive influence of nuisance parameters, which vary at millennial timescales. The general agreement between proxy-derived and modeled SST anomalies indicates the MIS11 climate was responding to insolation forcing, despite the low orbital eccentricity.