Evaluation of modern and mid-Holocene seasonal precipitation of the Mediterranean and northern Africa in the CMIP5 simulations modern mid-Holocene seasonal precipitation of Mediterranean northern in the

. We analyse the spatial expression of seasonal climates of the Mediterranean and northern Africa in pre-industrial ( piControl ) and mid-Holocene ( midHolocene , 6 yr BP) simulations from the ﬁfth phase of the Coupled Model Intercomparison Project (CMIP5). Modern observations show four distinct precipitation regimes characterized by differences in the seasonal distribution and total amount of precipitation: an equatorial band characterized by a double peak in rainfall, the monsoon zone characterized by summer rainfall, the desert characterized by low seasonality and total precipitation, and the Mediterranean zone characterized by summer drought. Most models correctly simulate the position of the Mediterranean and the equatorial climates in the piControl simulations, but overestimate the extent of monsoon inﬂuence and underestimate the extent of desert. How-ever, most models fail to reproduce the amount of precipitation in each zone. Model biases in the simulated magnitude of precipitation are unrelated to whether the models reproduce the correct spatial patterns of each regime. In the midHolocene , the models simulate a reduction in winter rainfall in the equatorial zone, and a northward expansion of the monsoon with a signiﬁcant increase in summer and autumn rainfall. Precipitation is slightly increased in the desert, mainly in summer and autumn, with northward expansion of the monsoon. Changes in the Mediterranean are small, although there is an increase in spring precipitation consistent with palaeo-observations of increased growing-season rainfall. Comparison with reconstructions shows most models underestimate the mid-Holocene changes in annual precipitation, except in the equatorial zone. Biases in the piControl have only a limited inﬂuence on midHolocene anomalies in ocean–atmosphere models; carbon-cycle models show no relationship between piControl bias and midHolocene anomalies. Biases in the prediction of the midHolocene monsoon expansion are unrelated to how well the models simulate changes in Mediterranean climate.


Introduction
The Mediterranean area, including southern Europe and northern Africa, is characterized today by a highly seasonal climate with summer drought and a wet season between October and March (Mehta and Yang, 2008). The generally low precipitation and marked seasonality gives rise to drought-adapted, sclerophyllous vegetation that is highly susceptible to wildfire during the dry season (Moreira et al., 2011). The Mediterranean region has experienced warming and increased drought in recent years (Camuffo et al., 2010;Hoerling et al., 2012;European Environment Agency, 2012) and has been identified as highly vulnerable to future climate changes (Giorgi, 2006). Model projections indicate large increases in temperatures and a reduction in mean annual precipitation (e.g. Meehl et al., 2007;Giorgi and Lionello, 2008;Nikulin et al., 2011), both of which would lead to large changes in vegetation cover and exacerbate wildfires (Amatulli et al., 2013). Given the high socio-economic costs of such changes, it is important to assess the reliability of model projections. Measures of how well the models simulate modern climate do not provide a measure of whether Published by Copernicus Publications on behalf of the European Geosciences Union.

552
A. Perez-Sanz et al.: Evaluation of modern and mid-Holocene seasonal precipitation of the Mediterranean the simulation of climate changes is realistic. However, the evaluation of model performance in the geologic past does provide a way of making such an assessment (Braconnot et al., 2012;Schmidt et al., 2014a).
The mid-Holocene (MH, 6000 yr BP) provides an opportunity to examine climate-model performance in the Mediterranean region. Palaeoenvironmental evidence suggests that the Mediterranean was wetter than today during the mid-Holocene. Lake levels across the region were higher than present (Kohfeld and Harrison, 2000;Magny et al., 2002;Roberts et al., 2008), indicating a more positive balance between precipitation and evaporation. Speleothem records also indicate increased precipitation compared to present . The observed expansion of deciduous trees (Prentice et al., 1996;Roberts et al., 2004;Carrión et al., 2010) across the region indicates that there was a change in rainfall seasonality with increased summer rainfall (Prentice et al., 1996). The observed decrease of fires in lowland areas, coupled with an increase in fires at higher elevations, is consistent with more humid conditions -which would suppress fires in already forested lowland regions but allow them to increase as forests expanded into higher elevation areas (Vannière et al., 2011). The changes in climate were spatially complex , but pollen-based climate reconstructions (e.g. Cheddadi et al., 1997;Davis et al., 2003;Bartlein et al., 2011) show that most of the Mediterranean region was characterized by a year-round decrease in temperature and an increase in plant-available moisture.
Systematic comparisons with observations have shown that global climate models are unable to reproduce the observed MH patterns of rainfall changes in the Mediterranean. In particular, they do not show a sufficiently large increase in summer rainfall to explain the shift towards deciduous vegetation. This was identified as a problem in atmosphere-only simulations of the mid-Holocene made during the first phase of the Palaeoclimate Modelling Intercomparison Project (PMIP1; see e.g. Masson et al., 1999;Guiot et al., 1999;Bonfils et al., 2004). Coupled ocean-atmosphere simulations made during PMIP2 were able to simulate the types of climate changes seen in the Mediterranean, but the geographic placement of these climate types, the spatial extent and the magnitude of the changes were not well captured . In particular, the simulated changes in precipitation are small and insufficient to explain the observed expansion of deciduous forests in the region.
The Mediterranean climate involves a complex interaction between different processes acting at several different spatio-temporal scales (Xoplaki et al., 2003;Luterbacher et al., 2006;CLIVAR, 2010;Lionello, 2012). However, interannual variability in Mediterranean summer precipitation is linked to variability in the strength of the Afro-Asian monsoon system (Rodwell and Hoskins, 2001;Raicich et al., 2003;Gaetani et al., 2011). Analyses of climate model simulations of the present day suggest that Mediterranean summer precipitation is suppressed during years when the Afro-Asian monsoon system is strong. This results from intensification of the Hadley cell and enhanced subsidence in the subtropics (i.e. strengthening of the Azores High), leading to high pressure over the eastern Mediterranean which results in decreased rainfall (Gaetani et al., 2011). However, when monsoon intensification is accompanied by northward movement of the intertropical convergence zone, as model simulations indicate occurred in the mid-Holocene (Braconnot et al., 2007a;Marzin and Braconnot, 2009), the Azores High is also displaced northeastward and weakened (e.g. Harrison et al., 1992). This has been shown to have a significant impact on precipitation in eastern North America (Forman et al., 1995;van Soelen et al., 2012) and could potentially lead to increased summer rainfall in the Mediterranean region.
The PMIP2 simulations show a significant enhancement and northward expansion of the African monsoon during the mid-Holocene in response to changes in insolation forcing (Braconnot et al., 2007a). However, comparisons with pollen-based estimates of the change in mean annual precipitation Bartlein et al., 2011) show that the models underestimate the increase in precipitation by between 20 and 50 % (Braconnot et al., 2007a(Braconnot et al., , 2012. Most models fail to produce a sufficient northward expansion of the monsoon. This underestimation of monsoon expansion is also present in the CMIP5 (Coupled Model Intercomparison Project) MH simulations (see e.g. . It is possible that this bias in the simulation of the African monsoon is linked to the failure to simulate the MH Mediterranean climate accurately, since larger shifts in the position of the monsoon are produced by models incorporating land-surface feedbacks and/or with higher spatial resolution (Levis et al., 2004;Wohlfahrt et al., 2004;Bosmans et al., 2012).
MH model simulations, made with the same models that are used for future projections, have been made as part of the fifth phase of the CMIP5 (Taylor et al., 2012) and are being analysed as part of the third phase of the Palaeoclimate Modelling Intercomparison Project (PMIP3; Bracconnot et al., 2012). Kelley et al. (2012) have shown that the simulation of the seasonal cycle of precipitation in the Mediterranean region under modern conditions is reasonable, although as in earlier versions of the models the amplitude of the cycle is more muted than observed with too little rain in winter and too much rain in summer (Brands et al., 2013). However, evaluation of CMIP5 model performance against modern observations suggests that some aspects of the simulation of the Afro-Asian monsoons (see e.g. Monerie et al., 2012;Roehrig et al., 2013;Sperber et al., 2013) are improved compared to earlier versions of the models, although preliminary assessments of the CMIP5 model indicate that improvements in the modern simulations do not translate into improvements in the simulation of the MH monsoon climate , and thus, given the dynamic links between the Clim. Past, 10, 551-568, 2014 www.clim-past.net/10/551/2014/ monsoon and Mediterranean precipitation, in MH Mediterranean climate changes. In this study, we examine the performance of the CMIP5 models for modern and MH climates, and compare the simulated climates with modern and palaeo-observations. This allows us to assess whether biases in the control simulations influence the MH simulations and to investigate whether regional biases in the simulation of MH monsoon changes influence model performance in the Mediterranean.

Methods
We present analyses of the pre-industrial (piControl) and MH (midHolocene) made by 12 coupled ocean-atmosphere models from the fifth phase of the CMIP5. In order to investigate whether biases in the control simulation influence the realism of the midHolocene climates, we first evaluate the pi-Control simulation. We use modern observations from the CRU TS3.1 data set, in the absence of climate reconstructions from northern African for the piControl interval. The piControl simulation is driven by boundary conditions appropriate for 1850 AD, but comparisons with a subset of transient historical simulations show that the spatial patterns and magnitudes of seasonal climates are very similar. In order to evaluate whether models capture the spatial expression of specific seasonal patterns, we define a number of climate types using the modern observations and apply these definitions to delimit these climate types in the piControl and mid-Holocene simulations. We evaluate the midHolocene simulations using quantitative climate reconstructions derived from pollen records. Although there are many kinds of palaeorecords that indicate that northern Africa and the circum-Mediterranean region were wetter during the mid-Holocene, including e.g. lake-level and archaeological records, these other sources of information do not provide quantitative estimates of the change in precipitation. Comparisons of simulated and observed climates are based on the simulated precipitation both within climate zones and within geographic zones.

Data sources: CMIP5 simulations
We examine precipitation changes between a mid-Holocene (midHolocene, 6000 yr BP) equilibrium simulation and a control simulation representing pre-industrial conditions (pi-Control) using 12 models from the fifth phase of the CMIP5. Both the midHolocene and piControl are equilibrium simulations. We use the midHolocene and piControl simulations in the CMIP5 (http://cmip-pcmdi.llnl.gov/cmip5/dataportal. html) archive as of 15 August 2012 (Table 1). Seven of these simulations are made with ocean-atmosphere (OA) models, and the other 5 models include an interactive carbon cycle (OAC). The piControl simulation has boundary conditions (insolation, greenhouse gas concentrations) appropriate for 1850 CE (common era). The midHolocene experiment shows the response to changes in the seasonal and latitudinal distribution in insolation 6000 yr ago; greenhouse gas concentrations are set at piControl levels (for details of the experimental protocol see Taylor et al., 2012, andBraconnot et al., 2012). To assess whether the piControl state differs from recent observed climates, we used outputs from a historical simulation (historical: 1850-2005 CE) available for six of the models. The historical simulation is forced by timevarying changes in solar, volcanic, and greenhouse gases (Taylor et al., 2012;Braconnot et al., 2012).
The output from each model was interpolated to a common grid (0.5 • ) using bilinear interpolation to facilitate comparisons and the calculation of zonal averages. Long-term mean monthly, seasonal, and annual precipitation values were obtained by averaging the last 100 yr of the piControl and mid-Holocene simulations, except in the case of HADGEM2-CC where only 35 yr of midHolocene simulated outputs are available. Long-term means of the six historical simulations were obtained by averaging the last 30 yr of each simulation. All averages were areally weighted (by the area of the model grid cells).

Data sources: modern and mid-Holocene climate data
Observations of the modern climate are taken from the CRU TS3.1 data set (Harris et al., 2014), which provides monthly precipitation values on a 0.5 • grid for the interval 1850-2006. We have created a monthly precipitation climatology using data from January 1961 through to December 1990. Zonal averages are constructed by areally weighting the gridded values. Bartlein et al. (2011) provide quantitative reconstructions of mean annual precipitation (MAP), expressed as anomalies from the present, from pollen and plant macrofossil records. The original site-based reconstructions were averaged to provide gridded values on a 2 • × 2 • grid, and differences between the site reconstructions within each grid were used to provide an estimate of reconstruction uncertainty (as a pooled estimate of the standard error). The data set provides mid-Holocene estimates of MAP anomalies for 62 cells (out of a possible 397 cells) within the area of interest (latitude: 0-45 • N, longitude: 20 • W-30 • E).

Definition of climate regions
Precipitation regimes can be characterized by a combination of the form of the seasonal cycle, seasonal concentration, and magnitude. We determined these characteristics of modern precipitation (using the CRU TS3.1 data set) for zonally averaged 5 • latitude bands between 0 and 45 • N. The seasonal cycle of precipitation in each 5 • latitude band was characterized according to the number of distinct rainfall peaks present in the 12 month precipitation climatology, using the R package  "pastecs" to determine whether there was a significant "pit" or "peak" in any month. A pit or peak is considered significant if the probability of turning points occurring in a random series is < 0.05, given by where n is the number of observations at time t (Ibanez, 1982). We calculated the total precipitation in each season (spring: March, April, May; summer: June, July, August; autumn: September, October, November; winter: December, January, February) and for the whole year. A measure of seasonal concentration was calculated following Kelley et al. (2013), where the magnitude of precipitation in each month is represented by the length of a vector in the complex plane and the direction of the vector represents the timing (with January set to 0 • ). The length of the mean vector divided by the annual precipitation provides an index of seasonal concentration (C), where C is 1 when the precipitation is concentrated in a single month and 0 when it is evenly distributed throughout the year.
We applied these definitions to determine the position of different precipitation regimes in the piControl and mid-Holocene simulations. Comparison of the observed limits and those identified in the piControl allows us to examine (a) whether the models produce these distinctive precipitation regimes and (b) how well they simulate their placement independently of whether they simulate the correct magnitude of precipitation. Comparison of the piControl and mid-Holocene limits allows us to characterize shifts in precipitation regimes, again, independent of changes in precipitation magnitude.

Analyses of the model simulations
We evaluate model performance for the piControl simulation in two steps. First we examine whether the models reproduce the spatial extent of different precipitation regimes, and then we examine whether they reproduce the magnitude of total annual and of seasonal precipitation. Long-term means for the period 1961-1990 from the CRU TS3.1 data set (Harris et al., 2014) are compared with long-term averages for the last 100 yr of the piControl. The standard deviation (SD) of the observations provides a measure of the significance of the difference between observations and simulations. We examine the differences between simulated and observed climate for the latitude band corresponding to a given precipitation regime in the observations. We also compare the differences in the amount of precipitation for the geographic region identified as falling within a specific precipitation regime in each model, which may be less/more extensive than the region identified in the observations.
We also examine the change in precipitation in the mid-Holocene in two steps. First we identify the spatial extent of each precipitation regime in the midHolocene simulations and compare this with the spatial extent shown in the piControl simulation of the same model. This allows us to identify whether there have been shifts in the precipitation regimes. We then examine the magnitude of the precipitation change in the latitude band characterized by a specific regime in both the piControl and the midHolocene simulations for each model. This allows us to identify whether there has been a change in precipitation in situ. We use the standard deviation of the piControl simulation for each model to determine whether the change between midHolocene and piControl is significant.
We examine whether the biases in simulated precipitation (both the bias in spatial extent of a given precipitation regime and the bias in the magnitude of the simulated precipitation) influence the simulated change in precipitation between piControl and midHolocene. The bias and anomaly values have been obtained firstly for discrete geographical zones (the zones characterized by different rainfall regimes today, as defined from the CRU data set) and secondly for the model-defined regions of these different rainfall regimes (e.g. the region where the simulated rainfall is of the monsoon type). We use linear regression to examine the relationship between precipitation biases and anomalies for all models, and for the OA and OAC classes of models. The realism of the simulated change in precipitation (mid-Holocene − piControl) is assessed by comparing with reconstructions of mean annual precipitation (MAP) from the Bartlein et al. (2011) data set. Comparisons are made by averaging the simulated precipitation for the grid cells where there are observations within each 5 • latitude band. There are sufficient data in most of the 5 • latitude bands to make robust comparisons.

Modern observed climate
The modern climate of the region can be divided into four distinct latitudinal zones, differentiated by marked differences in the seasonal distribution and amount of rainfall ( Fig. 1). In the south, the equatorial band is characterized by high rainfall (∼ 1800 mm) throughout the year ( Fig. 2) but with peaks in precipitation in spring (∼ 460 mm) and autumn (∼ 600 mm) and less rainfall in summer. This pattern reflects the seasonal migration north and south of the intertropical convergence zone. The "double-peak" rainfall pattern (hereafter DP) occurs between 0 and 5 • N. The region further north (5-20 • N) is characterized by summer monsoonal rainfall and dry winters. The amount of rainfall declines progressively from ca. 650 mm in summer (June, July, August) in the south to less than 100 mm in the north. The desert area (20-30 • N) is characterized by low rainfall (< 100 mm yr −1 ). There is no pronounced seasonal differentiation of rainfall in the desert, although southern regions tend to have slightly more rain in summer than winter and northern regions slightly more rainfall in winter than summer. The Mediterranean zone (30-45 • N) is characterized by higher rainfall, increasing from 200 mm yr −1 in the south band to 780 mm yr −1 in the north. The rainfall is concentrated in the winter half-year, with a pronounced summer drought.

piControl simulations
These four rainfall regimes can generally be identified in the piControl simulations, although two of the models (CNRM-CM5, MRI-CGCM3) fail to reproduce the DP pattern in the equatorial zone. However, several models represent the spatial extent of the regimes poorly. Thus 5 out of the 12 models show monsoon penetration further north than observed (Fig. 3a). Most models place the northern limit of the desert correctly, but two models (CSIRO-Mk3L-1-2, IPSL-CM5A-LR) show the area of low rainfall and low rainfall seasonality extending further north than observed. Observed seasonal cycle of precipitation in each of the defined climate zones, using the CRU T3.1 data set (Harris et al., 2014). The mean precipitation each month (mm) is shown by the black line, with the standard deviation shown by the bars. The grey shading shows the maximum and minimum rainfall experienced within the observation period . Note that the scale for the desert region differs from that used for the other regions.
Months are numbered consecutively from January (1) through to December (12).
Since there are no reconstructions of pre-industrial climate, we evaluate how well the models reproduce the magnitude of seasonal precipitation within each precipitation regime by comparing to observations for the period . Comparison of the piControl and historical simulations (Fig. S1, Supplement), for the six models where both runs are available, shows that differences in the simulated patterns and amount of precipitation between the two simulations are small. Differences between the two simulations www.clim-past.net/10/551/2014/ Clim. Past, 10, 551-568, 2014   3. The location of the four precipitation zones in the CMIP5 l piControl simulations compared to the limits defined using the CRU TS3.1 data set (Harris et al., 2014). The precipitation regime was characterized using zonally averaged long-term means for 5 • latitude bands. The location of the four precipitation zones in the CMIP5 midHolocene simulations compared to the limits defined using the CRU TS3.1 data set (Harris et al., 2014). The precipitation regime was characterized using zonally averaged long-term means for 5 • latitude bands. Comparison of simulated and observed mean annual and mean seasonal precipitation (mm) for each of the defined precipitation regimes (Mediterranean, desert, monsoon, double peak). The simulated precipitation (mean and standard deviation) is shown for both the climate zone as defined by the observations (solid line) and as defined in the piControl simulation itself (dotted line). The difference between these two lines for each model provides a measure of the degree to which incorrect placement of a given climate affects the zonal means. The grey bars represent one standard deviation of the mean annual and mean seasonal precipitation from observations. The seasons are defined as spring, summer, autumn and winter (as in Sect. 3.1).

Clim
are generally much smaller than the difference between the simulated and observed climate.
Comparison of the piControl with modern observations shows that most models fail to reproduce the magnitude of the precipitation (Fig. 4). Only two models (CSIRO-Mk3L-1-2, MPI-ESM-P) correctly reproduce the amount of rainfall in the DP band, while six models overestimate the rainfall by between 350 and 790 mm yr −1 . Although some models overestimate the amount of precipitation in every season, the positive biases are largest in spring (75-290 mm), autumn (90-325 mm) and winter (50-290 mm). Only two models (GISS-E2-R, CNRM-CM5) simulate the correct magnitude of mean annual precipitation in the monsoon zone. Seven models underestimate, and three models overestimate, the mean annual rainfall in the monsoon zone. The bias ranges from 280 mm less than observed to 270 mm more than observed. Models that underestimate the total amount of rainfall in the monsoon zone (e.g. BBC-CSM1.1, CSIRO-Mk3L-1-2, HadGEM2-CC, HadGEM2-ES, IPSL-CM5A-LR, MPI-ESM-P and MRI-CGCM3) do so because of simulating too little precipitation in summer and autumn, i.e. because the simulation of the monsoon is too weak. However, models that overestimate the total precipitation in this zone (e.g. overestimate the rainfall in all seasons of the year. Seven models simulate too much precipitation in the desert zone (10-55 mm yr −1 ), with too much rainfall in spring, summer and autumn. Given that the desert zone is by definition confined to regions with < 100 mm precipitation, the overestimation of rainfall in this zone is large. Four models underestimate the Mediterranean precipitation (by between 35 and 90 mm yr −1 ), because of underestimation of the autumn and winter rainfall, although they overestimate the summer rainfall. However, the IPSL-CM5A-LR and GISS-E2-R models overestimate total precipitation in this region: GISS produces too much rainfall in spring (45 mm), summer (100 mm) and autumn (60 mm), while IPSL-CM5A-LR simulates too much rainfall (130 mm) in summer only. Comparison of results from models that correctly simulate the location of each regime (compared to the observations) and those in which the area characterized by a given regime is too extensive or too small show that the biases in simulated precipitation are not related to whether models reproduce the spatial location of each regime correctly.

Mid-Holocene simulation
The location of the DP regime does not change between the piControl and midHolocene simulations of most (9) of the models (Fig. 3b). The two models (CNRM-CM5, MRI-CGCM3) that failed to simulate a DP pattern in the equatorial zone in the piControl nevertheless simulate this pattern in the midHolocene experiment. However, in the IPSL-CM5A-LR midHolocene simulation, the precipitation in the equatorial zone is more monsoon-like than in the model's piControl simulation. Most of the models (6) show no change in the northern limit of the monsoon; four models (CCSM4, IPSL-CM5A-LR, MRI-CGCM3, CNRM-CM5) show a northward displacement of the northern limit of the monsoon, while two models (BCC-CSM1.1, MRI-CGCM3) show a southward displacement of the northern limit of the monsoon as a result of southward expansion of the desert regime. Only two models (BBC-CSM1.1, MRI-CGCM3) show a northward displacement of the northern limit of the desert zone. Thus, in most of the midHolocene simulations, the desert regime occupies either a similar (5 models) or a slightly contracted area (4 models) compared to the piControl. Only one model (GISS-E2-R) shows a southward expansion of the Mediterranean precipitation regime; otherwise, this zone occupies the same position as in the piControl simulations. We necessarily confine our comparisons of the magnitude of changes within each precipitation regime to those models that simulate a given regime in both the piControl and midHolocene simulations. The changes in the DP regime are not consistent and in general do not exceed the variability shown by the piControl. Only two models (CSIRO-Mk3-6-0, MIROC-ESM) show a significant reduction in precipitation (of 200 and 250 mm, respectively) in the midHolocene compared to the piControl (Fig. 5; Table 2). In the case of the CSIRO-Mk3-6-0 model, this is the result of a large decrease in autumn precipitation but in the case of the MIROC-ESM the decrease is concentrated in the spring. The monsoon zone is characterized by a significant increase in precipitation, except in the case of the CSIRO-Mk3-6-0 model. The anomalies range from +50 to +200 mm yr −1 , reflecting large increases in summer (15-140 mm) and autumn (20-250 mm) rainfall. Changes in winter and spring precipitation in winter and spring are not significant. Most models show an increase in mean annual precipitation in the desert regime (5-35 mm) as a result of increased summer and autumn rainfall, but the change only exceeds piControl variability in three cases (CCSM4, GISS-E2-R and MIROC-ESM). Most of the models (11) show an increase in mean annual precipitation (10-75 mm) in the Mediterranean regime, although this increase only exceeds the piControl variability in the case of the GISS-E2-R model. The simulated increase in mean annual precipitation in the GISS-E2-R model results from an increase in spring, summer and autumn and a negligible change in winter. All of the models show an increase in spring precipitation, and two models (IPSL-CM5A-LR, HADGEM2-CC) show an increase in summer rainfall accompanied by either a small increase or no change in winter.

Comparison of midHolocene simulations and mid-Holocene observations
Reconstructions of the change in mean annual precipitation in the mid-Holocene (Fig. 6)  Simulated changes in total and seasonal precipitation in the midHolocene compared to the piControl for each of the four precipitation regimes (Mediterranean, desert, monsoon, double peak) for the region that is common between the two sets of simulations. The standard deviation of precipitation in the piControl control simulation of each model is shown (grey bars) to provide a visual measure of the significance of the simulated change in precipitation. The seasons are defined as spring, summer, autumn and winter (as in Sect. 3.1).

Comparison between bias and anomaly
Comparison of the piControl bias and midHolocene anomaly suggests that model performance in the control simulations directly affects model performance in the midHolocene simulations in the DP, desert and Mediterranean regions (Fig. 7). In the DP region, there is a significant negative significant correlation (Fig. 7, all models, black line: slope = −0.23, R 2 = 0.74, p = 0.0) between the bias and the anomaly: models that overestimate precipitation in the piControl show the largest reductions in precipitation in the midHolocene simulations (e.g. BCC-CSM1.1, CSIRO-Mk3-6-0 and MIROC-ESM). The overall relationship is driven by the OAC simulations (red line: R 2 = 0.88, p = 0.02); the slope for the OA models is not significant (blue line; R 2 = 0.37, p = 0.2). Indeed, as examination of these relationships in model-defined DP regions shows, the negative relationship shown by the OA models in the 0-5 • N is driven by the two models that simulate monsoon-like regimes in this zone in the piControl. There is no relationship between the piControl bias and the midHolocene anomaly in the monsoon zone (Fig. 7), whether this is defined geographically (slope = 0.00, R 2 = 0.0, p = 0.98) or using the model-based regimes (slope = 0.08, R 2 = 0.05, p = 0.49). Thus, the ability to simulate the correct magnitude of modern precipitation appears to have no influence on the magnitude of the response of the monsoon to changed forcing. However, the OA and OAC models appear to show opposite tendencies: the OA models show a weakly positive relationship between the bias and the anomaly (models that simulate less rainfall than observed in the piControl produce smaller MH anomalies) whereas the OAC models show a (very) weakly negative relationship.
www.clim-past.net/10/551/2014/ Clim. Past, 10, 551-568, 2014  There is a significant positive correlation between the pi-Control bias and midHolocene anomaly in the desert region (Fig. 7). This is true whether the region is defined geographically (slope = 0.32, R 2 = 0.58, p = 0.01) or using the model-defined desert regimes (slope = 0.32, R 2 = 0.48, p = 0.02). Models that produce a reasonable simulation of modern rainfall in this region fail to produce a significant enhancement in the midHolocene simulation (CSIRO-Mk3L-1-2, HadGEM2-CC, IPSL-CM5A-LR) whereas models that are too wet in the piControl produce large changes in the mid-Holocene (CCSM4, GISS-E2-R and MIROC-ESM). However, these relationships are driven by the OA simulations; the OAC simulations do not show any significant relationship between the piControl bias and the midHolocene anomaly.
There is also a significant positive correlation between bias and anomaly in the Mediterranean region (Fig. 7), whether the region is defined geographically (slope = 0.14, R 2 = 0.58, p = 0.01) or using the model-defined regimes (slope = 0.15, R 2 = 0.48, p = 0.02). Models that underestimate precipitation in this zone in the piControl show only small increases in the midHolocene (BCC-CSM1.1, CCSM4 and MPI-ESM) while models with positive bias (GISS-E2-R and IPSL-CM5A-LR) produce larger changes in precipitation. However, the relationship for the OAC simulations is again non-significant.  Even in those regions where there are significant relationships between piControl bias and the midHolocene anomaly, the R 2 value ranges from 0.48 to 0.75. Thus, the bias in the piControl is not the only factor that determines whether the simulated magnitude of the MH climate change is correct. Furthermore, biases in the piControl appear to have less (or no) influence on the simulated midHolocene anomaly in the OAC simulations, except in the DP zone.

Discussion
Our analyses suggest that the CMIP5 models fail to reproduce key aspects of both the modern and MH climate of the northern Africa and Mediterranean region. Although the models generally reproduce the four characteristic seasonal patterns of precipitation, they do not always simulate these patterns in the correct place. They also tend to underestimate the magnitude of seasonal changes in precipitation. For example, they underestimate the amount of winter rainfall and overestimate the summer rainfall in the Mediterranean region. This is consistent with previous analyses of Mediterranean climates in both the CMIP3 (Giorgi and Lionello, 2008) and CMIP5 (Kelley et al., 2012) simulations. The models overestimate the precipitation in the DP zone, again a feature identified from previous analyses (Roehrig et al., 2013). Previous analyses of the CMIP5 models (e.g. Roehrig et al., 2013;Brands et al., 2013) have suggested that there is a tendency for models to underestimate precipitation in the Sahel zone. While our analyses confirm this, with 8 out of 12 models showing less summer precipitation than observed, some of the models (e.g. CSIRO-Mk3L-1-2, BCC-CSM1.1) show a distinct improvement when the comparison is made between regions defined by precipitation regimes rather than geographically (Fig. 4). Furthermore, the temporal interval used for comparison also plays a role: MIROC-ESM, for example, simulates summer precipitation correctly but annual rainfall is too large because the simulated monsoon season is too long. Our evaluations are based on the assumption that the difference in climate between the piControl (1850 AD) and the 1961-1990 modern climatology is small. Comparisons of the piControl and historical simulations (Fig. S1, Supplement) for a sub-set of the models appear to support this assumption: the differences between the simulations are smaller than the difference between the simulated and observed climates. There is no synthesis of data for the pre-industrial era from northern Africa, but data from the Mediterranean region does not suggest substantial differences (e.g. Davis et al., 2003).
The models produce a northward shift and amplification of monsoon precipitation in the MH in response to insola-  (Harris et al., 2014). The simulated mean and standard deviation of precipitation from the CMIP5 models (blue) is based on the last 100 yr of the piControl. These simulations can be compared with results from coupled ocean-atmosphere simulations made during the second phase of the PMIP2 (Braconnot et al., 2012; shown in red). The PMIP2 results are the mean and standard deviation based on the last 100 yr of a piControl, except in three cases where only 50 yr of data were available. Model results are calculated for each precipitation regime based on the observed geographic extent characterized by these regimes, as defined using the CRU TS3.1 data set. Summer and winter as defined in Sect. 3.1. consistent with the observations, the magnitude of these changes is significantly underestimated (Fig. 6). The failure to simulate a sufficiently large expansion of the African monsoon has been a major criticism of previous generations of climate models Coe and Harrison, 2002;Braconnot et al., 2007aBraconnot et al., , 2012Brayshaw et al., 2011;Zhao and Harrison, 2011). Comparisons between CMIP5 and PMIP2 models (Fig. 8) show that the two ensembles are indistinguishable in terms of simulated changes over this study region. Global comparisons of these two sets of simulations (e.g. Harrison et al., 2013) appear to confirm that the CMIP5 models are no better at simulating climate changes than previous generations of models. It was originally suggested that the underestimation of monsoon expansion reflected the failure to include feedbacks associated with climate-induced changes in land-surface characteristics, including wetter and more organic soils, the replacement of desert by grassland and shrubland, and the expansion of lakes and wetlands. Indeed, simulations in which the impacts of changes in land-surface characteristics were prescribed through changing albedo produced much larger monsoons (Street-Perrott et al., 1990;Kutzbach et al., 1996;Coe and Bonan, 1997;Broström et al., 1998). However, this effect is not as pronounced in asynchronously-coupled climatevegetation simulations (Claussen and Gaylor, 1997;Texier et al., 1997;Braconnot et al., 1999), models with dynamic vegetation from PMIP2 (Braconnot et al., 2012), or indeed coupled carbon-climate models in CMIP5 . In general, these models produce a strengthening of the monsoon in situ and only a minor northward expansion of the zone of monsoon rainfall. If we assume that the coupled models are behaving reasonably, this shows that the changes to the energy budget produced by the prescribed changes in albedo are compensated by changes in the partitioning between latent and sensible heating through increased evapotranspiration. This implies that some other mechanism, for example associated with changes in circulation, is required to produce the observed expansion of rainfall in the Sahara. Our MH model evaluation is based on pollen-based reconstructions of mean annual precipitation. Although the increase in monsoon precipitation is large (300-400 mm between 5 and 30 • N) and spatially coherent, there are some zonal bands where the number of reconstructions is limited (see Fig. 6). However, other sources of palaeoenvironmental data, including vegetation (Hoelzmann et al., 1998;Prentice and Jolly, 2000;Watrin et al., 2009;Niedermeyer et al., 2010), lake-level reconstructions (Kohfeld and Harrison, 2000;Tierney et al., 2011), and archaeological evidence (Kuper and Kröpelin, 2006;Dunne et al., 2012), show that the magnitude of the reconstructed precipitation changes in these zones is plausible. Furthermore, the reconstructions of climate conditions in the Mediterranean region are based on a much larger number of individual data points . Thus, the discrepancies between the model simulations and the observations are not simply a result of lack of information.
It would be possible to use the qualitative information about changes in water balance provided by lake-level records to constrain pollen-based climate reconstructions (see e.g. Cheddadi et al., 1997). While this could provide more robust reconstructions of the observed change in precipitation for northern Africa, the number of observations would still necessarily be limited to sites where both pollen and lake-level records are available. Model inversion provides an alternative approach to use of lake-level data for climate reconstruction (see e.g. Vassiljev et al., 1998), and one that has already been successfully used with pollen data (Wu et al., 2007). However, changes in lake-water balance can be brought about by changes in multiple climate parameters (temperature, precipitation, seasonality of precipitation, cloudiness, vapour pressure, wind speed) and the magnitude of the lake-level changes that occur in response to changes in catchment-water balance are influenced by morphometric factors (lake depth and shape, lake size relative to catchment size) , and the methodology for taking account of all these factors has not yet been developed.
The simulated increase in mean annual precipitation in the Mediterranean region is small and, in comparison with the variability already present in the piControl, is not significant. However, although just half of the models show an increase in summer, all of them show an increase in precipitation in spring and some of them also show an increase in autumn. Thus, some of the models produce an increase in growing season moisture that, although too small, is consistent with the expansion of deciduous forest in this region during the mid-Holocene. Temperate deciduous forests occur in midlatitude regions with > 700 mm of annual precipitation spread throughout the year (see Harrison et al., 2010). Temperate deciduous forest occurs, for example, around Lake Banyoles in eastern Spain, where mean annual precipitation is ca. 800 mm and nearly half of this falls in spring and summer (Soler et al., 2007). According to the mid-Holocene simulations for the Mediterranean area, the largest increase in growing-season precipitation is ca. 30 mm in spring and 40 in summer (GISS-E2-R and HadGEM2-CC respectively), and the overall change in mean annual precipitation is < 75 mm (GISS-E2-R). This is less than the increase required for deciduous trees to grow. Nevertheless, these simulations point to mechanisms that could help to explain the observed vegetation changes in the Mediterranean. Furthermore, if the absence of a significant increase in summer rainfall in the Mediterranean is linked to underestimation of the northward migration of the African monsoon, then improvements in the simulation of monsoonal changes should also lead to a more realistic simulation of Mediterranean climate.
We have shown that there is a significant relationship between the bias in the control simulation and the magnitude of the simulated MH changes in precipitation for the DP, desert and Mediterranean zones, although no such relationship is present in the monsoon zone. However, the relationship in the desert and Mediterranean zones is only apparent in the OA models; the piControl bias does not seem to affect the midHolocene anomaly in the OAC models. The OA models also show a weakly positive (though non-significant) relationship between piControl bias and midHolocene anomaly in the monsoon region. Thus, the apparently significant relationships between bias and anomaly found when considering all the models are not a consistent feature of these simulations. Even in the DP, desert and Mediterranean zones, the bias in the OA piControl simulations only explains part of the variability in simulated climate changes. Previous studies have also had difficulties in finding consistent relationships between control biases and MH changes in precipitation. Comparison of control and MH atmosphere-only simulations made in the first phase of the PMIP1 showed that intermodel differences in the position of the intertropical convergence zone in the control simulation was reflected in the intermodel differences of its position in the MH simulation . However, there was no clear relationship between the amount of precipitation in the control and the increase in precipitation in the MH (Braconnot et al., 2002). Braconnot et al. (2007b), analysing OA simulations from PMIP2, showed that the relationship between the simulated precipitation in the control to the ratio of the change in precipitation between MH and control was negative: models that simulated very little rainfall tended to produce larger changes at the MH. However, this relationship was clearly driven by only three models, and the remaining models show no trend between the precipitation in the control simulation and the ratio of change in the MH. Thus, this seems to be consistent with our analyses. It is hard to escape the conclusion that improvements to the simulation of modern climate (see e.g. Haerter et al., 2011) will not guarantee that climate changes will be correctly simulated.
In this study, we have analysed the realism of simulated climates both in terms of climate regimes and by comparing specific geographic bands. The use of climate regimes places less stringent requirements on model performance, allowing an assessment, for example, of whether a model can simulate changes in seasonality independent of location. One reason for adopting this approach is the concern that model resolution, particularly in regions of complex topography, could affect geographic patterning (see e.g. Brewer et al., 2007). However, it can be difficult to find objective criteria for the definition of these climate regimes. Although we have been able to distinguish DP from monsoon, and monsoon from desert, climates solely on the basis of precipitation seasonality, it is not possible to use this type of criterion to distinguish desert and Mediterranean climates. Brewer et al. (2007) used k-means clustering to define climate regimes in Europe. Although this is an approach that needs to be further explored, it involves some arbitrary decisions about the climate variables used for clustering as well as the number of clusters considered.
www.clim-past.net/10/551/2014/ Clim. Past, 10, 551-568, 2014 Many of the large-scale features characteristic of projected climate changes are a feature of past climate changes, and comparison with palaeo-observations shows that current models reproduce these features in a realistic way (e.g. Braconnot et al., 2012;Izumi et al., 2013;Schmidt et al., 2014a;Li et al., 2013). Models, as we confirm here for northern Africa and the Mediterranean region, are also able to simulate precipitation regimes and shifts in these regimes in a realistic way Braconnot et al., 2007a;Brewer et al., 2007). However, there are still important discrepancies between the simulated and observed magnitude of changes in precipitation, despite the increasing complexity and resolution of the CMIP5 models compared to earlier generations of models. Given that the ability to simulate the magnitude of MH changes in seasonal climates does not appear to be systematically related to biases in the control simulations, focusing on improving the simulation of modern climate will not ensure that future projections or retrodictions of the climate of the Mediterranean and northern Africa will be more reliable. This is of concern given the environmental problems associated with recent climate changes in the Mediterranean and the importance of monsoonal rainfall for agriculture in northern Africa.

Conclusions
The CMIP5 models fail to reproduce key aspects of both the modern and MH climate of the northern Africa and Mediterranean region, including the correct geographical location of zonal precipitation regimes in the pre-industrial simulation and the magnitude of MH changes in these regimes.
Although biases in the OA simulations explain part of the variability in simulated climate changes, a similar relationship is not found for the OAC simulations. Thus, overall, biases in the control simulations cannot explain the failure to reproduce MH changes in precipitation.
As in previous generations of model simulations, the CMIP5 simulations underestimate the northward shift and the magnitude of observed changes in the north African monsoon.
In the Mediterranean region, the simulations show a tendency for increased growing-season precipitation. Such a shift is required to explain observed vegetation changes in this region in the MH, but the simulated shift is much too small. We speculate that this is linked to the underestimation of changes in the north African monsoon, suggesting that improved simulation of Mediterranean climates is linked to improvements in simulating the climate of northern Africa.
The failure to simulate observed mid-Holocene changes in the north African monsoon and the potentially linked failure to simulate the observed shift in rainfall seasonality in the Mediterranean raises concerns about the reliability of model projections of future climates in these regions.