An underestimated record breaking event – why summer 1540 was likely warmer than 2003

. The heat of summer 2003 in Western and Central Europe was claimed to be unprecedented since the Middle Ages on the basis of grape harvest data (GHD) and late wood maximum density (MXD) data from trees in the Alps. This paper shows that the authors of these studies overlooked the fact that the heat and drought in Switzerland in 1540 likely exceeded the amplitude of the previous hottest summer of 2003, because the persistent temperature and precipitation anomaly in that year, described in an abundant and coherent body of documentary evidence, severely affected the reliability of GHD and tree-rings as proxy-indicators for temperature estimates. Spring–summer (AMJJ) temperature anomalies of 4.7 ◦ C to 6.8 ◦ C being signiﬁcantly higher than in 2003 were assessed for 1540 from a new long Swiss GHD series (1444 to 2011). During the climax of the heat wave in early August the grapes desiccated on the vine, which caused many vine-growers to interrupt or postpone the harvest despite full grape maturity until after the next spell of rain. Likewise, the leaves of many trees withered and fell to the ground under extreme drought stress as would usually be expected in


Introduction
Future climate change will likely enhance the frequency and intensity of extreme anomalies (IPCC the Physical Science Basis, 2007). However, nobody is able to imagine the magnitude and severity of low-probability, high-impact events that are expected more frequently in the future as the result of continuing global warming (Field et al., 2012). Information on such "climatic surprises" is important, as the impacts of these events on human, ecological and physical systems might be very severe (Fuhrer et al., 2006). Studying the past is the only way to get an idea about the magnitude and the context of such "ultimate" extremes and their likely impacts. Extreme weather or climate events are to be understood as "the occurrence of a value of a weather or climate variable above (or below) a threshold value near the upper (or lower) ends of the range of observed values of the variable" (Field et al., 2012).
Regarding heat waves in Western and Central Europe, the summer of 2003 is usually taken as a benchmark for future extreme events. It was so far discussed in a hundred scientific papers (García-Herrera et al., 2010, and references therein;Fischer and Schär, 2010;Barriopedro et al., 2011;Stott et al., 2011;Weisheimer et al., 2011;Quesada al., 2012;Stefanon et al., 2012;Field et al., 2012). In fact, during the first two weeks of August at least four countries, the UK, Germany, Switzerland and Portugal, experienced new all-time records of measured daily maximum temperatures (Diaz et al., 2006). The August heat wave claimed approximately 40 000 extra deaths, mostly elderly people (García-Herrera et al., 2010). The financial loss due to crop failure over Europe alone is estimated at $ 12.3 billion (Heck et al., 2004), not considering other sectors of the economy.
The summer 2003 is claimed to be unprecedented, which is beyond doubt for the instrumental period reaching back 250 to 300 yr. According to the analysis of a series of grape harvest dates (GHD) in Burgundy (France) by Chuine et al. (2004) the heat of spring-summer (AMJJA) 2003 was probably "even higher than in any other year since 1370". Luterbacher et al. (2004) concluded from that result that the summer half-year (AMJJAS) 2003 was the warmest of the last 500 yr in Europe. According to a series of tree-ring maximum late wood density (MXD) measurements in Lötschental (Canton Valais, Switzerland), the summer 2003 was even claimed to be the warmest since AD 755 (Büntgen et al., 2006). This issue, however, still needs additional investigations. Schär et al. (2004) do not exclude the possibility "that such warm summers might have occurred in the more distant historical past, for instance in the Medieval Warm Period or in 1540". Indeed, the thousand year-long series of summer temperature indices for the Low Countries reaches the maximum value of +9 in 1540 (Shabalova and van Engelen, 2003). Beniston and Diaz (2004) based on documentary evidence by Pfister (1984), Glaser et al. (1999) and the climatological analysis by Jacobeit et al. (1999) argued that their results "suggest that 2003 is likely to have been the warmest summer since 1540" (emphasis added by the authors), whereas this event scores second after 2003 in the index and measurement based reconstruction for Germany, the Czech Republic and Switzerland since 1500 by Dobrovolný et al. (2010).
In this paper, which includes additional information and uses more abundant daily to seasonal time-scale analysis compared to previous studies for the year 1540, we show that the point made by Chuine et al. (2004) and Büntgen et al. (2006) with regard to 2003 cannot be held up. It will be demonstrated from coherent first-hand observations reported by chroniclers in Western and Central Europe that drought conditions in 1540 were so extreme that the timing of the grapevine harvest hinged on sufficient rainfall rather than on grape maturity, while trees suffered from the same extreme conditions. To reassess spring-summer (AMJJ) temperatures for this outstanding year a long GHD series for Switzerland encompassing the period 1444 to 2011 was composed, which in the case of 1540 was complemented with additional data to assess full grape maturity under extreme drought conditions.
The study is organized as follows: The first section reviews different documentary data types that are used in the analysis. The steps to merge Swiss partial GHD series into a homogenised main series are presented in the second section including a focus on reported drought effects on grapes and trees in 1540. Section three outlines the reconstruction of spring-summer (AMJJ) temperatures from this series using the calibration-verification approach and presents the results. Estimated temperatures are compared with the results of other reconstructions in the fourth section taking into account studies highlighting the significance of early soil desiccation for the generation of heat waves. The final section summarizes the results and the main lessons that can be drawn from comparing the extreme events of 1540 and 2003.

Data
Documentary sources being understood as physical units of man-made information on weather and climate provide the backbone of the analysis. They may contain two different kinds of data , and references therein): a. direct weather descriptions relating to warm and cold spells, sunshine, rain, snow, wind force, etc.; and b. indirect (bio-) physical data about vegetation advances or delays in the summer half-year (AMJJAS) and the presence or absence of frost, ice and snow-cover in the winter half-year (ONDJFM) (Pfister, 1984(Pfister, , 1992Brázdil et al., 2005).
With regard to source generation, a distinction is made between documents produced by individual amateur observers and those produced by members of institutions (Pfister et al., 2009).
a. Individually generated sources such as chronicles and diaries often contain both direct and indirect data. They are laid down on daily to seasonal time-scales putting a special focus on extreme anomalies and nature induced disasters that affected human societies. In order to allow comparison of outstanding anomalies over time, most chroniclers referred to indirect (bio-) physical proxy data in the natural environment. They presented such observations within their meteorological context which allows cross checking narrative meteorological and (bio-) physical proxy data. Individually generated sources are relatively short, ending with the death of the observer or before. Scholars need to assure that they were written by contemporaries, because copies are known to be error-prone (Alexandre, 1987).
b. Institutional sources were produced by officials of organizations such as churches or municipalities. These officials were in charge of managing resources that often fluctuated according to climate. The resulting documents were laid down in a standardized form regularly and thus provide contemporary, continuous, quantitative and quantifiable proxies for climate elements. At the same time, they are available for long time periods up O. Wetter and C. Pfister: An underestimated record breaking event 43 to several centuries. This allows the calibration and verification with instrumental measurements . To avoid biases in climate reconstruction, scholars need to assure that metadata relating to the climate proxy should not change over time.
Evidence from individual and institutional sources is complementary, in particular with regard to the reconstruction and interpretation of extreme events. On the one hand, the statistical analysis of institutional sources allows assessing pre-instrumental mean temperatures for the temporal resolution of seasons of several consecutive months. On the other hand, data from individual sources related to the same time span allow verifying the results, often at a time resolution of individual months, sometimes even days (Pfister, 1992). This procedure not only refers to documentary evidence (e.g. Wetter and Pfister, 2011), but also to the comparison of tree-ring data with documentary data (Büntgen et al., 2011).
The term GHD as it is used in this paper, includes evidence from both institutional and individual sources used as proxies for the phenological stage of full grape maturity. Such data might be grouped into (a) grape harvest ban related data (GHBD), (b) other kinds of evidence from institutional sources such as the first wage payments for grape harvest labourers (WPD), (c) observations about the beginning of grape harvest laid down by chroniclers, so called historic phenological data (HPD) and (d) standard phenological observations made by observers in the framework of phenological networks (PNO). Data from institutional and individual sources may, of course, overlap, as chronicler's reports often referred to vineyards submitted to the grape harvest ban. With regard to the faithful transmission manuscript sources or critically edited sources, they are both considered to be more reliable than uncritical publications.
The probably first long series of GHD was set up for the vineyards surrounding the town of Dijon (France) by the physician and scientist Jules Lavalle (1855) in cooperation with the local archivist. The Swiss M. Louis Dufour (1870) was the first scholar to use GHD for investigating climatic change. This proxy became then universally known through the pioneering work of Le Roy Ladurie (1972) and Le Roy  who drew on a GHD compilation by Angot (1885). To this day, 378 GHD series, mainly from France, were compiled, critically reviewed, statistically analysed and made available on internet (Daux et al., 2012). Phenologists classify crop harvests under the "aprocryphal" (i.e. questionable) phases, because they also depend on human decision making in contrast to phases of wild plants (DWD, 1991). Working with GHD thus entails investigating the social context in which the evidence was generated. The procedure of the grape harvest ban already practiced in Roman antiquity (Ruffing, 1999) is subsequently discussed on the example of France which is comparable to the situation in Switzerland.
Prior to the French revolution, in most areas vine-growers were not free to harvest at their will. They had to wait for a public order by the seigneur or the municipality. As soon as the earliest or the most important grape varieties were found to be ripe (Daux et al., 2012), the vineyards were guarded day and night to prevent the common people as well as the vine-growers from entering. The main reasons for the vintage ban were the prevention of theft or clandestine harvesting before the owners of the vineyards and the beneficiates of tithe (i.e. taxes) could monitor the correct delivery of their dues. Moreover, the time of the ban was needed to mobilize the many hands for picking the grapes. In France, the seigneur -not being submitted to the ban -had the right to begin the harvest in his vineyards one day ahead to benefit from lower wages (Lachiver, 1988).
The practice of setting the grape harvest ban was described in some detail on the example of Besançon by Garnier et al. (2011). In this town situated in eastern France, scheduling the grape harvest was one of the prerogatives of the people's representatives who met at the town hall almost daily before the revolution (1789). The procedure was conventionally based on a meteorological assessment of the previous months taking into account incidences such as military threats and plague outbreaks. The result was dutifully noted and dated in the registers in which the debates were recorded. In addition to GHD, registers include meteorological information. After the revolution vine-growers were theoretically free to begin the harvest at their will, but in practice most municipalities maintained a compulsory vintage ban to preserve law and order (Le Roy Ladurie and Daux, 2008). From 1889 the municipalities were entitled to keep or give up the practice of the harvest ban which led to its disappearance almost all over France. In 1979 the vintage ban was reintroduced in the entire country for reasons of quality control (Daux et al., 2012).
The practice in Dijon documented since the late 14th century was somewhat different. Two historians, Thomas Labbé and Damien Gaveau (2011), attempted an in-depth critique and reinterpretation of the Dijon GHD series based on the extraordinarily rich documentation available in the municipal archives of this town. At first, the two historians noticed that the series set up by Lavalle (1855) consists of both GHBD and reports about the day/date, when the first grapes were brought into town for pressing in the municipal vine presses. More importantly, the two authors discovered that prior to 1535 the vineyards around the town of Dijon were divided into a variable number of small local bans in which grape harvests began at different dates extending over a period of 13 days. Subsequently, the number of local bans was reduced (Labbé and Gaveau, 2011). Not until 1607 the municipal council considered grape maturity to be the most important parameter to begin the harvest. These circumstances led the two historians to conclude that GHD prior to 1600 seem to be "artificially early" with regard to grape maturity.
The authors of the Swiss series set up by Meier et al. (2007) (including C. Pfister) did not correct their dates 44 O. Wetter and C. Pfister: An underestimated record breaking event from Julian to Gregorian calendar by adding 9 to 10 days, leading to too warm reconstructed temperatures before 1700 (Fig. 1). In Switzerland the new style was introduced by the Catholic cantons except Valais in 1584, whereas most Protestant cantons, not considering individual latecomers, adopted the Gregorian reform after 1700 (Richards, 1998). Minor differences between Meier et al. (2007) and our reconstruction after 1701 occur, because of the effects of the 11-yr moving average, some potential Protestant latecomers to the new calendar style and the fact that we used a somewhat different set of data, especially in the second half of the 20th century.
In order to get a more reliable basis for assessing preinstrumental warm season temperatures a new GHD series (1444-2011) being longer and more complete than Meier et al. (2007) was being set up for Switzerland. The procedure is discussed below.  The new Swiss GHD series is composed from four different kinds of GHD, namely (a) (institutional) wage payment data (WPD) (b) (Institutional) grape harvest ban related data (GHBD), (c) (individual) historic phenological data (HPD) (d) (institutional) phenological network observation (PNO). Assessments of full grape maturity in 1540 is drawn from HPD in combination with narrative information.

The new Swiss GHD series
In order to assess data uncertainties, the quality of the sources needs to be assessed depending on whether the observations are made by contemporaries and if we have access to the original manuscript or to a high quality publication. Non-contemporary sources or uncritical data publications, often without metadata, may contain printing or copying errors. The 17 local GHD series (see Fig. 2) and the main Swiss series are published online with the supplementary material including the appropriate correlation matrix and the scheme of quality criteria. First-quality GHD are available in the form of contemporary manuscripts (c/m). Contemporary published data (c/p) are of secondary quality. Data from non-contemporary manuscripts (nc/m) or from uncritical data publications (nc/p) have to be considered with caution.    a. Wage payment data (WPD) of the Basel Hospital (1444-1705) (Fig. 2, series S1): Brázdil and Kotyza (2000) first discovered the potential of account books for climate reconstruction in their analysis of the Czech town of Louny. WPD for the Swiss series were drawn from the books of expenditure of the hospital of Basel, which according the above mentioned scheme are a first class source. The hospital of Basel was a profit orientated enterprise providing the upper classes of the municipal community in return for donations or inheritances with pensions for the elderly and a disability-insurance.

Year G ra p e h a rv e s t d a ta s e rie s (G H D )
1540 S17 S16 S15 S14 S13 S12 S11 S10 S9 S8 S7 S6 S5 S4 S3 S2 S1 1540 remained the same as can be concluded from the fact that the series is stationary and without a trend. It only contains a few minor gaps.
c. Historic Phenological Data (HPD) (Fig. 2, series S3, S4, S8, S10, S11, S12, S13): Prior to the establishment of national phenological observation networks working according to standardised guidelines, historical plant and animal phenological data (HPD) were laid down by amateur observers at their discretion (e.g. . In many cases the observations are presented within their meteorological context. HPD are not necessarily of lower quality than phenological network observations (PNO) though the lack of metadata (e.g. altitude, plant varieties etc.) is a source of uncertainty. As HPD were produced by individuals, the resulting series are rather short, spatially scattered and hardly overlapping, so as to make their integration into a long composite main series difficult. S4, S8, S10, S11 are completely based on HPD, whereas S3 and S13 are based on GHBD (S3) and PNO (S13) as well. Furthermore, reliable HPD were used to assess grape maturity in 1540, as will be subsequently shown.
d. Phenological network observation (PNO) (Fig. 2, series S7, S13, S14, S15, S16, S17): Compared to Germany, which has a long tradition of phenological network observations (DWD, 1991), continuous PNO in Switzerland started as late as in 1951 (Defila and Clot, 2001). The BBCH standard manual, which is a system for a uniform coding of phenologically similar growth stages of all mono-and dicotyledonous plant species, defines 39 observation categories for vines. "Begin of harvest" is undeniably easy to be observed, whereas observations about physiological stages such as grape maturity are more difficult to be clearly identified (Meier et al., 2009). All in all, 6 PNO series (S7, S13, S14, S15, S16, S17), observed and recorded in the framework of the Swiss Weather Service Meteo-Swiss, were included in the new Swiss GHD series. Existing historic GHD series were, whenever possible, completed with corresponding PNO series, from which places of observation are known. Series S7 and S13 from Twann (GHBD + PNO) and Hallau (HPD + PNO) were combined with other GHD data types. Series S14-S17 are PNO series only.
Vineyards of the PNO series, from which all necessary metadata is available, are known to be south facing. The same is true for most vineyards still existing today near the locations of the historic GHD series making it quite plausible that these data relate to the same places and exposures. Altitude was derived from the locality where the vineyards were cultivated taking into account mean altitudes of the still existing vineyards based on information in Google Earth. This procedure possibly involves some uncertainties. Furthermore, up to the late nineteenth or even to the early twentieth century it is not known what varieties of red grapes were grown. This is also a major source of uncertainty. Varieties of Pinot Noir were known both in the German speaking and the French speaking part of Switzerland since the Middle Ages under different local terminology (Schlegel, 1973), which, however, cannot be resolved any more, today.
Pearson correlations of all series having a sufficient overlap of > 15 values are significant at values between r = 0.63 and r = 0.92 (p = 0.01 and p = 0.03; only one correlation). Weaker correlations are related to distance, climate and soils as well as uncertainties related to grape varieties (e.g. north eastern vs. south western Switzerland).

46
O. Wetter and C. Pfister: An underestimated record breaking event e. Assessment of grape maturity in 1540 from HPD and weather narratives: None of the Swiss GHD series provides a date for 1540. Grape maturity was assessed from 3 HPD laid down in the context of detailed weather narratives about the effects of this extreme year motivated by drought scare. Chroniclers agree that rain only fell three or four times between early April and early August (Table S1, Sc2, in the Supplement). According to reconstructions of dominant meteorological situations by Jacobeit et al. (1999), blocking anti-cyclonic situations were quasi-persistent in summer (JJA) over large parts of Europe not considering the dusty-dry hot spring and the warm autumn. Vine-growers in Schaffhausen (Switzerland) were "long waiting for rain to begin the harvest", as chronicler Oswald Huber relates (Table S1, Sc3, in the Supplement). However, he writes "vinegrowers finally tackled the work nevertheless, because the plants withered" (Table S1, Sc3, in the Supplement).
Vine-growers at the shores of Lake Constance and in the Upper Alsace interrupted the vintage after picking the juicy grapes (Burmeister, 2008;Stolz, 1979), because the remaining ones were quasi dried out. The vintage was then resumed after a two-day spell of rain around St. Michaels Day (8 October). The GHBD for Dijon available according to the supplementary material on internet is 4 October (DOY 278) (Chuine et al., 2004), whereas the corrected date contained in the Dijon series is 3 September (DOY 247) (Labbé and Gaveau, 2011) which agrees with the date given for Besançon (Daux et al., 2012). The wrong value given by Chuine et al. (2004) might be due to a copying error in the compilation by Angot (1885)  At harvest time grapes in many vineyards had withered (grapes became raisins). They yielded a sweet sherry-like wine (Glaser et al., 1999) which made people rapidly drunk (Table S1, Sc4, in the Supplement). In Würzburg (Germany) the premium wine of 1540 was stored in a nicely decorated barrel and only offered to guests of the court. The wine became so famous that Swedish soldiers after their conquest of the town in 1631 were seeking the precious barrel. However, because it was hidden behind a wall, they were unable to find it. The last bottle of the 1540 vintage containing the world's oldest still-drinkable wine is today exposed in the Würzburg citizen's hospital (Glaser et al., 1999).
These descriptions suggest that due to the record breaking heat and drought human decision making to begin the harvest in 1540 was related to rainfall rather than to grape maturity. Detailed observations by contemporary alert observers allow assessing the likely time of grape maturity using two complementary approaches i. The first draws on 89 systematic observations in the open vineyard in Zollikon (473 m a.s.l.) from 1732 to 1832 and concerns the phenological stages of veraison, which refers to the colour change and softening of berries (Mullins et al., 1992) and the beginning of grape harvest (Kohler, 1879). The mean difference between the two stages is 37 days which is consistent with the values of 35 to 40 days given by Daux et al. (2012).
In 1540 veraison was reported on 5 July (Table S1, Sc3, in the Supplement) in Schaffhausen (403 m a.s.l.) and around 10 July (Table S1, Sc1, in the Supplement) in Zürich (408 m a.s.l.). As the delay between veraison and harvest is known to be quite constant (Daux et al., 2012), this suggests a maturity related harvest date between 12 and 17 August. This conclusion is consistent with an observation about sweet grapes found in Schaffhausen on 4 August (Table S1, Sc3, in the Supplement).
ii. The starting point for the second approach is Heinrich Bullinger's narrative that he tasted grape must ("Sauser") in Zürich on 10 August (Table S1, Sc1, in the Supplement). According to Werner Siegfried (personal communication, March 2012) grape must normally is obtained between one to two weeks before the main grape harvest starts which points to grape maturity between 17 and 24 August. This estimate coincides with a note in the chronicle of Ulm (south Germany) (438 m a.s.l.), situated in the comparatively cool climate of the Swabian Alb, that new wine was served already on 20 August (Table S1, Sc5, in the Supplement).
Considering the result of the two approaches, we assess the likely time of full grape maturity in 1540 to have been somewhere between 12 and 24 August. Both values were inserted in the temperature-GHD regression as a maximum and minimum value, which was derived from the calibration verification approach with HISTALP temperature anomalies (Auer et al., 2007), to assess AMJJ temperature anomalies in the extreme drought year (see Sect. 3).

Methodology
We applied the standard palaeo-climatology calibrationverification approach using linear regression models between measured temperatures (dependent variable) and the proxy, which in our case is GHD (independent variable). The overlapping period of the two series is divided up in two subperiods. This configuration allows one of them to be defined as a calibration period, for which the linear regression model was calculated, and subsequent independent verification of the results using data from the second period, then vice versa. al., 2008) and the Durbin Watson autocorrelation test. In order to investigate the record breaking extreme event of 1540 we relied on coherent records made by contemporary chroniclers reporting on the advanced development of trees and vines and on the vine-growers' reasons for taking particular decisions in an exceptional situation. A longer-standing discussion refers to the period within the year to be assessed from GHD, whether it is April to August (Chuine et al., 2004;Meier et al., 2007), April to September (Daux et al., 2012) or just April to July, as it is done in this investigation. In this context, Gladstones (2011) refers to the "widely observed phenomenon that temperatures of the first two or three growing season months, or alternatively the date of flowering, can usually predict quite closely the dates of veraison and maturity to follow [...]. The later phenological intervals show little response to temperature, and tend to be constant from year to year". His assessment confirmed by Daux et al. (2012) is in agreement with the results of stepwise regression analysis by Legrand (1979), Pfister (1984) and Guerreau (1995) showing that temperatures in August are not significant for the harvest date. We furthermore could replicate this result from a stepwise regression of our, and also from the GHD data published by Chuine et al. (2005).

Reconstruction of spring-summer (AMJJ) temperatures in Switzerland, 1444-2011
The 17 local series presented in the previous section were adjusted (homogenised) for (a) dating style, (b) altitude and (c) variety (only series S1) before being merged into a composite main series.
a. Homogenisation for dating style: As previously mentioned GHD series need to be adjusted for dating style. The series from Catholic cantons were adjusted by adding 9 days in the 15th century and 10 days in the 16th century prior to the Gregorian reform applied in 1584. Those from protestant cantons were adjusted by adding 9 and 10 days, respectively, prior to 1701.
b. Homogenisation for altitude: The local GHD series were then adjusted for altitude as follows: the mean altitude of all 17 series is 436 m a.s.l. Two series -S6 (Orbe, Canton Vaud, 438 m a.s.l.) and S17 (Neuhausen, Canton Schaffhausen, 437 m a.s.l.) -are situated at almost the same altitude as the overall mean altitude of all series.
Each of them has a mean of DOY 289, which, according to the homogenisation methodology by Chuine et al. (2004), was taken as the reference, to which the remaining 15 series were adjusted. This homogenisation was done by adding or subtracting to each record the difference between the long-term mean of the particular local series to the long-term mean of the reference series (i.e. DOY 289).
c. Homogenisation for grape variety (series S1): It turned out that the adjusted Basel WPD series appeared to be considerably too early with respect to altitude. This fact suggests the cultivation of earlier grape varieties in the Basel region. According to Dominik Wunderlin (personal communication, September 2011;Wunderlin, 1986) it is assumed that an early variety of Pinot Noir named "Äugstler" (early Red Burgundy) was grown in the Greater Basel region including southern Alsace and south western Germany. In Canton Schaffhausen (northern Switzerland) early Red Burgundy vines were also grown until the first decades of the twentieth century, for which appropriate phenological data are also available. Based on a statistical analysis of this evidence, Pfister (1984) demonstrated thatÄugstler were on average ripe on 10 September, i.e. at the end of August (Julian style). This might also indicate the origin of the word "Äugstler" -the name being probably linked to the month of August. He established a mean difference for veraison of 17 days between early Red Burgundy and the ordinary Pinot Noir grapes as well as a significant correlation of r = 0.87 (N = 37) between the veraison date of both varieties. The Basel WPD series was homogenized accordingly.
In a final step all GHD available for a particular year were annually averaged. The standard deviation of the averaged Swiss GHD series (average DOY of all available GHD series per year) amounts to 9.54 days. The new Swiss GHD series resulting from this homogenization-procedure was then calibrated with the monthly anomalies from the 1901 to 2000 mean of the long HISTALP temperature series going back to 1774. The HISTALP database consists of monthly quality controlled and homogenised instrumental records of temperature, pressure, precipitation, sunshine and cloudiness for the "greater Alpine region" comprising 724 000 km 2 , covering the whole territory of Switzerland, Liechtenstein, Austria, Slovenia and Croatia, together with parts of adjacent countries. Switzerland north of the Alps and the nearby regions are situated in the north-western sub-region of the "greater Alpine region" for which a particular temperature series is available (Auer et al., 2007;Böhm et al., 2010). It is used as a predictand for the present investigation.
The dates of 12 and 24 August were used in the regression for 1540 as a proxy of full grape maturity. As previously mentioned, they were obtained from two complementary approaches using detailed observations by contemporary meticulous observers. They represent the likely maximum and the minimum GHD derived from the above-mentioned approaches for this extreme year (see Sect. 2, (e). In summary, the two dates of 12th and 24th mark the margins of fluctuation within which full grape maturity likely occurred.

O. Wetter and C. Pfister: An underestimated record breaking event
Multiple stepwise linear regression revealed May temperatures to be the most important factor for GHD, followed by June, July and April (not shown). August was not significant in agreement with grape physiology (Mullins et al., 1992) which confirms earlier results by Legrand (1979) and Pfister (1984). Several independent calibration-and verification 50-yr sub-periods of the 1774-2005 HISTALP temperature anomaly series have been tested. It was found that there were overall good calibration and verification results.
The best verification match was found in the 1774-1824 sub-period where HISTALP spring-summer (AMJJ) mean temperatures significantly correlated with Pearson r = 0.86 (p = 0.01) (Fig. 3). The standard error of estimate (SEE) amounts to 0.5 • C. Figure 4 displays an 11-yr high-pass filter of reconstructed temperature anomalies. It is noticeable that the curve after 1990 does not fully represent the lengthening of the average growing season usually observed in the context of global warming. This reflects the previously mentioned fact established by Menzel et al. (2006) that responses of wild plants to global warming are larger than that of phases of crops which are also subject to management practice alterations. Moreover it has to be kept in mind that the reconstructed temperature variability is likely to be suppressed as a result of the regression method. The underestimation of the low-frequency variability typically amounts to 20 %-50 %. Nevertheless, low-frequency shapes are generally well reconstructed (Christiansen et al., 2009). The GHD based temperature reconstruction indicates that 1540 April-July mean temperature was between 4.7 • C and 6.8 • C (±0.5 • C SEE) higher than the mean 1901-2000 HISTALP temperature (Auer et al., 2007) depending on the assumed date of full grape maturity (12 vs. 24 August). According to this approach 1540 was by far the warmest April to July temperature anomaly in the last 566 yr. The estimated record breaking value for 1540 is followed in descending order by that for 1822 (+3.  (1816) surprisingly only are −1.36 • C. All significant positive and negative temperature anomalies exceeding ±2 • C are consistent with narrative documentary evidence about warm, respectively cold seasonal conditions (Pfister, 1999). It is Multiple stepwise linear regression revealed May temperatures to be the most important factor for GHD, followed by June, July and April (not shown). August was not significant in agreement with grape physiology (Mullins et al., 1992) which confirms earlier results by Legrand (1979) and Pfister (1984). Several independent calibration-and verification 50-year sub-periods of the 1774 -2005 HISTALP temperature anomaly series have been tested. It was found that there were overall good calibration and verification results. interesting to note that the latest GHD (1542) almost immediately follows the record breaking value of 1540.

Discussion
The discussion first involves a comparison of the new Swiss GHD series with those from neighbouring regions as well as with the index-based monthly and seasonal T reconstruction by Dobrovolný et al. (2010) and 4 MXD tree ring series. Best correlations are shown in Table 1. Subsequently, the estimates for 1540 are compared with those for 2003. Overall correlations between the homogenised Swiss GHD and other GHD series show good results (Table 1). The Besançon series (Garnier et al., 2011) correlates best with 0.82, followed by the corrected Dijon series (Labbé and Gaveau, 2011). Correlations with the series from Western Hungary and Vienna are somewhat lower, which is related to spatial distance. The Pearson correlation of the uncorrected Burgundy GHD series (Chuine et al., 2004) warrants a closer inspection.
31-yr moving correlations with the uncorrected Dijon GHD compilation series (green curve) reveal that the low correlation is due to the period 1516 to 1555, where the values drop to r = 0.19. This is a consequence of wrong values in 1522 and 1523 and the questionable value for 1540 (Fig. 5). The overall correlation of 0.76 obtained with averaged AMJJ Swiss documentary temperature indices  is relatively low which might be due to an accumulation of errors involved with estimating temperatures for individual months.
Correlations with tree ring (MXD) series are all significant, albeit on very different levels. The series presented by Battipaglia et al. (2010) averages MXD data for Larch from Lötschental (Canton Valais) as well as for Pine from Lauenen (Bernese Oberland) and Tyrol (Austria). Considering the inclusion of Tyrol situated some distance from the Western  Swiss Alps, the correlation of 0.55 is surprisingly high. On the other hand, correlations with the other two series from the Swiss Alps are surprisingly low which would warrant further inspection (see Table 1). Correlations with tree ring based temperature series derived from other regions in Europe far from Switzerland are very close to zero (e.g. Finland, Sweden; Esper et al., 2012;or Albania;Seim et al., 2012, etc.). This might be caused by distance, the use of another type of proxy, different time windows of reconstructed temperatures (AMJJ vs. JJA or MAMJJAS etc.) as well as by differing climatic conditions which taken together may aggregate possible errors. An in depth analysis of the reason for the low correlations would be interesting but is beyond the scope of this paper.
The key point of this analysis involves comparing springsummer temperatures in 2003 and 1540. It should be borne in mind that the comparison involves AMJJ temperatures only. As the 35 to 40 days preceding the harvest are known not to be subject to temperature induced modifications (Daux et al., 2012), it can be excluded that August temperatures mattered at all for the Dijon GHD of 15 August 2003. Then we need to consider that the GHD of 3 September, 1540 contained in the corrected Dijon series (Labbé and Gaveau, 2010) does not represent full grape maturity, as many grape harvests were postponed beyond this stage until the next rain spell due to severe drought impacts. The likely time of full grape maturity in Switzerland was assessed to be between 12 and 24 August. This estimate refers to vineyards situated somewhat above towns in the Swiss Plateau at altitudes of 400 to 410 m a.s.l. Thus, they are compatible with the mean altitude (436 m a.s.l.) of the main Swiss series. Estimating the 1540 date of full grape maturity for Dijon, situated at 245 m a.s.l. in the context of this information is more speculative. It would involve adjusting the date for Switzerland to Dijon by subtracting 18 days according to the average difference of GHD for Dijon (DOY 271) and the Swiss series (DOY 289). This leads to dates between 25 July and 6 August, 1540 for the likely full grape maturity in Dijon which at least does not contradict the main argument that AMJJ temperatures in 1540 were likely higher than in 2003.
Estimates of the AMJJ temperature anomaly in 1540 obtained from the Swiss series are 4.7 • C and 6.8 • C (±0.5 • C SEE), respectively, according to the assumed date of full grape maturity of 24 and 12 August. These estimates are undoubtedly higher than the 2003 AMJJ temperature anomalies of 2.5 • C from the 1960-1989 average measured in Paris (Rousseau, 2009). Further uncertainties of this estimate involve the fact, that the dates of full grape maturity in Switzerland were obtained from observations of grape-vine development made by several vine-growers, for which, of course, uncertainties cannot be quantified. Moreover, it remains to be determined by further research whether and how far this result obtained from local analyses can be spatially extrapolated. A further uncertainty relates to effects of drought stress. It cannot be excluded that the outstanding conditions in 1540 slowed down the process of grape maturity. Pierre de Teysseulh being a capitular of the church of Limoges (central France) notes that "this year there was such a great drought that grapes were harvested in August. The grapes were like roasted and the leaves of the vines had fallen to the ground like after a severe frost" (Table S1, Sc6, in the Supplement). Grapes rate of net photosynthesis decreases significantly above 35 • C. The absolute limit of CO 2 absorption is reached if 40 • C are achieved. In such cases the plant stops its vegetative activity. Low water drainage soils in combination with dry periods may furthermore have the same effect on the plants growth phase if temperatures for a longer time period do not fall below 30 • C (Currle et al., 1983).
Trees like vines suffered from drought stress. According to chronicler Sebastian Fischer from Ulm (south Germany), leaves on the trees withered (at the peak of the worst heat wave) in early August and fell to the ground "as if it had been in late autumn" (Table S1, Sc5, in the Supplement).  1464  1484  1504  1524  1544  1564  1584  1604  1624  1644  1664  1684  1704  1724  1744  1764  1784  1804  1824  1844  1864  1884  1904  1924  1944  1964  1984 1444  1464  1484  1504  1524  1544  1564  1584  1604  1624  1644  1664  1684  1704  1724  1744  1764  1784  1804  1824  1844  1864  1884  1904  1924  1944  1964  1984  2004 Year 31-year moving correlation coefficient ( Table 2). Correlations with tree ring based temperature series derived from other regions in Europe far from Switzerland are very close to zero (e.g. Finland, Sweden; Esper et al., 2012;or Albania;Seim et al., 2012 etc.). This might be caused by distance-, the use of another type of proxy-, different time windows of reconstructed temperatures (AMJJ vs. JJA or MAMJJAS etc.) as well as by differing climatic conditions which taken together may aggregate possible errors. An in depth analysis of the reason for the low correlations would be interesting but is beyond the scope of this paper. in the 95-(Lauenen) and the 98-(Tyrol) percentile, albeit not quite indicating record-breaking temperatures. According to the botanist and politician Renward Cysat in Lucerne, dew was abundant enough north of the Alps to substantially dampen drought effects (Table S1, Sc7, in the Supplement). Detailed qualitative descriptions of weather patterns and their impacts on human, ecological and physical systems, being widespread in a European scale, provide the most convincing arguments supporting the temperature estimates obtained from the new Swiss series. The drought and heat in 1540 began earlier than in 2003, it was more intense and it lasted much longer, namely more than 10 months. High temperatures already prevailed in the long rainless period in spring 1540 according to a report about full flowering of cherry trees on around 10 April in Ancy-sur-Moselle (265 m a.s.l.) and cherries being already ripe a month later at the same place (de Bouteiller, 1881). All vines had finished blossom on 10 June in Winterthur (Table S1, Sc3, in the Supplement) and Biel-Bienne (Table S1, Sc8, in the Supplement). Heat became unbearable from early June, considering the fact that quarrymen in Besançon (France) got time off from hard physical work. (Table S1, Sc9, in the Supplement). Four independent contemporary chroniclers describing the situation in vine-growing areas of Switzerland and Alsace agree that it did not rain a drop between 23 June and 6 August (Table S1, Sc2, Sc10, Sc11, Sc18, in the Supplement). The heat wave probably peaked in late July and early August. In Besançon people used taking refuge in cellars after 09:00 a.m. LT because they could not stand the heat in the streets during the day (Table S1, Sc12, in the Supplement). On 2 August, the town council of Ulm ordered the parsons preaching "about the hot and dry weather, begging God for rain" (Table S1, Sc5, in the Supplement). At that time, widespread (self-)ignition of forests and grassland is indicated by Swiss, Alsatian and German chroniclers (Table S1, Sc11, Sc13, Sc14, Sc15, in the Supplement), which is not known for Western and Central Europe in 2003. In central Portugal, where forest fires were rampant at this time in 2003, the temperature anomalies reached values higher than 9 • C ( García-Herrera et al., 2010).
These observations suggest that the temperature excess estimated for 1540 in comparison to 2003 mainly needs to be credited to heat waves in April-May and July, which, considering the positive temperature anomalies of 0 • C in April, 1.1 • C in May and 2.9 • C in July from the 1901average measured in Basel (in 2003, were probably more extreme in 1540. A valid comparison with June (6.8 • C) in 2003 is not possible. The fact that not a drop of rain fell during the entire month in 1540 suggests that temperatures may have been at about the same level as in 2003 (all values according to Begert et al., 2005). As previously mentioned, temperatures in August cannot be assessed from GHD. Chronicler Malachias Tschamser notices that the longest and most severe heat wave in Alsace occurred in the 32 days from 10 July to 10 August (Table S1, Sc15, in the Supplement). Chronicler Hans Salat confirms this observation mentioning that a rain spell began on 11 August in Lucerne (Table S1, Sc16, in the Supplement). These observations suggest that the extreme heat spell in 1540 culminated a couple of days earlier than in 2003. Temperatures from 11 to 31 August 1540 might have been at about the same level as in 2003 considering the observation of a second bloom of fruit trees in early September 1540 in Guebwiller (Alsace) (Stolz, 1979) which reflects similar observations being made in Munich at the same time in 2003 (http://de.wikipedia.org/wiki/Hitzewelle 2003; last access: 3 September 2012). Based on observations by vinegrowers of a second flowering of vines on 9 October and cherries reaching maturity for a second time in Lindau on the shore of Lake Constance (Burmeister, 2008), it can be concluded that September and October were probably warmer than in 2003. The reports by several chroniclers (Table S1, e.g. Sc2, Sc11, etc., in the Supplement) agree that weather was sunny and warm "like in April" until Christmas (Julian Style), i.e. 4 January 1541, without any frost and snow covering the ground (Table S1, Sc1, in the Supplement). At that time, several people demonstratively swam across the Rhine at Schaffhausen (Canton Schaffhausen) (Table S1, Sc3, in the Supplement). Chroniclers were eager to include such physical evidence in their narratives to demonstrate how extraordinarily warm it still was at the beginning of winter 1540/1541. Taking into account the preceding extreme spring-summer temperature anomaly and the outstandingly warm conditions in autumn (SON) until December 1540, we assume that water temperatures might have been at about 15 • C which is considerably below comfortable water-temperatures for swimming. Maximum water temperatures of the Rhine measured at this time of the year within the period 1978 to 2011 were about 11 • C in December 2006 and about 9 • C in January 2007 (Data from Swiss Federal Office for the Environment FOEN). Updated European averaged autumn and winter air temperature time-series indicate that temperatures for generally used for evapotranspiration, i.e. for driving the so-called latent heat flux. The remaining sensible heat flux ultimately impacts air temperature. In case of an initial strong soil moisture anomaly which may occur after a dry spring, the share of sensible heat increases with the higher position of the sun in early summer leading to higher air temperatures (see relationship A in Figure 6). Fig. 6. Processes contributing to soil moisture-temperature coupling and feedback loop (Seneviratne et al., 2010) Relationship (B) relates to the link between evapotranspiration and sensible heat flux.
Decreased evapotranspiration leads to an increase in sensible heat flux and thus to an increase in air temperature. Relationship (C) relates to a potential positive feedback leading to a further temperature increase: Increased temperature leads to a higher evaporative demand, and thus to a potential increase in evapotranspiration despite the dry conditions, possibly leading to a further decrease in soil moisture. The feedback loop inducing land-atmosphere coupling can continue until the total drying of the soil, when temperature increases cannot be dampened by any further increases in evapotranspiration. (Seneviratne et al., 2010). In 2011, the preconditions for record breaking temperature in summer were met after an intense spring drought, but several waves of heavy rainfall in June may have inhibited the trigger of feedbacks (Quesada et al., 2012). In 2003, spring drought was again intense with precipitation between February and May being below 50%. But this time, the spring drought gave way to consecutive episodes of intensive anticyclonic anomalies in the summer months associated with stationary blocking, clear skies, high temperatures and high evaporation amplifying temperatures through the feedback processes described above (Garcia Herrera et al., 2010).
Sensitivity analyses suggest that given climatologic mean soil moisture and similar continental-scale circulation, the 2003 JJA surface temperature anomalies would have been Fig. 6. Processes contributing to soil moisture-temperature coupling and feedback loop (Seneviratne et al., 2010). autumn 2006 and winter 2007 were likely the highest for more than 500 yr . According to the data underlying the reconstruction of monthly temperatures in central Europe on the basis of Pfister indices, the year 1540 was the warmest since 1500 .
The conditions under which temperatures rise to record breaking levels were intensively investigated after the 2003 event (see the review by Seneviratne et al., 2010, andreferences therein, Fischer et al., 2007). There is consensus that soil moisture-temperature interactions were a key driver in the sequence of events that led to the exceptional heat wave in early August, 2003. In temperate climates, a considerable part of incoming shortwave radiation is generally used for evapotranspiration, i.e. for driving the so-called latent heat flux. The remaining sensible heat flux ultimately impacts air temperature. In case of an initial strong soil moisture anomaly which may occur after a dry spring, the share of sensible heat increases with the higher position of the sun in early summer leading to higher air temperatures (see relationship A in Fig. 6).
Relationship B relates to the link between evapotranspiration and sensible heat flux. Decreased evapotranspiration leads to an increase in sensible heat flux and thus to an increase in air temperature. Relationship C relates to a potential positive feedback leading to a further temperature increase: increased temperature leads to a higher evaporative demand, and thus to a potential increase in evapotranspiration despite the dry conditions, possibly leading to a further decrease in soil moisture. The feedback loop inducing landatmosphere coupling can continue until the total drying of the soil, when temperature increases cannot be dampened by any further increases in evapotranspiration (Seneviratne et al., 2010). In 2011, the pre-conditions for record breaking temperature in summer were met after an intense spring drought, but several waves of heavy rainfall in June may have inhibited the trigger of feedbacks (Quesada et al., 2012). In 2003, spring drought was again intense with precipitation between February and May being below 50 %. But this time, the spring drought gave way to consecutive episodes of intensive anticyclonic anomalies in the summer months associated with stationary blocking, clear skies, high temperatures and high evaporation amplifying temperatures through the feedback processes described above (García-Herrera et al.,

52
O. Wetter and C. Pfister: An underestimated record breaking event 2010). Sensitivity analyses suggest that given climatologic mean soil moisture and similar continental-scale circulation, the 2003 JJA surface temperature anomalies would have been reduced by around 40 %. Thus in absence of soil moisture feedbacks, summer 2003 would still have been warm, but it would not have been such a devastating event as it turned out to be (Fischer et al., 2007).
Rainfall observations from Swiss and Alsatian chroniclers living in the core region of the record breaking anomaly in 1540 provide some clues for assessing soil moisture deficits . Some observers did not only specify when, but often also how long and how intensively it rained. For example, chronicler Oswald Huber from Schaffhausen reports just one abundant rain spell from 12 February to early June (about 10 June) (Table S1, Sc3, in the Supplement). Hans Stolz from Guebwiller (Alsace) confirms Huber's observations specifying that between February and 10 June it only rained for three days in mid-March, whereas April and May were throughout sunny and very warm (Stolz, 1979). It is concluded from these reports that spring drought in 1540 was far more severe than in 2003. Observations about extreme soil desiccation (Stolz, 1979) and soil cracking (Table S1; Sc7, in the Supplement) confirm the hypothesis of a record-breaking soil moisture deficit. Some cracks were so wide that people could put their feet into them (Table S1, Sc17, in the Supplement). Consecutive episodes of intensive anticyclonic anomalies following the 1540 spring drought may have activated the previously described positive feedback loop of rising temperatures and evaporation leading to record-breaking temperatures within the last 500 yr.

Summary and conclusion
Firstly, the main results of the study are briefly reviewed. Subsequently, fundamental issues regarding the approach to be applied for reconstructing record breaking extreme events from documentary data are addressed.
A new long Swiss GHD series (1444 to 2011) was set up to assess spring-summer (AMJJ) temperatures; August temperatures were excluded as these are not significant for the date of grape maturity. The calibration-verification approach using the HISTALP temperature series (Auer et al., 2007) for the north-western part of the Greater Alpine Area yielded the result that spring-summer (AMJJ) temperature anomalies in 1540 from the 20th century mean were between +4.7 • C and +6.8 • C (±0.5 • SEE) higher than those measured in 2003. From observations of a second bloom of fruit trees in early September, a second flowering of vines in October and the absence of cold spells in conjunction with extreme drought until the end of the year it is concluded that autumn (SON) was likewise warmer than in 2003. Considering the significance of soil moisture deficits for the generation of record breaking heat waves, these results still need to be validated with estimated seasonal precipitation.
The summer (JJA) 2003 was claimed to be unprecedented. Evidence from GHD and tree-rings led to the conclusion that it was likely warmer than any other year since the Middle Ages. Although it is not possible to simply extrapolate our results for Switzerland from GHD to a wider domain, it is concluded from a great number of coherent qualitative documentary evidence about the outstanding drought in 1540 that temperatures were likely more extreme in large parts of Western and Central Europe than in 2003. The persistent temperature and precipitation anomaly in that year, described in an abundant and coherent body of qualitative documentary evidence, may have severely affected the reliability of GHD and MXD measurements on tree-rings as proxy-indicators for temperatures. Due to the crossing of a poorly understood drought stress threshold it was widely observed that grapes were desiccated at the climax of the heat wave in early August, which led many vine-growers to interrupt or postpone the harvest despite full grape maturity until the next rain spell. Likewise, many trees were under extreme drought stress concluding from observations that leaves withered and fell to the ground as would typically observed in late autumn. It remains to be determined by further research whether and how far this result obtained from local analyses can be spatially extrapolated.
Fundamental considerations regarding the estimate of record-breaking extreme events in the pre-instrumental past deal with four issues, namely (a) the critical evaluation of sources, (b) the approach to deal with past extreme events, (c) the role to be played by documentary sources laid down by individuals in assessing extremely rare events in the past and d) the interpretation of the 1540 extreme event in the context of global warming.
a. All proxies have their strengths and limitations and only if we can find a similar signal in different types of proxies a robust assessment of past record breaking temperatures can be made. GHD are a valuable documentary proxy for past summer (AMJJ) temperatures. However, their interpretation "should be seen as a delicate task requiring a lot of endurance and accurateness (sic)" (Labbé and Gaveau, 2011), similar to data analysis in the sciences. Drawing on uncritical compilations in using documentary evidence involves a risk of obtaining flawed results. Moreover, model building using documentary evidence should be complemented by an in depth interpretation of historical decision making over time.
b. Extreme events are rare, which means there are few data available to make assessments regarding changes in their frequency or intensity (Field et al., 2012). They involve situations in which both human and ecological systems behave non-linearly outside the normal range of biological and probability laws.
c. Detailed observations provided by contemporary chroniclers describing both, (bio-) physical proxy data as well as the underlying meteorological conditions and the related human decision making should be used to assess the severity of record breaking extreme events and their impacts on human, ecological and physical systems. Besides the example of 1540, this conclusion also refers to tree-ring based studies by Battipaglia et al. (2010) and Büntgen et al. (2011). As the latter authors put it, documentary evidence independently confirmed many of the dendro-signals over the past millennium, and further provided insight on causes and consequences of ambient weather conditions related to the reconstructed extremes. We must not play the statistical and the narrative approach against each other. Rather, the two approaches are complementary accounting both for the ordinary and for the extraordinary. Subsequent analyses should focus on assessing precipitation and drought severity to make the worst case event of 1540 and its devastating impacts more plausible and comprehensive (Wetter et al., 2013). Further record breaking extreme events are to be expected for the period prior to 1500  which has so far not be systematically investigated.
d. The result presented in this paper that spring-summer (AMJJ) temperatures in 1540 in Western Europe likely exceeded the amplitude of the previous hottest summer of 2003 does not challenge the notion that summer 2003 can be partly related to global warming (Stott et al., 2004). It shows that even more extreme events than 2003 are documented from the pre-instrumental period under cooler climate conditions than today. Strikingly, the record-breaking warm spring-summer anomaly of 1540 was almost immediately followed by the coldest spring-summer (AMJJ) within the last 500 yr. Future analyses are needed to assess how frequent low probability-high impact events such as 1540 might become in the second half of the 21st century under conditions of continuous global warming.