Multiscale regression model to infer historical temperatures in a central Mediterranean sub-regional area

N. Diodato, G. Bellocchi, C. Bertolin, and D. Camuffo MetEROBS – Met European Research Observatory, GEWEX-CEOP Network, World Climate Research Programme, via Monte Pino snc, 82100 Benevento, Italy Grassland Ecosystem Research Unit, French National Institute of Agricultural Research, avenue du Brézet 234, 63000 Clermont-Ferrand, France Atmosphere and Ocean Science Institute, National Research Council of Italy, corso Stati Uniti 4, 35127 Padua, Italy


Introduction
"Modelling can be described as an art because it involves experience and intuition as well as the development of a set of -mathematical -skills." The Mediterranean is one of the few regions in the world holding a large volume of weather documentary proxies for the past 500-1000 years (Camuffo and Enzi, 1992;Jones et al., 2009).However, such large amounts of documents and archives have not yet been fully explored to reproduce with high spatio-temporal resolution the different climates of Mediterranean (García-Herrera et al., 2007).Determining Figures

Back Close
Full the climatic history in these unrepresented places of the world is a challenging and complex issue at both theoretical and applicative levels.
Modelling is an ideal trial to test the environmental processes over extensive space and time domains.In the recent decades, considerable progress has been made in pre-instrumental temperature modelling at both hemispheric and regional scales (e.g.Mitchell et al., 2005;Rutherford et al., 2005).Luterbacher et al. (2004) and Xoplaki et al. (2005) were able to map seasonally resolved temperature reconstructions across European land areas back to 1500.In particular, Luterbacher et al. (2004) developed separate multiple regression equations between each leading principal component (PC) time series of the proxy records and all the leading PC time series of the instrumental data.In this way, they assimilated proxy records into reconstructions of the underlying spatial patterns of past climate changes.The reconstructed climate field allows for a special assessment of the spatial coherence of past annual-to-decadal temperature changes at sub-continental scale, thus providing insight into the mechanisms or forcing underlying observed variability.In hemispheric, continental and regional reconstructions, however, multi-proxy coverage is often irregular and heterogeneous (Esper et al., 2002).Temperature and precipitation reconstructions, although well developed over large geographical areas, may become poorly accurate at sub-regional and local scales, or over particular periods (Mann et al., 2000;Ogilvie and J ónsson, 2001;Diodato et al., 2008).On the other hand, it is not surprising if Mann (2007), comparing estimated regional temperatures at different locations over the past 1000 years, found that the cold and warm periods were considerably different from region to region.Then, the issue of sub-regional reconstructions should attract the attention of scientists as it may exhibit unexpected results, especially regarding some temperature extremes (Bhatnagar et al., 2002).
In order to draw large-scale inference about temperature in Europe, documentary proxies' investigation remains a reliable approach to trace back the temperature extremes before the advent of instrumental recording of meteorological data (Br ázdil et al., 2005;Jones et al., 2009).However, as pointed out by Riedwyl et al. (2009), the Introduction

Conclusions References
Tables Figures

Back Close
Full issue of downscaling to small spatial and temporal scales has become a priority in order to achieve a better understanding of sub-regional climates.Brewer et al. (2007) investigated tree-ring sites to support the reconstruction of historical droughts in Mediterranean areas during the last 500 years.However, temperature series have not been modelled for this region so far.Moreover, continuous and homogeneous instrumental series cannot be extended before the 19th century (Camuffo et al., 2010).On the other hand, high-resolution climate information is increasingly needed for the study of past, present and future climate changes (Vrac et al., 2007).
Several authors such as Luterbacher and Xoplaki (2003), Pauling et al. (2003), and Ge et al. (2005) suggested that pre-modern instrumental weather indices may be promising to enrich climate reconstructions at regional or local scales.Different sets of proxy-variables have indeed been used to find out simplified relationships between predictors and predictands in high-resolution climate time reconstructions (e.g.Wang et al., 1991;Briffa et al., 2002;Larocque and Smith, 2005;Moberg et al., 2005;Diodato, 2007;Davi et al., 2008).Many of these reconstructions depend on empirical relationships between proxy records and temperature data.Comparing linear algorithms and neural networks, Helama et al. (2009) proved reliable reconstruction using both the approaches.Although regression-based techniques have been successful, they can engender bias in the estimates if not used with care (Robertson et al., 1999;Moberg et al., 2005;von Storch et al., 2005).These relationships are seldom based on a training process capable to capture all the possible data combinations that occur when extrapolation is performed (i.e.reconstruction period).Regarding to the source of dendroclimatological data, correlation between tree-ring proxies and temperature data only explains about 50% of the (Liang et al., 2008;Helama et al., 2009;Tan et al., 2009).Documentary data series were observed to correlate better with temperature, with an explained variance of about 70% (Leijonhufvud et al., 2008;Dobrovolný et al., 2010).However, there are few estimates of uncertainty in documentary based climate reconstructions (Moberg et al., 2009).Introduction

Conclusions References
Tables Figures

Back Close
Full In this study, we have considered an alternative approach to address the trainingand-extrapolation issue.In particular, a documentary-based technique was developed based on multiscale temperature regression (MTR)-model at sub-regional level.An area covering Southern and Central Italy and named in this paper Mediterranean Subregional Area (MSA) is the focus of the investigation.The goal was to produce a relatively simplified multiscaled model acceptable and verifiable by scientists as well as knowledgeable people.(MTR)-model combines documentary proxy-based local weather anomalies with large-scale temperature data to adapt regional temperature data to specific sites and seasons.The selected sub-region, centrally located in the Mediterranean region, is an interesting test area rich in documentary proxy data and modern weather records useful to improve the spatial resolution of past climate.The next section describes the geographical environment, the datasets and the developed methods.Section 3 illustrates the novel mixed-model approach in detail.Its results on temperature series estimation were evaluated over the MSA.Conclusions (Sect.4) point out the main results and look at horizons for future research.

Study area, datasets and method of analysis
The study is based on a set of both monthly-modelled regional temperatures and documentary proxy data at a typical Mediterranean area of Central and Southern Italy (MSA in Fig. 1).This sub-region is frequently crossed by depressions generating over the Mediterranean Sea (Wigley, 1992) that, reinforced by continental North easterly airflows, produce important fluctuations in temperature and precipitation and large-scale atmospheric oscillations (Barriendos Vallve and Martin-Vide, 1998).
Regional temperature data (hereafter called T R ) were derived from Luterbacher et al. (2004) for Europe over 1500-2002.The data, upscaled at about 0.25-degree grid resolution (∼35-50 km) from historical instrumental series and multi-proxy data Introduction

Conclusions References
Tables Figures

Back Close
Full (http://www.ncdc.noaa.gov/cgi-bin/paleo/eurotemp.pl),covers an area extending from 25 • W to 40 • E and from 35 • to 70 • N (Fig. 1a).From this map and from that depicted in Fig. 1b, it is also possible to observe the temperature-data missing over Southern Europe (including the MSA), as suggested by both data-density and correlation pattern.
In order to fill this deficiency in the data available, a new documentary-dataset was derived from chronicles found in two main sources, Moio and Susanna Manuscript (Ferrari, 1977) and Corradi's Annals (Corradi, 1972).A data bank (Catalogue EVA -Environmental Events of the ENEA -Italian National Agency of for New Technologies, Energy and the Environment, Clemente and Margottini, 1991) was also referred to and used when necessary.The Italian scientist Alfonso Corradi (1833-1892) carried out pioneering works in documentary research on the environmental and climatological extreme conditions that occurred in Italian regions through time.He collected the historical documents from 5 to 1850 AC, related to meteorology and epidemics into a five-volume book (Corradi, 1972).More recently, the historian Umberto Ferrari published the chronicles of Giovanni Battista Moio and Gregorio Susanna quoting climate extremes, famines from 1710 to 1769 and weather information over the 16th and 17th centuries for the Calabrian region (Ferrari, 1977).
For the purposes of modelling, the split-samples approach was used to segregate the available data into a calibration set and a validation set.Particular attention was paid to the calibration procedure in order to ensure that the resulting model could produce reliable outcomes (i.e.time-series reconstruction).Two distinct climate periods (1867-1903 and 1972-2002) were included in the calibration dataset (68 records in total) for two main reasons.The first was to ensure model calibration accuracy by accounting for both cold and warm intervals, and the second to ensure that the model was able to simulate air temperature on periods with either accurate (as in recent times) or inaccurate data (as in historical times).The validation dataset contained instrumental temperature reconstruction for the MSA (as performed by Camuffo et al., 2010), including the periods 1742-1754 and 1792-1818.These two intervals are considered the only reliable records in the historical time for this area.The entire workflow Introduction

Conclusions References
Tables Figures

Back Close
Full was executed interactively using a spreadsheet of MS Excel 2003, for data collection, model development and graphical assembling, with the support of STATGRAPHICS online statistical package (http://www.statgraphics.com) and Statistics Library-R modules (Wessa, 2009) for statistics performance and graphical outputs, respectively.The agreement between estimates and observations was evaluated using a set of statistics, including the modelling efficiency by Nash and Sutcliffe (1970), ranging from negative infinity to positive unity (the latter being the optimum value).In order to have a visual inspection of the quality of results, a set of comparative scatterplots and histograms are also presented.

Monthly temperature anomaly scaled index
Information held in the written documentary sources was extrapolated to derive temperature related indices.Different types of indices have been proposed in historical climatology studies (Pfister, 1999(Pfister, , 2001;;Br ázdil et al., 2005).As a general reference, a seven-point scale was employed, ranging from −3 for "extreme coldness" to +3 for "extreme hotness", with 0 indicating "normal" conditions.However, this ordinal scale bears the limitation of a limited discrimination across the full range of extremes, since it tends to assign all events above a certain level to the same extreme class (Glaser and Riemann, 2009).To obtain a more realistic degree of variability in the temperature modelling, we used a simplified scaled-index for a more accurate estimate of extreme anomalies.Examples of such events are recorded only during the Little Ice Age (e.g.rivers freezing), when no instrumental data could overlap the calibration period.Based on the above criteria, monthly indices were calculated, thus gaining more than seven possible classes to preserve the variability described by the written sources similar to the natural variability, and over a longer period than the calibration interval.These classes were allocated by an asymmetric matrix in order to take into account temporal shifts between proxy and actual anomalies in different seasons of the year.In fact, as an example, a river freezing on March or April is a more negative anomaly than a frozen river on January.Based on these new classification principles, temperature 2631 Introduction

Conclusions References
Tables Figures

Back Close
Full anomalies were coded for winter and summer by means of a monthly-based Temperature Anomaly Scaled Index (TASI), according to an asymmetric matrix (Table 1a).Abrupt jumps from "very cold" to "freezing" in winter (December, January, and February in Table 1a) are due to the lack of appreciative intermediate states during the calibration period.A similar scheme was reproduced for summer season (June, July and August in Table 1a).Once the magnitude of the indices array was defined, then the proxies were transformed into a time series with a clearly defined temporal resolution.This kind of understanding is offered in the form of an exemplary table layout (Table 1b), incorporating monthly and seasonal values of the TASI, and the relative source for the period 1752-1757.

Modelling of sub-regional winter temperatures
In some experimental situations, it is possible to measure more than one response for each case.This is also the case of temperature, which needs multi-scale predictors to be modelled over different space-and time-domains.In the analysis of these experiments, information from all the collected responses can be combined to provide parameters that are more accurate and, in turn, determine more realistic temperature data (after Bates and Watts, 2007).In this way, the information collected was downscaled to reasonably approximate the behaviour of the disturbance terms in the temperature measurements.These approximations reside on the general assumption that air temperature depends on regional-synoptic forcing and local weather conditions.
The regional scale can drive the general temperature trend, while area-specific temperatures are met by local conditions.Weather variables and climate indices were both used as predictors as basis of the multi-scale regression model.Introduction

Conclusions References
Tables Figures

Back Close
Full

Inferences for multi-scale temperature estimation
A model of sub-regional temperature estimation was created with aims of prediction and explanation.For prediction, the model structure was generated based on Box and Draper (1972).In particular, a determinant parameter estimation criterion for multiresponse data was derived upon the primary assumption that the disturbance terms of different cases are uncorrelated.A corollary assumption was that, in a single case, the disturbance terms have a fixed, unknown variance-covariance matrix for different responses.A model was written along this path, assuming M responses (measured on each of N experimental runs) and dependence on P parameters, θ, as referred to by Bates and Watts (2007): In Eq. ( 1): y(T ) n,m is the temperature random variable associated with the data value of the m-th response for the n-th case; f m is the model function for the m th response depending on some or all of the experimental settings X n , and on some or all of the parameters θ (Fig. 2); ε n,m is the normal random disturbance term independent from the regressors (sum of errors assumed equal to zero).
To contribute to the aim of explanation, we tried to identify influential predictors and gain insight into the relationship between the predictors and the outcome based on our background in climate history and modelling.In this path, the model function (f m ), which comprises the (stimulus)-variables at regional, (.) R , and sub-regional, (.) SR , scales may depend on the parameters nonlinearly as follows: where the first condition is to set null intercept (a), the second is to approximate the unit slope (b) of the straight line that would minimize the bias, and the third is to maximize the goodness-of-fit (R 2 ) of the linear function.
Once y(T ) is identified and known as combination of regional and sub-regional components, one can estimate the relationship between expected temperature T and predictors.Since the different assumptions cannot be guaranteed a priori, the parameters were estimated using an iterative, knowledge-driven approach to bias correction steps (after Box et al., 1978).For instance, after a first run, it was found that regional temperatures (T R ) were increasingly biased over historical times.Likewise, Mann et al. (2000) found a decreased number of spatial degrees of freedom in the earliest regional inferences (associated with significantly decreased variance).To account for this non-invariance over the historical time-scale, a power law was assigned to T R with the exponent forced to be lower than one (and finally set equal to 0.5) to rebalance internally the quality of calibration.To extend the procedure for extrapolations outside the range represented by the calibration sample, the model was iteratively rearranged towards a robust solution whereby two additive components are used (non-linear regional component, linear-and-local component): where the first term, y(T MTR ), is the seasonal mean temperature output ( seasonally-varying (index S) shift parameters ( • C) of T R , which force the model with meteorological (Σ TASI S , sum of monthly values of the Temperature Anomaly Scaled Index defined above) and climatological (Ω S , hereafter indicated as Ω w and Ω s for winter and summer, respectively) boundary conditions.

Model parameterization and evaluation
For (MTR)-model (Eq.4), the values of the parameters obtained from a particular set of observations with a recursive procedure are: β = 0.268, Ω w = 11.0,Ω s = 43.5.Using the estimated parameter values, the non-linear response to T R is depicted in Fig. 2, as originated by Eq. ( 1) and translated into Eq.( 4) for different values of Σ TASI S .
The parameter values estimated from the data roughly matched the observations.
In Fig. 3, negligible departures of the data-points from the 1:1 line (observed versus predicted values) indicate the presence of limited bias in the residuals with both winter (graph a) and summer (graph b) calibration datasets.The Nash-Sutcliffe efficiency index and the correlation coefficient, equal to 0.88 and 0.94 for winter and 0.87 and 0.88 for summer (Table 2), are also satisfactory.Figure 4 shows the results of model validation against independent time-series data.In general, fluctuations of observed and (MTR)-model predicted temperatures compare well in both seasons.In particular, absolute minimum and maximum observed values are both reflected in the predictions (black lines in Fig. 4a and b).The Nash-Sutcliffe efficiency values, equal to 0.66 (winter) and 0.63 (summer) are also satisfactory (Table 2).In contrast, the regional model by Luterbacher et al. (2004) poorly reflects the variability of actual winter temperature in both seasons (circles in Fig. 4a), as also confirmed by the correlation coefficient and the Nash-Sutcliffe efficiency values (equal to 0.32 and −0.59, for winter, and 0.07 and −0.64 for summer, Table 2).For (MTR)-model, the residuals distribution denote a quasi-Gaussian trend (Fig. 5a and b), with the QQ-plots reflecting theoretical values (Fig. 5a 1 , b 1 ) in both seasons.Independence-of-errors due to the possible presence of significant autocorrelations among the residuals was also tested.Strong temporal dependence may in fact induce Introduction

Conclusions References
Tables Figures

Back Close
Full spurious relations according to standard inference in an ordinary regression model (see Granger et al., 2001), and the same problem is further increased in the context of nonlinear models (Stenseth et al., 2003).The Durbin-Watson (Durbin andWatson, 1950, 1951) d statistic in the following form was calculated to verify the presence of autocorrelation in the residuals e (the index t indicating the t-th year): Two critical values, d L,α and d U,α , vary depending on the level of significance (α), the number of observations, and the number of predictors in the regression equation.In the calibration dataset, indication of possible correlation is produced at 0.01 < α < 0.05 significance level for winter only (Table 2).This may be due to some internal constraint in the calibration stage, probably related to the fact that winter temperatures in the regional dataset and model outputs are more similar in recent times (the period of years used for calibration) than it was in historical times.However, both calibration results in summer and the results of data validation in both seasons assume statistical independence of the residuals, with type-I error probability of 0.09 and 0.36 of Durbin-Watson test statistic (Table 2).
The mean absolute error (0.24-0.33), similar between calibration and validation and between seasons, and the other statistics of Table 2 indicate for the validation set a satisfactory performance.This suggests that the proposed approach is a promising tool for future applications in temperature estimation.

Conclusions References
Tables Figures

Back Close
Full

Conclusions
The main novelty of this paper is the introduction of a relatively simple model to reconstruct past seasonal (winter and summer) temperature variability at sub-regional scale based on proxy and simulated datasets.In general, the use of data deriving from different spotted sources is not straightforward to reconstruct climate in Southern Europe.
Data used in the previous seasonal temperature reconstruction over Europe, especially over the Mediterranean areas, are from few and early instrumental series (data before 1850) that, for their nature, are difficult to find, evaluate, correct and convert or present in a Celsius scale in terms of temperature anomalies.
The multi-scale regression approached here overcomes the inherent loss of variance in both early instrumental records and univariate least-squares calibration equations.In general, multi-scale, process-based climate models can be accurate.However, the authors argue that improvements in model sophistication may not be as profitable as the ability to reconstruct confidently the overall picture of temperature-related events (and therefore temperature data) over historical times and in different geographical places.
Validation, from this point of view, is a major statistical instrument to develop a reliable model to add robustness to past temperature reconstructions.Furthermore, in this paper, we took advantage of the (MTR)-model versatility to evaluate, through proxydocumentary data, how the sub-regional temperatures signal is driven by local and boundary conditions.The accuracy of these signals depends not only on the intrinsic properties of the model itself, but also from the possibility to recover homogeneous documentary records able to maintain unchanged the climate information and to replicate, through the model application, the actual temperature series.Once such conditions are satisfied, the modelling approach may potentially be suitable for applications elsewhere in the Mediterranean basin, provided that model parameters will be documented for other sub-regions than the one investigated here.Further research extending the modelling approach developed here towards other sub-regions of the Mediterranean area would provide additional insight into the implications for the production of valuable Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 2) The vector term y(T ) contains both B 1 and B 2 parameter matrices.A recursive procedure for the least-squares estimation was performed imposing restrictions on the entries of matrices B in order to obtain the best fit of a regression equation Y = a+b•X , where Y = model estimates and X = actual data, according to the following criteria: Discussion Paper | Discussion Paper | Discussion Paper |  • C) of the (MTR)-model; T R is the regional component of temperature ( • C) supplied as a boundary condition; the part in brackets is the sub-regional component of temperature ( • C) supplied as a local constraint.The square root (power of 0.5) of T R and parameter β are mainly to define the order of magnitude of the process used to downscale the (MTR)-model to the sub-regional scale.The other two terms into the brackets are Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | knowledge from proxy documentary data and can be considered the natural evolution of this study.Discussion Paper | Discussion Paper | Discussion Paper | in the Mediterranean Basin by means of documentary data and instrumental observations, Climatic Change, 101, 169-199, 2010.Clemente, G. F. and Margottini, C.: Sistema EVA: una biblioteca di dischi ottici per le catastrofi naturali del passato, Prometeo, 9, 22-29, 1991.Corradi, A.: Corradi Alfonso: Annali delle epidemie occorse in Italia dalle prime memorie fino al Discussion Paper | Discussion Paper | Discussion Paper |

Table 2 .
Performance and autocorrelation statistics for (MTR)-model (Eq.4) at the calibration and validation stages.