HadISDH land surface multivariable humidity and temperature record for climate monitoring

Introduction Conclusions References


Introduction
Atmospheric humidity is a term used to describe the amount of water vapour in the atmosphere.Water vapour is deemed an Essential Climate Variable both at the surface and aloft, recognising its importance for observing and characterising climatic system changes (Bojinski et al., 2014).There are a number of different measures used to describe it: -Relative humidity (RH), expressed as a percentage (%rh).The amount of water vapour in the air compared to how much water could potentially be held as a vapour at that temperature.
-Dew point temperature (T d ), wet bulb temperature (T w ) and dew point depression (DPD), expressed in units of temperature ( • C) T w : the amount of evaporative cooling of a thermometer in a moistened wick.Air that is not saturated will evaporate water from the wick, cooling the "wet bulb" thermometer.
T d : the temperature at which the air becomes saturated at that current level of water vapour, measured by artificially cooling a surface until water condenses onto it.
Published by Copernicus Publications on behalf of the European Geosciences Union.
DPD: the amount the air has to be cooled by to reach its dew point temperature.
If T d or T w is equal to the dry bulb temperature or DPD is zero, then the air is saturated.
-Vapour pressure (e) expressed in hPa.The partial pressure exerted by water vapour alone.
-Specific humidity (q) and mixing ratio (r) expressed in g kg −1 : q: the ratio of the mass of water vapour to the mass of moist air.
r: the mass of water vapour compared to the mass of dry air.
Atmospheric humidity plays a key role in both the energy and hydrological cycles.At the surface, it plays an important role in the energy budget through evaporation, removing and storing heat that can be transported elsewhere in the atmosphere and later released through condensation.It is key to the hydrological cycle as a source of water for precipitation.Although changes in humidity may not directly cause changes in precipitation, the specific humidity is related to the amount of rainfall in heavy precipitation events (Trenberth, 1999).The relative humidity at the surface can greatly affect the thermal comfort of humans and livestock.At high temperatures, higher relative humidity reduces the evaporative cooling ability of the body resulting in greater thermal stress.At low temperatures, higher relative humidity leads to more efficient conduction of cool air temperature to the body via contact with clothes and skin.Finally, water vapour evaporated at the surface is often transported aloft where it behaves as a very effective greenhouse gas.Indeed, water vapour is the dominant greenhouse gas in the Earth's climate system.However, water vapour has a short residence time of only a couple of weeks, and fluctuations in water vapour cannot directly lead to long-term changes in the Earth radiation balance, although they can produce feedbacks in response to other mechanisms.
Increasing specific humidity has been observed over most of the land surface and the well-observed parts of the ocean over recent decades (Dai, 2006;Berry and Kent, 2011;Willett et al., 2013b).Given the increase in surface temperatures over the last century across most of the Earth's surface (Hartmann et al., 2013), this is largely expected because of the Clausius-Clapeyron relationship.In regions where availability of water is not a limiting factor, the air will increase its water vapour storage at a rate of ∼ 7 % for every 1K rise in temperature (Held and Soden, 2006).Since early in the 21st century, a decrease in the relative humidity and a plateauing in the specific humidity has been observed (Simmons et al., 2010;Willett et al., 2013b).This is simultaneous with an apparent relative flattening in the rate of recent global (land, air and sea) surface temperature rise (Cohen et al., 2012;Hartmann et al., 2013;Kosaka and Xie, 2013), although this is less clear for global land surface air temperature and also warm extremes of temperature (Seneviratne et al., 2014).Potential mechanisms for these features in humidity have been discussed in the literature (e.g.Brutsaert and Parlange, 1998;Joshi et al., 2008;Rowell and Jones, 2006;Simmons et al., 2010).Much of the moisture over land is transported from the oceans where it was originally evaporated.Although the oceans cover just over 70 % of the Earth surface, ∼ 85 % of atmospheric water vapour is evaporated from oceans with ∼ 15 % coming from evaporation and transpiration over land (Ahrens, 2000).If the air over the ocean surface warms more slowly than that over the land, the water-holding capacity of air will also increase more slowly over the ocean.Therefore, it is plausible that water vapour over the oceans has not increased at a rate high enough to sustain constant relative humidity (and hence proportionally increasing specific humidity) over the warmer land mass (Simmons et al., 2010).Large-scale changes in the atmospheric circulation and reduced moisture availability over land may also play a part (Joshi et al., 2008).
Given the varied ways in which humidity is important to life on Earth, and the recently noted but incompletely understood changes, a comprehensive monitoring product for water vapour and temperature over the land surface is of high value.Few products are currently available and none provides a complete suite of humidity variables.Although all humidity variables can be readily derived from temperature and dew point temperature, the equations are nonlinear.So, to avoid errors, any multi-variable humidity climate data product has to be built from the original measurements (e.g.hourly station data) upwards for each individual variable, rather than from time-averaged variables at any averaged scale (such as daily or monthly).
Reanalysis products are available for land surface temperature and dew point temperature at subdaily resolution.Other humidity variables are easily derived from these.ERA-Interim land surface specific humidity is in very good agreement with observations and considered to be a reasonable estimate (Simmons et al., 2010;Willett et al., 2014a).However, there are problems with homogeneity due to changing data streams and imbalances in the water budget, which are common to many reanalyses (Lorenz and Kunstmann, 2012;Bosilovich et al., 2013).While neither observationonly products nor reanalyses are perfect, the availability of multiple methodologically independent estimates is crucial to understanding the structural uncertainties (Thorne et al., 2005).
The monthly mean gridded HadISDH (currently version 1.0.0.) is the only live (updated annually), land surface observation-only product and at present it is only available for specific humidity.This is a multi-national project led by the Met Office Hadley Centre.It provides quality-controlled and homogenised gridded monthly mean anomalies, actual values, climatological averages and uncertainty estimates, gridded at 5 • by 5 • resolution from 1973 onwards (Willett et al., 2013b).
Herein, HadISDH is expanded to include a full suite of humidity variables, as listed above, alongside the simultaneously observed land-surface air temperature.This new version is now referred to as HadISDH.2.0.0 with individual variables known as HadISDH.landq.2.0.0 for example.Monthly mean 5 • by 5 • grids for the land surface are provided from 1973 onwards (to the end of 2013 at time of publication), and will be updated annually.Like HadISDH.1.0.0,HadISDH.2.0.0 is quality controlled at the hourly level, averaged to monthly means and then homogenised to remove non-climatic data artefacts from the series.Care has been taken to ensure physical consistency across the variables where possible.Monthly mean anomalies are created relative to the 1976 to 2005 period, chosen to maximise station coverage, and then gridded.Recognising that errors remain in the data to some extent, the uncertainty model designed for HadISDH.landq.1.0.0 is expanded to estimate uncertainty for all variables.To the authors' knowledge, HadISDH.2.0.0 will be the first operational purely in situ multi-variable landsurface humidity product, and also the first to provide an estimate of uncertainties in the data for all variables.This product is designed for assessing changes over large scales.While the data are intended primarily for scientific research, they are freely available to all at www.metoffice.gov.uk/hadobs/hadisdh.
The remainder of the paper is structured as follows.Section 2 describes the build process for HadISDH including source data and quality control for minimising random error.Section 3 explains the multi-variate homogenisation process for removing systematic error in detail.The uncertainty estimation model is described in Sect. 4. Additional detail to Sects.2, 3 and 4 can be found in the Supplement.Section 5 discusses the leading features apparent in the HadISDH product including long-term trends and comparison with other data-products.Product availability and user information is covered in Sect.6.Finally, Sect.7 concludes.

Methodological steps in HadISDH processing
HadISDH uses the quality-controlled HadISD (Dunn et al., 2012;Supplement) hourly station T d and T data as its basis, which in turn uses NOAA's synoptic (hourly) Integrated Surface Database (ISD) (Smith et al., 2011; http://www.ncdc.noaa.gov/oa/climate/isd/index.php).Hence, the name HadISDH signifies a Met Office Hadley Centre led project utilising ISD Humidity data.The T d data most likely originate from paired wet bulb and dry bulb thermometer measurements (often referred to as psychrometers) in the majority of cases.In some cases, particularly in the more recent record, an electronic sensor may have been used to output either RH or T d , or a dew cell may have measured T d directly.The instrument type information is not available in an accessible format for all stations.
During HadISD processing a climatology period of 1976-2005 was chosen as this maximised the station coverage.After quality control, 3679 of the 6103 HadISD.1.0.2.2013p stations have sufficient data with which to calculate a reasonable estimate of the monthly mean climatology (Supplement).Additionally, 673 of the 3679 HadISDH stations have been merged from multiple stations where they appeared to be the same station.This was done as part of HadISD processing.The full HadISD station list is available at www.metoffice.gov.uk/hadobs/hadisd/v102_2013f/files/hadisd_station_info_v102.txt with variable specific lists alongside the HadISDH data product at www. metoffice.gov.uk/hadobs/hadisdh.

Deriving the humidity variables from temperature and dew point temperature
The hourly station data pass through several stages of processing before being gridded, as depicted in the flow chart in Fig. 1.The quality-controlled hourly T and T d are converted to T w , q, e and RH at each time point.All relevant equations can be found in Table 1 and are identical to those in Willett et al. (2013b).Specific humidity (Eq.2) is calculated by first converting T d to e (Eq.3).Where the calculated T w (Eq.5) values are below 0 • C, values of e are recalculated with respect to ice (Eq.4).This assumes that the wet bulb was in fact an ice bulb at that time and that the measurement Specific humidity (q) in g kg −1 q = 1000 0.622e P mst −((1−0.622)e)Peixoto and Oort (1996) (2) Vapour pressure with respect to water (e) in hPa (when + 3.46 × 10 −6 P mst Buck (1981) (3) substitute T for T d to give saturated vapour pressure (e s ) Vapour pressure with respect to ice (e ice ) in hPa (when (5) Station pressure (P mst ) in hPa 5.625 List (1963) (6) P msl is pressure at mean sea level, Z is height in metres, T in Kelvin Relative humidity (RH) in %rh RH = 100 e e s -( 7), q q s can be substituted for e e s .
was taken with a wet bulb thermometer.This potentially introduces a dry bias in q and e when T is near 0 • C where resistance or capacitance sensors are used -such automated sensors are more common in the most recent years.It is assumed that any large biases introduced by this are detected and accounted for in the homogenisation process (Sect.3).Furthermore, values of q and e are low in cold conditions, so absolute errors will be small even if they are large in percentage terms.This issue should bear little influence on the large-scale assessments of q and e.For RH, dry biases could be up to 4 %rh, increasing as T w rises towards 0 • C. It is not possible to identify which instruments were used where and when on the global scale and so this problem is impossible to correct for and remains an uncertainty in the data that is very difficult to quantify accurately.In converting to e, q and T d a climatological monthly mean station pressure (P mst ) is used.Station pressure from HadISD is not always available, or of suitable quality and so cannot be used.The effect of a 1 hPa change in P mst is ∼ 0.1 % in terms of q (Willett et al., 2008), hence assuming a standard pressure of 1013.25 hPa could introduce considerable error.To maximise coverage and minimise errors climatological monthly mean sea level pressure (P msl ) is obtained from the 20th Century Reanalysis V2 (20CR, Compo et al., 2011; data provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, http://www.esrl.noaa.gov/psd/).This is available for 2 • by 2 • grid boxes and has been averaged over the 1976 to 2005 climatological period.Gridded P msl is converted to P mst , using station elevation (Z in metres) and station climatological monthly mean T (in Kelvin), by an equation based on the Smithsonian Meteorological Tables (Eq.6; List, 1963).Using a climatological P msl introduces small errors (∼ 0.1 % in q for 1 hPa) at the hourly level.The conversion to P mst from climatological T introduces small errors that are progressively larger for higher elevation stations.For example, for stations at 2000 m and temperature differences (from climatology) of ±20 • C, an error in q of up to 2.3 % could be introduced to the individual hourly values.However, the majority of stations (89 %) are below 1000 m, where potential error for ±20 • C reduces to ∼ 1 % and then 0.5 % for 500 m.During any month the actual T and P msl will vary above and below the estimated climatological values due to synoptic and mesoscale as well as diurnal variations and so essentially resulting errors in P mst should cancel out or at least reduce by averaging.Ultimately, using a non-varying station pressure (year-to-year) ensures that any trends in q and e originate entirely from the humidity component as opposed to changes in T introduced into station pressure indirectly through conversion from P msl .Hence, for studying long-term trends inqanomalies, this method is sufficient.However, users of actual monthly mean q, e, RH and T w should be aware of the small potential errors noted above.

Processing of hourly values to monthly climate anomalies
All hourly variables are then converted to monthly mean values (Fig. 1c, and see Supplement).The absolute monthly mean values are passed and returned from the homogenisation software, which internally creates climate anomalies in order to perform the homogenisation.For consistency, homogenisation is applied directly to the core variables of T and DPD first (Fig. 1d, PHA: Pairwise Homogenisation Algorithm).The change-point locations from these are then used indirectly to apply adjustments to all other variables, in addition to applying DPD change points to T (Fig. 1e, ID [indirect] PHA).This is explained fully in Sect.3. Climate anomalies are then derived from the homogenised station data by simply recalculating monthly mean climatologies.In some cases these stations now have insufficient data with which to calculate a climatology.This occurs for two reasons.Firstly, during direct homogenisation (PHA), individual months of data can be removed.Secondly, the initial climatologies were built from monthly-hour climatologies.In a few cases monthly-hour means that contributed to a monthly-hour climatology were insufficient to create a monthly mean, resulting in insufficient monthly means to create a climatology.To ensure physical consistency, monthly homogenised T d is calculated by subtracting DPD from T .
Resulting spatial and temporal coverage is shown in Figs. 2 and 3, respectively (see also Figs.SM1 and SM2 for other variables in the Supplement).Note that coverage differs slightly between variables due to the homogenisation process which removes further stations, and some data periods (direct PHA only).The removal of supersaturated and subzero (q and e only) stations is discussed in Sect.WMO IDs/colours are bounded as follows: Europe: 0-199 999 (dark red), Russia/Eastern Europe: 200 000-379 999 (crimson), Central Asia/Middle East/India/Pakistan: 380 000-439 999 (dark orange), South East Asia/East Asia/China: 440 000-599 850 (orange), Africa: 600 000-689 999 (gold), USA/Canada: 690 000-749 999 (dark green), Central America: 760 000-799 999 (olive green), South America: 800 000-879 999 (royal blue), Antarctica: 880 000-899 999 (sky blue), Pacific Islands (Inc.Hawaii): 911 000-919 999 (navy), Papua New Guinea/Australia/New Zealand: 930 000-949 999 (lilac), Indonesia/Philippines/Borneo: 960 000-999 999 (purple).It should be noted that intermittency in station records is a common in situ observing system problem.While in some cases the observations do not exist, more likely, they exist but have not made it electronically into various archives due to technical problems at that time.Retrospective improvements do occur sporadically thanks to the archiving work of host institutions (e.g.NCDC in the case of ISD) and individual countries but it would take significant international effort to recover all missing records.The intermittent missing data are further compounded by removals during QC processing but this is necessary to ensure high-quality data.

Homogenising all variables in a consistent manner
For any analysis of long-term trends and variability, the data product must be free from features that are not of climatic origin.Systematic changes in the data that originate from changes to the observing system or its environment arise for multiple reasons.These change points need to be identified and the data adjusted to remove the inhomogeneities.
Locating change points and adjusting for inhomogeneities is non-trivial, especially with large station networks where the homogenisation process must by necessity be automated to be both tractable and reproducible.Records of changes to instruments, shelter, location, observing practice etc. may exist on paper or in digital form but there is presently no globally complete digital archive.For global-scale projects the majority of homogenisation processing is therefore necessarily undertaken using statistical frameworks.Large inhomogeneities that occur distinctly in time and space from any other inhomogeneity within the network are easy to detect.However, it is most likely that change has been ubiquitous across the global station record, ranging from large to very small changes in the mean with additional changes in the variance that likely differ depending on the season or even other climate variables -e.g.temperature biases could depend upon whether the sky is clear or cloudy.Comparisons of long-term trends in both the raw and homogenised data are made in Sects.5.2 and 5.3.
There are very few automated homogenisation methods able to be applied to networks of the order of several thousand stations (Venema et al., 2012).NOAA NCDC's Pairwise Homogenisation Algorithm (PHA; Menne and Williams, 2009;Supplement) has been used on the Global Historical Climate Network -Monthly (GHC-NMv3, Lawrimore et al., 2011) and was also used for HadISDH.landq.1.0.0.2012p (see Sect. 3 of Willett et al., 2013b).It has been tested against a set of benchmarks for the USA (Williams et al., 2012) and through the COST HOME benchmarks (Venema et al., 2012).Overall, it was found to bring temperature data closer to their "truth" but be overly conservative in places; i.e. its adjustments tended to be too few while it avoided making adjustments where none were needed.For the multi-variable HadISDH it is a good choice both because of its previous validation and also because it is computationally efficient.This allows multiple runs during method testing and efficient updating.The code is freely available from http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#phas.
As the equations for conversion between humidity variables (Table 1) are nonlinear, it would introduce substantive errors to simply homogenise monthly T and T d and then use these to calculate monthly values of all of the other variables.However, straightforward application of PHA to the monthly means of each variable separately can also result in physical inconsistencies.For example, a positive adjustment in T and RH should not coincide with a negative adjustment in q because that is not physically possible.Furthermore, some variables have better signal-to-noise ratios than others such that their change points are more easily detected.Within the HadISDH network, station 557 730 (Pagri, Tibetan plateau, China) has conspicuous changes in DPD between 2000 and 2007 (Fig. 4a) but this is not detected in q when PHA is applied directly (Fig. 4b).If the PHA-based change-point locations from DPD are used to apply adjustments to q, then a small change is applied (Fig. 4c).This suggests that by using the change points located within a variable with higher signal-to-noise ratio the smaller, less detectable inhomogeneities in the candidate variable can be adjusted for.Given the above, our strategy involves both direct and indirect application of the PHA algorithm (henceforth called PHA and ID PHA) as shown diagrammatically in Fig. 1.
In HadISDH2.0.0 all direct PHA detected change points in monthly T and DPD are used in applying appropriate adjustments to monthly series of all other variables (ID PHA, Fig. 1f).Additionally, ID PHA is used to apply extra adjustments to T using change-point locations found in DPD.Finally, a homogenised T d product is obtained by subtracting homogenised DPD from homogenised T (Fig. 1g).In this case conversion at monthly resolution is appropriate because the equation is linear.

Results of applying direct PHA to T and DPD
The distributions of adjustment size for both T and DPD are close to Gaussian and fairly evenly distributed over time (Fig. 5a and e).The "missing middle" is where change points most likely exist but are undetectable by homogenisation algorithms due to the low signal-to-noise ratio for small  Example of an inhomogeneity having a greater signalto-noise ratio in DPD compared to q for station 557 730, Pagri, Tibet.(a) DPD annual mean series for raw data (red) and direct PHA homogenised data (blue) for 557 730 and its raw neighbour series (black).(b) q annual mean series for raw data (red) and direct PHA homogenised data (blue) for 557 730 and its raw neighbour series (black).(c) q annual mean series for raw data (red) and indirect PHA homogenised data (blue) for 557 730 and its raw neighbour series (black).
magnitude inhomogeneities (Brohan et al., 2006).This is clear in Fig. 5a and c.A far greater frequency of change points is found for DPD (2.53 change points per 41 year station record) versus T (1.34 per station).When additional DPD change points are used to apply adjustments to T (see Sect. 3.2,Fig. 1e) this "missing middle" is substantially filled resulting in 3.6 change points per station (Fig. 5c) which supports the decision to use the additional DPD change points.
For direct PHA, T adjustments had a slight negative tendency (mean adjustment size = −0.10).The addition of the DPD change points reduces the negative tendency substantially (mean adjustment size = −0.02).Inevitably, some change points from DPD may originate from the wet bulb thermometer alone such that an adjustment is applied to T erroneously.
In such cases, adjustments applied will be extremely small because if no inhomogeneity exists then there should be no significant difference between the station minus neighbour difference series before and after the change point.
The change-point frequency over time commonly reduces away from the middle of the time period for direct PHA (Fig. 5b and f) to zero change points in the first and last two years.PHA does not allow change points to be detected in the first or last two years of any time series.Furthermore, its power to identify and adjust for change points is limited near the start and end of records.This is due to the reduced window width (homogeneous subperiod) available because shorter periods are more sensitive to inter-and intraannual variability.Where an adjustment cannot be robustly estimated (defined with reasonable confidence) then no adjustment is applied and the related sub-period of data is removed.This is the reason for the severe drop-off in station counts at the beginning and end of the record for T , T d and DPD shown in Figs.3a, b and SM2c.It also explains the higher density of missing data in the middle of the record in DPD, T d and T compared to the other variables that have undergone ID PHA.In ID PHA these change points are noted and adjustments applied regardless (no data are removed)hence the additional frequency of change points at time period ends in the first and last few years of the record shown in Fig. 5d.
In some cases the size of the adjustments is very large -six stations have adjustments in T greater than 5 • C with one of 13.99 • C (WMO ID 714963; Table SM1).Given their large adjustment sizes, these six stations are removed.In such cases the likely cause is either a bad merging decision (Sect.2) or a station move of considerable horizontal and/or vertical distance.Five out of the six stations are merged during HadISD processing and so can be considered as bad merges.Work is ongoing to investigate these merges further as they are necessarily based on automated statistics to some degree.Willett et al. (2013b) found that the merges improved spatio-temporal coverage considerably but not all of the 673 stations were analysed in detail.Although PHA can remove these inhomogeneities it is better to remove these stations altogether given that there are known problems.These problems would affect all variables and so it is sufficient to base this station removal on T only but then remove the stations from processing for all other variables too.

Results of applying ID PHA to T , q, e, RH and T w indirectly from change points in T and DPD
To our knowledge this is the first time that ID PHA has been used.ID PHA does not remove any data from station records.This results in better spatial and temporal coverage but leaves the ID PHA less conservative than direct PHA as an adjustment will always be applied.However, ID PHA and direct PHA yield very similar products, especially for largescale averages.For q, regional trend differences between ID PHA and PHA are within 0.01 g kg −1 decade −1 for the globe, Northern Hemisphere and tropics.The regions with drying trends are far more pronounced when using ID PHA.There are greater differences for RH with ID PHA, resulting in smaller regional trends.These are no longer significant for the globe and Northern Hemisphere in the ID PHA case.
With ID PHA there is a much greater chance of having consistency across variables and there is a greater frequency of change points than would be the case for direct PHA on these variables which we believe to be more realistic.This is both because change points from both T and DPD are applied and also because DPD often has a better signal-tonoise ratio than q and e.The distributions of change points by magnitude and frequency over time for ID PHA on T and RH are shown in Fig. 5c, d, g, h.Here it is clear that the "missing middle" is infilled to some extent and that there is a greater frequency of change points towards the endpoints of the record.The multi-variable approach of ID PHA may be an improvement over direct PHA in its ability to detect small inhomogeneities.However, without multi-variable benchmarks (Venema et al., 2012;Williams et al., 2012;Willett et al., 2014b) this is impossible to test fully.

Processing the homogenised stations into gridded products
All homogenised stations with enough monthly data to calculate a climatology are gridded.A further requirement is that station data must not show supersaturation (all humidity variables) or subzero values (for e, q and RH only).The application of ID PHA is intended to maintain the physical consistency across variables compared to direct PHA but there are cases where it does not.We have removed all such stations with physically unrealistic data from gridding and further analysis.This is discussed further in Supplement Sect.7. Table SM1 lists the number of stations removed because of this for each variable and station locations are shown in Fig. 2 and Fig. SM1.This is a minor problem for e, q and RH where only 28 stations are removed due to supersaturation and 52 stations removed due to subzero values (e and q only).This is significant for T w (808 stations removed) because it is much closer in value to the dry bulb temperature.This means that even small adjustments can result in supersaturation.Nearly all stations north of 60 • N are removed from the gridded T w product resulting in very few stations common to all variables above this latitude (Fig. 2a).Grid box averages are simple means of all stations present within the grid box for that month.Grid-box averages are made of the absolute values, climate anomalies, climatology and standard deviation in the climatology in addition to the uncertainty estimates.While there are limitations to the method presented here, none appears large enough to affect the large-scale features of the data (see Sect. 5).It is likely that these methods will be less suitable for types of variables that have shorter spatial correlation distances (e.g.precipitation).We attempt to quantify the uncertainty remaining due to both applied and missed inhomogeneities to some extent in Sect. 4 and further comparison between the raw and homogenised data is given in Sect. 5.

An uncertainty model for all variables
There are various sources of uncertainty whose effects propagate from the raw hourly observations, through the conversion to the chosen variable, averaging to monthly means, conversion to climate anomalies, homogenisation, gridding and spatial averaging.Here we estimate these components given the current state of knowledge.The framework is based around that described initially in Brohan et al. (2006) and then adapted for humidity and PHA homogenisation in HadISDH.1.0.0 (Willett et al., 2013b).Here it is expanded for application for all variables.All uncertainties are described as standard uncertainties (1σ ) and provided as 2σ in the final data-product.This coverage factor (k = 2) means that there is 95 % confidence (2σ ) that the true value of each quantity lies in the range ±2σ about our estimate.Each uncertainty component is available individually as a monthly gridded product.The individual uncertainties are assumed to be independent and so are combined in quadrature for each grid box.Regional uncertainties are also calculated, by additionally including a component for spatial coverage uncertainty due to missing data.

Station uncertainty
The uncertainty estimate for the station provides an individual value for each monthly mean anomaly based on uncertainty in the climatology calculation given missing data (clim), in any homogenisation adjustments made and missed (adj) and also from the initial hourly measurement (ob).All components are described fully in Willett et al. (2013b).An overview and the differences necessitated by the multivariable approach are described in Supplement.To briefly recap, for any monthly mean anomaly our model is as follows (note Eqs.(1) to ( 7) are in Table 1): Specifically, the measurement uncertainty expands the HadISDH.1.0.0 model out to all variables.As we do not have instrument type information for any of the data, we make the assumption that all measurements were taken using aspirated psychrometers.This is an overgeneralisation, especially over the last two decades when electronic sensors became more common.RH instruments are most prone to errors at high humidity whereas psychrometers are most inaccurate at low and HadISDH, it may be possible to identify for some stations whether they are automated or manual, and changes over time from the original ISD code.An assumption could be then made that all automated stations use sensors and all manual or non-identifiable stations used psychrometers.This would allow a more specific measurement uncertainty model to be applied.

Gridding of uncertainty components, grid box sampling uncertainty and spatial coverage uncertainty in large-scale averages
HadISDH is a 5 • by 5 • gridded product in which all values within a grid box for each month are simply averaged with weighting for stations' latitude/longitude/elevation.No further interpolation or smoothing is performed.The overall station uncertainty for a grid box is estimated as the square root of the sum of the squared station errors u 2 anom divided by the number of stations in the grid box.
Month to month, the number of stations within a grid box will vary for very many grid boxes.Furthermore, given the relatively sparse sampling of stations within each grid box, the grid box value is unlikely to be the true grid box value even in the limit that all contributing stations' data were perfect.This sampling uncertainty term is described fully in Willett et al. (2013b), and follows Brohan et al. (2006) (see also Jones et al., 1997).To briefly recap, the grid box sampling uncertainty SE 2 is estimated by where s2 i is the variance of the grid box anomalies calculated over the 1976-2005 climatology period, r is the average inter-site correlation and N s is the actual number of stations contributing to the grid box in each month of the record.
When obtaining global and regional time series for the variables in HadISDH, the time-varying coverage has to be accounted for.The same methodology of the earlier HadISDH.1.0.0 (Willett et al., 2013b), following Brohan et al. ( 2006) is used, and recapped here.While redoing the calculations an error was found in the code used for the coverage uncertainties of HadISDH.1.0.0, which meant they were overestimated in Willett et al. (2013b).This has been rectified here resulting in uncertainties of approximately half the magnitude.
ERA-Interim data (Dee et al., 2011) are used as they have good agreement with the in situ surface humidity variables (Simmons et al., 2010).Monthly surface fields are available for T and T d , and fields for q and RH have been created and provided by A. Simmons.Using equations from Table 1 we have derived DPD fields directly from T and T d , e from T d and surface pressure and T w from T , T d and surface pressure.These monthly resolution conversions for e and T w are not ideal given the nonlinear nature of the equations.This is sufficient though, for estimating the coverage uncertainty.For each of the seven variables, for each month in the HadISDH anomalies, the ERA-Interim anomalies from all matching calendar months are selected (i.e. for a January in HadISDH, all Januaries in ERA-Interim are selected).The ERA-Interim fields are then masked by the spatial coverage in HadISDH for that particular month and a cosine-weighted regional average is calculated.The residuals between these masked averages and the regional average calculated from the observed grid boxes are only then calculated.From the distribution of these residuals the standard deviation is extracted and used as the spatial coverage uncertainty for that month in the regional HadISDH time series.
On a month-by-month basis, the sampling and station uncertainties from each grid box are treated as independent and combined in quadrature.The resulting regional uncertainty is the square-root of the sum of the normalised cosine-latitudeweighted squares of the individual grid box uncertainties.This regional uncertainty is then also treated as independent of the overall coverage uncertainty for the region and combined in quadrature with it to obtain the final 2σ uncertainty on the area average time series.

Variability and recent trends in land surface
humidity and temperature

Seasonal and interannual variability
The annual climatological averages for 1976-2005 are shown in Fig. 6 and Fig. SM5.For T , T d , T w , e and q the patterns are very similar.All variables peak across the tropics and decrease polewards.A dry/cool arm extends southwards down to the Himalaya across eastern Asia.The patterns in T are most similar to those in T w whereas T d appears more similar to q and e. RH and DPD are very different from the other variables but similar to each other.They show little zonal stratification but have regionally consistent dry zones over high elevation and desert regions (e.g.southwestern USA/northwestern Mexico, southern Africa, Australia, northern and western Africa, central Asia including the Gobi desert and the Tibetan Plateau).Climatological limits for each variable are given in Supplement Table SM3.Note that these are monthly averages for the grid box and so extreme hourly values at a station may be considerably higher and lower.

Spatial patterns in long-term changes in land surface humidity and temperature
Long-term trend assessment requires homogeneous data free from non-climate influences, so it is important to compare the raw and homogenised data.Figures 7a-c to 10a-c show comparisons for grid box decadal trends for T , T d , q and RH.Trends are all calculated using the median of pairwise slopes (Sen, 1968).The effects of homogenisation on the spatial trend patterns are relatively small.For all variables, homogenisation has led to a reduction in the apparent extreme tails of the distribution of grid box trends (panel c).We may expect inhomogeneities to lead to unrealistically large grid box trends in some cases so this suggests that homogenisation is performing reasonably.
Panel a shows where decadal trends have changed sign after homogenisation, or where they have been enhanced or reduced.Very few clear patterns emerge.The smallest effect of homogenisation is in T and T w , whilst the greatest effect is in RH and DPD (panel b).For T , T w , T d , q, e, RH and DPD, 96.5, 91.7, 86.3, 90.2, 90.0, 74.2 and 69.7 % of grid boxes, respectively retain the same trend sign after homogenisation.Changes to grid box trends for q and e are almost identical, and changes in the southern tip of South America and around the east of the Arabian Peninsula are common to T d and T w as well.For RH and DPD there is little regional pattern (either in terms of each variable or when comparing the two) to the grid boxes which have switched sign, but homogenised grid boxes with changed trend direction are much more widespread than for the other variables.This suggests that RH and DPD are much more sensitive to inhomogeneity than the other variables.This may be because of their greater variability and their joint dependence on measurements of temperature and humidity.It may also be the case that many of the grid box trends lie close to a zero trend, hence a change in trend direction only requires a small change in the data.
Without the aid of variable-specific benchmark tests it is impossible to measure exactly how well the homogenisation is performing other than comparing the resulting dataproduct with expectation from theory.The good spatial consistency shown in the climatologies (Fig. 6, Fig. SM5), despite no spatial smoothing beyond the grid box, in addition to the agreement between variables and with expectation provides support to the efficacy of the homogenisation approach.Figures 11 and SM6 show mapped homogenised decadal trends.Grid boxes with black dots identify trends where the 5th to 95th percent confidence intervals of the trend are of the same sign.These are considered to be significantly different from a zero trend.For all variables the vast majority of grid boxes exhibit moistening (T d , T w , q, e) and warming (T ) trends with widespread consistency.For T , 88.6 % (90.2 %) of grid boxes after (before) homogenisation experience significant warming.As T w is governed by both changes in water vapour and T it is expected that this variable would be the most similar to T in spatial patterns of trends.This is indeed evident here.For T w , 82.1 % (80.7 %) of grid boxes after (before) homogenisation experience significant increases.Of the grid boxes common to T and T w , 95.5 % have trends of the same sign (94.5 % increasing, 0.9 % decreasing).
The variables that are solely affected by the amount of water vapour in the atmosphere (T d , q and e) mostly show significant moistening of the homogenised (raw) trends, making up 73.6 (67.6 %), 74.1 (69.3 %) and 73.1 % (69.1 %) of grid boxes, respectively.There are some drying trends over the typically desert regions (southern South America, southwest North America, southern Australia, southern Africa) and also in south-eastern China which in many grid boxes are signifi-cant.The Southern Hemisphere drying regions are also common to T w , but in this case they are not significantly different from a zero trend.The close agreement between these variables is a strong indicator of the internal consistency of the HadISDH data product.90.6 % of grid boxes common to q, e and T d are of the same sign (83.1 % increasing, 7.6 % decreasing).From Eq. ( 2), it is clear that e should be approximately 1.6 times larger than q.For the decadal trends the mean ratio of e to q is indeed 1.6 : 1.Although the standard deviation is reasonably large at 0.7, only 1.7 % of grid boxes have a ratio of less than 1.
Trends are strongest over northern high and mid-latitudes for T , T d and T w , whereas for q and e they are strongest over the tropics and around the Mediterranean.This is consistent with basic physics.Following Clausius-Clapeyron we expect increases in q and e to be larger in warmer, more humid regions.Changes in T d are greater in drier regions where there is less water vapour and smaller in more humid regions where there is more water vapour.For example, a 1 hPa increase in e from 1 to 2 hPa results in an increase of around ∼ 8 DPD and RH present more diverse spatial patterns in longterm trends.Both variables are distinct measures of closeness to saturation and as such have very similar patterns to each other.For RH almost an equal number of homogenised (raw) grid boxes show significant increases and decreases -29.8 % (40.6 %) becoming less saturated and 26.5 % (26.3 %) becoming more saturated.Similarly, DPD grid boxes with significant trends are of mixed sign with 39.6 % (46.2 %) becoming less saturated and 22.1 % (19.3 %) becoming more saturated.Overall, the high latitudes and tropics have become more humid (closer to saturation, decreasing DPD, increasing RH) while the mid-latitudes, including the Mediterranean, have become drier (less saturated, increasing DPD, decreasing RH).For RH and DPD 75.2 % of grid boxes concur on trend direction (31.8 % increasing saturation, 43.4 % decreasing saturation).
The regions of strongly increasing saturation (e.g.India and West Africa) are also regions of strong increases in water vapour (increasing e and q).However, regions of strongly reduced saturation (e.g.subtropics to mid-latitudes in both hemispheres) are not generally associated with decreasing water vapour (decreasing q and e).Although the amount of water vapour (e and q) is increasing for the most part over northern mid-latitudes, these increases appear insufficient to keep the same level of saturation (constant RH) as these regions are also rapidly warming.

Long-term trends in large-scale regional averages
For all variables, regional averages are created for the globe (70 • S to 70 • N), the Northern Hemisphere (20 to 70 • N), the Southern Hemisphere (70 to 20 • S) and the tropics (20 • S to 20 • N).Each grid box is weighted by the cosine of its latitude to account for the change in grid box area with latitude.Note that assuming complete spatial coverage over land this results in a global average comprising of 62, 24 and 14 % of grid boxes from the Northern Hemisphere, tropics and Southern Hemisphere, respectively, For q this breakdown is very similar at 64, 24 and 12 %, hence the global average is dominated by Northern Hemisphere signals, Trends have been fitted and significance of the trend direction tested as described in Sect.5.2.
For long-term trends in large-scale averages, homogenisation has had little impact in the well-sampled Northern Hemisphere and the global average.Figures 7 to 10 (panels d-g) show raw and homogenised regional average time series and trends for T , T d , q and RH.Overall, the homogenised trends Table 2. Regional trend direction summary before and after homogenisation.+ significant increasing trend, − significant decreasing trend, / no significant trend.Boxes are labelled for warming (warmer), increasing water vapour/closeness to saturation (wetter), decreasing water vapour/closeness to saturation (drier) and no change (blank).are very similar to the trends from the raw data.The homogenised trends are slightly smaller in magnitude in all cases except global and Northern Hemisphere T d , where they are very marginally larger (0.02 • C decade −1 and 0.01 • C decade −1 larger, respectively).Regional trend directions, where they are significantly different from a zero trend, are summarised in Table 2 for PHA and ID PHA compared to the raw data.For T , q and e there is no change in trend direction for any region.For T d and T w the trend over the Southern Hemisphere changes from decreasing and increasing, respectively to no trend in both cases.
The largest change occurs in RH where the significant decreasing (becoming less saturated) trends for the globe and tropics are no longer significantly different from a zero trend after homogenisation.For DPD, all trends in the raw data are significantly increasing (becoming less saturated).After homogenisation, there is no longer any significant trend direction in the tropics.
The coherent conclusion here is significant warming and increasing water vapour over the majority of sampled global land areas over the last 40 years.This is true for the global average and for the Northern Hemisphere.It is not true for the Southern Hemisphere, which, although warming, does not exhibit significantly increasing water vapour in any humidity variable.The surface air over the Southern Hemisphere land is becoming less saturated.The limited land area and poorer data coverage may be factors here.Sparser observational coverage leads to greater sensitivity to uneven spatiotemporal sampling of stations and errors remaining in the data.For the tropics and Southern Hemisphere decadal trends in q are present at only 49 and 59 % of grid boxes, respectively.For the globe this rises to 72 %, largely driven by the relatively well sampled Northern Hemisphere at 84 %.As such, there is much larger uncertainty over the tropics and Southern Hemisphere.This is reflected in the spatial coverage uncertainty estimates discussed below and shown in Figs. 12 and 13.
Figures 12f, 13f and 13n show continuation of the decline in RH since 2000 noted in Simmons et al. (2010) and recent State of the Climate reports (Willett et al., 2011(Willett et al., , 2012(Willett et al., , 2013a(Willett et al., , 2014a)).This is apparent in the global, Northern Hemisphere and Southern Hemisphere time series although too short to be considered a long-term trend at present.As shown in Fig. 11d, it is dominated by the mid-latitudes while the high latitudes and Northern Hemisphere tropics, including the Indian sub-continent show increasing saturation.For the most part this replicates findings in Simmons et al. (2010) for ERA-Interim.However, aside from West Africa and India, ERA-Interim shows drying across the Northern Hemisphere tropics.ERA-Interim drying over central Africa is thought to be unreliable (Dee et al., 2011) and cannot be compared with HadISDH as there is no data coverage.Given the poor data coverage and large uncertainties for HadISDH over the tropics, the validity of the increased saturation shown in the Caribbean for HadISDH is unclear.These features of reduced saturation are also shown in the DPD data (Figs. 12g,13g,13o and SM6c and 12g).
Regardless of whether the Caribbean is becoming more saturated or not, the zonal nature of these features suggests that there must be a large-scale contributing factor, possibly linked to atmospheric circulation The flattening since 1998, commonly referred to as the "warming hiatus", is less apparent in T than in T d , T w , q and e.As discussed earlier, these humidity variables depend substantially on evaporation over the ocean.The faster warming over the land relative to the oceans (Joshi et al., 2008;Simmons et al., 2010;Sánchez-Lugo et al., 2014), especially in summertime extremes of temperature (Seneviratne et al., 2014), is likely to be important.Insufficient water is being evaporated at cooler temperatures over oceans to maintain a constant RH for the higher temperatures over land.Furthermore, a reduction in moisture availability over land can in turn allow temperatures to escalate further (greater sensible heating) than in the presence of moisture because no energy is lost to evaporative processes (latent heating) (Brabson et al., 2005).Changing land use patterns may also play a part as urban, agricultural, grazed and forest surfaces all have very different properties in terms of water storage, run off, evaporation, evapotranspiration and energy balances.
From previous modelling studies it was thought that RH would not change much on multi-decadal timescales, at least in the large-scale average (Manabe and Wetherald, 1975;Allan and Ingram, 2002).More recently, Richter and Xie (2008) show modelled increases in RH of the order of 1 %rh over the surface oceans under an A1B emissions scenario out to 2090.They also show that this layer of increasing RH is very thin and that above the boundary layer (> 925 hPa) widespread decreases in RH are found over the subtropics and midlatitudes.As a land-only product, HadISDH RH cannot be directly compared with the results of Richter and Xie; but the decreasing land surface RH shown here suggests that largescale RH can indeed change on multidecadal timescales.
There are a few other products available that also provide global temperature and humidity variables over similar periods.Comparison with these is a useful test of validity for HadISDH.The CRUTEM4 (Jones et al., 2012) and GHCN-Mv3 (Lawrimore et al., 2011) are land-based temperature data sets routinely used for climate monitoring.The ERA-Interim reanalysis (Dee et al., 2011) is included here for T , q and RH comparison, following comparison with earlier HadISDH humidity and CRUTEM temperature products (Simmons et al., 2010;Willett et al., 2013b).ERA-Interim fields for T d are available and DPD has been derived from the monthly T and T d fields.We have also calculated monthly mean e and T w fields for ERA-Interim using 6-hourly T and T d fields.Finally, we show CRUTS3.21(Harris et al., 2014) for e.While CRUTS3.21 is not intended as a monitor-ing product due to infilling with climatology in data-sparse regions we expect there to be good agreement over wellsampled regions as this infilling occurs mostly pre-1960s.Global average time series are compared with all data sets regridded to HadISDH resolution and resampled to only those grid boxes where HadISDH has data.Difference series and correlations for the standardised climate anomalies (not detrended) are shown in Fig. 12.
In general, agreement between HadISDH and the other products is very good both intra-and inter-annually.Correlations are remarkably high (> 0.99) for T .Agreement is slightly poorer for RH although still strong at R = 0.817.This adds further support to the validity of HadISDH (and ERA-Interim).ERA-Interim changed the source of the ingested SSTs in June 2001, which has previously been thought to be the primary cause of a small shift between ERA-Interim and other products around that time (Simmons et al., 2010).This is not obvious in the difference series although some digression around that time is apparent for all variables other than T .Interestingly, for q, RH, e and T w HadISDH exhibits a slightly more humid period between 1997 and 2004.
The uncertainty ranges are sufficiently small at the global level (Fig. 12), that we can be quite certain that the most recentdecade was both warmer and moister than 1973-1982.For all regions, the combined uncertainty is dominated by the spatial coverage and station uncertainty.The latter is dominated by uncertainty in adjustments applied and missed during homogenisation, which reduces to near zero towards the present day.Temporal sampling uncertainty is negligible in all cases at these large regional scales.Interestingly, station uncertainty (essentially homogeneity uncertainty) makes a much larger contribution to combined uncertainties for q, e and RH compared to T d , T w , DPD and T .This suggests that both station quality and station density are the primary issues for producing robust humidity analysis whereas station quality is less of an issue for temperature.Uncertainties are similar in magnitude for the Northern Hemisphere and the globe in all cases (Fig. 13).For the direct humidity variables (q, e, RH, T d and DPD), coverage uncertainties are much larger over the tropics than the Northern Hemisphere.This is both due to the poorer spatial coverage and greater variability in terms of humidity.For the thermally driven variables of T and T w , there is far less variability in the tropics, thus mitigating against the effect of poorer spatial coverage.Hence, uncertainty in the tropics for T and T w is smaller than for the Northern Hemisphere and the globe.In all cases uncertainty in the Southern Hemisphere is considerable, leaving little confidence in the robustness of any signal over this region.changes in spatio-temporal coverage and thus the temporal behaviour of large-scale averages.These are insufficient to change the decadal trend magnitude in the global average out to three significant figures.

Comparison between
In terms of the homogenisation methods alone, the gridded homogenised product 2.0.0 is essentially the same as 1.0.0 for large-scale averages.HadISDH.2.0.0.landq produces significant decadal trends for the globe (tropics) that are 0.01 g kg −1 decade −1 smaller (larger).There is no difference for the Northern Hemisphere and changes are not significant for the Southern Hemisphere in either product.Differences for the global average are shown in Fig. 12 with a correlation of > 0.99.At the individual grid box level, 11 % of grid boxes now contain trends of a different direc-tion.These are almost all over the drying regions shown in Fig. 11c, these are more spatially widespread and consistent in HadISDH.2.0.0.It is more likely that these widespread drying signals are real as we would not expect errors in the data to have such a spatially smooth signal, providing further confidence for the methods of HadISDH.2.0.0.

Data availability and logistics
The gridded product of HadISDH used here is HadISDH.2.0.0.2013p.Table SM3 documents the fields available.It is freely available for research purposes from www.metoffice.gov.uk/hadobs/hadisdh,along with supporting material, diagnostics and also some of the source code used in development.Individual stations are also available on request.Version control will follow the HadISD format (Dunn et al., 2012) with HadISD updates being fed through to HadISDH.HadISDH version control and format are fully described on the download web page.The version of the pairwise algorithm used is that associated with the GHCN v3.2 release and can be downloaded from http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#phas.While great effort has been made to ensure high quality and long-term homogeneity of the data, all users are advised to use the uncertainty estimates and station numbers contributing to each grid box mean where possible.
Furthermore, there is some instability resulting from periodic ISD updates and improvements to the historical data.For each update an assessment will be made of any resulting differences in HadISDH.This will be documented on the website.
Feedback is very much appreciated and future versions/annual updates will endeavour to address any issues found.All updates, related work in progress and notes of interest are written on the HadISDH blogsite hadisdh.blogspot.co.uk.Users can also keep up to date with developments from the Met Office Hadley Centre HadOBS twitter account @metofficeHadOBS.

Concluding discussions
HadISDH.2.0.0 is the first, to the authors' knowledge, multivariable humidity and temperature in situ observations-only land climate data product that is homogenised and annually updated.Gridded monthly mean climate anomalies (and other statistics including uncertainty estimates) are provided for T , T w , T d , q, e, RH and DPD from 1973 to the end of 2013.Efforts have been made to ensure consistency across variables such that they can be used together.It is designed as a tool for climate monitoring, where long-term stability is essential, providing a variety of physically consistent humidity variables which have different uses for understanding of the Earth-atmosphere system and societal applications.It is also useful for evaluation and validation of climate models and reanalysis products.The new ID PHA homogenisation method applied here, which has been developed to expand the application of existing PHA software in a manner that is consistent across several simultaneous variables, is shown to perform well for all considered variables.This new method results in a greater frequency of change points identified for each variable than there would be otherwise, filling in the "missing middle" of inhomogeneities typically too small to detect with a single variable approach.
Given that the dates and character of inhomogeneities are either unknown or not documented in a useable digitised format, the only method of validation available is comparison with other available products and expectation based on basic physics.On both counts, the homogenised HadISDH product performs well, agreeing with the temporal evolution of estimates available from ERA-Interim, CRUTEM4, GHCN-Mv3 and CRUTS3.21 and expectations based on the Clausius-Clapeyron relationship.
Efforts have been made to maximise spatio-temporal coverage of the data.However, missing data are prevalent, particularly at the start and end of the period of record and from 2005 onwards for 128 USA stations.Data are missing for three main reasons: they are not present in the ISD archive; station records are too short or have insufficient data over the climatology period  to be included in our selection; and data have been removed by the QC or homogenisation process.In many cases, short station records often originate from the same station but have different reporting IDs.These can be merged into one long record, which has been done for some HadISDH stations.However, the selection and merge process employed here has been static since 2008 due to the considerable effort required to undertake such a task.A revisit is planned which should improve on spatio-temporal coverage.Although the change in spatio-temporal coverage over time is substantial, it appears to have little impact on the large-scale features.There are no obvious signals in the data that would be consistent with the pattern of missing data over time and there is good agreement between HadISDH and other data sets that do not use the same station selection.
Statistically significant long-term warming trends are found for the globe, both hemispheres and the tropics.Statistically significant long-term increases in water vapour (moistening trends) are found for the globe, tropics and Northern Hemisphere.The long-term trends in RH are significantly negative in the Southern Hemisphere but insignificant in the globe, tropics and Northern Hemisphere.However, the decline in global land RH since 2000, first noted in ERA-Interim (Simmons et al., 2010), is apparent in our globe, Northern Hemisphere and Southern Hemisphere averages.This decline approximately coincides with the socalled "warming hiatus".Interestingly, the flattening is an obvious feature in the regional average time series for q, T d and T w , and less apparent for T .
As expected following the Clausius-Clapeyron relationship, the largest increases in water vapour are found over the warm and water-rich regions of the tropics and the Mediterranean.There are significant increases in RH over much of the tropics and high northern latitudes, and the largest increases in T d and T w are found in the latter.Thus the land surface air is becoming more saturated in these regions.Despite increasing water vapour, over the mid-latitudes, including the Mediterranean, there are decreasing RH trends: the air near the surface over land is becoming less saturated.It is clear that the decline in regionally averaged RH since 2000 comes from these mid-latitude regions -it is not a truly global signal.It is possible that the increasing automation of observing instruments, which began in earnest from the late 20th century, is a contributing factor.However, the widespread signal is temporally consistent across both hemispheres and physically plausible.Furthermore, the data have been homogenised which should have removed any large artefacts from changes in instruments over time.
The regionally distinct features in RH, the flattening of the water vapour (q, e and T d ) trends and the decline of RH, are likely to be related.Drivers are not investigated here, but we suggest that atmospheric circulation changes, land-sea warming disparities and reduced water availability/changed land surface properties are potential contributing factors.
Uncertainty estimates, combining uncertainties for spatial coverage, temporal sampling and station issues of measurement, homogeneity and climatology uncertainty are provided for each grid box and for the large-scale averages.At large scales these are dominated by spatial coverage and additionally station uncertainty for q, e and RH.Station uncertainty is mostly driven by homogeneity uncertainty for both applied and missed adjustments.Thus, the best way to reduce uncertainty is through improving spatio-temporal coverage by greater data-sharing and increased data-rescue efforts (Thorne et al., 2011;Allan et al., 2011) and improving homogenisation by increasing the digital availability of metadata (i.e.instrument type, station type -manual or automated).
Both q and RH are relevant to the hydrological cycle.In particular, q governs the amount of precipitation during heavy rainfall events (Allen and Ingram, 2002).RH is important for both the occurrence of a precipitation event and the capacity of the air for evaporation (Richter and Xie, 2008).As water vapour is a greenhouse gas, any implications of it increasing are obviously important for the radiation budget.The energy balance at the surface is also relevant given both the capacity for evaporation as governed by RH and the amount of latent heat stored in the atmosphere as inferred by q.
Although lower RH in the mid-latitudes reduces the thermal stress on humans and mammals, should this trend continue it may have adverse effects on agriculture through the prevalence of drought.Conversely, increasing RH in already physiologically stressful regions of the tropics may challenge human health (Taylor, 2006;Kjellstrom et al., 2008;Willett and Sherwood, 2012).Overall we show robust signals of increasing water vapour over the last 40 years alongside changes in the level of saturation over the last decade that could have implications for society.Continued monitoring of both temperature and humidity variables will aid further understanding of both the causes and implications of these changes.

Copyright statement
The works published in this journal are distributed under the Creative Commons Attribution 3.0 License.This license does not affect the Crown copyright work, which is re-usable under the Open Government Licence (OGL).The Creative Commons Attribution 3.0 License and the OGL are interoperable and do not conflict with, reduce or limit each other.

© Crown copyright 2014
The Supplement related to this article is available online at doi:10.5194/cp-10-1983-2014-supplement.

Figure 2 .
Figure 2. Final station coverage maps with supersaturated and subzero error stations identified that are not included in the final HadISDH data sets.(a) Good stations common to all variables, (b) additional temperature stations, (c) additional dew point temperature stations, (d) additional specific humidity stations, and (e) additional relative humidity stations.See Fig. SM1 for other variables.
3.3.The tail-off at either end of the record and especially post-2005 is partly due to the initial HadISD data selection which maximises coverage over the 1976-2005 climatology period and has remained static since version 1.0.0. of HadISD.This will www.clim-past.net/10

Figure 4 .
Figure 4.Example of an inhomogeneity having a greater signalto-noise ratio in DPD compared to q for station 557 730, Pagri, Tibet.(a) DPD annual mean series for raw data (red) and direct PHA homogenised data (blue) for 557 730 and its raw neighbour series (black).(b) q annual mean series for raw data (red) and direct PHA homogenised data (blue) for 557 730 and its raw neighbour series (black).(c) q annual mean series for raw data (red) and indirect PHA homogenised data (blue) for 557 730 and its raw neighbour series (black).

Figure 5 .
Figure 5. Adjustment sizes and locations applied by PHA (a, b, e, f) and ID PHA (c, d, g, h) to all monthly mean station time series combined.(a,c, e, g) Distribution of adjustment size (black line) with a Gaussian fit (grey line), a best-fit merged line taking the larger wings and Gaussian fit over the missing middle (red dashed line) and the difference between the best-fit line and actual distribution (blue dotted line) for temperature, dew point depression and relative humidity, respectively.The mean and standard deviation of the differences are shown.b,d,f,h) Distribution of adjustments over time for temperature (a, b, c, d), dew point depression (e, f) and relative humidity (g, h), respectively.

Figure 7 .
Figure 7.Comparison of decadal trends from 1973-2013 between raw and homogenised monthly mean temperature.(a) Ratio of raw to homogenised trends for each grid box (a value of > 1 means that the raw trend is greater than the homogenised trend).(b) Scatter plot of raw verses homogenised decadal trends for each grid box with percentage presence in each quadrant.(c) Trend size distribution for raw and homogenised grid boxes.(d) Large-scale area-average time series (weighted by the cosine of the latitude) and decadal trends calculated using the median of pairwise slopes with 5th to 95th percentile slopes as confidence intervals.

Figure 8 .
Figure 8.Comparison of decadal trends from 1973-2013 between raw and homogenised monthly mean dew point temperature.(a) Ratio of raw to homogenised trends for each grid box (a value of > 1 means that the raw trend is greater than the homogenised trend).(b) Scatter plot of raw verses homogenised decadal trends for each grid box with percentage presence in each quadrant.(c) Trend size distribution for raw and homogenised grid boxes.(d) Large-scale area-average time series (weighted by the cosine of the latitude) and decadal trends calculated using the median of pairwise slopes with 5th to 95th percentile slopes as confidence intervals.

Figure 9 .
Figure 9.Comparison of decadal trends from 1973-2013 between raw and homogenised monthly mean specific humidity.(a) Ratio of raw to homogenised trends for each grid box (a value of > 1 means that the raw trend is greater than the homogenised trend).(b) Scatter plot of raw verses homogenised decadal trends for each grid box with percentage presence in each quadrant.(c) Trend size distribution for raw and homogenised grid boxes.(d) Large-scale area-average time series (weighted by the cosine of the latitude) and decadal trends calculated using the median of pairwise slopes with 5th to 95th percentile slopes as confidence intervals.

Figure 10 .
Figure 10.Comparison of decadal trends from 1973-2013 between raw and homogenised monthly mean relative humidity.(a) Ratio of raw to homogenised trends for each grid box (a value of > 1 means that the raw trend is greater than the homogenised trend).(b) Scatter plot of raw verses homogenised decadal trends for each grid box with percentage presence in each quadrant.(c) Trend size distribution for raw and homogenised grid boxes.(d) Large-scale area-average time series (weighted by the cosine of the latitude) and decadal trends calculated using the median of pairwise slopes with 5th to 95th percentile slopes as confidence intervals.

Figure 11 .
Figure 11.Decadal trends from 1973-2013 for monthly mean climate anomalies relative to 1976-2005.Trends are fitted using the median of pairwise slopes.Black dots show trends that are considered to be significantly different from a zero trend -where the 5th and 95th percentiles of the pairwise slopes are in the same direction.(a) Temperature, (b) dew point temperature, (c) specific humidity and (d) relative humidity.See Fig. SM6 for other variables.

Figure 12 .
Figure 12.Comparison global (70 • N to 70 • S) average time series of annual mean climate anomalies from HadISDH.2.0.0 with difference series for other monitoring products spatially matched to HadISDH grid box size and coverage.HadISDH uncertainty ranges for station (blue), sampling (red) and coverage (gold) uncertainty contributions combined are shown.All series have been given a zero mean over the common 1981-2010 period.Decadal trends with 5th to 95th percentile confidence ranges are shown for HadISDH and correlation coefficients of the standardised climate anomalies (not detrended) are shown.(a, h) Temperature, (b, i) wet bulb temperature, (c, j) dew point temperature, (d, k) specific humidity, (e, l) vapour pressure, (f, m) relative humidity, (g, n) dew point depression.

Table 1 .
Equations (1) to (7) used to derive humidity variables from dry bulb temperature and dew point temperature.

Clim. Past, 10, 1983-2006, 2014 www.clim-past.net/10/1983/2014/
Electronic sensors are also more prone to drift over time, especially after exposure to high levels of specific humidity, though this effect is mitigated by more frequent replacement of sensors.Overall, psychrometers are thought to be prone to greater errors, particularly at low humidity, so our assumption can be seen as providing a conservative uncertainty estimate.In future iterations of HadISDH we plan to incorporate an uncertainty model for RH sensors and address the challenge of homogenisation of data in which sensors are replaced/recalibrated on a time scale of at least once a year, as practiced in the UK (M.Molyneux and N. Mander, personal communication, 2014).For future versions of HadISD, Clim.Past, 10, 1983-2006, 2014 www.clim-past.net/10/1983/2014/humidity.