How could phenological records from the Chinese poems of the Tang and Song Dynasties (618-1260 AD) be reliable evidence of past climate changes?

Phenological records in historical documents have been proved to be of unique value for reconstructing past climate changes. As a literary genre, poetry reached its peak period in the Tang and Song Dynasties (618-1260 AD) in China, which could provide abundant phenological records in this period when lacking phenological observations. However, the reliability of phenological records from poems as well as their processing methods remains to be comprehensively summarized and discussed. In 15 this paper, after introducing the certainties and uncertainties of phenological information in poems, the key processing steps and methods for deriving phenological records from poems and using them in past climate change studies were discussed: (1) two principles namely the principle of conservative and the principle of personal experience should be followed to reduce the uncertainties; (2) the phenological records in poems need to be filtered according to the types of poems, the background information, the 20 rhetorical devices and the spatial representations; (3) the animals and plants are identified to species level according to their modern distributions and the sequences of different phenophases; (4) the phenophases in poems are identified on the basis of modern observation criterion; (5) the dates and sites for the phenophases in poems are confirmed from background information and related studies. Finally, the temperature anomalies reconstructed by phenological records from poems were compared with those 25 reconstructed by other historical documents in published studies to demonstrate the validity and reliability of phenological records from poems in studies of past climate changes. This paper proved that the phenological records from poems could be useful evidence of past climate changes after being https://doi.org/10.5194/cp-2020-122 Preprint. Discussion started: 28 September 2020 c © Author(s) 2020. CC BY 4.0 License.


The numbers and accessibility of phenological records from poems
By their very nature, poems have many distinctions in the field of keeping phenological information with documents produced by institutions and personal diaries ( Table 2). Poems have 115 evident advantages in the quantity and variety of phenological evidence. According to Quan-Tang-Shi (the Poetry of the Tang Dynasty) and Quan-Song-Shi (the Poetry of the Song Dynasty), nearly 50 thousand poems from the Tang Dynasty and more than 270 thousand poems from the Song Dynasty are preserved. Numerous phenological records in the poems not only include non-organic events, but also include a variety of organic phenomena, most of which are phenology of ornamental plants and animals. 120 However, unlike documents produced by institutions in which phenological evidence was recorded by dedicated persons, the phenological evidence in poems was recorded more inadvertently. The information of phenophases in poems may be incomplete or ambiguous. For a specific phenophase, a poet usually only recorded it a few times in poems during his lifetime. Thus, the frequency and continuity of the phenophase in his poems were relatively low. Only by integrating the same 125 phenophase recorded by different poets could improve frequency and continuity. In general, the accessibility of phenological records of poems is relatively lower than that of other documents. Take the word "willow" as an example, it has been mentioned in 9041 poems in the Quan-Tang-Shi and the Quan-Song-Shi, but clear species names, phenophases, dates and sites can be obtained from only 80 (0.88%) poems. The accessibility of phenological records of poems may vary with different features of 130 poets. For example, Li Bai and Du Fu are the most representative romantic poet and realistic poet in the Tang Dynasty, respectively. According to Quan-Tang-Shi, there were 896 poems written by Li Bai and 1158 poems written by Du Fu. Among them, 23 (2.56%) poems by Li Bai and 76 (6.56%) poems by Du Fu are related to phenology. Thus, the accessibility of phenological information from poems by Du Fu is more than two times greater than that of Li Bai. 135

Inherent uncertainties of phenological evidence in poems
In addition to the uncertainties arising from data interpretation, calibration, validation and verification, the extraction of phenological evidence from poems could also have inherent uncertainties during the identification of species, the identification of phenophases, and the ascertainment of dates and sites, which should be excluded before using the phenological records to reconstruct past climate 140 changes.

Uncertainties in the identification of species
Since the Chinese language has not changed fundamentally during the long history, the people in present day can read ancient poems almost without too much difficulty. Nevertheless, the changes in meanings and expressions of particular words and phrases still exist. Some words or phrases may have 145 several additional meanings in ancient Chinese compared with modern usage. For example, the phrase "jin hua" (mainly refers to golden flower in modern Chinese) has at least four meanings in the Quan-Tang-Shi, but only one of them is a substantial description of phenology (Table 3).
The different names of some specific species in ancient China have also been simplified and unified at present. For example, the Si sheng du juan (Cuculus micropterus) have at least three different 150 names during the Tang and Song Dynasties (Table 4). It was also noticed that the names of plants and animals in poems were mostly recorded at the genera level due to the lack of modern taxonomic knowledge. Nevertheless, different species within the same genus may exhibit divergent responses to climate change according to modern phenological studies (Dai et al., 2013). Thus, large uncertainties exist during the identification of species in poems. 155

Uncertainties in the judgment of phenophases
Phenophases in poems are not recorded in strict accordance with modern systematic criteria, but are described through multiple rhetorical devices such as metaphor, personification, hyperbole, quote, pun and rhyme, so it is difficult to extract clear phenophases from poems. For example, there is a line in a poem by the poet Quan Deyu: "Peonies occupy the spring breeze with their fragrance alone" 12 , 160 which describes the phase of peonies flowering. However, the phenophase in this line is equivocal due to the use of personification. In order to compare the phenological records from poems with corresponding modern observational phenophases, the exact phenological stages need to be identified from the first flowering date, the full-flowering date and the end of flowering date. Therefore, uncertainties may be produced during the identification of specific phenophases. 165

Uncertainties in ascertainment of dates
The exact date is the crucial factor for quantitatively evaluating phenological and climatic changes from past to present. By converting the Chinese lunar calendar into the modern Gregorian calendar, the phenophases in the poems can be compared with modern observational phenophases. Unfortunately, https://doi.org/10.5194/cp-2020-122 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License.
writing time was not consciously kept for most poems. Any lack of information of year, month, or day 170 may lead to failures in phenological and climatic reconstructions. For instance, the poet Bai Juyi recorded in his poem: "People are busy in the fifth lunar month because the wheat is yellow in the field." 13 Here, only the information of the month was directly presented in this poem, which would probably cause uncertainties when deducing the year and the day. To make matters worse, some poems were even not improvised, but were written according to the memories or imaginations of poets. The 175 information from this kind of poems required to be excluded.

Uncertainties in ascertainment of sites
By matching the ancient name of a site with the modern one, the phenophases in the poems can be compared with the corresponding observational phenophases at the same site. However, similar to date, the sites of phenophases in poems are sometimes missing. Even worse, some names of the sites 180 mentioned in the poems are imagined to express the emotions rather than to record real locations. For example, Lu You wrote a verse in his poem: "There are so many willow branches in Ba Qiao, but who would have thought sending one to me?" 14 Ba Qiao is a location in Xi'an (a city in central China), which is more than 700 km away from the place Lu You wrote this poem (Chengdu, China). By describing the willow branches in his hometown in this poem, the poet expressed his homesickness. 185 When ascertaining the sites, these kinds of uncertainties should be carefully dealt with.

The Methods of Processing Phenological Records in Poems from the Tang and Song Dynasties for past climate studies
In order to minimize the uncertainty during the extraction of clear species, phenophase, date and site information from poems and to make them comparable with modern observations, several basic 190 principles and processing steps should be put forward.

The principle of conservative
The principle of conservative refers to deducing the ambiguous information conservatively, so as to keep the characteristics of phenological information without causing too much deviation. Take the 195 aforementioned poem of Bai Juyi 13 as an example, the poem was written in 807 AD in Xi'an according https://doi.org/10.5194/cp-2020-122 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. to background information while the exact date is not recorded. From the poem, we can know that the harvest date of wheat in that year appeared in the fifth lunar month (from June 10 to July 8 in the Gregorian calendar), so that the date of June 10 which is the closest to the modern observations (from May 26 to June 8 with the average of June 2) can be determined as the date of wheat harvest in 807 AD 200 in Xi'an. It should be noted that if the recorded period in the poem is overlapped with the time of the modern phenophase, the principle of conservative is inapplicable, and the record in the poem is invalid.

The principle of personal experience
The principle of personal experience demands that the phenological information described in the can not be used. It takes effort to diagnose the information in some poems. For example, Lu You wrote a poem in 1208 AD: "The Begonias in Biji Fang (place name) are the best in the world. Each branch 210 looks dyed with scarlet blood." 16 By looking up into the life experience of Lu You, this poem is found to record his memory in 1172 AD. Therefore, this piece of record also can not be used as the phenological evidence according to the principle of personal experience.

The key steps of data processing
On the basis of the principles, four steps are required for the processing of phenological records in 215 poems ( Figure 1).

Step 1: filtering the records
(1) Filtering the records according to the features of poets and poems Poems commonly reflect the thoughts and daily lives of the poets. Thus the poems written by people in certain professions who have little contact with phenological events, such as the alchemists 220 mentioned in Table 3, may contain little phenological information. In this way, the poems written by alchemists can be excluded to improve the accessibility of phenological evidence from the poems.
Furthermore, the records can be filtered according to the styles of poems and the interests or life https://doi.org/10.5194/cp-2020-122 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. experiences of the poets. For example, it is more likely to extract phenological records from pastoral poems than from history-intoned poems. 225 (2) Filtering the records according to the background information According to the background information of a poem, we can judge whether the phenophases in the poem actually happened, thus ensuring the effectiveness of phenological evidence. For example, there is a line of Su Shi saying: "A few branches of peach blossom outside the bamboo grove, and the ducks will notice the warming of the river firstly." 17 From this line, it seems to describe the natural 230 phenophases in spring. However, by looking into the background information, we know that this poem is an illustrated poetry in painting. Therefore it describes the scenery within the painting instead of real nature. The record requires to be excluded.
(3) Filtering the records according to the rhetorical devices Whether the use of rhetorical devices in poems may affect the authenticity of phenophases is 235 required to be distinguished. For instance, despite the rhetorical device of personification used in the aforementioned poem by Quan Deyu 12 , it does reflect the blossom of peonies. Thus, this poem can be used in the study of past climate changes. The line of Lu Zhaoling saying: "The water in Laizhou (place name) has become shallower several times and how ripe is the peach fruit?" 18 seems to ask the time of peach phenophase, but actually, it is the quotation of the myths that the peaches mature once 240 every three thousand years in wonderland. The rhetorical device of quotation in this line has affected the authenticity of phenophases. Thus, this piece of record should be eliminated.
(4) Filtering the records according to the spatial representations For a specific species, phenophases vary with latitude, longitude and elevation. It is necessary to clarify the spatial representation of phenological records in poems and to select records that are not 245 affected by the local microclimate. For example, Bai Juyi recorded in his poem: "All the flowers on the plain have withered in the fourth lunar month, but the peaches in the temple on the mountain just begin to bloom." 19 This piece of record can not be directly compared with modern observational data because the difference in altitude is almost 1000 meters between the mountain in the poem and the modern observation site on the plain. Other factors that contribute to spatial differences such as valley, 250 depression and heat island effect are also used to filter the records.  Table 5.   Table 2). The mean annual temperatures reconstructed from poems in this study and from documents in Liu et al. (2016) were respectively 0.43 ℃ and 0.29 ℃ higher during the study period (600-902 AD) than at present (1961(600-902 AD) than at present ( -1990. During the whole overlapping period (600s-870s), the 325 difference of temperature anomalies reconstructed by two data sources did not exceed 0.10 ℃. There were approximately simultaneous temperature fluctuations between the two reconstructions, and both of them indicated a clear shift from warm to cold occurring around the 800s. For both reconstructions, the relatively higher temperatures occurred around the 670s and the 780s, while the colder years mainly appeared in the last decades of the period. Furthermore, the amplitude of reconstructed temperature 330 from documents was 3.30 ℃, which was very similar to the amplitude of reconstructed temperature by poems (2.94 ℃) in this study. Generally speaking, the temperature anomalies reconstructed by the two studies are almost consistent.
One of the reasons lies in the lack of sufficient evidence supporting the climatic reconstructions.
Although some studies have reconstructed the temperatures during this period using natural evidence such as tree rings, pollens, and sediments (Xu et al., 2004;Zhang et al., 2014;Zhu et al., 2019), their results either cannot cover the whole period or they have relatively low temporal resolutions. In 340 addition, these natural proxies are mostly collected from uninhabited areas, thus they can hardly be used for further evaluating the interactions between climate change and human activities. In comparison, documentary evidence, which occurs more frequently and is closer to human life, has become an important data source for reconstructing the climate change in this period. As one of the most popular literary forms in the Tang and Song Dynasties, poetry has huge potential to provide 345 abundant and various phenological information, which will undoubtedly contribute to the study of historical climate change.
Despite this, very few studies so far have been reported to use phenological records from poems to reconstruct historical climate change quantitatively due to the lack of effective methodology for data extraction. Unlike climate reconstructions using other proxies that have standard processing methods 350 and clear reference objects, the processing of phenological records from poems is much more complex.
For example, dating tree-ring samples requires only counting the number of annual rings from the outside to the inside or comparing them with a standard chronology. However, the temporal information in the poems cannot be obtained directly from a reference chronology. As already mentioned, the temporal information in the poems may be hidden in the poet's biography, the official 355 history book, or some related studies. It is necessary to search through these materials one by one and make careful comparisons before ascertaining the exact temporal information, even if some information is found to be unrecorded after searching through large amounts of materials. The problem also exists when extracting the information of species, phenophases and sites from poems.
We attempt to introduce a standard procedure for extracting phenological records from poems, 360 which could, on the one hand, minimize the uncertainties of the records, and on the other hand, filter the useless records efficiently. By following the principles and steps, researchers are able to know https://doi.org/10.5194/cp-2020-122 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License.
where to find the information needed and how to deal with the phenological data from poems. The extracted phenological records are comparable with modern observation data and can be used as the proxy for reconstructing the climate changes quantitatively. 365 In this study, we only used 85 phenological records extracted from poems to reconstruct the temperature anomalies for a small area in the Tang Dynasty. This is a case to prove the reliability of the records in indicating past climate changes. In fact, there are still plenty of phenological records that are not extracted. By rough estimation, the temporal resolution of the phenological records from poems of the Tang and Song Dynasties can reach at least 20 years. In addition, phenological records from poems 370 of the Tang and Song Dynasties are widely distributed, covering almost all the regions of modern China. The rich records around the capitals and developed cities are of great value in comparison with modern phenological observations. Future work will be focused on extracting more records from poems, and developing integration methods for different phenophases at different sites to explore the overall phenological change and climate change over a large region. 375

Conclusions
In this study, we put forward a processing method to extract phenological information from poems of the Tang and Song Dynasties, which includes two principles (the principle of conservative and the principle of personal experience) and four steps: (1) filtering the records based on the features of poets and poems, the background information, the rhetorical devices and the spatial representations; (2)       The date when Salix spp. and Populus spp.
begin to have fluffy catkins Appendix B: The modern data sources and reconstructing method for the two reconstructions 620 Modern phenological observation data in Xi'an, which located in the center of Guanzhong Area, were derived from the China Phenological Observation Network (CPON). Xi'an has kept observations every year since 1963 except for the period of 1997-2002. The corresponding annual mean temperature data in Xi'an were obtained from the Chinese Meteorological Administration. Owing to a lack of data, some modern phenophases were defined based on the meteorological data. For instance, the modern 625 date of spring cultivation were defined as the first day when the daily mean temperature is consecutively higher than 5°C for five days (Ge et al., 2010). The modern date of millet harvest in autumn is defined as the first day when the daily mean temperature is continuously lower than 10°C for five days (Hao et al., 2009).
After changing the time series of temperature and phenophases to anomalies with respect to the 630 reference period (1961( -1990, the transfer functions between the phenological and temperature anomalies were developed by linear regression, which can be expressed as: where y is the annual temperature anomalies, and x i is the phenological anomalies for phenophase i. The constants a and b are estimated using the least square method, which represents the regression slope and 635 intercept, respectively.
Subsequently, the phenophase-specific transfer functions were applied to each historic phenological anomaly to obtain the annual temperature anomalies. If there was more than one record in a single year, temperature in that year was calculated as the arithmetic mean of all of the reconstructed temperatures in that year. 640

Phenophases
Transfer functions