The 4 . 2 ka BP Event in northeastern China : a geospatial perspective

The Hunshandake Sandy Lands of northeastern China, currently a semiarid lightly vegetated region, were characterized by perennial lakes and forest stands in the early and middle Holocene. Well-developed dark grassland-type paleosols (mollisols) at the southern edge of the Hunshandake – OSL (optically stimulated luminescence)-dated to between 6.93± 0.61 and 4.27± 0.38 ka along with lacustrine sands at higher elevations that date to between 5.7± 0.3 and 5.2± 0.2 ka – and thick gray lacustrine sediments suggest a wetter climate. Between 4.2 and 3.8 ka, the region experienced extreme drying that was exacerbated by lake overflow drainage and sapping that depleted the groundwater table. The region supported a robust population, the Hongshan Culture, but was depopulated post 4.2 ka with migration likely to the Yellow River Valley where the Hongshan introduced their characteristic cultural elements to early Chinese civilization. Evidence for extreme and sudden environmental change in northeastern China, at and following the 4.2 ka BP Event and like that we document in the Hunshandake, is widespread. However, no comprehensive overview of this climatic episode exists. Here, we discuss the relevant events in northeastern China and capture them in a spatially explicit Geographic Information Systems database that can be used to analyze the timing and spatial pattern of climate and environmental change associated with the 4.2 ka BP Event. This approach could serve as a prototype for a global 4.2 ka BP Event database.


Introduction
The Hunshandake Sandy Lands of northeastern China (Fig. 1) are currently characterized by grasslands that overlie semi-stabilized aeolian deposits in its southern and eastern regions and by aeolian sand sheets and dunes in its western region.The current desert-like landscape was covered by lakes and forests during the early and middle Holocene (Jiang et al., 2006;Yang and Scuderi, 2010;Yang et al., 2015;Xiao et al., 2018) with bioclimatic regimes reflecting significantly wetter conditions associated with an intensified monsoon precipitation up to 50 % higher than current conditions (Yang et al., 2011(Yang et al., , 2013)).The region supported a significant population that lived in small communities and relied on fishing and hunting (Liu and Feng, 2012;Wagner et al., 2013).
A sudden shift from wet to dry conditions in the Hunshandake and for most of northeastern China occurs at ∼ 4.2 ka (Yang et al., 2015).While the primary driver appears to be linked to global-scale change occurring at that time, in the Hunshandake it was exacerbated by rapid groundwater drawdown resulting from drainage capture.This combined climatic and hydrologic reorganization led to a rapid loss of surface water and a shift from green to sandy conditions over a few hundred years.This environmental shift produced regional depopulation with a significant abandonment of sites across the region lasting until ∼ 3.6 ka (Liu and Feng, 2012;Wagner et al., 2013;Yang et al., 2015).
The change in environmental conditions in the Hunshandake is not unique within China.Evidence for a putative 4.2 ka BP Event appears at sites across China's deserts (Feng et al., 2006) and in many records from northeast China (Hong et al., 2001;Liu et al., 2002Liu et al., , 2010;;Xiao et al., 2018).Globally, the underlying cause for this event is controversial  (Weiss and Bradley, 2010;Weiss, 2016Weiss, , 2017)).Part of the problem in deciphering this event includes a lack of understanding of the drivers of the global climate system at ca. 4.2 ka, the spatial and temporal coherence of this event, and the sensitivity of different ecologic, hydrologic, and geomorphic systems to forcings of this magnitude.
In this paper we review the environmental change that took place in the Hunshandake bracketing the 4.2 ka BP Event and place it in context relative to other records from northeastern China.To better understand the event's temporal and spatial characteristics, we analyzed the existing literature, developed a geospatially explicit Geographic Information System database from this literature, and used it to map evidence for the 4.2 ka BP Event.We conclude with a discussion of how such a data structure and analysis approach might be used to better understand the 4.2 ka BP Event globally, and we provide the dataset for evaluation and analysis as an online supplement.

Study area
China is characterized by a broad swath of deserts that extend between 38 • and 46 • N.This desert belt is roughly divided at the Helan Mountains into a western portion of "true" hyperarid and arid deserts and a slightly wetter eastern portion consisting of lightly vegetated and stabilized semiarid to dry subhumid "Sandy Lands".The Hunshandake Sandy Lands (Fig. 1; elevation 1100-1400 m), along with the Horqin and Hulun Buir, are found on the eastern edge of this desert belt in northeastern China.Ecologically the Hunshandake is a semiarid grassland ecosystem underlain primarily by aeolian sandy soils.Monthly temperatures range from −18.3 • C in January to +18.5 • C in July with annual precipitation across the region between 150 and 450 mm and falling primarily during the summer months.

Holocene climatic history
The Hunshandake Sandy Lands can be subdivided into southern, western, and eastern units.The southern part of the region is characterized by low hills vegetated by a thin cover of grasses and shrubs and has an absence of standing water.The western part of the region is primarily low hills with a grassland cover and standing lakes while the eastern portion is grassland and low rolling vegetated dunes with dry lake beds.While all three are currently semiarid and lightly vegetated, they differed significantly in their response to climate change at 4.2 ka.
The Holocene environment of this area is discussed in detail in Yang et al. (2015).We briefly summarize the findings below.Figure 2 illustrates two well-developed dark grassland-type paleosols (mollisols) at the southern edge of the Hunshandake that are identifiable and can be traced across the entire region.Lacustrine sands underlying these paleosols indicate an earlier lake/wetland environment.The lower paleosol is OSL-dated to between 6.93 ± 0.61 and 4.27 ± 0.38 ka BP and suggests a period of wetter climate that rapidly transitioned to dry conditions at ca. 4.2 ka.The southern part of the Hunshandake, as indicated by a second paleosol dated to between 2.82 ± 0.26 and 1.54 ± 0.14 ka BP, returned to green conditions again at ca. 2.8 ka BP and maintained this state for ∼ 1500 years.Dune sediments reflecting an active aeolian environment have dominated since ∼ 1.3 ka.
The eastern Hunshandake (Figs. 3 and 4), while exhibiting evidence of the 4.2 ka BP Event in terms of paleosols and abandoned shoreline features, does not exhibit a return to grassland conditions between ∼ 2.8 and 1.5 ka BP with sandy conditions persisting since the onset of the 4.2 ka BP Event.As Yang et al. (2015) found, this difference was associated with the redirection of drainage from its northerly flow towards Dali Lake at ca. 4.5 ka by the capture of surface water and the groundwater table by the Xilamulun River.This may be reflected in the longest period of low stands of both Dali (Xiao et al., 2008(Xiao et al., , 2018) ) and Dai Hai (Xiao et al., 2004(Xiao et al., , 2006) lakes between ca.4.5 and 3.8 ka.The eastward drainage shift resulting from this capture, coupled with channel entrenchment via groundwater sapping resulted in longterm drying of the eastern Hunshandake, likely moisture enhancement of the downstream Horqin Sandy Land, and the abandonment the region by the Hongshan culture (Peterson et al., 2010) ca.4.5-4.2ka.Recent mapping has revealed a lack of artifacts in the eastern Hunshandake between 4.3 and 3.5 ka (Liu and Feng, 2012;Wagner et al., 2013).In the eastern Hunshandake Sandy Lands, Hongshan artifacts are found primarily within and below the 4.2 ka paleosols and shorelines while Bronze Age artifacts (Fig. 3b), which appear in the region ca.3.6 ka (Wagner et al., 2013), lie on or above the 4.2 ka paleosol.

Deriving and understanding the regional signal in northeast China
The 4.2 ka BP Event continues to puzzle scientists with respect to its spatial extent, triggering mechanisms, and regional to global characteristics.As shown above, the difference in response between sites within the Hunshandake separated by less than 100 km illustrates some of the complications that can arise from point source reconstructions.In northeastern China records are dominated by evidence from sediments and pollen from lacustrine and aeolian environments.At times these records present conflicting views of the 4.2 ka BP Event.In the following we discuss some of the issues with interpreting regional mappings of this diverse set of records and provide an approach to reconstructing the regional signal of this event.Parmesan and Yohe (2003) note that detection of a climate change signal is a search for spatially and temporally coherent sign-switching patterns.This search is complicated by the time-transgressive and rate response differing nature of complex interacting systems as well as possible spatially diffusive responses across a landscape.While records from individual sites provide snapshots of what occurred at a given locality, interpretation is often complicated by significant uncertainty in dating and possible regionalization of the observed signal.

Database creation
Placing the Hunshandake record in context relative to northeast China, and within the larger global context, requires the assembly of records from many diverse sources into a sin-    data define the parameters and structure of the information space we were exploring.To do so we utilized conceptual spaces for defining the informational architecture of our reconstruction of the 4.2 ka BP Event in northeast China.We note that such information context extraction may be useful in the identification of similar and differing responses between regions and understanding how these expressions of the 4.2 ka BP Event may have varied spatially and temporally.
We collected and analyzed 60 peer-reviewed articles (listed 1-60 in the online digital archive) pertaining to middle to late Holocene climate and environmental conditions in northeast China.Duplicate records reflecting multiple papers dealing with the same site were eliminated unless we believed that the given paper introduced a new analysis/interpretation for a given site.This analysis included text, figures, and tables.Using a web-enabled open-access semantic parser (Word Count Tool: High Star App, 2017: https://wordcounttools.com/, last access: 28 December 2018) we extracted ∼ 430 000 words (excluding simple words -"a", "it", "the", etc. -as well as dates).While the word counter provides extensive information about word use and structure, in our analysis we used only raw word count statistics.From this output we identified the 30 most used words from each article (224 critical words distributed across the 60 articles).These critical words appear 28 289 times (∼ 6.5 %) in the articles analyzed.
We organized the words by similarity and number of occurrences (Fig. 5) and derived database tables (Fig. 6) with explicit geocoding that were input into ARCInfo (ESRI, 2018) for geospatial analysis.We also included paleoclimate reconstructions of mid-Holocene (6.0 ka BP) climatic parameters derived from the Coupled Model Intercomparison Project 5 (CMIP5) data.These downscaled data (30 s resolution) were calibrated (bias corrected) using WorldClim 1.4 (Hijmans et al., 2005) as a baseline for "current" climate and were used as a reference for conditions prior to 4.2 ka.Using ARCGIS we clipped the output for each of these models to our study area (Fig. 1) and included them in our database.This includes layers for monthly average minimum temperature, monthly average maximum temperature, monthly total precipitation, and 19 bioclimatic variables derived from WorldClim data (source: http://worldclim.org/bioclim,last access: 24 January 2019).An example showing modeled 6.0 ka BP annual precipitation from the Beijing Climate Center Climate System Model (BCC-CSM1.1)(Fig. 7) is useful for illustrating the wide range of a single climatic variable across our northeast China study area (∼ 1500 km 2 ) and the potential difficulties arising from using individual site (point) data in assessing the impact of the 4.2 ka BP Event.
Finally, following the lead of Parmesan and Yohe (2003), we identified the 4.2 ka state change (yes, no) and rate and direction of change (rates: +2 to −2, with 0 being no change and positive or negative indicating an increase or decrease in the measured variable) from the text and figures in each  article and linked them in the database to the specific topic group they were derived from.This allows analysis and mapping of the distribution, intensity and directionality of 4.2 ka marker events, or lack thereof, of the different types of evidence available.

Analysis
An example of regional results for change at 4.2 ka (presence or absence of 4.2 ka BP Event) is illustrated in Fig. 8.It is clear from the distribution of points that the signature of a 4.2 ka BP Event is strong but not universal at all sites, with ∼ 23 % (18/77) reported showing inconclusive or no evidence of the 4.2 ka BP Event.Spatial analysis of the distribution of these records reveals an interesting pattern with both coastal sites and sites over 750 km from the coast showing consistent evidence for change at 4.2 ka.However, an intermediate band of higher-elevation sites between 300 and 500 km from the coast that are primarily derived from measures of vegetation type or abundance show inconsistent evidence for the 4.2 ka BP Event.Determining whether this is the result of interpretation errors, lack of sensitivity and/or temporal resolution, or the actual absence of environmental markers for the event will require additional research.
While there appears to be a significant level of coherence that suggests a relatively strong 4.2 ka signature across northeast China there are examples of local results that are inconsistent with this generalization.As we report in this paper, and in Yang et al. (2015), within the Hunshandake sites separated by as little as 100 km (Haoluku 42.95 • N, 116.76 • E; Xiaoniuchang 42.62 • N, 116.82 • E) exhibit signals that differ significantly.Liu et al. (2002) suggested, that these differences might be due to a combination of elevation (Xiaoniuchang 1460 m; Haoluku 1295 m), local conditions (Xiaoniuchang at the edge of a lake and Haoluku at its center), and transport of sediment due to changing environmental conditions (Xiaoniuchang dried up earlier than Haoluku which was still experiencing inflow of coarse sediments).Dating precision and uncertainty, as well as variability in local groundwater conditions and local and regional differences in topographic relief, may also result in apparent differences between local sequences both within the Hunshandake and in other localities that we document.While we currently do not have the ability to recognize different parts of the 4.2 ka BP Event in northeast China, as has been done for the central and western Mediterranean region (Magny et al., 2009), it is clear from the records we analyzed that the event is likely to have had multiple phases in northeast China.

Conclusions
The lack of integration of data into a scientifically credible, globally assembled information platform with consistent terminology and definitions to guide scientific inquiry hinders the understanding of the 4.2 ka BP Event.The creation of such an information platform can allow researchers to ask questions about the spatial distribution and environmental indicators that characterize this event.Such a platform can expand the range of research questions that can be tackled, encourage innovative and collaborative research, allow data sharing and comparison of results, and facilitate the development of innovative analytic tools.As Yin (2005) noted, the end goal of this type of research "is a data structure that sits firmly upon the deep-seated, some might say, hard-wired, natural structures of the information architecture".
Using this database approach, we showed the presence of a strong and coherent signal for the 4.2 ka BP Event in northeastern China, albeit with local and regional variation that complicates interpretation.Much of this complication may be the result of the use of different standards, differing interpretations of the data, data gaps, and differential spatial and temporal responses of indicators analyzed and reported.We note three important issues that broaden any palaeoclimatic (2) localization versus regionalization -some measures are "local" while others integrate "regional" conditions (for instance sediment in a small isolated lake basin versus a lake that is the terminal sink of a large area, respectively); (3) lagged response -the possibility of differential and lagged responses for different measures of the same event.
While much work remains, our prototype database approach, guided by semantic analysis of the literature and georeferencing of existing data sources, can serve as a guide to the assembly of a larger-scale global 4.2 ka database that should allow a better understanding of this climatic event.
The reader is encouraged to use the datasets found in the 4.2 ka data repository to both explore the 4.2 ka relationships in northeast China and to possibly guide the development of similar databases for other regions.

Figure 1 .
Figure 1.Sandy Lands of northeastern China.(a) Distribution of deserts across China by climate type.(b) Sandy land distribution.Boxed area in Hunshandake Sandy Lands is study area reported in this study.Regions: south -yellow; west -blue; east -red.

Figure 2 .
Figure 2. Southern Hunshandake.(a) Section P from Yang et al. (2015) showing paleosols and sandy units which overlie an earlier lacustrine unit.(b) Sampled exposure showing the two paleosols.

Figure 3 .
Figure 3. Eastern Hunshandake lacustrine deposits capped by paleosols.(a) Eastern Hunshandake paleosol/lakeshore (Section I from Yang et al., 2015).(b) Bronze age burial (lower left) on the surface of the lake shore exposed by deflation (Section E from Yang et al., 2015).

Figure 4 .
Figure 4. (a) Core records from Xiaoniuchang (afterLiu et al., 2002).(b) General Holocene reconstruction of eastern Hunshandake Sandy Lands(Yang et al., 2015).The coarse sand fraction increases significantly following the 4.2 ka BP Event with sediments dominated by desert sand sheets and dunes.

Figure 5 .
Figure 5. Topic map.Word extraction organized by number of words and topics (ovals: red -highest; blue -lowest; increasing number of occurrences from bottom to top) and by four major topic groups (left -vegetation; middle left -climate; middle right -geomorphic/sedimentologic type; right -cultural).Lower center -analysis type.TOC: table of contents.

Figure 6 .
Figure 6.Overview of the processing schema for database production.Each article was geocoded and combined with tables derived from keywords to produce the final GIS-compatible spatially explicit database for analysis.

Figure 7 .
Figure 7. Modeled annual precipitation (mm) for 6.0 ka BP (Hijmans et al. 2005) from the Beijing Climate Center Climate System Model (Version 1.1).Seven sites derived from An et al. (2000) are shown (green dots) to illustrate the incomplete spatial distribution of records within individual research records across the region.

Figure 8 .
Figure 8. 4.2 ka BP Event analysis sites by type of evidenced measured (numbered site information can be found in the online data archive).Note the lack of sites in the northern portion of the region (Heilongjiang Province and NE inner Mongolia).Light blue dots within individual measures indicate sites with no change across the 4.2 ka BP Event.Small purple dots are pollen sites from Ren and Zhang (1998) derived from published work and unpublished dissertations.While not all sites cover the 4.2 ka BP Event, we include the Ren and Zhang data to show the distribution of late Holocene paleoclimatic reconstruction work in NE China.