Articles | Volume 22, issue 6
https://doi.org/10.5194/cp-22-1125-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
Unravelling the tree cover dynamics over the last 20 000 years on the Northern Hemisphere
Download
- Final revised paper (published on 05 Jun 2026)
- Preprint (discussion started on 30 Dec 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-6393', Qiong Zhang, 25 Jan 2026
- AC2: 'Reply on RC1', Anne Dallmeyer, 25 Mar 2026
-
RC2: 'Comment on egusphere-2025-6393', Anonymous Referee #2, 09 Feb 2026
- AC1: 'Reply on RC2', Anne Dallmeyer, 25 Mar 2026
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Publish subject to minor revisions (review by editor) (17 Apr 2026) by Bette L. Otto-Bliesner
AR by Anne Dallmeyer on behalf of the Authors (27 Apr 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (23 May 2026) by Bette L. Otto-Bliesner
AR by Anne Dallmeyer on behalf of the Authors (27 May 2026)
Manuscript
Overall assessment
This manuscript presents a comprehensive and ambitious evaluation of Northern Hemisphere tree-cover dynamics over the last ~20 kyr, combining a transient MPI-ESM1.2 simulation with the recently developed hemispheric REVEALS-based tree-cover reconstruction by Schild et al. (2025). The study goes well beyond a descriptive model–data comparison by systematically diagnosing spatial patterns of agreement and disagreement, disentangling climatic drivers using emulation and GAM approaches, and explicitly discussing structural limitations of current dynamic vegetation models (DVMs).
The manuscript is clearly written, methodologically sophisticated, and addresses questions that are highly relevant to the paleoclimate community, particularly those working at the interface of palaeodata synthesis and Earth system modelling. The use of a transient simulation rather than time-slice experiments is a major strength, as is the explicit treatment of non-linearity in climate–vegetation relationships.
Overall, I find this to be a strong and publishable contribution, but in its current form it would benefit from clarifications, tighter framing of some conclusions, and a more critical separation between (1) climate biases, (2) vegetation model structure, and (3) reconstruction limitations. My comments below are intended to strengthen the robustness and interpretability of the results rather than to challenge the major findings.
1. Interpretation of REVEALS tree cover versus modelled absolute cover
A central issue in this manuscript is the comparison between REVEALS-derived tree cover (which sums to 100% vegetation) and MPI-ESM absolute tree cover including bare ground. The authors correctly acknowledge this mismatch and justify their choice to compare absolute PFT area (fi) rather than relative fractions (ci).
However, this choice has far-reaching implications for the interpretation of MAE, variance differences, and the systematic bias pattern (overestimation at low tree cover, underestimation at high tree cover). At present, these implications are discussed mainly qualitatively.
I suggest that the authors (1) add a concise conceptual clarification (possibly a schematic or boxed explanation) explicitly explaining how REVEALS tree cover should be interpreted in open landscapes, and how this affects MAE and variance metrics; (2) Clarify more explicitly that part of the diagnosed non-linear bias pattern (Fig. 6) is methodological rather than purely ecological or model-structural, especially in sparsely vegetated regions; (3) Consider whether at least a sensitivity comparison using relative cover (ci) for selected well-vegetated regions (e.g. mid-Holocene Europe) could help bound the uncertainty. This clarification is important because many readers may otherwise interpret MAE patterns too directly as "model error".
2. Mid-Holocene forest maximum: model limitation or forcing limitation?
A key result is the failure of MPI-ESM to reproduce the reconstructed mid-Holocene tree-cover maximum across large parts of the Northern Hemisphere, with the model instead peaking in the early Holocene. This is an important and robust finding. However, the manuscript currently blends several possible explanations such as overly strong warm-season temperature control linked to insolation, missing processes (permafrost, soils, disturbances), climate biases (e.g. absence of a mid-Holocene thermal maximum), and possible reconstruction artefacts.
I suggest sharpening this discussion by more explicitly separating (a) deficiencies in simulated climate trajectories (e.g. lack of MH warmth or hydroclimate persistence) from (b) deficiencies in vegetation sensitivity to that climate.
Also clarify whether the emulator experiments indicate that even with a corrected climate, JSBACH would still fail to produce a mid-Holocene maximum in boreal regions (which would strongly support a structural vegetation limitation). And explicitly position this result in relation to other transient Holocene simulations (e.g. PMIP-style experiments), even if only qualitatively.
3. Interpretation of CO₂ dominance and linearity
The manuscript convincingly shows that strong linear alignment of tree cover with CO₂ is associated with high temporal correlation but inflated variance and MAE, leading to the conclusion that CO₂ sensitivity is likely too strong in this model version. This is an important point, but it requires careful wording to avoid over-interpretation. The strong correlation between REVEALS tree cover and CO₂ is plausibly a proxy effect reflecting hemispheric deglacial trends rather than a physiological signal. This distinction should be emphasised more clearly in the Results and Discussion. It would be helpful to explicitly state that the emulator diagnoses model-internal sensitivities, not real-world sensitivities, and that the mismatch with REVEALS may arise from both sides. The conclusion that "CO₂ sensitivity is too strong" should be framed as relative to reconstructed variability patterns, not as an absolute statement about palaeo-CO₂ fertilisation.
4. Regional interpretation of disagreement: local dynamics versus model failure
The analysis of high- vs low-agreement grid cells is one of the strongest parts of the manuscript. The conclusion that poor agreement often coincides with non-linear, summer-temperature-dominated responses is compelling. However, I encourage the authors to make clearer that poor model–data agreement does not automatically imply "model failure", but may indicate regions where local ecological processes, migration lags, disturbance regimes, or microclimates dominate. This distinction is particularly important for Siberia, forest–tundra ecotones, and forest–steppe transition zones.
Minor comments and technical suggestions
P5, Line 149, "We consider only grid cells that indicate a significant (p<0.1) correlation between the simulated and reconstructed tree cover." The choice of p < 0.1 for correlation significance should be briefly justified, especially given the large number of grid cells.
P4, Line 116, it states that both model output and reconstructions are binned into 500-year intervals, but this temporal aggregation is not discussed further. Given that many key diagnostics (variance differences, non-linearity from GAMs, emulator performance, and correlation strength) are sensitive to temporal smoothing, the authors should briefly discuss how the 500-year binning may influence the detected strength of non-linear responses and the interpretation of model–data agreement, particularly in regions with rapid postglacial changes.
P12, 372-373, and P18, Line 553, the use of "energy-limited" vs "water-limited", would be helpful to briefly define these terms (e.g. dominant driver in emulator/GAM sense) to avoid confusion.
Fig. 5 and Fig. 6 are central but hard to understand; slightly stronger guidance in the captions on how to read them would help non-specialist readers.
Figure 11 is central for interpreting the effect of bias correction, but it is not immediately intuitive that the colours represent percentile-based changes relative to the original simulation rather than absolute agreement. I recommend clarifying this more explicitly in the caption (e.g. blue indicates relative improvement compared to MPI-ESM, not necessarily good agreement) and possibly adding a short explanatory sentence in the main text to guide readers.
Regional time-series examples (Fig. 12) are very effective and could be referenced more explicitly earlier in the text.