Articles | Volume 20, issue 4
https://doi.org/10.5194/cp-20-865-2024
© Author(s) 2024. This work is distributed under the Creative Commons Attribution 4.0 License.
Towards spatio-temporal comparison of simulated and reconstructed sea surface temperatures for the last deglaciation
Download
- Final revised paper (published on 08 Apr 2024)
- Supplement to the final revised paper
- Preprint (discussion started on 24 May 2023)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2023-986', Anonymous Referee #1, 15 Jun 2023
- AC1: 'Reply on RC1', Nils Weitzel, 15 Dec 2023
-
RC2: 'Comment on egusphere-2023-986', Anonymous Referee #2, 18 Oct 2023
- AC2: 'Reply on RC2', Nils Weitzel, 15 Dec 2023
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Publish subject to minor revisions (review by editor) (03 Jan 2024) by Marisa Montoya
AR by Nils Weitzel on behalf of the Authors (12 Jan 2024)
Author's response
Author's tracked changes
Manuscript
ED: Publish subject to minor revisions (review by editor) (04 Feb 2024) by Marisa Montoya
AR by Nils Weitzel on behalf of the Authors (13 Feb 2024)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (22 Feb 2024) by Marisa Montoya
AR by Nils Weitzel on behalf of the Authors (25 Feb 2024)
Manuscript
The manuscript “Towards spatio-temporal comparison of transient simulations and temperature reconstructions for the last deglaciation” introduces a new algorithm using proxy system models (PSM) to allow for a comprehensive data-model comparison of transient deglacial model simulations. The manuscript is predominately a methods paper where the authors introduce their algorithm, test the sensitivity of its parameters, and apply it in a perfect model pseudo-proxy experiment context. The review and vetting of the algorithm comprise the bulk of the text, where applying the algorithm to benchmark paleo simulations versus data is comparatively much shorter. While I am not an expert in the statistical models used within the methodology, the body of work appears sound without any immediate issues. I feel the wording justifying the need for the extra complexity of using a PSM could be improved. One of the core justifications for using the PSM is it avoids the need to rely on “sparse and uncertain proxy data”, yet the PSM largely degrades or reforms the model output to be more compatible with the proxy data and then uses same sparse and uncertain proxy data to benchmark the PSM created forward-modeled proxy time series. The authors also do not address why more traditional signal processing methods, such as Principal Component Analysis, could not instead be used to extract signal from the messy proxy data without the added complexity (and caveats) of using the PSM. I don’t think these criticisms undercut the work in any way, only that the justification for the need of the algorithm and PSM could be framed better.
While less text is dedicated to evaluating the model simulations against data, I disagree with one of the key findings: “Comparing the MPI-ESM and CCSM3 simulations that employ orbital, GHG, and ice sheet forcing, we find no systematic differences between the two climate models. In particular, TraCE-ALL is mostly within the IQD spread of the six MPI-ESM simulations.” This is an important statement, so the language should be more precises. What specifically are the authors referring to? There is only one MPI-ESM simulation that employs orbital, GHG, and ice sheet forcing (MPI_Ice6G_P2_glob), and there is no comparable TraCE simulation since TraCE-GHG and TraCE-ORB fix the ice sheet forcing to LGM (see Table 1). When I look at Figure 7, I don’t see TraCE-ALL effectively being the same as MPI with freshwater flux. Especially in the North Atlantic and North Pacific. What does it mean that MPI_Ice6G_P2_noMW outperforms TraCE-ALL in the North Atlantic for orbital pattern, when TraCE-ALL is specifically designed to reproduce the reconstructed AMOC variability? AMOC variability is sub-orbital scale of course, but what does it mean that a model without hosing is capturing that scale of variability better than TraCE-ALL (or conversely, the addition of hosing degrades the orbital-scale performance)? Likewise, what are the implications of TraCE-ALL and TraCE-GHG having nearly identical millennial pattern deviations in the North Atlantic, when the TraCE-GHG doesn’t include freshwater hosing? The TraCE-ALL and hosed MPI IQDs in the Figure 9 legend are largely not similar, no less TraCE-ALL bracketed by MPI. It is often difficult and nuanced to say when one model is performing better than another, but I don’t think the analysis and figures here support the claim TraCE-ALL and hosed MPI are effectively the same when compared to data.
The text notes “More generally, all simulations with meltwater input show a better agreement with reconstructions for millennial magnitudes than those without meltwater input.” I don’t think this is strictly true. There are cases where MPI_Ice6G_P2_noMW performs similar to, if not better than, the routed MPI-ESM simulations. In either case, this only means the millennial-scale variability is more like the data when hosing is added, not that the pattern is realistic (as noted around line 550). This is more apparent with the TraCE simulations where in some locations the addition of hosing degrades model performance. Getting the magnitude of variability correct, but the patterns (ie trends) of the deglaciation wrong isn’t particularly satisfying, which could be emphasized here.
I feel the manuscript would be improved from relatively minor revisions for clarity.
----------------
Minor comments and notes:
Line ~125 : define or give examples of “sensor” for the novice.
Line 137: Osman, et al., 2021 uses four proxy types, so I am not sure why it cited here.
Lines 177: “Computing averages in this last step instead of averaging temperature time series in the beginning avoids interpolating proxy records with irregular time axes to a common resolution.” Explain this. It seems like in some portions of the analysis the data are binned to a fixed 100-yr timestep (ie Section 3.2). The time series displayed in Figure 9 are regional averages, are these first binned to 100-yr interval or are they somehow calculated on irregular time spacing for multiple records and ensemble members?
Lines ~220: magnitude is defined as the standard deviation of each ensemble member (for the decomposed time series). Since standard deviation is an absolute value, doesn’t this fail to discriminate between trends in opposite directions (ie data is cooling when model is warming)?
Throughout the text “magnitude” is used to denote the degree of variability in the decomposed time series, which is just saying the strength of variability. It may not be obvious to the reader what the utility of this metric is. We tend to think in terms of time series, so “pattern” (as defined here) is far more intuitive.
Lines 235: Since N is either 100 or 1000, I assume an empirical probability distribution is used rather than a fitted distribution.
Line 244: How does IQD integrate differences in time series? Each forward-modeled proxy time series is on the same irregular age model spacing as the proxy data, but how is the time series translated to distributions used in the IQD equation?
Lines ~254: Zonal IQD seems to only be used in part of Figure 2, which is a flowchart of the analysis. If it is not used in the results section, could it be removed?
Section 4.2 Comparison of simulations against SST reconstructions: I think it would be really useful to the reader to plot orbital + millennial time series for the regions summarized in Figure 7 (ie this new plot should come before Figure 7). I envision something like Figure 9, which would give the reader a feel for what the models are simulating (relative to the data) prior to the decomposition. Those raw trends are somewhat abstracted away by plotting orbital and millennial time series separately. For example, in Figure 9 MPI_Ice6G_P3 has a cooling trend around 14 – 13 ka in both the orbital and millennial scales. If it is caused by the injection of freshwater forcing, I would expect it to only be in the millennial-scale (also perhaps implying freshwater forcing is showing up in the orbital-scale decomposition).
Figure 7: It would be very verbose, but would it be worth plotting a magnitude versus pattern IQD scatter plot? There are too many combinations for the main text, so perhaps an example (perhaps millennial magnitude versus pattern for the North Atlantic)? The best models should converge in the lower left of the plot near the plot origin (0,0).
Line 597: “To avoid the need to reconstruct gridded or regional mean temperatures from sparse and uncertain proxy data, the algorithm applies proxy system models to simulation output and quantifies the deviation between the resulting forward-modeled proxy time series and temperature reconstructions”. Doesn’t Figure 9 create regional stacks? I understand mean IQD is used to summarize regions (as explained in section 3.1.4), but how are time series of regional averages constructed?
Figure S2. Many of the plot titles in Figure S2 are identical. I assume this is depicting multiple records from the same core site. Perhaps this could be denoted better in the plot titles or figure caption. Also, how are the regional stack time series in Figure 9 made when not all records in Figure S2 span 19 – 9 ka?