|The authors appear to have adequately addressed comments from previous reviewers, and their conclusions are supported by the analyses performed. The result is primarily a null one -- that different approaches are only weakly distinguishable for bias correction, and that those conclusions may be model- and location-specific -- and these caveats are adequately presented. I suggest that the paper be published after including additional suggestions to improve readability and add context.|
First, a clear discussion of the kinds of biases the authors are targeting should appear at the beginning of the paper. The examples of distributions of extreme weather events or climate variability in the introduction seem to be a bit of a red herring, as those “biases” have more to do with higher moments of probability distribution than the the time-mean biases that are ultimately the focus of the paper.
Second, the absence of a discussion of paleoclimate state estimation and data assimilation is conspicuous given that the goals of those procedures are also to reduce misfits between models and paleo data. A starting point is the literature on offline Kalman filtering (e.g., Tardif et al. 2019, Clim Past, Last Millennium Reanalysis with an expanded proxy database and seasonal proxy modeling), particle filtering (e.g., Goosse 2016, Clim. Dyn., Reconstructed and simulated temperature asymmetry between continents in both hemispheres over the last centuries) and “online” state estimation that changes model forcing to generate new runs (e.g. Kurahashi-Nakamura et al. 2017, Paleoceanography, Dynamical reconstruction of the global ocean state during the Last Glacial Maximum and Amrhein et al. 2018, J. Clim, A Global Glacial Ocean State Estimate Constrained by Upper-Ocean Temperature Proxies). A comparison and discussion of complementarity would strengthen the paper and make it more relevant to CoP readers.
Finally, the sections describing the methods are quite difficult to follow, partly because of notation used and partly because technical terms are not defined. I have noted some of these below. Improving readability will increase the impact of the work.
p1l8-11 “slightly better…methods” It sounds like a more apt description is that the methods are indistinguishable. I would clarify what is meant by “slightly better”
p1l11 Should be a semicolon before however
p1l11 Please clarify what is meant by reconstructions — data? Potentially confusing because reconstructions often use model output. Please also comment on the utility of using interpolated products (e.g. the MARGO gridded product) to evaluate bias reduction, as those products have their own (likely biased) assumptions of spatiotemporal covariance built in.
p1l12 Please define what is meant by “active calibration” and “bias correction functions”
p2l12 Please clarify what is meant by medium-scale. More accurate than “millennial-scale averages” might be “quasi-equilibrated climate states” when models are run for millennia. But I would dispute that these issues are not present in paleoclimate studies (e.g., the paleo drought literature).
p2l27 “Finally…” This sentence needs clarification.
p2l28 “However…” What about the common practice of comparing paleoclimate anomalies in models and data? Isn’t that a validation of the “Delta method”? e.g., Brady et al. 2013, J. Cli., Sensitivity to Glacial Forcing in the CCSM4.
p3l16 Please provide more detail on the model simulations. Were they run to equilibrium? Biases can emerge when models are run for long periods of time (Amrhein et al. 2018, cited above), but long runs are also necessary to equilibrate climate states to forcings (particularly in the deep ocean, e.g. Jansen et al. 2018 J. Cli., Transient versus Equilibrium Response of the Ocean’s Overturning Circulation to Warming.
p3l19 It appears that the reference for the Last Interglacial has not been published. I’m not sure what Clim. Past’s policy is here, but it’s difficult to evaluate that output.
p6l23 Please define “distributional bias” and “quantile.” This introductory paragraph (and the rest of the section) are difficult to to understand. What CDFs are being discussed? Perhaps the following paragraph (p7l3) should come first.
p7l21 “By the nature of regression models” -- unclear what is meant here, please clarify
p8l9 measures -> measure
p8l12 I think that writing this out rather than using set notation would be more accessible to the readership of this journal.
p10l4 Please define the difference between median bias and median absolute bias.
Figure 1 Why are error bars only on a subset of the data?