Review of ' Evolution of mean ocean temperature in Marine Isotope Stages 5-4 ' by Shackleton et al . 2021

Shackleton et al. provide a reconstruction of mean ocean temperature (MOT) covering Marine Isotope Stage 4 (MIS 4, 74 to 59.5 ka BP) based on 56 new noble gas measurements performed on a shallow ice core from the Taylor Glacier blue ice area in Antarctica. Based on their new MOT reconstruction and previously published data covering the last and penultimate deglaciations the authors argue that most of the ocean cooling between the Last Interglacial (MIS 5e) towards the Last Glacial Maximum (MIS 2) occurred already across MIS 5, with little to no net change during MIS 3. The temporal resolution during MIS 4 is just high enough to allow the authors to speculate on millennialscale MOT turning points and trends linked to the Atlantic Meridional Overturning Circulation (AMOC).


MOT in
The authors compare their data to previously published records throughout the manuscript but do not show the data in their figures. While the data by Shackleton et al. 2020 appear in Fig. 3 in the context of a model-data comparison, the most recent data covering MIS 2 and the last deglaciation (e.g. Bereiter et al. 2018a) do not appear in any figure. I highly recommend adding previously published MOT records to Fig. 1. In this context, I feel the introduction would benefit from a brief summary of what we have learned from noble gas thermometry so far, referencing to Fig. 1. See comments below. Another argument for adding MOT to Fig. 1 is that the manuscript further compares benthic d18O to both old and new MOT reconstructions, yet there is no figure showing both data sets.

Point on full MIS 5-4 record
I am sure it is not intentional, but, starting with the title, the authors imply multiple times that they present an extensive dataset covering the entire MIS 5-4 interval, lasting for about 60 ka. After all, the authors provide (only) data for MIS 4 (including a few thousand years during MIS 5a) covering 15 ka, leaving a gap with no MOT data of about 50 ka during MIS 5e to 5a. The authors do mention that data over MIS 5 are sparse, but I think that it is important to note that we don't know the evolution (as in trajectory) of the MOT during MIS 5e to 5a, all we can tell is that the net change between two endpoints.
My suggestions for the title are: "Evolution of mean ocean temperature in Marine Isotope Stage 4" or just "Mean ocean temperature during/across Marine Isotope Stage 4".

Millennial-scale variability
Given the trends in the EDC dD record (Fig. 2b) I agree that the cooling may have happened in two stages at two different rates at the MIS 5a-4 boundary. However, looking at the data I am less convinced. How robust is this finding? I agree with the authors that the AMOC (via the bipolar seesaw mechanism) is linked to MOT, which is a logical consequence of the strong correlation with the EDC dD record. However, I think the authors are jumping to conclusions. A direct link with the AMOC needs to be established with more than just one DO event (DO 19 at ~72 ka BP,Fig. 2) and also needs to involve direct AMOC reconstructions, which are available for this time period: Böhm et al. (2015) Strong and deep Atlantic meridional overturning circulation during the last glacial cycle, 517, 73-76, Nature, doi:10.1038/nature14059 However, given that MIS 4 is muted in its DO activity and the MOT data only allow a direct comparison with DO 19 I think adding this record would probably not add much at this stage. Delving any deeper into this topic would distract from the authors' main conclusions.
I think it is OK to speculate that MOT and AMOC are linked even outside of glacial terminations; however, please clearly label the speculations as such.

Error analyses (section2.4)
I am confused about the way the uncertainty (Fig 2.a) has been calculated. The authors mention that they have used a bootstrapping method (a resampling method, involving the exclusion of a certain number of data points for each run), but go on and say that this involved wiggling the data within their uncertainty boundaries (Monte Carlo technique). Please clarify if you indeed used a bootstrapping method or if this only involves the exclusion of the low data point at 62 ka BP described in the caption of Fig. 1. If you did use actual bootstrapping, please provide a little more detail on the method used.
Looking at Fig. 2 I can see why the vertical error bars were omitted for clarity but I had to look at the data file to get an idea of the uncertainty associated with those measurements. I see that the propagated uncertainty of a single data ranges between 0.17 to 0.19°C, I think this should be mentioned in the text. Out of 11 replicate measurements, 5 do not agree within their uncertainties boundaries. Does that mean that your uncertainty estimates are too low? Maybe I am missing something here? I am confused about the spatial arrangement of replicate samples in the case of Taylor Glacier ice. How comparable are two replicates in your case? In an analogy to a classical ice core drilled on a dome or divide: Are we talking about true replicates from identical depth levels (but different cuts of the core) or quasi replicates from vertically directly neighboring ice from the same cut?
What are the greatest contributors to the final uncertainty? Is it the fractionation correction?

Box model(s) and d13C-CO2
The carbon cycle model simulations performed within the scope of this study come as a surprise. In fact, the first time a box model is mentioned is only in section 2.3 where the box model parameterizations are described. It is slightly confusing that the authors use two different box models, one for the calculation of the MOT itself (Bereiter et al. 2018a) and a carbon cycle box model published by Bauska et al. (2016). The authors refer to both as the "box model". Do these models have names? Please make sure that the reader understands the difference. I only realized that there are two different models after having read the manuscript multiple times. The carbon cycle model is first introduced in the Discussion. I think this should be mentioned in the abstract including the most important results thereof.
In this context, I am quite surprised that the authors chose not to show the d13C-CO2 model output of the carbon cycle model, given that the model used here (Bauska et al. Moving on to the conclusions, where to my surprise, the carbon cycle model results are mentioned first, although this is not the main conclusion of this study. I recommend reformulating the first paragraph of the conclusions and reorganize the conclusion section. Start by summarizing your measurement achievements first and state your main conclusion, that is that the MOT in MIS 4 and MIS 2 are the same (you don't mention this at all, despite it its importance in the abstract) followed by the evolution of MOT within MIS 4. Then add a summary of the consequences of your main findings and what you learned from the carbon cycle model. Avoid "clearly". Finally, move on to conclude on details and finish with your closing statement, which is great.

Data availability
I commend the authors for providing an easily accessible and clearly structured data file online via https://www.usap-dc.org/view/dataset/601415 as linked to by a DOI link in the manuscript.
Please update the correct reference to this paper once accepted. The title given in the data file reads "Mean ocean temperatures achieve full glacial levels by Marine Isotope Stage 4 in the last glacial cycle".

In-text citation style
In cases where an author's name is part of the narrative the citation format remains the same as for indirect citations and does not change accordingly which is disrupting the reading flow. I recommend replacing e.

Line-by-line comments:
Line 17-18: You also use data from Termination II, which is not part of the "last glacial cycle" in my understanding. Here a suggestion: … MOT reconstructions from the last and penultimate deglaciation… ?
Line 18: …we find that the majority of the interglacial-glacial ocean cooling must have occurred across MIS 5 Line 18-19: what are "full glacial levels"? Please define "full glacial levels" and consider reporting previously published MOT of MIS2 or provide a measure of how those two MOT'S (MIS 4 vs MIS 2) differ Line 19-20: "Comparing MOT to…". This sentence is quite vague. Please reformulate and quantify the magnitude "CO2 drawdown" and "d18O increase". Please also mention that, based on box model experiments performed within the scope of this study, you estimate that up to XX% (instead of "most of") may be attributable to ocean cooling. Line 49-56: "Third,…" I am sure it is not intentional, but this section sounds like you can say much about MOT changes associated with several DO events occurring during MIS 5d to MIS 5a. However, your data covers only one DO event with high-resolution MOT data (which is absolutely great, no need to overstate the extent of your study). I also think the introduction of DO events at this point in the manuscript disrupts the flow of reading at the end of the introduction where you motivate this study. I suggest adding a short paragraph just before "Here we reconstruct MOT from…" introducing important records relevant to this study and summarizing what we have learned from MOT reconstructions so far, including the suggested link between AMOC and MOT. This new paragraph can then be followed by your existing paragraph (starting in a new line) outlining the four purposes, i.e. what your study adds to this body of literature and why this is important.
Line 55: Please rephrase. Our record provides important, yet limited, insight that helps understand … by providing high-resolution MOT data over DO event 19 which is marking the MIS 5a/4 transition. Or likewise.
Line 56: mention your carbon cycle model here Line 57: atmospheric CO2 Line 60: change "20 meters" to "20 m" for consistency Line 64: I believe you measured 11 replicates (see data file) Line 64: What was the average sample weight? Mean temporal resolution?
Line 77: The title of this section is confusing. See the comment above on the box model conundrum. I recommend replacing "box model parameterizations" with "MOT calculation" or similar.
Line 79-83: Please add a sentence explaining why the use of isotope ratios inert gases is superior to the firn model-based correction for thermal fractionation (Bereiter et al. 2018a). Which gases have been measured for the purposes of this correction in this study? Please add this information to section 2.1.
Line 83: Where does the Argon suddenly come from? See comment above.
Line 78-83: How large is that correction? Please quantify in both absolute and relative terms.
Line 89-96: Maybe I am mistaken here, but I think this section belongs into the Discussion. Maybe part of it can go into the new introductory section requested above?
Line 123: Any idea why WAIS shows such large scatter? Is the offset relative to Taylor Glacier significant? This ice is very far away from the bubble/clathrate transitions zone. I realize that these samples are only about 60 m above the bedrock at WAIS. Any idea how this could have an impact on the noble gas ratios?
Line 126: …WAIS Divide is 0.2°C warmer… Line 130: …substantial MOT warming towards the end … line 145: replace "considerable portion" with an actual estimate, 23%? Line 146: remove "clearly" or replace with "may" or similar Line 147: "It is notable that MOT was already low during MIS 5a (Fig.3)." Unclear. Where is MIS 5a in Fig 3? Line 154: Benthic d18O records changes in deep seawater d18O and is used as a proxy for deep-water temperature.
Line 179: Swap AA cooling and MOT decrease Line 179-182. Long sentence that is hard to read with four "ands" in it. Please split up and reformulate.
Line 187-191: Ok, but speculative. I recommend acknowledging that by starting the sentence with: We speculate… Line 203 onwards: Please provide R² values to underline your statement of strong correlation.
Line 225: add "to", not to be Line 228-229: add "in model simulations" to the end of the sentence. Or similar.
Line 245: The benthic δ18O record is only shown in Fig. 1, but  Line 271 onwards: "Our record provides the first observational evidence that MOT responds to AMOC changes outside of deglaciations…" I agree with what you say but would recommend toning this down. For example: This is the first indication that AMOC changes may also be associated with MOT during the last glacial and not only during deglaciations. We speculate that a similar pattern can be expected for MIS 3 where DO events are more frequent… Do you?
Line 341: replace "grasp" with "understanding" Line 362-363: repetition of statements in the main text.
Line 364-366: Better move this sentence to the main text? Add new and previously published MOT data in new panel Allow for a little more space in the vertical direction to help with the y-axes (see below) Make sure the respective y-axes covers the full range of data shown, for example, the lowest value on the CO2 axis is 200 ppm but the data goes down to 180 ppm Add major ticks to x-axis (time axis) Add reference for insolation data in panel g) Add labels for panels a), b), c) Add major ticks to both axes Line554: refer to Appendix B for the dD seawater correction Figure A1 and A2: Add labels for panels a), b), c) Add major ticks to x-axis (time axis) Adjust length of y-axes to cover the full range of data but keep them as short as possible. It is hard to differentiate between the different colours of x's. Maybe different symbols for the different types of corrections would be helpful here? You could use open symbols for WAIS and full symbols for TG?