Orbital CO2 reconstruction using boron isotopes during the late Pleistocene, an assessment of accuracy
Abstract. Boron isotopes in planktonic foraminifera are a widely used proxy to determine ancient surface seawater pH, and by extension atmospheric CO2 concentration and climate forcing on geological time scales. Yet, to reconstruct absolute values for pH and CO2, we require a δ11Bforam-borate to pH calibration and independent determinations of ocean temperature, salinity, a second carbonate parameter, and the boron isotope composition of seawater. Although δ11B-derived records of atmospheric CO2 have been shown to perform well against ice core-based CO2 reconstructions, these tests have been performed at only a few locations and with limited temporal resolution. Here we present two highly resolved CO2 records for the late Pleistocene from ODP Sites 999 and 871. Our δ11B-derived CO2 record shows a very good agreement with the ice core CO2 record with an average offset of 4.6 ± 49 (2σ) ppm, and a RMSE of 25 ppm, with minor short-lived overestimations of CO2 (of up to ~50 ppm) occurring during some glacial onsets. We explore potential drivers of this disagreement and conclude that partial dissolution of foraminifera has a minimal effect on the CO2 offset. We also observe that the general agreement between δ11B -derived and ice core CO2 is improved by optimising the δ11Bforam-borate calibration. Despite these minor issues a strong linear relationship between relative change in climate forcing from CO2 (from ice core data) and pH change (from δ11B) exists over the late Pleistocene, confirming that pH change is a robust proxy of climate forcing over relatively short (<1 million year) intervals. Overall, these findings demonstrate that the boron isotope proxy is a reliable indicator of CO2 beyond the reach of the ice cores and can help improve determinations of climate sensitivity for ancient time intervals.
Elwyn de la Vega et al.
Status: final response (author comments only)
- RC1: 'Comment on cp-2022-93', Anonymous Referee #1, 13 Feb 2023
- RC2: 'Comment on cp-2022-93', William Gray, 22 Mar 2023
Elwyn de la Vega et al.
Elwyn de la Vega et al.
Viewed (geographical distribution)
de la Vega and co-authors present two beautiful new records on paleo-CO2 reconstructions that support the standing of the boron isotope proxy as a reliable indicator for paleo-CO2. The authors assess several aspects of the proxy, including dissolution and proxy calibration. The findings from these assessments are not entirely new, and but that does not diminish the value of the manuscript. I have several recommendations for the authors to further improve the manuscript:
The Dlog10CO2/DpH approach to estimate paleo-CO2 is an interesting one but the authors already indicate that it is only useful for short time scales, similar to the original study of Hain et al. 2018. This is somewhat disappointing for deep-time studies, where we really do not have a good sense of a second parameter of the carbon system. It would be helpful if the authors could discuss how this approach adds to data that we can already assess from ice cores over the past 800 kyr.
Site selection: The authors keep using the same sediment records that they have been using for many years but Figure 1 demonstrates that neither site is ideal. This raises questions about the utility of using either site to "calibrate" the proxy and the reconstructed CO2 shown in Fig. 3 shows more extreme deviations from glacial and interglacial CO2 extremes in the ice cores than site 999. An ideal site would be located in the vast ocean areas that are shaded green in Fig. 1. The final reconstructions are still impressive, but require corrections for a CO2 disequilibrium of which we cannot be certain that it remained constant through time. This caveat should be considered throughout the manuscript, as a source of uncertainty that affects all aspects of the study, including the calibration.
Stable isotope record: The authors used only 10 planktic foraminifer shells for each sample in this record, which is a small number given the geochemical variability from shell to shell and bioturbation. Even laser ablation studies in laboratory culture, where specimens experience well controlled, constant environmental conditions, use at least 12-25 shells to overcome interspecimen variability (see e.g., Holland et al. 2020). It would have been better if the authors had picked larger samples for boron isotope analyses, crushed and homogenized them and then taken a small split for stable isotope analyses. While it would be asked too much to replicate the record with a larger shell number per sample at this time, the authors should mention that this sample size is not ideal, so that other researchers do not use it as a guideline. This also reflects on the genotype comparison (Fig. S5), which might have shown more significant results with a larger, more suitable sample size.
Age models: The authors generate new data and display them in Fig. 2E,F but do not really discuss them. The figure caption describes a species correction but it is not clear how that correction has been determined and if it has already been applied to the displayed data or was applied afterwards. LR04 provides no guideline on this, as far as I can tell. Figure 2 shows site 999 data are too low compared to LR04 but site 871 data fall on LR04. This means that at least one of these records deviates from LR04, and the cause for this deviation (and the choice of species offset) should be discussed.
Dissolution experiment: This is the weakest part of the entire manuscript. The authors do not describe which sediment samples they used for the experiment. What makes those samples ideal for such an experiment? The lack of shell weight data is detrimental and essentially prevents any confidence in the data that have been collected. How large was the volume of acidified fluid? Is it possible that it got saturated right away, was there any dissolution at all? Here again the authors should say more clearly that their experiment falls short on several fronts. They do, but still interpret the data, which does not seem justified. Dissolution in acid and deionized water is likely very different from dissolution in corrosive seawater, so there really is little value in the experiment and associated data. The discussion draws mostly from earlier, much better dissolution studies, which serve the authors' purpose well, so the dissolution experiment could just as well be removed from this study without any impact on the discussion or results. The concern about including such an experiment is that it may lead others to follow the example, which would be unfortunate.
Temperature estimates: The authors use both pH-corrected and uncorrected calibrations to translate Mg/Ca to SST. In the end they use the uncorrected estimates for calculating CO2 but it is not discussed why this should be the better choice. This is particularly striking after multiple studies have highlighted the pH dependence of Mg/Ca in G. ruber. The authors should discuss why they think this is the better approach following line 461. This choice affects the downcore calibration for d11B and deserves more attention.
Figure S9 also discusses "anomalous" temperature estimates but no discussion is provided as to what constitutes such an anomalous deviation. What is the point of reference? How does the SST record compare to sites of similar latitude but outside of potential upwelling areas? (e.g., gyre sites). How do we know that SST was not cooler than expected?
CO2 forcing: Figure 4 shows a nice correlation between DFCO2 and pH but the data deviate at least 5-6 times from the regression lines and their uncertainty, if that is displayed by the grey shading, does not capture the true data variability. Based on the scatter around the lines, how many d11B data do the authors suggest are needed to provide a single reasonable estimate of DFCO2 for a given point in time? Fig. S7 suggests an uncertainty of +/- 0.3-0.8 W/m2, which is clearly an underestimate given the data scatter and requires an assessment of the number of data needed to provide such a minimal uncertainty.
Downcore calibration: This is an interesting approach that could be applied to species that have not yet been calibrated in culture, and it is an approach that could rival coretop calibrations because the modern pH range in surface seawaters is generally too small to allow for a high-quality calibration. However, here again I wonder how many cores and data should be included in a calibration exercise, and whether the downcore calibration is really stronger than the existing culture calibration for G. ruber. To do so, I would recommend that the authors generate a new calibration for each core site and then apply that calibration to the respective other core site. How different are the calibrations from each other and from the culture calibration, and do both calibrations improve the match of the CO2 estimates to the ice cores?
Finally, the authors should check spelling and grammar throughout, including names of authors whose work they cite. There are several typos throughout the manuscript, and in some cases incomplete sentences. Please check spelling in lines 42, 45, 74/75, 149, 150, 186, 215, 413, 433, 526, 570, 691, 693. The sentences in line 607-610 should be rephrased entirely.
In summary, this study adds valuable confirmation to an already strong proxy. There is still room for improvement in this manuscript, mostly by clarifying certain choices, but also by assessing the paleo-calibration from different angles.