27 Sep 2023
 | 27 Sep 2023
Status: a revised version of this preprint is currently under review for the journal CP.

Can machine learning algorithms improve upon classical palaeoenvironmental reconstruction models?

Peng Sun, Philip B. Holden, and H. John B. Birks

Abstract. Classical palaeoenvironmental reconstruction models often incorporate biological ideas and commonly assume that the taxa comprising a fossil assemblage exhibit unimodal response functions of the environmental variable of interest. In contrast, machine learning approaches do not rely upon any biological assumptions, but instead need training with large data-sets to extract some understanding of the relationships between biological assemblages and their environment. We have developed a two-layered machine learning reconstruction model MEMLM (Multi Ensemble Machine Learning Model). The first layer applies three different ensemble machine learning models of random forests, extra random trees and lightGBM, trained on the modern taxon assemblage and associated environmental data to make reconstructions based on the three different models, while the second layer uses multiple linear regression to integrate these three reconstructions into a consensus reconstruction. We consider three versions of the model: 1) A standard version of MEMLM, which uses only taxon abundance data, 2) MEMLMe, which uses embedded assemblage information, using a natural language processing model (GLOVE) to detect associations between taxa across the training data-set and 3) MEMLMc which incorporates both taxon abundance and assemblage data. We train these MEMLM model variants with three high quality diatom and pollen training sets and compare their reconstruction performance with three weighted averaging (WA) approaches of WA-Cla (classical deshrinking), WA-Inv (inverse deshrinking) and WA-PLS (partial least squares). In general, the MEMLM approaches, even when trained on only embedded assemblage data, perform substantially better than the WA approaches under cross-validation in the larger data-sets. However, when applied to fossil data, MEMLM and WA approaches sometimes generate qualitatively different palaeoenvironmental reconstructions. We applied a statistical significance test to all the reconstructions. This successfully identified each incidence where the reconstruction is not robust with respect to the model choice. We find that machine learning approaches can outperform classical approaches, but can sometimes catastrophically fail, despite showing high performance under cross-validation, likely indicating problems when extrapolation occurs. We find that the classical approaches are generally more robust, although they can also generate reconstructions which have modest statistical significance, and therefore may be unreliable. We conclude that cross-validation is not a sufficient measure of transfer-function performance, and we recommend that the results of statistical significance tests are provided alongside the down-core reconstructions based on fossil assemblages.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Peng Sun, Philip B. Holden, and H. John B. Birks

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on cp-2023-69', Cajo ter Braak, 20 Oct 2023
    • AC1: 'Reply on RC1', Phil Holden, 15 Apr 2024
  • RC2: 'Comment on cp-2023-69', Andrew Parnell, 22 Dec 2023
    • AC2: 'Reply on RC2', Phil Holden, 15 Apr 2024
Peng Sun, Philip B. Holden, and H. John B. Birks
Peng Sun, Philip B. Holden, and H. John B. Birks


Total article views: 748 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
533 173 42 748 32 32
  • HTML: 533
  • PDF: 173
  • XML: 42
  • Total: 748
  • BibTeX: 32
  • EndNote: 32
Views and downloads (calculated since 27 Sep 2023)
Cumulative views and downloads (calculated since 27 Sep 2023)

Viewed (geographical distribution)

Total article views: 709 (including HTML, PDF, and XML) Thereof 709 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 17 Jul 2024
Short summary
We develop the Multi Ensemble Machine Learning Model MEMLM for reconstructing palaeoenvironments from microfossil assemblages. The machine learning approaches, which include random tree and natural language processing techniques, substantially outperform classical approaches under cross-validation but they can catastrophically fail when applied to reconstruct past environments. Statistical significance testing is found sufficient to identify these unreliable reconstructions.