Can we use sea surface temperature and productivity proxy records to reconstruct Ekman upwelling?

Marine sediments have greatly improved our understanding of the climate system, but their interpretation often assumes that certain climate mechanisms operate consistently over all timescales of interest and that variability at one or a few sample sites is representative of an oceanographic province. In this study, we test these assumptions using modern observations in an idealized manner mimicking paleo-reconstruction to investigate whether sea surface temperature and productivity proxy records in the Southern California Current System can be used to reconstruct Ekman upwelling. The method uses extended empirical orthogonal function (EEOF) analysis of the covariation of alongshore wind stress, chlorophyll, and sea surface temperature as measured by satellites from 2002 to 2009. We find that EEOF1 does not reflect an Ekman upwelling pattern but instead much broader California Current processes. EEOF2 and 3 reflect upwelling patterns, but these patterns are timescale dependent and regional. Thus, the skill of using one site to reconstruct the large-scale dominant patterns is spatially dependent. Lastly, we show that using multiple sites and/or multiple variables generally improves field reconstruction. These results together suggest that caution is needed when attempting to extrapolate mechanisms that may be important on seasonal timescales (e.g., Ekman upwelling) to deeper time but also the advantage of having multiple proxy records.


Introduction
The climate system varies across multiple timescales and is driven by both stochastic processes and deterministic forcings (Huybers and Curry, 2006).Paleoclimate records help us understand mechanisms of climate variability and change over long timescales by extending instrumental records beyond the historical period.Numerous studies have used paleoclimate records to understand climate system responses to different external forcings (e.g., Shakun et al., 2012), have put recent climate change into a long-term context (e.g., Abram et al., 2016;PAGES2k Consortium, 2013), and have helped benchmark climate models (e.g., Harrison et al., 2015).
Marine sediment is one of the most widely used archives for paleoclimate studies.Using marine sediments for paleoclimate inference usually involves multiple steps, whereby one first measures multiple sensors, frequently proxies for sea surface temperature (SST) and productivity, from a single site.Then, one compares them with other nearby local records, hemispheric reconstructions, and forcing reconstructions.Lastly, one applies modern large-scale climatology to explain changes observed in paleoclimate records (e.g., Abram et al., 2016;Goni et al., 2006;Leduc et al., 2010a;MARGO, 2009;McGregor et al., 2007;Vargas et al., 2007).While these comparisons have improved our understanding about paleoclimate significantly, uncertainties and oversimplifications may often result in overly broad interpretations and assertions.Notably, this approach typically assumes that (1) certain climate mechanisms always operate over the past at all timescales of interest, and (2) large-scale phenomena can be linked to variability at one or a few sample sites (i.e., a paleoclimate record location).In actuality, some have found a substantial difference in SST reconstruction at nearby sites (e.g., Leduc et al., 2010b, and references therein).
This paper illustrates an approach to test commonly asserted interpretations of SST and productivity proxy records by using observational data to analyze a region where A. Cheung et al.: Covariation of SST and productivity proxy records in the Southern California Current a known mechanism drives a large fraction of the variability and with well-preserved high-resolution sedimentary records: the southern California region, an example of an eastern boundary upwelling system (EBUS).There are strong scientific and societal interests in understanding EBUSs because physical and biogeochemical changes in these regions are known to have significant impacts on regional climate (Snyder et al., 2003;Ravelo et al., 2004;Jacox et al., 2014) and the global fishery industry (Ryther, 1969;Pauly and Christensen, 1995;Ware and Thomson, 2005).Unfortunately, it remains uncertain how EBUSs will change on decadal to centennial timescales in the future (Bakun et al., 2015;Di Lorenzo, 2015;Garcia-Reyes et al., 2015, and references therein).Nevertheless, underlying sediments in these regions often accumulate rapidly and contain a wealth of paleoclimate information, in particular organic biomarkers and associated proxies.Thus, this has allowed for high-resolution (subdecadal timescale) and high-quality paleoclimate reconstructions along many EBUSs, which provide additional constraints on past and future changes in EBUSs (e.g., Leduc et al., 2010a;McGregor et al., 2007;Vargas et al., 2007).
Variability in SST and productivity reconstructions along EBUSs are often regarded as a response to Ekman pumping (e.g., Leduc et al., 2010a;McGregor et al., 2007;Vargas et al., 2007;MARGO, 2009).However, many other processes are also at play in EBUSs and can drive SST and productivity changes (e.g., eddies, zonal advection, surface heat flux variations, changes in nutrient sources and concentration forced by subsurface processes, and large-scale climate variability that affects the stratification) (Di Lorenzo et al., 2005;Chhak and Di Lorenzo, 2007;Gruber et al., 2011;Jacox et al., 2016;Rykaczewski and Dunne, 2010;Xiu et al., 2018).Depending on spatial and temporal timescales, these processes can overwhelm the Ekman signal in SST and productivity changes recorded by proxy records.
Here we use high-resolution modern observations available during the satellite era to probe the spatial and temporal influence of Ekman pumping on environmental parameters of interest (e.g., SST and productivity).We apply the extended empirical orthogonal function (EEOF) approach (Chen and Harr, 1993) to analyze covariation between sea surface temperature, productivity, and alongshore wind stress in the Southern California Current System using high-resolution satellite data.We test the hypotheses that (1) the dominant covarying EEOF pattern resembles regionwide Ekman upwelling, (2) Ekman upwelling patterns, and thus the wind stress magnitude, can be recovered using timeaveraged proxies, and (3) large-scale changes are not the dominant drivers of variability at a single paleoclimate site.We also assess the benefits of using multiple proxy records from multiple sites to better understand the climate variability of the past in EBUS regions.

California Current System
The availability of numerous high-resolution spatiotemporal data (e.g., repeated hydrography, gliders, satellites) and advances in modeling have allowed us to better understand the variability of the California Current System (CCS) on multiple timescales.The CCS is made up of the California Current, California Undercurrent, and upwelling zones, which interact with a variety of local topographic features and estuaries.On 1st order, the CCS as a whole is driven by large-scale climate forcing.Changes in atmospheric pressure systems (subtropical high, Aleutian low) alter wind strength and direction, which in turn affect current direction, strength, and upwelling variability.The stratification in the region is set by large-scale features and forcing of the North Pacific.Variations in topographic features, wind forcing, freshwater inputs, and submesoscale-mesoscale features across spatial scales also play important roles in determining the spatiotemporal characteristics of the CCS.Lynn and Simpson (1987), Checkley andBarth (2009), andCapet et al. (2008) provide overviews on the dynamics of the CCS and drivers of SST, chlorophyll, and wind forcing variability.
The optimal marine sediments to reconstruct subdecadal climate variability require a high sedimentation rate with minimal bioturbation and hence anoxic depositional environments.Along the CCS, these conditions mostly occur south of 24 • N with the exception of silled basins (e.g., Santa Barbara Basin) (van Geen et al., 2003).As a result, previous high-resolution (subdecadal) paleoclimate studies were mostly done in the southern part of the CCS (SCCS; Fig. 1) (e.g., Goni et al., 2006;Abella-Gutiérrez and Herguera, 2016;Zhao et al., 2000).

Data and methods
This study made use of high-spatiotemporal-resolution estimates of sea surface temperature (SST), chlorophyll a (CHL), and alongshore wind stress (TAU) from satellite measurements to assess the role of Ekman pumping in driving SST and productivity changes along the SCCS.We used an extended empirical orthogonal function (EEOF) to assess the covariation between these variables because they are expected to be correlated spatially and temporally if Ekman theory is indeed the primary mechanism driving changes in the region.EEOF analysis decomposes the dataset into different covarying patterns that are orthogonal to each other.Each covarying pattern is accompanied by a time series that represents the time evolution of the covarying pattern.These patterns do not necessarily correspond to dynamical modes, but they are suggestive of physical processes that are present in the system (Monahan et al., 2009).Thus, analysis of EEOF patterns allows us to make inferences about the potential underlying dynamics.In addition, we assessed the effects of time averaging and spatial subsampling on the ability to recover dominant patterns within the spatial window analyzed.Such an assessment allows us to determine the fidelity of using proxies, which are time averaged and undersampled spatially, to understand Ekman pumping in the SCCS.Details of the data and method used can be found in Sect.3.1 and 3.2.

Data
We used sea surface temperature (SST) from the Geostationary Environmental Satellite (GOES) system, chlorophyll a (CHL) from MODIS, and alongshore wind stress (TAU) observations from QuikSCAT that span from July 2002 to November 2009.Although CHL does not equate precisely to primary productivity and also differs from productivity inferred from proxy records, CHL provides a 1st-order estimate of productivity (Henson et al., 2010).All data were derived and are available from the National Aeronautics and Space Administration Jet Propulsion Laboratory PO.DAAC and ocean color data server.We did not use the California Cooperative Oceanic Fisheries Investigations dataset because sampling resolution is low and the spatial extent is small when compared to satellite images.Reanalysis products (e.g., SODA) were also not chosen because even though they may span a longer period of time, there are many uncertainties associated with these products, for instance initial conditions, boundary forcings, model physics, and resolution (approximately 25 km horizontal) (Carton et al., 2018).Furthermore, Capet et al. (2008) show that submesoscale-permitting resolution (750 m horizontal) is needed in order to accurately simulate this upwelling system.
For TAU calculation, we used the descending pass of level 3 gridded Jet Propulsion Laboratory v2 QuikSCAT surface wind observations (SeaPAC, 2006).The QuikSCAT satellite is equipped with the SeaWinds scatterometer, a microwave radar that measures ocean radar backscatter over a cross section, which varies with satellite parameters and surface geometry (Freilich et al., 1994;Chelton and Freilich, 2005).Surface wind vectors can be estimated using model functions to estimate the relationship between wind and radar backscatter over the cross section.Level 3 data were derived using the direction interval retrieval with threshold nudging wind vector solutions based on level 2B data, which used the QSCAT-1B geophysical model function (Perry, 2001).Level 3 QuikSCAT data provide 0.25 • ×0.25 • spatial resolution on a daily timescale.The QuikSCAT accuracy is about 0.75 ms −1 in the along-wind component and about 1.5 ms −1 in the crosswind component (Chelton and Freilich, 2005).
We utilized SST observations from the Geostationary Environmental Satellite (GOES) system.GOES provides neartime SST measurements along the west coast of North America.We used level 3 gridded GOES 6 km near-real-time SST daily data after 12 May 2003 (NOAA/NESDIS, 2003b) and averaged hourly SST data to daily mean resolution prior to 12 May 2003 (NOAA/NESDIS, 2003a).Level 3 GOES SST data provide 0.05 • × 0.05 • spatial resolution with better than 1K SST accuracy (Wick et al., 2002).
For CHL concentrations, we used ocean color from the Moderate Resolution Imaging Spectroradiometer on the Aqua satellite (MODIS Aqua) (Hu et al., 2012).MODIS Aqua is sun synchronous and measures 36 spectral bands.We used level 3 standard mapped image CHL measurements from MODIS Aqua v2018.0(NASA Goddard Space Flight Center, 2018).Level 3 CHL data provide 0.041 • × 0.041 • spatial resolution on a near-daily timescale with an accuracy of approximately ±35 % (Dall'Olmo et al., 2005).

Observation preprocessing
To allow for comparison between SST, CHL, and TAU, CHL and SST were regridded to 0.25 • × 0.25 • spatial resolution.This was done by bounding the datasets to 15-45 • N, 130-100 • W and then calculating the area-weighted CHL and SST value over each new grid cell.We further restricted our latitudinal extent to 15-35 • N to make the analysis more computationally efficient.Repeated analysis using different spatial domains (case 1: only east of 125 • W; case 2: only north of 20 • N) suggests that our conclusions are insensitive to the spatial extent selected for analysis (not shown).
Since the primary interest is Ekman-driven upwelling along the coast, we computed the TAU by using Eq. ( 1): where τ is alongshore wind stress, ρ is air density, C D is the drag coefficient, U is wind speed, and |U | is the alongshore wind vector.|U | was calculated by summing the alongshore component of zonal and meridional wind vectors such that −|U | and |U | represent equatorward and poleward wind stress, respectively.We used constant values for the coefficients, where ρ = 1.2 kg m −3 and C D = 1.2 × 10 −3 (Large and Pond, 1981).
Linear interpolation of all of the near-daily datasets temporally ensured a uniform daily sampling rate data at each grid cell.The logarithm of CHL data was taken after regridding but before EEOF analysis because CHL exhibits a nearly lognormal distribution (Campbell, 1995).Before EEOF analysis, each variable was normalized by dividing the dataset by its domain-wide and all-time standard deviation, which makes the anomaly variations in each variable comparable to each other in terms of occurrence likelihood (assuming approximately Gaussian distributions).
To follow the logic of analyzing fields that would resemble proxy records, no removal of mean or climatological states or seasonality from the satellite records was performed.Thus, the EEOF analysis is performed on the total fields rather than the anomaly fields.

Extended empirical orthogonal function
Extended empirical orthogonal function (EEOF) decomposition analysis was used to extract dominant patterns with covariation in SST, CHL, and TAU.EEOF is a variant of empirical orthogonal function (EOF) analysis, a method that extracts coherent, orthogonal patterns by optimizing variance into multiple orthogonal functions in time and space.Multiple variants of the EOF exist, which all involve taking into account temporal correlations of a variable or correlations between variables (e.g., Bretherton et al., 1992;Hannachi et al., 2007, and references therein).Examining multiple time snapshots as a single field allows EOF-based analysis to extract propagating patterns (e.g., Chen and Harr, 1993) and covarying patterns (Kutzbach, 1967).Figures 2 and 3 show examples of three different EOF-based methods that are fundamental to the analysis herein (EOF, EEOF with temporal correlation, EEOF with multiple variables).
The EEOF method used in this study involved extracting dominant covarying patterns by taking into account both temporal correlation within the same variable (symmetric lead-lag relationships) and correlation between variables.We employed the singular value decomposition (SVD) method (Bretherton et al., 1992) to decompose the covarying pattern of SST, CHL, and TAU into the relevant EEOF objects.
To consider the time correlation of a variable X for EEOF analysis, we form the following data matrix: where x t,i is a data point at a certain time snapshot t and space grid point i, t = 1, 2, . .., m, i = 1, 2, . .., j , k is the time unit of lead and lag included, m is the temporal length of the dataset, and j is the total number of total spatial grid points covered.Thus, X is the concatenation of multiple reproductions of x t,i , with each column featuring x evaluated at sequential times and each row representing every spatial value of x, concatenated with spatial maps that are displaced in time to provide lead and lag information.Similarly, the data matrix, M, with three variables can be written as follows: where SST, CHL, and TAU are submatrices with a structure similar to matrix X.Note that each row of M is a complete spatiotemporal set of each variable, including every spatial location and lead and lag times for each variable, so that M is effectively the concatenation of three X matrices, one for each variable.Then, using SVD, we can decompose Eq. ( 3) into  where U is a matrix of left-orthogonal singular vectors as columns with temporal information of the M matrix (principal components -PCs), S represents singular values, and V is a matrix of right-orthogonal singular vectors as columns with spatial information (extended empirical orthogonal functions -EEOFs) of the M matrix.Note that the SVD method arrives at a basis of eigenvectors of the covariance matrices M T M, i.e., M T MV = S 2 V , and MM T , i.e., MM T U = S 2 U , so this approach is equivalent (although slightly different algorithmically) to generating EEOFs by eigenvalue decomposition.
Since proxy records reflect time-averaged environmental information (usually monthly or longer), daily satellite information for analysis does not accurately depict the temporal smoothing characteristics in proxies.Hence, we performed EEOF analysis independently on daily data after averaging it into 30 d (∼ monthly) and 365 d (∼ annual) with nonoverlapping means.The relatively short span of satellite observations does not allow us to extend our analysis to longer time periods that might also be of interest.

Determining the significance of modes and lead-lag
Based on singular values, EEOF1 explains ∼ 85 % of the total variance, and EEOF2 and 3 each explain ∼ 5 % of the total variance.Instead of using singular values to determine the significance of each mode, we selected the number of modes and lead-lags to retain by evaluating the skill to reconstruct TAU.This approach was motivated by the interest of this study to detect Ekman upwelling, which involves covariation of SST, CHL, and TAU and our inability to reconstruct TAU directly using proxies.Reconstruction of TAU (TAU rec ) was carried out as follows: where TAU(0) represents the columns for TAU in the original data matrix that were replaced with zeros.Then, where r is a rescaling factor calculated by std(SST|CHL) std(SST rec |CHL rec ) , and V n is spatial information obtained from decomposing M (Eq.4) with n numbers of modes retained, where n = 1. ..5.Note that as n is much smaller than the rank of M rec , V n V T n is not the identity matrix but is better thought of as the projection of M 0 onto the leading modes of M. As zero wind stress is inconsistent with any of the modes V n , multiplying M 0 by this factor adds TAU variability back into the zeroed values that is more consistent with the observed SST and CHL, which is M rec .
We used root mean square error (RMSE) as a metric to measure agreement between reconstructed TAU and actual TAU: where [•] represents the mean of the data.
Our analysis shows that reconstruction using three modes with no lead-lag information included provides the most stable result in predicting TAU from SST and CHL regardless of the averaging timescale (Fig. 4).This result, and similar results of convergence accuracy by adding more modes, suggests that the first three modes (n = 3) are reliable in this and other analyses, which will be used for the remainder of this paper.

Reconstruction of principal components
We determined how well proxy records could represent large-scale circulation patterns by means of signal reconstruction.We focused specifically on three sites with previously published high-resolution paleoclimate records -Santa Barbara Basin, San Lazaro Basin, and Guaymas Basin -and two environmental variables, SST and productivity (Goni et al., 2006;Abella-Gutiérrez and Herguera, 2016;Zhao et al., 2000).We carried out three different kinds of reconstructions to address the following questions: (1) how well does a single site or proxy record represent large-scale circulation?(2) Does increasing the number of proxy records and/or sites improve the skill to represent modes extracted from EEOF analysis?(3) Does increasing the number of proxy records and/or sites improve the skill to reconstruct the original dataset?This was achieved by first only retaining the target time series (i.e., those proxy records that are to be included) from the location in M tar : We reconstructed the temporal evolution of each mode by Eq. ( 10) using only the targeted proxy records and n EEOF modes: We reconstructed the dataset with Eq. ( 11): where U rec represents reconstructed PCs, r s is the ratio between the standard deviation of time series from target site(s) and the standard deviation of the reconstructed time series from target site(s), S n and V n were derived from Eq. ( 4), and n is the number of modes retained for analyses.In this scenario, only the parts of V n associated with the target location were retained for reconstruction.The pseudo-inversion of the matrix S n V T n was done using a Moore-Penrose pseudoinverse, which amounts to inverting only the non-singular degrees of freedom, while zeroing out the remaining modes.Similarly, the multiplication of M tar by V n V T n considers only the projection of M tar onto the n retained modes (V V T is the identity matrix, but if only some modes are retained, then only V T n V n is an identity but over the smaller dimensional space spanned by the retained modes).By retaining one mode (n = 1) and limiting the proxy record used in M tar to one, Eqs. ( 10) and ( 11) can be used to addressed the ability of using a proxy record at a single location to represent largescale circulation, which is represented by EEOF1.Similarly, by retaining three modes (n = 3), Eqs. ( 10) and ( 11) can be used to evaluate the effects of increasing the number of proxy records to reconstruct modes extracted from EEOF analysis and the original dataset.• W) and comparing them, we find the TAU and CHL display a weak cross-shore gradient compared to their own respective meridional gradient.On the other hand, SST exhibits a meridional gradient that is stronger than its cross-shore gradient (Fig. 5).These patterns remain dominant when 30 and 365 d averaged data were used.

Results and discussion
The fact that EEOF1 does not resemble an Ekman upwelling pattern has two major implications.First, this implies that wind stress is not the only forcing that drives CHL and SST changes along EBUSs.Previous studies have reported different mechanisms that could control changes in CHL or SST along EBUSs on various timescales.For instance, changes in subsurface nutrient concentration and sources have been shown to alter primary productivity (Chhak and Di Lorenzo, 2007;Rykaczewski and Dunne, 2010), whereas surface heat flux has been shown to exert a dominant control on sea surface temperature in the California Current System (Di Lorenzo et al., 2005).Our study confirms these results and further iterates the importance of considering different factors that could affect CHL and SST along EBUSs, which are often used as indicators for changes in Ekman-driven upwelling.Second, paleoclimate reconstructions in the SCCS will be unlikely to reflect Ekman upwelling, in contrast to the common paradigm in the field, and couplings observed between proxy reconstructions of SST and productivity likely capture other processes.

Can time-averaged proxies be used to reconstruct
Ekman upwelling?
Even though the dominant covarying pattern does not reflect Ekman upwelling, the EEOF method allows us to decompose multiple covarying patterns for analysis.Our results suggest that EEOF2 and EEOF3 resemble an Ekman upwelling pattern on daily timescales, but they reflect upwelling at different locations (Figs.6-7).Specifically, EEOF2 depicts upwelling conditions off Baja California, whereas EEOF3 reflects upwelling or other rapid changes in conditions at the Sea of Cortez.This presents an opportunity to understand whether time-averaged proxies can be used to reconstruct Ekman upwelling given optimal site selection.
Visual comparison of EEOF2 and EEOF3 patterns across different averaging windows suggests that these patterns change with respect to the averaging window.For both EE-OFs, their patterns resemble Ekman upwelling when data with daily resolution are used.These Ekman-upwelling-like patterns disappear when 365 d averaged data are used instead and only spatially incoherent structures are retained (bottom rows in Figs.6i-l and 7i-l).The disappearance of an Ekman upwelling pattern suggests that either Ekman upwelling is a subannual process and/or that this process is not a dominant feature on an annual timescale.We further analyze the changes in temporal scale by comparing 30 and 365 d averages of the principal component derived using daily data with principal components derived from 30 and 365 d average data.The averages of the principal component derived using daily data represent the assumption that the same dynamical process happen at all timescales, whereas the principal components derived from averaged data represent the actual covarying pattern on the timescale of interest.Our results   show that the 30 and 365 d means of PC2 and PC3 derived from daily data do not always track the principal components derived from time-averaged data (Figs.6h, l and 7h, l).While it is not possible to diagnose the underlying cause using our method, these results imply that marine sedimentary records, which generally integrate over the annual cycle, cannot capture Ekman upwelling variations in this region.Furthermore, these results highlight the importance of considering what timescales are reflected in the proxy record.On the assumption that some proxies are seasonally biased (e.g., "integrated production temperature" applied to the interpretation of alkenone paleotemperature estimates by Conte et al., 1992) we add a sine-weighting function (maximum in March and minimum in September) to the 30 d averaged dataset and reanalyze the resulting EEOF pattern.We find that the pattern is similar to the one without weighting (not shown).This suggests that the seasonal cycle does not dominate the resulting EEOF patterns over this spatial and temporal domain.
4.3 Are there benefits to analyzing records from multiple sites?
Since an upwelling pattern is only observed in the analysis using daily and 30 d averaged data, we focus on assessing the potential benefits of analyzing records from multiple sites on 30 d (∼ monthly) data.We acknowledge that most sedimentary records integrate over an annual cycle.However, since we cannot recover the upwelling pattern in the first three modes when using 365 d averaged data, we consider an idealized situation instead in which proxy records integrate climate information on an approximately monthly timescale.
With only a single proxy-type measurement from one site, one can only assume it reflects the dominant large-scale circulation pattern of that area (represented by EEOF1-PC1 in this case).However, comparisons between PC1 and reconstructed PC1s based on a variable from one site show that the ability to recover the dominant pattern depends on the location and variable (Fig. 8).This varying relationship suggests that small-scale processes can drive variability at a proxy site, which can lead to behavior that is different from large-scale circulation.Therefore, caution is needed when trying to extrapolate variability in a single proxy record from one paleoclimate site to infer large-scale circulation changes.Nevertheless, in the absence of additional sites available to recover sediment cores, we find that measuring multiple variables often leads to better constraint of large-scale climate variability (Fig. 9a).
Multiple drilling expeditions in the SCCS have recovered cores from different locations, which allows us to determine whether there are benefits to analyzing records from multiple sites.With multiple sites available, we can potentially reconstruct different patterns of large-scale variability (Figs.5-7).In the case of 30 d averaged data, a multiple-site-based reconstruction allows us to reconstruct spatiotemporal patterns that are associated with Ekman-driven upwelling (Fig. 9).There is also a tendency of increasing reconstruction skill when more sites and proxies are used.Therefore, there is potential to recover multiple covarying patterns that are driven by different dynamics.
Adding reconstruction sites and variables analyzed can also potentially improve the ability to reconstruct spatiotemporal variability in the spatial domain analyzed.This has been shown in other pseudo-proxy experiments that concern hemispheric reconstruction (e.g., Wang et al., 2014).Although our reconstruction technique is rather simple compared to commonly used climate field reconstruction techniques in pseudo-proxy experiments and other reconstructions (e.g., Wang et al., 2014), we show that similar results emerge, wherein an increasing number of sites and/or variables can help better reconstruct full field data that contain multiple variables (Fig. 10; Eq. 3).Therefore, these results together argue for the notion of using multiple sites and proxies for paleoclimate reconstruction.

Implications
While this study only focuses on the case of Ekman upwelling in the SCCS, it has general implications for paleoclimate studies.First, our analysis provides empirical evidence that it is important to consider the spatial representativeness of a proxy record.This calls for careful interpretation in each proxy record developed in order to avoid over simplification and over-interpretation of the climate system.Second, we demonstrate that depending on time averaging and the timescale of interest, mechanisms such as Ekman upwelling might or might not be an important process that drives variability in proxy records.Therefore, it is also important to understand whether the proxies applied and the record are able to resolve the timescales at which the mechanism of interest dominates (e.g., El Niño-Southern Oscillation on interannual timescale).Third, we show that analyzing different proxy records from multiple sites can help us reconstruct multiple covarying patterns and improve climate field reconstruction.Last, we propose and demonstrate a multivariate method that allows us to test the assumptions regarding spatial and temporal sampling.We expect that this method can also be easily applied to other regions to provide a 1st-order constraint on how the proxy records can be interpreted.

Limitations
There are multiple limitations that have to be taken into account when applying the results from this analysis to a paleoclimate context.Firstly, our analysis is only based on 7 years of instrumental data.It is possible that the patterns established in this study are only applicable to the years analyzed  due to potential nonstationary covarying relationships between the variables analyzed.Furthermore, the short length of the instrumental records does not allow us to assess the impacts of basin-scale low-frequency climate variability.
Secondly, our analysis assumes that signals from proxy records can capture surface ocean conditions perfectly and are free from other noise.This assumption is certainly violated, with multiple studies pointing to different sources of uncertainties in sedimentary records (e.g., Dolman and Laepple, 2018).Nevertheless, our analysis provides an idealized scenario to understand assumptions associated with spatial and temporal sampling and marks an important step toward better interpreting paleoclimate records.
Thirdly, the utilization of a chlorophyll satellite product assumes that chlorophyll is related to primary productivity, which in turn is related to export productivity, a variable that is believed to be captured by proxies.While the first assumption that chlorophyll and primary productivity are related is probably accurate on 1st order (Henson et al., 2010), the relationship between primary productivity and export productivity is less trivial.Previous studies have identified a general relationship between export productivity, marine productivity, and sea surface temperature (Dunne et al., 2005;Laws et al., 2011).Sediment trap studies done in the Santa Barbara and Guaymas basins generally show a similar pattern (Thunell et al., 1994;Thunell, 1998), with export production correlated positively with primary productivity (organic carbon and opal in the Santa Barbara Basin; opal in the Guaymas Basin).However, a discontinuous sediment trap study done in the San Lazaro Basin suggested productivity driven by remineralization during El Niño, which resulted low export productivity despite high productivity (Silverberg et al., 2004).This highlights the potential complexity in plankton communities along a continental margin, which can experience both eutrophic and oligotrophic conditions.In fact, Dunne et al. (2005) examined the proposed parameterization by synthesizing different sediment trap sites and showed that the positive relationship between primary productivity and export productivity works in a global sense but not small scales.Furthermore, many studies have highlighted other factors to consider when considering export production, for instance particle size, ballasting effects, rem-ineralization, eddy subduction, and mixed layer pumping (Lam and Marchal, 2015;Boyd et al., 2019, and references therein).Hence, more dedicated experiments are needed in order to establish a quantitative relationship between the chlorophyll data used here and paleo-productivity records.
Fourthly, we assume that each statistical mode retrieved in this study is tied to a dynamical mechanism.However, previous studies have cautioned against such interpretations (e.g., Hannachi et al., 2007).Nevertheless, our study does not aim to diagnose Ekman upwelling processes but simply aims to determine whether it is possible to recover Ekmanupwelling-related patterns in proxy records.Hence, we argue that the distinction between a dynamical mode and statistical mode does not undermine our results.
Lastly, our multiple-record analysis assumes that proxy records contain perfect age models.In most cases, this assumption is also invalid.It is inevitable that each sedimentary record contains absolute age uncertainties.Therefore, using marine sedimentary records for a multi-site proxy reconstruction with a high temporal resolution is more challenging and might yield a different conclusion than ours.

Conclusions
This study aimed to evaluate assumptions commonly made in paleoclimate studies: (1) a certain mechanism operates in the past on all timescales of interest, and (2) large-scale phenomena can explain the most variance in a small location (i.e., a paleoclimate site).We tested these assumptions by focusing on the Southern California Current System and used observational records to understand whether it is possible to reconstruct Ekman upwelling using multiple sedimentary records.We introduced an extended empirical orthogonal function framework and applied it to satellite records to make inferences about paleoclimate records.Our results indicate that the dominant TAU, CHL, and SST covarying pattern does not resemble Ekman upwelling.In addition, the relationship between these variables appears to depend on timescales and spatial scales.A positive result is that our analysis suggests that a few sediment sites can monitor large-scale fields associated with the Southern California Current.Lastly, we highlight the potential benefits of using multiple proxy records to understand different largescale covarying patterns.Our study suggests that instrumental records are helpful for testing assumptions in paleoclimatology and the associated spatial-and temporal-scale extrapolations made based on paleoclimate reconstructions.Testing these assumptions might help us better interpret proxy records and understand past climate changes.SST

Figure 1 .
Figure 1.(a) Winter (December, January, February) sea surface temperature average and wind pattern; (b) winter chlorophyll monthly average and wind pattern; (c) summer (June, July, August) sea surface temperature average and wind pattern; (d) summer chlorophyll monthly average and wind pattern.The basins highlighted are where high-resolution (subdecadal) sediment cores were previously retrieved and analyzed.

Figure 2 .
Figure 2. Example of decomposing sea surface temperature into different modes by using an (a, d) empirical orthogonal function, (b, e) an extended empirical orthogonal function with 1 d lead and lag, and (c, f) an extended empirical orthogonal function with chlorophyll included.

Figure 3 .
Figure 3. Example of decomposing chlorophyll into different modes by using an (a, d) empirical orthogonal function, (b, e) an extended empirical orthogonal function with 1 d lead and lag, and (c, f) an extended empirical orthogonal function with sea surface temperature included.

4. 1 Figure 4 .
Figure 4. Root mean square error of reconstructed wind stress with respect to actual wind stress using (a) daily data, (b) 7 d averaged data, (c) 30 d averaged data, (d) 90 d averaged data, and (e) 365 d averaged data.

Figure 5 .
Figure 5. EEOF1 spatial and temporal patterns of TAU, CHL, and SST using (a-d) daily, (e-h) 30 d averaged, and (i-l) 365 d averaged data.Stars in the spatial pattern plots indicate locations where the differences were taken to compute cross-shore and meridional gradients.30 d mean (green) and 365 d mean (purple) time series were derived from averaging 1 d average PC1.The correlation coefficient indicates how well the time mean of 1 d average PC tracks the PC of the time-averaged data.

Figure 6 .
Figure 6.EEOF2 spatial and temporal patterns of TAU, CHL, and SST using (a-d) daily, (e-h) 30 d averaged, and (i-l) 365 d averaged data.Stars in the spatial pattern plots indicate locations where the differences were taken to compute cross-shore and meridional gradients.30 d mean (green) and 365 d mean (purple) time series were derived from averaging 1 d average PC2.The correlation coefficient indicates how well the time mean of 1 d average PC tracks the PC of the time-averaged data.

Figure 7 .
Figure 7. EEOF3 spatial and temporal patterns of TAU, CHL, and SST using (a-d) daily, (e-h) 30 d averaged, and (i-l) 365 d averaged data.Stars in the spatial pattern plots indicate locations where the differences were taken to compute cross-shore and meridional gradients.30 d mean (green) and 365 d mean (purple) time series were derived from averaging 1 d average PC3.The correlation coefficient indicates how well the time mean of 1 d average PC tracks the PC of the time-averaged data.

Figure 8 .
Figure 8. (a-d) Best, (e-h) median, and (i-l) worst PC1 reconstruction spatial RMSE and time series using only one variable from one site.The white marker indicates the site used in that reconstruction, with the circle indicating SST and the square indicating CHL.The means of both time series were removed for visualization purposes.

Figure 9 .
Figure 9. Correlation coefficient between reconstructed and actual (a) PC1, (b) PC2, and (c) PC3 temporal pattern using 30 d averaged data with varying numbers of time series from the target sites.Also shown are the best, median, worst, and original temporal pattern reconstructions (ranked by correlation with the original PC) of (d) PC1, (e) PC2, and (f) PC3.

Figure 10 .
Figure 10.Full data reconstruction RMSE using different numbers of time series as input for reconstruction.
and QuikSCAT surface wind data were obtained from the NASA EOSDIS Physical Oceanography Distributed Active Archive Center (PO.DAAC) at the Jet Propulsion Laboratory, Pasadena, CA.MODIS chlorophyll a data were obtained from the NASA Goddard Space Flight Center, Ocean Ecology Laboratory, Ocean Biology Processing Group.We thank Richard Vachula, Nora Richter, and Sloane Garelick for helpful comments, suggestions, and discussions.Anson Cheung was supported by the Brown University Presidential Fellowship.Baylor Fox-Kemper was supported by ONR N00014-17-1-2393.This research has been supported by the Office of Naval Research, Office of Naval Research Global (grant no.N00014-17-1-2393), and a Brown University Presidential Fellowship.This paper was edited by Ran Feng and reviewed by Yige Zhang and two anonymous referees.