Articles | Volume 25, issue 6
Research article
17 Jun 2021
Research article |  | 17 Jun 2021

Machine learning deciphers CO2 sequestration and subsurface flowpaths from stream chemistry

Andrew R. Shaughnessy, Xin Gu, Tao Wen, and Susan L. Brantley

Endmember mixing analysis (EMMA) is often used by hydrogeochemists to interpret the sources of stream solutes, but variations in stream concentrations and discharges remain difficult to explain. We discovered that machine learning can be used to highlight patterns in stream chemistry that reveal information about sources of solutes and subsurface groundwater flowpaths. The investigation has implications, in turn, for the balance of CO2 in the atmosphere. For example, CO2-driven weathering of silicate minerals removes carbon from the atmosphere over 106-year timescales. Weathering of another common mineral, pyrite, releases sulfuric acid that in turn causes dissolution of carbonates. In that process, however, CO2 is released instead of sequestered from the atmosphere. Thus, understanding long-term global CO2 sequestration by weathering requires quantification of CO2- versus H2SO4-driven reactions. Most researchers estimate such weathering fluxes from stream chemistry, but interpreting the reactant minerals and acids dissolved in streams has been fraught with difficulty. We apply a machine-learning technique to EMMA in three watersheds to determine the extent of mineral dissolution by each acid, without pre-defining the endmembers. The results show that the watersheds continuously or intermittently sequester CO2, but the extent of CO2 drawdown is diminished in areas heavily affected by acid rain. Prior to applying the new algorithm, CO2 drawdown was overestimated. The new technique, which elucidates the importance of different subsurface flowpaths and long-timescale changes in the watersheds, should have utility as a new EMMA for investigating water resources worldwide.

1 Introduction

We need to understand the long-term controls on atmospheric CO2 because of the impact of this greenhouse gas on global climate. This is important because humans are increasingly burning fossil fuels and releasing long-sequestered carbon into the atmosphere (Kasting and Walker, 1992). This new C flux upsets the natural long-term balance in the atmosphere between volcanic degassing and weathering-induced drawdown of CO2 over millennial timescales. Chemical weathering of the most common rock-forming minerals, silicates and carbonates, removes CO2 from the atmosphere by forming dissolved inorganic carbon that is carried in rivers to the ocean (DIC; Fig. 1). Over 105–106-year timescales, this DIC is precipitated as marine calcite, releasing half or all of the atmospherically derived CO2 back into the atmosphere for silicates and carbonates, respectively (Fig. 1). Thus, over this timescale, CO2-driven weathering (CO2 weathering) of silicates sequesters CO2 out of the atmosphere, while CO2 weathering of carbonates neither removes nor releases CO2 into the atmosphere (Fig. 1). Some researchers also emphasize that this simple picture neglects weathering of another ubiquitous mineral, pyrite (Lerman et al., 2007). When pyrite weathers, it produces sulfuric acid that also dissolves silicates and carbonates, i.e., H2SO4 weathering. When DIC generated through H2SO4 weathering of carbonates is carried to the ocean, marine calcite precipitates and releases CO2, increasing atmosphere concentrations (Spence and Telmer, 2005; Calmels et al., 2011; Torres et al., 2014; Kölling et al., 2019). Thus, determination of the weathering contributions of silicates, carbonates, and pyrite is essential toward understanding long-term dynamics of CO2. In this paper we describe a powerful machine-learning technique to interpret the sources of stream solutes to understand problems such as weathering. While we show the importance of applying this machine-learning technique to the weathering question, we also emphasize how machine learning can teach hydrogeochemists about subsurface flowpaths and other characteristics of stream systems.

The most common way hydrogeochemists interpret the fluxes of weathering is to investigate stream and river chemistry. Determining the endmembers for streams is important because streams integrate the byproducts of weathering reactions over drainage basins, allowing assessment of regional to global understanding of fluxes – but only if minerals weathered by different acid sources can be deconvoluted (Li et al., 2008; Calmels et al., 2011; Torres et al., 2016; Winnick et al., 2017; Burke et al., 2018; Killingsworth et al., 2018). In small-scale studies in the laboratory or soil profiles, mineral reactions can be documented, but this information cannot be scaled up easily (Navarre-Sitchler and Brantley, 2007). Here we show that machine learning can decipher the balance of fluxes of CO2 versus H2SO4 weathering as recorded in stream chemistry. We discovered that catchments partition water into subsurface flowpaths that can be (i) deciphered with respect to the extent of pyrite, silicate, and carbonate weathering in different lithologies and (ii) interpreted with respect to whether weathering is driven by CO2 or H2SO4. We emphasize the long-term effects (over 105–106 years) on the CO2 balance in the atmosphere.

Although geochemists commonly use stream chemistry to determine mineral sources of solutes via weathering reactions over large aerial extents (Gaillardet et al., 1999) and hydrologists commonly use endmember mixing analysis (EMMA) to determine the sources of solutes in a stream (Christophersen et al., 1990), stream datasets remain difficult to interpret because of spatial and temporal variations in endmember composition. For example, sulfur isotopes in stream solutes can distinguish pyrite-derived from rain-derived sulfate because pyrite typically is depleted in 34S (Burke et al., 2018; Killingsworth et al., 2018). However, this attribution is difficult, more expensive, and often ambiguous because pyrite δ34S varies between formations (Gautier, 1986) or within a single catchment (Bailey et al., 2004). Likewise, inputs of sulfate to watersheds, such as acid rain, can swamp out the signal from mineral reactions and can change significantly over time (e.g., because of changing acid rain deposition) (Lynch et al., 2000; Lehmann et al., 2007). These factors make it difficult to determine sources releasing sulfate to varying stream chemistries over time.

Several so-called “inverse models” have been used successfully to partition sulfate into endmember sources for streams and rivers. These include the two prominent modeling approaches by Torres et al. (2016) and Burke et al. (2018). However, because the chemistry of acid rain has varied over the past decades, utilizing the full range of rain chemistry in those models results in unrealistic contributions of acid rain (i.e., > 100 %) or models that fail to converge. This is at least partly because the chemistry of acid rain has been so variable that it spans the entire measured range of stream samples. Additionally, utilizing the approach of Burke et al. (2018), based on the approach of Gaillardet et al. (1999), requires a priori assignment of accurate endmember chemistries. Often, the researcher must rely on a few samples to characterize endmembers, resulting in large uncertainties in endmember chemistry and in source apportioning. Since the inception of EMMA, many researchers have aimed to improve analysis through a more accurate determination of unknown or under-constrained endmember chemistries (Hooper, 2003; Carrera et al., 2004; Valder et al., 2012). However, these efforts all use some a priori determination of endmembers. Our machine-learning model adds to the growing effort to improve EMMA by applying blind source separation. The machine-learning approach we describe here deconvolves sources of stream chemistry without pre-defining the endmembers. We demonstrate this first with a synthetic dataset and then with data from three well-studied watersheds with different characteristics. The new method discovers the endmember chemistries and, as a result, documents new findings of importance previously undiscovered with the other methods.

For the target watersheds, we focus first on Shale Hills, an acid-rain-impacted shale watershed in central Pennsylvania, USA, with extensive data for water/rock chemistry (Jin et al., 2010; Brantley et al., 2013a; Sullivan et al., 2016). This watershed allows the most complete understanding of solute sources. Although we do not show this here, if we use either of the two previously used models for source attribution, stream chemistry data for Shale Hills either do not separate acid rain and pyrite as a sulfate source (if we use the model of Torres et al., 2016) or yield a proportion for acid rain which is larger than 100 % (if we use the model of Burke et al., 2018). As shown below, the non-negative matrix factorization (NMF) model easily defines endmembers and proportions.

We then show the utility of the machine-learning method for watersheds where less water/rock chemistry has been published: we investigate East River and Hubbard Brook catchments. Like Shale Hills, East River is shale-hosted, but it receives little acid rain (Winnick et al., 2017). In contrast, Hubbard Brook has been extensively impacted by acid rain but is underlain by glacial till over schist (Likens et al., 2002). In both cases, NMF successfully determines endmembers and source proportions.

Figure 1Schematic summarizing the reactions, timescales, and net CO2 release to or uptake from the atmosphere accompanying weathering of silicate and carbonate minerals. Uptake or release depends upon timescale, as shown, and as discussed in the text. CaSiO3 is used as a generic silicate mineral.


2 Methods

2.1 Study sites

Where previous deconvolutions of stream chemistry into endmembers were generally based on assumptions about the chemistry of dissolving minerals alone, data for watersheds show that the flowpath of the water also affects this chemistry (e.g., Brantley et al., 2017). We demonstrate this with data from three well-studied watersheds with different characteristics. We focus first on Shale Hills, a small (0.08 km2), acid-rain-impacted forested watershed underlain by Rose Hill shale located in central Pennsylvania, USA (Brantley et al., 2018). The Rose Hill Formation shale contains  0.14 wt % S as pyrite (FeS2) (Gu et al., 2020a).

We then show the utility of the method in East River (shale-hosted, but it receives little acid rain) and Hubbard Brook (extensively impacted by acid rain but underlain by schist and glacial till) catchments. Specifically, East River is a large (85 km2), mountainous watershed underlain by Mancos Shale that is located near Gothic, Colorado, USA, within the Gunnison River basin (Winnick et al., 2017). Mancos is a black shale that contains  1.6 wt % S as pyrite (Wan et al., 2019). Both of these shale-hosted watersheds contain carbonate minerals that vary in composition and abundance in the subsurface. Lastly, Hubbard Brook (Nezat et al., 2004), located in the White Mountains of New Hampshire, USA, consists of a series of nine small (0.14–0.77 km2), forested watersheds underlain by Rangeley Formation metamorphosed shale and sandstone (schist) generally covered by glacial till derived mostly from the Kinsman granodiorite. The schist bedrock contains  0.2–0.9 wt % S, and till contains  0.1–0.2 wt % S. Again, almost all S is present as iron sulfide (pyrite or pyrrhotite). Both bedrock and till are largely carbonate-free.

2.2 Data acquisition

For Shale Hills, daily stream chemistry has been reported from 2008 to 2010 (Brantley et al., 2013b; Brantley et al., 2013c; Brantley et al., 2013d). Additional samples were measured in other time intervals for sulfur isotopes and alkalinity (Jin et al., 2014). All samples were filtered through a 0.45 µm Nylon filter and aliquots for cation analysis were acidified with nitric acid. Cations were measured on a Leeman Labs PS3000UV (Teledyne Leeman Labs, Hudson, NH) inductively coupled plasma–optical emission spectrometer (ICP-OES), and anions were measured on a Dionex Ion Chromatograph (Sunnyvale, CA). Alkalinity was measured by titration with 0.16 M H2SO4. Discharge data are available online (, last access: 1 November 2019).

All published data from East River were used in analysis (Winnick et al., 2017), except for two samples with extremely high values of chloride (246 and 854 µM) because they differed significantly from the remaining sample chemistry (average Cl concentration = 21 µM). Hubbard Brook weekly chemistry from 2000 to 2017 was downloaded for the sub-catchments (3, 6, 7, 8, 9) that were not experimentally manipulated (Bernhardt et al., 2019). Stream discharge data for each sub-catchment are from the USDA Forest Service (USDA, 2019).

2.3 Machine-learning model

To assign the proportion of sulfate in streams to sources, we first bootstrapped measurements to increase data volume and then used a method of blind source separation (Alexandrov and Vesselinov, 2014; Vesselinov et al., 2018) called NMF. NMF is unique from previously used methods in that it allows calculation of endmember compositions and mixing proportions simultaneously and does not rely on measurements or assumptions about endmembers a priori (Fig. 2a; see Sect. S1). Specifically, NMF decomposes the n×m matrix, V, into two matrices W and H:

(1) V = WH .

Here, cell entries of V are molar solute concentration ratios, [X] / [Y], for stream samples. Indicator n refers to the sampling date, m refers to different solutes X (= Ca2+, Mg2+, Na+, K+, and Cl), and brackets refer to concentrations. W is the n×p matrix whose cell entries are proportions, α, for each endmember in each stream sample. Again, n refers to sampling dates, but p is the number of sources of solutes (referred to as endmembers). The proportions refer to the fractions of sulfate in each sample that derive from an individual endmember, where the sum of proportions must equal 1±0.05 for each sample. To derive the mixing proportions of sulfate specifically, we set up the NMF approach by normalizing each analyte concentration by sulfate concentration (Y= SO42-), the target solute. After running the algorithm for each of the three watersheds, we then inferred by inspection (see discussion below) that the endmembers represent different flowpaths in the subsurface. Therefore, these proportions of sulfate are referred to here as shallow, moderately shallow, and deep flowpaths, i.e., αshallow, αmoderate, and αdeep, respectively (only one of our target watersheds revealed the moderate-depth flowpath). H is the p×m matrix whose cell entries are the concentration ratios that define the chemical signature of each of the p endmembers. The key to NMF is that these concentration ratios are not determined prior to apportionment but rather are determined from the data themselves. In addition, the chemical signatures of each endmember can vary temporally around central tendencies. Because the solution to Eq. (1) is non-unique, we run the model 20 000 times, apply a filter to the models, and then calculate the mean and standard deviation of the remaining models for trend and error analysis (see Sect. S1; Eq. S1).

The only hyperparameter that must be defined to run NMF a priori is the number of endmembers, p. We used principal component analysis (PCA) to determine the minimum number of components needed to explain > 90 % of the variance in stream solute ratios and trained NMF to the bootstrapped data while assuming that number of endmembers. Machine learning determined the compositions defining the endmembers and the mixing proportions of each endmember in each sample. After running NMF, we interpreted each endmember composition based on geological and watershed knowledge.

Based on the outputs of the NMF model, we calculated the weathering rates of sulfide, carbonate, and silicate minerals in the watersheds. Additionally, we calculated the relative contributions of sulfuric and carbonic acid driving those weathering reactions. For details on the weathering calculations, see Sect. S2.

2.4 Synthetic dataset

NMF is an algorithm that has been used for many applications (e.g., spectral analysis, email surveillance, cluster analysis; Berry et al., 2007) but has only recently been applied to stream chemistry (e.g., Xu and Harman, 2020). To exemplify the validity of our modeling approach, we generated a dataset of synthetic stream chemistry versus time and ran it through our NMF model. First, we defined two known endmember compositions, which are shown in Table S1. Next, we randomly generated 300 synthetic stream samples that were each calculated as a mixture of the two endmembers. Lastly, we ran NMF on the synthetic stream chemistry to determine the mixing proportions (α) and endmember compositions ([X] / [SO42-]) for all X.

Figure 2Schematic diagram showing the differences between a traditional mixing model and our machine-learning mixing model (a). Notably, in the machine-learning mixing model, endmember chemistry is not assigned a priori but rather derived from patterns in the data. Results from using our machine-learning mixing model (i.e., NMF) on a synthetic dataset of known endmember chemistry and mixing proportions (i.e., α) are shown in (b) and (c). Using only the synthesized stream sample chemistries, the model adequately recovered the correct mixing proportions (b) and endmember chemistries (c). The axes in (c) are the true concentration ratios of the endmembers and the NMF-derived concentration ratios of the endmembers.


3 Results and discussion

3.1 Synthetic data model

After generating the synthetic dataset of stream samples, we utilized NMF to determine the mixing proportions and endmember compositions. We then filtered out the poor fitting models (see Eq. S1). As described more fully in the Supplement, this left an average number of valid models of 62 (range: 42–77). The average variance between valid models was < 10 %. Without any prior information about the system, NMF accurately determined the correct mixing proportions (RMSE = 0.04; R2= 0.98; p< 0.001; Fig. 2b) and endmember compositions (RMSE = 0.21; R2= 0.99; p< 0.001; Fig. 2c). In effect, the model was able to use patterns in the data to deconvolve sample chemistry into endmembers and proportions.

3.2 Application to Shale Hills

While clay minerals in shale-underlain watersheds in rainy climates are found at all depths because of their low chemical reactivity, pyrite and carbonate minerals are often chemically removed from upper layers and only found in unweathered shale at depth (Fig. 3; Brantley et al., 2013a; Wan et al., 2019; Gu et al., 2020a). For example, at Shale Hills, pyrite and carbonate minerals are only observed deeper than at least 15 m below land surface (mbls) under the ridges and 2 mbls under the valley. In these deeper zones, calcite (CaCO3), ankerite (Ca(Fe0.34Mg0.62Mn0.04)(CO3)2), and pyrite (FeS2) dissolve in regional groundwaters that flow to the stream (Brantley et al., 2013a; Gu et al., 2020a). These groundwaters thus contribute DIC, Ca2+, Mg2+, and SO42- into the stream.

Like many catchments, water also flows to the stream in Shale Hills along a much shallower near-surface flowpath, which we call interflow (Fig. 3). Interflow is thought to occur along a transiently perched water table that lies within the upper 5–8 mbls. The most abundant mineral, illite (K0.69(Si3.24Al0.76)(Al1.69Fe0.103+Fe0.162+Mg0.19)O10(OH)2), dissolves in interflow, where it flows through the soil, with minimal illite dissolution in underlying weathered rock. Illite dissolution releases DIC and Mg2+ and K+ to interflow waters and causes precipitation of clays and iron oxides. Interflow derives ultimately from local precipitation that also contains Na+, Cl, and SO42-. Interflow and deep groundwater flow lines converge under the catchment outlet where the stream, on average, is 90 % interflow and 10 % deep groundwater (Sullivan et al., 2016; Li et al., 2017).

Only one mineral, chlorite ((Fe0.402+Mg0.15Al0.35)6(Si0.76Al0.24)4O10(OH)8), is observed to begin to weather in the deep groundwater and continue weathering all the way to the surface (Fig. 3; Gu et al., 2020a). Chlorite thus dissolves to release Mg2+ to both interflow and deep groundwater. While most water entering the catchment leaves as interflow without entering deep groundwater, the wide reaction zone observed for chlorite is consistent with a small fraction of water infiltrating vertically to the deeper zone (Brantley et al., 2017).

Figure 3Schematic cross section of Shale Hills showing the depths (labeled lines) where oxidation of pyrite and dissolution of carbonate, chlorite, and illite initiate (modified after Brantley et al., 2013a). Illite and chlorite dissolve at all depths above the labeled lines, but reactions of carbonate minerals and pyrite only occur in a narrow 1 m wide depth zone under the ridge that widens to several meters toward the valley. Specifically, pyrite oxidation is complete under both ridge and valley at the depths where chlorite dissolution initiates. Carbonate dissolution is complete at the depth where pyrite oxidation is complete under the ridge but at  4 m above the pyrite front under the valley. These reaction fronts are estimated and extrapolated from bulk chemistry measured in samples from boreholes located at the ridge and valley (Jin et al., 2010; Brantley et al., 2013a; Gu et al., 2020b).


PCA for stream chemistry (2008–2010) at Shale Hills revealed two sources of sulfate, and this was used to set up NMF, i.e., p= 2 (Table S2). By comparing the compositions from matrix H (Table S2) determined by NMF to our knowledge of the subsurface (Fig. 3), we interpreted the two endmembers as deep and shallow weathering along the two flowpaths, i.e., groundwater and interflow (Fig. 3), respectively (Jin et al., 2014; Sullivan et al., 2016). The endmember with high [Ca2+] / [SO42-] and [Mg2+] / [SO42-] was attributed to deep weathering because Ca- and Mg-containing minerals (i.e., calcite and ankerite) only dissolve at depth (Fig. 3; Jin et al., 2014; Gu et al., 2020a). The high [Cl] / [SO42-] endmember was attributed to shallow interflow because it is dominated by Cl-containing acid rain. This attribution revealed, consistent with other studies of the acid-rain-impacted northeastern United States, that precipitation accounts for the majority of sulfate flux (i.e., 77 %) at Shale Hills between 2008 and 2010.

Many lines of evidence back up these endmember attributions. The sulfate in the shallow endmember derives from interflow well above the pyrite oxidation front through pyrite-depleted rock and is thus attributed to acid rain, while the sulfate in the deep endmember is attributed mostly to pyrite oxidation. Some sulfate from acid rain may infiltrate to the regional groundwater, but the fraction is small. At Shale Hills, acid rain always contains Cl, and pyrite oxidation always preferentially dissolves carbonate minerals, giving each flowpath endmember a unique signature.

To test the NMF deconvolution, we compared these attributions to isotopic data. The value of δ34S in dissolved sulfate is observed to correlate with increasing concentrations of pyrite-derived sulfate determined by NMF (Fig. 4a), consistent with depleted δ34S signatures in pyrite (e.g., −20 ‰; Killingsworth et al., 2018). In contrast, acid rain shows δ34S values around +3 ‰–5 ‰ (Bailey et al., 2004), and low sulfate concentrations in stream samples are characterized by δ34S values within this range. Also, as pyrite oxidizes, the concentration of sulfate increases and the δ34S values decrease to reflect the inferred composition of pyrite, 9.5 ‰ to 7.2 ‰ (Fig. 4a). Finally, Gu et al. (2020b) showed that pyrite oxidation drives the carbonate dissolution at Shale Hills. NMF results show that stream water was near calcite equilibrium (i.e., Ωcalcite=1; log Ωcalcite= 0) and had the highest pyrite-derived sulfate concentrations when the stream was fed by groundwater (Fig. 4b).

However, the annual flux of acid-rain-derived sulfate from 2008 to 2010 in the shallow endmember determined from NMF at Shale Hills (Table 1) far exceeds the wet deposition of sulfate during the sampling period (Fig. 4c). Such inconsistencies have been noted elsewhere and attributed to travel-time delays over decades between acid rain input and stream output (Cosby et al., 1985; Prechtel et al., 2001; Mörth et al., 2005; Rice et al., 2014). Fig. 4c thus allows us to estimate a  19–31-year lag time between input and export of sulfate from the temporally changing acid rain (see Sect. S2.4).

Weathering profiles at Shale Hills, the chemistry of the composition (H) matrix, sulfur isotopes, calcite saturation, and lag in acid rain export all support our interpretation that the two components in the NMF model are shallow and deep flowpaths and that sulfate largely derives from acid rain and pyrite, respectively. The dissolution of different minerals along these flowpaths leads to patterns in stream chemistry that our NMF model discerns and separates. If mineral reaction fronts are not separated in the subsurface, different flowpaths might not be separated by NMF; however, Brantley et al. (2017) and Gu et al. (2020a) have shown that separation of reaction fronts is common.

Figure 4(a) Sulfur isotope composition plotted versus concentration for sulfate in the subset of stream water or groundwater samples at Shale Hills where S isotopes were measured (symbols; Jin et al., 2014). Dot-dashed lines represent the average sulfur isotope range for acid rain in the USA (3 %–5 ‰; Bailey et al., 2004) and dashed lines represent the average sulfur isotope range of pyrite calculated from NMF results (9.5 ‰ to 7.2 ‰). Sulfur isotopes in pyrite at Shale Hills were previously constrained to lie in the range of 1 ‰ to 15 ‰ (Jin et al., 2014). (B) Plot showing the calcite saturation index (log Ωcalcite) versus concentration of pyrite-derived sulfate (calculated through NMF) in surface and groundwater samples at Shale Hills where alkalinity was measured. Here Ωcalcite (= ion activity product/equilibrium constant for calcite dissolution) is < 1, the water is undersaturated with respect to calcite, and when Ωcalcite is > 1, the water is oversaturated. Black line represents water–calcite equilibrium. Some samples in (b) differ from those in (a) because more samples were collected for alkalinity than sulfur isotopes. In both A and B, color shading represents the fraction of total sulfate derived from pyrite calculated by NMF (i.e., αdeep). (c) Time series plot showing the flux of sulfate in Pennsylvania NADP site PA42 (2.8 km from Shale Hills) from wet and dry deposition (see Sect. S2.4). Black bar shows the NMF results for the export flux of sulfate derived from acid rain for Shale Hills during our sampling period and the rationale for the inferred 19-year lag between input and output.


3.3 Rates of weathering and CO2 sequestration at Shale Hills

With these calculations we can use NMF results to elucidate the effect of sequestration or release of CO2 at Shale Hills. We emphasize fluxes of importance over 105–106-year timescales. CO2-driven weathering of the silicate minerals chlorite and illite removes carbon from the atmosphere and carries it as DIC in rivers to the ocean where it is buried as carbonate minerals (akin to Reaction R2 in Fig. 1, Table S3). In contrast, calcite and ankerite weathering coupled to pyrite oxidation instead releases CO2 into the atmosphere over those timescales (Reaction R7 in Fig. 1), and carbonate mineral weathering is neutral over those timescales (Reaction 4 in Fig. 1). Additionally, acid rain can interact with silicate minerals but not carbonate minerals at Shale Hills (because these are not present in the shallow subsurface – Fig. 3). Thus H2SO4 dissolution caused by acid rain competes with CO2 dissolution for silicates. This competition lowers the CO2 consumption from silicate weathering, which has been observed in other watersheds (e.g., Suchet et al., 1995).

To summarize the effect of weathering on CO2 considered at the timescale of 105–106 years as shown in Fig. 1, we propose a new parameter, the stream CO2 sequestration coefficient, κstream (see Sect. S2.2 for the full derivation). This coefficient is defined as mol CO2/Σ+total, where Σ+total is the sum of the equivalents of base cations in a sample. Here, equivalents refer to molar concentration multiplied by charge for an ion. Positive κstream implies the stream acts as a source and negative implies it acts as a net sink of CO2, and the values are calculated for an individual sample or integrated over some time period of stream sampling. The product of κstream times Σ+total in a sample equals the moles of CO2 sequestered or released during weathering as represented in that sample (but the accounting is calculated for the reactions considered for the 105–106-year timescale in Fig. 1). Quantitatively this parameter reveals the moles of CO2 sequestered or released during weathering per cation equivalent in a given stream sample:

(2) κ stream = 1 2 - 1 + γ stream + ζ stream .

Here, γstream is the proportion of cation equivalents in the stream derived from carbonate weathering per Σ+total, and ζstream is the ratio of sulfate equivalents from sulfuric acid per total base cation equivalents. We calculate γstream for a sample by multiplying the pyrite-derived sulfate concentration (i.e., αdeep multiplied by total sulfate concentration) by the [Ca2+] / [SO42-] and [Mg2+] / [SO42-] ratios in the sample calculated by NMF to have derived from the deep weathering endmember and then dividing by Σ+total. Likewise, ζstream is calculated by multiplying the fraction of sulfate from pyrite + acid rain (e.g., αdeep + αshallow) by the total sulfate concentration and dividing that by Σ+total. This calculation shows that, seasonally, Shale Hills switches between a net source and net sink of CO2 (Fig. 5d). Using the weathering reactions described in Sect. S2.2, we also calculated the actual associated CO2 fluxes; annual CO2 dynamics are net neutral at Shale Hills when considered over timescales of 105–106 years (Table 1; Fig. S4).

The switch in systems from operating as a source or a sink is attributed to seasonality in the dominant flowpath: CO2 weathering of silicates occurs year-round, but H2SO4 weathering is more important in the wet season and is dominated by acid from rain. Specifically, in the dry season when water tables are low, the stream water is often dominated by deeper groundwater flow that interacts with the deep pyrite reaction front and has little contribution of acid rain. However, even though this dry season is characterized by higher proportions of pyrite-derived sulfate, the watershed acts predominantly as a sink of CO2 during this time of the year because the drawdown of CO2 from CO2 weathering of silicates is larger than the efflux of CO2 from pyrite-driven H2SO4 weathering of carbonate (Fig. 5d). In the wet season when water tables are high, however, the stream is dominated by shallow interflow that does not interact with pyrite but has a large contribution of H2SO4 from rain. Kanzaki et al. (2020) also previously showed that the separation of reaction fronts (Fig. 3) can cause such important effects on CO2 fluxes, although that previous treatment focused strictly on simple model systems unaffected by acid rain.

To test the accuracy of these inferences based on NMF, we compare to previous results for Shale Hills. Based on soil porewater chemistry and rain fluxes at Shale Hills, Jin et al. (2014) estimated the CO2 drawdown from silicate weathering to be 44 mmol m−2 yr−1. We find that if we assume all silicate weathering is CO2-driven, then the silicate weathering drawdown is 38 mmol m−2 yr−1, which is consistent with the estimate of Jin et al. (2014). However, 44 mmol m−2 yr−1 is an overestimate because it does not consider H2SO4 weathering of silicates or carbonates.

Table 1Fluxes of SO42-, cations, and CO2. n/a stands for “not applicable”.

a Weathering fluxes calculated following the procedure in Sect. S2.2. b Negative CO2 flux indicates sequestration and positive indicates release into the atmosphere as considered over 105–106-year timescales (see Fig. 1). c No carbonate cation fluxes reported because the bedrock contains no carbonate. d Stream CO2 sequestration coefficient integrated over the period of record for each site. e Rock and stream CO2 sequestration coefficients show that Shale Hills and East River are within an error of net neutral with respect to CO2 and that Hubbard Brook sequesters CO2.

Download Print Version | Download XLSX

3.4 East River

Shale Hills is unique in that it is a monolithologic catchment and that the data volume to constrain endmember apportionment is large. However, NMF also works well for watersheds in which the subsurface flow and reactions are less constrained, partly due to the more complex subsurface geology. The weathering profile at East River (underlain by black shale) shows that pyrite and carbonate are depleted in upper layers but start dissolving at  2–4 mbls (Wan et al., 2019). PCA shows that the number of components is two. The compositions of the endmembers for East River are similar to Shale Hills (Table S2); however, the endmember composition indicates a higher proportion of H2SO4 weathering of carbonates (see Sect. S2).

Based on NMF for East River, pyrite contributes 62 % of the annual sulfate flux (Table 1). Sulfuric acid drives 29 % to 69 % of carbonate dissolution depending on the season, and this compares well with previous estimates of 35 %–75 % (Winnick et al., 2017). Unlike Shale Hills, pyrite oxidation at East River is the dominant source of sulfate because acid rain is less important, and the black shale is pyrite-rich (Fig. 5b).

Although East River is like Shale Hills in that it intermittently switches between acting as a source or sink of CO2 (Fig. 5), the seasonality of the switch between Shale Hills and East River is reversed. During baseflow (i.e., between periods of precipitation), Shale Hills is predominantly a sink of CO2, and it sometimes switches to a source of CO2 in the wet season because acid rain competes with CO2 and reduces CO2 consumption from silicate weathering. Without the large acid rain influx, East River instead acts as a sink of CO2 during the wet season of snowmelt and then switches to a source during baseflow. Our results are consistent with previous interpretations (Winnick et al., 2017) suggesting CO2 efflux rates are highest in baseflow-dominated and lowest in snowmelt-dominated flow regimes.

3.5 Hubbard Brook

Monolithologic shale watersheds are not the only target chemistries that can be deconvoluted with NMF: we now consider Hubbard Brook, a catchment on crystalline rock. Large variations in the δ34S composition of the bedrock at Hubbard Brook (Bailey et al., 2004) mean that sulfur isotopes in stream water cannot be used to unambiguously apportion sulfate sources. Weathering fluxes from sulfide minerals are therefore difficult to constrain (Mitchell et al., 2001).

At Hubbard Brook, PCA shows three endmember sources of sulfate. As described below, we attribute these to three inferred flow lines, two in till and one at depth: waters flowing through (i) shallow soil developed from till, (ii) moderately deep, less-weathered till, and (iii) weathering bedrock. A three-layered weathering profile has been observed in other till-covered areas of New Hampshire as well (Goldthwait and Kruger, 1938). We used these ideas to identify endmembers as described below.

Concentrations of sulfate in acid rain have declined over time in the northeastern USA (Lynch et al., 2000; Lehmann et al., 2007). Of the three NMF-determined endmembers at Hubbard Brook, two of them show declining sulfate concentrations with time. We therefore attributed the first and second endmembers to acid rain (Fig. S1).

Only one endmember showed little to no decline in sulfate concentration over time, and we therefore attributed that endmember to deep weathering in water interacting with the underlying bedrock. The composition of the deep weathering endmember shows a strong correlation between [Mg2+] / [SO42-] and [K+] / [SO42-]. This chemical signature is similar to previous observations of weathering of metasedimentary rock piles where silicates (biotite and chlorite) are the first minerals to dissolve when sulfides oxidize (Moncur et al., 2009). Specifically, biotite (K(Si3Al)Mg2FeO10(OH)2) is known to release Mg2+ and K+, while chlorite releases Mg2+ upon weathering. Moreover, the metamorphic conditions that produce pyrrhotite also produce biotite and chlorite, and those three minerals tend to be located together in schist foliations (Carpenter, 1974). We thus infer that pyrrhotite oxidation at Hubbard Brook apparently causes dissolution of biotite ± chlorite because these are the most susceptible minerals in close proximity to the sulfide. Thus, several lines of evidence underlie our interpretation that component 3 is the deep weathering source of sulfate.

From the NMF results summarized in Table 1, pyrrhotite can account for 30 % of the total sulfate flux at Hubbard Brook. The schist and till contain essentially no carbonate; therefore, weathering is always a net sink for CO2. In this watershed, however, the story is complicated by the dissolution of silicate minerals by sulfuric acid from pyrrhotite oxidation and acid rain. If we had assumed all of the base cations detected in Hubbard Brook were caused by CO2 weathering, we would have overestimated the net drawdown of CO2 out of the atmosphere (Fig. 1).

Figure 5Concentration of total sulfate (black line), rain-derived sulfate (NMF-calculated; gray), and sulfide-derived sulfate (NMF-calculated; yellow) in stream water plotted versus time at Shale Hills (a), East River (b), and Hubbard Brook (W-3 sub-catchment) (c). Shale Hills and East River temporally switch between being a source and sink of CO2, while Hubbard Brook is always a sink over the timescales studied, as shown by the CO2 sequestration coefficient (κstream) for Shale Hills (d), East River (e), and Hubbard Brook (f). Gray error bars in (d), (e), and (f) represent 1 SD from the calculated κstream for that sample. The range (mean + 1 SD) indicated in red to the right of (d), (e), and (f) represents κrock, the time-integrated CO2 sequestration coefficient calculated from the rock chemistry (see text). Here, κstream> 0 or < 0 indicates stream is a source or sink of CO2, respectively, when considering weathering reactions over 105- to 106-year timescales (see Fig. 1). The long record at Hubbard Brook shows that κstream is approaching κrock as the watershed recovers from acid rain. Gaps in the time series for Shale Hills occur when the autosampler tubing or stream froze.


3.6 Predicting CO2 release or drawdown from rock chemistry

From the stream chemistry, we found that Shale Hills and East River are net neutral with respect to CO2, and Hubbard Brook is a net sink (Table 1; Fig. 5). In Table 1, the weathering fluxes are summarized as CO2 fluxes (see Sect. S2.2; Fig. S4), but the NMF results can also be used to calculate weathering losses for each mineral as described in Sect. S2.5 (Table S5). Although we do not explicitly discuss each of these mineral-related fluxes learned from NMF, they have resulted in differences in composition of soil versus protolith, and we can use soil chemistry therefore as an additional test of κstream: specifically, we compare κstream to the CO2 flux recorded in the weathered profile as solid-phase chemistry. To do this, we calculate a CO2 sequestration coefficient analogous to κstream but instead based on rock chemistry, κrock, by assessing soil and taking into account the fraction of base cations weathered, the fraction of base cations from carbonates, and the capacity of the bedrock to produce H2SO4:

(3) κ rock = 1 2 τ + γ rock + ζ rock .

In effect, κrock is the time-integrated CO2 sequestration coefficient recorded as the solid-phase weathering products in units of mol CO2/eq base cation. In Eq. (3), τ is the mass transfer coefficient for base cations at the land surface (where 1-τ equals the fraction of total base cations originally present in parent rock that remain in topsoil at the land surface), γrock is the proportion of base cations in the bedrock associated with carbonate minerals, and ζrock is the acid generation capacity of the rock. The derivation of Eq. (3) and the description of each variable are more fully summarized in Sect. S2.3. Briefly, γrock expresses the proportion of base cations in the parent rock that are associated with carbonate minerals (varies from 0 to 1 for 100 % silicate protolith to 100 % carbonate protolith). ζrock expresses the relative amount of (acid-generating) pyrite to base cations in the protolith (varies from 0 to 1.5 for catchments where 100 % of weathering is CO2-driven to catchments where 100 % of weathering is H2SO4-driven, respectively). τ expresses the fraction of cations that have not dissolved away upon exposure at the land surface (varies from 1 to 0 for 0 % cations remaining at the land surface to 100% cations remaining, respectively). Negative κrock describes a lithology that has been net sequestering CO2 over the duration of weathering, whereas positive κrock has been net releasing CO2. Based on the chemistry of the bedrock and topsoil at each watershed, κrock is 0.08 ± 0.11, 0.08 ± 0.17, and 0.19 ± 0.11 for Shale Hills, East River, and Hubbard Brook, respectively (Tables 1, S4). Based on these values from observations of the solid weathering phases, Shale Hills and East River on net are CO2 neutral (i.e., within an error of 0), but Hubbard Brook has acted as a long-term CO2 sink.

If the streams at each site today are acting just like the weathering recorded over the last tens of thousands of years in the solid-phase material and our assumptions about CO2 versus H2SO4 weathering are correct, κrock should equal κstream. Here, we find that κstream (discharge-weighted average) for Shale Hills, East River, and Hubbard Brook are 0.01 ± 0.03, 0.02 ± 0.02, and 0.14 ± 0.01, respectively (Table 1, Fig. 5). For all sites, the stream chemistry shows similar values of the CO2 sequestration coefficient for the modern (stream timescale) compared to the time-integrated (soil timescale), i.e., κstreamκrock, consistent with Shale Hills and East River acting as CO2 net neutral but Hubbard Brook as a CO2 sink. In addition, at Hubbard Brook, it can be seen that acid rain has competed with CO2 in weathering minerals, lowering the capacity of the rock to sequester atmospheric CO2. Because our calculation of κrock does not include acid rain, we would expect acid rain would increase κstream relative to κrock, which is what we observe at Hubbard Brook. Hubbard Brook has only moved back to equivalency between the rock and stream record in recent years (2013–2016; Fig. 5f) as the system has recovered from acid rain. These comparisons also suggest that rock chemistry, which is much easier to analyze, can sometimes predict stream fluxes adequately.

4 Conclusions

By not requiring a priori assignments of endmembers, our machine-learning model not only successfully reproduced source apportionments made in more traditional endmember analysis for streams, but also revealed new information about how watersheds work. At the same time, the method also solved some issues related to source apportionment for streams with time variations of large acid rain inputs. The approach documented that two carbonate-containing shale watersheds (Shale Hills, East River) are intermittent sources or sinks of CO2 into the atmosphere but on net are neutral with respect to CO2. In contrast, because it has no carbonate minerals, Hubbard Brook is a constant sink for CO2 (Figs. 5 and S5). These observations were compared and confirmed by comparing stream chemistry to rock chemistry.

NMF also emphasized the importance of different water flowpaths in determining endmembers: the endmembers were not strictly defined by mineralogy but by patterns of subsurface flow that can be related to subsurface reaction zones. These flowpaths lead to patterns in stream water chemistry that were easily deciphered by our newly developed machine-learning-based mixing model. In particular, for three streams, signals in the chemical variations were observed to reveal dissolution of the most reactive mineral in proximity to sulfide oxidation. Many watersheds have flowpaths distinguished by geochemical signatures from mineral reactions (Brantley et al., 2017), but we do not know these paths a priori when we investigate stream chemistry. Machine learning will be useful for modeling mineral reactions on broader spatial scales and will help constrain global weathering-related CO2 dynamics because it can delineate endmembers without a priori assumptions.

Beyond these attributes, the machine-learning approach also revealed other new attributes of weathering. In Shale Hills, we discovered that sulfate inputs from acid rain may not be exported completely for 2 decades, which impacts mass balance and weathering-related CO2 dynamics. Although not discussed explicitly here, this decadal time lag was also observed at Hubbard Brook. NMF also showed that Hubbard Brook, recovering from the impacts of acid rain, has only recently been returning to its full potential as a CO2-sequestering rock system. In other words, prior to acid rain, Hubbard Brook sequestered more CO2 per mole of weathered bedrock than it does today. However, acid rain dissolved some of the silicates with H2SO4, lowering the CO2 sequestration capability of the watershed. NMF led us to discover this new attribute of acid rain, namely that it diminishes the capacity of a rock to sequester CO2 at millennial timescales (Fig. 1) by replacement of CO2 by H2SO4 as a weathering agent. Regardless of the net CO2 dynamic, we discovered that without considering sulfide oxidation or acid rain, the CO2 weathering sink considered over 105 to 106-year timescales is always overestimated.

Data availability

Data used in analysis of this work can be found online for Shale Hills (,,, Brantley et al., 2013b, c, d), the Hubbard Brook data catalog (, Bernhardt et al., 2019), and in Wan et al. (2019) and in Winnick et al. (2017). Codes used in the modeling are available upon request.


The supplement related to this article is available online at:

Author contributions

SLB, TW, and ARS conceived the project. TW and ARS developed the codes for the model. SLB, XG, and ARS interpreted the data and model outputs. All the authors contributed in the preparation of the manuscript.

Competing interests

The authors declare that they have no conflict of interest.


We thank the editor and two anonymous reviewers for comments that strengthened the manuscript. Part of this research was conducted in Shale Hills, located in Penn State's Stone Valley Forest, which is supported by the Penn State College of Agriculture Sciences, Department of Ecosystem Science and Management, and managed by staff in the Forestlands Management Office.

Financial support

This research was financially supported by the National Science Foundation (grant nos. EAR 13-31726 and IIS 16-39150 to SLB and a Graduate Research Fellowship to ARS), Office of Basic Sciences Department of Energy (grant no. DE-FG02-05ER15675 to SLB), and Penn State (College of Earth and Mineral Sciences Distinguished Postdoctoral Scholar Fellowship to TW and University Graduate Fellowship to ARS).

Review statement

This paper was edited by Monica Riva and reviewed by two anonymous referees.


Alexandrov, B. S. and Vesselinov, V. V.: Blind source separation for groundwater pressure analysis based on nonnegative matrix factorization, Water Resour. Res., 50, 7332–7347, 2014. 

Bailey, S. W., Mayer, B., and Mitchell, M. J.: Evidence for influence of mineral weathering on stream water sulphate in Vermont and New Hampshire (USA), Hydrol. Process., 18, 1639–1653, 2004. 

Bernhardt, E., Likens, G. E., and Rosi, E.: Continuous precipitation and stream chemistry data, Hubbard Brook Ecosystem Study, 1963–present, Environmental Data Initiative, Hubbard Brook Data Catalog,, 2019. 

Berry, M. W., Browne, M., Langville, A. N., Pauca, V. P., and Plemmons, R. J.: Algorithms and applications for approximate nonnegative matrix factorization, Comput. Stat. Data An., 52, 155–173, 2007. 

Brantley, S. L., Holleran, M. E., Jin, L., and Bazilevskaya, E.: Probing deep weathering in the Shale Hills Critical Zone Observatory, Pennsylvania (USA): the hypothesis of nested chemical reaction fronts in the subsurface, Earth Surf. Proc. Land., 38, 1280–1298, 2013a. 

Brantley, S. L., Jin, L., Andrews, D., Holmes, G., Holleran, M., Williams, J. Z., Herndon, E., Duffy, C. J., and Sullivan, P. L.: Susquehanna Shale Hills Critical Zone Observatory Stream Water Chemistry (2008) Interdisciplinary Earth Data Alliance (IEDA), EarthChem,, 2013b. 

Brantley, S. L., Jin, L., Andrews, D., Holmes, G., Bhatt, M., Holleran, M., Kaiser, N., Williams, J. Z., Herndon, E., Duffy, C. J., and Sullivan, P. L.: Susquehanna Shale Hills Critical Zone Observatory Stream Water Chemistry (2009) Interdisciplinary Earth Data Alliance (IEDA), EarthChem,, 2013c. 

Brantley, S. L., Bazilevskaya, E., Andrews, D., Williams, J. Z., Herndon, E., Holmes, G., Bhatt, M., Holleran, M., Yesavage, T., Thomas, E., Duffy, C. J., and Sullivan, P. L.: Susquehanna Shale Hills Critical Zone Observatory Stream Water Chemistry (2010) Interdisciplinary Earth Data Alliance (IEDA), EarthChem,, 2013d. 

Brantley, S. L., Lebedeva, M. I., Balashov, V. N., Singha, K., Sullivan, P. L., and Stinchcomb, G.: Toward a conceptual model relating chemical reaction fronts to water flow paths in hills, Geomorphology, 277, 100–117, 2017. 

Brantley, S. L., White, T., West, N., Williams, J. Z., Forsythe, B., Shapich, D., Kaye, J., Lin, H., Shi, Y., Kaye, M., Herndon, E., Davis, K. J., He, Y., Eissenstat, D., Weitzman, J., DiBiase, R., Li, L., Reed, W., Brubaker, K., and Gu, X.: Susquehanna Shale Hills Critical Zone Observatory: Shale Hills in the context of Shaver's Creek watershed, Vadose Zone J., 17, 1–19, 2018. 

Burke, A., Present, T. M., Paris, G., Rae, E. C., Sandilands, B. H., Gaillardet, J., Peucker-Ehrenbrink, B., Fischer, W. W., McClelland, J. W., Spencer, R. G., and Voss, B. M.: Sulfur isotopes in rivers: Insights into global weathering budgets, pyrite oxidation, and the modern sulfur cycle, Earth Planet. Sc. Lett., 496, 168–177, 2018. 

Calmels, D., Galy, A., Hovius, N., Bickle, M., West, A. J., Chen, M. C., and Chapman, H.: Contribution of deep groundwater to the weathering budget in a rapidly eroding mountain belt, Taiwan, Earth Planet. Sc. Lett., 303, 48–58, 2011. 

Carpenter, R. H.: Pyrrhotite isograd in southeastern Tennessee and southwestern North Carolina, Geol. Soc. Am. Bull., 85, 451–456, 1974. 

Carrera, J., Vázquez-Suñé, E., Castillo, O., and Sánchez-Vila, X.: A methodology to compute mixing ratios with uncertain endmembers, Water Resour. Res., 40, W12101,, 2004. 

Christophersen, N., Neal, C., Hooper, R. P., Vogt, R. D., and Andersen, S.: Modelling streamwater chemistry as a mixture of soilwater end-members. A step towards second-generation acidification models, J. Hydrol., 116, 307–320, 1990. 

Cosby, B. J., Hornberger, G. M., Galloway, J. N., and Wright, R. E.: Time scales of catchment acidification. A quantitative model for estimating freshwater acidification, Environ. Sci. Technol., 19, 1144–1149, 1985. 

Gaillardet, J., Dupré, B., Louvat, P., and Allegre, C. J.: Global silicate weathering and CO2 consumption rates deduced from the chemistry of large rivers, Chem. Geol., 159, 3–30, 1999. 

Gautier, D. L.: Cretaceous shales from the western interior of North America: Sulfur/carbon ratios and sulfur-isotope composition, Geology, 14, 225–228, 1986. 

Goldthwait, J. W. and Kruger, F. C.: Weathered rock in and under the drift in New Hampshire, Geol. Soc. Am. Bull., 49, 1183–1198, 1938. 

Gu, X., Rempe, D. M., Dietrich, W. E., West, A. J., Lin, T. C., Jin, L., and Brantley, S. L.: Chemical reactions, porosity, and microfracturing in shale during weathering: The effect of erosion rate, Geochim. Cosmochim. Ac., 269, 63–100, 2020a. 

Gu, X., Mavko, G., Ma, L., Oakley, D., Accardo, N., Carr, B. J., Nyblade, A. A., and Brantley, S. L.: Seismic refraction tracks porosity generation and possible CO2 production at depth under a headwater catchment, P. Natl. Acad. Sci. USA, 117, 18991–18997, 2020b. 

Hooper, R. P.: Diagnostic tools for mixing models of stream water chemistry, Water Resour. Res., 39, 1055, 235, 2003. 

Jin, L., Ravella, R., Ketchum, B., Bierman, P. R., Heaney, P., White, T., and Brantley, S. L.: Mineral weathering and elemental transport during hillslope evolution at the Susquehanna/Shale Hills Critical Zone Observatory, Geochim. Cosmochim. Ac., 74, 3669–3691, 2010. 

Jin, L., Ogrinc, N., Yesavage, T., Hasenmueller, E. A., Ma, L., Sullivan, P. L., Kaye, J., Duffy, C. and Brantley, S. L.: The CO2 consumption potential during gray shale weathering: insights from the evolution of carbon isotopes in the Susquehanna Shale Hills critical zone observatory, Geochim. Cosmochim. Ac., 142, 260–280, 2014. 

Kanzaki, Y., Brantley, S. L., and Kump, L. R.: A numerical examination of the effects of sulfide oxidation on silicate weathering, Earth Planet. Sc. Lett., 539, 116239,, 2020. 

Kasting, J. F. and Walker, J. C.: The geochemical carbon cycle and the uptake of fossil fuel CO2, AIP Conf. Proc., 247, 175–200, 1992. 

Killingsworth, B. A., Bao, H., and Kohl, I. E.: Assessing pyrite-derived sulfate in the Mississippi River with four years of sulfur and triple-oxygen isotope data, Environ. Sci. Technol., 52, 6126–6136, 2018. 

Kölling, M., Bouimetarhan, I., Bowles, M. W., Felis, T., Goldhammer, T., Hinrichs, K. U., Schulz, M. and Zabel, M., Consistent CO2 release by pyrite oxidation on continental shelves prior to glacial terminations, Nat. Geosci., 12, 929–934, 2019. 

Lehmann, C. M., Bowersox, V. C., Larson, R. S., and Larson, S. M.: Monitoring long-term trends in sulfate and ammonium in US precipitation: Results from the National Atmospheric Deposition Program/National Trends Network, in: Acid Rain-Deposition to Recovery, Springer, Dordrecht, 59–66, 2007. 

Lerman, A., Wu, L., and Mackenzie, F. T.: CO2 and H2SO4 consumption in weathering and material transport to the ocean, and their role in the global carbon balance, Mar. Chem., 106, 326–350, 2007. 

Li, S. L., Calmels, D., Han, G., Gaillardet, J., and Liu, C. Q.: Sulfuric acid as an agent of carbonate weathering constrained by δ13CDIC: Examples from Southwest China, Earth Planet. Sc. Lett., 270, 189–199, 2008. 

Li, L., Bao, C., Sullivan, P. L., Brantley, S., Shi, Y., and Duffy, C.: Understanding watershed hydrogeochemistry: 2. Synchronized hydrological and geochemical processes drive stream chemostatic behavior, Water Resour. Res., 53, 2346–2367, 2017. 

Likens, G. E., Driscoll, C. T., Buso, D. C., Mitchell, M. J., Lovett, G. M., Bailey, S. W., Siccama, T. G., Reiners, W. A., and Alewell, C.: The biogeochemistry of sulfur at Hubbard Brook, Biogeochemistry, 60, 235–316, 2002. 

Lynch, J. A., Bowersox, V. C., and Grimm, J. W.: Acid rain reduced in eastern United States, Environ. Sci. Technol., 34, 940–949, 2000. 

Mitchell, M. J., Mayer, B., Bailey, S. W., Hornbeck, J. W., Alewell, C., Driscoll, C. T., and Likens, G. E.: Use of stable isotope ratios for evaluating sulfur sources and losses at the Hubbard Brook Experimental Forest, Water Air Soil Poll., 130, 75–86, 2001. 

Moncur, M. C., Jambor, J. L., Ptacek, C. J., and Blowes, D. W.: Mine drainage from the weathering of sulfide minerals and magnetite, Appl. Geochem., 24, 2362–2373, 2009. 

Mörth, C. M., Torssander, P., Kjønaas, O. J., Stuanes, A. O., Moldan, F., and Giesler, R.: Mineralization of organic sulfur delays recovery from anthropogenic acidification, Environ. Sci. Technol., 39, 5234–5240, 2005. 

Navarre-Sitchler, A. and Brantley, S. L.: Basalt weathering across scales, Earth Planet. Sc. Lett., 261, 321–334, 2007. 

Nezat, C. A., Blum, J. D., Klaue, A., Johnson, C. E., and Siccama, T. G.: Influence of landscape position and vegetation on long-term weathering rates at the Hubbard Brook Experimental Forest, New Hampshire, USA, Geochim. Cosmochim. Ac., 68, 3065–3078, 2004. 

Prechtel, A., Alewell, C., Armbruster, M., Bittersohl, J., Cullen, J. M., Evans, C. D., Helliwell, R., Kopácek, J., Marchetto, A., Matzner, E., Meesenburg, H., Moldan, F., Moritz, K., Veselý, J., and Wright, R. F.: Response of sulphur dynamics in European catchments to decreasing sulphate deposition, Hydrol. Earth Syst. Sci., 5, 311–326,, 2001.  

Rice K. C., Scanlon T. M., Lynch J. A., and Cosby B. J.: Decreased atmospheric sulfur deposition across the Southeastern US: When will watersheds release stored sulfate?, Environ. Sci. Technol., 48, 10071–10078, 2014. 

Spence, J. and Telmer, K.: The role of sulfur in chemical weathering and atmospheric CO2 fluxes: Evidence from major ions, δ13CDIC, and δ34SSO4 in rivers of the Canadian Cordillera, Geochim. Cosmochim. Ac., 69, 5441–5458, 2005. 

Suchet, P. A., Probst, A., and Probst, J. L.: Influence of acid rain on CO2 consumption by rock weathering: Local and global scales, Water Air Soil Poll., 85, 1563–1568, 1995. 

Sullivan, P. L., Hynek, S. A., Gu, X., Singha, K., White, T., West, N., Kim, H., Clarke, B., Kirby, E., Duffy, C., and Brantley, S. L.: Oxidative dissolution under the channel leads geomorphological evolution at the Shale Hills catchment, Am. J. Sci., 316, 981–1026, 2016. 

Torres, M. A., West, A. J., and Li, G.: Sulphide oxidation and carbonate dissolution as a source of CO2 over geological timescales, Nature, 507, 346, 2014. 

Torres, M. A., West, A. J., Clark, K. E., Paris, G., Bouchez, J., Ponton, C., Feakins, S. J., Galy, V. and Adkins, J. F.: The acid and alkalinity budgets of weathering in the Andes–Amazon system: Insights into the erosional control of global biogeochemical cycles, Earth Planet. Sci. Lett., 450, 381–391, 2016. 

Valder, J. F., Long, A. J., Davis, A. D., and Kenner, S. J.: Multivariate statistical approach to estimate mixing proportions for unknown end members, J. Hydrol., 460, 65–76, 2012. 

Vesselinov, V. V., Alexandrov, B. S., and O'Malley, D.: Contaminant source identification using semi-supervised machine learning, J. Contam. Hydrol., 212, 134–142, 2018. 

USDA Forest Service, Northern Research Station: Hubbard Brook Experimental Forest: Instantaneous Streamflow by Watershed, 1956–present, Environmental Data Initiative, EDI data portal,, 2019. 

Wan, J., Tokunaga, T. K., Williams, K. H., Dong, W., Brown, W., Henderson, A. N., Newman, A. W., and Hubbard, S. S.: Predicting sedimentary bedrock subsurface weathering fronts and weathering rates, Sci. Rep.-UK, 9, 1–10, 2019. 

Winnick, M. J., Carroll, R. W., Williams, K. H., Maxwell, R. M., Dong, W., and Maher, K.: Snowmelt controls on concentration-discharge relationships and the balance of oxidative and acid-base weathering fluxes in an alpine catchment, East River, Colorado, Water Resour. Res., 53, 2507–2523, 2017. 

Xu Fei, E. and Harman, C. J.: Technical Note: A data-driven method for estimating the composition of end-members from streamwater chemistry observations, Hydrol. Earth Syst. Sci. Discuss. [preprint],, in review, 2020. 

Short summary
It is often difficult to determine the sources of solutes in streams and how much each source contributes. We developed a new method of unmixing stream chemistry via machine learning. We found that sulfate in three watersheds is related to groundwater flowpaths. Our results emphasize that acid rain reduces a watershed's capacity to remove CO2 from the atmosphere, a key geological control on climate. Our method will help scientists unmix stream chemistry in watersheds where sources are unknown.