Knowledge on the response of sediment export to recent climate change in glacierized areas in the European Alps is limited, primarily because long-term records of suspended sediment concentrations (SSCs) are scarce. Here we tested the estimation of sediment export of the past five decades using quantile regression forest (QRF), a nonparametric, multivariate regression based on random forest. The regression builds on short-term records of SSCs and long records of the most important hydroclimatic drivers (discharge, precipitation and air temperature – QPT). We trained independent models for two nested and partially glacier-covered catchments, Vent (98 km
Our validation at gauge Vernagt demonstrated that QRF performs well in estimating past daily sediment export (Nash–Sutcliffe efficiency (NSE) of 0.73) and satisfactorily for SSCs (NSE of 0.51), despite the small training dataset. The temporal extrapolation ability of QRF was superior to SRCs, especially in periods with high-SSC events, which demonstrated the ability of QRF to model threshold effects. Days with high SSCs tended to be underestimated, but the effect on annual yields was small. Days with predictor exceedances were rare, indicating a good representativity of the training dataset. Finally, the QRF reconstruction models outperformed SRCs by about 20 percent points of the explained variance.
Significant positive trends in the reconstructed annual suspended sediment yields were found at both gauges, with distinct step-like increases around 1981. This was linked to increased glacier melt, which became apparent through step-like increases in discharge at both gauges as well as change points in mass balances of the two largest glaciers in the Vent catchment. We identified exceptionally high July temperatures in 1982 and 1983 as a likely cause. In contrast, we did not find coinciding change points in precipitation. Opposing trends at the two gauges after 1981 suggest different timings of “peak sediment”. We conclude that, given large-enough training datasets, the presented QRF approach is a promising tool with the ability to deepen our understanding of the response of high-alpine areas to decadal climate change.
Sediment production rates per unit area are highest in the world's mountains (Hallet et al., 1996), headed by modern glacierized basins (Hinderer et al., 2013). As a consequence, sediments transported from these rapidly deglaciating high-alpine areas have substantial effects on downstream hydropower production and reservoir sedimentation (Schöber and Hofer, 2018; Guillén-Ludeña et al., 2018; Li et al., 2022), flood hazard (Nones, 2019; Brooke et al., 2022) and water quality, nutrient and contaminant transport and aquatic habitats (Gabbud and Lane, 2016; Bilotta and Brazier, 2008; Vercruysse et al., 2017) and impact global sediment and biochemical balances (Herman et al., 2021). Thus, there is considerable interest in water resource research and management to gain better understanding of sediment dynamics in high-alpine areas and to mitigate and adapt to future changes.
However, there is still limited quantitative understanding of sediment transport in high-alpine rivers and their relation to changes in climatic forcing over temporal scales relevant to investigating changes associated with anthropogenic climate change, i.e., at decadal and centennial scales as opposed to longer ones (Huss et al., 2017; Antoniazza and Lane, 2021; Herman et al., 2021). This is partly due to the complexity of sediment dynamics in high-alpine areas, which are the result of an intricate system of climatic forcing and hydrogeomorphological processes (Costa et al., 2018; Vercruysse et al., 2017; Zhang et al., 2022).
A significant body of knowledge exists on how some of the components of these complex systems have changed in recent decades due to rising temperatures. Cryospheric changes include widespread and accelerating glacier retreat (Abermann et al., 2009; Sommer et al., 2020) and reduced extent and duration of snow cover (Beniston et al., 2018; Chiarle et al., 2021). As a result, hydrological regimes are changing from glacial to nival regimes and from nival to pluvial regimes (Beniston et al., 2018), which results in changes in water quantities (Vormoor et al., 2015; Wijngaard et al., 2016), streamflow variability (van Tiel et al., 2019) and hydrograph timing (Kormann et al., 2016; Kuhn et al., 2016; Hanus et al., 2021; Rottler et al., 2020, 2021). These hydrological changes can translate to changes in erosivity, sediment transport capacities and fluvial erosion. At the same time, sediment supply changes, as glacier retreat uncovers large amounts of sediment previously inaccessible to pluvial and fluvial erosion (Carrivick and Heckmann, 2017; Leggat et al., 2015), subglacial sediment discharge transiently increases (Costa et al., 2018; Delaney and Adhikari, 2020) and continuing permafrost thaw destabilizes slopes and facilitates mass movements (Huggel et al., 2010, 2012; Beniston et al., 2018; Savi et al., 2020). Adding to this, erosive precipitation has a higher chance of affecting unfrozen material during prolonged snow-free periods (Kormann et al., 2016; Rottler et al., 2021; Wijngaard et al., 2016).
However, the magnitude of these impacts is catchment-specific, as it depends, e.g., on the area occupied by glaciers and permafrost or basin hypsometry (Huss et al., 2017) and is thus not easily transferable from one site to another. Aggravatingly, high-alpine sediment dynamics are highly variable, so that long time series are required to assess systematic changes. However, most records are too short for such analyses and long-term data are extremely rare, especially in glacierized headwaters, which are often especially challenging to monitor.
To our knowledge, only very few examples of decadal sediment records from
the European Alps exist in the current literature, as opposed to their
availability for, e.g., High Mountain Asia (Singh et al., 2020; Li et al., 2020, 2021; Zhang et al., 2021), the Andes (Vergara et al., 2022) or the Arctic (Bendixen et al., 2017a) (for an extensive review, see Zhang et al., 2022). Costa et al. (2018) report on an exceptional record of suspended sediment concentrations from the upper Rhône basin, Switzerland, of almost five decades, though these recordings are severely affected by anthropogenic impacts (hydropower generation and gravel mining) and integrate over an area of 5340 km
Quantile regression forest (QRF) (Meinshausen, 2006) represents an approach that has been successfully applied to modeling suspended sediment concentrations in past geomorphological studies, in badland-dominated catchments in Spain (Francke et al., 2008a, b) and in a tropical forest in Panama (Zimmermann et al., 2012). QRF is a multivariate, nonparametric regression technique based on random forest, from the category of machine-learning (ML) approaches, which generally seek to identify patterns from complex data (Tahmasebi et al., 2020). Comparative studies have reported that QRF performs favorably compared to sediment rating curves and generalized linear models (Francke et al., 2008a) and that the performance of random forest (which QRF is based on) was superior to support-vector machines and artificial neural networks (Al-Mukhtar, 2019) in modeling suspended sediment concentrations. Importantly, as an advantage over other ML approaches, QRF allows us to quantify the model uncertainty by estimating prediction accuracy (Francke et al., 2008a; Al-Mukhtar, 2019).
Thus, the first objective of this study was to extensively test QRF as an approach for estimating past suspended sediment dynamics at decadal scales in high-alpine areas. Previous studies have included proxies for drivers of sediment transport and erosive processes, e.g., precipitation, discharge, seasonality and antecedent conditions (Francke et al., 2008a, b; Zimmermann et al., 2012). We built on this setup and adapted it by including air temperature as a predictor, since many processes relevant to sediment dynamics in high-alpine areas are temperature-sensitive (e.g., snowmelt and glacier melt, thawing of topsoil). To assess model performance, we evaluated several validations and compared the results to sediment rating curves based on data from the two gauges “Vent Rofenache” and “Vernagt” that are located in a nested catchment setup in the Rofental within the upper Ötztal in Tyrol, Austria. This nested setup provides a favorable opportunity to test the QRF model and gives a good overview of sediment dynamics in this catchment.
The second objective of this study is to examine the resulting estimates of annual suspended sediment yields with respect to changes at decadal scales. Thus, we analyzed the time series for trends, some of which could be expected, e.g., due to ongoing temperature increase. However, the possibility of sudden, tipping-point-like shifts in response to climatic changes has been suggested for cryospheric geomorphic systems as well (Huggel et al., 2012), and sediment dynamics especially (Vercruysse et al., 2017) and indeed step-like increases in suspended sediment concentrations have been observed in other catchments (Costa et al., 2018; Li et al., 2020, 2021; Zhang et al., 2021). Hence, we used change point detection methods to assess whether the detected trends are gradual or follow a step-like pattern. We extended these analyses to the predictors (temperature, discharge and precipitation) and annual mass balances to assess possible reasons for changes in suspended sediment yields.
The study area is located in the “Rofental”, a valley in the upper Ötztal in the Tyrolean Alps, Austria. For more than 100 years, the area
has been subject to intense glaciological and hydrometeorological research,
yielding outstanding datasets (Strasser et al., 2018). The entire catchment is 98.1 km
Map of the catchment area upstream gauge Vent and nested within the Vernagt catchment, with glaciers Vernagtferner (VF) and Hintereisferner (HEF) as well as the measurement station Martin-Busch-Hütte (MB). Smaller glaciers Hochjochferner, Platteiferner and Mitterkarferner are denoted by HJF, PF and MKF, respectively. Sources: 10 m DTM of Tyrol (Land Tirol, 2016), glacier inventory 4 (2015) (Buckel and Otto, 2018), and rivers from tiris open government data (Land Tirol, 2021).
The study area's shielded location in the inner Alpine region leads to the
relatively warm and dry climate (Kuhn et al., 1982). Mean annual temperature and precipitation at the gauge in Vent are 2.5
Both catchments are heavily glacierized, with approximately 28 % and 64 % glacier cover in 2015 (Buckel and Otto, 2018). The two largest glaciers within the Vent catchment, Vernagtferner and Hintereisferner, have been systematically observed since the 1950s and 1960s and have shown accelerating retreat since the beginning of the 1980s (Abermann et al., 2009), which is expected to continue in the future (Hanzer et al., 2018).
The catchment is drained by the Rofenache River, which flows into the Ötztaler Ache, one of the largest tributaries to the Inn River. The hydrological regime (glacial to nival) shows a pronounced seasonality as most of the discharge is fed by snowmelt and glacier melt and about 89 % of the discharge in Vent occurs from April to September (Schmidt et al., 2022b; Hanzer et al., 2018).
The Ötztal Alps are part of the Ötztal–Stubai massif within the crystalline central–eastern Alps, and the catchment geology is dominated by biotite–plagioclase, biotite and muscovite gneisses, variable mica schists and gneissic schists (Strasser et al., 2018). The land cover of higher elevations is dominated by glaciers, bare rock or sparsely vegetated terrain, whereas mountain pastures and coniferous forests are present at lower altitudes.
Suspended sediment concentrations at the gauge in Vent proved to be the highest in an Austria-wide comparison (Lalk et al., 2014), and annual suspended sediment yields are about 1500 t km
In essence, we trained a nonparametric model on suspended sediment concentrations and predictors (discharge, temperature, precipitation) and then used this model and long-term records of the predictors to reconstruct past suspended sediment concentrations (SSCs) and derive annual suspended sediment yields. We then analyzed the resulting time series with respect to trends and change points.
In the following section, we briefly describe the general modeling approach and the validations, the QRF approach and its predictors, the input data and their preparation for the model as well as the statistical tools used for analyzing the results.
In our analyses, we focus on (annual) suspended sediment yields instead of concentrations, due to the strong seasonality in discharge (Schmidt et al., 2022b). This gives more weight to days with higher discharge, which are more influential for overall sediment export to downstream areas and the potential resulting problems, as opposed to, e.g., mean annual SSCs, which weigh days at the beginning and end of the season, with low discharge and low concentrations, and days during the glacier melt period equally. We use the term “reconstructing” to describe the estimates of SSCs (or the resulting yields) from our model predictions.
We combined the analysis of the two gauges Vent and Vernagt to gain a more reliable and comprehensive understanding of past sediment dynamics in the Rofental from their distinct data context.
At gauge Vent, continuous turbidity-derived SSC time series have been recorded at high temporal resolution (15 min) since 2006, providing abundant training data for our model. Additionally, the long-term predictor data are available back until 1967, facilitating insights into long-term changes in catchment dynamics – yet only at daily resolution, which predetermines the temporal resolution of the reconstruction model. This is challenging as sediment concentrations vary considerably during 1 d, leaving us with the need to assess whether a daily model adequately represents sediment dynamics.
At gauge Vernagt, the availability of hourly discharge (
Overview of the modeling approach. SSC: suspended sediment concentration;
To address these issues and benefit from these opportunities, we preformed
three preparatory steps before the final reconstruction models (Fig. 2).
Gap-filling model: at gauge Vernagt, we trained a model on SSCs determined from 131 water samples and the predictors ( Validation A – temporal resolution: we ran models at both hourly and daily resolution on all available training data at gauge Vernagt to assess the error magnitude due to the coarser temporal resolution in the final reconstruction models. Validation B – extrapolation: we trained a model on the corrected Vernagt SSC data of 2019 and 2020 at daily resolution and estimated SSC back until 2000 for validation against the 2000–2001 measurement data (Fig. 2, center).
Finally, we ran the “reconstruction models” at daily resolution (for consistency between the two gauges), taking into account all available
training data to estimate SSC back until 1967 and 1975, respectively.
Beyond Validations A and B, we assessed whether and how often the ranges of the predictors in the training data were exceeded during the reconstruction
period, since it is a known limitation of QRF that it is not possible to
extrapolate beyond the range of values represented in the training data (Francke et al., 2008a).
For example, if the discharge measured on a day in the reconstruction period
exceeds the maximum discharge within the training period (
Additionally, we compared QRF performance to the performance of sediment rating curves, one of the most commonly used approaches to estimate suspended
sediment concentrations (Vercruysse et al.,
2017), by fitting a power function between SSC and
To reproduce suspended sediment dynamics, the desired models have to account for a multitude of processes controlling SSCs (discharge dynamics, temperature-dependent activation of sediment sources and transport processes, precipitation, antecedent wetness conditions, etc.), need to deal with non-normal distributions of both predictor and response variables and have to handle correlations between the predictors (Zimmermann et al., 2012).
QRF (Meinshausen, 2006) represents a multivariate approach that can deal with nonlinearity, interactions and nonadditive behavior without making assumptions about underlying distributions. QRF performed favorably in reproducing sediment dynamics as compared to generalized linear models or sediment rating curves (Francke et al., 2008a). QRF is a generalization of random forest (RF) regression tree ensembles (Breiman, 2001). Regression trees (a.k.a. CARTs – classification and regression trees) (Breiman et al., 1984) apply recursive rule-based data partitioning in order to group data with similar values of the response variable (Francke et al., 2008a). To produce a “forest”, RF and QRF employ an ensemble of these trees, each one grown on a random subset (bootstrap sample) of the training data. Predictive performance is evaluated on those parts of the data that are not considered in the bootstrap sample, i.e., the out-of-bag (OOB) data (ibid.). Model predictions are then obtained from the mean of predictions of each single tree (RF) or are based on the distribution of these single-tree predictions (QRF). As QRF keeps the value of all observations within a node, it enables the quantification of the model uncertainty with the help of these inherent ensemble characteristics. This represents an advantage over other nonparametric approaches applied to SSC modeling, such as traditional fuzzy logic or artificial neural networks (ibid.).
Building on the developed model setup (Francke et al., 2008b; Francke, 2017; Zimmermann et al., 2012), we chose the predictors to represent proxies for processes important to sediment dynamics in high-alpine areas. For example, discharge is crucial for sediment transfer and, potentially, channel erosion, while precipitation drives hillslope erosion and mass wasting events. We extended the set of primary predictors by adding air temperature, as many processes determining sediment dynamics in high-alpine areas are temperature-sensitive (e.g., the activation of sediment sources, such as the occurrence of rain vs. snow, the availability of subglacial and proglacial sediments and their transport through glacier melt waters, but also potential permafrost thaw). From these three primary predictors (discharge, precipitation and temperature, QPT), we derived information on antecedent conditions for each time step by computing ancillary predictors derived thereof. These ancillary predictors can capture that, e.g., longer-term discharge behavior is linked to potential exhaustion of sediment sources or storage, prolonged warm periods may lead to increased glacier ablation associated with intensified transport of glacial sediment, and high antecedent moisture conditions prior to a precipitation event may favor mass movements. To keep correlation between these derived predictors as low as possible, we used non-overlapping windows of increasing sizes (e.g., 24 h, 24–72 h, 72 h to 7 d and 7 to 20 d ahead of each time step) to compute sums of the primary predictors (Zimmermann et al., 2012). To complete the set of predictors, we used the day of year to capture the pronounced seasonality in high-alpine sediment dynamics (Schmidt et al., 2022b) and the rate of change in discharge (see also Francke et al., 2008b; Zimmermann et al., 2012).
Further, we tuned the model with respect to the length of the antecedent periods considered: we optimized the length of the time windows for the
highest model performance with respect to daily or hourly SSCs using the Nash–Sutcliffe efficiency index (see Sect. 3.4.1). Besides the length of the antecedent periods, the two most important hyperparameters for the QRF
are the number of trees in a “forest” and the number of selected
predictors at each node, implemented as parameter “mtry” in R. To increase robustness, we set the number of trees to 1000, which is twice the default value. The parameter “mtry” is optimized for maximum performance in the modeling process. Additionally, we optimized model performance in a cross-validation with five folds, as is commonly done (Murphy, 2012). However, unlike cross-validation approaches in the classical sense, which randomly pick a set of individual data points, we divided the training data into five equal temporally contiguous chunks to avoid unrealistically good performance simply due to autocorrelation of temporally close points in time. Finally, in the “reconstruction model”, we derived annual suspended sediment yields by performing 250 Monte Carlo realizations of the annual SSY for each year, which allows for assessing the prediction uncertainty, from which the mean and quartiles of annual suspended sediment yields were computed. We chose 250 Monte Carlo realizations as this yields sufficiently robust estimates of the mean annual SSY (the confidence intervals of the mean are ca.
Here we describe the input data and their preparation for modeling. An overview of details such as coordinates of stations and gauges, the temporal resolution of the different time series, their sources and data availability can be found in Table A1 in the Appendix.
Gauge Vent has been operated continuously by the Hydrographic Service of Tyrol since 1967 (Strasser et al., 2018), and discharge has been measured at 15 min resolution through water stage recordings, complemented by a radar probe since 2000. At gauge Vernagt, discharge has been recorded since 1974 (Bergmann and Reinwarth, 1977) and is determined through water stage recordings and tracer-calibrated stage–discharge relations. Additionally, this is complemented by ultrasonic stage measurements, providing nearly uninterrupted discharge series since 1974 (Braun et al., 2007). The temporal resolutions of the discharge time series at both gauges vary over time (from 5 and 10 to 60 min; for details, see Table A1 in the Appendix).
Precipitation and temperature have been recorded close to the gauge in Vent
(see Fig. 1) since 1935 and are available at daily resolution. Both time
series showed gaps, which creates a problem when using the chosen QRF
approach. Thus, to fill the gaps in the precipitation time series, we used
data recorded at gauge Vernagt: we derived a linear model between all days when daily precipitation sums were available from both stations and used the
resulting linear model for conversion (i.e.,
At gauge Vernagt, precipitation and temperature have been recorded at high temporal resolution (5 to 60 min; for details, see Table A1 in the Appendix) next to the gauge since 1974. To fill the present gaps, we used
data recorded by the Hydrographic Service since 2010 in close proximity to
the Vernagt station. As some gaps still remained, we subsequently used data
recorded at Martin-Busch-Hütte (conversion factors:
At gauge Vent, turbidity has been measured since 2006 using two optical infrared turbidity sensors (Solitax ts-line and Solitax hs-line by Hach). To calibrate the turbidity measurements to SSCs, water samples are taken manually from the stream close to the turbidity sensors frequently (Lalk et al., 2014). Turbidity measurements are paused every winter (between October and April) to avoid damage to the equipment. However, the equipment is reinstalled early enough to capture the spring rise in concentrations, and winter sediment transport can be considered negligible (Schmidt et al., 2022b).
At gauge Vernagt, water is diverted into a measuring chamber (Bergmann and Reinwarth, 1977), where turbidity can be recorded
while avoiding damage to the equipment by large rocks in the main channel.
Turbidity was recorded in the summers of 2000 and 2001 (Staiger-Mohilo
STAMOSENS 7745 UNIT) (Naeser, 2002) and 2019 and 2020 (Campbell OBS501). Water samples for calibration of turbidity to SSC were
taken directly next to the turbidity sensor by hand in 2000 and 2001 (57 samples) and by means of an automatic sampler (ISCO 6712) in 2019 and 2020
(131 samples). The latter initiated sampling if one of two criteria was met:
(i) regular sampling to avoid long gaps between two samples or (ii) threshold-based sampling to obtain samples across the whole range of
possible turbidity values. Gravimetric sediment concentrations SSC
As temporal resolutions varied between the different time series, we aggregated or disaggregated the data to achieve homogenous temporal resolutions for the respective QRF models at daily, hourly and 10 min resolution. Data at 10 min resolution were only needed for the gap-filling model at gauge Vernagt. For this, we had to disaggregate precipitation and temperature data from 60 min resolution in 2000 and 2001 by dividing hourly precipitation sums by 6 and replicating mean hourly temperature values for the six corresponding 10 min time steps.
For the analysis of annual
From SSCs (modeled and from turbidity) and discharge (
To quantify model performance, and for our validation, we used the Nash–Sutcliffe efficiency (NSE) index:
In order to quantify time series behavior, we generally followed the approach of first analyzing for the existence of a trend. If a trend was detected, we assessed whether the trend was homogenous by analyzing for change points. If a change point was identified, we then examined the resulting segments of the time series for trends.
Most of the investigated time series are not normally distributed, and some show autocorrelation. Thus, to calculate trend significance, we used the
nonparametric Mann–Kendall test for linear trend detection in a version that was modified to detect trends in serially correlated time series (Yue and Wang, 2004) as recommended by Madsen et al. (2014). To estimate trend
magnitude, we used Sen's slope estimator (Sen, 1968). Both methods are implemented in the mkTrend function of the R package “FUME”
(Santander Meteorology Group, 2012). In our results, we only plot and refer to trends that were significant at least at a significance level
For change point (CP) detection, we used the nonparametric Pettitt test (Pettitt, 1979), which is commonly used as it is a powerful rank-based test for a change in the median of a time series and is robust to changes in distributional form (Yue et al., 2012), as implemented in R package “trend” (Pohlert, 2020). However, it only gives one change point location without uncertainty quantification around its location and was shown to be sensitive to the position of the change point within the time series; i.e., detection at the beginning and end of the series is unlikely (Mallakpour and Villarini, 2016).
Thus, as a complementary advanced approach that counterbalances the weaknesses of Pettitt's test, we used Bayesian regression with change points as implemented in R package “mcp” (which stands for “multiple change points”; hereafter we refer to this method as “MCP”) (Lindeløv, 2020). This represents a much more flexible approach which allows us to assess uncertainty through the resulting posterior distributions of the change point location and is applied in an increasing number of studies in different fields of research (e.g., Veh et al., 2022; Yadav et al., 2021; Pilla and Williamson, 2022). Although MCP allows us to detect multiple change points, we only considered one change point as we aimed to detect the largest shift in the time series and to ensure comparability with Pettitt's test. Unless specified otherwise, we used the uninformative default prior, allowed free slope estimation before and after the change point, assumed a disjoined slope (i.e., step-like change) at the change point and allowed for changes in variance at the change point.
As mentioned earlier, we computed 250 Monte Carlo realizations of the annual SSY as a result of the QRF model. We propagated this uncertainty by applying the trend and change point detection methods not only to the mean estimates, but also to the 250 resulting time series realizations.
All calculations were done in R version 4.2.1 (R Core Team, 2018).
The temporal resolution for long-term reconstruction is limited to daily, as
the respective long-term predictor data are available only at daily
resolution at gauge Vent. As a result, we can expect some loss of
information, e.g., on short-term precipitation intensity, which can be crucial for sediment dynamics. To assess the error magnitude, we ran two
variants of the models at gauge Vernagt at daily and hourly resolution based on all available training data (i.e., 2000, 2001, 2019 and 2020). We then compared daily sediment discharge (
Results of the five-fold cross-validation at gauge Vent for QRF and sediment rating curves (SRCs) with respect to mean daily SSC compared to turbidity measurements, expressed as the Nash–Sutcliffe efficiency (NSE) of the model estimates for each one-fifth of the time series in the cross-validation (with corresponding years in parentheses) and the full models (OOB: out of bag in the case of QRF).
To evaluate the capability of the QRF models to reconstruct past sediment
dynamics, we trained a daily model on the 2019–2020 (
In the “validation model” results, annual yields were affected by overestimation in late August 2000 and underestimation in July 2001 (Fig. 4b). As a result, the mean annual sSSY estimates were 31 % higher and 16 % lower than observed annual sSSY in 2000 and 2001, respectively. Considering the spread of the model results (i.e., the 2.5 and 97.5 percentiles of the 250 model predictions as depicted by the whiskers in Fig. 4b, which were chosen as they are more robust than the extremes while covering 95 % of the estimates), 19 % overestimation compared to the maximum model estimate in 2000 and 4 % underestimation compared to the minimum model estimate in 2001 remained. Unfortunately, annual yields of the validation model trained on the 2000–2001 data cannot be compared in the same way, since observations show gaps in both 2019 and 2020 (see also the caption of Fig. 4a).
At gauge Vent, more years of data were available, so that the results of the five-fold cross-validation provided a more detailed picture of the temporal extrapolation ability (Table 1). Some cross-validation periods showed rather low NSE values, which indicates that they are harder to predict with only the remaining training data and therefore contain more valuable data for training the full QRF model. For example, the period 2012 to 2014 with the lowest NSE contains the most extreme event in August 2014, which was likely linked to mass movements (Schmidt et al., 2022b). Similarly, the period 2018 to 2020 also shows a low NSE, and a mass-movement event was observed in 2020 (ibid.). Including such periods in the training data apparently is of importance to the fidelity of the model. Conversely, the same cross-validation with a sediment rating curve shows that QRF performance is much better, especially in those periods that are harder to predict. Consequently, QRF clearly excels over the traditional approach (see also Sect. 4.4).
At gauge Vent, the maximum temperature of the training period (
At gauge Vernagt,
There were also days at both gauges when discharge was lower than the
minimum discharge measured during the training period (
To test the performance of the reconstruction models, we compared out-of-bag SSC estimates of the reconstruction models and SSC estimates of sediment rating curves (trained on all available data) to turbidity measurements at both gauges (Fig. 5).
Measured versus modeled sediment concentration and yield using QRF (OOB estimates, red) and SRC (blue).
QRF performance is superior to sediment rating curves (SRCs) at both gauges with respect to mean daily SSCs: as can be seen in Fig. 5 and the NSE and BE values, SRCs are generally less capable of capturing the variability in the training data. The NSE and BE values of the QRF model represent a satisfactory performance at gauge Vent and very good performance at gauge Vernagt (Moriasi et al., 2007). Considering mean annual SSCs at gauge Vent, QRF performs much better than SRCs. At gauge Vernagt, QRF performs very well and slightly better than SRCs, but there are only four years available for comparison, which confines the representativity of this analysis.
Performance with respect to
Rare days with extremely high SSC (and
In the resulting time series, the average sSSY of all years (
At both gauges, mean modeled annual sSSYs show significant positive trends (Fig. 6). In Vent, 97.6 % of the 250 time series realizations show significant positive trends (Sen's slope (SS): 9–15 t km
Mean specific annual suspended sediment yields (sSSYs) as reconstructed by the QRF model (black points). Whiskers depict 2.5 % and 97.5 % quantiles of the 250 QRF realizations. The overall trend of mean annual sSSY is given by Sen's slope (black line) and trends of all 250 QRF realizations in grey. Change points determined by Pettitt's test (red dashed line) and MCP (posterior distributions of 250 realizations as solid light red and mean as dark red lines). Trends in the time series segments before and after 1981 are given as dark blue (mean) and light-blue (realizations) lines (only plotted if significant).
At gauge Vernagt, MCP shows very similar probability density distributions
to those of Vent, with maximum probabilities in 1981 for all realizations. However, Pettitt's test detects a change point in 1989 (
In the first segment, no significant trends were detected at gauge Vernagt,
and at gauge Vent, only 2 of the 250 realizations show significant positive trends (SS of 11.1 and 3.9 t km
We found positive trends in annual sSSYs at both gauges with high probabilities of a step-like increase around 1981. To assess whether this coincides with changes in temperature (i.e., changes in snowmelt and/or glacier melt and discharge) or changes in precipitation, we analyzed time series of the predictors, temperature, discharge and precipitation as well as glacier mass balance data with respect to trends and change points.
Annual specific discharge sums at both gauges show significant positive
overall trends (Fig. 7a and b). In Vent, both change point detection
methods indicate high probabilities of a change point in 1981. At gauge
Vernagt, annual discharge volumes roughly doubled within the examined
period, from ca. 1250 mm to ca. 2500 mm. The two change point (CP) detection
methods disagree as in the case of annual sSSYs, as Pettitt's test detects a change point in 1990, while MCP suggests a change point with high probability in 1981. Dividing both time series in the year 1981, both
segments of the Vent time series show insignificant trends, while both
segments at gauge Vernagt show significant positive trends (first segment: Sen's slope
We analyzed mean monthly discharge volumes and found significant positive trends in May and June in Vent (a detailed figure can be found in Appendix Fig. A2). As with temperature, change points are indicated by both methods in 1981 for July discharge and around 1995 for June discharge. At gauge Vernagt, discharge shows significant positive trends from May through August (see Fig. A3 in the Appendix). In June, a CP is indicated around 1995 by both methods, and in July, MCP indicates a CP around 1981, but Pettitt's test does not yield a significant change point. For May and August discharge, Pettitt's test yields change points in the late 1990s and mid-1980s, respectively, but MCP shows very widespread probability distributions.
Summer precipitation sums (May–September) at gauge Vent (Fig. 7c) show
no significant trend, while annual precipitation sums show a significant
positive trend. A change point is identified by Pettitt's test in 1991 (
We derived mean annual temperatures at gauge Vent; however, at gauge Vernagt, we computed mean summer temperatures (between May and September) instead, as temperatures are missing in winter in many years (see also Fig. A1 in the Appendix). Both time series show significant positive trends but no clear change points, as Pettitt and MCP disagree and MCP yields a very widespread probability distribution (Fig. 7e and f). Additionally, we analyzed mean monthly temperatures over time and found that, while most months show positive trends, at both gauges July is the only month with a high change point probability around 1981 (Fig. 7e and f). At both stations, July temperatures in 1982 and 1983 were exceptionally warm.
In addition to the hydroclimatic predictors considered in the QRF model, we analyzed independent data of annual mass balances of the Vernagtferner (VF)
and Hintereisferner (HEF), the two largest glaciers in the Vent catchment
with long glaciological mass balance records. Since 1965 (VF) and 1952
(HEF), the two glaciers have been regularly and extensively surveyed for
volume changes (World Glacier
Monitoring Service, 2021; Strasser et al., 2018). Annual mass balances of
both glaciers show significant negative trends (Fig. 7g and h).
Significant change points are indicated in 1985 for both glaciers by the
Pettitt test, while MCP attributed (much) higher probabilities for change
points to the year 1981. Notably, annual mass balances of both glaciers are
almost exclusively negative after 1980 (with the exception of 1984, where
mass balances are barely positive). Dividing the time series in 1981, no
significant trends are detected in the first segments of both time series
and significant negative trends in the second segments (HEF:
Summer mass balances were only (continuously) available for the Vernagtferner and show a strong negative trend, and both Pettitt and MCP detect change points in 1981 (Fig. 7h). No significant trends are detected in the resulting time series segments. Winter mass balances do not show a significant trend or change points.
The presented study aimed to examine extensively whether quantile regression forest is a suitable method for estimating past sediment export. Overall, the final QRF reconstruction models outperformed SRCs by about 20 percent points in explained variance at both gauges – even though we assessed OOB estimates of QRF versus rating curves based on all available data. OOB performance was better at gauge Vernagt (NSE of 0.82 with respect to daily SSCs) than at gauge Vent (NSE of 0.6), although even the latter corresponds to a satisfactory performance (Moriasi et al., 2007). We suggest that there are three reasons for this difference in performance. First, the Vent catchment is much larger (almost 100 km
Nevertheless, a few limitations need to be considered. As a major limitation, QRF cannot extrapolate if predictors in the reconstruction period exceed the range of values represented in the training dataset (Francke et al., 2008a). In these cases, the modeled values will potentially suffer from overestimation or underestimation. At the same time, assessing the number of such exceedances in the period of model application provides an indication of the representativity of the training dataset, which must be sufficiently large and varied for such data-driven approaches (Vercruysse et al., 2017). In our case, exceedances were rare (7 d at gauge Vent and 10 d at gauge Vernagt), so we consider the error negligible for annual sediment export. We strongly encourage future studies to assess the number and severity of such exceedances for this purpose, especially in view of changing conditions, e.g., due to (prospective) climate change.
More generally, the QRF model can only reproduce the quantitative and qualitative conditions as represented within the training data. This has several implications. First, the spread of the QRF model results needs to be interpreted as a minimum estimate of uncertainty, as it can only reproduce the “known unknown”, i.e., the variability represented in the training data and their respective relationships or the lack thereof. Second, major qualitative changes in the functioning of the modeled system, such as changes in connectivity, e.g., through the formation of proglacial lakes or large-scale storage of sediments along the flow paths, cannot be captured. However, for the catchments considered in this study, such major changes are at least not indicated in historic aerial images (Laser- und Luftbildatlas Tirol, 2022), and the longitudinal profiles of the major water courses are very steep, which precludes significant sediment storage.
We found that our models at both gauges underestimated rare, high daily SSCs and
Further, we suspected that the aggregation of precipitation and discharge to daily values might involve some loss of information, e.g., on subdaily precipitation intensity and maximum discharge, which would very likely affect sediment export estimates. Indeed, the underestimation is a little more pronounced in the daily model as compared to hourly resolution, yet the difference was relatively small (Validation A, Fig. 3a). Adding to this, the underestimation at the daily scale does not seem to propagate to annual estimates, as high annual sSSYs are not systematically underestimated to the same extent (Fig. 5b) and the underestimation of rare events at the daily scale has a limited effect on the annual estimates (Sect. 4.4). This is in accordance with the finding that only about one-fourth of the annual yield in Vent is transported during (precipitation) events (Schmidt et al., 2022b), as opposed to other fluvial systems, where the majority of the annual sediment yield is transported by several extreme events (Delaney et al., 2018). Consequently, the proposed QRF model is also applicable at daily resolution, enabling its application to the longer time series available.
We assessed temporal extrapolation ability at gauge Vernagt by training a
model on the data of 2019/2020 and comparing the estimates to measurements in
2000 and 2001 (Validation B; Fig. 4), which showed overestimation and underestimation of annual sSSYs by 31 % and 16 %, respectively. In interpreting these
results, it has to be considered that the amount of data used for training
is very small and less than half of the final reconstruction model (212 of
579 d), while the amount of training data is known to be crucial in
data-driven models (e.g., Vercruysse et al., 2017). Specifically, turbidity recordings only started in mid-July 2019, so
that only one spring season was available for training. Thus, given the
rigorousness of this test and the temporal distance of 20 years, we find that the validation yields satisfactory results with good agreement of
dynamics on short timescales and annual estimates. The validation model trained on the data of 2000/2001 yielded lower but still satisfactory
NSE values. We attribute this to higher discharge and temperature values in
2019 and 2020, which, again, points to the importance of exceedances. With better data availability at gauge Vent, we assessed temporal extrapolation
ability through a five-fold cross-validation. This showed that some periods
were harder to model based solely on the remaining data. This means that
these periods contain rather distinct data and are thus especially valuable
as training data for the full QRF model, as they contained the highest SSC
and
The overall magnitudes of annual yields at the two analyzed gauges fall at the high end compared to an extensive collection from the European Alps (Hinderer et al., 2013) and are in good accordance with yields from other catchments in the Stubai and Ötztal Alps (Schöber and Hofer, 2018; Tschada and Hofer, 1990) (see also Schmidt et al., 2022b). Their reconstruction for the past five decades constitutes an important contribution to the understanding of long-term sediment budgets, as other existing records commonly only cover some decades.
The reconstructed annual sSSYs at both gauges show overall positive trends and change points around 1981. The two change point detection methods did not agree at gauge Vernagt, where MCP yielded high change point probabilities around 1981, whereas Pettitt's test detected a significant change point in 1989 (or 2002 in some realizations). Similarly, the results did not agree for July temperatures and discharge at gauge Vernagt. We attribute this to a limitation of Pettitt's test, which is known to be much less sensitive to break points located near the beginning or end of the time series (Mallakpour and Villarini, 2016). Notwithstanding, we conclude that there is a high probability of change points around 1981 in the reconstructed sSSY as well as discharge and July temperature time series at gauge Vernagt, as the MCP probability distributions are very narrow and similar to the corresponding probability distributions in Vent.
Our results suggest that the step-like increase in modeled sSSY is linked to the onset of increased ice melt. Mean annual temperature in Vent and mean summer temperature at gauge Vernagt at both locations show gradual positive trends without a clear change point. However, mean July temperatures show high change point probabilities around 1981 at both locations, which is probably heavily influenced by extraordinarily high temperatures in July 1982 and 1983. July temperatures are especially relevant for firn and glacier melt, since July is the month with the highest firn and ice melt contribution to discharge after snowmelt contributions have peaked in June and snow cover has decreased substantially (Kormann et al., 2016; Weber and Prasch, 2016; Schmieder et al., 2018; Schmidt et al., 2022b). This shift is reflected in discharge, which shows a step-like increase around 1981 at both gauges and continues to be elevated after this change point. The analyses of monthly discharge showed that July was the only month with a change point around 1981 at both gauges, which again emphasizes the dominance of increased firn and ice melt. In contrast, we did not find evidence of a change in precipitation sums around 1981, which would indicate that enhanced hillslope erosion on snow- or ice-free surfaces played a crucial role in the sSSY increase. With regard to suspended sediment dynamics, the increase in ice melt may translate to an increase in sediment-rich glacial meltwater (Delaney and Adhikari, 2020) and intensified fluvial erosion of sediment-rich proglacial areas.
These changes in the predictors of our QRF model are also reflected in the mass balance time series of the Hintereisferner and Vernagtferner, which were almost exclusively negative after 1980, with very negative summer mass balances in 1982 and 1983. Escher-Vetter (2007) and Abermann et al. (2009) also showed that mass loss started in 1981 at both glaciers, resulting from negative summer balances. With respect to the Vernagtferner, the drastically higher ablation area ratio of almost 80 % in 1982 as compared to around 25 % in the preceding years (Escher-Vetter and Siebers, 2007) indicates that large areas of the glacier became snow-free in 1982. This entails crucial changes in albedo and therefore intensified ice melt and thinning of firn areas due to rising energy absorption at the glacier surface (Escher-Vetter, 2007; Braun et al., 2007). We interpret this as a regime shift as summer mass balances continue to be lower than before 1981 although (July) temperatures decrease again, with the exception of 1984, which was characterized by a relatively high number of snowfall days during the ablation period (Escher-Vetter and Siebers, 2007).
Similar step-like increases in sediment export have been reported from the upper Rhône basin in Switzerland (Costa et al., 2018) (only slightly later, around 1985, which is likely due to lower average temperatures) and for the headwaters of the Yangtze River on the Tibetan Plateau (Li et al., 2020; Zhang et al., 2021) and 28 headwater basins in High Mountain Asia (Li et al., 2021). Thus, we deem it unlikely that the change points identified in the present study are related, e.g., to a change in measurements, also because they were detected at both gauges in the same year and in both (July) temperature and discharge. More generally, substantial increases in sediment export in response to climate change have been reported from cold environments around the globe (e.g., Bogen, 2008; Koppes et al., 2009; Bendixen et al., 2017b; Singh et al., 2020; Vergara et al., 2022) and are in accordance with state-of-the-art conceptual models, which expect that phases of glacier retreat (and re-advance) will lead to the highest increase in sediment yield across glacial cycles (Antoniazza and Lane, 2021), as has been confirmed by several studies (Lane et al., 2017, 2019; Micheletti and Lane, 2016).
Interestingly, we found opposing trends in the time after 1981, with strongly increasing annual sSSY at gauge Vernagt and decreasing sSSY at gauge Vent. To some extent, this is reflected in changes in discharge, where we found no significant trend (and Sen's slope close to zero) after 1981 at gauge Vent but a strong positive trend at gauge Vernagt. Altogether, this could indicate different timings of “peak sediment” (Ballantyne, 2002; Antoniazza and Lane, 2021) and thus a stabilization or compensation of the larger Vent catchment as opposed to the nested Vernagt catchment. Some independent observations support this notion: for example, the glacier tongue of the Hochjochferner (denoted as “HJF” in Fig. 1), located in the southernmost tributary valley to the Rofental, has retreated by about 2 km since the 1970s (determined based on historic aerial image collection of the State of Tyrol, Laser- und Luftbildatlas Tirol, 2022) and has now retreated behind a rock sill. Additionally, several small lakes have formed in its glacier foreland that likely act as sediment sinks. Another two smaller glaciers in the northeastern part of the Vent catchment, Mitterkarferner and Platteiferner (“MKF” and “PF” in Fig. 1), have disappeared almost completely since the 1970s. Conversely, the Vernagtferner glacier has also experienced considerable loss in area and volume but lacks such pronounced qualitative changes thus far.
In the presented study, we tested a quantile regression forest (QRF) model, which enabled the estimation of sediment export of the past five decades for the two gauges Vent and Vernagt. This allowed us to analyze annual specific suspended sediment yields (sSSYs) for trends and change points. Annual sSSYs show positive trends as well as step-like increases after 1981 at both gauges. As this coincides with exceptionally high July temperatures in 1982 and 1983, distinct changes in the glacier mass balances of the two largest glaciers in the catchment area and a sudden increase in the ablation area of one of the glaciers, we conclude that temperature-driven enhanced glacier melt is responsible for the step-like increase in sSSYs. This is also mirrored in discharge measurements at both gauges, which show change points around 1981 as well. Opposing trends after 1981 at the two gauges could indicate different timings of “peak sediment”. These analyses also demonstrated the value of assessing change points in addition to trend analyses in order to detect sudden changes in the analyzed geomorphic systems and thus facilitate a better understanding of critical time periods.
Further, we explored advantages and limitations of the QRF approach. As a
major limitation, QRF may yield underestimates if predictors exceed the
range of values represented in the training dataset. To estimate the effect
on the model results, we suggest assessing the number of such exceedances, which can also be leveraged to evaluate the representativity of the training
dataset. Further, we found that events with very high SSC and
Characteristics of input data.
Discharge (
Trends and change points in monthly discharge sums at gauge Vent.
Dashed red lines indicate change points according to Pettitt's test
(significant at
Trends and change points in monthly discharge sums at gauge
Vernagt. Dashed red lines indicate change points according to Pettitt's test
(significant at
The code of the quantile regression forest model, including the preprocessed data and raw model results, is available on B2Share (
See column “data availability” in Table A1 in Appendix A.
LKS developed the general idea and conceptualized the study with TF and PMG, with mentoring and reviewing by AB. LKS gathered the raw data. LKS, PMG and TF installed and maintained the automatic water sampler for gauge Vernagt, heavily supported by the installations run by CM, who also supplied past data. PMG prepared the input data, adapted and extended the model code and performed modeling experiments with the support and supervision of TF and LKS. LKS conducted the statistical analyses with support by TF. CM reviewed and evaluated the results in the glaciological context. LKS prepared the original draft, including all the figures, and all the authors contributed to the writing of this paper.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We were supported by the Hydrographic Service of Tyrol, Austria, by providing data as well as logistical support and fruitful discussions.
We thank Oliver Korup for his encouraging advice on statistical analyses, Matthias Siebers, Andreas Bauer, Irene Hahn, Theresa Hofmann, Marvin Teschner, Nina Lena Neumann and Joseph Pscherer for their help and support during fieldwork and laboratory work, and Stefan Achleitner, Carolina Kinzel and the Environmental Engineering laboratory at the University of Innsbruck for their kind support during fieldwork and laboratory work. We thank Anna Costa, Dongfeng Li and an anonymous referee for their thorough and valuable comments that helped enhance and refine the manuscript.
This research has been supported by the Deutsche Forschungsgemeinschaft research training group “Natural Hazards and Risks in a Changing World” (grant nos. NatRiskChange GRK 2043/1 and NatRiskChange GRK 2043/2) and a fieldwork fellowship of the German Hydrological Society (DHG).
This paper was edited by Roberto Greco and reviewed by Dongfeng Li, Anna Costa, and one anonymous referee.