Simulating sediment discharge at water treatment plants under different land use scenarios using cascade modelling with an expert-based erosion-runoff model and a deep neural network

Excessive sediment discharge in karstic regions can be highly disruptive to water treatment plants. It is essential for catchment stakeholders and drinking water suppliers to limit the impact of high sediment loads on potable water supply, but their strategic choices must be based on simulations integrating surface and groundwater transfers and taking into account possible changes in land use. Karstic environments are particularly challenging as they face a lack of accurate physical descriptions for the modelling process, and they can be particularly complex to predict due to the non-linearity of the processes generating sediment discharge. The aim of the study was to assess the sediment discharge variability at a water treatment plant according to multiple realistic land use scenarios. To reach that goal, we developed a new cascade modelling approach with an erosionrunoff geographic information system (GIS) model (WaterSed) and a deep neural network. The model was used in the Radicatel hydrogeological catchment (106 km2 in Normandy, France), where karstic spring water is extracted to a water treatment plant. The sediment discharge was simulated for five design storms under current land use and compared to four land use scenarios (baseline, ploughing up of grassland, eco-engineering, best farming practices, and coupling of eco-engineering/best farming practices). Daily rainfall time series and WaterSed modelling outputs extracted at connected sinkholes (positive dye tracing) were used as input data for the deep neural network model. The model structure was found by a classical trial-and-error procedure, and the model was trained on 2 significant hydrologic years. Evaluation on a test set showed a good performance of the model (NSE= 0.82), and the application of a monthly backwardchaining nested cross-validation revealed that the model is able to generalize on new datasets. Simulations made for the four land use scenarios suggested that ploughing up 33 % of grasslands would increase sediment discharge at the water treatment plant by 5 % on average. By contrast, ecoengineering and best farming practices will significantly reduce sediment discharge at the water treatment plant (respectively in the ranges of 10 %–44 % and 24 %–61 %). The coupling of these two strategies is the most efficient since it affects the hydro-sedimentary production and transfer processes (decreasing sediment discharge from 40 % to 80 %). The cascade modelling approach developed in this study offers interesting opportunities for sediment discharge prediction at karstic springs or water treatment plants under multiple land use scenarios. It also provides robust decisionmaking tools for land use planning and drinking water suppliers. Published by Copernicus Publications on behalf of the European Geosciences Union. 6224 E. Patault et al.: Simulating sediment discharge at water treatment plants under different land use scenarios

Abstract. Excessive sediment discharge in karstic regions can be highly disruptive to water treatment plants. It is essential for catchment stakeholders and drinking water suppliers to limit the impact of high sediment loads on potable water supply, but their strategic choices must be based on simulations integrating surface and groundwater transfers and taking into account possible changes in land use. Karstic environments are particularly challenging as they face a lack of accurate physical descriptions for the modelling process, and they can be particularly complex to predict due to the non-linearity of the processes generating sediment discharge. The aim of the study was to assess the sediment discharge variability at a water treatment plant according to multiple realistic land use scenarios. To reach that goal, we developed a new cascade modelling approach with an erosionrunoff geographic information system (GIS) model (Wa-terSed) and a deep neural network. The model was used in the Radicatel hydrogeological catchment (106 km 2 in Normandy, France), where karstic spring water is extracted to a water treatment plant. The sediment discharge was simulated for five design storms under current land use and compared to four land use scenarios (baseline, ploughing up of grassland, eco-engineering, best farming practices, and coupling of eco-engineering/best farming practices). Daily rainfall time series and WaterSed modelling outputs extracted at connected sinkholes (positive dye tracing) were used as input data for the deep neural network model. The model structure was found by a classical trial-and-error procedure, and the model was trained on 2 significant hydrologic years. Evaluation on a test set showed a good performance of the model (NSE = 0.82), and the application of a monthly backwardchaining nested cross-validation revealed that the model is able to generalize on new datasets. Simulations made for the four land use scenarios suggested that ploughing up 33 % of grasslands would increase sediment discharge at the water treatment plant by 5 % on average. By contrast, ecoengineering and best farming practices will significantly reduce sediment discharge at the water treatment plant (respectively in the ranges of 10 %-44 % and 24 %-61 %). The coupling of these two strategies is the most efficient since it affects the hydro-sedimentary production and transfer processes (decreasing sediment discharge from 40 % to 80 %). The cascade modelling approach developed in this study offers interesting opportunities for sediment discharge prediction at karstic springs or water treatment plants under multiple land use scenarios. It also provides robust decisionmaking tools for land use planning and drinking water suppliers.

Introduction
In karstic environments, erosion and runoff can lead to a high load of sediments in surface and underground streams. Sediment discharge (SD) can occur through rapid and direct transfer via sinkholes and/or via re-suspension of sediment in the karst network itself (Masséi et al., 2003). For suppliers of drinking water, excessive sediment input can be highly disruptive, requiring additional treatment or, in the worst cases, temporary shutdowns of a water treatment plant (WTP; Stevenson and Bravo, 2019). Impacts can be significant, including restrictions on the use of drinking water. Upper Normandy (France) is particularly affected, and the economic cost linked to the restrictions on the use of drinking water due to excessive sediment inputs in raw water was estimated at EUR 5 million during the period 1992-2018 (Patault et al., 2021a); 10 000 to 20 000 people are still affected every year by restrictions on the use of drinking water in the region (ARS, 2013). Reducing sediment delivery to sinkholes is therefore essential for catchment managers in order to reduce the impact on potable water supply. One way to achieve this would be to build a complex decision-making process modelling chain, integrating surface and groundwater transfers. Several approaches have been proposed to model erosion/runoff and the karstic response induced by rainfall, but these approaches are often treated separately, which does not make it possible to evaluate the impact of a change in land use on the sediment load delivered to a WTP. Many studies in the literature have focused on hillslope erosion processes using different types of erosion models (Merritt et al., 2003;De Vente et al., 2013). Empirical models, such as RUSLE (Renard and Freidmund, 1994), are frequently used because of limited data requirements but are not able to fully represent spatial and temporal dynamics of erosion processes at the catchment scale (Verstraeten et al., 2007). Physical models, such as WEPP or LISEM (Laflen et al., 1991;Takken et al., 1999), can more accurately describe processes but may require many input parameters that are not available for application to large areas. Expert-based models (e.g. STREAM, WaterSed; Cerdan et al., 2001;Landemaine, 2016) offer an interesting compromise focusing on the main driving factors of runoff and erosion. These models have been designed with cultivated areas of the European loess belt in mind and are particularly efficient where Hortonian overland flow dominates (Souchère et al., 2005;Evrard et al., 2010;Landemaine, 2016). Some hillslope erosion studies conducted in similar karstic environments do not account for the transmissivity loss of water and sediment in sinkholes. Other studies that have focused on modelling karst processes have mainly examined karstic floods but, for the most part, have overlooked sedimentary fluxes (see the review of Hartmann et al., 2014). Due to the non-linearity of the processes generating sediment at karstic springs and the lack of accurate physical descriptions of karstic environments, modelling surface-subsurface interactions with physical models can therefore be a difficult task (Savary et al., 2017;Jourde et al., 2018). Based on systemic approaches, as initiated by Mangin (1984), karst can be considered a system able to transform an input (rainfall) into an output (discharge), and the input-output relation can be evaluated using mathematical functions. This approach can be considered a "blackbox" model to some extent, and recent research emphasized the advantages of using data-driven techniques, such as deep neural networks (DNNs) in similar situations (Yaseen et al., 2015;Kratzert et al., 2018Kratzert et al., , 2019. DNNs are advanced artificial neural networks (ANNs) which have been gaining momentum since 2012 in the computer sciences and whose adoption has been gradual in hydrology (Shen, 2018). DNNs are now commonly used for modelling real-world problems due to their ability to represent and generalize complex nonlinear relationships between inputs and outputs (Meyers et al., 2016(Meyers et al., , 2017Hafeez et al., 2019). DNNs can help by providing both stronger predictive capabilities and a complementary avenue toward knowledge discovery in the hydrologic sciences . Limited but conclusive applications were made to predict inflows to reservoirs, water levels of combined sewage outflow structures, turbidity, and flood forecasting in karst regions and integrated in rainfall-runoff modelling to better predict streamflows (Siou et al., 2011;Bai et al., 2016;Savary et al., 2017;Hu et al., 2018;Kratzert et al., 2018). The main objectives of this study were (i) to develop a cascade modelling approach able to simulate hydro-sedimentary transfer at a WTP for specific daily rainfall events and (ii) to evaluate the impact of different land use scenarios on the SD variability. This study was conducted in the Radicatel hydrogeological catchment (Normandy, France), where spring water is extracted to a WTP. We benefited from the use of an existing expert-based geographic information system (GIS) model (WaterSed), developed and successfully applied to the studied area, to simulate the impact of land use management on hydro-sedimentary transfers to connected sinkholes. Rainfall event characteristics and WaterSed outputs were then used as input for a datadriven model (i.e. a DNN) to simulate SD at the WTP. The cascade modelling approach was applied to multiple design storms (DS) under different scenarios in order to simulate hydro-sedimentary transfer in the hydrogeological catchment and evaluate the efficiency of different land use management strategies.

Study site
The study site is in the Pays de Caux (Normandy, France) on the right bank of the Seine River about 30 km from the Seine estuary (Fig. 1a). The climate is temperate with an average temperature of 11 • C. Annual precipitation ranges between 600 and 1100 mm yr −1 with a rainy season occurring between October and May. The karst is typical of the geological setting of the lower Seine Valley. The karstic chalk plateaus of the north-eastern side of the Normandy region are part of the western edge of the Paris Basin. The elevation ranges from 138 to 0 m a.s.l. (above sea level), and the median slope is 5.9 %. The geology consists of Cenomanian to Campanian chalk overlaid by thick surficial formations. The major formation is composed of clay with flints, which results from the weathering of the chalk (Laignel, 2003), loess, and tertiary sands (Lautridou, 1985). The thickness of the formation overlying the chalk ranges from 5 to 10 m. Water infiltrates from the uplands of the chalk aquifer to the valleys via rapid transfer through a highly developed karstic network and via slower drainage through the thick surficial formations.
The hydrogeological catchment of Radicatel covers 106 km 2 and the WTP is located at the interface of the Seine alluvium and the karstic chalk (Fig. 1b). Water is pumped from four different springs and seven pumping wells near the WTP (Chédeville, 2015). According to the information system for groundwater management in Seine-Normandy (http://sigessn.brgm.fr/, last access: 1 September 2020), hydrogeological investigations reported seven sinkholes posi-tively connected to the springs (in situ dye tracing already performed and confirming the connection). The WTP of Radicatel is exploited by the Le Havre-Seine Métropole (LHSM) to supply Le Havre inhabitants. Turbidity (NTU) is measured with a nephelometer at the inlet of the pumping station and has been recorded since 1987 (Fig. 2). The maximum turbidity value is recorded every day ( t = 1 d) and LHSM provided access to the entire turbidity time series . Incomplete periods or hydrologic years with missing data were discarded. Twenty-two hydrologic years with a complete time series of turbidity were kept for the rest of the study.

Methodology
The proposed cascade modelling for the simulation of SD at a WTP incorporates two different models that can be used separately or as part of an integrated modelling framework. These components are described in detail in the following sections.

Expert-based GIS model
The WaterSed model uses a raster-based distributed approach to model the spatial distribution of runoff and soil erosion within a catchment for a given rainfall event. The WaterSed model is an upgrade of the STREAM model (Souchère et al., 1998;Cerdan et al., 2001) simulating hydrological processes by conceptualizing each raster grid cell as a reservoir whose properties are calculated at the event scale and by routing both water and sediment according to the surface flow network. The WaterSed model requires several datasets to compute runoff and erosion for any location in the catchment: (i) a digital elevation model (DEM) to extract the slope and the runoff flow network, (ii) a stream network (used to modify the DEM to ensure a steady downstream gradient according to the observed hydrographic network) and river width, (iii) a land cover and soil texture map to associate each land cover in the catchment with the appropriate soil surface characteristics, (iv) decision tables, adapted for the local conditions, to associate each soil surface characteristic (soil surface crusting, surface roughness, and crop cover) observed in the land cover or field with a steady-state infiltration rate, a Manning roughness coefficient, a single potential sediment concentration, and a soil erodibility value (Cerdan et al., 2002a), and (v) rainfall events including total precipitation, antecedent moisture (48 h), and effective rainfall duration.

Deep neural network
In this study, a multi-layer feed-forward network (DNN) was built under Python version 3.6 using the high-level API Keras (Chollet, 2015) and Tensorflow (Abadi et al., 2016) as a background engine. A multi-layer feed-forward neural network is an interconnection of perceptrons in which data and calculations flow in a single direction, from the input data to the outputs. The number of layers in a neural network is the number of layers of perceptrons. The number of hidden layers and perceptrons depends on the characteristics of the input data, and there is no specific rule for selecting these parameters (Le et al., 2019). Neural networks are adjusted during a training stage when the parameters are calculated iteratively in a gradient descent that seeks to minimize a mean squared error. For efficient learning, input variables were rescaled between 0 and 1 using normalization and re-transformed for the simulations using the normalization parameters. The mean squared error (MSE) was chosen as a loss function, and the adaptive moment estimation (Adam) was adopted as the model optimization algorithm. After training, the quality of the model is evaluated on a test set. The ability of the model on the test set is known as generalization, and it can be influenced by overfitting during the training stage. To avoid overfitting and increase model robustness, we used two regularization methods in this study: (i) early stopping and (ii) cross-validation. Early stopping is an efficient regularization method that prevents the model from overfitting (Sjörberg and Ljung, 1992). We visually monitored the learning curves and stopped the training as soon as the validation error reached a minimum. Then, we rolled back the model parameters to the point where the validation error was at the minimum. The choice of the training/test set remains arbitrary and may need to be evaluated to compute a robust estimate of model error. With time series data, care must be taken when splitting the data in order to prevent data leakage (Cochrane, 2018). To address this issue, we adapted a monthly backward-chaining nested crossvalidation procedure, which provides an almost unbiased estimate of the true error (Varma and Simon, 2006). The procedure contains an inner loop for parameter tuning and an outer loop for error estimation (see Fig. B1). The parameters that minimize error are chosen on the inner loop, and we add an outer loop, which splits the data into multiple training/test sets. Then, the error is averaged on each split to evaluate the overall performance of the model. The optimum structure and configuration of the network (model design) were found by a classical trial-and-error procedure (training-evaluation process through optimization of errors; Ortiz-Rodriguez et al., 2013).

Performance evaluation
The performance of the model was evaluated through the Nash-Sutcliffe efficiency (NSE) and the root mean square error (RMSE). The RMSE corresponds to the standard deviation of the residuals (prediction errors). The residuals are a measure of the distance from the data points of the regression line. They evaluate how the predictions match to the observations, and values may range from no fit (+∞) to perfect fit (0) based on the relative change in the data. The NSE (Nash and Sutcliffe, 1970) indicates the model's ability to predict variables different from the mean and gives the proportion of the initial variance accounted for by the model. NSE values vary between −∞ (poor model) and 1, indicating a perfect fit between observed and predicted values. Finally, to ensure that the model does not suffer from a weakness when making simulations during extreme events, we performed an additional evaluation using the generalized extreme value (GEV) distribution that is broadly applied to extreme events such as rainfall or river discharges (Carreau et al., 2013;De Michele and Avanzi, 2018). The GEV distribution was fitted to SD time series at the WTP of Radicatel. The maximum annual SD observed at the WTP was extracted for all complete hydrologic years (i.e. annual maximal blocks; n = 22) and the distribution was fitted using the package "extRemes" of the R software (R Development Core Team, 2008; Gilleland and Katz, 2016; see Fig. C1). We compared the SD simulated at the WTP by the DNN to the values calculated by the GEV.

Data
Appropriate data handling can help address various concerns in DNN modelling, such as its ability to generalize beyond training limits (Kourgialas et al., 2015). Moreover, cascade modelling with a GIS and a DNN can require important computational effort; therefore, the input data must be carefully selected.

WaterSed data
To compute erosion and runoff, the WaterSed model needs a DEM. The DEM (5 m resolution) was retrieved from BD Alti ® . Depressions in the DEM were filled according to the algorithm developed by Wang and Liu (2006). The stream network location and width were provided by BD TOPO ® . To define the soil surface characteristics needed by the WaterSed model, we computed a land cover map and a soil texture map. The land cover map was developed for 2016 by combining two national databases: the French Land Parcel Identification System (RPG) and the Soil Observatory of Upper Normandy (OSCOM). The soil texture map (three classes: clay, silt, and sand) was derived from the Regional Pedological Referential (RRP) with a precision of 1 : 250 000. The parametrization consists of a characterization of the main parameters influencing runoff and infiltration in the studied context: soil surface crusting, surface roughness, and crop cover. These soil surface characteristics were defined for each month and for each crop class according to the cropping calendar developed by Evrard et al. (2010) and Delmas et al. (2012). Steady-state infiltration and potential sediment concentration were assigned to each soil surface characteristic according to Cerdan et al. (2001Cerdan et al. ( , 2002b). The WaterSed model requires Manning's roughness coefficient in order to compute flow velocity. Based on soil surface characteristics, Manning's values were derived from surface roughness (Morgan, 2005) and the percentage of crop cover (Gilley et al., 1991). Last, soil surface characteristics were also used to define the soil erodibility factor, varying in the [0-1] interval, by adapting the table developed by Souchère et al. (2003). Calibration parameters were extracted from a previous study (Landemaine, 2016;Landemaine et al., 2020b) conducted in the Austreberthe catchment, located 30 km east of the Radicatel catchment. For the training phase of the DNN, erosion and runoff were calculated with the Wa-terSed model for 269 events to reduce computational efforts, considering that erosion and runoff occur only for significant rainfall events (P > 2.5 mm d −1 , the threshold below which no effective rainfall is generated, and runoff and sediment discharge entering connected sinkholes were set to 0).

DNN data
Cumulative daily rainfall (mm) from 1987 to 2017 was extracted from the SAFRAN database over the hydrogeological catchment of Radicatel. SAFRAN data are hydroclimatic data covering France at a resolution of 8 km on an extended Lambert-II projection and produced by Météo-France (Quintana-Segui et al., 2008;Vidal et al., 2010). We used the rainfall time series (i.e. daily cumulative rainfall, P, and 48 h antecedent rainfall, P 48 ) retrieved from the SAFRAN database and WaterSed modelling outputs -runoff (R WS ) and sediment discharge (SD WS ) -at connected sinkholes as input data for the DNN model. The SD time series at the Radicatel WTP was retrieved from turbidity data and considered the output for the DNN. In this study, turbidity data distribution was explored using scatter plots (Fig. A1). The turbidity data were chosen so that they include the best dispersion of values (0-370 NTU) and a strong recurrence of extreme turbidity values (> 150 NTU). Thus, 2 significant hydrologic years (H12-H13; from 1 October 1998 to 30 September 2000) were selected, accounting for 731 daily events. These 2 hydrologic years were selected as the training/test set of the model. Data were split as a training set (70 %) and a test set (30 %) while respecting the chronology of daily rainfall amounts. The training set was also split into training and stop sets to fine-tune the hyperparameters (70 %-30 %). At  Masséi et al. (2006) and Hanin (2011) in a karstic environment in Normandy, a lag of 1 d was applied to the SD time series to properly match the rainfall input.

Design storms and land use scenarios
Five design storms were constructed for the cascade modelling approach based on expert knowledge and depthduration-frequency curves of the French Meteorological Survey (Météo-France) available from 1996 to 2006 on the studied area (Table 1). We considered winter events for low return period rainfall (0.5 and 2 years). In the studied region, even low daily rainfall depths can lead to severe erosion by water runoff due to saturated soils induced by heavy cumulated rainfall on antecedent days (Le Bissonnais et al., 2002;Evrard et al., 2010). For higher return periods (i.e. 10, 50, and 100 years), we considered spring events that are characterized by stronger rainfall intensities and that can be particularly damaging in this region (Evrard et al., 2007).
Four land use change scenarios for the year 2050 were investigated and compared to a baseline land use scenario (2018). The scenarios were incorporated into the model in order to simulate SD variability at the WTP and evaluate their impacts. All scenarios are described below.
-Baseline scenario (S_base): this scenario served as a reference and was built considering the latest available land use data on the catchment (see Sect. 3.3). Existing erosion control measures in 2018 were considered and extracted from a regional database (BD CASTOR; http://bdcastor.fr/, last access: 1 September 2020). The database contained 45 dams/retention ponds, 16 ponds, 1 fascine, and 4 hedges for the actual land use scenario on the Radicatel catchment, which have been included in the WaterSed model.
-Ploughing up of grassland (S_grass): based on regional benchmarks (DRAAF; https://draaf.normandie. agriculture.gouv.fr/, last access: 1 September 2020) for the 1970-2010 period in the studied region (Pays de Caux), we observed an average conversion rate of grasslands up to 900 ha yr −1 . Extrapolation by 2050 led to the conversion of 33 % of existing grasslands. The extrapolated rate was applied on the Radicatel catchment. Grasslands were ploughed up based on a slope criterion, taking into account the working conditions of farmers and prioritizing those with mild slopes (< 12 %) and therefore mainly located on the plateau upstream of the catchment.
-Eco-engineering (S_engi): based on expert knowledge, 181 fascines and 13.1 ha of grass strips were implemented in addition to existing erosion control measures to mitigate runoff/erosion on the catchment and reduce rapid transfer via the connected sinkholes. Grass strips were deployed on the flow paths in the vicinity of the sinkholes. Fascines were deployed throughout the catchment, also on flow paths and along plot boundaries. This scenario allows for a shift from a 0.19 per 30 ha erosion control measure density to nearly 1 per 30 ha, which is advised to promote sedimentation and landscape restructuring (Ouvry et al., 2019). The localization was optimized according to the baseline scenario simulations.
-Best farming practices (S_farm): this scenario promotes the adoption of farming practices improving infiltration on the catchment (increasing crop cover or delaying the formation of the slaking crust). Fifty percent of the plots were randomly selected and applied a 15 % increase in infiltration capacity, respecting the actual proportions of winter and spring crops on the catchment. The applied value was set based on the study of Maetens et al. (2012), who synthesized the reduction in erosion and runoff following different agricultural practices across Europe. According to their results, a 15 % increase in infiltration capacity can be regarded as a conservative assumption that can be easily achieved through simplified agricultural techniques (e.g. minimum tillage, no till, direct seeding, crop cover).
-Coupling eco-engineering and best farming practices (S_farm + engi): both scenarios, S_farm and S_engi, were combined. Experiments carried out in the study area suggested that combining both approaches is necessary to reduce the impact on sensitive or vulnerable areas (Ouvry et al., 2012).

Modelling the inputs
In order to feed the DNN, erosion and runoff were simulated with the WaterSed model for 269 rainfall events over the entire Radicatel catchment. We used the WaterSed parametrization that was carried out on the adjacent catchment (La Lézarde; Landemaine et al., 2020a) and that had already proved to be valid in the same area of the Pays de Caux (Landemaine, 2016;Landemaine et al., 2020b) and Belgium (Baartman et al., 2020). For each event, Wa-terSed outputs -i.e. runoff (R WS ; m 3 ) and sediment discharge (SD WS ; kg d −1 ) values -were independently extracted over the connected sinkholes and summed to consider a unique contribution from the seven sinkholes to the spring. Figure 3 shows that from October 1998 to September 2000, rainfall events led to a significant cumulative runoff and sediment discharge to the sinkholes (R

DNN: calibration and generalization
The final structure of the DNN was composed of one input layer with four variables, two hidden layers, and one output layer with the targeted variable. We set 40 neurons in the hidden layers as follows: 30-10. We used the rectified linear (ReLU) activation function for the two hidden layers. The optimal number of iterations was set to 15 and the batch size to 1. We used the runoff and sediment discharge simulated by the WaterSed model for the 2 selected hydrologic years as inputs for the DNN. The rainfall time series (P , P 48 ) available on the catchment were also considered as inputs. A turbidity time series recorded at the WTP was transformed into sediment discharge knowing the mean pumping flow rate at the Radicatel WTP and considering a mean sediment concentration of 1 mg L −1 from field investigations made by Hanin (2011). Thus, sediment discharge at the WTP was considered as output of the DNN. The DNN was trained from October 1998 to February 2000 (n = 511 daily events). The remaining period from March 2000 to September 2000 was Figure 5. Boxplots of the performance metrics over the training and test sets using the monthly backward-chaining nested cross-validation. used as the test set (n = 220 daily events). Modelling results (Fig. 4) suggested a good agreement between observed and simulated SD at the WTP over the test set, with a NSE criterion reaching 0.82 and a RMSE around 383 kg. The temporal variability was also well reproduced. The results over the training set were slightly worse and reached a NSE of 0.6 and a RMSE of 420 kg. Over the investigated period, the cumulative SD observed at the WTP reached 158 611 kg, whereas the simulated cumulative SD reached 129 253 kg (underestimation of 19 %). Comparing SD WS and observed SD at the WTP resulted in a 60 % recovery rate from WaterSed outputs, which was consistent with previous hydrogeological tracing results on connected sinkholes. Hanin (2011) suggested the existence of fast karstic connections between sinkholes and springs on the Radicatel catchment, with a 62 % recovery rate. For the 126 events for which erosion and runoff occurred on the catchment, the cumulative amount for both observed and simulated SD was estimated to 88 114 and 78 870 kg respectively, suggesting that direct transfers account for 55 %-61 %. These results were consistent with previous published results of Masséi et al. (2003) in the same karstic environment (Norville catchment, Normandy, France), which evaluated the proportion of direct transfers at 73 % during erosive events. The monthly backward-chaining nested cross-validation procedure made it possible to assess the generalization capacity of the DNN on a new dataset. This procedure removed 1 month from the initial dataset (i.e. 731 events) and was repeated 12 times while keeping at least 1 full hydrologic year as input for the modelling. The modelling results were more efficient for the inner loop (Fig. 4) than for the outer loops (Figs. 5 and B1). Overall, the median NSE value for the training and test sets for the outer loops was above 0.5. The median RMSE value on the training and test sets was on the same order of magnitude as for the complete dataset (300-500 kg). The difficulties of the model in generalizing suggested a classical problem in machine learning, the biasvariance dilemma (Geman et al., 1992). Ideally, the model should accurately reflect irregularities in the training data while generalizing to data testing, but it was not possible to achieve both at the same time. The higher NSE and lower RMSE values on training sets suggest a complex model with high variance and a reliable bias that represents the training data fairly well but that presents a learning risk on the test data. The model can thus represent part of the random noise of the learning dataset.
To validate the approach for extreme values, we simulated overall SD for the baseline scenario at the connected sinkholes and for the five DS using the WaterSed model. Then, we used these modelling results as inputs for the DNN model and applied them to the five DS. Secondly, we compared the results with the calculated GEV distribution (Fig. 6). The SD simulated at the WTP by the DNN model can be considered to be in good agreement with the values calculated by the GEV, even if the latter shows a high dispersion for higher return periods.
In parallel, we selected the 1 % highest sediment discharge on records from October 1998 to September 2000, which represented 28 % of the total sediment discharge observed at the WTP, and compared the results with simulated sediment discharge in a scatter plot. We observed a good relation between those two variables (R 2 = 0.72; Fig. D1), which strengthens our confidence in the model for simulating extreme events. While the performance of deep learning-based methods in modelling extreme events is discussed (Zhang et al., 2019), the results obtained here provide confidence in the model's ability to simulate them due to a careful selection of input data that allow the model to learn patterns of extreme events in historical time series.

Prospective analysis
Once the calibration was completed and the architecture of the DNN defined, we applied the cascade modelling for all predefined scenarios. As a first step, and using the WaterSed model, we simulated SD WS and R WS at the connected sinkholes for the three additional scenarios and the five DS, comparing them to the baseline scenarios ( Fig. 7a and b). For the first scenario (S_grass), 33 % of grasslands were ploughed up, which led to an increase in the spatial extent of runoff generation on the catchment. SD S_grass ranged from 8468 to 188 264 kg for the five DS, with an average increase of 4.74 % compared to the results of the baseline scenario. The effect on runoff was higher, with an average increase of 8.4 % for the five DS, ranging from 2211 to 67 097 m 3 . The second scenario (S_engi) considered the implementation of erosion control measures by 2050 (i.e. 181 fascines and 13.1 ha of grass strips), which led to a global decrease in sediment flux rates. Simulated SD S_engi to connected sinkholes ranged from 1659 to 133 458 kg, resulting in an average decrease of 44 %. This scenario was more effective in small return periods (i.e. 0.5 and 2 years), with a SD decrease between 79 % and 55 % respectively. Erosion control measures were less effective on the three other DS, with decreases ranging from 25 % to 34 %. The effects on runoff were not as important, with an average decrease of 7 %. This was not surprising because the considered management plan only included 13.1 ha of grass strips, and the main interest of the fascines lies in their ability to trap suspended sediments, es-pecially their coarsest elements (AREAS, 2012). The third scenario (S_farm) represented the adoption of good farming practices (i.e. +15 % infiltration capacity) on 50 % of the plots. It can be observed that this scenario significantly reduced sediment production on the hillslopes ( Fig. 8a and b). The simulated SD S_farm ranged from 741 to 134 001 kg, leading to an average decrease of 49 % compared to the baseline scenario results. The simulated values on runoff ranged from 720 to 47 442 m 3 . This scenario was more effective in reducing SD and runoff, on average, than the eco-engineering scenario. The SD decrease induced by good farming practices ranged from 25 % to 90 % at the connected sinkholes. The fourth scenario (S_farm + engi) combined both effects of the erosion control measures and good farming practices. We observed the highest decrease in sediment discharge and runoff at connected sinkholes. The simulated values on SD ranged from 547 to 97 739 kg and from 598 to 45 051 m 3 for runoff.
DS characteristics and simulated SD/runoff by the Wa-terSed model at connected sinkholes, and for all scenarios, were injected as inputs in the DNN model. In the baseline scenario, simulated SD at the WTP ranged from 576 to 12 200 kg (see Sect. 4.2), and Fig. 9 shows the global trends of the scenario modelling results. For the S_grass scenario, simulated SD at the WTP was slightly higher than the baseline scenario, reaching 12 894 kg for the 100-year DS. This scenario resulted in an average increase of 4.5 % for the SD at the WTP. The SD increase was less significant in small return periods (+0.42 %-3.29 %) than longer return periods (+5.68-6.82 %). In the second scenario (S_engi), the modelling results suggested a significant SD decrease at the WTP. The simulated SD ranged from 515 to 9810 kg, which led to an average decrease of 25.4 %. The SD reduction was particularly effective on the 2-year DS, reaching a decrease of 44.7 %. The smallest SD reduction was observed for the 0.5-year DS (10 %) and ranged from 19 % to 30 % on longer return periods. These results were consistent with those found by Fournier et al. (2008), who showed by a parametric interpolation model that erosion control measures at connected sinkholes on the Caumont aquifer led to a 36 % decrease in the turbidity in the Varras WTP (Normandy, France). The third scenario (S_farm) was more efficient, with simulated SD ranging from 223 to 9165 kg and an average decrease of 43 %. The SD reduction ranged from 28.9 % to 61.3 % and was more efficient for the 0.5and 2-year return periods (55.6 %-61.3 %). For the last scenario (S_farm + engi), we observed a global decrease of 59.6 %. The simulated SD values at the WTP ranged from 167 to 7298 kg. This scenario was more efficient for the 0.5-, 2-, and 10-year return periods (70 % on average).    The original cascade modelling approach developed in this study allowed the evaluation of surface and subsurface sediment transfer processes. As karstic processes are complex and difficult to model due to the lack of knowledge of their geometry, the cascade modelling with a process-based GIS model and a data-driven model (i.e. DNN) proved to be a powerful tool. This approach is efficient at assessing the impact of different land use scenarios on drinking water quality at drinking water treatment plants. Despite very encouraging results, one may notice that the cascade model was trained on only 2 years of complete selected measurements and therefore may not have captured the full distribution of possible cumulative daily rainfall on the catchment and/or turbidity values at the drinking water treatment plant. The degree of confidence in the model's output could be further improved with longer time series. Two limitations of the presented cascade approach are the data availability and its ability to be validated at other study sites, as the turbidity data backups are not always properly done in all WTPs. One specific issue for DL applications in the hydrological sciences, as mentioned by Sit et al. (2020), is that data provided by the authorities are dispersed and that they occasionally have mismatches in temporal or spatial coverage. We thus encourage the drinking water supplier to keep turbidity records to allow the application of these data-driven models and to ensure that the time series do not have interruptions.

Implications for future land use strategies
The modelling results in this study suggest that two different land use strategies (i.e. increase in infiltration capacity and eco-engineering) can significantly reduce the SD incoming to connected sinkholes during extreme rainfall events (up to 90 %) and therefore decrease the SD at the WTP (from 10 % to 61 %). The first strategy leads to a decrease in sediment production through simplified cultural techniques, while the second affects transfer processes. The adoption of better farming practices (e.g. increase in crop cover, no-tillage, reduced tillage) inducing an increase of 15 % of the infiltration capacity on 50 % of the agricultural plots of the catchment appears to be slightly more efficient than the implementation of eco-engineering infrastructures (181 fascines + 13.1 ha of grass strips) at reducing SD at the WTP. These results thus show that it is more interesting to adopt a land use strategy aimed at reducing hydro-sedimentary production directly on cultivated plots than transfer processes on the catchment, but these strategies could be combined. Additional simulations integrating both effects of best farming practices on cultivated plots and the implementation of eco-engineering infrastructures show a 10-fold effect. As illustrated in Fig. 9, the coupling of the two strategies makes it possible to reduce significantly the SD at the WTP. The combined effect will not add up, but we can expect an improved decrease in sediment discharge at the WTP from 40 % to 80 %. Both land use strategies can also provide interesting effects on biodiversity and ecosystem services (Posthumus et al., 2015) and/or can improve soil resilience and promote sustainable agriculture (Lal, 2014). Public policies leading to the implementation of erosion control measures can be economically viable and efficient (Patault et al., 2021b). Even if some simplified cultural techniques may imply negative returns for farmers, they may be eligible for agri-environment payments (Posthumus et al., 2015). The implementation of these land use strategies may also require specific maintenance to keep their initial performance (Frankl et al., 2018) or specific training and machines for farmers. According to the simulations, the ploughing up of 33 % of grasslands for the benefit of agricultural plots on the catchment by 2050 will not increase significantly the SD at the WTP (less than 5 % on average). Our simulations just ex-tended the current trend observed in the studied region. Despite everything, some precautions must be taken regarding the results of this scenario, which could be higher or lower depending on the localization of the ploughed-up plots. The same observations were made by Souchère et al. (2005), who suggested, according to their results, that the efficiency of all developments in reducing erosion and runoff is linked to their location within the catchment. Indeed, it is well known that hydrologic connectivity may lead to increments in runoff (Appels et al., 2011).

Conclusions
In this study, a new cascade modelling approach was developed in order to help decision-makers choose an adapted erosion and runoff management strategy to reduce the impact of sediment discharge on drinking water supply. The expert-based GIS model (WaterSed) was used to simulate erosion and runoff at connected sinkholes (positive dye tracing) on the Radicatel catchment (Normandy, France) and used to feed a data-driven model (i.e. deep neural network) to simulate karstic transfers. This new approach does not require knowledge of the geometry of the karstic system studied and demonstrated the value of understanding hillslope erosion and runoff processes to model underground hydrosedimentary transfers in karstic systems. Our modelling results suggest that the cascade model performed well during the calibrating phase. The cascade model was able to generalize on unknown datasets through an adapted monthly backward-chaining nested cross-validation procedure, and the cascade model was efficient at simulating extreme events. The results also suggest that a land use change scenario considering the adoption of simplified cultural techniques can significantly reduce sediment discharge at the water treatment plant (up to 61 %). This scenario also outperformed the one which considered the implementation of eco-engineering control measures reducing erosion and runoff on the catchment (up to 45 % at the water treatment plant). The coupling of the two previous land use strategies can be even more effective since it operates on both the hydro-sedimentary production and transfer processes (decreasing SD at WTP from 40 % to 80 %). Finally, ploughing up 33 % of actual grasslands on the catchment will not significantly increase sediment discharge at the water treatment plant. However, the results might be influenced by the spatial organization of grasslands on the catchment that is a key parameter for hydrosedimentary connectivity and hydro-sedimentary transfers to the sinkholes. In the framework of this study, we suggest conducting a specific study on this hypothesis.
Appendix A: Scatter plot of turbidity time series Figure A1. Scatter plot of turbidity time series recorded at the water treatment plant in the Radicatel catchment, Normandy, France.
Appendix B: Monthly backward-chaining nested cross-validation Figure B1. Monthly backward-chaining nested cross-validation procedure developed to test the generalization capacity of the model (modified after Cochrane, 2018).