Impairment of water quality by organic micropollutants such as pesticides, pharmaceuticals or household chemicals is a problem in many catchments worldwide. These chemicals originate from different urban and agricultural usages and are transferred to surface waters from point or diffuse sources by a number of transport pathways. The quantification of this form of pollution in streams is challenging and especially demanding for diffuse pollution due to the high spatio-temporal concentration dynamics, which require large sampling and analytical efforts to obtain representative data on the actual water quality.
Models can also be used to predict to what degree streams are affected by these pollutants. However, spatially distributed modelling of water quality is challenging for a number of reasons. Key issues are the lack of such models that incorporate both urban and agricultural sources of organic micropollutants, the large number of parameters to be estimated for many available water quality models, and the difficulty to transfer parameter estimates from calibration sites to areas where predictions are needed.
To overcome these difficulties, we used the parsimonious iWaQa model that simulates herbicide transport from agricultural fields and diffuse biocide losses from urban areas (mainly façades and roof materials) and tested its predictive capabilities in the Rhine River basin. The model only requires between one and eight global model parameters per compound that need to be calibrated. Most of the data requirements relate to spatially distributed land use and comprehensive time series of precipitation, air temperature and spatial data on discharge. For larger catchments, routing was explicitly considered by coupling the iWaQa to the AQUASIM model.
The model was calibrated with datasets from three different small catchments
(0.5–24.6 km
Limited spatio-temporal data, first on the use of the selected pesticides and second on their concentrations in the river network, restrict the possibilities to scrutinize model performance. Nevertheless, the results strongly suggest that input data and model structure are major sources of predictive uncertainty. The latter is for example seen in background concentrations that are systematically overestimated in certain regions, which is most probably linked to the modelled coupling of background concentrations to land use intensity.
Despite these limitations the findings indicate that key drivers and processes are reasonably well approximated by the model and that such a simple model that includes land use as a proxy for compound use, weather data for the timing of herbicide applications and discharge or precipitation as drivers for transport is sufficient to predict the timing and level of peak concentrations within a factor of 2 to 3 in a spatially distributed manner at the scale of large river basins.
Mankind uses thousands of synthetic chemicals for many different purposes in households, industries or agriculture (Schwarzenbach et al., 2006; Bernhardt et al., 2017). Many of these compounds reach water bodies during some stage of their life cycle. Accordingly, the impairment of water quality caused by substances such as pharmaceuticals, household chemicals or pesticides is a problem of many catchments worldwide. From an ecological point of view, pesticides are often of special concern because they have been designed to harm a wide range of organisms.
Pesticides are used for different purposes. In agriculture, they are used to protect crops from weeds, pests or diseases. However, the same compounds may be also used to fight unwanted organisms on materials such as roofs, façades or ships. Depending on where pesticides are used, they may reach water bodies via different pathways. Although pesticides may be ecotoxicologically relevant chemicals even in treated wastewater discharged from point sources (Munz et al., 2017; Müller et al., 2002), diffuse pollution is often dominant for these compounds (Moschet et al., 2014). The quantification of this form of pollution in streams is challenging due to the high spatio-temporal concentration dynamics, which require large sampling and analytical efforts (e.g., Wittmer et al., 2010; Leu et al., 2004b).
As a consequence, the water quality status of many water bodies is not quantified sufficiently for properly addressing management and research questions that require a sound understanding about spatio-temporal patterns of pesticides occurring in streams. There may be deficits with regard to the spatial or temporal coverage of data as well as coverage of all chemicals of interest (Moschet et al., 2014).
Spatially (semi-)distributed models can potentially fill such gaps and have been developed and used for decades to do so (Borah and Bera, 2004). Some of these models (e.g., SWAT, Arnold et al., 2011, MONERIS, Behrendt et al., 2002, GREAT-ER, Kehrein et al., 2015, or MACRO, Steffens et al., 2015; Larsbo et al., 2005) have been widely used, and many others have been developed and used in specific research contexts (e.g., ZIN-AgriTra, Gassmann et al., 2013, SPIDER, Renaud et al., 2008; Villamizar and Brown, 2017, or DRIPS, Röpke et al., 2004). One of the challenges related to modelling diffuse pesticide losses is the necessity to cover all relevant sources and flow paths. Many models, for example, do not simulate urban and agricultural processes with the same level of detail. This may pose a serious problem in regions that are characterized by a mixed land use of urban and agricultural areas such as in many parts of densely populated central Europe.
Models differ widely in the degree to which they aim to represent the relevant processes explicitly. On the one hand, so-called physically based models try to describe them with equations in such a way that the model parameters should have a real physical, chemical, or biological meaning independent of the model application with the goal to provide causal system understanding (Bossel, 1994; Beck, 1987). Generally, running such highly parameterized models comes with a huge data demand, and – as this demand usually cannot be covered – many model parameters cannot be estimated from independent observations. In the end, this leads to either the use of potentially unrealistic parameter values or calibration, the latter facing the problem that many of the parameter values cannot be properly identified, possibly inducing large uncertainties during a validation or prediction phase (Beck, 1987; Brun et al., 2001).
On the other hand, more conceptual, parsimonious models try to cope with the lack of (spatially distributed) data by dramatically reducing the number of parameters. This comes at the cost that model parameters may lose their direct physical or chemical interpretation. Such parsimonious models basically assume that essential aspects of the response of a complex (real) system can be represented by some rather simple mathematical descriptions that incorporate the effects of major external drivers, such as precipitation. Such types of models are frequently used in hydrology for simulating discharge (e.g., Beven and Kirkby, 1979) as well as for water quality simulations (Hahn et al., 2013; Jackson-Blake et al., 2017), but only a few models are used for simulating pesticide transport to surface waters (Honti et al., 2017).
Here we present a model that covers major urban and agricultural sources for pesticides in streams that can be applied to large water basins, provides high spatial and temporal resolution (hourly to daily), and is still parsimonious. It is similar to the iWaQa model approach in Honti et al. (2017) but adapted for large basins by including an explicit routing component by coupling it to the AQUASIM model. It differs from many other model concepts in that it does not include a rainfall–runoff module but directly links agricultural pesticide losses in a novel way to measured discharge and urban biocide losses directly to precipitation.
Specifically, the paper has the following objectives:
description of the model concepts and their implementation; calibration of the model on selected small catchments and selected
pesticides representing agricultural herbicides and urban biocides; evaluating the performance of the calibrated model with blind predictions on
a large set of validation catchments – this step includes a pronounced spatial
upscaling of the model by 3 to 4 orders of magnitude.
We have used the Rhine basin upstream of Emmerich (see Fig. 1) as a case
study to investigate these questions. Due to lack of data, the biocide part
was only tested within Switzerland.
Map of the Rhine basin. The study area covers the part upstream of Emmerich indicated by the red circle. The different colours represent the sub-basins according to the International Commission for the Protection of the Rhine (ICPR) with the an additional distinction of the Aare basin in Switzerland. Base data: Vogt et al. (2007); Swisstopo (2007). AT: Austria, BE: Belgium, CH: Switzerland, DE: Germany, FL: Liechtenstein, FR: France, IT: Italy, LU: Luxembourg, NL: the Netherlands.
The study is carried out in the Rhine basin upstream of the gauging station
Emmerich am Rhein (Germany; see Fig. 1). We limited the analysis to this part
of the basin because the model structure does not cover complex, strongly
managed flow regimes such as those that are prevalent in the Dutch part of
the basin. Even with these
restrictions, the study area is one of the largest drainage basins in Europe,
with an area of 160 000 km
Altitude ranges from above 4200 m a.s.l. in the Bernese Alps in the south to 17 m a.s.l. at Emmerich in the north. Accordingly, the hydrological regime varies strongly across the basin. The discharge regime in the southern part of the basin is largely influenced by snow accumulation and melt. As a consequence, most southern rivers are of pluvio-nival type with low water periods during winter and flood events occurring mainly in summer. In contrast, sub-basins further north (Neckar, Main, Moselle, Ruhr, etc.) are characterized by a pluvial regime with winter floods and low water levels in summer. Similarly, temperature regimes show important differences, which may be reflected in shifts in phenology of crops and hence in application periods of agricultural pesticides.
The basin is densely populated (290 inhabitants km
The Rhine River is heavily used by hydropower plants upstream of Iffezheim along the main channel and main tributaries. However, the effects on travel times is rather moderate on the streams of interest and was neglected during the routing calculations.
The river basin is discretized into sub-catchments based on the CCM2
database. To reduce the number of sub-catchments and ensure a reasonable
minimum size, CCM2 catchments smaller than 2 km
The model consists of two principal components. The first component – the
We assume that sub-catchments are laterally disconnected from each other, and therefore simulations of the substance transfer module can be run separately for each sub-catchment. Subsequently, the routing module integrates all outputs of the substance transfer module by processing sub-catchments from up- to downstream.
For the routing, the main river (and optionally also tributaries) is split into river segments (see Sect. S10, Fig. S9). Each segment receives input from upstream and lateral directions as well.
Global model parameters.
This module consists of several independent parts that describe the transfer of chemicals from the different pesticide sources in the catchment. In particular, it consists of the iWaQa model, which describes substance transfer for herbicides (Sect. 3.2.1) and for biocides (Sect. 3.2.2). These models treat sub-catchments as spatially lumped units. The models are very parsimonious, such that they only require one to eight empirical yet global model parameters per simulated chemical (Table 1). All other model inputs consist of (generally) available statistical data on chemical consumption, spatial data on land use and hydro-climatic time series.
This section describes first the system of the herbicide model and subsequently the input and output of the system.
This model consists of two spatially lumped storage terms representing the
dissolved and sorbed fractions of the total herbicide mass
The exchange between the two storages is described by two kinetic rate
parameters: sorption to the soil matrix is described with the transfer rate
The mass balance and the two first-order differential equation describing the
change in stock of herbicide masses
Crop development and hence also the timing of herbicide applications is
strongly controlled by temperature conditions in any particular year. As
application dates are generally unknown, a temperature sum model is used to
simulate crop growth and the related herbicide applications, which is linked
to specific growth stages of the crops. In particular, we assume that
application of herbicides starts when the daily temperature sum at a given
location reaches a crop-specific temperature threshold (Honti et al., 2017).
Daily mean values of temperature are summed up
The concept to describe the transfer of the applied herbicides from the
fields to the river is based on the empirical observation that herbicide
concentrations increase with flow during discharge events during the
application period (Leu et al., 2010). Mechanistically this can be explained
by the occurrence of fast transport processes (with high herbicide
concentrations) such as surface runoff and fast subsurface flow through
drainage systems or macropores (Leu et al., 2004a) during discharge events.
Hence the concentration (
The proportionality coefficient depends on
Certain herbicides are present in significant concentration outside of the
application period too (see for example, Leu et al., 2004a). Therefore, we
added a constant background concentration (
Biocides are applied in the urban settlement on façades, flat roofs, basement seals and underground parking lots. Due to the potential year-round application and the long-term protection purpose of biocides, it is assumed that the stock in the urban settlement is constant over time (Wittmer et al., 2010).
The leaching of biocides in urban areas is a complex process and several
studies provide quantitative information on loss rates, dynamics and driving
factors (Jungnickel et al., 2008; Burkhardt et al., 2008; Wittmer et al.,
2011). The process is mainly driven by precipitation that occurs when water
flows over the treated surfaces, and it was observed that concentration
patterns of urban compounds follow the rainfall pattern more than the river
discharge (Wittmer et al., 2010). Therefore, the current model simplifies
the processes by assuming that the release is proportional to precipitation
and assuming instantaneous transport to the rivers. The following equation
thus describes the resulting modelled load
Concentrations of micropollutants at the outlet of any catchment composed of several sub-catchments are predicted by aggregating the loads from the output of the substance transfer module and division by the actual total discharge. The approach considers the local availability of sources and the spatial distinctions of the driving factors (discharge or precipitation). However, instantaneous aggregation assumes no in-stream losses, such as degradation, sedimentation or diffusion taking place during the transport. Furthermore, it implies that the temporal resolution should be larger than the longest travel time of a component during a rain or discharge event. Otherwise the concentration dynamics are affected.
A special situation is caused by the presence of the large pre-alpine lakes (Lake Constance, Lake Lucerne etc.) in the river network. Because of the long water residence time in these water bodies (months to years), the concentration dynamics in the lake outlet are strongly dampened and differ substantially from other river sections. To account for these different dynamics, we simulated the input into each of these lakes separately by the substance transfer module. We assumed complete mixing into the volume corresponding to 1 year of discharge and used the resulting concentration as a constant value in the river water flow out of the respective lakes. The load varied accordingly with discharge from the lake. A different case was Lake Biel, which was not treated as a mixing reactor because of the short spatial distance between the inflow and the outflow of the river Aare.
In larger river basins the effects of travel time, dispersion and degradation during pollutant transport in the river system become more important. The assumption of instant arrival of pollutants at the outlet within daily time steps does not hold true anymore and hydraulic routing becomes indispensable.
To that end the load output from the substance transfer module was used as input into the program AQUASIM (Reichert, 1994) that was used for describing the transport and fate processes within the main rivers. Flow was described with the kinematic wave approximation of the Saint Venant equations. Transformation and sedimentation through sorption was neglected because the model compounds are sufficiently stable and show only weak sorption.
Characterization of calibration and validation catchments. NADUF: National long-term surveillance of Swiss rivers, NAWA SPEZ: NAWA SPEZ: National Surface Water Quality Monitoring Programme – Special Campaigns, IRMS: International Rhine Monitoring Station (Basel).
Hourly discharge data were obtained for 1033 stations from federal and national agencies (see Sect. S4, Table S2) to derive two kinds of discharge time series for all sub-catchments. The first, termed local runoff, refers to surface and subsurface runoff originating from the specific sub-catchment and is used in the substance transfer module for herbicides. The other is the streamflow at the outlet of a sub-catchment required in the routing module to calculate the concentrations of catchments or as input to AQUASIM. For headwater sub-catchments without any further upstream connections, the local runoff is identical to the streamflow.
Time series of local runoff are derived from the records of gauging stations measuring rivers with a Strahler stream order (Strahler, 1957) less than 5 (804 out of 931 or 86 % of the available gauging stations). Using gauging stations at larger rivers would not accurately reproduce the high temporal variations of the local runoff. The recorded discharge is allocated to the sub-catchments upstream according to the drainage area ratio method, which assumes that discharge scales proportionally to catchment area (Hirsch, 1979). Unfortunately, many sub-catchments remain ungauged hereby. On the one hand this method does not provide time series for sub-catchments downstream of the stream gauges with Strahler order larger than 4, on the other hand numerous ungauged tributaries join the river network downstream of the selected stream gauges. In both cases a nearby reference station (with Strahler order < 5) is selected and the area ratio method is applied to calculate local runoff. Selection of the reference stations is based on the map-correlation method from Archfield and Vogel (2010). This geostatistical method calculates the correlation between discharge time series at observed stream gauges and estimates the station with the most correlated discharge at the ungauged catchment.
The stream flow time series for all sub-catchments were deduced in a similar way. Upstream of stream gauges with Strahler order less than 5, the discharge is allocated according to the drainage area ratio and accumulated downstream. The discharge of any stream gauge is passed on to the downstream sub-catchments and accumulated with the streamflow of converging tributaries. Likewise to the local runoff, the streamflow for ungauged tributaries is adapted from reference stations selected with the map-correlation method.
Hourly precipitation data for the study area are available for Switzerland
from MeteoSwiss CombiPrecip (Sideris et al., 2014) and for the rest of the
Rhine basin from RADOLAN (Bartels et al., 2004), a product of the German
Meteorological Office (DWD). Both are raster datasets (with a spatial
resolution of 1 km
Raster temperature data with daily mean values are retrieved from MeteoSwiss
TabsD (Begert et al., 2003) with a spatial resolution of 0.02
Given that the Swiss temperature dataset has a finer grid size than the
average area of the sub-catchments (8.8 km
Herbicides are applied on specific crops, and therefore detailed, spatially
distributed agricultural land use data were required. The dataset
“Agricultural Landuse2000” from the JRC AFOLU project (Leip et al., 2008)
classifies agricultural land use into 30 crops and for a grid with a
resolution of 1 km
The European dataset on
agricultural land use does not cover Switzerland. In order to have a dataset
with the same crop categories and a similar spatial resolution, a harmonized
dataset was created from the Land Use Statistics of Switzerland (Swiss
Federal Statistical Office FSO, 2012) and the census of agricultural
enterprises (Swiss Federal Statistical Office FSO, 2011). The cultivation
areas of 60 listed crops reported in each municipality in the census were
distributed on the grid cells of the Land Use Statistics belonging to the
three agricultural land use classes, leading to an average fraction of
cultivated area of crop
Land cover of housing and settlements is available with vector-based maps,
where every building is precisely represented by a polygon and in some cases
with knowledge about its height.
France: Institute géographique nationale (IGN) BD TOPO®
(with height); Germany: Arbeitsgemeinschaft der Vermessungsverwaltungen Deutschland (ADV)
ALKIS®; Luxembourg: Administration du Cadastre et de la Topographie (ACT)
BD-L-TC; Switzerland: Federal Office of Topography (swisstopo) swissTLM 3D (with
height).
Façade surfaces are calculated by multiplying the contours of buildings
with their height where available (CH, FR). For the other countries (DE, LU)
the façade areas within a sub-catchment are estimated from the footprints
areas and the population. Footprint and façade follow a linear relation,
whereas the relationship between population
Five model compounds (see Table S1) have been selected for this study: three agricultural herbicides (isoproturon: IPU, S-metolachlor: MEC, terbuthylazine: TBA) and two (dual-use) biocides (carbendazim: CBZ, diuron: DIU). The biocides are mainly used in urban environments to protect materials. They may also have some agricultural use in some regions of the basin (e.g., in Switzerland) but the usage is of minor relevance and is neglected here.
Use and consumption data for the chemicals are not available in a spatially distributed manner. To provide input for all spatial model units, we proceeded in two steps. First, we obtained statistical data on pesticide use and consumption for the relevant regions or countries. Subsequently, we downscaled these statistical data based on land use or population.
Annual sales data of herbicides were available from the countries of Switzerland
(Agroscope ZA-AUI, Spycher and Daniel, 2013) and Germany (Federal Office of
Consumer Protection and Food Safety, Federal Office of Consumer Protection
& Food Safety BVL 2008–2012) as well as the French regions Alsace (Office
national de l'eau et des milieux aquatique, Office national de l'eau et des
milieux aquatique ONEMA, 2014) and Lorraine (Groupe Régional d'Action
contre la Pollution Phytosanitaires des Eaux Lorraine, Groupe Régional
d'Action contre la Pollution Phytosanitaires des Eaux Lorraine GRAPPE
Lorraine, 2005) for the years 2008–2012 (except for the study for Lorraine, which was
only issued for 2005). The spatial coverage area of the statistics varied
strongly, ranging from 357 300 km
Only one source for the use and sale of biocides was at hand. The survey of Burkhardt and Dietschwiler (2013) investigated the consumption rates in Switzerland of various biocides in antifouling paints, masonry and wood protection agents. The use rates have been applied to the entire study area.
The mass distributed on the agricultural fields respectively applied on
houses of each catchment was estimated by downscaling regional or national
sales data
To calibrate the model, data from field studies were used that simultaneously provided
data on application amounts of substances as well as on losses
to the rivers. Such studies are rare and we used the following studies
situated in the north-eastern part of Switzerland. The sampling campaigns from
Gomides Freitas et al. (2008) and Doppler et al. (2012) measured herbicide
concentrations at the small-scale agricultural catchments Summerau and
Ossingen, respectively, after a controlled herbicide application. Wittmer et
al. (2010) monitored the mass and dates of herbicide applications in a
slightly larger catchment, Mönchaltorf (25 km
The substance-specific parameter sets for herbicides
The model parameters were calibrated using a Bayesian inference approach. The
likelihood function accounted for the parameter uncertainty and the
structural model errors. For herbicides, model errors were assumed to deviate
stronger during the application season. Therefore an error-scaling function
was added, depending on the substance input to the system, as well as a
driver imitating the approximate substance application to the fields. The
error scaling function makes the standard deviation of herbicide errors
proportional to the remaining field stock to reflect that errors are larger
in the application period than afterwards, when the compound is present in
negligible amounts. The additional parameters to calibrate, resulting from
the error-scaling function, were
Measured peak concentrations of herbicides in the calibration studies occurring before the monitored application period were excluded from the calibration procedure as they represent accidental spills or runoff from hard surfaces. As such events are not represented in the model, including them would have spoiled the identification of model parameters.
The likelihood function used in this study is based on the assumption that
Box–Cox-transformed (Box and Cox, 1964) time series of concentration data
The Jacobian of the transformation
Calibration and validation catchments in Switzerland. Base data: Vogt et al. (2007); Swisstopo (2007).
Priors for the substance-specific loss rates were estimated based on reported
information in the calibration studies (see Sect. S8, Table S4). Estimation
for the substance-specific
Priors for the substance-specific loss rates of the biocide model were the
total loss rate reported in Wittmer et al. (2010) divided by the yearly sum
of precipitation. Having multiple study catchments or ranges of loss rates
allowed a distribution of the priors for
Prior distributions for the parameters describing pesticide fate in the soil
(
The maximum of the posterior parameter distribution was found by performing a Nelder–Mead simplex optimization. The maximum likelihood parameter set was used as a prior for the Markov chain–Monte Carlo (MCMC) simulation using the Metropolis algorithm (Gamerman, 1997). The developed posterior parameter distributions were used to predict the parameter and model uncertainty. The procedure was repeated for every calibration site separately.
Several comprehensive sampling campaigns from the Swiss “National Surface Water Quality Monitoring Program – Special Campaigns – NAWA SPEZ” (Federal Office for the Environment FOEN, 2013) and data from a continuous monitoring station were selected to evaluate the model.
The first campaign (NAWA SPEZ) comprised five catchments (Fig. 2) ranging
from 39 to 105 km
The second survey was the “National long-term surveillance of Swiss
rivers”, termed NADUF, where weekly or biweekly mixed-samples were taken during 2009
(Stamm et al., 2012). The monitoring sites were in the north-eastern
part of Switzerland and quantified the concentrations of several organic
micropollutants in five nested catchments. These nested catchments have a
large range of size from 74 to 14 718 km
A third validation was conducted with data for 2011 from the continuous
measurement program of the International Rhine Monitoring Station (IRMS) near
Basel. With five probes distributed over the cross section, daily
discharge-proportional pollutant levels are evaluated. The upstream area of
the Rhine at this point covers almost 36 000 km
Modelled hourly concentrations were adapted to the sampling periods of the respective validation surveys. According to the aggregation periods of mixed samples in the measurement surveys, the modelled concentrations were averaged over the sampling time periods, such that the resulting time series were fully comparable.
The issue of routing arises for larger catchments where the transport time is longer and also the processes along the way become more significant. For the sites of the NADUF survey the concentrations at the outlets were first modelled with load aggregation and in a second step river segments were defined where the routing with AQUASIM was calculated. Thus the influence of a physically based hydraulic routing can be compared to the situation where in-stream transport and processes are neglected.
In the case of the IRMS, measuring a large sub-basin of the Rhine, the catchment model is applied for 5950 sub-catchments. Downstream of the lakes the substance transport was modelled with AQUASIM for the larger rivers (such as the Rhine, Aare; Sect. S10, Fig. S9). The simplistic approach with load aggregation was applied on this large scale as well.
Examples of the comparison between simulated and observed concentration time series during the calibration step for each compound. IPU: isoproturon, MEC: S-metolachlor, TBA: terbuthylazine, CBZ: carbendazim, DIU: diuron. Moe: Mönchaltdorf, sum: Summerau. The full set of calibrations is shown in the Supplement (Fig. S12).
The calibrated model was finally applied to the Rhine and the major tributaries to characterize the pollutant dynamics of herbicides. These simulations were real predictions without any further adjustments of model parameters. Due to the lack of statistical input data on the use of biocides in France and Germany, predictions for the Rhine basin were not possible for carbendazim and diuron.
The iWaQa model is written in C
All modules are executable individually. Preprocessing succeeds within 30 min to sort the hourly input data for all 18 240 sub-catchments of the Rhine basin on an Intel x86 8-core processor. The substance transfer module takes approximately 1 h to run and sort the output by both sub-catchments and time steps. Run times of the routing options differ substantially depending on the size of the considered catchment and the parameterization of AQUASIM. Generally the load aggregation is calculated within a few minutes and the simulation of the main tributaries of the Rhine basin with AQUASIM is completed within 6 h.
Metrics used for quantifying model performance.
Besides the likelihood used for parameter calibration, there are many metrics for evaluating model performance of hydrological and water quality models (Jachner et al., 2007; Smith and Rose, 1995; Reusser et al., 2009; Moriasi et al., 2007). Out of those, we have selected some frequently used statistics (Table 3) that allow for a comparison with other studies. In addition, we have included some metrics that are more specifically designed to analyse aspects that are of special relevance for this work. These measures include the geometric reliability index (GRI) of the cumulative distribution of the simulated concentrations to see how well the overall concentration level is met or the fold difference between the observed and simulated maximum concentration during the simulation period (see Table 3).
The calibration was carried out for all catchment–compound combinations for which observations are available (see Table 2). For the agricultural herbicides this provides several alternative calibration sets (Tables S6, S7). The final decision of which set to use for further predictions was based on the performance in the validation step with the NAWA SPEZ sites (see below).
For the agricultural herbicides, the calibration resulted in a reasonable
simulation of the observed concentration dynamics (Fig. 3, Figs. S12,
S33–S34, Tables S8–S10). The calibrated uncertainty bands also followed the
expected seasonal patterns: they were large during the application periods
and decreased with time. The model, however, poorly captured the exact timing
of the concentrations, as one can see from the low Nash–Sutcliffe (NSE)
coefficients (ranging between
For the biocides, the model predicted a rather uniform distribution of
concentration peaks around the year, reflecting the precipitation patterns.
The observations, however, suggest a bi-modal seasonal pattern with higher
concentrations in spring and autumn. This pattern resulted in low
correlations (
Over- or underestimation of maximum concentrations (site–compound combinations) in percentage of the observations. For the herbicides, only the peaks during spring application were considered. IPU: isoproturon, MEC: S-metolachlor, TBA: terbuthylazine, CBZ: carbendazim, DIU: diuron. NADUF: National long-term surveillance of Swiss rivers, NAWA SPEZ: NAWA SPEZ: National Surface Water Quality Monitoring Programme – Special Campaigns, IRMS: International Rhine Monitoring Station (Basel).
The residuals pointed to systematic deviations between observed and modelled concentrations (Fig. S11). The data are grouped into two clusters. One of the clusters showed systematic underestimations of the observations, while the other showed the opposite. Comparison with the time series revealed on the one hand that for most compounds, the highest observed concentrations peaks were (substantially) underestimated during calibration (see for example S-metolachlor or terbuthylazin in Fig. 3). These peak concentrations were underestimated by 13 % to 83 % (Table 4). On the other hand, the second cluster of data points indicates that concentrations of some (smaller) events were overestimated. This pattern suggests that the model structure did not capture the full dynamic range of the pesticide concentrations.
Overview of the overall predictive power to simulate the concentrations levels during the calibration and validation phase as quantified by the geometric reliability index (GRI). A value of 1 (green horizontal line) indicates a perfect match; the larger the value, the stronger the deviation. Cal: calibration; Val: validation; NW: NAWA SPEZ: National Surface Water Quality Monitoring Programme – Special Campaigns, ND: National long-term surveillance of Swiss rivers (NADUF), RM: International Rhine Monitoring Station (Basel), Time: evaluation of concentration time series; -Cum: evaluation of cumulative concentration distributions (sorted according to size); blue: agricultural herbicides; red: dual-use (urban and agricultural) biocides.
Despite these limitations, the concentrations were reasonably well represented by the model. The GRI indicates that the predicted concentrations of the agricultural herbicides were within a range of 1.9 to 2.5 of the observations (Figs. 4–5). When the timing is ignored and only the accuracy of the simulation of the cumulative concentration distributions is considered – this is generally relevant for water quality assessment – these values range between 1.4 and 2.2. As can be seen from Figs. 4–5, the performance for the biocides was considerably poorer but the cumulative distribution was also reproduced better that the concentration time series.
Cumulative distribution of the fold difference between observed
and simulated concentrations
Based on the relative RMSE one can compare the calibration performance across sites. Mönchaltdorf and Summerau yielded better calibrations for S-metolachlor and terbuthylazin than the Ossingen dataset (Sect. S15, Table S8) The opposite was true for isoproturon. In the case of Ossingen, a long dry period followed after the isoproturon application, resulting in very low concentrations without a pronounced peak related to the recent application. This last aspect points to the fact that single-calibration datasets may represent special situations, hampering the predictive power during normal conditions. The application of S-metolachlor and terbuthylazine in Ossingen, for example, took place just before an intensive precipitation event. Through direct shortcuts, such as manholes of drainage systems and storm drains, the transfer to the river was accelerated and very high concentrations were measured (Doppler et al., 2012).
So far, we have compared the observations to the deterministic model predictions. Comparing the observations to the simulations including the prediction uncertainties due to the estimated parameter uncertainty (of the deterministic model) and the total predictive uncertainty accounting for input and model structure deficits reveals that the parameter uncertainty contributes only a small fraction. Taking into consideration all sources of uncertainty leads to uncertainty bands that include most of the observations, as can be seen from the cumulative concentration distributions depicted in Sect. S14 and Figs. S27–S28.
All calibrated parameters of the deterministic model had priors based on
physical reasoning or empirical data; hence, the maximum likelihood values are
not expected to deviate strongly. This held true for the decay rate, the loss
rates (
The different calibrated parameter sets were used to predict the corresponding concentrations for the validation case studies. To that end, the model output having a daily resolution was aggregated to the time periods of the real sampling strategies at the respective sites. In contrast to the calibration procedure, the validation step also included the simulation of the compound input. This included the estimation of the applied amounts and the timing of the applications.
For the agricultural herbicides, several calibration datasets were available. All of them were first tested on the NAWA SPEZ sites. Based on their performance, one set per compound was used for simulating the larger NADUF and IRMS sites. Based on the correlation coefficients and the NSE criterion the parameter sets calibrated at Mönchaltorf for the compounds isoproturon and terbuthylazine and the parameter set from Summerau for S-metolachlor were used for the other catchments (see Sect. S14, Table S8).
At the IRMS, the validation of the model was partially restricted due to the
low concentrations that often remained below the limit of quantification
(LOQ) of 5 ng L
The quality of the predictions varied between compound use classes and the validation catchments. The GRI values demonstrate that the agricultural herbicides were simulated better than were the biocides (Fig. 5, compare red and blue distributions). The (cumulative) distribution of observed concentrations was better represented by the model compared to the actual time series. Interestingly, the model performed better in the larger catchments (Fig. 5) despite the fact that calibrations were up-scaled to areas that are between 4 and 70 000 times larger than the calibration catchment (Table 2). This might be explained by averaging out regional differences and variabilities in local application dates – and hence also input uncertainty – across larger scales.
The quality of the predicted maximum concentrations changed from the calibration to the validation step. While the values were systematically underestimated during calibration, this pattern changed substantially for the validation. Depending on the site–compound combination, the maximum concentrations were either clearly under- or overestimated (Table 4). Irrespective of the sign of the deviation, the fold difference between observed and simulated concentrations – indicating by which factor observations were over- or underestimated – mostly ranged between 1 and 4 (Fig. 6). However, there were a few cases of extreme deviations because of either the observation of a pronounced and very high peak or very low measured values hardly exceeding the observed background. Again, the model performed better for the herbicides where for 50 % of the predictions (site–compound combinations) the maximum concentrations were predicted within a factor of 2.0 deviation from the observations. For the biocides, the value was larger (> 3.0). We also observed clear compound-specific patterns such as systematic overestimation of diuron concentrations (see e.g., Sect. S13, Fig. S32).
Comparison of predicted isoproturon concentrations along the River Rhine for 2011 compared to the observations at the measuring site at Lobith. ICPR: International Commission for the Protection of the Rhine.
As during the calibration step, the Nash–Sutcliffe values were low, pointing
again to the problem of properly simulating the exact timing of concentration
peaks (Fig. S33). This was very pronounced for the biocides. The correlation
coefficients provided a mixed picture. For some compounds
such as diuron, the correlations coefficient range between 0.29 and 0.68
(median
For the IRMS measuring site, we compared the performance of the simple load
aggregation procedure and the explicit routing with AQUASIM (see Table S10).
Differences between both approaches were moderate. The routing yielded better
results because some of the pronounced concentration peaks predicted by load
aggregation were substantially smoothed. Therefore, the maximum
concentrations were overestimated to a lesser degree. The median difference
between observed and simulated maximum concentrations with and without
routing were 3.1- and 3.4-fold, respectively (averages: 2.6 and 4.8,
respectively). The slightly better NSE values also suggest a better
performance with an explicit routing. These results provided evidence that at
the scale of such large basins of 30 000 km
Based on the findings reported above on the effects on routing, we only report the findings based on AQUASIM for the predictions of the main tributaries (Aare, Neckar, Main and Moselle) and the further measuring sites downstream of Basel. The total river length for which the routing was explicitly simulated with this module was 1773 km. We focus here on the three herbicides (isoproturon, S-metolachlor and terbuthylazine) because for them a minimum set of observations was available.
The observed isoproturon concentrations revealed the two peaks in spring and autumn, as also measured at the other locations (Fig. 6). The model predicted the timing of the spring peak very well. The absolute concentrations level of the peak was also simulated well (within 30 % of the observation). Concentrations during the summer months were slightly underestimated; the autumn peak was missed because no application was included in the model (see above).
The comparison of the simulated chemographs along the Rhine show some slight temporal shifts of the peaks caused by different application periods. Despite of the size of the basin, however, these shifts due to varying phenology were small, corresponding to a few days only.
The simulations show very similar patterns for the other two herbicides in
the different tributaries (see Sect. S13, Figs. S25–S26). The time shifts
between the different sub-basins were also very small. Unfortunately, these
findings cannot be tested against observations because the LOQ (10
and 50 ng L
In our simulations, we have assumed that the compounds behave like conservative tracers without degradation or sorption taking place. Although this is not completely true, the travel times through the river network are so short that relevant dissipation can be expected to be negligible for the model compounds considered in this paper.
We presented here a series of predictions for herbicide and biocide
concentrations in streams without any local calibration or model adaptations.
In this sense, the results correspond to predictions in ungauged catchments
covering tens of thousands of square kilometres based on calibration catchments
covering less than 30 km
Examples of reported Nash–Sutcliffe efficiency values and Pearson correlation coefficients between observations and simulations reported for a selection of water quality modelling studies. C: calibration, S: scenario calculations, V: validation.
Despite these positive aspects, one has to be clear about the limitations of
the model and the resulting predictions. Deficiencies are obvious when
evaluating the performance metrics. The NSE or correlation coefficients
obtained are low compared to values typically called satisfactory or good for
hydrological models. However, our results need to be put into the context of
comparable water quality studies dealing with diffuse pollution. As pointed
out by Pullan et al. (2016), there is a lack of studies in this field
reporting quantitative performance metrics such as NSE or
The fact that a parsimonious model such as the iWaQa model presented here was able to yield meaningful predictions suggests that the model concept represents the effects of the major drivers controlling the degree and dynamic of herbicide – and to some degree biocide – inputs into streams. It also indicates that these drivers remain constant over considerable spatial areas and that one can use findings from small study areas to extrapolate to larger basins as long as the first-order controls do not strongly change. For the iWaQa model as implemented here this means that the herbicide input for example is mainly triggered by discharge events. However, in drier regions it may be possible that point sources play a dominant role (Müller et al., 2002). In this case, the model concept had to be complemented to account for this input pathway, as discussed in Honti et al. (2017).
The observation that findings from small catchments can be extrapolated to larger areas in a meaningful manner may be considered a contradiction to earlier work where important spatial differences between herbicide loss rates within catchments were demonstrated (Doppler et al., 2014; Leu et al., 2010). However, the data suggest that spatial heterogeneity at small scales is averaged out at larger ones such that it does not dominate the large scale patterns.
Spatial and temporal resolution of input variables (blue), calibration data (yellow), validation data (red) and forecasts (green). Abbreviations: prec: precipitation, temp: temperature, herb: herbicide, cons: consumption, bioc: biocides, catch: catchment.
Despite the positive aspects mentioned above, there are several (major) model limitations one has to be aware of. First, the parsimonious and empirical structure of the model requires compound-specific calibration. This generally implies that field data are available at the catchment scale with sufficiently well quantified input and output fluxes.
While this calibration step is also necessary for other (more complex) models there are also model limitations related to the model structure. During calibration, we have noticed that the model was not able to fully represent the observed concentration peaks (see Table 4). This suggests that the model structure misses important processes that control concentrations during rainfall events. A possible candidate for such a process is drift deposition on roads and subsequent runoff during rainfall (e.g., Lefrancq et al., 2013). Interestingly, this systematic problem during the calibration phase was only partially observed during validation. This was possibly due to the (much) larger scale of the validation catchments that average over many temporally independent application events.
Other structural model limitations are too-high herbicide background concentrations in some sub-basins, the lack of an isoproturon application in autumn, or seasonal biocide concentration peaks that were not represented by the model. These limitations were rather easy to identify but not very easy to solve. The first problem of herbicide background concentration levels would require a more explicit modelling of the long-term fate of these compounds in the coupled unsaturated and saturated zone. To keep such a model parsimonious one had to test whether these background concentrations could be empirically linked to some simple catchment characteristics. Second, the herbicide application in autumn is much more difficult to predict compared to the spring application because it not only depends on a single variable such as the temperature sum over the year but it is also influenced by all the climatic variables determining the time of harvest of the previous crop and potential intercropping. In the future, this deficit may be overcome by deriving a stochastic application model based on application data obtained from national surveys. Regarding the third aspect, the seasonal biocide patterns, we lack any information about biocide use on buildings that could explain the observed seasonality. Targeted surveys on actual use across the year might be a solution. Better input data could then allow further structural deficits of this very simple biocide model to be studied in more detail.
These examples demonstrate that the model limitations are often a mixture between a too-simplistic model structure and a lack of input data. This agrees with the findings from the error models (see Sect. S14 and Figs. S27–S28). The predictive uncertainty due to poorly identified parameters only explain a small fraction of the deviations between observations and the deterministic model predictions in the calibration phase. The estimated uncertainty for the full error model, however, covers most of the data. Nevertheless, one should not overstate this observation because the fraction of uncertainty assigned to different sources depends on how the error model was formulated (Honti et al., 2014).
One could conclude that a more complex model was warranted to overcome such
limitations. While this would be definitely worth considering one should be
aware of the severe limitations that come with the input uncertainty
regarding the chemicals to be modelled. To illustrate this point, we have
quantified the spatial and temporal density of input data needed for the
model (Fig. 7). Compared to the drivers of the hydrological part such as
precipitation the density of data on biocide and herbicide input into the
system was orders of magnitude lower. While there is hourly precipitation
data available on a
Given this level of input uncertainty, it comes as no surprise that the observed concentrations may be substantially over- or underestimated in a given sub-catchment. The degree of mismatch between observations and simulations was still in a range that allowed the model to be used as a screening tool for identifying potentially critical catchments in a basin. This was probably thanks to the widespread use of the selected model compounds. For less frequently used compounds, one can assume that the input estimates based on sales statistics would be even more uncertain due to, for example, region-specific application patterns. Accordingly, the predictive uncertainty would increase further.
Our findings suggest that even a very parsimonious model with a maximum of
eight global parameters that need to be calibrated is sufficient to capture
the key drivers and processes for diffuse agricultural herbicide and urban
biocide losses reasonably well, so as to predict levels of peak
concentrations within a factor of 2 to 4. This demonstrates that land use as
a proxy for compound use, weather data for the timing of herbicide
applications, and discharge or precipitation as drivers for fast transport are
first-order controls for diffuse pollution for the compounds in our study
area. The results further demonstrate that the impact of these factors can be
scaled spatially across at least 4 orders of magnitude (from < 3 km
At the same time the results also point to clear model deficiencies such as the simulation of background concentrations or the lack of the autumn application of certain herbicides. Unfortunately, the analysis of model performance is limited by the lack of adequate validation data that have to combine reliable information on the timing and amount of the chemicals used as well as on concentrations in the streams. Progress in modelling and in measuring will remain closely coupled in this area and mutually benefit from each other.
Finally, it should be recognized that despite using a very parsimonious model, collecting the necessary input data and bringing it into a consistent form across a large water basin such as the Rhine is very time consuming. Hence, sharing model codes and even more importantly the required data will benefit the scientific community by not having to re-invent the wheel.
The source code and the input data for the models have been placed in GitHub at
The supplement related to this article is available online at:
CS designed the initial study design. AM, RS, FF and CS continuously discussed and guided the project progress. DW, RS and AM prepared the input data. DW and AM did most of the model coding with essential support from MH. AM and CS did most of the data analysis, and figures were provided by RS, AM and CS. AM and CS prepared the paper with contributions from all co-authors.
Fabrizio Fenicia and Christian Stamm are both on the editorial board of HESS. All other authors declare that they have no conflict of interest.
The CrossWater project was financed by the Swiss National Science Foundation (grant no. 406140-125866) and builds on previous work funded by the Swiss Federal Office for the Environment (contribution Devon Wemyss). Hans-Peter Bader supported the project by regularly discussing important issues during the entire duration of the project. We would like to thank Mike Müller for getting us started in Python. Edited by: Nunzio Roman Reviewed by: three anonymous referees