Trajectories of nitrate input and output in three nested catchments along a land use gradient

Increased anthropogenic inputs of nitrogen (N) to the biosphere during the last few decades have resulted in increased groundwater and surface water concentrations of N (primarily as nitrate), posing a global problem. Although measures have been implemented to reduce N inputs, they have not always led to decreasing riverine nitrate concentrations and loads. This limited response to the measures can either be caused by the accumulation of organic N in the soils (biogeochemical legacy) – or by long travel times (TTs) of inorganic N to the streams (hydrological legacy). Here, we compare atmospheric and agricultural N inputs with longterm observations (1970–2016) of riverine nitrate concentrations and loads in a central German mesoscale catchment with three nested subcatchments of increasing agricultural land use. Based on a data-driven approach, we assess jointly the N budget and the effective TTs of N through the soil and groundwater compartments. In combination with long-term trajectories of the C–Q relationships, we evaluate the potential for and the characteristics of an N legacy. We show that in the 40-year-long observation period, the catchment (270 km2) with 60 % agricultural area received an N input of 53 437 t, while it exported 6592 t, indicating an overall retention of 88 %. Removal of N by denitrification could not sufficiently explain this imbalance. Log-normal travel time distributions (TTDs) that link the N input history to the riverine export differed seasonally, with modes spanning 7–22 years and the mean TTs being systematically shorter during the high-flow season as compared to low-flow conditions. Systematic shifts in the C–Q relationships were noticed over time that could be attributed to strong changes in N inputs resulting from agricultural intensification before 1989, the break-down of East German agriculture after 1989 and the seasonal differences in TTs. A chemostatic export regime of nitrate was only found after several years of stabilized N inputs. The changes in C–Q relationships suggest a dominance of the hydrological N legacy over the biogeochemical N fixation in the soils, as we expected to observe a stronger and even increasing dampening of the riverine N concentrations after sustained high N inputs. Our analyses reveal an imbalance between N input and output, long timelags and a lack of significant denitrification in the catchment. All these suggest that catchment management needs to address both a longer-term reduction of N inputs and shorterterm mitigation of today’s high N loads. The latter may be covered by interventions triggering denitrification, such as hedgerows around agricultural fields, riparian buffers zones or constructed wetlands. Further joint analyses of N budgets and TTs covering a higher variety of catchments will provide a deeper insight into N trajectories and their controlling parameters.

fields, riparian buffers zones or constructed wetlands. Further joint analyses of N budgets and TTs covering a higher variety of catchment will provide a deeper insight to N trajectories and their controlling parameters.

Introduction
In terrestrial, freshwater and marine ecosystems nitrogen (N) species are essential and often limiting nutrients (Webster et al., 2003;Elser et al., 2007). Changes in strength of their different sources like atmospheric deposition, wastewater inputs 5 and agricultural activities caused major changes in the N cycle (Webster et al., 2003). Especially two major innovations from the industrial age accelerated anthropogenic inputs of reactive N species into the environment: artificial N fixation and the internal combustion engine (Elser, 2011). By that the amount of reactive N that enters into the element's biospheric cycle has been doubled in comparison to the preindustrial era (Smil et al., 1999;Vitousek et al., 1997). However, the different input sources of N show diverging rates of change over time and space. While the atmospheric emissions of N oxides and 10 ammonia have strongly declined in Europe since the 1980s (EEA, 2014), the agricultural N input through fertilizers declined but is still at a high level (Federal Ministry for the Environment and Federal Ministry of Food, 2012). In the cultural landscape of Western countries, most of the N emissions in surface and groundwater bodies stem from diffuse agricultural sources (Bouraoui and Grizzetti, 2011;Dupas et al., 2013).
The widespread consequences of these excessive N inputs are significantly elevated concentrations of dissolved inorganic 15 nitrogen (DIN) in groundwater and connected surface waters (Altman and Parizek, 1995;Sebilo et al., 2013;Wassenaar, 1995) leading to increased riverine DIN fluxes (Dupas et al., 2016) and causing the ecological degradation of freshwater and marine systems. This degradation is caused by the ability of N species to increase primary production and to change food web structures (Howarth et al., 1996;Turner & Rabalais, 1991). Especially the coastal marine environments, where nitrate (NO 3 ) is typically the limiting nutrient, are affected by these eutrophication problems (Decrem et al., 2007;Prasuhn and 20 measures can partly be explained by nutrient legacy effects, which stem from an accumulation of excessive fertilizer inputs over decades creating a strongly dampened response between the implementation of measures and water quality improvement (van Meter & Basu, 2015). Furthermore, the multi-year travel times (TTs) of nitrate through the soil and groundwater compartments cause large time lags (Howden et al., 2010;Melland et al., 2012) that can substantially delay the riverine response to applied management interventions. For a targeted and effective water quality management, we therefore 5 need a profound understanding of the processes and controls of time lags of N from the source to groundwater and surface water bodies. Bringing together N balancing and accumulation with estimations of N TTs from application to riverine exports can contribute to this lack of knowledge. Estimation of the water or solute TTs is essential for predicting the retention, mobility and fate of solutes, nutrients and contaminants at catchment-scale (Jasechko et al., 2016). Time series of solute concentrations and loads that cover both, input 10 to the geosphere and the subsequent riverine export, can be used not only to determine TTs (van Meter & Basu, 2017), but also to quantify mass losses in the export as well as the behaviour of the catchment's retention capacity (Dupas et al., 2015).
Knowledge on the TT of N would therefore allow understanding on the N transport behaviour, defining the fate of injected N mass into the system and its contribution to riverine N response. The mass of N being transported through the catchment storage can be referred as hydrological legacy. Data driven or simplified mechanistic approaches have often been used to 15 derive stationary and seasonally variable travel time distributions (TTDs) using in-and output signals of conservative tracers or isotopes (Jasechko et al., 2016;Heidbüchel et al., 2012) or chloride concentrations (Kirchner et al., 2000;Bennettin et al., 2015). Recently, van Meter & Basu (2017) estimated the solute TTs for N transport at several stations across a catchment located in Southern Ontario, Canada, showing decadal time-lags between input and riverine exports. Moreover, systematic seasonal variations in the NO 3 -N concentrations have been found, which were explained by seasonal shifts in the N delivery 20 pathways and connected time lags (van Meter & Basu, 2017). Despite the determination of such seasonal concentration changes and age dynamics, there are relatively few studies focussing on their long-term trajectory under conditions of changing N inputs (Dupas et al., 2018;Howden et al., 2010;Minaudo et al., 2015;Abbott et al., 2018). Seasonally differing time shifts, resulting in changing intra-annual concentration variations are of importance to aquatic ecosystems health and their functionality. Seasonal concentration changes can also be directly connected to changing concentration-discharge (C-25 Q) relationshipsa tool for classifying observed solute responses to changing discharge conditions and for characterizing and understanding anthropogenic impacts on solute input, transport and fate (Jawitz & Mitchell, 2011;Musolff et al. 2015).
Investigations of temporal dynamics in the C-Q relationship are a valuable addition to approaches based on N balancing only (e.g. Abbott et al. 2018), when evaluating the effect of management interventions.
The C-Q relationships can be on the one hand classified in terms of their pattern, characterized by the slope b of the ln(C)-30 ln(Q) regression (Godsey et al., 2009): with enrichment (b>0), dilution (b<0) or constant (b≈0) patterns (Musolff et al., 2017). On the other hand, C-Q relationships can be classified according to the ratio between the coefficients of variation of concentration (CV C ) and of discharge (CV Q ; Thompson et al., 2011). This export regime can be either chemodynamic (CV C /CV Q > 0. 5) or chemostatic, where the variance of the solute load is more dominated by the variance in discharge than the variance in concentration (Musolff et al., 2017). Both, patterns and regimes are dominantly shaped by the spatial distribution of solute sources (Seibert et al., 2009;Basu et al., 2010;Thompson et al., 2011;Musolff et al., 2017). High source heterogeneity and consequently high concentration variability is thought to be characteristic for nutrients under pristine conditions (Musolff et al., 2017;Basu et al., 2010). It was shown in Germany and the United States that catchments under intensive agricultural use evolve from chemodynamic to more chemostatic behaviour regarding nitrate export 5 (Thompson et al., 2011;Dupas et al., 2016). Several decades of human N inputs seem to dampen the discharge-dependent concentration variability, resulting in chemostatic behaviour, where concentrations are largely independent of discharge variations (Dupas et al., 2016). Also Thompson et al. (2011) stated observational and model-based evidence of an increasing chemostatic response of nitrate with increasing agricultural intensity. This shift in the export regimes is caused by a longterm homogenisation of the nitrate sources in space and/ or in depth within soils and aquifers (Dupas et al., 2016;Musolff et 10 al., 2017). However, effective denitrification in the subsurface can create concentration variability over depths and flow path age and thus have shown to result in chemodynamic exports even with intensive agriculture (Van der Velde et al., 2010;Musolff et al., 2017). Long-term N inputs lead to a loading of all flow paths in the catchment with mobile fractions of N and by that the formation of a hydrological N legacy (van Meter & Basu, 2015) and chemostatic riverine N exports. On the other hand, excessive fertilizer input is linked to the above-mentioned build-up of legacy N stores in the catchment, changing the 15 export regime from a supply-to a transport-limited chemostatic one . This legacy is manifested as a biogeochemical legacy in form of increased, less mobile, organic N content within the soil (Worral et al., 2015;van Meter & Basu, 2015;van Meter et al., 2017a). This type of legacy buffers biogeochemical variations, so that management measures can only show their effect if the build-up source gets substantially depleted .
Depending on the catchment configuration, both forms of legacyhydrological and biogeochemicalcan exist with 20 different shares of the total N stored in a catchment (van Meter et al., 2017a). However, biogeochemical legacy is hard to distinguish from hydrological legacy when looking at time lags between N input and output or at catchment scale N budgets only (van Meter & Basu, 2015). One way to better disentangle the N legacy types is applying the framework of C-Q relationships as defined by Jawitz & Mitchell (2011), Musolff et al. (2015 and Musolff et al. (2017). In case of a hydrological legacy, strong changes of fertilizer inputs (such as increasing inputs in the initial phase of intensification and 25 decreasing inputs as a consequence of measures) will temporarily increase spatial concentration heterogeneity (e.g. comparing young and old water fractions in the catchment storage), and therefore also shift the export regime to more chemodynamic conditions. On the other hand, a dominant biogeochemical legacy will lead to sustained concentration homogeneity in the N source zone in the soils and to an insensitivity of the riverine N export regime to fast changes in inputs. 30 Common approaches to quantify catchment scale N budgets and to characterize legacy or to derive TTs are either based on data-driven (Worral et al., 2015;Dupas et al., 2016) or on forward modeling (van Meter & Basu, 2015;van Meter et al., 2017a) approaches. So far, data-driven studies focused either solely on N budgeting and legacy estimation or on TTs. Here, we conducted a joint data-driven assessment of catchment scale N budget, the potential and characteristics of an N legacy and on the estimation of TTs of the riverine exported N. We utilized the trajectory of agricultural catchments in terms of C-Q relationships, their changes over longer time scales and their potential evolution to a chemostatic export regime. The novel combination of the long-term N budgeting, TT estimation and C-Q trajectory will help understanding the differentiation between biogeochemical and hydrological legacy, both reasons for missed targets in water quality management. This study will address the following research questions: 5 1. How high is the retention potential for N of the studied mesoscale catchment and what are the consequences in terms of a potential build-up of an N legacy? 2. What are the characteristics of the TTD for N that links change in the diffuse anthropogenic N inputs to the geosphere and their observable effect in riverine NO 3 -N concentrations?
3. What are the characteristics of a long-term trajectory of C-Q relationships? Is there an evolution to a chemostatic 10 export regime that can be linked to a biogeochemical or hydrological N legacy?
To answer these questions, we used time series of water quality data over four decades, available from a mesoscale German catchment, as well as estimated N input to the geosphere. We linked N input and output on annual and intra-annual time scales through consideration of N budgeting and the use of TTDs. This input-output assessment uses time series of the Holtemme catchment (270 km²) with its three nested sub-catchments along a land use gradient from pristine mountainous 15 headwaters to a lower basin with intensive agriculture and associated increases of fertilizer applications. This catchment with its pronounced increase in anthropogenic impacts from up-to downstream is quite typical for many mesoscale catchments in Germany and elsewhere. Moreover, this catchment offers a unique possibility to analyze the system response to strong changes in fertilizer usage in East-Germany before and after reunification. Thereby, we anticipate that our improved understanding gained through this study in these catchment settings is transferable to similar regions. In comparison to 20 spatially and temporally integrated water quality signals stemming solely from the catchment outlet, the higher spatial resolution with three stations and the unique length of the monitoring period  allow for a more detailed investigation about the fate of N, and consequently findings may provide guidance for an effective water quality management.
is representative for other German and central European regions showing similar vulnerability (Zacharias et al., 2011). The observatory is one of the meteorologically and hydrologically best-instrumented catchments in Germany (Zacharias et al., 2011;Wollschläger et al., 2017), and provides long-term data for many environmental variables including water quantity (e.g. precipitation, discharge) and water quality at various locations.
The Holtemme catchment has its spring at 862 m a.s.l. in the Harz Mountains and extends to the Northeast to the Central 5 German Lowlands with an outlet at 85 m a.s.l.. The long-term annual mean precipitation  shows a remarkable decrease from colder and humid climate in the Harz Mountains (1262 mm) down to the warmer and dryer climate of the Central German Lowlands on the leeward side of the mountains (614 mm; Rauthe et al., 2013;Frick et al., 2014). Discharge The geology of the catchment is dominated by late Paleozoic rocks in the mountainous upstream part that are largely covered by Mesozoic rocks as well as Tertiary and Quaternary sediments in the lowlands (Frühauf & Schwab, 2008;Schuberth, 2008). Land use of the catchment changes from forests in the pristine, mountainous headwaters to intensive agricultural use in the downstream lowlands (EEA, 2012). According to Corine Land Cover (CLC) from different years (1990,2000,2006,2012), the land use change over the investigated period is negligible. Overall 60 % of the catchment is used by agriculture, 15 with a crop rotation of wheat, barley, triticale, rye and rapeseed (Yang et al., 2018b), while 30 % is covered by forest (EEA, 2012). Urban land use occupies 8 % of the total catchment area (EEA, 2012) with two major towns (Wernigerode, Halberstadt) and several small villages. Two wastewater treatment plants (WWTPs) discharge into the river. The town of Wernigerode had its WWTP within its city boundaries until 1995, when a new WWTP was put into operation about 9.1 km downstream in a smaller village, called Silstedt, replacing the old WWTP. The WWTP in Halberstadt was not relocated but 20 renovated in 2000. Nowadays, the total nitrogen load (TNb) in cleaned water is approximately 67.95 kg d -1 (WWTP Silstedt: NO 3 -N load 55 kg d -1 ) and 35.09 kg d -1 (WWTP Halberstadt: NO 3 -N load 6.7 kg d -1 ; mean daily loads 2014; Müller et al., 2018). Referring to the last 5 years of observations, NO 3 -N load from wastewater made up 17 % of the total observed NO 3 -N flux at the midstream station (see below) and 11 % at the downstream station. Despite this point source N input, major nitrate contribution is due to inputs from agricultural land use (Müller et al., 2018), which is predominant in the mid-and 25 downstream part of the catchment (Fig. 1). The Holtemme River has a length of 47 km. Along the river, the LHW Saxony-Anhalt maintains long-term monitoring stations, providing the daily mean discharge and the biweekly to monthly water quality measurements covering roughly the 5 last four decades . Three of the water quality stations along the river were selected to represent the characteristic land use and topographic gradient in the catchment. From up-to downstream, the stations are named Werbat, Derenburg and Nienhagen (Fig. 1); and in the following referred to as Upstream, Midstream and Downstream. The pristine headwaters upstream represent the smallest (6 % of total catchment area) and the steepest area among the three selected sub-catchments with about a three times higher mean topographic slope than the downstream parts (DGM25; Table1). According to the latest 10 Corine land cover dataset (CLC, 2012;EEA, 2012), the land use is characterized by forest only. The larger midstream subcatchment that represents one third of the total area is still dominated by forests, but with growing anthropogenic impact due to increasing agricultural land use and the town of Wernigerode. More than half of the agricultural land in this sub-catchment  Table 1; S1.1). The largest sub-catchment (61 %) constitutes the downstream lowland areas 15 which are predominantly covered by Chernozems (Schuberth, 2008), representing one of the most fertile soils within Germany (Schmidt, 1995). Hence, the agricultural land use in this sub-catchment is the highest (81 %) in comparison to the two upstream sub-catchments (EEA, 2012).

Nitrogen input
The main N sources were quantified over time assisting the data-based input-output assessment to address the three research questions regarding the N budgeting, effective TTs and C-Q relationships in the catchment. 5 A recent investigation in the study catchment by Müller et al. (2018) showed that the major nitrate contribution stems from agricultural land use and the associated application of fertilizers. The quantification of this contribution is the N-surplus (also referred to as agricultural surplus) that reflects N input that is in excess of crop and forage needs. For Germany there is no consistent data set available for the N-surplus that covers all land use types and is sufficiently resolved in time and space. Therefore, we combined the available agricultural N input (including atmospheric deposition) dataset with another dataset of 10 atmospheric N deposition rates for the non-agricultural land.
The annual agricultural N input for the Holtemme catchment was calculated using two different data sets of agricultural Nsurplus across Germany provided by the University of Gießen (Bach & Frede, 1998;Bach et al., 2011). Surplus data [kg N ha -1 a -1 ] were available on the federal state level for 1950-2015 and on the county level for 1995-2015; with an accuracy level of 5 % (see Bach & Frede, 1998 for more details). We used the data from the overlapping time period  to 15 downscale the state level data (state: Saxony-Anhalt) to the county level (county: Harzkreis). Both (the state level and the aggregated county to state level) data sets show high correspondence with a correlation (R 2 ) of 0.85, but they slightly differ in their absolute values (by 6 % of the mean annual values). The mean offset of 3.85 kg N ha -1 a -1 was subtracted from the federal state level data to yield the surplus in the county before 1995.
Both of the above datasets account for the atmospheric deposition, but only on agricultural areas. For other non-agricultural areas (forest and urban landscapes), the N source stemming from atmospheric deposition was quantified based on datasets 5 from the Meteorological Synthesizing Centre -West (MSC-W) of the European Monitoring and Evaluation Programme (EMEP). The underlying dataset consists of gridded fields of EU-wide wet and dry atmospheric N depositions from a chemical transport model that assimilates different observational records on atmospheric chemicals (e.g. Bartnicki & Benedictow, 2017;Bartnicki & Fagerli, 2006). This dataset is available at annual time-steps since 1995, and at every 5 years between 1980 and 1995. Data between the 5-year time steps were linearly interpolated to obtain annual estimates of N 10 deposition between 1980 and 1995. For years prior to 1980, we made use of global gridded estimates of atmospheric N deposition from the three-dimensional chemistry-transport model (TM3) for the year 1860 (Dentener, 2006;Galloway et al., 2004). In absence of any other information, we performed a linear interpolation of the N deposition estimates between 1860 and 1980.
To quantify the net N fluxes to the soil non-agricultural land use types, the terrestrial biological N fixation had to be added to 15 the atmospheric deposition. Based on a global inventory of terrestrial biological N fixation in natural ecosystems, Cleveland et al. (1999) estimated the mean uptake for temperate (mixed, coniferous or deciduous) forests and (tall/medium or short) grassland as 16.04 kg N ha -1 a -1 , and 2.7 kg N ha -1 a -1 , respectively. The atmospheric deposition and biological fixation for the different non-agricultural land uses, were added to the agricultural N-surplus to achieve the total N input per area. In contrast to the widely applied term net anthropogenic nitrogen input (NANI), we do not account for wastewater fluxes in the 20 N input but rather focus on the diffuse N input and connected flow paths, where legacy accumulation and time lags between in-and output potentially occur.

Discharge and water quality time series
Discharge and water quality observations were used to quantify the N load and to characterize the trajectory of NO 3 -N 25 concentrations and the C-Q trajectories in the three sub-catchments.
The data for water quality (biweekly to monthly) and discharge (daily) from 1970 to 2016 were provided by the LHW, Saxony-Anhalt. The biweekly to monthly sampling was done at gauging stations defining the three sub-catchments. The data sets cover a wide range of instream chemical constituents including major ions, alkalinity, nutrients and in situ measured parameters (pH, O 2 , water temperature, electrical conductivity). As this study only focuses on N species, we restricted the 30 selection of parameters to nitrate (NO 3 ; Fig. 2), nitrite (NO 2 ; supplement, S1.2.2) and ammonium (NH 4 ; supplement, S1.2.1).

Figure 2: NO 3 -N concentration and discharge (Q) time series: Upstream (a), Midstream (b) and Downstream (c).
Discharge time series at daily time scales were measured at two of the water quality stations (Upstream,Downstream;5 Fig. 2). Continuous daily discharge series are required to calculate flow-normalized concentrations (see the following section 2.3.2 for more details). To derive the discharge data for the midstream station and to fill measurement gaps at the other stations (2 % Upstream, 3 % Downstream), we used simulations from a grid-based distributed mesoscale hydrological model mHM (Samaniego et al., 2010;Kumar et al., 2013). Daily mean discharge was simulated for the same time frame as the available measured data. We used a model set-up similar to Müller et al. (2016) with robust results capturing the observed 10 variability of discharge in the studied, near-by catchments. We note that the discharge time series were used as weighting factors in the later analysis of flow-normalized concentrations. Consequently it is more important to capture the temporal dynamics than the absolute values. Nonetheless, we performed a simple bias correction method by applying the regression equation of simulated and measured values to reduce the simulated bias of modelled discharge. After this revision, the simulated discharges could be used to fill the gaps of measured data. The midstream station (Derenburg) for the water 5 quality data is 5.6 km upstream of the next gauging station. Therefore, the nearest station (Mahndorf) with simulated and measured discharge data was used to derive the bias correction equation that was subsequently applied to correct the simulated discharge data at the midstream station, assuming the same bias between modelled and observed discharges at the gauging station. The study of Müller et al. (2018) indicated the dominance of N from diffuse sources in the Holtemme catchment, but also stressed an impact of wastewater-borne nitrate during low flow periods. Because our purpose was to balance and compare N 30 input and outputs from diffuse sources only, the provided annual flux of total N from the two WWTPs was therefore used to correct flow-normalized fluxes and concentrations derived from the WRTDS assessment. We argue that the annual wastewater N flux is robust to correct the flow-normalized concentrations, but it does not allow for the correction of measured concentration data at a specific day. Both treatment plants provided snapshot samples of both, NO 3 -N and total N fluxes, to derive the fraction of N that is discharged as NO 3 -N into the stream. This fraction is 19 % for the WWTP Halberstadt (384 measurements between January 2014 to July 2016), and 81 % for Silstedt (eight measurements from February 2007 to December 2017). We argue that the fraction of N leaving as NH 4 , NO 2 and N org does not interfere with the NO 3 -N flux in the river due to the limited stream length and therefore nitrification potential of the Holtemme River impacted 5 by wastewater (see also supplement, S1.2.3). We related the wastewater-borne NO 3 -N flux to the flow-normalized daily flux of NO 3 -N from the WRTDS method to get a daily fraction of wastewater NO 3 -N in the river that we used to correct the flownormalized concentrations. Note that this correction was applied to the midstream station from 1996 on, when the Silstedt treatment plant was taken to operation. In the downstream station, we additionally applied the correction from the Halberstadt treatment plant, renovated in the year 2000. Before that, we assume that waste water-borne N dominantly leaves 10 the treatment plants as NH 4 -N (see also supplement, S1.2.1).
Based on the daily resolved flow-normalized and wastewater-corrected concentration and flux data, descriptive statistical metrics were calculated on an annual time scale. Seasonal statistics of each year were also calculated for winter (December, January, February), spring (March, April, May), summer (June, July, August) and fall (September, October, November).
Note that statistics for the winter season incorporate December values from the calendar year before. 15 Following Musolff et al. (2015Musolff et al. ( , 2017, the ratio of CV C /CV Q and the slope (b) of the linear relationship between ln(C) and ln(Q) were used to characterize the export pattern and the export regimes of NO 3 -N along the three study catchments.

Input-output assessment: Nitrogen budgeting and effective travel times
The input-output assessment is needed to estimate the retention potential for N in the catchment as well as to link temporal 20 changes in the diffuse anthropogenic N inputs to the observed changes in the riverine NO 3 -N concentrations. The stream concentration of a given solute, e.g. as shown by Kirchner et al. (2000), is assumed at any time as the convolution of the TTD and the rainfall concentration throughout the past. This study applies the same principle for the N input as incoming time series that, when convolved with the TTD, yields the stream concentration time series. We selected a log-normal distribution function (with two parameters, µ and σ) as a convolution transfer function, based on a recent study by Musolff et 25 al. (2017) who successfully applied this form of a transfer function to represent TTs. The two free parameters were obtained through optimization based on minimizing the sum of squared errors between observed and simulated N exports. The form of selected transfer function is in line with Kirchner et al. (2000) stating that exponential TTDs are unlikely at catchment scale but rather a skewed, long tailed distribution. Note that we used the log-normal distribution as a transfer function between the temporal patterns of input (N load per area) and flow-normalized concentrations on an annual time-scale only 30 and not as a flux-conservative transfer function. TTDs were inferred based on median annual and median seasonal flownormalized concentrations and the corresponding N input estimates. To account for the uncertainties in the flow-normalized concentration input, we additionally derive TTDs for the confidence bands of the concentrations (5 th and 95 th percentile) estimated through the bootstrap method (see section 2.3.2 for more details). Here, we assumed that the width of the confidence bands provided for the annual concentrations also applies to the seasonal concentrations of the same year.

Input assessment
In the period from 1950 to 2015, the Holtemme catchment received a cumulative diffuse N input (excluding the waste water 5 point sources) of 80 055 t with the majority of this associated with agriculture related N application (74 %). Within the period when water quality data were available, the total sum is 63 396 t , with 76 % agricultural contribution. The N input showed a remarkable temporal variability (see Fig. 6; purple, dashed line). From 1950 to 1976, the input was characterized by a strong increase (slope of linear increase = 2.4 kg N ha -1 a 1 per year) with a maximum annual, agricultural input of 132.05 kg N ha -1 a -1 (1976), which is twenty times the agricultural input in 1950. After more than 10 years of high 10 but more stable inputs, the N-surplus dropped dramatically with the peaceful reunification of Germany and the collapse of the established agricultural structures in East Germany (1989/1990Gross, 1996). In the time period afterwards (1990)(1991)(1992)(1993)(1994)(1995), the N-surplus was only one-sixth (20 kg N ha -1 a -1 ) of the previous input. After another 8 years of increased agricultural inputs (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003) of around 50 kg N ha -1 a -1 , the input slowly decreased with a mean slope of -0.8 kg N ha -1 a -1 per year, but showed distinctive changes in the input between the years. 15 The median N input upstream (53 t a -1 ) is less than 7 % of the total catchment input (760 t a -1 ). Hence, the input to the upstream area was only minor in comparison to the ones further downstream that are dominated by agriculture.
As land use change over the investigated period is negligible, the N input from biological fixation stayed constant.

Discharge time series and WRTDS results on decadal statistics 20
Discharge was characterized by a strong seasonality throughout the entire data record, which divided the year into a high flow season (HFS) during winter and spring, accounting for two-thirds of the annual discharge and a low flow season (LFS) during summer and fall. Average discharge in the sub-catchments is mainly a reflection of the strong spatial precipitation gradient across the study area being on the leeward side of the Harz Mountains. The upstream sub-catchment contributed 21 % of the median discharge measured at the downstream station ( Table 2). The midstream station, representing the 25 cumulated discharge signal from the up-and midstream sub-catchments, accounted for 82 % of the median annual discharge at the outlet. Although the upstream sub-catchment had the highest specific discharge, the major fraction of total discharge (61 %) was generated in the midstream sub-catchment. Also the seasonality in discharge was dominated by this major midstream contribution, especially during high flow conditions. Vice versa, especially during HFSs, the median downstream contribution was less than 10 %, while during low flow periods, the downstream contribution accounted for up to 33 % (summer). The flow-normalized NO 3 -N concentrations in each sub-catchment showed strong differences in their overall levels and temporal patterns over the four decades (Fig. 3a, see also Fig. 2 & Fig. 6 for details). The lowest decadal concentration changes and the earliest decrease in concentrations were found in the pristine catchment. Median upstream concentrations 10 were highest in the 80s (1987), with a reduction of the concentrations to about one half in the latter decades. Over the entire period, the median upstream concentrations were smaller than 1 mg L -1 , so that the described changes are small compared to the NO 3 -N dynamics of the more downstream stations. High changes over time were observed in the two downstream stations with a tripling of concentrations between the 1970s and 1990s, when maximum concentrations were reached. While median concentrations Downstream decreased slightly after this peak (1995/1996), the ones Midstream (peak: 1998) stayed 15 constantly high. At the end of the observation period, at the outlet (Downstream), the median annual concentrations did not decrease below 3 mg L -1 NO 3 -N, a level that was exceeded after the 1970s. The differences in NO 3 -N concentrations between the pristine upstream and the downstream station evolved from an increase by a factor of 3 in the 1970s to a factor of 7 after the 1980s. Calculated loads (Fig. 3b) also showed a drastic change between the beginning and the end of the time series. The daily upstream load contribution was below 10 % of the total annual export at the downstream station in all decades and then the 5 estimates decreased from 9 % (1970s) to 4 % (2010s). The median daily load between 1970s and 1990s tripled Midstream (0.1 t d -1 to 0.3 t d -1 ) and more than doubled Downstream (0.2 t d -1 to 0.5 t d -1 ). In the 1990s, the Holtemme River exported on average more than 0.5 t d -1 of NO 3 -N, which, related to the agricultural area in the catchment, translates into more than 3.1 kg N km -2 d -1 (maximum 13.4 kg N ha -1 a -1 in 1995).

Input-Output-balance: N budget 10
We jointly evaluated the estimated N inputs and the exported NO 3 -N loads to enable an input-output-balance. This comparison on the one hand allowed for an estimation of the catchment's retention potential, and on the other hand enabled us to estimate future exportable loads. The load stemming from the most upstream, pristine catchment accounted for less than 10 % of the exported riverine load at the outlet. To focus on the anthropogenic impacts, the data from the upstream station are not discussed on its own in the 5 following. At the midstream station, a total sum of input of 16 441 t compared to 4 109 t of exported NO 3 -N for the overlapping time period of in-and output was analyzed . The midstream sub-catchment received 73 % (Table 3) more N mass than it exported at the same time. Note that the exported N is not necessarily the N applied in the same period due to the temporal offset as discussed later in detail. With the assumption that 43 % (agricultural N input of sub-catchment N input) of the diffuse input resulted from agriculture, the sub-catchment exported 616 kg N ha -1 (537-719 kg N ha -1 ) from 10 agricultural areas. The cumulated N input from the entire catchment (measured Downstream) from 1976 to 2015 (overlapping time of in-and output) was 53 437 t, while the riverine export in the same time was only 12 % (6 kg N ha -1 a -1 ; 11-14 %) implying an agricultural export of 370 kg N ha -1 (325-415 kg N ha -1 ; Fig. 4). This mass discrepancy between inand output translates into a retention rate in the entire Holtemme catchment of 88 % (86-89 %). In relation to the entire subcatchment area (not only agricultural land use), the annual retention rate of NO 3 -N was around 28 kg N ha -1 a -1 (27-30 kg N 15 ha -1 a -1 ) in the midstream sub-catchment and 59 kg N ha -1 a -1 (59-59 kg N ha -1 a -1 ) in the flatter and more intensively cultivated downstream sub-catchment.

Effective TTs of N
We approximated the effective TTs for all seasonal NO 3 -N concentration trajectories at the midstream and downstream stations by fitting the log-normal TTDs ( Fig. 5; Table 4). Note that the upstream station was not used here due to the lack of temporally resolved input data on the atmospheric N deposition (estimated linear input increase between 1950 and 1979). In general, the optimized distributions were able to sufficiently capture the time lag and smoothing between the input and 10 output concentrations (R 2 ≥ 0.72; see also supplement, S2.1, S2.2). Systematic differences between stations and seasons can be observed, best represented by the mode of the distributions (peak TTs). The average deviation between the best and worst case estimation of the fitted TTDs from their respective average value was only 4 % with respect to the mode of the distributions (Table 4).

Figure 5: Seasonal variations in the fitted log-normal distributions of effective travel times between nitrogen input and output responses for Midstream (a) and Downstream (b).
The TTDs for all seasons taken together showed longer TTs for the mid-in comparison to the downstream station. The comparison of the TTD modes for the different seasons Midstream showed distinctly differing peak TTs between 11 years 19 (spring) and 22 years (fall), which represented a doubling of the peak TT. Fastest times appeared in the HFSs while modes of the TTDs appeared longer in the LFSs. Note that the shape factor σ of the effective TTs also changed systematically: The HFS spring exhibited a higher shape factor than those of the other seasons. This refers to a change in the coefficient of variation of the distributions Midstream from 0.6 in spring to 0.2 in fall.
The modes of the fitted distributions for the downstream station for each season were shorter than the ones at the midstream 5 station. The mode of the TTs ranged between 7 years (spring) and 15 years (winter, fall). The shape factors of the fitted TTDs ranged between 0.8 (spring) and 0.3 (summer) for the downstream station. In summary, HFS spring in both subcatchments had shorter TTDs than the other seasons and midstream showed longer TTDs than downstream.

Seasonal NO 3 -N concentrations and C-Q relationships over time
As described above, the Holtemme catchment showed a pronounced seasonality in discharge conditions, producing the HFS 10 in December-May (winter + spring) and the LFS in June-November (summer + fall). Therefore, changes in the seasonal concentrations of NO 3 -N also reflect in the annual C-Q relationship. Analysing the changing seasonal dynamics therefore provide a deeper insight into N trajectories in the Holtemme catchment.

Figure 6: Annual N input (referred to the whole catchment, 2nd y-axis) to the catchment and measured median NO 3 -N concentrations in the stream (1st y-axis) over time at three different locations. Upstream (a, d), Midstream (b, e), Downstream (c, f). Lower panels show plots of slope b vs. CV C /CV Q for NO 3 -N for the three sub-catchments following the classification scheme provided in Musolff et al. (2015). X-axis gives the coefficient of variation of concentrations (C) relative to the coefficient of variation of discharge (Q). Y-axis gives the slope b of the linear ln(C)-ln(Q)-relationship. Colours indicate the temporal evolution from 1970-2015 starting from red to yellow.
In the pristine upstream catchment, no temporal changes in the seasonal differences of riverine NO 3 -N concentrations could be found (Fig. 6a). Also the C-Q relationship (Fig. 6d) showed a steady pattern (moderate accretion) with highest concentrations in the HFSs i.e. winter and spring. The ratio of CV C /CV Q indicates a chemostatic export regime and changed 10 only marginally (amplitude of 0.2) over time.
At the midstream station (Fig. 6b), the early 1970s showed an export pattern with highest concentration during HFSs similar to the upstream catchment, but with a general increase of concentrations from 1970-1995. During the 1980s, the increase of concentrations in the HFS was faster than in the LFS, which changed the C-Q pattern to a strongly positive one (b max =0.42, 1987; red to orange symbols in Fig. 6e). This development was characterized by a tripling of intra-annual amplitudes (C spring -C fall ) of up to 2.4 mg L -1 (1987). With a lag of around 10 years, in the 1990s also the LFSs exhibit a strong increase in concentrations (C max = 3.1 mg L -1 , 1998, Fig. 6b). The midstream concentration time series shows bimodality. The C-Q relationships (Fig. 6e) evolved from an intensifying accretion pattern in the 1970s and 1980s (red to orange symbols in Fig.   6e) to a constant pattern between C and Q in the 1990s and afterwards (yellow symbols). The CV C /CV Q increased during the 5 1970s and decreased afterwards strongly by 0.4 between 1984 and 1995, showing a trajectory starting from a more chemostatic to a chemodynamic, and then back to a chemostatic export regime.
At the downstream station (Fig. 6c) the concentrations in the HFSs were found to be comparable to the ones observed at the midstream station. As seen Midstream, the N concentrations during the LFSs peaked with a delay compared to those of the HFSs. The resulting intra-annual amplitude showed a maximum of 2.4 mg L -1 in the 1980s (1983/84), with strongly positive 10 C-Q patterns (b max = 0.4, 1985; red symbols in Fig. 6f). In contrast to the bimodal concentration trends in the mid-and downstream HFSs, the LFSs Downstream showed an unimodal pattern peaking around 1995/96 with concentrations above 6 mg L -1 NO 3 -N (C max =6.9 mg L -1 ). In the 1990s, the concentrations in the LFSs were higher than those noticed in the HFSs causing a switch to a dilution C-Q pattern (orange symbols in Fig. 6f). Due to the strong decline of LFSs concentrations after 1995 (Fig. 6c), the dilution pattern evolved to a constant C-Q pattern (yellow symbols in Fig. 6f) from the 2000s 15 onward. After an initial phase with chemostatic conditions (1970s), the CV C /CV Q strongly increased to a chemodynamic export regime in the 1980s (max. CV C /CV Q =0.8, 1984). Later on CV C /CV Q declined by 0.8 between 1984 and 2001 (min. CV C /CV Q =0.03), which indicate the C-Q trajectory is coming back to a chemostatic export nitrate regime.

Catchment scale N budgeting 20
Based on the calculated budgets of N inputs and riverine N outputs for the three sub-catchments within the Holtemme catchment, we discuss here differences between the sub-catchments and potential main reasons for the missing part in the N budget: 1) permanent N removal by denitrification or 2) the build-up of N legacies.
The N load stemming from the most upstream, pristine catchment accounted for less than 10 % of the exported annual load over the entire study period. This minor contribution can be attributed to the lack of agricultural and urban land use as 25 dominant sources for N. Consequently, the N export from the upstream sub-catchment was dominantly controlled by N inputs from atmospheric deposition and biological fixation.
The total input over the whole catchment area was quantified as more than 53 000 t N  and compared to the respective output over the same time period yielded export rates of 25 % (22-29 %) at the midstream and 12 % (11-14 %) at the downstream station (Table 3), respectively. There can be several reasons for the difference in export rates between the 30 two sub-catchments. The most likely ones are due to differences in discharge, topography and denitrification capacity among the sub-catchments, which are discussed in the following.
Load export of N from agricultural catchments is assumed to be mainly discharge-controlled . Many solutes show a lower variance in concentrations compared to the variance in stream flow, which makes the flow variability a strong surrogate for load variability (Jawitz & Mitchell, 2011). This can also be seen in the Holtemme catchment, which evolved over time to a more chemostatic export regime with high N loads (Fig. 6b). Highest N export and lowest retention were observed in the midstream sub-catchment, where the overall highest discharge contribution can be found. 5 Besides discharge-quantity, we argue that the midstream sub-catchment favors a more effective export of NO 3 -N. The higher percentage of artificial drainage by tiles and ditches (59 % vs. 21 %; supplement, S1.1) as well as the steeper terrain slopes (3.2° vs. 1.9°) in the non-forested area of the midstream catchment, promote rapid, shallow subsurface flows. These flow paths can more directly connect agricultural N sources with the stream and in turn cause elevated instream NO 3 -N concentrations (Yang et al., 2018a). In addition, the steeper surface topography suggests a deeper vertical infiltration 10 (Jasechko et al., 2016) and by that a wider range of flow paths of different ages than those observed in the flatter terrain areas. Vice versa, fewer drainage installations, a flatter terrain and thus in general shallower flow paths may decrease the N export efficiency (increase the retention) potential Downstream.
The only process able to permanently remove N input from the catchment is denitrification in soils, aquifers (Seitzinger et al., 2006;Hofstra & Bouwman, 2005), and at the stream-aquifer interface such as in the riparian (Vidon & Hill, 2004;Trauth 15 et al., 2018) and hyporheic zones (Vieweg et al., 2016). As the riverine exports are signals of the catchment or subcatchment processes, integrated in time and space, separating a build-up of an N legacy from a permanent removal via denitrification is difficult. A clear separation of these two key processes, however, would be important for decision makers as both have different implications for management strategies and different future impacts on water quality. Even if groundwater quality measurements that indicate denitrification were available, using this type of local information for an 20 effective catchment scale estimation of N removal via denitrification would be challenging (Green et al., 2016;Otero et al., 2009;Refsgaard et al., 2014). Therefore, we discuss the denitrification potential in the soils and aquifers of the Holtemme catchment based on a local isotope-study and a literature review of studies in similar settings. A strong argument against a dominant role of denitrification is provided by Müller et al. (2018) for the study area. On the basis of a monitoring of nitrate isotopic compositions in the Holtemme River and in tributaries, Müller et al. (2018) stated that denitrification played no or 25 only a minor role in the catchment. However, we still see the need to carefully check the potential of denitrification to explain the input-output imbalance considering other studies.
If 88 % of the N input (53 437 t, dominantly agricultural input) to the catchment between 1976 and 2015 (39 years) were denitrified in the soils of the agricultural area (161 km²), it would need a rate of 74.9 kg N ha -1 a -1 . Considering the derived TTs, denitrification of the convolved input would need a slightly lower rate (66.7 kg N ha -1 a -1 , 1976-2015). Denitrification 30 rates in soils for Germany (NLfB, 2005) have been reported to range between 13.5-250 kg N ha -1 a -1 , with rates larger than 50 kg N ha -1 a -1 may be found in carbon rich and waterlogged soils in the riparian zones near rivers and in areas with fens and bogs (Kunkel et al., 2008). As water bodies and wetlands make up only 1 % of the catchment's land use ( Fig. 1; EEA, 2012), and consequently the extent of waterlogged soils is negligible, denitrification rates larger than 50 kg N ha -1 a -1 are highly unlikely. In a global study, Seitzinger et al. (2006) assumed a rate of 14 kg N ha -1 a -1 as denitrification for agricultural soils. With this rate only 19 % of the retained (88 %) study catchment's N input can be denitrified. On the basis of a simulation with the modeling framework GROWA-WEKU-MEPhos Kuhr et al. (2014) estimates very low to low denitrification rates, of 9-13 kg N ha -1 a -1 , for the soils of the Holtemme catchment. Based on the above discussion we find for our study catchment, the denitrification in the soils, including the riparian zone, may partly explain the retention of NO 3 -5 N, but is unlikely to be a single explanation for the observed imbalance between in-and output.
Regarding the potential for denitrification in groundwater, the literature provides denitrification rate constants of a first order decay process between 0.01-0.56 year -1 (van Meter et al., 2017b;van der Velde et al., 2010;Wendland et al., 2005). We derived the denitrification constant by distributing the input according to the fitted log-normal distribution of TTs assuming a first order decay along the flow paths (Kuhr et al., 2014;Rode et al., 2009;van der Velde, 2010). The denitrification of the 10 88 % of input mass would require a rate constant of 0.14 year -1 . This constant is in the range of values reported by mentioned modelling studies. However, in a regional evaluation of groundwater quality, Hannappel et al. (2018) provide strong evidence that denitrification in the groundwater of the Holtemme catchment is not a dominant retention process. More specifically, Hannappel et al. (2018) assess denitrification in over 500 wells in the federal state Saxony-Anhalt for nitrate, oxygen, iron concentrations and redox potential and connects the results to the hydrogeological units. Within the hard rock 15 aquifers that are present in our study area, only 0-16 % of the wells showed signs of denitrification. Taking together the local evidence from the nitrate isotopic composition (Müller et al., 2018), the regional evidence from groundwater quality (Hannappel et al., 2018) and the rates provided in literature for soils and groundwater, we argue that the role of denitrification in groundwater is unlikely to explain the observed imbalance between N input and output.
Lastly, assimilatory NO 3 uptake in the stream may be a potential contributor to the difference between in-and output. But 20 even with maximal NO 3 uptake rates as reported by Mulholland et al. (2004;0.14 g N m -2 d -1 ) or Rode et al. (2016;max. 0.27 g N m -2 d -1; estimated for a catchment adjacent to the Holtemme), the annual assimilatory uptake in the river would be a minor removal process, estimated to contribute only 3 % of the 88 % discrepancy between in-and output. According to the rates reported by Mulholland et al. (2008;max. 0.24 g N m -2 d -1 ), the Holtemme River would need a 45-times larger area to be able to denitrify the retained N. Therefore denitrification in the stream can be excluded as a dominant removal process. 25 In summary, the precise differentiation between the accumulation of an N legacy and removal by denitrification cannot be fully resolved on the basis of the available data. Also a mix of both may account for the missing 88 % (86-89 %, Downstream) or 75 % (71-78 %, Midstream) in the N output. Input-output assessments with time series from different catchments, as presented in van Meter & Basu (2017), covering a larger variety of catchment characteristics, hold promise for an improved understanding of the controlling parameters and dominant retention processes. 30 The fact that current NO 3 concentration levels in the Holtemme River still show no clear sign of a significant decrease, calls for a continuation of the NO 3 concentration monitoring, best extended by additional monitoring in soils and groundwater.
Despite strong reductions in agricultural N input since the 1990s, the annual N-surplus (e.g. 818 t a -1 , 2015) is still much higher than the highest measured export (load max = 216 t a -1 , 1995) from the catchment. Hence, the difference between in-and output is still high with a mean factor of 6 during the past 10 years (mean factor of 7 with the shifted input according to 12 years of TT). Consequently, either the legacy of N in the catchment keeps growing instead of getting depleted or the system relies on a potentially limited denitrification capacity. Denitrification may irreversibly consume electron donors like pyrite for autolithotrophic denitrification or organic carbon for heterotrophic denitrification (Rivett et al., 2008).
Based on the analyses and literature research, there is evidence but no proof on the fate of missing N, although a directed 5 water quality management would need a clearer differentiation between N mass that is stored or denitrified. Though, neither tolerating the growing build-up of legacies nor relying on finite denitrification represents sustainable and adapted agricultural management practice. Hence, also future years will face increased NO 3 -N concentrations and loads exported from the Holtemme catchment.

Linking effective TTs, concentrations and C-Q trajectories with N legacies 10
Based on our data-driven analyses, we propose the following conceptual model (Fig. 7) for N export from the Holtemme catchment, which is able to plausibly connect and synthesize the available data and findings on TTs, concentration trajectories and C-Q relationships and, allows for a discussion on the type of N legacy.
Figure 7: Conceptual model of nitrogen legacy and exports from the Midstream and the Downstream catchment. The four stacked boxes refer to the dominant source layer of nitrate that is activated with changing water level and catchment wetness during low flow seasons fall (red) and autumn (orange) as well as high flow seasons winter (blue) and spring (green). Numbers in the boxes refer to peak travel times of each season. The percentages refer to the N imbalance between input and output explainable by travel times (hydrological legacy). Background map created from ATKIS data.

5
Over the course of a year, different subsurface flow paths are active, which connect different subsurface N source zones with different source strength (in terms of concentration and flux) to streams. These flow paths transfer water and NO 3 -N to streams, predominantly from shallower parts of the aquifer when water tables are high during HFSs and exclusively from deeper groundwater during low flows in LFSs (Rozemeijer & Broers, 2007;Dupas et al., 2016;Musolff et al., 2016). This conceptual model allows us to explain the observed intra-annual concentration patterns and the distinct clustering of TTs into 10 low flow and high flow conditions. Furthermore, it can explain the mobilization of nutrients from spatially distributed NO 3 -N sources by temporally varying flow-generating zones . Spatial heterogeneity of solute source zones can be a result of downward migration of the dominant NO 3 -N storage zone in the vertical soil-groundwater profile (Dupas et al., 2016). Moreover, a systematic increase of the water age with depths would, if denitrification in groundwater takes place uniformly, lead to a vertical concentration decrease. Based on the stable hydroclimatic conditions without changes in land 15 use, topography or the river network during the observation period, long-term changes of flow paths in the catchment are unlikely. Assuming that flow contributions from the same depths do not change between the years, the observed decadal changes in the seasonal concentrations cannot be explained by a stronger imprint of denitrification with increasing water age.
Under such conditions one would expect a more steady seasonality in concentrations and C-Q patterns over time with NO 3 -N concentrations that are always similarly high in HFSs and similarly low in LFSs, which we do not see in the data. 20 Additionally, previous findings have indicated no or only a minor role of denitrification in the catchment (Hannappel et al., 2018;Kunkel et al., 2008;Müller et al. 2018). In line with Dupas et al. (2016) we instead argue that the vertical migration of a temporally changing NO 3 -N input is one of the most likely plausible explanations for our observations with regards to N budgets, concentrations and C-Q trajectories.
The faster TTs observed at the midstream station during HFSs are assumed to be dominated by discharge from shallow 25 (near-surface) source zones. This zone is responsible for the fast response of instream NO 3 -N concentrations to the increasing N inputs (1970s to mid-1980s). This faster lateral transfer especially in spring (shortest TT) may be also enhanced by the presence of artificial drainage structures such as tiles and ditches. In line with the longer TTs during the LFSs, low flow NO 3 -N concentrations were less impacted in the 1970s to mid-1980s as deeper parts of the aquifer were still less affected by anthropogenic inputs. With ongoing time and a downward migration of the high NO 3 -N inputs before 1990, also 30 those deeper layers and thus longer flow paths delivered increased concentrations to the stream (1990s). In parallel with the increasing low flow concentrations (in the 1990s), the spring concentrations of NO 3 decreased caused by a depletion of the shallower NO 3 -N stocks (see also Dupas et al., 2016;Thomas & Abbott, 2018). This depletion of the stock was a consequence of drastically reduced N input after the German reunification in 1989. This conceptual model of N trajectories is supported by the changing C-Q relationship over time. The seasonal cycle started with increasing NO 3 -N maxima during high flows and minima during low flows, since firstly shallow source zones were getting loaded with NO 3 . Consequently, the accretion pattern was intensified in the first decades accompanied by an increase of CV C /CV Q . The resulting positive C-Q relationship on a seasonal basis was found in many agricultural catchments worldwide (e.g. Aubert et al., 2013;Martin et al., 2004;Mellander et al., 2014;Rodriguez-Blanco et al., 2015;Musolff et al. 2015). However, after several years of deeper migration of the N input, the catchment started to exhibit a chemostatic NO 3 -N export regime (after 1990s), which was 5 manifested in the decreasing CV c /CV Q ratio. This stationarity could have been caused by a vertical equilibration of NO 3 -N concentrations in all seasonally activated depth zones of the soils and aquifers after a more stable long-term N input after 1995. According to the 50 th percentile of the derived TT, after 20 years only 50 % of the input had been released Midstream.
Therefore without any strong changes in input, the chemostatic conditions caused by the uniform, vertical NO 3 -N contamination will remain. At the same time, this chemostatic export regime supports the hypothesis of an accumulated N 10 legacy rather than denitrification as dominant reason for the imbalance between in-and output.
At the downstream station, the riverine NO 3 concentrations during high flows were dominated by inputs from the midstream sub-catchment, which explains the similarity with the midstream bimodality in concentrations as well as the comparable TTs. The reason for these dominating midstream flows is the strong precipitation gradient resulting in a runoff gradient on the leeward side of the mountains. During low flows, the downstream sub-catchment can contribute much more to discharge 15 and therefore to the overall N export. During the LFSs, we observed higher NO 3 -N concentrations with a unimodal trajectory, and shorter TTs compared to the midstream sub-catchment. We argue that the lowland sub-catchment supports higher water levels and thus faster TTs during the low flows. Greater prevalence of young age streamflow in flatter lowland terrain was also described by Jasechko et al. (2016). But besides the earlier peak time during low flows, the concentration was found to be much higher than Midstream. To cause such high intra-annual concentration changes, the downstream NO 3 -20 N load contribution, e.g. during the concentration peak 1995/96, had to be high: the summer season export was 46 t, which is more than twice the median contribution during summer (22 t). A more effective export from the downstream catchment happened mainly during LFSs, which is also supported by the narrower TTD (small shape factor σ) in the summer and fall (Fig. 5b). The difference between the 75 th and 25 th percentiles (5 years) was also the smallest of all seasons in the summer at the downstream station. This could be one reason for the high concentrations in comparison to the midstream catchment and 25 during the HFSs.
In contrast to the midstream catchment, the C-Q trajectory in the downstream catchment temporarily switched from an enrichment pattern, dominated by the high concentration during high flows from Midstream to a dilution pattern and a chemodynamic regime, when the high concentrations in the LFS from the downstream sub-catchment dominated. Although the low flow concentrations were slowly decreasing in the 2000s and 2010s, also the downstream catchment finally evolved 30 to a chemostatic NO 3 export regime as noticed Midstream (Fig. 6f).
Our findings support the evolution from chemodynamic to chemostatic behaviour in managed catchments, but also emphasize that changing inputs of N into the catchment can lead to fast changing export regimes even in relatively slowly reacting systems. Our findings expand on previous knowledge Dupas et al., 2016) as we could show systematic inter-annual C-Q changes that are in line with a changing input and a systematic seasonal differentiation of TTs.
Although our study showed chemostatic behaviour towards the end of the observation period (Mid-and Downstream; Fig. 6e-f), this export regime is not necessarily stable as it depends on a continuous replenishment of the legacy store. Changes in the N input translate to an increase of spatial heterogeneity in NO 3 -N concentrations in soil-and groundwater with contrasting water ages. The seasonal changing contribution of different water ages thus results in more chemodynamic NO 3 -5 N export regimes. As described in Musolff et al. (2017) both, export regimes and patters are therefore controlled by the interrelation of TT and source concentrations. We argue that a hydrological legacy of NO 3 -N in the catchment has been established that resulted in a pseudo-chemostatic export behaviour we observe nowadays. This supports for a notion that a biogeochemical legacy corresponding to the build-up of organic N in the root zones of the soil (van Meter et al., 2016) is less probable. If we assume that all of the 88 % of the N input is accumulating in the soils, we cannot explain the observed 10 shorter-term inter-annual concentration changes and trajectory in the C-Q relationships. We would rather expect a stronger and even growing dampening of the N input to the subsurface with the build-up of a biogeochemical legacy in form of organic N. However, we cannot fully exclude the accumulation of a protected pool of soil organic matter with very slow mineralization rates as described in van Meter et al. (2017). Our conceptual model assigns the missing N to the long TTs of NO 3 -N in soil-and groundwater and in turn to a pronounced hydrological legacy. In the midstream sub-catchment, the 15 estimated TTD explains 40 % of the retained NO 3 -N, comparing the convolution of TTD with the N input time series to the actual riverine export. The remaining 60 % cannot be fully explained at the moment and may be assigned to a permanent removal by denitrification (see discussion above), to a fixation due to biogeochemical legacy, or to more complex e.g. longer tailed TTDs, which are not well represented by our assumed log-normal distribution. In the downstream sub-catchment, our approach explains 29 % of the observed export. This could in principle be caused by the same processes as described for the 20 midstream sub-catchment. Also a hydrological legacy store in deeper zones without significant discharge contribution is possible (Fig. 7). That mass of N is either bypassing the downstream monitoring station (note that the downstream station is still 3 km upstream of the Holtemme catchment outlet) or is affected by a strong time delay and dampening not captured by our approach. Consequently, future changes in N inputs will also change the future export patterns and regimes, since this would shift the homogeneous NO 3 -N distributions in vertical soil and groundwater profiles back to more heterogeneous 25 ones.

Conclusion
In the present study we used a unique time series of riverine N concentrations over the last four decades from a mesoscale German catchment as well as estimated N input and to discuss the linkage between the two on annual and intra-annual time scales. From the input-output assessment, the build-up of a potential N legacy was quantified, effective TTs of nitrate were 30 estimated and the temporal evolution to chemostatic NO 3 -N export was investigated. This study provides four major findings that can be generalized and transferred to other catchments of similar hydroclimatic and landscape settings as well.
First, the retention capacity of the catchment for N is 88 % of the N input (input and output referring to 1976 to 2015), which either can be stored as a legacy or denitrified in the terrestrial or aquatic system. Although we could not fully quantify denitrification, we argue that this process is not the dominant one in the catchment to explain input-output differences. The observed N retention can be more plausibly explained by legacy than by denitrification. In consequence, the hydrological N legacy, i.e. the load of nitrate still on the way to the stream, may have strong effects on future water quality and long-term 5 implications for river water quality management. With a median export rate of 162 t N a -1 (1976-2016, downstream station, 6 kg N ha -1 a -1 ), a depletion of this legacy (< 46 000 t N) via baseflow would maintain elevated riverine concentrations for the next decades. Although N-surplus strongly decreased after the 1980s, during the past 10 years there still was, an imbalance between agricultural input and riverine export by a mean factor of 5 (assuming the temporal offset of peak TTs between inand output of 12 years). This is a non-sustainable condition, regardless of whether the retained nitrate is stored or denitrified. 10 Export rates as well as retention capacity derived for this catchment were found to be comparable to findings of other studies in Europe (Worrall et al., 2015;Dupas et al., 2015) and North America (van Meter et al., 2016).
Secondly, we derived peak time lags between N input and riverine export between 7-22 years with systematic differences among the different seasons. Catchment managers should be aware of these long time frames when implementing measures and when evaluating them. This study explains the seasonally differing lag times and temporal concentration evolutions with 15 the vertical migration of the nitrate and their changing contribution to discharge by seasonally changing aquifer connection.
Hence, inter-annual concentration changes are not dominantly controlled by inter-annually changing discharge conditions, but rather by the seasonal changing activation of subsurface flows with differing ages and thus differing N loads. As a consequence of this activation-dependent load contribution, an effective, adapted monitoring needs to cover, different discharge conditions when measures shall be assessed for their effectiveness. In the light of comparable findings of long time 20 lags (van Meter & Basu, 2017;Howden, 2011), there is a general need for sufficient monitoring length and appropriate methods for data evaluation like the seasonal statistics of time series.
Third, in contrast to a more monotonic change from a chemodynamic to a chemostatic nitrate export regime that was observed previously (Dupas et al., 2016;Basu et al., 2010), this study found a systematic change of the nitrate export regime from accretion over dilution to chemostatic behavior. Here, we can make use of the unique situation in East-German 25 catchments where the collapse of agriculture in the early 1990s provided a large scale "experiment" with abruptly reduced N inputs. While previous studies could not distinguish between biogeochemical and hydrological legacy to cause chemostatic export behavior, our findings support for a hydrological legacy in the study catchment. The systematic inter-annual changes of C-Q relationships of NO 3 -N were explained by the changes in the N input in combination with the seasonally changing effective TTs of N. The observed export regime and pattern of NO 3 -N suggest a dominance of a hydrological N legacy over 30 the biogeochemical N legacy in the upper soils. In turn, observed trajectories in export regimes of other catchments may be an indicator of their state of homogenization and can be helpful to classify results and predict future concentrations.
Fourth, although we observed long TTs, significant input changes also created strong inter-annual changes in the export regime. The chemostatic behavior is therefore not necessarily a persistent endpoint of intense agricultural land use, but depends on steady replenishment of the N store. Therefore, the export behavior can also be termed pseudo-chemostatic and may further evolve in the future (Musolff et al., 2015) under the assumptions of a changing N input. Depending on the legacy size, a significant reduction or increase of N input can cause an evolution back to more chemodynamic regimes with dilution or enrichment patterns. Simultaneously, input changes affect the homogenized vertical nitrate profile, resulting in larger intra-annual concentration differences and consequently chemodynamic behavior. Hence, chemostatic behavior and 5 homogenization may be characteristics of managed catchments, but only under constant N input.
Recommendations for a sustainable management of N pollution in the studied Holtemme catchment, also transferable to comparable catchments, focus on the two aspects.

-
Our findings could not prove a significant loss of NO 3 -N by denitrification. To deal with the past inputs and to focus on the depletion of the N legacy, end-of-pipe measures such as hedgerows around agricultural fields ( Thomas 10 & Abbott, 2018), riparian buffers or constructed wetlands may initiate N removal by denitrification (Messer et al., 2012).

-
We could show that there is still an imbalance of agricultural N input and riverine export by a mean factor of 5. A reduced N input due to better management of fertilizer and the prevention of N losses from the root zone in present time is indispensable to enable depletion instead of a further build-up or stabilization of the legacy. 15 The combination of N budgeting, effective TTs with long-term changes in C-Q characteristics proved to be a helpful tool to discuss the build-up and type of N legacy at catchment scale. This study strongly benefits from the availability of long time series in nested catchments with a hydroclimatic and land-use gradient. This wealth of data may not be available everywhere.
However, we see the potential to transfer this approach to a much wider range of catchments with long-term observations for understanding the spatial and temporal variation and type of legacy build-up, denitrification and TTs as well as their 20 controlling factors. Data-driven analyses of differing catchments covering a higher variety of characteristics may provide a more comprehensive picture of N trajectories and their controlling parameters. In addition to data-driven approaches emphasis should also be put on robust estimations of water TT in catchments to constraint reaction rates. Recent studies present promising approaches to derive TTs in groundwater (Marcais et al., 2018;Kolbe et al., 2019) and at catchment scale (Jasechko et al., 2016;Yang et al., 2018a) 25

Data availability
Discharge data (for all dates) and water quality data (from 1993) can be accessed at the websites of the State Office of Flood