Nitrogen surface water retention in the Baltic Sea drainage basin

In this paper, we estimate the surface water retention of nitrogen (N) in all the 117 drainage basins to the Baltic Sea with the use of a statistical model (MESAW) for source apportionment of riverine loads of pollutants. Our results show that the MESAW model was able to estimate the N load at the river mouth of 88 Baltic Sea rivers, for which we had observed data, with a sufficient degree of precision and accuracy. The estimated retention parameters were also statistically significant. Our results show that around 380 000 t of N are annually retained in surface waters draining to the Baltic Sea. The total annual riverine load from the 117 basins to the Baltic Sea was estimated at 570 000 t of N, giving a total surface water N retention of around 40 %. In terms of absolute retention values, three major river basins account for 50 % of the total retention in the 117 basins; i.e. around 104 000 t of N are retained in Neva, 55 000 t in Vistula and 32 000 t in Oder. The largest retention was found in river basins with a high percentage of lakes as indicated by a strong relationship between N retention (%) and share of lake area in the river drainage areas. For example in Göta älv, we estimated a total N retention of 72 %, whereof 67 % of the retention occurred in the lakes of that drainage area (Lake Vänern primarily). The obtained results will hopefully enable the Helsinki Commission (HELCOM) to refine the nutrient load targets in the Baltic Sea Action Plan (BSAP), as well as to better identify cost-efficient measures to reduce nutrient loadings to the Baltic Sea.


Introduction
Expanding human activities have had a great impact on nutrient dynamics and nutrient export from watersheds (Hill and Bolgrien, 2011;Mayorga et al., 2010).Increased population densities, food production, sewage emissions and fossil fuel combustion are among the driving forces causing increased nutrient mobilisation and alterations to hydrological systems (Mayorga et al., 2010).Increased nutrient export from coastal watersheds has had severe impacts on the ecological functions and community composition of estuaries, with algal blooms, increased water turbidity, oxygen depletion, and severe fish deaths as the most prominent consequences (Kellogg et al., 2010;Mayorga et al., 2010;Hoffmann et al., 2009).
Several geomorphic, hydraulic and biological factors may interact to reduce nutrient export from watersheds (Wollheim et al., 2006).Hejzlar et al. (2009) define retention as the fraction of external nutrient inputs that is retained within watersheds, either in absolute values or relative to the input.For nitrogen (N), the term retention is widely used to describe the processes leading to a temporary immobilisation of reactive (non-N 2 ) N by incorporation into biomass or sedimentation, or the permanent loss of reactive N by conversion into the non-reactive atmospheric form (N 2 ) by denitrification (Billen et al., 2009).Nitrogen is primarily removed (or retained) from surface water by denitrification (i.e. the microbial production of N 2 from fixed N), followed by processes such as sorption to sediment or organic matter, and biological uptake (Hejzlar et al., 2009).Results from mass-balance studies across a wide range of geographic scales indicate that P. Stålnacke et al.: Nitrogen surface water retention in the Baltic Sea watersheds could retain as much as 60-90 % of total N inputs (Kellogg et al., 2010).Reduced N export can be achieved by increasing N retention in soils, sediments and biomass, reducing atmospheric and terrestrial N sources, and increasing in-stream N removal and retention processes (Hill and Bolgrien, 2011).
Water residence time is a major factor determining the retention of nutrients in watersheds (Hejzlar et al., 2009), while Hesse and co-workers emphasised the need for better understanding of terrestrial retention (i.e. in soils; Hesse et al., 2013).Watershed characteristics, such as hydrology and geomorphology, strongly control water residence time, and increased water residence time can enhance denitrification processes and thereby reduce N loads to coastal waters (Kellogg et al., 2010;Behrendt and Opitz, 2000).Total N inputs influence denitrification rates, whereas hydrology and geomorphology (or water residence time) influence the proportion of N inputs that are denitrified (Seitzinger et al., 2006).Certain areas within watersheds can be identified as sink areas with regard to N export, often being areas with a relatively long water residence time where biogeochemical processes can transform reactive N into organic N in biomass, or N gases via denitrification (Kellogg et al., 2009), or burial of N in sediments (Harrison et al., 2009).The mitigating effect of these sink areas could in some cases be negligible, especially in cases where such areas are bypassed by N-carrying water flows due to specific land management practices (e.g.tile drains or storm water overflows) (Kellogg et al., 2009).Denitrification processes are favoured in sediments and hypoxic or anoxic bottom waters, particularly in systems with abundant organic carbon (C) and nitrate (Harrison et al., 2009;Mulholland et al., 2008).
The question on how to quantify the retention of nutrients from source to river mouth remains one of the largest uncertainties in river basin management.Several authors (e.g.Mayorga et al., 2010;Seitzinger, 2008) emphasise the need for advances in methods and models for determining the impacts of human activities on nutrient inputs to coastal waters, and a better understanding of the processes leading to retention of N in watersheds.Seitzinger et al. (2002) argue that studies generally have focused on N removal in shorter sub-sections of rivers and emphasise the need for a river network approach if we are to quantify the retention of nutrients relative to total inputs.In later years, a number of models of different complexity have been developed for estimating surface water N retention (e.g.Billen et al., 2009;Grimvall and Stålnacke, 1996;Hejzlar et al., 2009;Hill and Bolgrien, 2011;Jung and Deng, 2011;Mayorga et al., 2010;Seitzinger et al., 2002).
In this paper, we estimate the surface water retention of N in the Baltic Sea drainage basin with the use of a statistical model for source apportionment of riverine loads of pollutants, the MESAW model (Grimvall and Stålnacke, 1996).Scientifically, estimation of retention is one of the largest challenges in river basin nutrient accounting (i.e.source ap-portionment and budget calculations) and, therefore, also the case in the Baltic Sea drainage basin.

Materials and methods
The Baltic Sea, together with the lakes and watercourses in its drainage basin, represents one of the most intensively monitored aquatic systems in the world and eutrophication has been identified as a major threat to this system.The total area of the Baltic Sea drainage basin is 1 745 000 km 2 , which is around four times the area of the sea itself.The long-term average inflow of freshwater with the rivers is 475 km 3 yr −1 or 15 130 m 3 s −1 (Bergström and Carlsson, 1994;Mörth et al., 2007).Details on population and land use characteristics in the Baltic Sea drainage area can be found in Mörth et al. (2007).The input data for all basins is found in Table A1.

MESAW input data
For the estimation of WWTP emissions, we created a spatially distributed data set of people "connected" or "not connected" to WWTPs (primary, secondary and tertiary) within the Baltic Sea river basins.For this, we used spatially distributed population data and national level statistics on WWTP connection.Population numbers for the year 2005 divided into urban and rural population were obtained from the HYDE database (http://themasites.pbl.nl/en/themasites/hyde/).These data were redistributed into a 10 × 10 km grid.Percentages of population "connected" and type of waste water treatment were compiled from EUROSTAT (European Commission) and the Organisation for Economic Co-operation and Development (OECD).For Russia, Belarus, Ukraine and Slovakia, only percentage of people "connected" to any type of waste water treatment was available, so the distribution between primary, secondary and tertiary treatment was based on assumptions and expert judgement.
Based on these national statistics, the total number of "connected" people in each country was calculated.The number of "connected" people was then spatially distributed to the grid cells.The distribution was made based on the assumption that urban populations and grid cells with higher population numbers would be more likely to have a municipal WWTP connection than rural and smaller populations.Applying this principle, the grid cells for each country were classified as "connected" starting with urban populations in a descending order, and continuing with rural population in the same way until the number of "connected" people reached the number specified by the national statistics.This procedure was carried out for all three treatment types; first tertiary, then secondary and last primary.The number of people "not connected" to any type of treatment plant was also calculated for each grid cell.Total N emission from WWTPs was then calculated for each grid cell based on the approach of Mörth et al. (2007).

The MESAW model and model parameterisation
MESAW is a statistical model for source apportionment of riverine loads of pollutants developed by Grimvall and Stålnacke (1996).This model approach uses non-linear regression for simultaneous estimation of export coefficients to surface waters for the different specified land cover or soil categories and retention coefficients for pollutants in river basins.Examples of application of the MESAW model are given in Lidèn et al. (1999), Vassiljev andStålnacke (2005), Vassiljev et al. (2008) and Povilaitis et al. (2012).MESAW has many features in common with the more well-known SPARROW model developed in the USA (Smith et al., 1997;Alexander et al., 2000).
The basic principles and major steps in the procedure included (i) estimation of mean annual riverine N loads for a fixed time period (i.e. the years 1994-2006) at each of the 88 monitoring sites; (ii) derivation of statistics on land cover, lake area, point source emissions and atmospheric deposition (see Sect. 2.1) for each river basin; and (iii) use of a general non-linear regression expression with N loads at each river basin as the dependent/response variable and basin characteristics as covariates/explanatory variables.This gave the following generalised form of the model (Eq.1): where L i is the load at outlet of basin i; S i is total losses from soil to water in basin i; P i is the point source discharges (WWTP and industry) to waters in basin i; D i is the atmospheric deposition on surface waters in sub-basin i; R denotes the retention for the source emissions S, P and D, respectively; n is the number of basins, and ε i is the statistical error term.
The parameterisation of the model is flexible and studyarea specific depending on the data and expert knowledge.The model is fitted by minimising the sum of squares for the differences between observed and estimated loads.The model can be run based on absolute or relative values.If based on absolute values, the optimisation procedure finds the minimum sum of squares of the absolute differences between observed and estimated transport.This procedure implies that the influence of the different rivers/basins will be a function of size.If relative values are used, the optimisation procedure finds the minimum sum of squares of relative differences between observed and estimated transport.This procedure assumes that all rivers have the same weight in the optimisation routine.In this study, we used relative values in order to give equal weight to small and large river basins.
The total diffuse loss of N from soil to water, S i , in the ith sub-basin was assumed to be a function of the land cover (Eq.2): where a 1i , a 2i and a 3i in our study refer to the areas of three land cover classes, i.e. cultivated land, wetlands and other land (mainly forests), respectively.θ 1 , θ 2 and θ 3 are unknown emission coefficients for the three land use categories that are statistically estimated in MESAW jointly with the retention (see Eq. 3 below).The point source emissions, P i , and atmospheric deposition on surface waters, D i , were assumed to be known (see Sect. 2.1).
Throughout the exploratory analysis we found that certain basins deviated from the relationship and in most cases also where geographically located near to each other.Thus, we introduced a "grouping variable" according to the following: where each group j consisted of 2 or more basins depending on the model run (see Table 1) and where ω is the unknown coefficient(s).The model was run with different combinations of basin sub-groups in order to obtain reasonable model coefficients and load estimates (i.e.little deviation between predicted and observed loads).The grouping of basins was based on prior knowledge of similarities between basins as well as geographic location.For example, the 10 smaller Danish sub-basins formed one group, as a residual analysis showed that these sub-basins deviated from the general relationships.In its practical meaning, we simply adjusted the "global" diffuse emission coefficients to the local conditions (despite not knowing the underlying causes).This can be justified since applying the same coefficient to such a large drainage basin (1 745 000 km 2 ) seem less logic.
Retention was in our study used as a summarising expression for all hydrological and biogeochemical processes that may decrease or retard the transport of N, e.g.denitrification, sedimentation and biological uptake.Irrespective of the exact Table 1a.Results from the different MESAW model runs for estimation of total nitrogen (N) retention with different combinations of basin sub-groups.Results include estimated export coefficients (kg km −2 ) from different land use classes (i.e.cultivated, wetland and other), estimated retention coefficients (dimension-less) for lake area and total drainage area, and the coefficient of determination (R 2 ) between observed and predicted annual loads.Standard error (SE) and t ratio of the estimated coefficients are given for each model run.

Diffuse emissions Retention
Cultivated retention mechanism, the parameterisation of the retention in the different basins was after several exploratory runs with alternative models done with the following empirical function (Eq.3): where λ 1 and λ 2 denote a non-negative parameter and R i denotes the retention in the ith basin.The empirical function were in our case derived from the conception that the removal of N takes place primarily in the surface waters (both in-streams and in lakes).The first part of the function reflects the in-stream retention whereas the second part reflects the retention in lakes and reservoirs.Our assumption was that the removal rate is proportional to drainage basin size and the ratio of the lake to the total drainage area.Subsequently this can be seen as an indirect expression of the water residence time in the river basin.
Moreover, for the sake of simplicity, we assumed that the retention is the same for source categories D, S and P .Finally, by combining the parametric expressions of losses from soil to waters and retention with the empirical data, the θ 1 , θ 2 , θ 3 , λ 1 , λ 2 and ω parameters were estimated simultaneously.

Parameterisation results
For estimation of total N retention, the model was first run including only 88 river basins for which we had observed annual N load (Table 1).Among these 88 were 10 smaller Danish sub-basins, all with available monitoring data, but which only constitute parts of the major Danish river basins draining to the Baltic Sea.As stated in Sect.2.2, the model was run as exploratory in order to obtain reasonable parameter coefficients and load estimates (i.e.little deviation between predicted and observed loads).In the final model run (no.4 in Table 1 with nine estimated parameters), including all 88 basins with observed N load, both retention parameters (λ 1 and λ 2 ) and land use category "cultivated" (i.e.θ 1 ) were statistically significant (p < 0.05).The land use category "other" (θ 2 which basically is the forest land) was very close to being statistically significant (p < 0.06)."Wetland" (θ 3 ) was not statistically significant, but this land use category accounts for less than 4 % of the total drainage area in the Baltic Sea drainage basin.It should be noted that the classification of wetlands is rather rough from the data source and given as joint expression of all wetlands ranging from marshes to peatland bogs.All four grouping parameters ω 1ω 4 were statistically significant.
It is worth noting that these diffuse losses parameters (θ 1 − θ 3 ) all are given in kilograms per square kilometres and thus can be interpreted as export or unit-area loss coefficients.Interestingly, our estimates corroborate well with the results of monitored losses from small catchments with relative uniform land use.For example, the point estimate and standard error for cultivated land gave an estimate of 1073 and 109 kg km −2 , respectively (model run no.4; Table 1).Stålnacke and co-workers compiled data from 35 small agricultural catchments in the Nordic and Baltic regions (Stålnacke et al., 2014).They found that a majority of these catchments had a unit-area loss of between 600 and 2500 kg km −2 .In addition, our results showed that the nitrogen losses from agricultural land were almost four times higher than the corresponding losses from forested land (Table 1), which is found to be realistic and in line with other results (Lidèn et al., 1999;Vassiljev and Stålnacke, 2005;Vassiljev et al., 2008)

Major retention estimate results
The final model parameterisation using the 88 river basin data (i.e.model run no. 4 in Table 1) was used to determine the surface water retention of N in all 117 major river basins in the Baltic Sea drainage area.This included 78 river basins with observed N load (excluding the 10 smaller Danish subbasins), and also an additional 39 unmonitored river basins.
The total annual riverine load from the 117 basins to the Baltic Sea was estimated at 570 000 t of N compared to the model-estimated gross load of 950 000 t of N (Fig. 2).Thus, our results show that around 380 000 t of N are annually retained in surface waters draining to the Baltic Sea (streams, rivers, reservoirs and lakes; Fig. 2), giving a total surface water N retention of around 40 %.This is substantially higher than that given by Mörth et al. (2007), who reported a mean N retention of 15 % in the Baltic Sea rivers.The spatial distribution of the relative surface water retention is shown in Fig. 3. Averaged over all basins, mean lake retention is 25 % whereas the estimated in-stream retention is 5 % (Table A2).In terms of absolute retention values, three major river basins account for 50 % of the total retention in the 117 basins; i.e. around 104 000 t of N are retained in Neva, 55 000 t in Vistula and 32 000 t in Oder (Table A2).
Most of the retention occurs in lakes, as indicated by a strong relationship between N retention (%) and share of lake area in the river drainage areas (up to 20 % lake area; Fig. 4).In Göta älv, we estimated a total N retention of 72 %, whereof 67 % occurred in the lakes of that drainage area (Lake Vänern primarily).Other river basins with high retention were Kymijoki (70 %), Motalaström (73 %) and Neva (74 %).All these basins are characterised by a high percentage of lakes.Low retention was estimated for lake-poor basins, e.g.Aurajoki (2 %), Kasari (4 %) and Kelia (3 %).This is in accordance with earlier studies, where the highest N retention has been found in river basins with a large proportion of lakes.In a comparison of N retention in four selected watersheds in Europe, representing a wide range in climate, hydrology and nutrient loads, Hejzlar et al. (2009) found the highest retention values in the two watersheds with lakes as compared to the two other mostly or entirely lake-less watersheds.A globalscale analysis by Harrison et al. (2009) indicated that lakes and reservoirs are important sinks for N in watersheds, with small lakes (<50 km 2 ) retaining about half of the global total.Despite the fact that reservoirs occupy only 6 % of global lentic surface area, the reservoirs were estimated to retain about 33 % of the total N retained by lentic systems.Relationship between estimated retention (%) and total drainage area (km 2 ; upper panel) and share of lake area (% of total drainage area; lower panel) for 117 Baltic Sea basins.

Uncertainty aspects and outlook
It should be noted that there are considerable uncertainties related to estimates of nutrient loads and especially retention at the watershed scale.In a study comparing nutrient retention estimates by catchment-scale models of different complexity, Hejzlar et al. (2009) showed a large variation in nutrient retention values as estimated by the different models in four selected catchments in Europe.They further showed that retention values were directly proportional to nutrient sources within catchments, indicating a close relationship between uncertainties in quantification of diffuse nutrient sources and nutrient retention determination.They concluded that realistic modelling of nutrient export from large catchments is only possible with a certain level of measured data.However, modelling efforts that combine comprehensive data sets on population, land cover, water discharge and quality, etc., may serve as important tools for improved watershed management and for better identification of cost-efficient measures to reduce nutrient loading.In our study, the MESAW model was apparently able to estimate the N load at the river mouth of 88 Baltic Sea rivers for which we had observed data with a sufficient degree of accuracy (Fig. 1, upper panel; Table 1).However, when we show the obtained relationships using unit-area (specific) load, the model underestimates the load (Fig. 1, lower panel).It is also worth noting that the 10 Danish sub-basins included (despite the effort with the grouping) deviate from the general relationship.These 10 smaller subbasins have a high observed specific N load, which is not well predicted by the model.
Figure 5a-d show the relationships between observed specific N load (kg N km −2 ) and share of various land cover categories and lake area in the 88 (78 for wetland) Baltic Sea basins with observed N loads.A high specific N load was generally found in river basins with a large share of cultivated land, as indicated by a strong positive relationship between specific N load (kg N km −2 ) and share of cultivated land (%; Fig. 5a).In contrast to cultivated land, the specific N load was found to be negatively correlated with the share of "other land" (i.e.primarily forest; Fig. 5d).
In their modelling of riverine N transport to the Baltic Sea, Mörth et al. (2007) found diffuse sources contribute the most to the overall simulated riverine N loads.A review by Stål-nacke et al. (2009) also emphasized the importance of diffuse sources (or share of cultivated land) in contributing to N loads in watersheds.HELCOM (2011) reports that 45-61 % of the total waterborne inputs of N to the Baltic Sea are from diffuse sources.The importance of wetlands in determining N loads seems highly variable, with no apparent relationship between specific N load and share of wetland area in the river basins (Fig. 5b).This was less surprising since this land cover class included all kinds of wetlands (from marshes to peatlands).The low specific load for drainages in basins with a wetland coverage exceeding 15 % are all located in middle/northern Finland and also in the northern part of Sweden (Table A1).These basins are all characterised by low population density and low share of cultivated land.In a meta-analysis of the importance of wetlands for the removal of inorganic N and reduction of N export from watersheds, Jordan et al. (2011) found a large variation (0.25-100 %) in N removal efficiency between individual wetlands.When grouped into different wetland classes, mean efficiency was highest for palustrine forested wetlands (63 %) and lowest for estuarine emergent wetlands (33 %).
Regarding statistical uncertainty in our study, the three land cover classes (and the surface water area) add up to 100 % and apparently these explanatory variables are intercorrelated.This will have less influence on the method applied although there is always a risk of multicollinearity in these kinds of regression-type of models.It should be noted that the model inputs are areas of the land cover and not the percentages which will decrease the risk of multicollinearity.Experiences with the MESAW models, as also given in the previously quoted papers, in different geographical areas (Lidèn et al., 1999;Vassiljev and Stålnacke, 2005;Vassiljev et al., 2008;Povilaitis et al., 2012) have not indicated any problem with possible interrelated explanatory variables.In addition, the parameter estimates showed reasonable stability; little change occurred in the values of the most statistically significant model coefficients when additional variables were added in exploratory regressions (Table 1).
The predicted climate change is an additional factor that may significantly affect nutrient loads and retention in watersheds (Jeppesen et al., 2011).Changes in temperature and precipitation will most likely induce changes in agricultural land use, e.g.type of crops grown, rates and timing of fertiliser use, and thereby influence N cycling and export to coastal waters.However, given the uncertainties in predicting future climate and land use on a regional level, the predicted effects on nutrient budgets in watersheds remain highly uncertain (Jeppesen et al., 2011).Further studies on these issues are needed.
Despite these discerned uncertainties, it seems that the MESAW model seems to be a reliable tool for simultaneous estimation of sources and retention in a river basin.It was also evident that MESAW is flexible and can accommodate many functional relationships and explanatory variables.In addition, MESAW can be used to identify measurements and P. Stålnacke et al.: Nitrogen surface water retention in the Baltic Sea basins that are outside the general patterns and relationships.The main advantages with the model are (i) the simple structure of the model, (ii) the simple input data, (iii) that all unknown parameters are derived from empirical data, (iv) that information from all water quality monitoring sites are used in an optimal way, and (v) that the model yields results on the base of all available measured data, which is more optimal than applying emission coefficients from the literature; which is normally even extrapolated from other regions or upscaled from small watersheds.

Conclusions
We claim that one of the largest scientific and management uncertainties is devoted to the question of how to quantify the retention from source to river mouth.In this study, we used the MESAW statistical model to estimate the surface water N retention in the 117 river basins draining to the Baltic Sea.The MESAW model was able to estimate the N load at the river mouth of 88 Baltic Sea rivers, for which we had observed data, with a sufficient degree of accuracy.The estimated retention parameters were also statistically significant.Our results show that around 380 000 t of N are annually retained in surface waters draining to the Baltic Sea.The total annual riverine load from the 117 basins to the Baltic Sea was estimated at 570 000 t of N, giving a total surface water N retention of around 40 %.The largest retention was found in river basins with a high percentage of lakes.
The obtained results will hopefully enable the Helsinki Commission (HELCOM) to refine the nutrient load targets in the Baltic Sea Action Plan (BSAP), and to better identify cost-efficient measures to reduce nutrient loadings to the Baltic Sea.

Figure 1 .
Figure 1.Relationship between observed and predicted annual N load (kt N yr −1 ; upper panel) and specific observed and predicted N load (kg N yr −1 km −2 ; lower panel) in the 88 Baltic Sea basins with observed N load (lower panel).

Figure 2 .
Figure 2. Total estimated nitrogen (N) load (kt N yr −1 ) in the 117 basins of the Baltic Sea drainage area.Total retention is given as the difference between the estimated total load if no retention and the estimated total riverine net N load.

Figure 3 .
Figure 3. Relative total nitrogen (N) retention in the Baltic Sea drainage basins.

Figure 4 .
Figure 4. Relationship between estimated retention (%) and total drainage area (km 2 ; upper panel) and share of lake area (% of total drainage area; lower panel) for 117 Baltic Sea basins.

Figure 5 .
Figure 5. Relationship between specific N load (kg N km −2 ) and share of (a) cultivated, (b) wetland, (c) lake and (d) other areas (in % of total drainage area) in the 88 (78 for wetland) Baltic Sea basins with observed N load.

Table 1b .
Estimated group ratios ± standard error for diffuse emissions coefficients.

Stålnacke et al.: Nitrogen surface water retention in the Baltic Sea 989 Appendix ATable A1 .
Input data to the MESAW model for estimation of total nitrogen (N) retention.Input data include land cover (cultivated, wetland, lake area, other and total drainage area; km 2 ) and point source emissions (WWTP and industry; kg N yr −1 ).Observed annual loads are given with the retention results in TableA2.

Stålnacke et al.: Nitrogen surface water retention in the Baltic Sea 991Table A1 .
Continued.