Modelling water provision as an ecosystem service in a large East African river basin

Reconciling limited water availability with an increasing demand in a sustainable manner requires detailed knowledge on the benefits people obtain from water resources. A frequently advocated approach to deliver such information is the ecosystem services concept. This study quantifies water provision as an ecosystem service for the 43 000 km 2 Pangani Basin in Tanzania and Kenya. The starting assumption that an ecosystem service must be valued and accessible by people necessitates the explicit consideration of stakeholders, as well as fine spatial detail in order to determine their access to water. Further requirements include the use of a simulation model to obtain estimates for unmeasured locations and time periods, and uncertainty assessment due to limited data availability and quality. By slightly adapting the hydrological model Soil and Water Assessment Tool (SWAT), developing and applying tools for input pre-processing, and using Sequential Uncertainty Fitting ver. 2 (SUFI-2) in calibration and uncertainty assessment, a watershed model is set up according to these requirements for the Pangani Basin. Indicators for water provision for different uses are derived from model results by combining them with stakeholder requirements and socio-economic datasets such as census or water rights data. Overall water provision is rather low in the basin, however with large spatial variability. On average, for domestic use, livestock, and industry, 86–105 l per capita and day (95% prediction uncertainty, 95 PPU) are available at a reliability level of 95%. 1.19–1.50 ha (95 PPU) of farmland on which a growing period with sufficient water of 3–6 months is reached at the 75% reliability level – suitable for the production of staple crops – are available per farming household, as well as 0.19–0.51 ha (95 PPU) of farmland with a growing period of ≥6 months, suitable for the cultivation of cash crops. The indicators presented reflect stakeholder information needs and can be extracted from the model for any physical or political spatial unit in the basin.


Introduction
Water is becoming increasingly scarce in many, especially arid and semi-arid, regions of the world (IPCC, 2007b;Millennium Ecosystem Assessment, 2005;UNESCO-WWAP, 2003). Climate change and growing water demand due to population increase and economic development threaten to worsen the situation in coming decades (Hulme et al., 2001;IPCC, 2007a;Liu et al., 2008). Water scarcity impedes development, provokes food shortages and conflicts and has adverse implications on human and ecosystem health. River basin management thus faces the challenge to reconcile water availability and demand in a sustainable manner.
The concept of "ecosystem services" has in recent years been regarded as a promising way to mitigate problems arising from unsustainable management of natural resources, including water (Belluzzo, 2010;Daily et al., 2009;Nelson et al., 2009). Ecosystem services, according to the Millennium Ecosystem Assessment (Millennium Ecosystem Assessment, 2005, p. 40), are "the benefits people obtain from ecosystems". The idea that has made the concept popular is that if people are informed of these benefits and their value, they will more likely be ready to pay for the conservation of ecosystems. However, apart from helping to raise funds, knowledge about the benefits people obtain from their environment forms the basis for any decision-making process towards sustainable development (Hurni and Wiesmann, 2010).

B. Notter et al.: Modelling water provision as an ecosystem service
While the ecosystem services concept has undoubtedly succeeded in raising awareness of the importance of ecosystems for human well-being (Costanza et al., 1997;Daily et al., 2009;Millennium Ecosystem Assessment, 2003), its application in river basin management has been characterised by mixed success so far. Even though numerous projects targeting watershed services have been initiated in many countries around the globe, the evidence of service delivery remains elusive in many cases (Porras et al., 2008;Carpenter et al., 2009). This calls for systematic quantification approaches.
Part of the problem may be that the term ecosystem services has become a buzz-word used to convey a wide range of interests. The original definition of ecosystem services from the Millennium Ecosystem Assessment (2005, see above) is rarely applied systematically. From that definition follows that only components and processes of ecosystems that are perceived and valued by humans as benefits can be regarded as ecosystem services; and for people to obtain the benefit, it must be accessible in time and space (Notter, 2010). On one hand, this makes the explicit consideration of stakeholders and their demands a critical input to any quantification approach. On the other hand, spatial and temporal resolution of the assessment must be sufficient to (a) capture variability in demand as well as water availability, (b) determine access of stakeholders to water sources, and (c) to predict the temporal reliability at which water for given uses is available.
Besides allowing the distinction between actual benefits and non-valued or non-accessible resources, a quantification approach should also provide for the possibility of predicting service availability for future scenarios; this results in the need to use simulation models. Furthermore, the information needs of stakeholders need to be considered. On one hand, results should be available at the scale of decision-making, which mostly corresponds to the aggregation level of political units (Ngana et al., 2010). On the other hand, the aim of transparency towards the consumers of research outputs about data and model uncertainty calls for systematic uncertainty assessment Balin et al., 2010).
In this paper we present an approach to quantifying water provision as an ecosystem service in the East African Pangani Basin according to the above-mentioned requirements using the SWAT hydrological model (Andersson et al., 2009;Arnold et al., 1998;Betrie et al., 2011;Easton et al., 2010;Gassman et al., 2005). Specifically, the following steps were taken to reach this goal: 1. To make the best possible use of available data, efforts were made to optimise input datasets. Specifically, for meteorological point data, the most appropriate interpolation technique was determined and applied. Spatial data layers were optimised by combining information from different sources.
2. The Soil and Water Assessment Tool (SWAT) was set up as hydrological simulator and calibrated at the required resolution, according to a spatial discretization scheme that allows extracting modelling results for physical as well as political spatial units. A slightly modified version of SWAT2005, SWAT-P, was developed and applied in order to be able to simulate a large number of spatial units, as well as processes specific to Pangani Basin. Uncertainty was assessed using Sequential Uncertainty Fitting ver. 2 (SUFI-2) (Abbaspour et al., 2007;Schuol et al., 2008), including additional SWAT-P parameters to assess input uncertainty.
3. Quantitative indicators for water provision for different uses in the Pangani Basin around the year 2000 were derived from model results and socio-economic data, based on criteria of valuation and accessibility by stakeholders.

The study area
The Pangani Basin stretches over 43 000 km 2 between Kilimanjaro and the Indian Ocean (Fig. 1). 95 % of its area is located in Tanzania (Arusha, Kilimanjaro, Tanga and Manyara Regions), the remaining 5 % in Kenya. Most of the discharge in perennial rivers originates from the humid mountain ranges, while the surrounding lowlands have a semiarid climate (Ngana, 2001a). Crops (coffee, bananas, maize, flowers, sugarcane, and rice) are grown mostly on the mountain slopes and footzones, or in irrigated schemes in the river plains; the Upper Basin includes some of the economically most productive areas of Tanzania, with growing international investments in large-scale agriculture and industries. Hydropower generation along Pangani River satisfies a significant share of Tanzania's electricity demand (IUCN, 2003). Growing water demand increasingly leads to conflicts between water users (Mujwahuzi, 2001). A number of studies with a hydrological focus have been conducted in the basin. A joint river basin management project by the Norwegian NTNU and the University of Dar es Salaam around the year 2000 resulted in a number of case studies (compiled in Ngana, 2001bNgana, , 2002. Modelling studies in the basin include e.g. Røhr's (2003) investigations on the hydrology of the Kilimanjaro slopes; Moges' (2003) development of a decision support system for the basin; Ndomba's (2008) successful use of SWAT2005 to predict sediment yield in the Upper Pangani Basin; and a scenario report by IUCN and Pangani Basin Water Office making use of the WEAP model (IUCN and PBWO, 2008).

Data collection and quality control
Three types of data were required for the study: 1. Hydro-meteorological data: daily rainfall, minimum/maximum temperatures and river discharge data were obtained from the University of Dar es Salaam and the Tanzania Ministry of Water, and quality-controlled using methods described by Feng et al. (2004). Information on point source inputs (large springs, boreholes) and granted diversion amounts were available from catchment authorities and case studies (Jalon and Mezer, 1971;Ngana, 2001bNgana, , 2002United Republic of Tanzania, 1977).
2. Spatial data: For the Digital Elevation Model (DEM), the 90 m resolution dataset by the Shuttle Radar Topography Mission (SRTM) of the NASA (Farr et al., 2007) was used. A soil map was combined from the FAO maps for Southern Africa (source scale 1:2 000 000) and North-Eastern Africa (source scale 1:1 000 000) (Dijkshoorn, 2003;FAO, 1997

Time-series data
Meteorological time series inputs, especially precipitation data, have repeatedly been identified as one of the main limiting factors in hydrologic modelling, due to spatial patterns not captured by wide-meshed monitoring networks (e.g. Notter et al., 2007). Depending on the chosen interpolation technique, better or poorer spatial representations of meteorological variables can be achieved (e.g. Goovaerts, 2000). The SWAT model itself uses no explicit internal interpolation algorithm -the value from the closest station to each subbasin center is used (Neitsch et al., 2005); however, it is possible to carry out interpolation to subbasin areas outside SWAT and then use the calculated rainfall amount as pseudo-gauge inputs (Zhang, 2006). For the purpose of testing the performance of different interpolation techniques and pre-processing the meteorological inputs for SWAT, a time-series interpolation tool was developed using ArcGIS and ArcObjects (ESRI Inc.), which interpolates time series data of meteorological variables to raster or polygon geometries, using the interpolation techniques available in ArcGIS (Inverse Distance Weighting, IDW, Spline and Kriging). Additionally, a secondary variable like elevation can be included in the interpolation, using an absolute (Eq. 1) or relative lapse rate (Eq. 2): where V int2,i is the interpolated value of the variable of interest, calculated by inclusion of secondary variable, at location i; V int1,i is the interpolated value, calculated using univariate technique, at location i; LR is the lapse rate (an absolute or percent increase by unit of the secondary variable); SV real,i , is the observed value of the secondary variable at location i; and SV int,i is the value of the secondary variable at location i, interpolated from the those locations where observed values of the variable of interest are available. The performance of the different interpolation techniques was assessed by cross-validation. First, the univariate algorithms (IDW, Kriging and Spline) were tested against each other with varying parameter settings using samples of daily, monthly and annual time steps of precipitation and temperature data between 1981 and 2000. The technique obtaining the lowest Root Mean Square Error (RMSE) between interpolated and measured values was then combined with elevation as secondary variable and again tested using crossvalidation. The resulting best technique was used to preprocess SWAT climate inputs.

Spatial data
Given the requirement of high spatial detail, efforts were made to improve the spatial input data. The available digital river network for the basin, at a source scale of 1:250 000, showed deviations of >1 km; the DEM derived from SRTM data, on the other hand, was spatially accurate but in flat areas, river directions cannot be reliably determined from it. Therefore, a combined approach based on the GeoCover satellite image and the DEM was chosen: in flat areas, were water courses are distinguishable, they were classified from the satellite image using the maximum likelihood classification method. The thus obtained stream lines were then used to "burn in" the DEM using the "Agree" algorithm (Hellweger, 1997). In more mountainous areas, the stream delineation was done based on the DEM. This method left only few streams, which were neither captured by the satellite image classification nor accurately delineated based on the DEM, to be corrected manually.

The hydrological model: from SWAT2005 to SWAT-P
SWAT (Soil and Water Assessment Tool; Arnold et al., 1998;Gassman et al., 2005) was chosen as hydrological simulator due to its comprehensiveness in simulating watershed processes (runoff generation and routing, crop growth, nutrient cycling, and erosion are included), its flexibility in spatial discretization, which provides for detailed assessments at plot scale as well as more generalized continental-scale applications (Neitsch et al., 2005;Schuol et al., 2008), the availability of tools for uncertainty assessment (Abbaspour et al., 2007;van Griensven and Meixner, 2006), open access to the source code, and based on the fact that it has been successfully applied in other studies in data-limited environments, specifically on the African continent (Betrie et al., 2011;Easton et al., 2010;Schuol et al., 2008;Ndomba et al., 2008). SWAT is a physically-based, distributed model. It includes three levels of spatial aggregation. A watershed is divided into a number of subbasins (typically on topographical basis), and these are again divided into Hydrological Response Units (HRUs) on the basis of land use, soil type and optionally slope. In each HRU hydrological and biological processes are simulated on a daily or hourly time-step. Incoming rainfall is partitioned into surface runoff and infilitration based on the Curve Number method. Flow is aggregated at subbasin level and then routed through the stream network (Neitsch et al., 2005).
For the application of quantifying water provision as an ecosystem service in the Pangani Basin, a few modifications to the source code of SWAT2005 became necessary. The modified version developed and used in the study was called SWAT-P (see also Supplement).
On the technical side, the limit of 6300 HRUs per watershed was removed; SWAT-P is able to simulate configurations with more than 10 000 (up to 99 999) HRUs (hydrological response units). The dormancy threshold for tropical latitudes (20 • N-20 • S) was decreased from 0 to −1 in order to avoid unintended dormancy of plants, which can occur due to differences between the latitudes of the closest weather station and the subbasin center. Some additional output variables such as actual consumptive water use were introduced, and an error in the auto-irrigation routine in SWAT2005, which caused flow in a reach to be set to zero even if only a part of it was removed for irrigation, was corrected.
Regarding process simulation, a simple floodplain routine was introduced that simulates the spilling of water from reaches over adjacent HRUs. This was necessary to model the flooding of Kirua Swamp along the middle reaches of Pangani River, which is relevant for both water users and hydrology in the basin. The new routine works essentially the same way as the existing SWAT wetland routine, but instead of receiving water from a fraction of the subbasin like a "wetland", the "floodplain" receives water when flow in the main reach of the subbasin spills over the banks.
An irrigation efficiency parameter was introduced in order to account for low irrigation efficiency (27 % on average in the basin; IUCN and PBWO, 2008). The order of removal for consumptive use and irrigation was changed so consumptive use gets first priority.
Since some model input data with a high sensitivity (rainfall, temperature, point source discharges, and granted diversion amounts) for the Pangani Basin are of low reliability, correction factors for these inputs were introduced. These correction factors could then be varied during the calibration and uncertainty assessment with SUFI-2 (see Table 1 and text below) like other model parameters, and their uncertainty could be included in the prediction uncertainty.

Model configuration
The model application in the current study posed challenges related to subbasin delineation that could not be handled with existing GIS interfaces: on one hand, since SWAT does not allow climatic differentiation within subbasins, a large number of very small subbasins would have to be created due to steep ecological gradients -this would increase computation time. On the other hand, the necessary inclusion of stakeholder data required model outputs to be spatially compatible with available stakeholder data (usually linked to the geometry of administrative units). To automate subbasin delineation taking into account these factors and at the same time minimizing the number of subbasins created, a script in AML (Arc Modelling Language, ESRI Inc.) was created.
Inputs into the script include a weighted flow accumulation raster, which allows creating smaller subbasins in more heterogeneous areas and larger subbasins in more homogeneous areas (in this study, weights were based on slope and mean annual rainfall in order to reach higher spatial detail in the humid mountainous regions of the basin than in flat and dry areas); two datasets of land units, one of which creates subbasin outlets at the intersection of streams with its land unit borders (useful for including political land units), and the other one creates subbasin boundaries along the input unit borders (useful for subdividing subbasins into elevation bands); and a raster of lake areas, which ensures that subbasin boundaries do not divide lakes in the model. The additional elevation-band subbasins created within the topographical subbasins with this tool are linked through zero-length "pseudo-streams" in order to ensure correct water routing. When using this option, it should be noted that after running the model, reach outputs should only be used from the outlets of topographical subbasins; when setting up the model, model parameters for surface runoff and erosion (longest tributary channel length, slope length etc.) must be determined based on topographical subbasins, and their values for the elevation-band subdivisions determined the same way that SWAT internally assigns these parameter values to HRUs based on model subbasin inputs. In the current study, 400 m elevation bands were used for this option.
The outputs of the tool were then processed together with the mentioned spatial and time-series input data using the ArcSWAT interface (Winchell et al., 2008) in order to create the input files required by SWAT.
The following points regarding model configuration can further be noted: 1. Large springs with known constant discharge in the footzones of Kilimanjaro and Mt. Meru were modelled as point sources. Water feeding these springs was assumed to originate from deep aquifer recharge on the upper mountain slopes (controlled by the deep aquifer recharge parameter RCHRG DP). The SWAT-P point source correction factor PSCOR was used to account for the uncertainty in these inputs during calibration with SUFI-2.
2. For crop areas, the land use inputs only allowed differentiation between herbaceous crops like maize, rice or sugarcane, and tree/shrub crops like coffee or bananas (FAO, 2002). Therefore, generic land use classes were created in the SWAT crop database. They were parametrised based on existing classes, hydrological modelling work done in Kenya (McMillan and Liniger, 2005;Notter et al., 2007), and local experience (R. Daluti, personal communication, 2007). Plants were constantly kept at maximum growth stage for two reasons: first, planting dates vary spatially within the basin (in most areas, the main rainy season starts in March/April, while on the Eastern slopes of mountain ranges, the main rainy season starts in November), as well as temporally, depending on the onset of the rainy season in a particular year. And second, appropriate observed data were not available to calibrate plant growth. With irrigated agriculture being modelled using the SWAT autoirrigation routine, this approach resulted in the maximum possible irrigation water demand. However, diversion amounts for irrigation were capped using the variable DIVMAX, based on the water rights (maximum allowed abstractions) kept in the Pangani Basin Water Office water rights database. To account for uncertainty due to the possibly incomplete database and lacking enforcement of abstraction limitations, the variable Table 1. Parameters sensitive to discharge calibrated using SUFI-2 (prefix v indicates that the parameter value is replaced by a given value; prefix r indicates the parameter value is multiplied by (1 + a given value) (Abbaspour et al., 2007). DIVCOR, also introduced with SWAT-P, was included in calibration with SUFI-2 (see next section). As the location of water sources was only known for the largescale irrigation schemes, for all other irrigated areas, the path distance function in ArcGIS (ESRI © Inc.) was used to determine the nearest irrigation water source, using the option that there should be no upward paths (since water in canals cannot move upward).
4. Water transfers for domestic, livestock, and industrial use were conceptualised as "consumptive" use, i.e. constant rates were removed from streams and aquifers through the SWAT *.wus input files.
5. Natural lakes and small dams were simulated as unmanaged reservoirs; Nyumba ya Mungu, located in the middle of the basin (see Fig. 1) was simulated as a managed reservoir with monthly target storages obtained from Moges (2003), but measured outflow data were used as inputs into the downstream reaches for periods for which they were available.

Calibration, validation, and uncertainty assessment
Model calibration, validation, and uncertainty assessment were carried out using the Sequential Uncertainty Fitting ver. 2 (SUFI-2) algorithm (Abbaspour et al., 2007). SUFI-2 aggregates uncertainties in model concept, inputs and parameters and aims to obtain the smallest possible uncertainty (range) of predictions while bracketing most of the observed data (Schuol et al., 2008). Starting with large, physically meaningful parameter ranges, SUFI-2 decreases these ranges iteratively. The "P -factor" describes the percentage of data bracketed by the 95 % prediction uncertainty and should be as large as possible up to a maximum value of 100. The "Rfactor" describes the width of the 95 % uncertainty interval in standard deviations of measured data and should therefore be minimised. SWAT was calibrated and validated against measured monthly discharge data from 16 stations in the basin, mostly from the period 1980-2005; measured data to calibrate other model output variables were not available in time-series form. Due to numerous gaps in the measured series the criterion was formulated that 3 yr of data had to be available for both calibration and validation. At four stations, the available data series was long enough for calibration only. At three further stations, the data series ended in the beginning of the 1980's; therefore, earlier data (from 1960 onward) were used for calibration and validation at these locations. A 10 % measurement error was included in the P -and R-factor calculations (compare e.g., Abbaspour et al., 2009;Andersson et al., 2009;Schuol et al., 2008).
16 parameters sensitive to discharge were identified in an initial sensitivity analysis (as described in Abbaspour et al., 2008; Table 1). These included the correction factors for measured inputs introduced with SWAT-P (compare Sect. 3.3). The parameters were grouped into 11 zones on the basis of climate, topography, and geology (compare Faramarzi et al. 2009; Fig. 1). For the groundwater parameters, the zones in mountain areas were internally further differentiated into a higher and a lower zone in order to capture areas dominated by groundwater recharge and discharge, respectively (Ngana, 2001b).
With the fine spatial detail and the consequently large number of spatial units to be simulated, computing time was considerable -almost 6 h for one simulation run of the entire basin for the 1981-2005 period on a computer with two 2.53 GHz CPUs. To make best possible use of available computing resources, SUFI-2 iterations were split (see Abbaspour et al., 2008) and also run on Linux, for which SUFI2 was adapted by making use of the open-source Mono and Wine packages.
The Nash-Sutcliffe efficiency (Nash and Sutcliffe, 1970) was used as the objective function to evaluate the performance of each simulation at each discharge station. The parameter ranges obtained in the last iteration of SUFI-2 represented the posterior parameter space on which subsequent analysis was carried out (see next section).

Derivation of indicators for water provision
Indicators for the ecosystem service of water provision were defined according to stakeholder requirements. These were established based on previous studies (IUCN, 2003;IUCN and PBWO, 2008;Msuya, 2010;Ngana, 2001bNgana, , 2002Turpie et al., 2005) and validated during a workshop in December 2009 in collaboration with 25 representatives of the relevant stakeholder groups (small-and large-scale farmers, urban residents, water and agricultural sector authorities, and TANESCO, the Tanzania Electricity Supply Company). Information was collected on the quantity, location, and timing in which water needs to be available in order to be valued (i.e. perceived as a benefit) for a given use and accessible (i.e. obtained) by the different stakeholder groups. The water uses considered included domestic supply, livestock, industry, agriculture, and hydro-electric power production (Table 3). Stakeholder requirements regarding water quality were not considered, since SWAT model parameters affecting water quality had not been calibrated (see Sect. 3.5).
The indicators were derived from a SUFI-2 iteration varying model parameters within the posterior parameter space established in calibration. The simulations comprised 25 yr using meteorological data from 1981 to 2005. Water management inputs were based on socio-economic data from the years 2002 for Tanzania, and 1999 for Kenya, respectively. The resulting simulation reflects the socio-economic situation around the year 2000, and includes the climatic variability of 25 yr of weather data.
Domestic supply, livestock and industry require modest but constant water supply; therefore, actual water use at the points of diversion at the 95 % reliability level (i.e. the amount available at least 95 % of time) was chosen as the indicator of water provision for these uses. For each river reach, this was determined as the diversion amount modelled at the time step at which the 95 % percentile of simulated monthly discharges from the entire simulation period, sorted in descending order, was reached (based on Estoppey et al., 2000). The location of water diversions is known from the Basin Water Office (PBWO) water rights database for the about 72 % of the basin population served by water supply authorities or water projects (which includes industrial use); the amount that can be diverted for use is also known at these points. For the remaining 28 % of the basin population obtaining water from unimproved sources (percentages by Ward are known from 2002 census data), it was assumed that water for domestic use and livestock is fetched from the nearest surface source, which was determined using the Path-Distance function in ArcGIS (ESRI © Inc.). The maximum amount removed was estimated at 20 liters per capita and per day (lcd) for humans (based on Turpie et al., 2005 -cor-responds to the amount that can be carried over large distances), 50 lcd for cattle and 10 lcd for sheep and goats (figures according to stakeholder workshop participants). It was also assumed that this portion of the population is equally distributed over each Ward area, excluding protected areas.
In summary, water provision from a given subbasin i can be expressed as: where wus i is the amount of water provision for domestic, livestock, and industrial use from subbasin i (corresponding to the variable WUS EFF introduced in SWAT-P in the .rch and .sub output files), Q95 is the simulated water availability in subbasin i at 95 % reliability, MaxDiversion i is the sum of water rights for domestic, livestock, and industrial use in subbasin i, NoPeople i the number of people without water rights fetching water in subbasin i, NoCattle i the number of cattle not included in water rights data obtaining water from subbasin i, and NoShoats i the number of sheep and goats not included in water rights data obtaining water from subbasin i. Deficits were calculated by comparing actual water availability to a theoretical per capita demand of 135 lcd (litres per capita per day) in urban and 65 lcd in rural areas for humans, 50 lcd for cattle, 10 lcd for sheep and goats, and the granted water amounts from the PBWO database for industries (based on IUCN and PBWO, 2008;Msuya, 2010;Turpie et al., 2005).
In order to quantify water provision for agriculture, water amounts available for irrigation could be derived from model outputs. However, farmers and authorities in the basin are much more interested to know how much land can be cultivated with certain crop types, regardless whether the required water is supplied by rainfall or irrigation; therefore, the duration of the growing period with sufficient water availability (GP) at the 75 % reliability level -i.e. reached in at least 3 out of 4 yr -was determined based on the SWAT output variable water stress (WSTRS). In a first step the growing period duration in months in a given year j was determined for each HRU as: GP j,HRU = N (n 1 , n 2 , ..., n N |wstrs n ≥ 0.5 ) where N is the count of consecutive months n 1 to n N in year j , in which the ratio of water availability to plant water demand, wstrs n , is equal or above 0.5 (threshold based on Verdin andKlaver, 2002, andSmith, 1992). In a second step, the growing period duration reached or exceeded in 75 % of years, was determined. Crop areas were then categorised into areas with a GP duration of 3-6 months, which is suitable for staple crops like maize, and areas with a GP duration of ≥6 months, suitable for the cultivation of cash crops like coffee (K. Nkya, personal communication, 2009). The final indicators for water for agriculture were calculated for each aggregation level (subbasin, District, Region) as the areas of farmland of the two categories, available per farming household: where ha refers to hectares of cropland, FarmHH to households with farming as the main income according to the 2002 census, N(FarmHH) to the count of farming households at a given aggregation level, and GP3-6 and GP ≥ 6 to growing period durations of 3 to 6, or at least 6 months, respectively.    For hydropower generation, discharge at the power plant locations versus the minimum and maximum discharges required for power generation were used to calculate the percentage of time power can be generated as the main indicator for ecosystem service provision; mean discharge and discharge at 95 % reliability were also calculated for these locations.

Pre-processing of climate input data
Among the univariate interpolation techniques, IDW (with a power of 1 and considering the 24 nearest neighbouring stations) emerged as the overall best-performing technique for both rainfall and temperature; however, the best parameter settings for Kriging performed almost as well and even outperformed IDW in the more data-scarce decade of the 1990's.
Including elevation as secondary variable, by using a relative lapse rate of 3.6 % per 100 elevation meters for rainfall and an absolute rate of −0.6 • C/100 m for temperature, improved interpolation performance only marginally for daily rainfall (by 0.2 %); however, for annual aggregated rainfall and for temperature, the interpolation error was reduced by 8.3 % and 48.3 %, respectively (Fig. 2). Rainfall and temperature inputs for SWAT were therefore pre-processed (by interpolation to model subbasin areas) using this method. Compared to the method internally used by SWAT (using the unmodified value from the nearest station, see Neitsch et al., 2005), the interpolation error was reduced by 16.1 % for daily rainfall, and by 58.4 % for temperature. By identifying the most appropriate interpolation technique and including the secondary high-resolution information on elevation, a more realistic spatial representation of climatic variables, and a reduction of input uncertainty for modelling, could be achieved; however, interpolation errors continued to be a significant source of uncertainty in modelling that needed to be addressed in calibration and uncertainty assessment with SUFI-2 (see Sect. 3.5).

Model configuration
The model configuration resulted in 1853 physical (topographical) subbasins, 3820 model subbasins including elevation band subdivisions, and 25 677 HRUs in the entire Pangani Basin. This configuration allowed a maximum of spatial detail and provided flexibility to use available inputs and produce the required outputs, while keeping model complexity and computational demand as low as possible. Experiments showed that using conventional subbasin delineation tools, several ten thousands of subbasins in the entire basin would have had to be created in order to reach a similar degree of detail in the critical areas (Fig. 3). The creation of separate stream reaches per Ward allowed using water demand inputs at the scale at which they are available, and extracting model outputs for any administrative or physical spatial unit required.
The subdivision of subbasins based on elevation bands actually results in the introduction of an additional conceptual level of aggregation in SWAT. In order to avoid model outputs being unintentionally affected by this configuration, model parameters calculated at subbasin level (such as longest flow paths and slope length) must be calculated for topographic subbasins and then apportioned to modelled subbasins according to their areal fraction in the topographic subbasin, the same way SWAT internally calculates these parameter values, which are input at subbasin level, for each HRU (Neitsch et al., 2005).

Calibration and validation results
Calibration and validation of monthly discharge yielded satisfactory results given the scarcity of available data. In the Upper Basin (upstream the Nyumba ya Mungu Reservoir), where a higher density of climate stations is available, model performance is generally better than in the Lower Basin (downstream Nyumba ya Mungu). The fact that similar performance measures were reached in the validation as in the calibration period indicates that there was no "overfitting" of parameters.
In the Upper Basin, Nash-Sutcliffe Efficiency (NSE) scores of ≥0.5 were achieved at 7 out of 8 gauges in the calibration period and at 4 out of 6 gauges in the validation period (Fig. 4). On average, in the calibration period, the P -factor (% of measured data bracketed by the 95 % prediction uncertainty) at all stations was 70 %, and the R-factor was 0.61. In the validation period, the average P -factor was 66 % and the average R-factor was 0.78.
In the Lower Basin, NSE values of ≥0.5 were achieved at 4 of 8 gauges in calibration, and at 4 of 7 gauges in validation. At two gauges, 1DB2A (Saseni at Gulutu) and 1DA3 (Luengera at Magoma), located at the outlets of mountainous catchments without any rain gauge in the vicinity, negative NSE scores were obtained in calibration. The low NSE score (0.11 in calibration, 0.08 in validation) achieved at the gauge on Pangani at Buiko, at the outlet of Kirua Swamp, was partly due to the low variability of discharge at this station, which makes a higher NSE value harder to achieve. In the Lower Basin, the average P -factor in calibration was 73 % and the average R-factor was 1.58; in validation, the average P -factor was 74 % and the average R-factor 1.66. These results confirm the quite large uncertainty of discharge that was to be expected in the Lower Basin based on the density and reliability of available measured data.
Calibration of a hydrological model solely based on discharge can lead to over-or underestimation of non-calibrated water balance elements such as evapotranspiration or deep aquifer recharge (compare Table 2). Therefore, modelling results from this study were compared to results of previous studies. The comparison shows that the estimation of noncalibrated elements of the water balance is consistent with previously published results: For example, the modelled average deep aquifer recharge in the Upper Basin is similar to the rate given by Ndomba et al. (2008), who estimated an average value of the RCHRG DP (deep aquifer recharge) parameter of about 0.75 for the Kikuletwa subcatchment; and the modelled 95% actual evapotranspiration for elevation bands on Kilimanjaro is in the range of the values resulting from an assessment by Røhr (2003) using the CRAE approach (Morton, 1983, see Fig. 6).

Domestic, livestock and industrial use
Overall water provision for domestic, livestock, and industrial use in the Pangani Basin is 2.84-3.48 m 3 s −1 (95 % prediction uncertainty, 95 PPU), which are available at least 95 % of time. Thus the theoretical demand of 4.01 m 3 s −1 is not met (Table 4). It is not known which share of the available water goes to which use, but according to regulations, domestic use should be given the first and industrial use the last priority. The distribution of water provision in space is unequal. Demand is met or nearly met at the 95 % reliability level in large areas of the Upper Basin (most Districts around Kilimanjaro and Mt. Meru). Most Districts in the Lower Basin experience deficits. This spatial disparity is only partly due to natural water availability (which is low in the semi-arid Southwestern Plateau, but not in the Pare and Usambara Mountains) -it is also due to differences in water infrastructure development: in the Upper Basin, the percentage of people supplied by authorities or water projects is higher, while in the Lower Basin, more people have to fetch water from streams and are therefore not able to meet the theoretical per capita demand.

Water provision for agriculture
Water provision for agriculture comprises rainfed as well as irrigated cultivation. With a reliability of 75 %, i.e. in at least 3 out of 4 yr, rainfall is sufficient for an average growing period (GP) duration of 3.2-4.3 months on rainfed cropland (95 PPU, Table 5, Fig. 7); however, in more humid highland areas (e.g. Arumeru, Arusha, Hai or Moshi Districts), 5 months and more are experienced. In the highlands, cash crops can therefore be grown with only limited irrigation to complement rainfall, while in the lowlands, even the production of staple crops is unreliable without irrigation. On irrigated land, the duration of the period with sufficient water is extended to 5.1 to 7.1 months on average (95 PPU, Table 5, Fig. 7), but again, spatial disparities are highsome irrigation schemes, mainly in the Upper Basin, can irrigate year-round, while others suffer from water scarcity and competition from upstream users.  On basin average, 1.2-1.5 ha of cropland with a growing period duration of 3-6 months at the 75 % reliability level, suitable for the production of staple crops, are available per farming household (95 PPU, Table 5). Land suitable for cash crop production, reaching a growing period duration of 6 months or above, is available at 0.2-0.5 ha per farming household. Again, the Upper Basin and the highland areas are privileged in this regard compared to the Lower Basin and lowland areas.
Average areas planted with crops that require given growing period durations can also be extracted from the 2002-2003 Agricultural Sample Census (ASC), conducted by the Tanzania National Bureau of Statistics. A comparison of modelling results from this study to the ASC dataset by administrative Region indicates that the figures for cropland per household presented here are in the correct range: in Arusha and Kilimanjaro Regions, the ASC figures are within the 95 PPU range of modelling results (Fig. 8). In Tanga Region, the results obtained in this study are lower than the averages from the ASC. The probable reason is that the land cover inputs used for modelling (the FAO Africover map from 1997, FAO, 2002) classifies large areas in the Usambara Mountains as natural vegetation, but they are clearly agriculturally used today (as confirmed e.g. by recent Google Maps satellite images, Google © ). For Manyara Region (comprising the semi-arid Southwestern part of the Basin), too few households were interviewed during the ASC in those Districts lying within the Pangani Basin to obtain a representative average.
It can be noted in Fig. 8 that the uncertainty in the duration of the growing period is highest in Arusha Region, located in the Upper Basin. This is contrary to the width of the 95 % uncertainty range in discharge, which is higher in the Lower Basin (compare Sect. 4.3). The reason is the large share of irrigated agriculture in Arusha Region, which, in combination with a high uncertainty on the enforcement of abstraction limitations (variable DIVCOR, see Table 1), leads to a large range in growing period durations throughout the posterior parameter space.  Røhr (2003) and modelled with SWAT in this study. Fig. 6. Comparison of actual evapotranspiration by elevation on the Southern Kilimanjaro slopes calculated with the CRAE method by Røhr (2003) and modelled with SWAT in this study.  (Table 6); however, power rationing due to reduced power production in dry periods does occur and has become more frequent in recent years (which is also a result of increasing demand). Fig. 6. Comparison of actual evapotranspiration by elevation on the Southern Kilimanjaro slopes calculated with the CRAE method by Røhr (2003) and modelled with SWAT in this study.

Conclusions and outlook
This study has attempted to quantify water provision in an East African watershed as an ecosystem service, based on the definition of ecosystem services as benefits obtained by people. Consequently, instead of carrying out a purely biophysical assessment of water as a resource, the conditionality of the resource being a benefit, i.e. valued and accessible by stakeholders, has additionally been included. This necessitated the explicit consideration of valuation of resources by stakeholders, and a sufficiently high spatial and temporal resolution in order to determine whether access to water is given. In addition, the requirement to be able to make predictions for unmeasured locations and for the future resulted in the need to use a hydrological model, and the uncertainty about measured data and model structure, as well as the aim of transparency towards the consumers of the research results, called for systematic uncertainty assessment. The study has shown that these requirements could be fulfilled in a relatively large, data-scarce watershed, by applying and slightly adapting the well-established SWAT model, as well as developing and adapting tools for pre-processing or uncertainty assessment. This way, some limitations to the application of SWAT in data-scarce contexts could be overcome. Further, the study has demonstrated how model results can be combined with socio-economic data in order to better fulfill stakeholder information needs.
The SWAT model has proved itself a flexible and comprehensive simulation tool for the purpose of quantifying  water provision explicitly in time and space and at varying scales. An additional advantage is the fact that it has been successfully tested and applied in many parts of the world Faramarzi et al., 2009;Gassman et al., 2005;Schuol et al., 2008;Yang et al., 2006). Minor modifications of the model code were necessary for the application in the study context, especially regarding the ability to simulate a large number of spatial units, as well as some processes particular to the study area, such as the large Kirua floodplain. This led to the development of the SWAT-P version. A technical drawback of the program when simulating large watersheds and/or fine spatial detail -the fact that separate input files are required for each subbasin and HRU, which can lead to hundreds of thousands of input files -remains unresolved for the moment, as this would require a concerted effort of the SWAT user community due to the numerous other programs built around the model (like the GIS interfaces or SUFI-2) that require the current input file structure.
The SUFI-2 method, in combination with the correction factors introduced in SWAT-P, allows assessment of uncertainty in inputs that are very relevant but at the same time available in low quality in the Pangani Basin, which could very well be the case in other tropical watersheds as well. Newer versions of the software (SWAT-CUP ≥ 2.1.4) allow varying precipitation inputs directly through the interface, without the correction factor available only in SWAT-P. SUFI-2 further has the advantages that fewer simulation runs are required for calibration and uncertainty assessment than with other available algorithms , and that uncertainty analysis outputs are easily communicable to a wider, also non-scientific audience.
Efforts to optimise input data, such as pre-processing of climatic inputs based on secondary information, or combining different types of spatial information, have been shown to reduce input uncertainty and add spatial detail. Considerable uncertainty still remains in the final outputs, which could be made transparent using the SUFI-2 method. The added spatial detail was a prerequisite to come up with realistic estimates of water resources that are actually accessible to stakeholders: if water sources (e.g. streams) were spatially aggregated at a too high level in the model, the available water quantity would tend to be overestimated, since small streams with unreliable flow, that may represent the only accessible water source for parts of the population, could not be distinguished.
Data limitations also prevented calibration of the model based on additional output variables besides discharge, such as water quality parameters or crop yield. This would have added confidence in modelling results. However, where information from earlier studies was available, even if only for parts of the basin -such as evapotranspiration in the Kilimanjaro area (Røhr, 2003) aquifer recharge in the Kikuletwa subcatchment (Ndomba et al., 2008), or cropland areas per household based on the Agricultural Sample Census -it was compared to the results of this study. The comparisons show that the results presented here are consistent with earlier publications, which increases their plausibility.
Estimates of water provision as an ecosystem service, i.e. as a valued and accessible benefit, at relatively fine spatiotemporal resolution and over the entire basin as output by this study, represent an added value over the generalised estimates of demand or physical water availability, or local case studies, available so far (e.g. IUCN and PBWO, 2008;Moges, 2003;Ngana, 2001b;Turpie et al., 2005). Results are directly available for any physical or political unit required and can therefore be used for planning and policy-making from local to basin level. Since the indicators have been defined based on stakeholder preferences, they are appropriate for use in decision-making processes involving stakeholders, e.g. on future priorities of water management and allocation. Additionally, indicators expressed in physical (not dimensionless) units can be verified based on observations, and allow determining whether critical levels are reached. Finally, the indicators presented rely on data from a sufficiently long period to include interannual variability, so that the temporal reliability of water availability can be estimated (compare e.g. Andersson et al., 2009).
In other parts of the world, observed data may be available to quantify water provision at a similar level of detail as in the present study for a "historical" timeline like the year 2000 without having to use a hydrological model. However, models have the advantage of being able to make projections for future scenarios. This study provides a strong basis on which such future projections can be made for the Pangani Basin. Similar approaches could also be applied in other watersheds in data-scarce areas. The approach can also provide input data for the assessment of other ecosystem services in a given study area.