Large-scale groundwater modeling using global datasets: a test case for the Rhine-Meuse basin

The current generation of large-scale hydrological models does not include a groundwater flow component. Large-scale groundwater models, involving aquifers and basins of multiple countries, are still rare mainly due to a lack of hydro-geological data which are usually only available in developed countries. In this study, we propose a novel approach to construct large-scale groundwater models by using global datasets that are readily available. As the test-bed, we use the combined Rhine-Meuse basin that contains groundwater head data used to verify the model output. We start by building a distributed land surface model (30 arc-second resolution) to estimate groundwater recharge and river discharge. Subsequently, a MODFLOW transient groundwater model is built and forced by the recharge and surface water levels calculated by the land surface model. Results are promising despite the fact that we still use an offline procedure to couple the land surface and MODFLOW groundwater models (i.e. the simulations of both models are separately performed). The simulated river discharges compare well to the observations. Moreover, based on our sensitivity analysis, in which we run several groundwater model scenarios with various hydro-geological parameter settings, we observe that the model can reasonably well reproduce the observed groundwater head time series. However, we note that there are still some limitations in the current approach, specifically because the offline-coupling technique simplifies the dynamic feedbacks between surface water levels and groundwater heads, and between soil moisture states and groundwater heads. Also the current sensitivity analysis Correspondence to: E. H. Sutanudjaja (e.sutanudjaja@geo.uu.nl) ignores the uncertainty of the land surface model output. Despite these limitations, we argue that the results of the current model show a promise for large-scale groundwater modeling practices, including for data-poor environments and at the global scale.


Introduction
Groundwater is a vulnerable resource, and in many areas, groundwater is being consumed faster than it is being naturally replenished (e.g.Rodell et al., 2009;Wada et al., 2010).Given increased population and heightened variability and uncertainty in precipitation due to climate change, the pressure upon groundwater resources is expected to intensify.These issues make monitoring and predicting groundwater changes, especially over large areas, imperative.
Changes in groundwater resources and their causes can be inferred from groundwater models.A groundwater model has the ability to calculate and predict spatio-temporal groundwater head in a sufficiently fine resolution (e.g. 1 km resolution).However, large-scale groundwater models, especially for large aquifers and basins of multiple countries, are still rare, mainly due to lack of hydro-geological data.Some existing large-scale groundwater models, such as in the Death Valley area, USA (D'Agnese et al., 1999), and in the MIPWA region, the Netherlands (Snepvangers et al., 2007), were developed on the basis of highly detailed information (e.g.elaborate 3-D geological models).Such information may be available in developed countries but is seldom available in other parts of the world.
Published by Copernicus Publications on behalf of the European Geosciences Union.In this paper, we propose a novel approach for constructing a large-scale groundwater model by using only readily available global datasets.Note that by large scale, we mean large extent area sizes, as defined by Bierkens et al. (2000), and not map scale or resolution.Here the model proposed is a MODFLOW transient groundwater model that is coupled to a distributed land surface model.The latter is used to estimate groundwater recharge and surface water levels that are used to force the groundwater model.As the test bed of this study, we use the combined Rhine-Meuse basin (total area: ±200 000 km 2 ).This basin, located in Western Europe (see Fig. 1), is selected because it contains ample groundwater head data that can be used to verify the model output.However, while constructing the model, we use only globally available datasets that are listed as follows.We use the Global Land Cover Characteristics Data Base Version 2.0 (GLCC 2.0, http://edc2.usgs.gov/glcc/globeint.php) and FAO soil maps (1995) in order to parameterize the land cover and upper sub-surface properties.For mapping hydro-geological features and estimating their aquifer properties, we make use of the global digital elevation model of HydroSHEDS (Lehner et al., 2008) and an estimate of groundwater depth based on a simple steady-state groundwater model (see Sect. 2.2).For climatological forcing, we use the global CRU datasets (Mitchell and Jones, 2005;New et al., 2002) that are combined with the ECMWF reanalysis data of ERA-40 (Uppala et al., 2005) and operational archive (http://www.ecmwf.int/products/data/archive/descriptions/od/oper/index.html).
The goal of this paper is then to construct a largescale groundwater model on the basis of readily available global datasets and to evaluate the model performance using groundwater head observations.Here we do not intend to calibrate the model yet.Rather, we conduct a sensitivity analysis to study how changing aquifer properties influence the model outcome, specifically the resulting groundwater head time series.By this sensitivity analysis, we expect to gain insights into the model behaviour that can be used as the basis for improving the current model.
The paper is organized as follows.In the following section, we explain the model concept and structure used in this study.Then, we present the methodology to evaluate the model outcome, including the sensitivity analysis procedure.Subsequently, the results and their analyses follow.The last part of this paper is mainly devoted to a discussion about the prospects of large-scale groundwater assessment in data-poor environments and at the global scale, and to suggest ways to further improve this large-scale model.

General modeling procedure
The hydrological model developed in this study consists of two parts: (1) the land surface model (Sect.2.2 and Appendix A), which conceptualizes the hydrological processes on and in the upper-soil or unsaturated-zone layer; and (2) the groundwater model (Sect.2.3), which describes saturated flow in the deeper underground.The land surface model was adopted from the global hydrology model of PCR-GLOBWB (Van Beek and Bierkens, 2009;Van Beek et al., 2011) having two upper soil stores and a simple linear groundwater store (see Fig. 2a).In this study, we replaced the latter by the MODFLOW groundwater model (McDonald and Harbaugh, 1988).
We started this modeling exercise by modifying PCR-GLOBWB and performing the daily simulation of it to calculate groundwater recharge and river discharge.The river discharge was translated to monthly surface water levels by assuming channel dimensions and properties based on geomorphological relations to bankfull discharge (Lacey, 1930).These surface water levels and groundwater recharge were used to force a weekly transient groundwater model built in MODFLOW.The whole modeling procedure can be considered as an offline-coupling procedure between PCR-GLOBWB and MODFLOW because we separately and sequentially run both of them (see Fig. 2b).We chose this offline-coupling method to avoid expensive computational costs.This version, which takes about 1.0 h for oneyear model simulation in a single PC with AMD Athlon Dual Core Processor 5200 + 2.71 GHz 2GB RAM, is the first step into developing a fully coupled one.Using this offline-coupling version, we evaluated computational loads and identified weaknesses and possibilities in the modeling structure.

PCR-GLOBWB land surface model
PCR-GLOBWB (Van Beek and Bierkens, 2009;Van Beek et al., 2011) is a raster-based global hydrological model coded in the PCRaster scripting languange (Wesseling et al., 1996).Here we briefly describe its main features and modifications implemented for the purpose of this paper.The original PCR-GLOBWB (hereafter called as "PCR-GLOBWB-ORI") has 30 ×30 cells, while the modified one used in this study (hereafter called as "PCR-GLOBWB-MOD") has the resolution of 30 ×30 (approximately equal to 1 km × 1 km at the equator).
The full description of PCR-GLOBWB-MOD is provided in Appendix A, which mainly discusses the hydrological processes above and in the first two upper soil stores.These upper soil stores respectively represent the top 30 cm of soil (thickness Z 1 ≤ 30 cm ) and the following 70 cm of soil (Z 2 ≤ 70 cm), in which the storages are respectively symbolized as S 1 and S 2 [L].In both versions of PCR-GLOBWB (hereafter "PCR-GLOBWB" refers to both "PCR-GLOBWB-ORI" and "PCR-GLOBWB-MOD"), the states and fluxes are calculated on a daily basis.Climate forcing data are also supplied on a daily resolution (see Appendix B about the forcing data used in this study).Following Fig. 2a, the specific local runoff Q loc [L T −1 ] in each land surface cell of PCR-GLOBWB consists of three components: −1 ].Note that as the consequence of the offline-coupling procedure between the land surface model and MODFLOW, the linear reservoir concept of groundwater store (in which the storage is symbolized as S 3 [L]) is still used in PCR-GLOBWB-MOD, specifically for estimating Q bf .In addition to Fig. 2, some tables related to the model are provided: Table 1 listing the model parameters, including the global datasets used to derive them; Table 2 listing the state and flux variables; and Table 3 summarizing the most important changes introduced in PCR-GLOBWB-MOD.It is important to note that contrary to a 30 × 30 cell in PCR-GLOBWB-ORI, a 30 × 30 cell in PCR-GLOBWB-MOD has a uniform type of land cover, a uniform type of vegetation and a uniform type of soil.PCR-GLOBWB-MOD considers only the sub-grid elevation variability (based on the 3 digital elevation map of HydroSHEDS, see Eq. (A9)) to estimate the fraction of saturated soil contributing to surface runoff.
As illustrated in Fig. 2a, besides precipitation P [L T −1 ] and evaporation E [L T −1 ] fluxes, important vertical fluxes are water exchanges between the stores 1 and 2, Q 12 [L T −1 ], and between the stores 2 and 3, Q 23 [L T −1 ].Note that both Q 12 and Q 23 consist of downward percolation fluxes, Q 1→2 and Q 2→3 [L T −1 ], and upward capillary rise fluxes, Q 2→1 and Q 3→2 [L T −1 ].However, in the current PCR-GLOBWB-MOD, to force one-way coupling from the land surface model to MODFLOW, we inactivate the upward capillary rise from the groundwater to second soil stores (Q 3→2 = 0), which is one of the limitations of the current modeling approach.
The land surface model simulation was performed for the period 1960-2008.In this study, we limited the channel discharge calculation of PCR-GLOBWB-MOD to monthly resolution.Therefore, we could neglect water residence time in channels (less than a week) and obtain monthly channel discharge time series, Q chn [L 3 T −1 ] by simply knowing the surface area A cell [L 2 ] of each cell and accumulating the monthly values of the specific local runoff Q loc from all cells along the drainage network.

Groundwater model
As mentioned earlier, a MODFLOW (McDonald and Harbaugh, 1988) based groundwater model is used to replace the groundwater store (S 3 ) in the land surface model.Here we built a simple MODFLOW model that considers only a single upper aquifer (see Sect. 2.3.1).The MODFLOW model was forced by the output of PCR-GLOBWB-MOD, particularly the monthly recharge Q 23 (see Sect. 2.3.2) and the monthly channel discharge Q chn that is beforehand translated to surface water levels (see Sect. 2.3.3).We performed groundwater flow simulation for the period 1965-2008 using a weekly time step and monthly stress period, within which specific groundwater recharge and surface water levels are constant.Note that as there are no readily available global datasets about groundwater extraction by pumping, we did not include groundwater abstraction in our model yet.

Aquifer properties
To characterize the properties of the aquifer, we initially turned to two maps: (1) the global lithological map of Dürr et al. (2005) and (2) the UNESCO international hydro-geological map of Europe (http://www.bgr.de/app/fishy/ihme1500/).However, both maps are imprecise at 30 resolution employed here.The locality of units of the first map is not accurate, particularly after being checked with the 30 digital elevation map of HydroSHEDS that we used.For example, we found that the position of the Upper Rhine Graben area, a large and important groundwater body located in the central part of the study area, is inaccurate.Moreover, the first map does not include small aquifer features that are often located surrounding rivers in narrow valleys.Although the second map includes these small aquifer features, it is as yet only a scanned map (not a digital one) with all its geocoding problems.
To overcome these difficulties, we developed a procedure that classifies the model area to shallow permeable sedimentary basin aquifers and deep less permeable mountainous aquifers.Briefly stated, the method uses a steadystate groundwater model to calculate steady-state groundwater heads, a digital elevation map (DEM) to estimate groundwater depths and a drainage direction map (LDD) to incorporate the influence of river networks, that are closely related to the occurrence of groundwater bodies in their surroundings.The method is summarized as follows: 1. First, for the entire model area, we assumed a set of uniform aquifer properties, transmissivity KD = 100 m 2 day −1 and specific yield Sy = 0.25, specifically for calculating the groundwater recession coefficient J in Eq. (A28) of the land surface model.
2. Next, we ran the PCR-GLOBWB-MOD land surface model for a long period .
3. Subsequently, using the output of step 2, the long-term average recharge and discharge fields were calculated.The latter was translated to surface water level fields by using relations between discharge and channel dimensions (see Sect. 2.3.3).
4. The average water level and discharge fields derived in the step 3 were used to force the groundwater model in order to estimate a field of steady-state groundwater head.Furthermore, using the DEM 30 [L] (where the subscript 30 indicates the spatial resolution), we could derive a steady-state field of "groundwater depth" d gw [L], which is the difference between the surface level elevation and the calculated steady-state groundwater head.
5. We assumed that cells with steady-state groundwater depths d gw of less than 25 m have productive aquifers.These shallow groundwater cells, located in valleys, were classified as the "sedimentary pocket/basin" cells that most likely contain permeable materials and productive groundwater bodies.To avoid the occurrence of isolated cells due to errors and limitations in the DEM 30 (such as "blocked" rivers in narrow valleys or gorges), we used the LDD 30 to assure that downstream cells of a sedimentary basin cell are also classified as sedimentary basin cells.Moreover, because MODFLOW uses a discretization that does not allow diagonal flow across the corners (see e.g.Wolf et al., 2008) lower extents.This is done in order to ensure the flow connectivity among the cells.
6.The remaining cells were subsequently classified as "mountainous area" cells, where groundwater bodies are most likely located at greater depths.Note that here we mean real groundwater bodies, not perched groundwater storage in regolith, which is modeled in the interflow module of the land surface model (see Sect.A5).
m day −1 and from the second soil to groundwater stores: downward components of percolation fluxes, from the first to second soil m day −1 stores and from the second to groundwater stores Q 2→1 and Q 3→2 upward seepage (capillary rise) fluxes, from the second to first soil stores m day −1 and from the third groundwater to second stores.For this study, the latter is inactivated ( To force one-way offline coupling from the groundwater between the land surface model to soil stores and MODFLOW.The fields of KD and Sy were stored in the "Block-Centered Flow" (BCF) package of MODFLOW (see Mc-Donald and Harbaugh, 1988, Chapter 5).Here we defined a single aquifer layer, in which two conditions apply throughout the simulation: 1.The transmissivity KD is constant in time, independent of the actual thickness of the water table or the saturated zone.This condition is suitable for our model as groundwater head fluctuation is mostly expected to be only a small fraction of the thickness of the single aquifer layer defined in the model.This condition also implies that our MODFLOW cells are never 'dry' (i.e. the simulated groundwater heads never fall below the aquifer bottom elevation).
2. Sy is defined as the storage coefficient that remains constant in time and ignoring the fact that there might be a "transition" from a "confined groundwater" situation to a "phreatic water table" situation, or vice versa.Here we ignore the presence of confining layer and assume a phreatic groundwater throughout the simulation.
The main advantage of using this layer type is that it makes the MODFLOW iterative solver quickly converge throughout the simulation.Moreover, it circumvents the problem of having to define the aquifer top and bottom elevations, the information that is not globally available.
Attention is needed to convert the storage coefficient Sy for the variable 30 × 30 grid-size cells before using them in MODFLOW as MODFLOW normally uses a rectangular discretization with appropriate unit lengths (e.g.m).In the MODFLOW BCF package, the input values of Sy are commonly multiplied by the cell areas to create so-called E. H. Sutanudjaja et al.: Groundwater model for the Rhine-Meuse basin "storage capacities" (SC MF , unit: m 2 ) that are used for the calculation (see McDonald and Harbaugh, 1988, Chapter 5, pages 5-24 and 5-25).The MODFLOW groundwater model that we built has the same resolution as the land surface model: 30 × 30 .It means that our MODFLOW cells, which are not rectangular, have inappropriate length units and varying surface areas A cell (m 2 ).Given this fact, we have to modify the input of Sy so that SC MF has correct values and units: where Sy inp is the input supplied to the BCF package of MODFLOW, Sy act is the actual storage coefficient and A MF is the 'apparent' MODFLOW cell dimension, which is 30 × 30 .Using these Sy inp input values, the values of SC MF (internally multiplied by A MF in the BCF package) are: Note that the transmissivities KD (m 2 day −1 ) are not modified because the algorithm in the BCF package of MOD-FLOW never multiplies KD with the MODFLOW cell dimensions A MF .

Boundary conditions and recharge
No-flow boundaries were assumed at the boundaries surrounding the basin, thus assuming that topographic and groundwater divides coincide.For the "large lakes" (see Fig. 1), we assumed fixed-head boundary conditions, keeping water levels constant for the entire simulation period.
Here we define "large lakes" by selecting, from the derived surface water body f wat map (see Sect.A8), only the lakes that have surface areas at least five times of 30 × 30 gridcell.For each of those lakes, constant water levels are assumed based on the DEM 30 of HydroSHEDS.
The monthly time series of groundwater recharge Q 23 obtained from the land surface model of PCR-GLOBWB-MOD were fed to the "Recharge" (RCH) package of MODFLOW.The actual unit of Q 23 is m day −1 .In the RCH package calculation, the input values of recharge are multiplied by the MODFLOW cell dimension so that they are expressed in a volume per unit time (see McDonald and Harbaugh, 1988, Chapter 7), which is m 3 day −1 in our case.Because our MODFLOW cell dimension is 30 × 30 (A MF ), the input of Q 23 must be modified as follows: where Q 23,inp is the input introduced to the RCH package of MODFLOW and Q 23,act is the actual recharge from the land surface model output (unit: m day −1 ).

Channel dimensions and surface water levels
We used the "RIVER" (RIV) and "DRAIN" (DRN) packages of MODFLOW to accommodate (offline) interaction between groundwater bodies and surface water networks.This interaction is governed by actual groundwater heads and surface water levels.The latter can be translated from the monthly discharge Q chn by using assumed channel properties: the channel width B B chn is derived using the formula of Lacey (1930) who postulated that the width of a natural channel at bankfull flow is proportional to the root of the discharge: where P bkfl (unit: m) and Q bkfl (m 3 s −1 ) are the wetted perimeter and flow at the bankfull condition, and 4.8 is a factor with unit s 0.5 m −0.5 (see Savenije, 2003).In large natural alluvial rivers, P bkfl is slightly larger than B chn .To calculate Q bkfl , which, as a rule of thumb, occurs on average once every 1.5 yr, we used the monthly time series of Q chn calculated from the land surface model.
D chn is derived by combining the Lacey's formula with Manning's formula (Manning, 1891) and assuming a rectangular channel shape: (5) By subtracting D chn from DEM 30 , we may estimate the channel or river bed elevation, RBOT.However, due to errors in DEM 30 , a few of pixels may have unrealistic RBOT elevations.Here we implemented median filters with various window sizes to smooth the longitudinal profile of RBOT.
Given the channel properties, RBOT, n, B chn and Sl, the monthly water levels HRIV can be translated from the monthly discharge Q chn by means of Manning's formula: RBOT and monthly HRIV are used as the input for the RIV package, the principle of which is: where QRIV [L 3 T −1 ] is the flow between the stream and aquifer, taken as positive if it is directed into the aquifer, h is the groundwater head, and CRDR [L 2 T −1 ] is the estimated river conductance: where BRES [T] is the bed resistance (taken as The RIV package is defined only in cells with B chn ≥ 2 m.To simulate smaller drainage elements, the DRN package is defined for all cells without RIV package: where QDRN [L 3 T −1 ] is the flow between the drainage network and stream and DELV is the median drain elevation, which is assumed to be located half meter below the surface elevation DEM 30 .

Sensitivity analysis of aquifer properties
In groundwater modeling, the transmissivity KD and storage coefficient Sy are important parameters which are also subject to large uncertainty.In this paper, which may be considered as our first attempt to model groundwater at a large scale, we did not perform a full calibration yet.However, we did investigate the sensitivity of the model outcome to changing aquifer properties.The list of the scenarios that we simulated is given in Table 4, in which the reference scenario has KD For the sake of simplicity, we used only one fixed output from the land surface model for all scenarios.The monthly recharge Q 23 and surface water level HRIV time series fields are the same for all scenarios.To verify the land surface model output, we first compared the modeled discharge in two stations: Lobith and Borgharen, located in the downstream parts of Rhine and Meuse, respectively.Note that the baseflow component of the modeled discharge evaluated here is Q bf from the groundwater linear reservoir of the land surface model (Eq.(A27)), not -(QRIV + QDRN) from the MODFLOW model (Eqs.( 7) and ( 9)).In other words, although they are by definition the same, we ignored the discrepancies between the baseflow values of the land surface and groundwater models.
For each scenario, the simulated groundwater levels or heads h md are compared to the piezometer data h dt .We have collected more than 30 000 sets of head time series from several institutions in the Netherlands, Belgium, France, Germany and Switzerland and some individual partners.For model evaluation, we selected a subset of over about 6000 time series which are relatively recent (after 1979) and long records exceeding 5 yr that contain seasonal variations (at least there is a measurement datum for each season: winter, spring, summer and autumn).Moreover, based on the information from the data suppliers, we only selected the time The scenario A01.0 B01.0 is the reference scenario.* * NA indicates the scenarios that failed to converge, specifically the ones with low transmissivities and storage coefficients.KD: aquifer transmissivities; Sy: specific yields or storage coefficients; R cor : cross-correlation coefficients between calculated and measured groundwater head time series; QRE 7525 : the relative error of inter-quantile range of the calculated groundwater head time series (compared to the observation, see Eq. ( 10)).
series belonging to the top aquifer.Figure 7a shows the selected measurement station locations.Note that for stations located in the same pixel, we did not upscale them to the pixel resolution because they usually have different time spans.It means that all measurements are at the point scale, not at the 30 × 30 as the model resolution, and our evaluation is therefore on the conservative side because of lack of scale adjustment.
To verify the model performance of each scenario, specifically in every measurement station, we compared and evaluated modeled and observed head time series using several measures.First, we calculated the bias between both mean values, hmd − hdt , and the bias between both median values, h md 50 − h dt 50 .Also, we calculated the cross-correlation coefficient R cor between the model results and measurement data.The latter performance indicator, calculated without where IQ md 7525 and IQ dt 7525 are the inter-quantile ranges of the model result and measurement data time series.While evaluating mean and median biases, cross-correlations and interquantile range errors, we only used dates for which measurement data exist.
With so many observation points used (> 6000 points), we decided to analyze all performance indicators (biases, cross-correlations and inter-quantile range errors) at the subbasin scale.We sub-divided the model areas into several sub-basins, by using the local drainage direction map.Then, in each sub-basin, we calculated the sub-basin averages of hmd − hdt , h md 50 − h dt 50 , R cor and QRE 7525 .

Results
Figure 5a and 5b show the river discharges calculated by the land surface model and the measurement data in two locations, in Lobith (downstream of Rhine) and Borgharen (Meuse), both are in the Netherlands.The figures show that the discharge can be reasonably simulated by the model, except the summer discharge in Borgharen which is generally overestimated.This overestimation can be explained by the fact that our model did not include water extraction in Monsin, located about 25 km upstream of Borgharen.In Monsin, especially during the summer, some water from the Meuse River is diverted to sustain the navigation function of the Scheldt River, which is located outside the Rhine-Meuse basin (De Wit, 2001).Some examples of comparison of simulated head time series to measurement data are presented in Fig. 5c-5g.Here, instead of plotting actual head h md and h dt values, we plotted the model results and measurement data in their anomalies related to their mean values, hmd and hdt .Note that, while calculating hmd and hdt , we only used the dates for which measurement data exist.For the examples shown in Fig. 5c-5g, we can conclude that the model is able to capture both the timing and the amplitude of observed heads quite well.
An alternative straightforward way to evaluate the model outcome is by making scatter-plots between both mean values, hdt and hmd -as shown in Fig. 6a, and between both medians, h dt 50 and h md 50 -as shown in Fig. 6b.From both scatterplots, we see that model result average and median values correlate very well to measurement data average and median values.However, these scatter-plots should be carefully interpreted because they do not provide information about the spatial distribution of the biases between the model results and measurement data.Moreover, such scatter-plots are predominantly influenced to areas with high densities in measurement stations, which are mainly in the lowland and valley areas of the basins.Also, some data suppliers supplied enormous number of data, while others supplied only few points (see Fig. 7).Areas with sparse measurement stations may not be well-represented in the scatter-plots.
The uneven station location distribution is another reason why we performed the analyses at the sub-basin scale.Figure 8 shows the sub-basin-scale mean absolute biases ( hdt − hmd ) of the reference scenario A01.0 B01.0.Here we observe that there are large biases in some sub-basins.Explanations for these biases are model structure errors (e.g.only a single layer aquifer model used and no pumping activities simulated), parameter errors (e.g.no calibration, only two classes for classifying aquifer and only homogeneous aquifer properties assigned for each class) and discrepancies in resolutions and elevation references between the model results and point measurement data.Related to the elevation references, we acknowledge that we did not do perform any correction to the DEM of HydroSHEDS used in the model and station elevation information provided by data suppliers, who most likely do not use the same elevation references.This issue may be considered as one of the limitations of the current modeling approach.However, given the nature of a large-scale groundwater model, which covers multiple basins and countries, we have to accept that it is still difficult to define the same and consistent  6), ( 7), and ( 9)).(KD and Sy).We see that mostly the amplitude error QRE 7525 (Fig. 10) is sensitive to different aquifer properties, while the timing agreement R cor (Fig. 9) is less sensitive.The latter may be due to the fact that although we varied KD and Sy for our MODFLOW groundwater model input, we used the same land surface model output (the same recharge Q 23 and surface water levels HRIV time series) for all scenarios.It seems that, to achieve better time series timing, we should have extended our sensitivity analysis by also looking at the uncertainty of our land surface model outcome.The sensitivity of QRE 7525 and the insensitivity of R cor to the aquifer properties variation can be observed from Table 4, that summarizes the (entire) basin-scale average values for each scenario.Note that to calculate these basin-scale average values, we used the surface areas of subbasins as weight factors.From our sensitivity analysis, we see that the basin-scale average values of QRE 7525 vary from 80 % to above 450 %, while the basin-scale average values of R cor vary only from 0.32 to 0.43.oppositely, in the sense that, moving through the parameter space, a performance indicator improves whereas the other deteriorates.This condition can be regarded as an inability of the model to reproduce simultaneously different aspects of observed groundwater heads, which are related to model structural limitations that should be investigated in the future.Yet, despite the aforementioned limitations, we can still observe that our groundwater model can reasonably reproduce the time series of observed groundwater head time series.Figure 12a  the maximum values of R cor and the minimum values of QRE 7525 that are selected from the sub-basin scale values of all scenarios (from Figs. 9 and 10).We observed that more than 50 % of sub-basins have relatively good timing agreements (R cor > 0.5), and more than 50 % of sub-basins have relatively small amplitude errors ( QRE 7525 < 50 %).They include not only shallow groundwater areas, but also areas

Conclusions and discussion
This study shows that it is possible to build a simple and reasonably accurate large-scale groundwater model by using only global datasets.It suggests a promising prospect for large-scale groundwater modeling practice, including in data-poor environments.Although the model may not be suitable for karstic aquifer areas -for which MODFLOW is not suitable for modeling groundwater flow -PCR-GLOBWB-MOD can be applied in several areas that contain large sedimentary basins or pockets, such as the basins of Nile, Danube, Mekong, Yellow and Ganges-Brahmaputra Rivers.
The (Van Beek and Bierkens, 2009;Van Beek et al., 2011), WASMOD-M (Widén-Nilsson et al., 2007) and VIC (Liang et al., 1994), which do not have the ability to calculate spatio-temporal groundwater heads; therefore, do not incorporate any lateral flows in their groundwater compartments.
Although groundwater heads and lateral groundwater flows may not be important for current common global hydrological models, which usually have a spatial resolution of 25-50 km, their inclusion is relevant for future global hydrological models that may have spatial resolutions of down to 1 km (Wood et al., 2011).Several authors (Bierkens and van den Hurk, 2007;Fan et al., 2007;Miguez-Macho et al., 2007;Anyah et al., 2008;Miguez-Macho et al., 2008;Maxwell and Kollet, 2008;Fan andMiguez-Macho, 2010, 2011) have suggested that groundwater lateral flows can be important for regional climate conditions.For instance, Bierkens and van den Hurk (2007) have shown that rainfall persistence may be partly explained by groundwater confluence to discharge zones that remain wet throughout the year to sustain evaporation for longer periods of time.However, for our study area that has humid climate and relatively high drainage density, the importance of groundwater lateral flows to regional climate has to be confirmed.In such areas where rainfall may be mostly transferred as hillslope and channel flows, groundwater lateral flow or groundwater confluence to discharge zones may only be important during a long dry spell.This issue may still not be resolved from our current study, but we argue that any further investigation about it can be done by using a model such as presented here.
We realize that there are several weaknesses in the current approach.The most obvious one is the fact that we do not use a full coupling between the land surface model and the groundwater model parts.Consequently, the soil moisture of the upper soil stores calculated by the land surface model, do not interactively correlate to groundwater heads simulated with the groundwater model (as capillary rise is ignored).The omission of this interaction makes the groundwater store not have the ability to sustain soil moisture states and to fulfil evaporation demands (especially during dry seasons).
We also ignore the fluctuations of water levels in large lakes.Moreover, we disable the direct and interactive connection between the channel/surface water flows and groundwater tables.It should be also noted that the current model ignores the fact that overland flows may occur as the consequence of rising water tables above the land surface elevation (especially for phreatic aquifer locations).Such overland flow might be accommodated by using additional MOD-FLOW packages (e.g.Restrepo et al., 1998).However, they are currently irreconcilable with PCR-GLOBWB-MOD in its current form as the capillary rise from the groundwater store has been disabled.All of the aforementioned weaknesses must be addressed while building the next generation of this model that includes full coupling between the land surface model and the groundwater model parts.
We also acknowledge that the current version of PCR-GLOBWB-MOD is still not suitable for areas under heavy anthropogenic water extraction as there are no global datasets on pumping activities that can be meaningfully resolved at the model resolution (30 × 30 ).As far as we know, Wada et al. (2010) and Wada et al. (2011) are the only studies that estimated global groundwater abstraction, but at a very coarse resolution of 30 × 30 .However, our model is still useful to assess impacts under an uncertain future climate, such as changing precipitation and temperature.Moreover, if such datasets of water pumping abstraction rate are provided, they can be readily incorporated in our model.
In this study, the sensitivity analysis of the groundwater head output is still limited to the uncertainty of our aquifer properties in the groundwater model, not considering the uncertainty of the land surface model outcome.In a future study, we may want to extend the sensitivity analysis by running several scenarios with varying soil properties of the first and second soil stores (unsaturated zone) to produce several recharge and surface water level time series and using them to force the groundwater model.However, considering aforementioned weaknesses discussed previously, this extended sensitivity analysis may not be meaningful if we do not use the fully coupled model.In such a fully coupled model, the dynamic feedbacks between surface water levels and groundwater heads, and between soil moisture states and groundwater heads are expected to influence the behaviour of resulting groundwater head time series.
Moreover, for such a fully coupled large-scale model, model evaluation and calibration can be reasonably done by comparing the model soil moisture states and remote sensing soil moisture products, such as AMSR-E (e.g.Njoku et al., 2003) and ERS (Wagner et al., 1999), which are also available for the entire globe.By doing this, we anticipate that a large-scale groundwater model can be evaluated and calibrated without extensive head measurement data that are hardly available in other parts of the world.Thus, it allows the construction and verification of large-scale groundwater models in data-poor environments.

Appendix A The land surface model of PCR-GLOBWB-MOD
This Appendix A briefly describes the main features of the land-surface model of PCR-GLOBWB-MOD (which has the spatial resolution of 30 × 30 ) and explains the modifications from its original version PCR-GLOBWB-ORI (30 × 30 ), of which Van Beek and Bierkens (2009) and Van Beek et al. (2011) provide the detailed description.Note that as stated in Sect.2.2, the terms "PCR-GLOBWB-ORI" and "PCR-GLOBWB-MOD" refer to the original and modified versions, while 'PCR-GLOBWB' refer to both versions.

A1 Interception
PCR-GLOBWB includes an interception storage, Si [L], which is subject to evaporation.Precipitation, P [L T −1 ], which falls either as snow, Sn [L T −1 ] (if atmospheric temperature is below the water freezing temperature, Ta < 0 • C), or liquid rainfall, P rain [L T −1 ] (if Ta ≥ 0 • C), fills the interception storage up to a certain threshold.In PCR-GLOBWB-MOD, we use the interception definition as suggested by Savenije (2004), asserting that interception accounts not only for evaporation fluxes from leaf interception, but also any fast evaporation fluxes as precipitation may be intercepted on other places, such as rocks, bare soils, roads, litters, organic top soil layers, etc.Thus, the interception capacity is parameterized as: where T max and T min are the maximum temperature and minimum temperature assumed for the growing and dormancy Hydrol.Earth Syst.Sci., 15, 2913-2935, 2011 www.hydrol-earth-syst-sci.net/15/2913/2011/ seasons, and the monthly temperature T m fields are taken from the 10 CRU-CL2.0 dataset (New et al., 2002), containing 12 monthly fields representing the average climatology conditions over 1961-1990.Using f m , the seasonal parameters C f,m and LAI m are modeled as: The maximum and minimum values of LAI and C f are assigned based on the land cover map of GLCC 2.0 (http: //edc2.usgs.gov/glcc/globeint.php), the global ecosystem classification of Olson (1994a,b) and the improved land surface parameter table of Hagemann (2002).
The fast evaporation flux from the intercepted water, E i [L T −1 ], is limited by either available evaporation energy for wet interception areas E p,i [L T −1 ] or available water in Si: where E p,0 [L T −1 ] is the reference potential evaporation (FAO Penman-Monteith, Allen et al., 1998), Kc i [−] is the "crop factor" for interception areas and t [T] is the timestep (one day).

A2 Snow pack
If Ta < 0 • C, the surplus precipitation above Si max falls as snow and feeds the snow storage, Ss [L], which is modeled with a degree-day-factor (DDF [L −1 T −1 ]) method adapted from the HBV model (Bergström, 1995).Snow may melt (if Ta ≥ 0 • C) and melt water may refreeze (if Ta < 0 • C) with linear rate CFR [T −1 ] or evaporate (if enough energy is available).Melt water can also be stored in a liquid water storage of the snow pack, Ss sl [L], up to a certain maximum holding capacity that is proportionally to Ss and controlled by a factor CWH [−].Any surplus above this holding capacity is transferred to the soil.

A3 Direct or surface runoff
If Ta ≥ 0 • C, the net input liquid flux transferred to soil, P n [L T −1 ], consists of the surplus precipitation above the interception capacity Si max (falling as liquid rainfall) and excess melt water from the snow pack.In principle, P n infiltrates if soil is not saturated and causes direct runoff if soil is saturated.However, this principle cannot be straightforwardly implemented because we have to account for the variability of soil saturation within a 30 × 30 cell.Here we adopted the Improved Arno Scheme concept (Hagemann and Gates, 2003), in which the total soil water storage capacity of a cell consists of the aggregate of many different soil water storage capacities.Following this concept, Van Beek and Bierkens (2009) derived Eq. (A7) to estimate the fractional saturation of a PCR-GLOBWB grid-cell, x [−], as a function of gridaverage values W [L]: where W min is the grid-(local)-minimum capacity, W act and W max are respectively the grid-average-actual water storage and water capacity for the entire soil profile (W act = S 1 + S 2 and W max = SC 1 +SC 2 , where SC [L] is the soil water capacity for each layer).
Based on Eq. (A7), the net input flux P n is divided into direct runoff, Q dr [L T −1 ], and infiltration flux into the first soil layer, P 01 [L T −1 ].The direct runoff is given by: with W = W max − W min and W act = W max − W act .Equation (A8) states that an event P n , for a given cell and a given period t, only generates runoff Q dr if it brings W act above W min .It implies that W min is an important parameter governing runoff generation response time, especially for a large and highly variable 30 × 30 cell of PCR-GLOBWB-ORI consisting of several land cover, vegetation and soil types.However, W min is less important for a relatively small 30 × 30 cell of PCR-GLOBWB-MOD, for which we assumed a uniform type of land cover, a uniform type of vegetation and a uniform type of soil.Here, for the sake of simplicity, we assumed W min = 0 for all cells in PCR-GLOBWB-MOD.However, PCR-GLOBWB-MOD still considers the sub-grid elevation variability in a 30 ×30 cell by the existence of parameter b that accounts for the fact that for a given soil wetness, we expect that more runoff is produced in mountainous regions than in flat regions (see e.g.Hagemann and Gates, 2003;Van Beek and Bierkens, 2009) where σ h is the standard deviation of orography within a 30 × 30 cell calculated from the 3 DEM of Hy-droSHEDS (Lehner et al., 2008), and σ min and σ max are the model-area minimum and maximum values of the standard deviations of orography at the grid resolution.Through this scheme, the amount of infiltration P 01 transferred to the first soil store is equal to the difference between P n and Q dr (P 01 = P n − Q dr ).However, we also have to consider that the infiltration rate cannot exceed the saturated hydraulic conductivity of the first layer, K sat,1 [L T −1 ].In this case, if P 01 > K sat,1 , its excess is passed to the direct runoff Q dr .

B2 Period 2000-2008
For the precipitation and temperature forcing data during 2000-2008, we used the ECMWF operational archive (http://www.ecmwf.int/products/data/archive/descriptions/od/oper/index.html) that was constrained to the long term averages and trends of the CRU-TS2.1 data: 1.For each year y, the annual mean of the CRU-TS2.1 forcing data, F CRU-TS2.1,y (which may be either precipitation or temperature) was calculated.Subsequently, the 1961-1980 long-term mean of F CRU-TS2.1,y -symbolized as FCRU-TS2.1,61-80-was also calculated. 5.As done in the step 1, the long-term average of annual means of the ECMWF operational archive datasets (hereafter called as "ECMWF-OA") for the period 2000-2008 -symbolized as FECMWF-OA,00-08 -was calculated.
To obtain monthly reference potential evaporation E p,0 fields over the period 2000-2008, for most of which no CRU-TS2.1 datasets are available, we defined a procedure to select the corresponding data from the monthly dataset of Van Beek (2008), E p,0, * ,30 ,m , that covers the period 1901-2002.The procedure -repeated for each 30 ×30 cell, and every month m and year y in 2000-2008 -is summarized as: E. H. Sutanudjaja et al.: Groundwater model for the Rhine-Meuse basin

Fig. 1 .
Fig. 1.The combined Rhine-Meuse basin used as the test bed for this model.The bold black line indicates the scope of the model area while the bold dashed line indicates the approximate border between the Meuse and Rhine basins.The major rivers and large lakes are indicated in blue.The map shown in the upper part indicates the location of the study area in Europe.

Fig. 2 .
Fig. 2. The modeling structure and strategy for this study: (a) the concept of the land surface model of PCR-GLOBWB (Van Beek andBierkens, 2009; Van Beek et al., 2011): on the left, the soil compartment, divided in the two upper soil stores, S 1 and S 2 , and the linear groundwater store, S 3 , that is replaced by the MODFLOW(McDonald and Harbaugh, 1988) groundwater model; on the right, the total local gains from all cells are routed along the local drainage direction to yield the channel discharge, Q chn .(b) The modeling strategy used to couple the PCR-GLOBWB and MODFLOW: first, we run the PCR-GLOBWB to calculate the monthly net recharge Q 23 to groundwater store and channel discharge Q chn that can be translated into surface water levels by assuming channel dimensions.Then, the monthly net recharge and surface water levels are used to force MODFLOW.

Fig. 3 .
Fig.3.The approximate steady-state groundwater depth map that is used for aquifer classification.Here, we classified cells that have groundwater depth below 25 m and their downstream cells as the "sedimentary basin", where shallow productive aquifer pockets are usually located.Moreover, we also made sure that a sedimentary basin cell must have at least one neighboring cell in its left, right, upper, or lower extents.The remaining cells were classified as "mountainous areas", where groundwater depths are large (seeSect.2.3.1).

Fig. 4 .
Fig. 4. The average calculated groundwater head for the period 1974-2008, based on the reference scenario A01.0 B01.0.The alphabetical codes shown on the maps indicate the measurement station locations for the graphs in Fig. 5.

Fig. 5 .
Fig. 5.The comparison between measurement data (red) and model output (black): (a) the discharge in Lobith, located downstream of the Rhine.(b) The discharge in Borgharen, located downstream of Meuse.(c,d, e, f, and g) Groundwater head anomaly comparisons based on the reference scenario A01.0 B01.0 at several locations indicated in Fig.4.

Figures 9
Figures 9 and 10 show the sub-basin scale averages of R cor , which indicate the timing punctuality, and QRE 7525 , which indicate the magnitude of amplitude error.Both figures present results from several scenarios with different aquifer properties(KD and Sy).We see that mostly the amplitude error QRE 7525 (Fig.10) is sensitive to different aquifer properties, while the timing agreement R cor (Fig.9) is less

Fig. 9 .
Fig. 9.The sub-basin average of timing agreement indicators, R cor , for all scenarios.To distinguish near zero values (white), we use a yellow background.The lower right corner map is a composite map of the maximum values of all scenarios.Note: some scenarios with small tranmissivities KD and storage coefficients Sy failed to converge.

Fig. 10 .
Fig. 10.The sub-basin average of amplitude error indicators, QRE 7525 , for all scenarios.The lower right corner map is a composite map of the minimum values of all scenarios.Note: some scenarios with small tranmissivities KD and storage coefficients Sy failed to converge.

Fig. 12 .
Fig. 12. Histograms of maximum values of all maps in Fig. 9 (cross correlation R cor ) and minimum values of all maps in Fig. 10 (amplitude error QRE 7525 ).Each bar in the histogram is clustered based on approximate groundwater depths that are calculated by averaging the 34-yr 1974-2008 average modeled groundwater heads of all scenarios.Note that, to calculate these average depths, we used only cells with measuring stations.
A1)where Si max[L]  is the interception capacity of each gridcell consisting of the fractions C f [−] of vegetation cover.I nv and I veg[L]  are parameters defining the interception capacities per unit surface area in non-vegetated and vegetated areas.LAI [−] is the leaf area index, defined as the ratio of total upper leaf surface of vegetation divided by the surface area of the land on which the vegetation grows.Equation (A1), used in PCR-GLOBWB-MOD, is slightly different than its original version PCR-GLOBWB-ORI, which limits the interception capacity only to leaf canopies represented by the second term of Eq. (A1) (C f,m I veg LAI m ).The first term of Eq. (A1) ([1 − C f,m ] I nv ), introduced in PCR-GLOBWB-MOD, represents the interception capacity in the non-vegetated fraction.The subscript m, which is the monthly index, indicates that Si max,m , C f,m and LAI m show monthly or seasonal variations due to vegetation phenology.Their variations are according to a growth factor f m [−] which is a function of monthly temperature T m [ ]: 2. Next, for the period 1981-2002, we calculated the anomaly time series A y : A y = F CRU-TS2.1,y − FCRU-TS2.1,61-80(B6) 3. Furthermore, the trend of the A y time series was regressed with the following linear model: prediction, while b 0 and b 1 are the intercept and slope parameters.4. Subsequently, the 2000-2008 long-term mean, F trend CRU-TS2.1,00-08 , was estimated using Eq.(B7), FCRU-TS2.1,61-80and the year y = 2004, which is taken as the representative of the period 2000-2008: F trend CRU-TS2.1,00-08= FCRU-TS2.1,61-80(B8) + (b 0 + b 1 × 2004) where F ECMWF-OA,m are the original monthly ECMWF-OA fields and F corrected ECMWF-OA,m are their corrected ones that are used to force the land surface model (for the period[2000][2001][2002][2003][2004][2005][2006][2007][2008].

Table 1 .
List of model parameters used in the model.The parameterization of FAO map (1995) based on Table of Van Beek and Bierkens (2009).b The parameterization of GLCC 2.0 land cover map based on Table of Hagemann (2002).c The parameterization of the vegetation height h veg based on Table of Van Beek (2008).d The subscripts 1 and 2 indicate the first and second soil stores. a

Table 2 .
List of state and flux variables defined in the model.(s 1 ) and K 2 (s 2 ) unsaturated hydraulic conductivities at specific degree of saturations s 1 and s 2 * m day −1 ψ 1 (s 1 ) and ψ 2 (s 2 ) soil matric suctions (at specific degree of saturations s 1 and s 2 ) net vertical fluxes from the first to second soil stores: Q 12

Table 3 .
Important changes in PCR-GLOBWB-MOD (compared to the original PCR-GLOBWB-ORI).I nv and I veg Only I veg is used.I nv and f i are introduced.A broader definition of interception is used in PCR-GLOBWB-MOD (see Sect.A1).

Table 4 .
List of the sensitivity analysis scenarios including their performance indicators presented in basin-scale average values.
0 if P n t + W act ≤ W min P n t − W act + min < P n t + W act ≤ W max P n t − W act if P n t + W act > W max