Nonlinear effects of locally heterogeneous hydraulic conductivity fields on regional stream – aquifer exchanges

Computational experiments are performed to evaluate the effects of locally heterogeneous conductivity fields on regional exchanges of water between stream and aquifer systems in the Middle Heihe River basin (MHRB) of northwestern China. The effects are found to be nonlinear in the sense that simulated discharges from aquifers to streams are systematically lower than discharges produced by a base model parameterized with relatively coarse effective conductivity. A similar, but weaker, effect is observed for stream leakage. The study is organized around three hypotheses: (H1) small-scale spatial variations of conductivity significantly affect regional exchanges of water between streams and aquifers in river basins, (H2) aggregating small-scale heterogeneities into regional effective parameters systematically biases estimates of stream–aquifer exchanges, and (H3) the biases result from slow paths in groundwater flow that emerge due to small-scale heterogeneities. The hypotheses are evaluated by comparing stream–aquifer fluxes produced by the base model to fluxes simulated using realizations of the MHRB characterized by local (grid-scale) heterogeneity. Levels of local heterogeneity are manipulated as control variables by adjusting coefficients of variation. All models are implemented using the MODFLOW (Modular Three-dimensional Finite-difference Groundwater Flow Model) simulation environment, and the PEST (parameter estimation) tool is used to calibrate effective conductivities defined over 16 zones within the MHRB. The effective parameters are also used as expected values to develop lognormally distributed conductivity (K) fields on local grid scales. Stream–aquifer exchanges are simulated with K fields at both scales and then compared. Results show that the effects of small-scale heterogeneities significantly influence exchanges with simulations based on local-scale heterogeneities always producing discharges that are less than those produced by the base model. Although aquifer heterogeneities are uncorrelated at local scales, they appear to induce coherent slow paths in groundwater fluxes that in turn reduce aquifer–stream exchanges. Since surface water– groundwater exchanges are critical hydrologic processes in basin-scale water budgets, these results also have implications for water resources management.


Introduction
Exchanges of water between streams and aquifers are critical elements in the coupled dynamics of watersheds.Groundwater discharge to streams maintains natural hydrologic systems through base flow during periods of low streamflow, while leakage from streams is an important source of groundwater recharge (Sophocleous et al., 1995;Hantush, 2005;Newman et al., 2006).The magnitudes of such fluxes vary on scales ranging from sub-regional (or zonal) to local scales.Local variations in the spatial distribution of aquifer characteristics are known to affect groundwater flow and stream-aquifer exchanges on sub-regional scales (Schmidt et al., 2006;Kalbus et al., 2009;Mendoza et al., 2015), but it is not usually possible to measure system parameters and states with sufficient accuracy and level of detail to specify local variations throughout a watershed (Wroblicky et al., 1998; Published by Copernicus Publications on behalf of the European Geosciences Union. J. Zhu et al.: Nonlinear effects of locally heterogeneous hydraulic conductivity fields Kalbus et al., 2006).Furthermore, hydrologic system parameters can vary on multiple spatial and temporal scales and are subject to experimental error (Molz, 2000;Genereux et al., 2008).
Computational experiments, on the other hand, allow the effects of heterogeneity to be investigated by consistently varying the values of parameters, such as hydraulic conductivity, as control variables and observing the resulting effects on simulated system states (Winter et al., 2004;Bruen and Osman, 2004;Hantush, 2005).In this study the sensitivity of regional hydrologic systems to locally heterogeneous aquifer hydraulic conductivity is explored by simulating stream-aquifer exchanges in the Middle Heihe River basin (MHRB) of northwestern China, a typical semi-arid basin.Fluxes in alluvial aquifers of the MHRB are usually represented as two-dimensional processes (Huang, 2012).
In addition to their importance for water resources management, stream-aquifer exchanges are a convenient measure of overall watershed performance because they summarize the states of system fluxes at well-defined locations where they are relatively easy to quantify.Two scales of heterogeneity are represented in the simulations, zonal and local (or grid scale).Local-scale conductivity is manipulated as the only control variable in the computational experiments.All other system parameters, including streambed conductance, are the same in every simulation.
Multi-physics models of the coupled hydrologic fluxes of the MHRB are implemented using the MODFLOW (Modular Three-dimensional Finite-difference Groundwater Flow Model) simulation environment (McDonald and Harbaugh, 2003).The PEST (parameter estimation) tool (Doherty et al., 1994) is used to calibrate an effective aquifer hydraulic conductivity for each of 16 zones within the MHRB.The zones were defined in previous hydrogeological studies of the MHRB (Hu et al., 2007).The effective zonal parameters are also used as expected values to develop realizations of lognormally distributed conductivity fields on local grid scales.Standard deviations are defined by means of coefficients of variation (CV) ranging from 0.5 to 2.0.Ten realizations of a random conductivity field are produced for each CV.Random conductivity values are uncorrelated in space.Monthly stream-aquifer exchanges are calculated for each realization and for the zonally parameterized model.
The study is organized in terms of three explicit hypotheses about the effects of local-scale heterogeneity: H1.Local-scale heterogeneities significantly affect regional stream-aquifer water exchanges in the MHRB.
H2. Systematic biases arise in estimates of stream-aquifer exchanges if local-scale heterogeneities are smoothed by aggregating conductivity into effective values for the set of 16 sub-regional zones.
H3.The biases are the result of slow paths in groundwater flow that emerge due to small-scale heterogeneities in conductivity fields.
The hypotheses H1-3 are tested by comparing simulated stream-aquifer exchanges produced by the local realizations to the zonal simulation.
Hypotheses related to H1 and H2 have been investigated in a few recent studies.Kalbus et al. (2009) evaluated the relative effects of aquifer conductivity and streambed conductance by simulating stream-aquifer interactions in a 220 m reach of a small stream in Germany.They found that heterogeneity in the aquifer influenced discharge to the stream more strongly than did variations in the streambed itself.Schmidt et al. (2006) also studied the spatial distribution and magnitude of groundwater discharge to a stream with a simple analytical model that was mainly focused on groundwater discharge at the reach scale.Kurtz et al. (2013) generated multiple realizations of stream-aquifer interactions in the Limmat aquifer system in Zurich Switzerland.They allowed riverbed hydraulic conductivities to take one of four different levels of heterogeneity ranging from local variability at each grid point to effective conductances of only 5, 3, and 2 values.They found that effective conductance did not always reproduce fluxes obtained from base simulations where system parameters were perfectly known; furthermore, simulations based on effective parameters gave biased estimates of net exchanges between aquifer and stream.Mendoza et al. (2015) used a land surface model with strong a priori constraints on parameters to argue that such models can perform poorly when spatial variability and hydrologic connectivity of a region are represented coarsely.Lackey et al. (2015) used a synthetic stream-aquifer system to show that modeling streambed conductance as a homogeneous property can lead to errors in estimated stream depletion.
The current study contributes to the understanding of these topics in two ways.First, results indicate that locally heterogeneous hydraulic conductivity fields lead to systematic reductions of regional stream-aquifer exchanges, especially in the direction of aquifer-stream discharges.The causal mechanism appears to be slow paths that emerge in groundwater flows in response to local heterogeneities.These localscale fluxes are spatially coherent and are not averaged out by aggregating conductivity into a few effective zonal parameters.The nonlinear effects of these coherent structures are propagated to regional scales.Second, hypotheses H1-3 have practical implications for allocations of water resources since many resource management decisions depend on the results of computational models like the base model used here (Perkins and Sophocleous, 1999;Fleckenstein et al., 2006).A regional basin is partitioned into a few zones, on the basis of physical, geographical, and geological criteria (Eaton, 2006), and effective zonal parameters are set by calibration (Christensen et al., 1998) ties observed here, the base model is found to systematically overestimate regional stream-aquifer exchanges.This holds true for all levels of heterogeneity investigated here as controlled by CV.The paper is organized as follows.In Sect.2, the experimental framework used to model the MHRB and the stochastic methods used to produce locally heterogeneous hydraulic conductivity fields are described.Section 3 reports methods and results of investigations of H1 and H2.It presents results comparing simulations of stream-aquifer exchanges derived from (1) a base model whose effective conductivity parameters are specified by zone and (2) stochastic realizations of heterogeneous local-scale conductivity.Section 4 focuses on H3.The emergence of coherent slow paths in the flow fields is investigated as the source of the bias in base model estimates.Results are summarized and discussed in Sect.5, and a few directions for future experiments are sketched.

Experiment setting and model development
The experimental approach is to compare alternative numerical models of hydrologic system dynamics that produce stream-aquifer exchanges in the MHRB, a closed basin in northwestern China with an area of 8778 km 2 .Streamaquifer exchanges are estimated by (1) a so-called base model that specifies hydraulic conductivity on zonal scales and (2) a set of related models that incorporate local-scale heterogeneity in their parameterizations.Zonal conductivity in the base model is calibrated using the PEST parameter estimation tool (Doherty et al., 1994;Sophocleous et al., 1999), and the base model assigns the same calibrated value of hydraulic conductivity to every grid point in a given zone.
The level of heterogeneity of hydraulic conductivity is the experimental control variable in local-scale simulations.It is set by adding varying levels of randomness to the zonal values of the base model.The result is a random field of conductivities with expected values that are the same in a given zone as the base model's effective conductivity.Otherwise, system parameters including specific yield and streambed conductance and boundary conditions are the same for all simulations.All computations are conducted using the MODFLOW simulation environment (Leake and Prudic, 1991;McDonald and Harbaugh, 2003;Rodríguez et al., 2006).

Study area
The MHRB is contained in the Heihe River basin, the second largest closed river basin in China, which drains a total area of 142 900 km 2 .The MHRB is in a semi-arid region with average annual rainfall of approximately 50-200 mm yr −1 , most of which occurs from May to October.The river originates in the Qilian Mountains of Qinghai Province and then flows through Gansu Province to western Inner Mongolia.The average annual inflow from mountain areas upstream is 1.58 billion m 3 as measured at the Yingluoxia Gauge (YLX), and the discharge through the Zhengyixia Gauge (ZYX) is 0.95 billion m 3 .The 184 km long reach from YLX to ZYX defines the MHRB study area (Fig. 1), which is known to be influenced by temporal patterns and spatial distributions of stream-aquifer exchanges (Zhou et al., 2011).
Intensive stream seepage and groundwater discharge occur in the study area along the river in fluvial and alluvial fans.The exchanges proceed in two directions: groundwater flows through the streambed into the stream (discharging gaining stream), and stream water infiltrates through the streambed into the groundwater system (leaking losing stream), which is the same as general states presented in Winter (1995) and Kalbus et al. (2006).Groundwater discharge to streams maintains natural hydrologic systems during periods of low streamflow through base flow (Sophocleous et al., 1995;Hantush, 2005).Streams are an important source of groundwater recharge at higher elevations.Conjunctive use of surface and subsurface waters is a potentially important resource management tool in semi-arid regions like the Heihe River basin whose potential also depends on system state estimates.
The Middle Heihe River leaks into a sandy fluvial area immediately after YLX and then enters an alluvial plain composed of fine soil.At that point substantial stream seepage, diversion, and groundwater discharge (base flow) occur along the stream.The Middle Heihe River is a wide and shallow stream, and the area of the stream channel is much greater than the lateral areas of both banks.Thus the exchange of stream water and groundwater occurs primarily in the vertical direction.The plain of fine-grained soil along the river is the main area where springs and groundwater emerge.Thirty irrigation areas are distributed throughout the study area.Groundwater tables range from 0 to 300 m beneath the land surface, and aquifer thickness varies from tens to a couple of hundred meters.

Numerical model of stream-aquifer exchanges
All stream-aquifer interactions are simulated using the numerical modeling tool MODFLOW with the stream package (STR) for one-dimensional streamflow and two-dimensional groundwater flow in this study.The stream package (Leake and Prudic, 1991) simulates streamflow with Manning's equation and interactions with groundwater flow using Richard's equation, which assumes vertical, gravity-driven flow and neglects capillarity.This is an acceptable assumption for the typical alluvial sediments of the kind found in the MHRB (Spanoudaki et al., 2009;Huang, 2012).
The stream and aquifer systems are coupled in the model through an iterative process in which convergence of state variables linking the two domains is used as the criterion for accepting a solution.The stream stage and groundwater table are iterated at each time step until the differences between two iterations are within a small tolerance.Seep- age is calculated from the product of the head difference times a streambed conductance.In regional-scale groundwater models, it is reasonable to assume a homogeneous low-conductivity streambed within a heterogeneous aquifer (Kalbus et al., 2009).Conductance is assigned based on an existing calibration and measured data values between 1 and 3 m d −1 (He and Zhao, 2007;Zhou et al., 2011).
Since aquifer thickness is small compared to its horizontal dimensions, one layer is sufficient for the vertical discretization.Workman et al. (1997) pointed out that the discrete cell size can be 1-2 km in regional groundwater flow models, as a river can affect water table elevations as far as 1525 m from the middle of the stream.Thus, the study domain is discretized into 155 rows and 172 columns, with each cell size 1 km×1 km.The aquifer is simulated as a free-surface boundary able to fluctuate in response to recharge from irrigation fields, evapotranspiration, flow to drains, and interaction with streams.The recharge package (RCH) is used to input rainfall and irrigation seepages as vertical boundaries, and the recharge distribution is set according to the data from Hu et al. (2007).The ETS (evapotranspiration package) package is employed to simulate phreatic evaporation considering the multiple-linear relationship with groundwater depth.Based on previous studies (Zhou et al., 2011;Tian et al., 2015), spring flow is also calculated in irrigation areas by representing the drainage area as drain cells.
The groundwater table observed on 1 January 2000 is input and interpolated as initial head for all simulations.The time step is 5 days, and the simulation period is 1 year.Stream inflow at the YLX Gauge and groundwater lateral recharges from mountain areas are used as an upper boundary, and outflow at the ZYX Gauge is taken as a lower boundary (Zhou et al., 2011).In the mountain front region, the aquifer has a high proportion of pebbles, sand, and gravel, while the plain of the lower area is composed of thick unconsolidated alluvial fills with alluvial sand, silt, and clay (Hu et al., 2007).

Base model
Previous research has identified 16 major hydrologic zones within the MHRB based on variations of hydraulic conductivity and specific yield (Hu et al., 2007;Jia et al., 2009).In this study, the PEST parameter estimation system is used to calibrate aquifer parameters (zonal horizontal conductivity K z and specific yield S z ) for the different zones, z = 1, . .., 16.The parameterization of conductivity by constant effective values, one each for the 16 coarse zones, is the foundation of the base model.The 16 zonal conductivities are also used as expected values for the stochastic generation of locally refined conductivity fields as described in the next section.
The objective function for PEST estimates is the sum of squared differences between observations and simulations of groundwater tables at 34 wells distributed through the basin.Effective parameters K z and S z are optimized and estimated separately for the 16 zones of the domain due to computational limitations.The lower and upper levels of parameter settings for conductivity are 0.5 ∼ 50 m d −1 and are 0.01 ∼ 0.25 for specific yield without considering anisotropy.The procedure of PEST includes an initial MODFLOW simulation, calculation of the Jacobian matrix for each parameter, and different Lambda upgrade.After the calibration of K z values the specific yields of the system, S z , are optimized with PEST for transient conditions by minimizing the groundwater table errors at observation wells.

Stochastic realization of conductivity fields
Stochastic simulations are the basis for local-scale realizations of hydraulic conductivity fields, K(x), used in comparisons.The variable x corresponds to a point in the 155 × 172 computational grid used in all simulations of hydrologic system dynamics in the MHRB.Local parameters K(x) are defined by lognormally distributed perturbations on the zonal conductivity field of the base model, (1) The zonal mean, Y z , and variance, V z , are defined in terms of the base model parameters, K z , where C is the coefficient of variation (CV).
In order to compare the effects of different levels of total heterogeneity, realizations of conductivity fields are generated corresponding to six values C = 0.1, 0.2, 0.5, 0.8, 1.0, 2.0.Grid-scale analyses at different levels of heterogeneity are based on a sample of 10 realizations of lognormally distributed random conductivity fields for each value of C. Local-scale simulations of groundwater flow and stream-aquifer exchanges are produced using realizations of K fields based on Eqs. ( 1) and ( 2) in the general model.
Grid point values of conductivity are sampled independently, so no correlation structure is imposed on local conductivities.It is generally accepted that hydraulic conductivity is a random field that is typically correlated over a continuous range of small scales, becoming uncorrelated once a characteristic correlation length is reached (Zhang, 2002;Rubin, 2003).The effect of small-scale correlations on regional velocities and mass transport is understood through second-order for spatially stationary conductivity fields (Gelhar and Axness, 1983;Winter et al., 1984;Neuman and Orr, 1993) and for a wider range of fields that are heterogeneous at local and regional scales (Winter and Tartakovsky, 2002).The small scale of our study is set by the 1 km × 1 km size of our grid cells.Since many alluvial systems have correlation lengths that are much less than that, e.g., Rehfeldt et al. (1992) and Riva et al. (2006), it is informative, and in some cases realistic, to investigate the effect that a field of locally independent, identically distributed conductivities has on the state variables of regional models.Indeed, our experiments show that stream-aquifer exchanges estimated by a typical regional simulation like the base case can be biased even when local conductivities are uncorrelated.

Tests of hypotheses
Comparisons between simulations produced by the base model and local-scale models depend on either normalized squared departures between fluxes produced by the base model and the locally heterogeneous models or ratios between the same fluxes.The results of comparisons are said to support H1 and H2, if simulated values produced by the base model exhibit systematic bias with respect to local-scale simulations of stream-aquifer exchanges.H3 is evaluated by comparing normalized squared departures of groundwater fluxes produced by the base model with samples produced by locally resolved simulations.

Simulated stream-aquifer exchanges
This section focuses on H1 and H2, the hypotheses that simulations of exchanges of water between the stream and aquifer systems produced using locally heterogeneous conductivity fields are systematically less than estimates produced by the base model.Results are given separately for the base model and for models with parameterizations affected by local scales of heterogeneity and then compared.Levels of CV are manipulated as control parameters in the experiments.Simulated groundwater fluxes are used later (Sect.4) to evaluate H3; i.e., the hypothesis that locally heterogeneous conductivity affects the emergence of preferential paths in the aquifer.

Base model calibration and simulation results
The lower fluvial plain of the MHRB is mainly composed of relatively low-permeability silts and clay, while the upland aquifer, containing mostly sand mixed with gravel, is more permeable (Hu et al., 2007).This distribution is reflected in estimates of zonal conductivity and specific storage, K z and S z , derived from a PEST calibration based on a record of 5day groundwater table variations in the year 2000 (Fig. 1).The differences between well heights simulated using the PEST calibration and observations yield a residual mean of −0.19 m, standard error of 0.021 m, root mean square (RMS) of 1.4 m, and normalized RMS of 0.77 %.The PEST calibration results specify higher conductivity for upstream areas near mountain fronts than for the lower fluvial plain in accord with the characteristic geology of the basin.
Simulated river stages and groundwater tables for June and December along the 183 km river distance are displayed in Fig. 2a and b.The groundwater table is generally much lower than the stream stage for about 25 km below YLX Gauge, indicating a losing reach of stream, as can be seen from these two figures.The groundwater table becomes quite shallow in the plain downstream of YLX, and the stream may be losing or gaining at different seasons and locations until the outlet at the ZYX Gauge.
River stage in June is relatively higher than that in December (Fig. 2c), as the streamflow is higher in flood season.Nevertheless, the groundwater table along the river in December is a bit higher because there is almost no pumping for irrigation in that month.Stream inflow from YLX exhibits clear seasonal variations within the year with a large pulse of water in spring-summer due to snowmelt (Fig. 2d).This effect is clearly reflected by monthly stream leakage to groundwater (m 2 km d −1 ).On the other hand, groundwater discharge to the river (or base flow) is relatively stable and less controlled by the seasonal inflow from station YLX.The pattern of groundwater flow is not only controlled by the distribution of hydraulic conductivity but also by the configuration of the water table.Magnitudes of average groundwater flow velocity are higher near the mountain fronts than in the lower plains (Fig. 3a).Average monthly stream leakage rates (Q b ),where the average is taken over all monthly values, exhibit critical transition points between upwelling and downwelling zones (Fig. 3b).The stream is losing in deep groundwater areas near mountain fronts and becomes a gaining stream in the lower plain where groundwater tables are shallow.
The intensity of stream-aquifer interactions also varies seasonally.Interactions in summer and autumn show similar patterns with water exchanges of losing streams and gaining streams stronger than the monthly average.Stream inflow is higher and recharge to groundwater (precipitation and irrigation seepage) is relatively larger in these seasons.Thus, leakage from losing streams and groundwater discharge along gaining streams both become larger than the corresponding seasonal average.The magnitudes of water exchange are more spatially variable in the high-flow seasons of winter and spring: the magnitude augmentation (compared with seasonal average) of water exchanges for losing streams and gaining streams can be as large as 1041 and 317m 2 km d −1 , respectively.However, the water exchanges of the losing streams and gaining streams become much weaker in spring and winter, as the stream inflow from YLX is relatively low and the groundwater recharge (precipitation and irrigation seepage) is quite small.

Influence of local heterogeneity on simulated results
Groundwater simulations based on local (grid-scale) simulations of hydraulic conductivity K(x) display more spatially heterogeneous system states than the base model.Realizations of highly heterogeneous K(x) fields are produced by applying the method of Eqs. ( 1) and (2) in Sect. 2. Figure 4 displays grid-scale perturbations of K(x) from the zonal mean K z for CV of magnitude C = 0.1 and C = 2, as indicated by K(x)/K z .Higher perturbations from zonal mean conductivity occur for higher CV, which follows from the definition of zonal standard deviation (Eq.2).Grid-scale simulations, W G (x), of groundwater tables at the end of the simulation period (December) are more variable than the base result, W B (x). Pointwise differences between the two, W (x) = W B (x) − W G (x), are about equally distributed between grid cells where the grid-scale water table is higher than the base case ( W (x) < 0) and lower ( W (x) > 0), although there is a slightly greater chance of a rise.The number of cells where the absolute change in the water table is large (| W (x)| > 1 m) increases systematically with CV, and the number of cells where | W (x)| < 1 m decreases correspondingly.The pointwise differences exhibit a pattern of compact spatially coherent areas where water tables consistently increase or decrease together.This is despite the lack of any correlation in the values of grid-scale conductivity.
It is convenient to use an indicator, Similarly, Q m , the mean total exchange for month m averaged across all r = 1, . .., N r (= 10) locally refined models is The mean monthly value of the aquifer-stream discharge indicator is always less than 1 for all values of CV: the base model produces discharge values that are systematically larger than the locally heterogeneous models (Fig. 5).When two standard deviation confidence intervals are calculated for I m , a few intervals corresponding to the smallest CVs include 1 for some months.In those cases smallscale heterogeneities increase discharge in some local areas and the overall effect of heterogeneity is reduced.When CV equals 0.1, the monthly aquifer-stream discharge is quite close to the base model value (i.e., the expected value) since perturbations around the expected values are small in that case.On the other hand, mean indicator values decrease to nearly 0.8 as CV increases from 0.1 to 2, and standard deviations continue to increase.As heterogeneity in local models increases, the monthly aquifer-stream discharge indicator decreases and uncertainty as measured by standard deviation increases.The discharge ratio also shows some seasonal patterns, with smaller effects of heterogeneity in high-flow months, such as August and September.Compared with other months, the aquifer-stream discharge quantity is larger and the uncertainty is less in these months.
The mean monthly value of the corresponding streamaquifer leakage indicator is also less than 1 in most months, and the standard deviations show increasing trends with increasing CVs, but the relative magnitudes of leakage are less  than those of discharge and are closer to the base model (Fig. 6).Taking a CV that is equal to 2, for example, the ranges of mean values for stream-aquifer leakage and discharge are, respectively, 0.97-0.99 and 0.81-0.93,and the respective ranges of standard deviations for stream-aquifer leakage and discharge are 0.004-0.013and 0.020-0.044.
The entire Middle Heihe River includes upstream segments where the dominant exchange mechanism is leakage and downstream segments where discharge is dominant.To separate these two confounding classes of fluxes, spatial effects of heterogeneous conductivity are further investigated  Figs. 7,8).Stream 1 is an upstream segment while streams 2-4 comprise the main stem of the river in the alluvial plain.The indicator values are almost all less than 1 for average monthly discharge from these four substreams (Fig. 7), corresponding to the relatively larger impact of conductivity heterogeneity on groundwater processes leading to discharge.Stream-aquifer leakage is larger than  the corresponding base model in every sub-stream during at least some months (Fig. 8).
The stream-aquifer water exchanges of these four substreams respond differently to local heterogeneity, reflecting the different groundwater flow regimes of the different subregions.For example, stream 1 is mainly a losing stream located in a mountain front area with a deep groundwater table; thus, aquifer heterogeneity has a relatively small effect on stream-aquifer exchanges.Streams 2-4 go through the plain where water exchanges are affected by pumping and recharge schemes, and the differences between Q m and Q (B) m are more obvious.Since groundwater depths are shallow and extensive stream-aquifer interactions occur in these areas, the effects of aquifer heterogeneity on water exchanges of these streams are relatively strong.The values of Q m /Q (B) m are smaller during summer and autumn, when the stream inflow from YLX and precipitation are relatively large.

Slow paths
In this section evidence is presented for the hypothesis (H3) that the systematic bias observed in Sect. 3 is due to the response of groundwater flow to local heterogeneities in conductivity fields.Spatially coherent slow paths are seen to emerge in groundwater flows simulated in realizations of heterogeneous conductivity fields, and their density increases as the CV of experiments increases.This occurs despite the lack of any correlation in the spatial distribution of local-scale conductivity itself.

Local-scale conductivity field heterogeneity
It is well-known that groundwater system parameters exhibit heterogeneity on multiple scales ranging down to the smallest (Neuman et al., 1987;Neuman, 1990Neuman, , 1994)).Here the statistics of locally heterogeneous conductivity fields are analyzed and compared to the mean fields corresponding to the calibrated conductivity parameters of the base model.A single realization is chosen at random from the 10 available per level of CV, and its statistics are analyzed.Results for other realizations are essentially the same.
The point x is a grid cell in the zth zone.The ratio compares, K(x), the local conductivity at x to, K z , the expected value of K(x) defined by the calibrated zonal conductivity that parameterized the base model.The summary statistics of ρ z (x) describe the overall behavior of normalized local conductivity when adjusted for zone (Table 1).The arithmetic means of the ρ z (x) equal 1 for all values of CV, implying that the conductivity field is indeed a realization drawn from an ensemble of fields whose zonal expected values are the base model values derived from the initial calibration.The standard deviations of the ρ z (x) increase with CV.Both these behaviors are expected.Since hydraulic conductivity is bounded below by 0, the ranges of minimum values are also limited; nonetheless, minimum values decrease by 2 orders of magnitude, which is consistent with the increase in CV.Maximum values increase with increasing CV.Calculations of Moran's I (not shown) indicate that values of K are not spatially correlated for any CV.
The geometric mean of ρ z (x) declines from 0.96 to 0.76 as the heterogeneity of the conductivity field (CV) increases.Since the effective conductivity of a heterogeneous twodimensional field is equal to its geometric mean (Matheron, 1967;Gutjahr et al., 1978), groundwater flow is expected to be slower in more heterogeneous fields where CV is higher.A related effect is observed in the simulations of aquifer-   9).Maps for other realizations are similar.Black grid cells are locations where K(x) < K z , while white grid cells have values K(x) > K z .The number of black grids clearly increases with heterogeneity (CV), illustrating the increasing percentage of area occupied by locations with smaller conductivity than expected.

Emergence of slow paths
To evaluate hypothesis H3, we compare groundwater fluxes q(x) and q b (x) obtained, respectively, from (i) the flow simulation based on the same realization of a locally heterogeneous conductivity field, K(x), that was selected in Sect.4.1 and (ii) the simulation based on the calibrated base model field, K z .We adopt the normalized difference q(x) = (q(x) − q b (x)) / q(x) • q b (x) (7) to indicate differences between flow simulations.Here q(x) < 0 indicates a point where flow due to local variations of conductivity, q(x), is less than flow due to the base model, q b (x).The normalization removes large-scale effects due to zonal averages from the analysis.
Summary statistics of flow (Table 2) support the hypothesis that locally heterogeneous conductivity fields induce reduced overall flows even though the heterogeneous conductivities are spatially independent.The arithmetic mean difference drops from −0.05 to −0.51 as CV increases from 0.1 to 2, and the median and minimum values exhibit a declining trend as well, implying that groundwater flow estimated by the local-scale model is slower at most points than the groundwater flow estimates from the base model.Meanwhile, standard deviations steadily increase from 0.175 to 0.833, indicating that groundwater flow becomes more variable as CV increases.Additionally the percent of area that   is occupied by relatively lower groundwater flow (q(x) < q b (x)) increases with increasing heterogeneity.This fact is amplified by examining the spatial distribution of normalized differences q(x) (Fig. 10).Other realizations exhibit similar patterns.Black grid cells correspond to locations where q(x) < 0, while white grid cells are points where q(x) > 0. As aquifer heterogeneity increases, the area covered by black grid cells expands, consistent with the last line of Table 2.This is somewhat more pronounced in low-lying regions where zonal mean hydraulic conductivities are low (cf.Fig. 1).
The distribution of locations where q(x) > 0 ( q(x) < 0) is fairly uniform in the simulation with relatively low heterogeneity (CV = 0.1).Areas in zone 1 where there seems to be a slightly higher concentration of q(x) < 0 may be exceptional.It should also be noted that relatively strong local anisotropies in the direction of the stream appear throughout the basin, except in zone 1.The relatively uniform spatial distribution of the q(x) index begins to disappear as CV increases.When heterogeneity is high (CV = 1.0, 2.0), areas where the q(x) index is less than zero generally coincide with zones of low conductivity (like zones 1, 2, 3, etc. in Fig. 1).Zones of high conductivity are somewhat harder to pick out, but there are obvious areas where q(x) > 0 that correspond to areas of high zonal conductivity (like zones 10, 8, 9, etc. in Fig. 1).
Coherent areas of relatively high and low flow buildup in the groundwater simulations (Fig. 10), especially at high levels of heterogeneity (high CV).This seems related to the increase of the area occupied by points of low conductivity with increasing heterogeneity (Table 1).Yet points of relatively low conductivity do not exhibit spatial coherence (cf.Fig. 9), nor should they since the random perturbation to con- ductivity that is added at any given point is chosen independently of all other points.The spatial coherence of low-flux areas, as measured by q(x) < 0, results from the cumulative effect on continuous flows of blockages arising from spatially independent low conductivities.The flow paths of other simulations are statistically the same as this one, although their actual locations and geometry are different.

Summary and concluding remarks
This study focuses on three related hypotheses about the effects of locally variable hydraulic conductivity fields on regional exchanges of water between streams and aquifers: (H1) small-scale heterogeneities of hydraulic conductivity significantly affect simulated stream-aquifer water exchanges in river basins, and hence, computational projections of them; (H2) systematic biases arise in estimates of exchanges if small-scale heterogeneities are smoothed by aggregation into a few sub-regions; and (H3) the biases result from slow paths in groundwater flow that emerge due to small-scale heterogeneities.
The study addresses these hypotheses through computational experiments by simulating system states of the hydrologic systems of the Middle Heihe River basin (MHRB) of northern China.The study compares (i) estimates of streamaquifer exchanges and groundwater fluxes produced by a base model that reflects current practice of parameterizing hydraulic conductivity on sub-regional scales to (ii) estimates produced by a related set of models whose conductivity parameters are spatially heterogeneous on local (grid-point) scales.The computational experiments provide evidence that regional system states, specifically exchanges of water between the MHRB's streams and aquifers, respond significantly to local variations of conductivity.
Local-scale heterogeneities in conductivity fields cannot be resolved by effective zonal parameterizations used in the base model.The cumulative effects of local heterogeneities combine to produce spatially coherent "slow paths" in the resulting simulations of groundwater flows.Flow paths in locally heterogeneous systems appear to intersect enough relatively low-conductivity areas compared to the base model to reduce groundwater flow regardless of zone.The system-atic behavior observed in the local-scale experiments is also consistent with theoretical results of Matheron (1967) and Gutjahr et al. (1978).
The aquifer-stream discharge response is stronger than the response of stream-aquifer leakage.That may result from discharge's greater sensitivity to variations of hydraulic conductivity in the MHRB.Stream segments that are further downstream are also more affected by heterogeneity than upstream segments.This seems consistent with the overall pattern of leakage in the Heihe River basin where recharge occurs primarily in upland reaches that are not directly connected to the water table and average conductivities are relatively high, and thus are relatively unaffected by variations in conductivity.
The results presented here strongly suggest that localscale variations in the conductivity system parameter are not averaged out as scales of representation increase, but instead significantly affect regional exchanges of water between streams and aquifers (H1): in these experiments local heterogeneity has nonlinear effects on states of the regional system.This is the case even when the level of heterogeneity in conductivity is fairly small, as measured by the coefficient of variation of the simulated conductivity fields (CV).These effects increase with CV.This is especially true of discharge, which the sub-regionally scaled base model systematically overestimates (H2).The emergence of slow paths in the groundwater flow field (H3) appears to be the source of the nonlinearities.
These nonlinear effects have implications for water resources management in addition to their intrinsic scientific interest.Since the introduction of computational tools capable of approximating the dynamics of basin-scale hydrologic systems, it has become common to use effectively (zonally) parameterized coupled models to manage regional water resources (Chen and Shu, 2006;Werner et al., 2006;Dragoni et al., 2013).Although estimates of system states produced by such aggregated models have been assumed to approximate watershed dynamics up to reasonable amounts of unsystematic random error, this assumption is generally not tested.The base model used here, and the methods used to produce it, is typical of such zonally parameterized models.It has been an open question whether zonal parameterizations can produce estimates of system states that account for the effects of local-scale heterogeneity that are known to exist in parameters (Schmidt et al., 2006;Rodríguez et al., 2013).The results of this study reveal a systematic bias in approximations of stream-aquifer exchanges in the MHRB produced by the base model.This is especially true of discharge.Similar effects have been observed in other studies (Kalbus et al., 2009;Kurtz et al., 2013;Mendoza et al., 2015;Lackey et al., 2015).
Most of the evidence for (or against) hypotheses like H1-3 will continue to be computational.Several additional lines of investigation suggest themselves immediately.First is the extension of this kind of study to other basins, both natural and synthetic.Extensions to three dimensions may be required in some cases.The respective roles of zonal and local heterogeneity can be investigated explicitly by applying analysis of variance (Winter et al., 2006), or other multivariate statistical techniques.
Local-scale heterogeneities are independent in this study.This is a weak assumption in the sense that it places a minimal number of restrictions on the spatial structure of conductivity fields, but its influence on conclusions about H1-3 is unclear.Hydraulic conductivity fields are generally correlated, so experiments that evaluate the effects of different levels and kinds of correlation should be pursued.Synthetic conductivity fields that reproduce specified correlation structures can provide large numbers of realizations of parameter fields with given values of control variables like correlation length and spatial anisotropy.
Finally, the existence of slow paths in groundwater flows, is inferred indirectly here from summary statistics (cf. the last line in Table 2) or by qualitative interpretations of results (Fig. 10).A Langrangian analysis based on particle tracking would allow the interaction between variations in conductivity and groundwater fluxes to be quantified more directly.

Figure 1 .
Figure 1.PEST estimated aquifer conductivity K z and specific yield S z .

Figure 2 .
Figure 2. Simulated groundwater table and river stage along river distance in (a) June and (b) December; (c) difference between December and June (December − June); (d) temporal stream-aquifer exchanges within year.

Figure 3 .
Figure 3. Groundwater flow rate q 0 distribution (a); average monthly leakage rate Q b and groundwater table depth (b).

Figure 4 .
Figure 4. Grid-scale realization of K field perturbation when CV equals 0.1 and 2.
stream-aquifer exchanges produced by local grid-scale models, Q m , with monthly exchanges produced by the base model.Here m = 1, . .., 12 indicates the month.The base model simulation, Q (B) m , of total water exchanged between the stream and aquifer during month m summed over all x = 1, . .., n grid cells along the stream is

Figure 5 .
Figure 5. Means and confidence intervals of aquifer-stream discharge from local model.

Figure 6 .
Figure 6.Means and confidence intervals of stream-aquifer leakage from local model.
by calculating indicator values (Q m /Q (B) m ) for four major sub-streams in the MHRB.Results for discharge and leakage are shown for all 10 realizations of conductivity for each value of CV (

Figure 7 .
Figure 7. Aquifer-stream discharges of different streams from refined model.

Figure 8 .
Figure 8. Stream-aquifer leakages of different streams from refined model.

Figure 10 .
Figure10.The field of normalized groundwater flow differences q(x) with grid-scale K(x).

Table 1 .
Grid-scale K field statistics.

Table 2 .
Groundwater flow field statistics with local heterogeneity.