Sensitivity of hydrologic and geologic parameters on recharge processes in a highly heterogeneous, semi-confined aquifer system

An increasing reliance on groundwater resources has been observed worldwide during the past 50–70 years and has led to unsustainable groundwater abstraction in many regions, especially in semi-arid and arid alluvial groundwater basins. Managed aquifer recharge (MAR) has been promoted to replenish overdrafted groundwater basins and augment surface water supply. However, MAR feasibility in alluvial groundwater basins is complicated by complex geologic architecture that typically includes laterally continuous, fine-texture confining units that can impede both recharge rates and regional propagation of increases in the hydraulic head. A greater feasibility of MAR hinges on identifying locations where rapid, high-volume recharge that provides regional increases in pressure head are possible, but relatively little research has evaluated the factors that control MAR feasibility in alluvial groundwater basins. Here, we combine a transition probability Markov chain geostatistical model of the subsurface geologic heterogeneity of the eastern side of the northern Central Valley, California, with the three-dimensional, variably saturated water flow code ParFlow to explore the variability of MAR feasibility in this region. We use a combination of computationally efficient localand global-sensitivity analyses to evaluate the relative importance of factors that contribute to MAR feasibility. A novel proxy parameter approach was used to describe the configuration and proportions of subsurface hydrofacies and the water table depth for sensitivity analyses, and results suggest that recharge potential is relatively more sensitive to the variability of this proxy parameter than to the variability of individual hydrofacies hydraulic properties. Results demonstrate that large variability of MAR feasibility is typical for alluvial aquifer systems and that outsized recharge rates are possible in select locations where interconnected, coarse-texture hydrofacies occur.


Introduction
Geologic heterogeneity strongly affects both the movement of water in the subsurface and the exchange of water between subsurface and surface stores; however, rarely are enough data available to explicitly represent heterogeneous geologic features 20 in groundwater models (Koltermann and Gorelick, 1996;De Marsily et al., 2005). Instead, models typically simplify and/or upscale heterogeneity to represent subsurface flows for purposes of regional-scale water resources management (e.g., Fogg, 1986;Phillips and Belitz, 1991). Upscaling methods have been the focus of numerous studies (e.g., Renard and De Marsily, 1997;Fogg et al., 2000;Neuman and Di Federico, 2003;Fleckenstein and Fogg, 2008), and coarse-resolution models with upscaled (i.e., effective) hydrologic properties are often adequate for regional-scale flow studies, but typically lack enough 25 detail to reliably capture some phenomena, like recharge and transport processes, that are strongly influenced by geologic heterogeneity.
To represent the influence of geologic heterogeneity on flow and transport phenomena, many approaches have relied on stochastic methods, like transition probability based indicator geostatistics which can represent heterogeneous features while honoring measured data (Carle and Fogg, 1996;. These approaches 30 represent geologic heterogeneity with hydrogeologic facies categories, each of which is assigned effective values or probability densities for estimates of hydraulic properties. By categorizing facies according to depositional environment rather than texture alone, the predictable geometries (i.e., facies mean lengths, proportions, and juxtapositions) of these features can be more accurately represented with sparse data. Studies that rely on these methods show strong influence of subsurface heterogeneity on groundwater/surface-water interactions and recharge processes (Lee, 2004;Fleckenstein et al., 2006;Engdahl et al., 2010; A.  . Groundwater pumping in the region typically occurs at depths >30 m in the deeper semi-confined or confined portion of the aquifer system (Liu, 2014).

Hydrofacies Model Development
Transition probability Markov-chain geostatistics (TPROGS) (Carle andFogg, 1996, 1997;Carle, 1999), was used to simulate the subsurface distribution of hydrofacies in the domain area (Fig. 2). Model development is described in detail by Meirovitz (2010) and Maples et al. (2019). Conditioning data for the TPROGS model included about 1200 well logs, soil surveys, geologic cross-sections and mapped paleochannels. Geologic data were binned into four textural categories: gravel, sand, 90 muddy sand, and mud (undifferentiated silt and/or clay) [Table 1 (Fleckenstein et al., 2004;Meirovitz, 2010)]. "Mud" refers to silt and clay undifferentiated, because most of the subsurface data available only identify the fine-grained sediments and are not sufficiently detailed to distinguish silt from clay. From these data the proportions for each facies were calculated directly. Through geologic analysis of the data, additional ::::::::: Additional parameters were estimated describing the mean lengths of each hydrofacies along the principal directions and the embedded transition probabilities to represent cross-correlation 95 between different facies. Because the depositional characteristics of the American and Cosumnes fans were markedly different, individual models of each were produced and subsequently combined by Meirovitz (2010).   Three-dimensional, variably-saturated water flow was simulated with the hydrologic modeling code, ParFlow (Ashby and Falgout, 1996;Jones and Woodward, 2001;Kollet and Maxwell, 2006), which couples surface and subsurface flow with the 2-D diffusive or kinematic wave equation, and solves the 3-D mixed form of Richards' Equation for variably-saturated subsurface flow: S ss S ww : (h) ∂h ∂t + φ ∂S w (h) ∂t ∂S w (h) ∂t :::::: = ∇ · q + q r r (x, z) (1) 110 where q = φS ww : (h)v = −K ss (x)k r r (h)∇(h + z) In these equations S s :: S s is specific storage [L −1 ], S w ::: S w is relative saturation [−], h is pressure head [L], t is time [T], φ is porosity [−], q is Darcy flux [L T −1 ], q r :: q r is a source/sink term [T −1 ], z is elevation [L], v is the subsurface flow veolicty [L T −1 ], K s (x) ::::: K s (x) : is the saturated hydraulic conductivity tensor [L T −1 ], and k r is relative permeability [−]. The van 115 Genuchten relations (Van Genuchten, 1980) describe S w and k r :: S w :::: and :: k r : as a function of h in the unsaturated zone, with parameters for air entry pressure α [L −1 ], pore size distribution n [−], and residual saturation S res ::: S res [−].

Boundary Conditions
Model boundary conditions are discussed in greater detail in Liu (2014) and Maples et al. (2019). The locations of domain boundaries were chosen to simplify the assignment of boundary conditions for the flow model. The eastern boundary roughly 120 coincides with the Sierra Nevada foothills, and the northern, southern, and western boundaries roughly coincide with local surface water bodies (Fig. 1). A specified head boundary condition was applied for the eastern boundary to coincide with the local groundwater head distribution estimated from local monitoring well data (Liu, 2014). A general head boundary of 0 m amsl was set 1 km beyond the western boundary to approximate the Sacramento River and Sacramento-San Joaquin Delta along the northwestern, and southwestern portions of the western boundary, respectively. No-flow boundary conditions were 125 applied along northern, southern, and bottom boundaries because the regional groundwater flow direction is generally from east to west.
Model spin up and recharge simulations used combinations of specified-flux and specified-head upper boundary conditionsand are described in greater detail in subsequent sections.

130
Model spinup and calibration are described in greater detail in Liu (2014) and Maples et al. (2019). To summarize, a 16-yr.
simulation period was used to bring the simulated hydrology into dynamic equilibrium. Water budget components, including groundwater discharge, recharge and boundary flows along with facies hydraulic properties were estimated and adjusted manually to simulate a realistic water budget, water table configuration, and vertical hydraulic gradients during the calibration process.
: . An initial potentiometric surface was specified using interpolated groundwater level data. Monthly estimated urban 135 and agricultural groundwater pumping rates were applied as specified fluxes representing wells screened in lower portions of the domain that coincide with typical screened intervals of municipal and agricultural pumping wells in the region. Dominant sources of recharge for the region include stream recharge from the American River, Cosumnes River, and Deer Creek, as well as deep percolation from agricultural and urban return flows. Weekly estimates of spatially-distributed river stage for the streams were applied as specified heads along coincident land surface cells. Monthly estimates of urban and agricultural 140 recharge volumes were applied as specified-flux boundary condition across the top of the domain to simulate deep percolation of agricultural and urban return flows and to equilibrate soil moisture conditions in the near-surface UZ cells. :::::::::::::: unsaturated-zone :::: (UZ) ::::: cells.

Recharge Scenarios and Model Post-Processing
To evaluate the system response to recharge stress and recovery, 90-day recharge scenarios were run individually at each site, wherein recharge was simulated during the initial 30-day period followed by a 60-day recovery period. Surface ponding was approximated by a specified head boundary condition representing 10 cm of ponding at the 25 upper-boundary cells coincident with each recharge site, with no recharge specified for the remaining upper-boundary cells. An additional simulation was run in which no recharge was specified for all upper-boundary cells, i.e., as a no-recharge scenario. The initial condition for all  Table 3.

225
To better understand the relationships between site characteristics (variables :::::::: predictors) and model outputs (predictors :::::::: variables), correlations (Pearson's r, Spearman's rho, and Kendall's tau) were evaluated between all variable and predictor pairs across all 100 sites. Variables and predictors are described in Table 3. The purpose of evaluating correlations between variables and predictors was to determine if any site characteristics could be used to reasonably predict model outputs with empirical relations.
Log 10 data transformations were selectively performed on variables and predictors to improve normality prior to calculation of 230 Pearson's r. Transformations were not performed for Spearman's rho and Kendall's tau, as neither require normal distributions for prediction.

Development of a Geologic Proxy Parameter for Recharge Potential
To incorporate descriptions of geologic configuration in sensitivity analyses of recharge potential, development of a geologic proxy parameter (GP P ) was required. Correlations between select variables (R 30d , P 30d , and V fines, 90d ) and predictor pairs 235 described in section 2.4.4 were ranked, and a GP P was determined for each by developing empirical regression relations between those variables and the highest ranked predictor. Unlike other parameters, the GP P cannot be directly varied at each site, and instead was approximated either with linear regression of recharge response and GP P , or by relocating the recharge site within the domain to a location with the corresponding GP P . Previous studies have shown the importance of geologic heterogeneity on MAR feasibility (e.g., Maples et al., 2019), and the novel proxy parameter methodology described here is 240 analogous to a transfer function (e.g., Wösten et al., 2001), in that it describes the influence of complex geologic heterogeneity on recharge processes with relatively easily-derived site characteristics. By using this approach in sensitivity analyses, we are able to both capture this geologic complexity and also reduce the overall computational expense, albeit with some predictive uncertainty related to the regression relations.

245
Realistic ranges of model parameters describing eight ::: six hydraulic properties for four facies types (n = 24 model parameters) are shown in Table 4 along with an estimated range of GP P . Ranges for model parameters were chosen from literature values for the Central Valley California, and for similar alluvial systems (Anderson et al., 2015;Botros et al., 2009;Fleckenstein et al., 2004;Frei et al., 2009;Maserjian, 1993;Niswonger and Fogg, 2008;Sager, 2012). Model parameters were assumed to be distributed uniformly within each of these ranges for simplicity. The range of GP P was determined from the range observed 250 from the 100 exploratory sites described in section 2.4. The distribution GP P was observed to be approximately log-normal, so a Log 10 data transformation was performed for subsequent sensitivity analyses. All sensitivity scenarios were initialized with the h distribution from the end of the model spin up and were simulated and post-processed following the approach outlined in section 2.4.3. In this way, each scenario required two simulations be run with 255 the same parameter sets (i.e., a recharge and no-recharge simulation) which were then differenced to isolate recharge stresses from other model stimuli, including transient model responses to changes in parameter values.

Local Sensitivity Analyses
Parameter sensitivities were evaluated locally using the dimensionless-scaled sensitivity (DSS) and composite-scaled sensitivity (CSS) metrics, which are computationally-frugal screening methods used to compare the relative importance of different 260 parameters to the estimation of a simulated model output (Hill and Tiedeman, 2007). DSS for simulated output i and parameter j are calculated as where y i is the ith simulated output, b j is the jth estimated parameter, ∂y i /∂b j is the derivative (i.e., the sensitivity) of the simulated output with respect to the jth parameter, b is the vector of parameter values at which sensitivities are evaluated, and ii is the weight of the ith simulated output. For this work, simulated outputs P 30d , R 30d , V fines, 90d were weighted equally at unity.
Composite-scaled sensitivities (CSS) were calculated to estimate the total amount of sensitivity provided by each parameter across multiple sites and for multiple model outputs where DSS ij is from equation 6, and n is the total number of simulated outputs i associated with parameter j.
DSS were estimated for select model outputs R 30d , P 30d and V fines, 90d (Table 3) by perturbing each hydraulic-property parameter (n = 24) by 10% of its total range (Table 4). Results from the exploratory simulations show that recharge response is highly dependent on site choice, so DSS was evaluated at four representative sites which span a large range of recharge potential. Each of the four representative sites were chosen to correspond with the 25th, 50th, 75th and 95th percentile of 275 recharge potential, as estimated by GP P . These sites are hereto referred to as q25, q50, q75, and q95, respectively. A total of 96 model evaluations were required to estimate DSS on the three model outputs (R 30d , P 30d and V fines, 90d ) for all 24 parameters (Table 4) at these four sites. To incorporate GP P in DSS analyses, an approach was developed using the predictive regression relation between GP P and R 30d . For example, DSS require perturbation of a parameter (i.e., ∂b j ) by a percentage of the parameter range (e.g., by 10%). A corresponding 10% perturbation in R 30d (i.e., ∂y i ) was approximated using the predictive 280 regression relation between GP P and R 30d rather than by performing an additional model evaluation.
CSS were calculated for R 30d , P 30d and V fines, 90d (Table 3 by combining DSS estimates for each model output across sites q25, q50, q75, and q95 for each of 24 model parameters. The same approach was used to calculated CSS for GP P , but was only estimated for R 30d and not for P 30d and V fines, 90d .

285
A measure of global sensitivity was provided by the method of Morris (1991), which relies on the calculation of elementary effects, i.e., local derivatives sampled one-at-a-time (OAT) on a grid that covers the parameter space. The method of Morris creates a trajectory through the parameter space by perturbing each parameter x j along a grid by a step ∆ j . A sequence of p perturbations is required to obtain a one trajectory for a model with p parameters. For each trajectory, the elementary effect for a single parameter, EE j , is calculated as the ratio of the perturbation in model output to the perturbation of the parameter :::::::::::::::::::::::::: where f (x) is the evaluation of the function at the prior point in the trajectory. Calculating the elementary effects for p parameters using a single trajectory requires p + 1 model evaluations. Because the elementary effect for any single trajectory does not account for interactions between parameters and depends strongly on the location of the initial point, x, in the parameter space, the method of Morris performs the OAT approach over multiple trajectories, N , within the parameter space using a factorial 295 sampling approach. A variation of the original approach was employed to resolve issues related to opposite signs of elementary effects affecting the calculation of total-order sensitivity (Campolongo et al., 2007), in which the total-order sensitivity of each parameter, µ * j , was calculated as the mean of the absolute values of N elementary effects Unlike DSS approaches, the method of Morris required significantly greater computational resources for an equivalent 300 number of parameters, so a sub-set of three parameters were included in the Morris approach. Results from local sensitivity analyses suggest that K s and GP P are relatively important parameters for recharge potential. . : Computational expense was further reduced by pairing parameters, i.e., K s of gravel and sand, and of muddy-sand and mud, effectively reducing these four parameters into two describing 'coarse-' and 'fine-texture' facies, respectively. By pairing parameters, K s of gravel and sand (and of muddy sand and mud) are perturbed within their respective parameter ranges together, reducing the total number of 305 parameters from five to three : . :: In ::::: total, :::: three ::::::::: parameters ::::: were ::::::: included :: in ::: the :::::: Morris ::::::::: approach, :::::::: including ::: K s :: of :::::::::::: coarse-texture ::::: facies, ::: K s :: of :::::::::: fine-texture :::::: facies, ::: and :::::: GP P . Sensitivity indices were calculated using a sample size of N = 20, resulting in a total of 80 model evaluations. Herman et al. (2013) demonstrated that the method of Morris with N = 20 trajectories produced similar sensitivity results to the Sobol' method (Sobol, 2001) with >2 orders-of-magnitude fewer model evaluations.
To further reduce the computational expense, the total simulation time was reduced from 90 to 10 days, during which 310 recharge was applied for the entire simulation. Sensitivity indices, Morris µ * j , were only evaluated with respect to the effective recharge rate at the end the 10-day simulation period, R 10d [cm d −1 ]. Morris µ * j was not evaluated with respect to other model outputs describing pressure perturbation or volume of recharge accommodated by fines because GP P was determined to be an inadequate predictor of these model outputs.
To incorporate GP P in the Morris framework, a novel approach was developed in which the location of the sampling site 315 was varied to correspond with the requisite GP P parameter choice. For example, if a hypothetical sensitivity analysis required evaluation of the model with GP P at the 50th quantile (q50; i.e., the median value), the model would be run using the site with the nearest corresponding GP P value from the 100 exploratory sites described in section 2.4. In this way, the variability of GP P as identified in the exploratory simulations can be sampled directly by simply varying the location of the recharge site within the domain. The Morris approach was implemented with open-source library developed by Herman and Usher (2017).

Exploratory Simulations
Results from the exploratory simulations at the 100 selected sites show a wide range of R 30d , P 30d , and V fines, 90d across sites (Fig. 4). R 30d varied over 2 orders of magnitude and were non-normally distributed, with a maximum, minimum, and mean value of 66.4 :: cm ::: d -1 , 0.5 ::: cm ::: d -1 , and 8.6 cm d -1 . P 30d were similarly non-normally distributed and also showed a large range, varying over 4 orders of magnitude. Maximum, minimum, and mean P 30d were 1.6 × 10 5 :: m 3 , 33 :: m 3 , and 1.9 × 10 3 m 3 , respectively. These results highlight that a small number of sites have outsize recharge potential compared with most of the landscape. R 30d and P 30d were positively correlated (r > 0.70), indicating that these recharge benefits are physically related.
The proportion of recharge accommodated by fine-texture facies (V fines, 90d ) also showed large variability across sites, ranging from 0.13 to 1.00, with a mean value of 0.69. The high proportion of V fines, 90d observed here is consistent with previous 330 findings that suggest that fine-texture facies are the largest reservoir for MAR in this aquifer system (Maples et al., 2019).
V fines, 90d was negatively correlated with both R 30d and P 30d (r > 0.70 :::::::: |r| > 0.70), which indicates that when interconnected, coarse-texture pathways are present, a greater proportion of MAR is accommodated in the coarse-texture aquifer system.

Influence of Coarse-Texture Connectivity
Of the 100 exploratory sites, 23 were shown to have interconnected coarse-texture gravel and sand facies from land surface 335 to the initial water table depth. R 30d , P 30d , and V fines, 90d were parsed according to whether they were interconnected (Fig.   4). Results show that mean R 30d and P 30d were 2.2× and 2.3× greater, respectively, for interconnected sites than for noninterconnected sites (14.7 vs ::: cm ::: d -1 ::: vs. : 6.7 cm d -1 , and 1. non-interconnected distributions of V fines, 90d were not significantly different.
These results indicate that sites with interconnected coarse-texture facies have greater R 30d and P 30d potential. However, this metric is not entirely diagnostic of recharge potential. As shown in Fig. 4a,b, some interconnected sites exhibited low R 30d and P 30d . This is likely because some interconnected sites with shallow water table depths have limited unsaturated pore volume to accommodate large recharge volumes. In addition, the interconnection metric described herein only describes 345 vertical interconnection of coarse-texture facies for unsaturated-zone cells that are vertically coincident with the recharge site, and does not consider whether these coarse facies connect with the greater aquifer network outside of the unsaturatedzone control volume. Results also show that some seemingly disconnected sites have large recharge potential. Indeed, the interconnection metric described here does not account for any lateral interconnection from land surface to the greater aquifer network, which could explain this behavior. In reality, the simplified estimator of connectivity used here likely underestimates 350 the number of interconnected sites.

Ranked Correlations
Additional correlation metrics (Pearson's r, Spearman's rho, and Kendall's tau) between R 30d and site characteristics were ranked and are shown in Fig. 6. Results show that site characteristics that include some variation of K arith , K geom , or K harm were, 365 in general, more correlated with R 30d than site characteristics that only include W T D, U Z coarse , and Surf coarse . K geom ×W T D was, on average, most correlated with R 30d .
In general, site characteristics that included K geom and K harm were slightly more correlated with R 30d than site characteristics that included K arith . We speculate that this behavior is related to the dominantly vertical flow direction of recharge across typically horizontal facies configurations. Previous work has shown that K geom and K harm best describe upscaled K for these 370 flow configurations in this domain (Yunjie Liu, personal communication; Fogg, 1986).
Interestingly, site characteristics that included only W T D, U Z coarse , and Surf coarse were poorly correlated (r < 0.20) with R 30d . This finding has important implications for determining MAR site suitability because many GIS-derived indices of recharge suitability rely solely on soil and/or surface geology to determine geologic suitability for recharge. These results suggest that even more detailed geologic descriptions that estimate deeper fractions of coarse-texture facies may not fully 375 capture recharge potential. Instead, metrics that include some description of upscaled vertical K appear to be most diagnostic of recharge potential.

Recharge Extrapolation
The relation between site-averaged K geom × DT W and R 30d was determined to be the best predictor and was used to predict R 30d for subsequent sensitivity analyses by treating K geom ×DT W as a GP P (Fig. 7a). The linear regression relation between 380 K geom × DT W and R 30d was highly significant (p < 0.01), and correlation coefficients (r 2 ) showed that empirical regression explained 70% of the variation in the data. Linear regression relations for K geom × DT W and P 30d and V fines, 90d were deemed insufficient for prediction (r 2 < 0.40) and were not incorporated in sensitivity analyses. Domain-wide K geom ×W T D was converted to R 30d using the predictive relation described above (Fig. 8). Results show that 84% of the domain has R 30d potential <10 cm d -1 , while 6% of the domain has R 30d potential >25 cm d -1 , and a small portion of 385 the domain has R 30d potential >150 cm d -1 . These results show a large contrast between locations with high recharge potential and those with low recharge potential which supports previous findings indicating that a small fraction of the landscape has recharge potential that is orders-of-magnitude greater than the rest of the landscape (Maples et al., 2019;Fleckenstein et al., 2006). Deposition of IVF within the domain area has been documented by Meirovitz (2010) and explains the presence of these high recharge potential locations.

Local Sensitivity Analyses
GP P perturbations to estimate DSS for sites q25, q50, q75, and q95 using predictive regression relations are are illustrated in Fig. 7b,c. DSS and CSS results for each model parameter and GP P with respect to R 30d is shown in Fig. 9. DSS results ( Fig. 9a) show that for low recharge potential sites q25 and q50, K s of mud and muddy sand facies were the most sensitive 395 parameters with respect to R 30d . For high recharge potential sites q75 and q95, GP P was the most sensitive parameter. These findings demonstrate that K s of fine-texture facies is the dominant driver of recharge potential for low recharge potential sites and the configuration of facies and water table depth is relatively less important. However, for high recharge potential sites, which presumably have a higher proportion of coarse-texture facies, the configuration of facies and water depth becomes the dominant driver of recharge potential. In general, DSS of all parameters were greater for sites with higher recharge potential 400 than for sites with low recharge potential.
CSS results for R 30d (Fig 9b) show that, in general, GP P was the most sensitive parameter for R 30d when aggregated across all 4 sites. In general, K s and φ were also sensitive with respect to R 30d . It is unsurprising that /phi is sensitive to R 30d because specific yield (S y ), which is not explicitly parameterized in ParFlow, is closely related to φ. Moreover, Maples et al. (2019) showed that the majority of recharge volume in this alluvial system is accommodated by filling unsaturated-zone pore 405 volume, which is controlled primarily by S y , and by association in this model, by φ. Empirical fitting parameters describing unsaturated-zone texture and soil water retention, α, n, and S res , were relatively insensitive, especially for sites q75 and q95.
This suggests that while unsaturated pore volume is important for recharge, the unsaturated flow processes are not particularly important, at least when considering infiltration of ponded water, which typically allows for rapid wetting-front advancement through the unsaturated zone, especially for high recharge potential sites. Results suggest that saturated storage properties, i.e., 410 S s , were also relatively unimportant. This likely because most recharge volume is accommodated by filling unsaturated pore volume, and is thus more dependent on φ (and S y ) than on S s .
DSS and CSS results for GP P demonstrate the novel usage of empirical regression relations in a local sensitivity analysis framework. By perturbing GP P in this way, constancy of other parameters can be maintained in a way that would be otherwise difficult if GP P was perturbed by changing the location of the recharge site. Performing local sensitivity analyses at multiple sites spanning a range of recharge potential allowed for comparison of DSS sensitivities across sites and highlights differences 425 of parameter sensitivities for low-and high-recharge potential sites. Our findings demonstrate that (1) facies permeability and unsaturated-zone storage properties are important factors for recharge potential, and (2) the configuration of subsurface geology and water table depth is particularly important for the total recharge volume that can be accommodated at a particular site, especially for high recharge potential sites.

430
Results from global sensitivity analyses are shown in Fig. 11. Morris µ * values indicate that GP P is the most sensitive parameter when compared with K sat of coarse-and fine-texture facies. These results are consistent with findings from local sensitivity analyses which also showed that GP P was the most important parameter with respect to R 30d . Unlike DSS and CSS results, which compared GP P against model parameters for each facies, Morris analysis combined K s parameters for coarse-and fine-texture facies which, in turn, increased the influence of those parameters on R 30d relative to GP P . Even so, 435 results indicate that GP P is the most important parameter with respect to R 30d . These results further highlight the importance of the configuration of subsurface geology and water table depth for groundwater recharge potential.
Morris results demonstrate a novel incorporation of GP P within a global sensitivity analysis framework, and was unique as compared to incorporation of GP P in local sensitivity analyses described in section 3.4.1. Unlike the local methods, which used an empirical relation to incorporate an estimate of GP P sensitivity, the method used for the Morris approach directly varied GP P within the parameter space by moving the recharge site to the location with the requisite GP P parameter value.
Unlike the local approaches, which required constancy among all other parameters as each parameter is perturbed and thus required usage of an empirical relation to perturb GP P , the Morris approach varies all parameters globally, which allowed for GP P to be included explicitly within the approach. Consistency of the results of the local and global approaches despite these methodological differences for incorporating GP P highlights the robustness of these findings.

Discussion
Sensitivity analyses are a fundamental diagnostic tool to provide insight into the relative importance of the parameterization of aquifer properties among other inputs in complex hydrologic models (Saltelli et al., 2004). Sensitivity analyses can be broadly categorized as local or global methods, where local methods provide sensitivity evaluation at a single location in the parameter space (Hill and Tiedeman, 2007), while global approaches explore sensitivities throughout a multi-dimensional pa-450 rameter space (Saltelli et al., 2008). Many studies have shown the diagnostic utility of local approaches (e.g., Foglia et al., 2009); however, local approaches are generally less robust than global approaches, especially for non-linear models (Saltelli et al., 2008). where bars represent each µ * estimate, and whiskers represent the respective 95% confidence interval of the estimate. meability upscaling in variably-saturated flow models (e.g., Gilbert et al., 2016;Foster and Maxwell, 2019;Srivastava et al., 2014) and on MAR specifically (e.g., Rahman et al., 2013;Heilweil et al., 2015), but to our knowledge, this is the first study to use a 3-dimensional variably-saturated water flow code with a detailed representation of geologic heterogeneity to evaluate the importance of hydraulic properties and geologic configuration on MAR dynamics with a combination of local and global sensitivity analyses.

460
Our approach includes novel incorporation of geologic architecture as a geologic proxy parameter (i.e., GP P ). Of the many approaches to develop a GP P of recharge potential from descriptions of subsurface geologic and hydrologic characteristics, our results show that a GP P which combines metrics related to upscaled vertical K s and unsaturated zone thickness was most diagnostic of recharge potential. In addition, results from local and global sensitivity analyses indicate that this GP P is equally or more important as characterizing the hydraulic properties of any particular facies for recharge potential. Consistency among 465 results for both local and global approaches shows that these findings are reasonably robust and highlights the importance of accurately characterizing the subsurface configuration of coarse-texture facies in clastic sedimentary aquifer systems. While a GP P was shown to be the most important parameter with both approaches, we also show that parameters related to unsaturatedzone storage and facies permeability (i.e., φ and K s , respectively) were also important for MAR. In contrast, we show that parameters related to unsaturated-zone geologic texture and soil water retention, along with saturated-zone storage properties 470 (i.e., α, n, S res , and S s ) were relatively unimportant. We speculate that these parameters are relatively unimportant because our simulations typically showed that surface ponding initiated rapid downward wetting-front advancement through the unsaturated zone, quickly developing fully saturated conditions from land surface to the water table. In systems dominated by diffuse recharge, these parameters may be more sensitive.
Findings presented here for a semi-confined alluvial aquifer system show large spatial variability of recharge rates that are accommodating orders-of-magnitude greater recharge benefit than would be possible over the rest of the landscape.  Weissmann et al., 2005) and in other major river fans that drain highelevation, glacially-influenced catchments (e.g., Pierce and Scott, 1983). Identifying sites that can accommodate large MAR volumes during short windows is especially valuable in places like California, where excess surface water available for recharge typically occurs from a few precipitation events (Dettinger et al., 2011) during short (< 10 day) windows (Kocis and Dahlke, 2017).
Importantly, no single GP P described herein was a fully diagnostic metric of recharge potential at all sites. This result is not surprising given the complexity of geologic architecture and variability of aquifer configuration sampled across sites in the domain, which are challenging to fully captured :::::: capture with a single metric. For example, all site characteristics described here were developed only for those model cells that are vertically-coincident with each site footprint, and do not account for into this phenomena could provide additional insight into developing site-specific GP P , but is outside the immediate scope of this work. We also acknowledge some limitations of our sensitivity analyses. For example, reliance on imperfect empirical regression relations to include measures of geologic configuration in local methods likely introduced uncertainty to DSS and CSS estimates for this parameter. In addition, inclusion of all model parameters describing facies hydraulic properties in the 515 Morris approach would have been valuable, but was infeasible given the computational resources for the simulations required.
In addition, parameter-range uncertainty contributes some uncertainty to rankings of parameter importance.
Our simulations also do not consider some subsurface geologic conditions that influence MAR. Clastic sedimentary aquifer systems are typically replenished naturally over longer timescales (Taylor et al., 2013) because even productive aquifer systems are commonly composed mostly of fine-texture sediments (e.g., Fogg, 1986;Fogg et al., 2000), that form nearly ubiquitous, 520 multiple confining layers that inhibit direct recharge of the interconnected sand and gravel body networks that comprise the aquifer system. The presence of laterally-continuous aquitard facies have been well documented portions of the southern Central Valley (Phillips and Belitz, 1991;Faunt et al., 2009), and in other unconsolidated alluvial aquifer systems in California (e.g., Fisher, 1964). While not present within the domain area, these features have been shown to uniformly impede recharge to confined aquifer systems where they are present. In addition, we do not consider some surface conditions that affect real-525 world MAR, like topographic site limitations, evaporative losses, and clogging effects (Bouwer, 2002). We emphasize that this study is not a thorough site investigation of the American-Cosumnes area. The TPROGS approach is inherently stochastic and conditioning data to inform the model are sparse in places (Maples et al., 2019). In addition, the single TPROGS realization used for our simulations provides only a single representation of possible facies distributions within the domain.

535
Our findings have important implications for assessing MAR feasibility and for understanding MAR processes in clastic alluvial aquifer systems in California and globally, where accelerating groundwater overdraft and increasing water scarcity are observed (Scanlon et al., 2012;Famiglietti et al., 2011;Wada et al., 2011). Our results highlight the importance of identifying and cataloging locations with favorable geology for recharge, especially in light of recently-passed groundwater management legislation in California that mandates limiting both the "chronic lowering of groundwater levels" and "significant and unrea-540 sonable reductions in groundwater storage" (Kiparsky et al., 2016). While studies have shown that implementation of MAR can lead to more sustainable groundwater management (e.g., Niswonger et al., 2017), widespread adoption of of MAR is still hampered by a number of challenges, including institutional barriers to water-rights transference and water accounting uncertainty (Asano, 2016), infrastructure limitations, including land acquisition and water conveyance costs (Gailey, 2018), and water quality considerations (Hartog and Stuyfzand, 2017). Our approach, which combines a detailed representation of investigations and data collection methods for proposed MAR projects, and (2) improve representation of recharge processes in management-focused, typically coarse-resolution groundwater models.

Conclusions
This research explores the sensitivity of hydraulic properties and subsurface geologic architecture on MAR processes with the 550 variably-saturated water flow code, ParFlow, in a highly heterogeneous geologic domain that reflects the complex, unconsolidated alluvial geologic architecture of the northern Central Valley, CA that is consistent with many alluvial aquifer systems.
This work comprises two fundamental components. First, exploratory simulations were performed at 100 randomly-sampled sites across the domain to to evaluate the correlation between 17 geologic and hydrologic site characteristics and simulated recharge benefits. Results from the exploratory simulations show that site characteristics representing subsurface geologic con-555 figuration by upscaling vertical K can produce good correlations with the average 30-day recharge rate (R 30d ). Regression relations between site-averaged K geom × W T D and R 30d were shown to be the most correlated (r = 0.70, p < 0.01, r 2 = 0.70).
Conversely, site characteristics describing unsaturated-zone thickness (W T D), fraction of coarse-texture unsaturated-zone (U Z coarse ), and fraction of coarse-texture surface facies (Surf coarse ) alone were all poorly correlated with R 30d . These results highlight the value of characterizing subsurface geologic configuration through K upscaling. For subsequent sensitivity 560 analyses, K geom × W T D was designated as a geologic proxy parameter, GP P , for recharge potential using aforementioned predictive regression relation.
Subsequent local and global sensitivity analyses were performed for model hydraulic properties and the GP P to evaluate the relative importance of these parameters on recharge potential. Results from local sensitivity analyses indicated that GP P is the most sensitive parameter for R 30d , more so than any parameters describing hydraulic properties of each facies. Sensitivity 565 analyses also indicated that permeability and unsaturated-zone pore volume (i.e., K s and φ, respectively) were relatively more important than other hydraulic properties, including unsaturated-zone geologic texture, soil water retention, and saturated-zone storage properties (i.e., α, n, S res , and S s ) for R 30d . Results from global sensitivity analyses were consistent with local sensitivity analyses, indicating that GP P is relatively more important than K s of coarse-and fine-texture facies for R 30d . Agreement of local and global approaches regarding the importance of GP P shows a degree of robustness of these findings. The results

570
presented here demonstrate the importance of thoroughly characterizing subsurface geologic configuration when considering recharge feasibility. To our knowledge, this study is the first of its kind to incorporate of a measure of geologic configuration with a geologic proxy parameter in formal sensitivity analyses. Our approach outlines a novel combination of subsurface site characterization with simulations of variably-saturated water flow physics within a sensitivity analysis framework to (1) improve understanding the role of geologic heterogeneity on MAR processes and (2) provide insight into potential strategies to 575 characterize subsurface geologic heterogeneity when considering recharge feasibility.
and Sustainability Research Initiative (SM, GF, LF). We would like to acknowledge high-performance computing support from Cheyenne (doi:10.5065/D6RX99HX) provided by NCAR's Computational and Information Systems Laboratory, sponsored by NSF. All data used in the analysis can be made available by SM (srmap@ucdavis.edu).