Spatially distributed hydrological models are commonly employed to optimize the locations of engineering control measures across a watershed. Yet, parameter screening exercises that aim to reduce the dimensionality of the calibration search space are typically completed only for gauged locations, like the watershed outlet, and use screening metrics that are relevant to calibration instead of explicitly describing the engineering decision objectives. Identifying parameters that describe physical processes in ungauged locations that affect decision objectives should lead to a better understanding of control measure effectiveness. This paper provides guidance on evaluating model parameter uncertainty at the spatial scales and flow magnitudes of interest for such decision-making problems. We use global sensitivity analysis to screen parameters for model calibration, and to subsequently evaluate the appropriateness of using multipliers to adjust the values of spatially distributed parameters to further reduce dimensionality. We evaluate six sensitivity metrics, four of which align with decision objectives and two of which consider model residual error that would be considered in spatial optimizations of engineering designs. We compare the resulting parameter selection for the basin outlet and each hillslope. We also compare basin outlet results for four calibration-relevant metrics. These methods were applied to a RHESSys ecohydrological model of an exurban forested watershed near Baltimore, MD, USA. Results show that (1) the set of parameters selected by calibration-relevant metrics does not include parameters that control decision-relevant high and low streamflows, (2) evaluating sensitivity metrics at the basin outlet misses many parameters that control streamflows in hillslopes, and (3) for some multipliers, calibrating all parameters in the set being adjusted may be preferable to using the multiplier if parameter sensitivities are significantly different, while for others, calibrating a subset of the parameters may be preferable if they are not all influential. Thus, we recommend that parameter screening exercises use decision-relevant metrics that are evaluated at the spatial scales appropriate to decision making. While including more parameters in calibration will exacerbate equifinality, the resulting parametric uncertainty should be important to consider in discovering control measures that are robust to it.

Spatially distributed hydrological models are commonly employed to inform water management decisions across a watershed, such as the optimal locations of engineering control measures (e.g., green and gray infrastructure). Quantifying the impact of control measures requires accurate simulations of streamflows and nutrient fluxes across the watershed

Because there are computational limitations to calibrating hundreds of parameters, parameter screening exercises via sensitivity analysis are usually applied to reduce the dimensionality of the calibration. Recent reviews of sensitivity analysis methods for spatially distributed models

The combination of these factors could have proximate consequences on siting and sizing engineering controls if equifinal parameter sets for the watershed outlet (1) suggest different optimal sites and/or sizes due to the resulting uncertainty in model outputs across the watershed, or (2) do not consider all of the decision-relevant parametric uncertainties across the watershed. This paper provides guidance on evaluating parametric model uncertainty at the spatial scales and flow magnitudes of interest for such decision-making problems as opposed to using a single location and metrics of interest for calibration. We use three sensitivity metrics to capture differences in parameters that control physical processes that generate low flows, flood flows, and all other flows as in

We employ the RHESSys ecohydrological model for this study

The remainder of the paper is structured as follows: Section

Uncertainty sources in all environmental systems models include (e.g., Fig. 1;

In this paper, the sensitivity analyses consider parametric uncertainty for a fixed model structure and input data time series (described in Sect.

In many hydrological studies, sensitivity analysis is used to understand how input parameters influence model performance measures

Decision-relevant and calibration-relevant sensitivity metrics for daily streamflow and total nitrogen.

For the basin outlet, we used the sum of absolute error (SAE) as the performance measure for decision-relevant sensitivity metrics. For hillslopes (where observations are not available) we used the sum of absolute median deviation (SAMD), where the median value for each hillslope was computed across all model simulations. For completeness, we compared the results of using SAMD for the basin outlet to the SAE results in the Supplement (item S9). We found similar parameter selection and sensitivity ranking results for each performance measure, which demonstrates that an observation time series is not necessary to obtain the parameter set to calibrate, although observations help to check that SA model simulations are reasonable. The SAE and SAMD expressions are shown in Eqs. (

We consider sensitivity metrics that are relevant to water quantity and quality outcomes because they are among the most common for hydrological modeling studies. For water quantity, we compute SAE (basin) and SAMD (hillslopes) for three mutually exclusive flows: (1) high flows greater than the historical

For water quality, we consider the estimated daily total nitrogen (TN) concentration. As described in Sect.

Four performance measures that are typically used to calibrate hydrological models are used as calibration-relevant sensitivity metrics

We selected the skew exponential power (generalized normal) distribution

Sensitivity analysis methods can be local about a single point or global to summarize the effects of parameters on model outputs across the specified parameter domain

The Method of Morris is based on elementary effects (EEs) that approximate the first derivative of the sensitivity metric with respect to a change in a parameter value. The EEs are computed by changing one parameter at a time along a trajectory, and comparing the change in sensitivity metric from one step in the trajectory to the next. The change is normalized by the relative change in the parameter value (Eq.

We used 40 trajectories that were initialized by a Latin hypercube sample, and used the R sensitivity package

After the hydrological model runs completed for all trajectories, we estimated 90 % confidence intervals for each parameter's

We used an EE cutoff to determine which parameters would be selected for calibration. For each sensitivity metric, we determined the bootstrapped mean EE value (Eq.

We compare the EEs for parameters that are traditionally adjusted by the same multiplier to determine if all parameter EEs are meaningfully large and not statistically significantly different from each other. This would suggest that a multiplier or another regularization method may be useful to reduce the dimensionality of the calibration problem. Parameters with large and statistically significantly different EEs are candidates for being calibrated individually, as this suggests the multiplier would not uniformly influence the model outputs across adjusted parameters. More investigation on the cause for different EEs could inform the decision to calibrate individually or use a multiplier (e.g., the difference in sensitivity could be caused by the parameters acting in vastly different proportions of the watershed area). We evaluate significance using the bootstrapped 90 % confidence intervals.

We used the Regional Hydro-Ecologic Simulation System (RHESSys) for this study

For this paper, we classified RHESSys model parameters as structural or non-structural. A key structural modeling decision is running the model in vegetation growth mode or in static mode, which only models seasonal vegetation cycles (e.g., leaf-on, leaf-off), and net photosynthesis and evapotranspiration, and does not provide nitrogen cycle outputs. We found that randomly sampling non-structural growth model parameters within their specified ranges commonly resulted in unstable ecosystems (e.g., very large trees or unrealistic mortality). It is beyond the scope of this paper to determine the conditions (parameter values) for which ecosystems would be stable, so we used RHESSys in static mode. We used a statistical method to estimate total nitrogen (TN) as a function of simulated streamflow, as described in Sect.

We categorized non-structural parameters according to the processes they control. Table

RHESSys parameter categories, the processes modeled in those categories for this study, the number of unique parameters in each category, and the number of parameters that can be adjusted by built-in RHESSys parameter multipliers.

RHESSys is typically calibrated using built-in parameter multipliers, which for this study would mean using 11 multipliers to adjust 40 of the 271 possible parameters. While we know that some of these parameters are more easily measured than others, we consider all 271 parameters in the sensitivity analysis. Some parameters are structurally dependent, so we aggregated EEs for these parameters, resulting in 237 unique EEs for each sensitivity metric. (Supplement item S0 describes the aggregation method.) We assume all parameters within an aggregated set would be calibrated, but only report them as one parameter. Previous studies that implemented sensitivity analyses of RHESSys generally adjusted a subset of the multipliers by limiting the analysis to process-specific parameters that are known or expected to affect outputs of interest (e.g., streamflow in

To our knowledge, this paper presents the first sensitivity analysis of all non-structural RHESSys model parameters. A global sensitivity analysis approach is used to discover which parameters and processes are most important to model streamflow for this study. Consequently, part of our discussion in Sect.

We used the Weighted Regression on Time Discharge and Season (WRTDS) method

In order to use WRTDS for any streamflow value within the observation time period, we created two-dimensional (

We apply these methods to a RHESSys model of the Baisman Run watershed, which is an approximately 4

In Sect.

We first compare the number of parameters selected for calibration based upon decision-relevant elementary effects (EEs) whose mean or

The number of parameters that would be selected for model calibration using the decision-relevant sensitivity metrics as a function of the cutoff percentage used to select parameters based on their elementary effects. The blue lines with circle points indicate the parameters that would be selected using only the basin outlet, while the gray lines correspond to using all hillslope outlets. Only streamflow metrics are considered for the hillslope outlets. Lighter line colors correspond to the bootstrapped mean and darker colors correspond to using the bootstrapped

For the selected 10 % cutoff in Fig.

Basin outlet EEs are displayed in Fig.

For the three TN metrics (Fig.

Mean absolute value of elementary effects for RHESSys model parameters evaluated for the six decision-relevant sensitivity metrics at the basin outlet. The EEs are normalized such that the maximum error bar value is 1 on each plot. Only parameters that would be selected by any metric presented in Table

For hillslope outlets, 37 unique parameters were selected using the 10 % cutoff and the

Figure

Figure

We present results for only those multipliers whose adjusted parameters all have non-zero EEs. Figure

We evaluate the appropriateness of using a parameter multiplier based on the magnitudes of the EEs and their uncertainty. Parameters within the sets adjusted by

Barplots of the mean absolute value of the elementary effects for parameters that can be adjusted by 10 RHESSys multiplier parameters (panel

Figure

Mean absolute value of elementary effects for RHESSys model parameters evaluated for the four calibration-relevant sensitivity metrics at the basin outlet. The style matches Fig.

Figure

Indicators for whether or not a parameter would be selected for calibration for each of the calibration-relevant sensitivity metrics, and separately aggregated over all calibration-relevant and decision-relevant sensitivity metrics. B and H in the

When sensitivity analysis is used to inform model calibrations, a primary goal is usually to reduce the dimensionality of the search space by screening those parameters that most affect the outputs to be calibrated. How model outputs are considered in sensitivity analyses and subsequent screening exercises can affect which parameters are selected. We found that specifically evaluating high and low flows as decision-relevant metrics provided a different parameter selection than using the calibration-relevant metrics that are often used to capture parameters that control such flows. While the NSE is mathematically sensitive (i.e., not robust) to high flows, the EE magnitudes and parameters that are selected by the NSE sensitivity metric do not match well with those selected from the high flows decision-relevant metric. Instead, the EE magnitudes and selected parameters resemble the

Calibration-relevant metrics have limited value for sensitivity analyses of spatially distributed models because they can only be computed at gauged locations. The sensitivity analyses that we completed for ungauged hillslope outlets led to the identification of more parameters to calibrate than were selected based on sensitivity analysis at the gauged basin outlet. Calibrating additional parameters that have smaller impact at the gauged location is likely to exacerbate equifinality in simulated outputs. Equifinality at the basin outlet will often result in variability in outputs at ungauged locations, such that calibration of these additional parameters should be important to better capture the physical processes in hillslopes where engineering controls could be located. Even if parameter values are unchanged from their prior distributions after calibration, locations of engineering control measures can be optimized to be robust to the resulting uncertainty in model outputs across the watershed. Spatially distributed monitoring of model parameters and streamflow gauges within sub-catchments could help to reduce this uncertainty, particularly for catchments with spatially heterogeneous characteristics. In summary, spatial evaluation of sensitivity metrics for spatially distributed models allows for the discovery of parametric sources of uncertainty across the watershed to which engineering designs would have to be robust.

Spatial sensitivity analyses also reveal opportunities to reduce parametric uncertainty by using additional data and data types. Parametric uncertainty could be reduced for any parameter by better constraining its prior range. For example, septic water loads could be constrained with household water consumption surveys. Surveys and data collection efforts for other parameters can target those hillslopes for which model sensitivity is largest. Alternatively, some of the parameters could instead be specified by additional input datasets to reduce the dimensionality of the calibration. For example, impervious surface percentage could be specified spatially from the land cover dataset, and time series of wind speed may be obtained from weather gauges or satellite data and then be processed to the spatial scale of the model. These approaches would transfer parametric uncertainty to input data uncertainty, which would ideally be negligible. Finally, uncertainty may be reduced by better capturing spatial trends in parameter values, for example, using finer-resolution soils data products, such as POLARIS estimates

Parameter multipliers and other regularization methods are a common dimensionality reduction choice for spatially distributed models. A comparison of model sensitivity results for parameters that can be adjusted by built-in RHESSys multipliers revealed opportunities for dimensionality reduction by a multiplier, and also identified some parameters that may be better to calibrate individually for this problem. Future research is needed to formally test these recommendations for their impact on model calibration.

For RHESSys streamflow simulations, the global sensitivity analysis identified some parameters for calibration that are not commonly calibrated and should therefore be assigned priors that are adjusted to local site conditions. Studies of other models, such as NOAH-MP

This paper focuses on the importance of evaluating sensitivity analyses at the spatial scale and magnitude that is appropriate for decision making. Selecting the appropriate temporal resolution for the sensitivity metric and the time period of sensitivity analysis is also important to inform parameter selection. All of the sensitivity metrics in this paper are temporally aggregated measures instead of time-varied. With this approach, two model runs could have very different simulated time series, yet could have similar metric values. Additionally, parameters that arise from different generating processes (e.g., floods from spring snowmelts vs. summer hurricanes) would not necessarily be parsed out from any one model run. For engineering problems, a magnitude-varying sensitivity analysis

A final consideration for risk-based decision making is the use of deterministic or stochastic watershed models. We found that sensitivity metrics for TN model residual error resulted in a different set of parameters to calibrate than using the mean of TN. This result suggests that sensitivity analysis of stochastic watershed models could lead to different parameter selection. Future work is needed to compare sensitivity analysis and resulting parameter selection for deterministic and stochastic watershed models.

This paper provides guidance on evaluating parametric model uncertainty at the spatial scales of interest for engineering decision-making problems. We used the results of a global sensitivity analysis to evaluate common methods to reduce the dimensionality of the calibration problem for spatially distributed hydrological models. We found that the sensitivity of model outputs to parameters may be relatively large at ungauged sites where engineering control measures could be located, even though the corresponding sensitivity at the gauged location is relatively small. The spatial variation in parameters with the largest sensitivity could be described well by variation in land cover and soil features, which suggests that different physical processes have important controls on model outputs across the watershed. More calibration parameters result from sensitivity analysis at local scales (i.e., ungauged hillslopes) than do from sensitivity analysis at watershed scales. While the processes affected by the additional parameters would have a relatively small effect at the outlet location, thus exacerbating the equifinality problem during calibration, they would describe important variability in model outputs at potential engineering control locations. Thus, due to equifinality, calibration methods that estimate parameter distributions are preferable to relying upon a single “best” parameter set; considering such parametric uncertainty in optimizations of engineering control measures should help to discover solutions that are robust to it. Sensitivity analysis results were also useful to inform which parameter multipliers may be useful to employ for further dimensionality reduction.

Results from this study support two critical avenues of future research that could further inform how to employ sensitivity analyses of models that are used in decision-making problems. The literature on sensitivity analysis of hydrological models almost exclusively corresponds to deterministic outputs, whereas a stochastic framework that considers model residual error should be, and often is, used to develop engineering designs. We found that considering model error resulted in selecting additional parameters to calibrate. Future research should formally compare sensitivity analysis of deterministic and stochastic watershed models that are employed for engineering decision-making problems. We also found that the parameters screened by using common extreme streamflow calibration performance measures as sensitivity metrics do not match those parameters screened by specifically evaluating extreme flows. Future work should compare results of using screened parameters from each method to calibrate a model that is used to optimize engineering controls, evaluate which method is ultimately preferable for various decision problems, and determine whether or not there is a meaningful difference in performance of the resulting controls.

The code and data used for this study are made available in a HydroShare data repository (

The supplement related to this article is available online at:

JDS contributed to writing the original draft, research conceptualization, methodology, formal analysis, visualization, software development, and data curation. LL contributed to reviewing and editing the manuscript, software development, and data curation. JDQ contributed to reviewing and editing the manuscript, research conceptualization, methodology, visualization, and supervision. LEB contributed to reviewing and editing the manuscript, research conceptualization, and supervision.

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors acknowledge Research Computing at The University of Virginia for providing computational resources and technical support that have contributed to the results reported within this publication (

This paper was edited by Christa Kelleher and reviewed by Fanny Sarrazin and three anonymous referees.