Hierarchical sensitivity analysis for a large-scale process-based hydrological model applied to an Amazonian watershed

Sensitivity analysis methods have recently received much attention for identifying important uncertainty sources (or uncertain inputs) and improving model calibrations and predictions for hydrological models. However, it is still challenging to apply the quantitative and comprehensive global sensitivity analysis method to complex large-scale process-based hydrological models (PBHMs) because of its variant uncertainty sources and high computational cost. Therefore, a global sensitivity analysis method that is capable of simultaneously analyzing multiple uncertainty sources of PBHMs and providing quantitative sensitivity analysis results is still lacking. In an effort to develop a new tool for overcoming these weaknesses, we improved the hierarchical sensitivity analysis method by defining a new set of sensitivity indices for subdivided parameters. A new binning method and Latin hypercube sampling (LHS) were implemented for estimating these new sensitivity indices. For test and demonstration purposes, this improved global sensitivity analysis method was implemented to quantify three different uncertainty sources (parameters, models, and climate scenarios) of a three-dimensional large-scale process-based hydrologic model (Process-based Adaptive Watershed Simulator, PAWS) with an application case in an ∼ 9000 km2 Amazon catchment. The importance of different uncertainty sources was quantified by sensitivity indices for two hydrologic outputs of interest: evapotranspiration (ET) and groundwater contribution to streamflow (QG). The results show that the parameters, especially the vadose zone parameters, are the most important uncertainty contributors for both outputs. In addition, the influence of climate scenarios on ET predictions is also important. Furthermore, the thickness of the aquifers is important for QG predictions, especially in main stream areas. These sensitivity analysis results provide useful information for modelers, and our method is mathematically rigorous and can be applied to other large-scale hydrological models.

mate scenarios and simulation of the basin-to continentalscale ecosystem energy balance, biogeochemistry, and ecological functioning (Vertessy et al., 1993;Parkin et al., 1996;Bixio et al., 2002;Oogathoo et al., 2011;Weill et al., 2011;Shen et al., 2013;Maxwell et al., 2014;Riley and Shen, 2014). Specific to the hydrological process, PBHMs are capable of simulating the surface water processes of ET, overland flow, channel runoff, and so on (Beven, 2002). For subsurface water, PBHMs can simulate complex hydrological processes in the soil, such as root extraction, infiltration, soil evaporation, and groundwater discharge and recharge in the vadose zone, by solving the Richards equation (Maxwell et al., 2014). However, these complex processes and governing equations embedded in the PBHMs inevitably induce large uncertainties in the modeling predictions (Neuman, 2003;Rojas et al., 2010;Lu et al., 2012;Shen et al., 2014;Gupta, 2015, 2016;Qiu et al., 2019). How to efficiently decrease these large uncertainties becomes an essential problem for modelers. Sensitivity analysis aims to identify the most influential sources of uncertainty and is therefore an important tool (Saltelli and Sobol, 1995;Saltelli et al., 2000Saltelli et al., , 2010Song et al., 2015). The sensitivity analysis results assist modelers and managers in focusing on observing and calibrating the uncertain inputs that have the greatest influences on model outputs. Thus, the sensitivity analysis process saves resources (e.g., funding and labor) used for calibration and significantly improves the efficiency of reducing the uncertainty of PBHM predictions.
In general, sensitivity analysis methods can be divided into local and global categories. The main limitation of the local sensitivity analysis is that its results are only valid for a small range of parameter values (Gedeon and Mallants, 2012;King and Perera, 2013;Wainwright et al., 2014;Dai and Ye, 2015). Compared with the local method, global sensitivity analysis can provide sensitivity estimates for the entire range of uncertain parameter values (Saltelli et al., 2000(Saltelli et al., , 2010Gupta, 2015, 2016). Because of this advantage, global sensitivity analysis has gained popularity in recent modeling works despite its high computational cost (Hamby, 1994;van Griensven et al., 2006;Sulis et al., 2011;Baroni and Tarantola, 2014). Common global sensitivity analysis methods include screening methods, regression-based methods, variance-based methods, meta-model methods (Song et al., 2013), and information-entropy-based methods . Among the different global sensitivity analysis methods, the variance-based method has been widely accepted and used because of its ability to accurately quantify the importance of uncertain parameters while considering their interactions (Saltelli and Sobol, 1995;Zhang et al., 2013;Dai and Ye, 2015).
To date, considerable research has been conducted to reduce the uncertainties in hydrological models by using local or global sensitivity analysis methods (e.g., Nijssen et al., 2001;Chávarri et al., 2013;de Paiva et al., 2013). However, conducting a comprehensive global sensitivity analysis, es-pecially variance-based sensitivity analysis on PBHMs, remains a challenge, and there are two main obstacles. The first obstacle is the high computational cost arising from two sources: the high complexity of the model itself and the method requirement of variance-based global sensitivity analysis. A PBHM usually has a very large number of parameters and multiple high-order nonlinear governing equations. These facts combined with a large-scale model domain cause the running of a PBHM itself to be very computationally expensive. For the sensitivity analysis method, compared with the local sensitivity analysis, which can only provide results valid in a certain range of parameter values (e.g., the derivative of the model prediction with respect to parameter A at a certain value point can be a measurement of A's local sensitivity at this point), the global sensitivity analysis is more comprehensive because its results are valid for the whole range of parameter values. To achieve this goal, the methods of global sensitivity analysis are all relatively computationally expensive, especially for the variance-based method, which uses complex sampling techniques, and its computational cost grows exponentially with the number of parameters (Saltelli et al., 2000(Saltelli et al., , 2010. Therefore, the implementation of a global sensitivity analysis for a PBHM leads to an extremely high computational cost considering that we have to run a large number of simulations for a complex PBHM using different parameter samples. The second obstacle of implementing the global sensitivity analysis method in PBHMs is the variant uncertainty sources included in the model. Conventional global sensitivity analysis generally considers only uncertainty from model parameters and ignores other important hydrological model uncertainties. However, for PBHMs, uncertainties usually arise from three different sources, including parametric uncertainty, model structural uncertainty (induced through multiple different plausible conceptual or mathematical models), and scenario uncertainty (caused by alternative unpredictable future climate conditions) (Ye et al., 2005;Makler-Pick et al., 2011;Neumann, 2012;Dai and Ye, 2015;Song et al., 2015;Dai et al., 2017a, b;Zeng et al., 2018;Pan et al., 2020). To overcome these two obstacles, Dai et al. (2017a) developed a new hierarchical sensitivity analysis method that integrates the variance-based method and hierarchical uncertainty framework. By combining uncertain inputs based on their characteristics and dependencies, hierarchical sensitivity analysis can quantify the sensitivity of different sources of uncertainty involved in hydrological models (e.g., parameters, models, and climate scenarios) and dramatically reduce the computational cost. However, the original hierarchical sensitivity analysis method is limited to considering parameters as a whole, and the sensitivity indices of different parameters cannot be defined or estimated. This simple strategy may be adequate for a groundwater modeling case, but it cannot provide detailed information for a PBHM that includes multiple hydrological processes. This research presents a new tool of the improved hierarchical sensitivity analysis method and demonstrates its implementation to a pilot example for comprehensive global sensitivity analysis of large-scale PBHMs. A new set of subdivided parametric sensitivity indices was defined to quantify the importance of a physical process involving only partial model parameters. A new binning method was implemented with the Latin hypercube sampling (LHS) method to estimate these subdivided parameter sensitivity indices. The LHS method also makes the assessment of hierarchical sensitivity analysis for large-scale PBHMs more computationally affordable compared with the original Monte Carlo method. This new and flexible hierarchical sensitivity analysis method provides modelers with the novel capability of analyzing sensitivity from the physical process viewpoint and estimating accurate importance for further subdivided parameter groups.
The Process-based Adaptive Watershed Simulator (PAWS) model was first developed in Shen and Phanikumar (2010); PAWS is capable of simulating large catchments and long-term frames by efficiently coupling surface and subsurface hydrological processes. Coupling PAWS with the CLM (Community Land Model) can enable the model to describe vegetation respiration and evapotranspiration in a physics-based manner Niu et al., 2017). The model has been applied extensively in many watersheds, e.g., the large-scale watersheds in Michigan, USA (Shen et al., 2013(Shen et al., , 2016Niu et al., 2014Niu et al., , 2017Ji et al., 2015;Qiu et al., 2019), and the watershed in the Amazon basin (Niu et al., 2017), and the model has presented good performances in these watersheds. PAWS can also estimate multiple key variables of hydrological states and fluxes at different spatiotemporal scales. The high efficiency, great performance, and complex variables all make PAWS an excellent model choice for PBHMs to evaluate and demonstrate the sensitivity analysis method.
The PAWS model with the new hierarchical sensitivity analysis method was implemented in a study area of the ∼ 9000 km 2 Amazon catchment located in northern Manaus, Brazil, for the purposes of evaluation and demonstration. Three different types of uncertainty sources (climate scenario, model, and parameters) were all included in this test case. The parameters were further divided into three groups (vadose zone parameters, groundwater parameters, and overland flow parameter) to investigate the detailed importance information of the model parameters through the new subdivided parameter sensitivity indices. By developing the new hierarchical sensitivity analysis method and implementing it in this test case, we aim to (1) provide a new tool and pilot example of comprehensive global sensitivity analysis for the PBHMs, (2) identify the most important uncertainty sources for modeling hydrological processes in the Amazon, and (3) investigate the possible patterns for sensitivity analysis results of PBHMs. We introduce the study area and the numerical model in Sect. 2.1. Section 2.2 and 2.3 present the improved hierarchical sensitivity analysis method and its algorithms in detail. Then, we describe the generation of uncertainty sources based on the study site information in Sect. 2.4. We present and discuss the results in Sect. 3. Finally, Sect. 4 summarizes the key findings of this research.

Study site and numerical model
The study site is located in northern Manaus, Brazil (Fig. 1), and the site has a drainage area of ∼ 9000 km 2 . Within the central Amazon, the watershed is mostly covered by tropical forest, with ∼ 12 % cropland and ∼ 5 % wetland (based on CLM land surface data; Niu et al., 2017). With the relatively high elevation (90-210 m) of the upper landscape and relatively low elevation (45-55 m) of the swampy valleys, a dense drainage network formed in the region. The watershed has four rivers: the Urubu, Preto da Eva, Tarumã-açu, and Tarumã-mirim rivers. The average precipitation in this region has large seasonal variability. December to May is the wet season, and June to November is the dry season.
The modeling tool used in this study is the PAWS model (Shen and Phanikumar, 2010;Shen et al., 2014;Niu et al., 2017). The main reason for choosing PAWS as the pilot example of PBHMs is that, compared with other PBHMs, PAWS is a comprehensive and representative large-scale hydrological model that can be applied to large catchments and long-term frames by efficiently coupling both surface and subsurface hydrological processes (Shen and Phanikumar, 2010). The complexity and parameter dimensionality of PAWS are high enough to test and demonstrate our new global sensitivity analysis method. Furthermore, PAWS was previously applied to the studied watershed, and it was capable of simulating multiple key variables of hydrological states and fluxes at different spatiotemporal scales and presented good model performance validated by various ground and satellite observation data (Niu et al., 2017). This previous model application provides a solid basis for our uncertainty identification and sensitivity analysis study.
The details of the numerical implementation and the governing equations of PAWS can be found in Appendix A. Briefly, four flow domains are simulated in PAWS, including the stream channel, overland flow, vadose zone, and saturated groundwater. The structured grid-based finite-volume method is the main numerical scheme applied to discretize the governing equations of the various hydrologic components. PAWS also simulates two land surface subdomains, i.e., infiltration and evaporation, which are depicted in the ponding subdomain, while overland flow occurs in the surface flow subdomain. PAWS considers the horizontal interaction of both surface runoff and groundwater flow between model grids, which represents the actual hydrological processes and is often ignored by other regional and global hydrologic models. The 1-D diffusive wave equation is solved to simulate channel flow, and the 2-D version is used for overland flow. The "leakance" concept is applied to explicitly simulate the exchange between the channel and groundwater. PAWS has been coupled with the CLM , which calculates the surface energy balance and soil and plant carbon and nitrogen cycles. Canopy interception and ET demand (both transpiration and soil evaporation) are also computed in the CLM at each time step.
For the numerical model case in this study, a 1 km× 1 km grid is used for horizontal discretization, resulting in 118 × 122 grid cells for the study site. In this model, 20 vertical layers are defined to discretize the vadose zone, and for the fully saturated groundwater, there are two layers: the unconfined aquifer at the top and the confined aquifer at the bottom. In this study, the 90 m resolution NASA Shuttle Radar Topography Mission (SRTM) (US Geological Survey; http://eros.usgs.gov, last access: 27 July 2020) data are applied as DEM input, but for the channel network and watershed boundary delineation, the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) provides the 30 m resolution Global Digital Elevation Model Version 2 (GDEM V2). CLM land surface data are applied as landuse and land-cover (LULC) inputs. Details regarding these data can be found in Niu et al. (2017). More information on the governing equations of PAWS can be found in Shen and Phanikumar (2010) and Niu et al. (2014).

Hierarchical sensitivity analysis method with subdivided parametric sensitivity
The essential concept of the hierarchical sensitivity analysis method involves categorizing and quantifying different complex uncertainties of certain model systems while considering their dependence relationships. Different uncertainty sources (or uncertain inputs) are placed in different layers of a hierarchical uncertainty framework, which is then integrated with the variance-based global sensitivity analysis method to form a new set of sensitivity indices to accurately quantify the importance of different uncertainty sources. In this study, climate scenarios, different aquifer thicknesses, and parameters are treated as random uncertain inputs, and they represent the climate scenario uncertainty, model uncertainty, and parameter uncertainty, respectively. Notably, the thicknesses of aquifers here represent the model uncertainty because the different thicknesses of distinct types of aquifers lead to different conceptual hydrological models, and a similar concept (different thicknesses were used for two underground geological formations) for the model uncertainty was used in previous work (Dai et al., 2017a). Six model parameters are included in this test case, and they are divided into three groups. The first group includes vadose zone parameters (PR VDZ ): soil saturated hydraulic conductivity, K s (m d −1 ); the van Genuchten equation parameters α (m −1 ); and N (unitless) (van Genuchten, 1980). The second group is composed of groundwater parameters (PR GW ): unconfined aquifer hydraulic conductivity, K 1 (m d −1 ), and confined aquifer hydraulic conductivity, K 2 (m d −1 ). The third group is the overland flow parameter (PR OVN ): the length of the flow path for runoff contribution to the overland flow domain, L (m). Here, we consider the van Genuchten parameters α and N, because the correlation between α and N can largely affect the soil water release and infiltration processes in the vadose zone (Pan et al., 2011). In the hierarchical uncertainty framework, all these uncertainties are placed into the proper levels based on their dependence relationships. The climate scenario uncertainty is at the top layer, and the model uncertainty and parameter uncertainty are at the middle and bottom layers, respectively (Fig. 2), which is because the climate scenarios (CSs) are the driving forces of the hydrological model system, and multiple models can be built under a single scenario. Similar to the model and parameters, each model can contain a different set of parameters (Meyer et al., 2007). According to the hierarchical sensitivity analysis method, the partial variances contributed by the climate scenario uncertainty, model uncertainty, and parameter uncertainty can be expressed as Eqs. (1)-(3), respectively (see Appendix B for more details). Figure 2. The framework of the hierarchical sensitivity analysis developed for PAWS and applied to the central Amazon basin. The three uncertainty source types are placed into the appropriate hierarchical level according to their dependence relationships. The left part of this figure shows the sources of these uncertainties, and the right side shows the abbreviations and the structural relationships among the various uncertainties. The number of climate scenarios (CSs) in this study is six; the number of plausible numerical models under each climate scenario is three; and the number of parameter sets under each numerical model is 600. Notably, the parameter uncertainty sources are further divided into three parts: vadose zone parameters, groundwater parameters, and the overland flow parameter.
where is the model output, CS represents the set of alternative climate scenarios, NM represents the multiple plausible models with different aquifer thicknesses, and PR represents the multiple parameter sets under a certain model. The notations of "NM|CS" and "PR|NM, CS" indicate the hierarchical relationships that models are conditioned on climate scenarios and parameters are conditioned on models and climate scenarios. The term " |NM, CS" indicates that the output is calculated using the parameter sets that are conditioned on climate scenarios and models. The sensitivity indices for the climate scenarios S CS , models S NM , and parameters S PR are expressed as Eqs. (4)-(6), following the hierarchical sensitivity analysis method: where V ( ) is the total variance in the model output (Eq. B5). The above equations are directly derived based on the hierarchical sensitivity analysis method. Notably, the parameter sensitivity index in Eq. (6) includes the influence of all parameters. However, to explore the detailed parameter sensitivity, the total parameter uncertainty is further decomposed into three components, representing the uncertainties contributed from vadose zone parameters (PR VDZ ), groundwater parameters (PR GW ), and the overland flow parameter (PR OVN ). Using the variance decomposition method (Eq. B1), the partial variance in the parameters can be further decomposed as follows: where the notation PR ∼VDZ refers to other uncertain parameters excluding vadose zone parameters, which are groundwater parameters and the overland flow parameter. The first term of Eq. (7) on the right-hand side is the partial variance contributed by PR VDZ , and the second term represents the partial variance in the other parameters, which are groundwater parameters and the overland flow parameter. Note that Eq. (7) is decomposed based on the vadose zone parameters; when we decompose the partial variance in parameters based on the groundwater parameters or the overland flow parameter, the partial variance in the parameters can be further decomposed as Eqs. (8) and (9): The first terms in Eqs. (8) and (9) represent the partial variances contributed by the groundwater and overland flow parameters, respectively. Then, we can define a new set of subdivided parameter sensitivity indices for PR VDZ , PR GW , and PR OVN following the first-order sensitivity index definition (Eq. B2):

Sensitivity index estimation using the LHS and binning method
The hierarchical sensitivity analysis method proposed by Dai et al. (2017a) was sampled using the conventional Monte Carlo random sampling method, which is computationally expensive for the sensitivity analysis of large-scale PBHMs. In this study, the different parameters were simultaneously sampled by the LHS method (Zhang and Pinder, 2003;Kanso et al., 2006). Compared with the conventional Monte Carlo method, the LHS method can guarantee space filling and noncollapsing of parameter samples (Grosso et al., 2009;Crombecq et al., 2011;Husslage et al., 2011;Damblin et al., 2013;Ba et al., 2015;Qian, 2012), which means that the sampling points can be evenly distributed throughout the sampling region and that there are no two sampling points with the same value. Thus, LHS is a sampling method with higher sampling efficiency (Helton and Davis, 2003). The convergence rate of the conventional Monte Carlo method is However, for a system where the parameters are simply distributed (e.g., uniformly distributed), the convergence rate of LHS can reach O(N −3 ) (Iman and Conover, 1980). The LHS method has been one popular sampling technique used to reduce computational cost. For the function Y = f (X), the input vector X consists of k parameters (i.e., X = (X 1 , X 2 , . . ., X k )). By using the LHS method, the range of X i , i = 1, 2, . . ., k can be divided into n nonoverlapping intervals with equal probabilities. The n values obtained from X 1 are randomly paired with n values obtained from X 2 ; these n paired values are then combined with those n values from X 3 . We repeat this process until the new n × k sample matrix A is developed. This sample matrix A can be used to calculate the sensitivity index for the model output. More details regarding LHS are described in previous studies (McKay et al., 1979;Owen, 1998;Helton and Davis, 2003).
Using the variance definition, the partial variance in V (PR) can be first expressed as follows: After applying the formula of expectation and the LHS method, the terms V (PR), V (NM) and V (CS) can be ex-pressed as follows: where n and j represent the total sample number of LHS and the index of LHS samples, respectively, P (NM k |CS l ) represents the prior weight of model NM k under climate scenario CS l with k P (NM k |CS l ) = 1, and P (CS l ) is the prior weights of different CS satisfying l P (CS l ) = 1. The values of the weights for alternative models or CS could be selected using prior knowledge or objective criteria, e.g., posterior probabilities of the Bayesian theorem (Neumann, 2012;Schöniger et al., 2014).
To calculate the subdivided parametric sensitivity indices, i.e., the sensitivity indices for vadose zone parameters, groundwater parameters, and overland flow parameter, a binning method was implemented in this study. This binning method was designed to estimate the partial variance terms of subdivided parameter groups with paired LHS samples of parameters. Using the sensitivity index of vadose zone parameters as an example, the range of vadose zone parameters was divided into multiple equal bins, and the partial variance term |PR bin VDZ , NM, CS using the model outputs calculated by those parameter sample pairs that contain vadose zone parameters in the same bin (noted as PR bin VDZ ). Then, the partial variance term in Eq. (10) can be computed as follows: The subscript PR bin VDZ |NMCS represents the vadose zone parameters in the same bins under the fixed model and fixed climate scenario. The subscript PR GW , PR OVN |PR bin VDZ , NM, CS represents the change in the combination of PR GW and PR OVN sets belonging to a specific PR VDZ bin under a fixed model and fixed climate scenario. The term |PR bin VDZ , NM, CS represents the output under the fixed vadose zone parameter, subsurface stratigraphy model, and climate scenario. P PR bin VDZ |NM, CS refers to the weights of different bins for PR VDZ under the fixed model and fixed climate scenario.
The procedures for calculating the subdivided parametric sensitivity indices for PR VDZ using the combined LHS and binning methods are listed as follows: (1) where the symbol U refers to the number of combinations of PR GW and PR OVN in bin PR bin w VDZ , i.e., the size of the parameter set in bin PR bin w VDZ , and the symbol u is the index for these combinations. w represents the index for the bins of vadose zone parameters, and W is the total number of bins. After applying the LHS sampling method and the same binning method, the partial variance for PR GW and PR OVN can be estimated as follows: The binning method is a rigorously derived mathematical technique designed to separate and estimate the partial variances contributed from different parameters of one LHS method sampled parameter set. Because the mathematical equations are general and rigorous, this method can be applied to any modeling case with LHS parameter samplings. However, when the samplings for different parameters are totally random and unrelated, such as the conventional Monte Carlo simulation, the binning method is not applicable. Using LHS and the binning method, the number of realizations is reduced to the size of the parameter sets obtained from the LHS method. Thus, the computation cost for estimating the subdivided parametric indices can be highly reduced. Dai et al. (2017b) confirmed a similar accuracy of 36 000 000 Monte Carlo realization results with 16 000 realizations when applying only the binning method for a synthetic example. The combination of the LHS method with the binning method makes it computationally affordable to analyze the detailed parametric sensitivity for such a large-scale complex hydrologic model.

The generation of uncertain inputs
For the CS, we generated six typical and alternative scenarios based on NASA's Tropical Rainfall Measuring Mission (TRMM) data (https://trmm.gsfc.nasa.gov/publications_dir/ TRMM_Reentry_Risk_Assessment_FINAL_20150604.pdf) and the default CLM CRU-NCEP (CRUNCEP) dataset (Piao et al., 2012) from 1998 to 2013. We considered five climate variables: daily precipitation, temperature, solar radiation, humidity, and wind speed. The precipitation data were obtained from TRMM, while the temperature, solar radiation, humidity, and wind speed data are based on the CRUNCEP, because the model fails to capture the peak stream discharges using the CRUNCEP rainfall data (Niu et al., 2017). We first divided the annual climate dataset into dry and wet seasons according to the precipitation values (6 months for each season). Then, we sorted the wet and dry seasons according to their total precipitation values during the whole season. Next, we divided these wet and dry Figure 3. We identified six CSs based on precipitation data for 1998-2013 from NASA's TRMM data (https://trmm.gsfc.nasa.gov/ publications_dir/TRMM_Reentry_Risk_Assessment_FINAL_20150604.pdf). The first climate scenario (CS 1 ) is the wettest one, and the sixth climate scenario (CS 6 ) is the driest one. seasons into three different groups representing six climate scenarios from wet to dry (Fig. 3). The mean and SD of the values of the different climate variables (e.g., precipitation, maximum temperature) for each group were calculated using the daily data (Table 1). Finally, we generated random daily weather data for each climate scenario based on these mean and SD data using a normal distribution. The mean and SD for each climate scenario's daily data are listed in Table 1, and Fig. 3 displays a box plot of the precipitation data for the six climate scenarios (CS 1 -CS 6 ). For the model uncertainty, the research of Brunke et al. (2016) shows that the shallow bedrock depth or deep bedrock depth has a great influence on surface runoff and base flow in CLM. Therefore, in this study, we will consider the effects of different aquifer models. Niu et al. (2017) simulated an unconfined aquifer with 100 m depth and 200 m thick-ness for the confined aquifer. Considering that (1) the stratification of the soil and aquifer is relatively stable (and the thickness does not change much) (do Rosário et al., 2016), (2) there is a lack of actual measurements in this area to determine the stratification of unconfined aquifers and confined aquifers, and (3) according to Pelletier et al. (2016), the thickness of the unconfined aquifer in the central Amazon is larger than 50 m (and the depth of the bedrock is very deep), three aquifer models involving different thicknesses of the unconfined and confined aquifers were generated to investigate the sensitivity of the model outputs to aquifer thickness. These three aquifer models involving different thicknesses of the unconfined and confined aquifers are (1) 100 and 200 m (NM 1 ), (2) 50 and 250 m (NM 2 ), and (3) 250 and 50 m (NM 3 ), respectively. These three models represent the situations of (i) similar thickness of the unconfined and con-fined aquifer, with medium bedrock depth; (ii) thick confined aquifer, with low bedrock depth; and (iii) thick unconfined aquifer, with large bedrock depth.
The six different model parameters were sampled by LHS within the feasible range (Table 2), and 600 samples of the parameter set were generated. The reasons for using 600 parameter samples in this study are because, first, based on the experiences of previous research cases (Emery et al., 2016;Dai et al., 2019), 600 is an adequate parameter sample size for this research considering the model domain and number of uncertain parameters and, second, considering the computational cost, 600 parameter samples is an appropriate sample size for this study. By combining model uncertainty and climate scenario uncertainty, there are 600 × 3 × 6 = 10 800 simulations in total. The pure simulation time without analyzing data is already very time consuming even when using the best high-performance computing (HPC) platform we have.
According to the study of Cuartas et al. (2012), the clay content in the northwestern part of Manaus is very high (65 %-90 %). Considering the difference in regional soil texture (Fisher et al., 2008;Teixeira et al., 2014), the allowable range of soil saturated conductivity, K s , selected in this study is between 0 and 10 m d −1 . The ranges of unconfined aquifer conductivity, K 1 , and confined aquifer conductivity, K 2 , are chosen as 0-10 and 0-60 m d −1 , respectively. The results of the model calibration in Niu et al. (2017), which are related to the characteristics of the soil and groundwater layers in the watershed (Oleson et al., 2008;Christoffersen et al., 2014), are used to define the mean values of distributions used for these uncertain parameters. The soil saturated conductivity, K s ; unconfined aquifer conductivity, K 1 ; and confined aquifer conductivity, K 2 , were assumed to follow lognormal distributions (log-N (1.6094, 0.4214 2 ), log-N (3.4012, 0.4214 2 ), and log-N (1.6094, 0.4214 2 ), respectively). The remaining three parameters (α, N , and L) were assumed to follow a uniform distribution: U (0.1, 4), U (1.03, 5), and U (20, 700). The allowable ranges of these six parameters are listed in Table 2.
From Sect. 3.1 to 3.4, we assumed that the different scenarios have equal probability. Moreover, three models under each climate scenario were also assumed to have equal weights, i.e., P (CS l ) = 1/6, and P (NM k |CS l ) = 1/3. However, the weights for models and scenarios may affect the output results. We investigated the variability in the results to the changing weights for NM 1 , CS 1 (the wettest climate scenario), and CS 6 (the driest climate scenario) in Sect. 3.5. This experiment is helpful for improving our understanding of sensitivity analysis results.

Model predictions
As mentioned in the above section, the total number of PAWS+CLM simulations considering all possible combinations of the three uncertain factors is 6 × 3 × 600 = 10 800, which represents six climate scenarios, three model conceptualizations of aquifer thickness, and 600 sampled parameter sets. We used the parallel computing technique for running these simulations through the HPC platform (13 cores of Xeon 2.8 GHz CPU). The average time spent on a single simulation was 2.8 min, and a total of 10 800 simulations were run for 3 weeks. The simulation time for all the simulations was 6 months (180 d, 4320 h), which is the length of the dry or wet season in the central Amazon region. The results given by PAWS were represented in two forms: (1) spaceaccumulative output values over the whole grid at each time step and (2) time-accumulative output values over the whole simulated period for each grid. In this study, the time step is 1 h. Figure 4 depicts the space-accumulative model predictions for the two outputs of interest, ET and Q G , using different inputs of scenarios, models, and parameter sets. All prediction results are grouped into 24 groups based on local time, which represent 01:00 to 24:00 (all times are given in local time, LT, unless stated otherwise) throughout the day (Fig. 4). Each box in Fig. 4 describes the prediction results estimated using all the combinations of 600 parameter sets, three models, and six CSs at the same local times. Figure 4a shows that the ET predictions throughout a day have a timevarying pattern, and their values are significantly larger during the daytime and smaller at night. This pattern coincides with the physical fact that sunlight leads to higher temperature and more plant transpiration. The uncertainty of ET predictions during the daytime is also larger than that during the night. Figure 4b shows that the predictions of Q G have no significant time-varying pattern throughout the day. However, the prediction results of ET and Q G both demonstrate great variability or uncertainty for each time group. Further quantitative sensitivity analysis is necessary to identify the most important sources of uncertainty for these predictions.

Sensitivity indices for evapotranspiration
First, we calculated the sensitivity indices for the spaceaccumulative ET over the whole watershed at all time steps using Eqs. (4)-(6). Figure 5a shows the sensitivity indices for the whole simulation period of 4320 time steps. All the sensitivity indices fluctuate strongly with time, except for the sensitivity indices of the models. The sensitivity indices for the models (S NM ) are always close to zero at every time step, indicating that aquifer thickness has little influence on space-accumulative ET. Figure 5b-g plot the sensitivity indices across six periods, exhibiting the details at each time step. Every period lasts for 3 d. The patterns of the sensi- Table 2. Six chosen parameters to be included in parameter uncertainty.

Group
Parameter tivity indices have a daily cycle, but specific values of the sensitivity indices at the same wall-clock time on different days are distinguished. Figure 5 indicates that the sensitivity to various factors is strongly time dependent. Notably, at 12:00-13:00, the CSs are always the most important factors affecting the sensitivity of ET (Fig. 5b-g), which may be because ET is directly influenced by solar radiation values, and the radiation forcing used in this study reaches its maximum value at approximately 12:00. Therefore, the CSs dominate the uncertainties at 12:00-13:00. Another finding is that at 24:00-01:00 the sensitivity indices for the parameters (S PR ) show absolute dominance, but the sensitivity in-dices for the climate scenarios (S CS ) are decreased. A possible explanation for this result might be that precipitation and radiation forcing all decrease to zero during this period, leading to a decrease in the sensitivity indices for the climate scenarios (S CS ). In contrast, the importance of parameters is greatly increased. Six time points (simulation times are 1428, 1440, 2868, 2880, 4308, and 4320 h) were chosen as examples to show the specific sensitivity indices (Fig. 6). Simulation times of 1428, 2868, and 4320 h belonged to different days, but all corresponded to 12:00 LT. At these time points, the climate scenario uncertainty (S CS ) is the most important contributor to the total ET prediction uncertainty, accounting for 54 %-77 % of the total uncertainty, and parameters (S PR ) contribute the second most to uncertainty. However, at different time points (1440, 2880, and 4320 h, corresponding to 24:00 LT), the parameters are the dominant uncertainty contributor, with S PR ranging from 89 % to 92 %. We also calculated the sensitivity indices for every grid cell within the model domain using the time-accumulative ET predictions over all simulation periods (4320 h). Figure 7 shows the spatial variability in the sensitivity indices for the temporal mean ET predictions. The maps demonstrate that for most grids, parameters are the most important uncertainty contributor to ET predictions (S PR > 0.50), and CS are the second most important contributor to uncertainty. However, for stream grid cells, the importance of aquifer thicknesses increases. Therefore, the parameters and aquifer thicknesses are both important. Here, aquifer thicknesses refer to the average aquifer thickness for the whole watershed. The increase in the model sensitivity indices indicates that the structure of the aquifer significantly affects the baseflow and then influences the river evaporation predictions. Figure 7 shows that the parameter uncertainty within the overall watershed is important for ET, and for river evaporation, the aquifer thicknesses are also important.

Sensitivity indices for groundwater contribution to streamflow
Groundwater has been demonstrated to be crucial for soil moisture in the Amazon region by previous research (Miguez-Macho and Fan, 2012b). Meanwhile, it also exerts a significant buffering effect on maintaining evapotranspiration during dry seasons (Miguez-Macho and Fan, 2012a;Pokhrel et al., 2014). The model PAWS uses the output of Q G to quantify the variation in groundwater volumes and measure the interaction process between groundwater and rivers. It is essential to implement the sensitivity analysis to investigate which factor is most influential on this groundwater exchange process. The same sensitivity analysis procedures were also conducted for the model predictions of Q G . Figure 8a shows the sensitivity indices for the whole simulation period of 4320 time steps for Q G predictions. This figure indicates that regardless of the time steps, parameters are always the dominant contributor to the total Q G prediction uncertainty. This result may be explained by the fact that soil parameters strongly affect the soil water redistribution process, including infiltration into groundwater. We selected the same period as Fig. 5b-g to display the more detailed results for Q G predictions in Fig. 8b-g. As shown in these figures, the sensitivity indices of the models (S NM ) and climate scenarios (S CS ) reach peak values at approximately 01:00. In terms of S CS , this may be because the exchange between groundwater and river flow occurs hours later than the rainfall process, and the amount of water during the exchange process always reaches its peak at night, at approximately 01:00. The S NM might be because the thickness of aquifers will greatly influence the water redistribution process in the aquifer. Another pattern demonstrated in Fig. 8 is that the values of S CS generally increase with time. This trend may be caused by the seasonality effect of CS or the long-term cumulative influence of CS on the groundwater flow.
Because groundwater exchange with stream flow occurs only at grid cells along the streams, the sensitivity indices only have valid values in those stream grid cells (Fig. 9). Our results indicate that considering most grid cells, the parameters are the most important contributor to the uncertainty of time-accumulative Q G predictions, and the second most important factor is aquifer thickness. However, if we divide the grid cells into groundwater and stem river grid cells based on their location relative to the river and aquifer type, the sensitivity analysis results are totally different in these two types of grid cells. The model parameter uncertainty is usually the most important in stem river grid cells; in contrast, the aquifer thicknesses contribute the largest portion of the uncertainty in groundwater grid cells. This pattern of results may be caused by the unconfined aquifer and river being unconnected in the stem river grid cells, and there is an unsaturated zone between two of them. Therefore, the soil parameters affect Q G predictions in stem river grid cells. Moreover, the groundwater table is relatively high, and the groundwater is directly connected with rivers in the groundwater grid cells. Thus, the aquifer thicknesses are more important under this condition.

Sensitivity indices for subdivided parameters
Based on the sensitivity analysis for ET and Q G predictions, the results show that parameters are important uncertain inputs for both the space-accumulative and time-accumulative uncertainties. In this study, we used Eqs. (10)-(12) to further calculate the subdivided parametric sensitivity indices, which can provide a more detailed sensitivity analysis for model simulation. Through this investigation, the parametric sensitivity was subdivided into three groups: (1) the sensitivity for vadose zone parameters (S PR VDZ ), (2) the sensitivity for groundwater parameters (S PR GW ), and (3) the sensitivity for the overland flow parameter (S PR OVN ). Using the binning method, we calculated the space-accumulative and time-accumulative subdivided parametric sensitivity indices for ET and Q G . We plotted frequency histograms of the subdivided parametric sensitivity indices over 4320 h in Fig. 10. Figure 10a depicts the results for ET. The value of S PR VDZ is concentrated in the range of 0.1-0.9, and S PR GW is concentrated in the range of 0.003-0.032. The value of S PR OVN is so small that the influence of the overland flow parameter can be ignored. This indicates that vadose zone parameters (PR VDZ ) dominate the total parametric uncertainties for ET. Figure 10b shows the frequency histogram of spaceaccumulative subdivided parametric sensitivity results for Q G ; S PR VDZ is still concentrated in the larger number range (0.2-0.53), and the value of S PR GW changes from 0.04 to 0.3. . S PR is the sensitivity index for parameters. S NM is the sensitivity index for models and represents the influence of aquifer thickness. The S CS is the sensitivity index for climate scenarios. The bottom x axis of (b-g) represents the simulated time steps, and the upper x axis of (b-g) represents the local time.
The number of S PR OVN is also the lowest, indicating that the overland flow parameter has little effect on Q G . The order of importance of the uncertain inputs is the same for both ET and Q G predictions. However, it is significantly different from ET in that although PR GW is the second most important parameter group, the value of S PR GW in the Q G results is an order of magnitude higher than that in the ET results. In the Q G results, the range of S PR GW is concentrated in the range of 0.05-0.2, while in the ET results, the value of S PR GW is concentrated in the range of 0.003-0.032.
We plotted the time-accumulative subdivided parametric sensitivity indices for ET in Fig. 11a and for Q G in Fig. 11b. Considering ET as our output, for most grids, the vadose zone parameters are the most important contributor to parametric uncertainties. Compared with that on other grids, the influence of groundwater parameters on the river grids is more significant (Fig. 11a). For the Q G results, the vadose zone parameters generally dominate the parametric sensitivities for most grids (Fig. 11b). However, if considering different types of grid cells, we find that the vadose zone parameters mainly affect the stem river grid cells and have a relatively small influence on the groundwater grid cells. This pattern coincides with our hypotheses that there is an unsaturated zone between the stem rivers and groundwater. More detailed sensitivity indices for all six parameters are demonstrated in Appendix C.

Effects of prior weights on sensitivity indices
In this section, we changed the prior weights of the CS and numerical models to investigate their influences on the space-accumulative sensitivity indices. Because the number of space-accumulative results for ET and Q G is too large to S PR is the sensitivity index for the parameters. S NM is the sensitivity index for the numerical models, and S CS is the sensitivity index for the climate scenarios. be well exhibited, we chose one time step (4308 h, 12:00 wall-clock time) to show the trend. We randomly changed the values of the weights for NM 1 (the thickness of the unconfined aquifer is 50 m and that of the confined aquifer is 250 m), CS 1 (the wettest climate scenario), and CS 6 (the driest climate scenario) to between 0 and 1. If the weight for NM 1 , CS 1 , or CS 6 is p, then the weight of the remaining climate scenarios or models will be assumed to be (1 − p)/n, where n is the number of the remaining climate scenarios or models. Figure 12a indicates that when we consider ET as our output, with the increase in the prior weight of NM 1 , the uncertainty of the CS will decrease to 50 %, while the uncertainty of the parameters will increase to 50 %. Both parameters and CSs have important effects on ET. Different from the results for ET, with the increase in the prior weights of NM 1 , the sensitivity index of the numerical models for Q G decreases to 0 (because only one model exists under this condition), and the scenario uncertainty changes only slightly. Moreover, the uncertainty of parameters always dominates the total uncertainty for Q G (Fig. 12b) regardless of the prior weight value. In general, the different prior weight values for the numerical models only slightly change the sensitivity analysis results. Figure 12c-f exhibits the influences of prior weights for the wettest and driest CS on ET. These figures first demonstrate that changing the values of the prior weights of CS 1 and CS 6 has larger impacts on ET predictions than on Q G predictions. This pattern coincides with the fact that the parameter uncertainty dominates the total predictive uncertainty of Q G and that the scenario uncertainty is relatively . S PR is the sensitivity index for parameters. S NM is the sensitivity index for models and represents the influence of aquifer thickness. The S CS is the sensitivity index for climate scenarios. The bottom x axis of (b-g) represents the simulated time steps, and the upper x axis of (b-g) represents the local time.    small. Therefore, the selection of prior weight values for the scenarios does not have a significant effect on the sensitivity analysis results for the Q G predictions, and the parameter sensitivity index is always the largest (Fig. 12d and f). For the sensitivity analysis results pertaining to ET predictions, changing the values of the weights for CS 1 and CS 6 has different effects. The sensitivity index values of the climate scenarios for ET predictions monotonically decrease, while the importance of parameters continues to increase as the prior weight of CS 1 is larger than 40 %, which reflects that when the probability of extreme humid seasons in the Amazon is greater than 40 %, the importance of parameters takes precedence over the importance of climate scenarios for ET. However, the value of S CS for ET predictions first increases when the prior weight of CS 6 approaches 10 % and then decreases after the prior weight of CS 6 approaches 90 %, and S PR shows the opposite trend (Fig. 12e). This shows that when the probability of occurrence of extreme dry seasons is between 10 % and 90 %, the climate scenario is always the most important uncertain input unless the occurrence probability of the extreme dry season is greater than 90 %.

Discussion
The results from this case study exhibit the importance of parameters, especially the vadose zone parameters, for ET and Q G predictions. Furthermore, according to the spaceaccumulative results, the climate scenario is also an important uncertainty source for ET predictions, especially at 12:00. Meanwhile, the thickness of the aquifer has a nonignorable influence on the Q G predictions on the groundwater grid cells. Moreover, according to the results of adjusting the climate scenario and model weights, the change in model (aquifer thickness) weights only has a small impact on the importance of different uncertainties. When the probability of occurrence of the extreme humid season is high, the importance of the parameters increases significantly. However, when the probability of occurrence of the extreme dry season is high, the main factors affecting ET predation are still the climate scenario unless the probability of occurrence of CS is greater than 90 %. Although these patterns of sensitivity analysis results may not be universally correct, they can still provide useful insights for other modelers with similar cases and models.
In addition to the specific results, we also have some new insights into the general patterns of sensitivity analysis for the PBHMs provided by this pilot case. For instance, first, the ranks of importance of uncertain inputs are totally different for different model outputs; e.g., CSs have a large impact on ET predictions but a small impact on Q G predictions. There is no one set of results that are valid for all different model outputs. Second, the sensitivity analysis results of ET and Q G predictions show that the uncertainty has high temporal and spatial variability, which reflects that for very complex hydrological models, such as PBHMs, it is incorrect to gen-eralize the sensitivity analysis results of a grid or a time step to the entire watershed or the entire simulation cycle. Third, it is necessary to implement such a comprehensive global sensitivity analysis method that considers more than parametric uncertainty for the large-scale PBHMs since the sensitivity analysis results showed that other sources of uncertainty (e.g., climate scenario and model uncertainties) are essential as well for model predictions. Finally, evaluating the sensitivity of the parameters in detail is essential for PBHMs. For such a complex surface-subsurface coupling model, the new sensitivity analysis method can efficiently identify the uncertain inputs that have the greatest impact on the model outputs. This process can greatly improve our understanding of the complex model system and save time that is normally spent calibrating the model.

Conclusions
This research presented an improved hierarchical sensitivity analysis method for comprehensive global sensitivity analysis of large-scale complex PBHMs. Developed based on the previous hierarchical framework of Dai et al. (2017a), this new methodology can simultaneously consider various types of uncertainty sources and estimate the importance of different processes involved in the modeling work. A new set of sensitivity indices of subdivided parameters was defined to quantify the importance of processes that only involve partial parameters. The highly efficient sampling algorithm of the LHS and binning method were implemented for the estimation of sensitivity indices to reduce computational cost. For evaluation and demonstration purposes, we implemented the new sensitivity analysis method into a real-world case of large-scale complex PBHM (PAWS), which was applied to a large Amazon catchment. Three common groups of uncertainty sources or uncertain inputs were considered in this study, including six CSs, three plausible aquifer models, and six uncertain parameters (i.e., soil saturated conductivity, van Genuchten α and N, unconfined aquifer conductivity, confined aquifer conductivity, and the length of the flow path for runoff contribution to the overland flow domain). A new set of subdivided parametric sensitivity indices was defined for three groups of parameters (i.e., vadose zone, groundwater, and overland flow parameters).
The sensitivity analysis results in this study first demonstrate the necessity of implementing such a comprehensive global sensitivity analysis for PBHMs, because uncertainty sources other than parameters (e.g., CS and models) are also important for model predictions. Furthermore, the values of model weights have a small impact on the sensitivity analysis results, but the selections of weights for extreme CSs may change the ranks of importance for uncertain inputs. Moreover, the sensitivity analysis results are both temporally and spatially dependent and have distinct patterns for different model outputs. Therefore, there is no single conclusion for all model outputs considering different times and locations. In general, the parameter uncertainty is important for both ET and Q G predictions. Among all the parameters, the vadose zone parameters are the most important, and the parameter of overland flow is negligible. The CSs are also important uncertainties for ET predictions, especially at 12:00. Along the river grid cells, the thickness of the aquifer has a significant influence on both ET and Q G predictions. Although the patterns of sensitivity analysis results found in this study may not be universally valid, they can still provide useful insights for modelers with similar problems. For instance, we can suggest that when modelers apply sophisticated hydrological simulators, such as PAWS, they should pay more attention to the values of weather variables at approximately 12:00 (the daily peak values) and focus more on estimating the thicknesses of groundwater aquifers near rivers and adjusting the vadose zone parameters.
Through this pilot example of comprehensive global sensitivity analysis, this study proves that using the new improved hierarchical sensitivity analysis method is a computationally affordable and useful way to identify the most important uncertain inputs for large-scale complex PBHMs. The sensitivity analysis results can provide key information on uncertainty sources for modelers and greatly improve the model calibration and uncertainty analysis processes. The proposed method is mathematically rigorous and general and can be applied to extensive large-scale hydrological or environmental models with different sources of uncertainty.

Appendix A
The governing equations of PAWS are presented in detail in Shen and Phanikumar (2010) and Shen et al. (2013). Here, we will mainly introduce the equations describing the processes involved in this article.
In PAWS, the soil moisture in the vadose zone is calculated according to the Richards equation. The vertical movement of fluid between saturated and unsaturated soil is calculated based on the mixed form of the Richards equation (Celia et al., 1990;van Dam and Feddes, 2000): where h represents the soil water pressure head, z is the elevation (positive upward), K(h) represents the soil unsaturated conductivity, and W (h) is the source or sink term, including the influence of evaporation, root extraction, and lateral flow. The differential water capacity can be described as C(h) = ∂θ/∂h, where h is the soil pressure head and θ is the water content. The pressure head, h, is related to the unsaturated hydraulic conductivity, K(h). According to the Mualem-van Genuchten formula (Mualem, 1976;van Genuchten, 1980), the soil saturated hydraulic conductivity, K s , and van Genuchten parameters α and N will influence the unsaturated conductivity, K(h): where S is the relative saturation, θ s is the saturated water content, θ r is the residual water content, N is related to the pore size distribution, α indicates the reciprocal of air suction, and λ is a parameter measuring pore connectivity. The aquifers in PAWS are depicted as a series of 2-D layers . In each layer, the 2-D groundwater equation is used to describe the water movement: where S is the storability and T is the transmissivity of the aquifer; T = Kb, where K is the aquifer conductivity and b is the saturated thickness of the aquifer; H is the aquifer hydraulic head; R is recharge or discharge; W is the source and sink term; and D p is percolation into deeper aquifers. PAWS applies one-dimensional diffusive wave equations to portray the channel flow model (Shen and Phanikumar, 2010;Shen et al., 2014). After calculating the channel flow, the exchange between groundwater and channel flow (Q G ) is immediately computed. The calculation of Q G is based on the leakance concept (Shen and Phanikumar, 2010): where h * r is the river level calculated from the channel flow model, K r is the riverbed conductivity, Z b is the elevation of the riverbed, Z b is the thickness of the riverbed, and H * is the groundwater table. Note that H * can also be described as Eq. (A5). By solving these two equations together, we can obtain H * and h n+1 r . Then, the value of Q G can be calculated as follows (Shen and Phanikumar, 2010): where w is the wetted perimeter. If the river width is greater than 10 m, w can be approximated as the river width. PAWS retains its own flow scheme, but the surface processes use the CLM 4.0 model, which enables the simulation of detailed surface processes, such as surface heat flux, water vapor flux, surface radiation balance, crop growth, and plant photosynthesis. The calculation of ET demand is performed in the CLM model based on the climate data, and then, ET demand will be transferred to PAWS as a source term for the vadose zone. More details about the calculation of ET (both evaporation and transpiration information can be found in the technical note of CLM 4.0, http://www.cesm.ucar.edu/models/cesm1.1/clm/CLM4_ Tech_Note.pdf, last access: 27 July 2020). The coupling with the CLM makes PAWS a more comprehensive and robust surface-subsurface hydrological model.

Appendix B
For the model, = f (X) = f (X 1 , . . ., X m ), where is the model output and X = {X 1 , . . ., X m } is a group of uncertainty inputs; using the law of total variance, the total variance in can be decomposed as follows (Dai et al., 2017a): where the first term of partial variance on the right-hand side is the within-X i partial variance and represents the variance contribution by X i , and X ∼i represents all the inputs except X i . The second term on the right-hand side represents the variance contributed by the model inputs excluding X i as well as the interactions of all the inputs. Based on the definition of the first-order sensitivity index (Saltelli and Sobol, 1995), The percentage of uncertainty contributed by input X i can be accurately quantified.
For the hierarchical framework in Fig. 2, the variancebased sensitivity analysis method enables decomposition of the total variance into individual contributors as follows: The first term of partial variance on the right-hand side of this equation represents the variance caused by multiple CSs. The second term on the right-hand side is the partial variance caused by other uncertain inputs and can be further decomposed as follows: where the first partial variance term on the right-hand side of this equation represents the uncertainty contributed by multiple plausible models. The second term represents the withinmodel partial variance caused by the uncertain parameters.
The three terms on the right-hand side of Eq. (B5) represent the partial variances contributed by the parameters, models, and CSs, respectively. The equation indicates that the total variance can be decomposed into the variances contributed by the alternative climate scenarios, denoted by CS; plausible numerical models, denoted by NM; and uncertain parameters, denoted by PR. Then, following the first-order sensitivity index definition (Eq. B2), the hierarchical sensitivity analysis method defines the indices for PR, NM, and CS, respectively, as shown in Eqs. (4)-(6).

Appendix C
To conduct a more comprehensive analysis of all parameters and to compare the impact of two aquifers on Q G , we estimated the sensitivity indices of the six parameters according to Eq. (C1). The difference between this equation and the previous ones is that Eq. (C1) no longer groups the parameters, and it calculates the sensitivity indices individually for six parameters.
In this equation, θ refers to one of the six parameters, i.e., K s , α, N , K 1 , K 2 , or L. The subscript "θ |NM, CS" represents the change in one parameter under a fixed model and a climate scenario. The subscript "∼ θ|θ, NM, CS" refers to the other five uncertain parameter inputs excluding θ parameter. The term " |θ, NM, CS" represents the output under the fixed θ , model, and climate scenario. The spatial distribution of the sensitivity indices of six parameters for Q G is shown in Fig. C1. According to Fig. C1, the importance of the van Genuchten parameter N in the stem grid cells is significant. The conductivity of unconfined aquifer K 1 has a certain impact on Q G in most river grid cells. Additionally, it can also be seen from Fig. C1 that for most grids the influence of K 1 is greater than K 2 , which implies that the unconfined aquifer has a greater influence on baseflow. Figure C1. Maps of six parameter sensitivity indices for groundwater contribution to stream flow (Q G ) predictions.
Data availability. The weather dataset is available from the default CLM CRU-NCEP (CRUNCEP) dataset and https://disc.gsfc. nasa.gov/mirador-guide?project=TRMM&tree=project (NASA Orbital Debris Program Office, 2015). The sensitivity data and the source code used for analyzing the sensitivity data in this study are freely available upon request.
Author contributions. HD and JN were involved in the project conceptualization and formulating the methodology. HL performed the experiments and analyzed and visualized the data with guidance from HD and JN. HL prepared the original paper, with contributions from HD, JN, HQ, and WR. DG, BXH, MY, JZ, CW, and XC helped shape the initial ideas for this research.
Competing interests. The authors declare that they have no conflict of interest.