The diversity in hydrologic models has historically led to great controversy on the “correct” approach to process-based hydrologic modeling, with debates centered on the adequacy of process parameterizations, data limitations and uncertainty, and computational constraints on model analysis. In this paper, we revisit key modeling challenges on requirements to (1) define suitable model equations, (2) define adequate model parameters, and (3) cope with limitations in computing power. We outline the historical modeling challenges, provide examples of modeling advances that address these challenges, and define outstanding research needs. We illustrate how modeling advances have been made by groups using models of different type and complexity, and we argue for the need to more effectively use our diversity of modeling approaches in order to advance our collective quest for physically realistic hydrologic models.
The research community exhibits great diversity in its approach to hydrologic modeling, with different models positioned at different points along a continuum of complexity. Models can be defined both in terms of process complexity (i.e., to what extent do different models explicitly represent specific processes) and spatial complexity (i.e., to what extent do different models explicitly represent details of the landscape and the lateral flow of water across model elements). Such model diversity has led to great community debates on the “correct” approach to process-based hydrologic modeling (Wood et al., 1988, 2011, 2012; Grayson et al., 1992a, b; Famiglietti and Wood, 1995; Reggiani et al., 1998; Beven, 2002; Sivapalan et al., 2003; Maxwell and Miller, 2005; Beven and Cloke, 2012), with the debate centered around issues of the adequacy of process parameterizations, data limitations and uncertainty, and computational constraints on model analysis.
The purpose of this paper is to revisit the historical
debates on process-based hydrologic modeling and ask the following question:
We posit that these questions, published almost 50 years ago, are very
relevant today and nicely frame the debates on process-based hydrologic
modeling.
We organize the paper around the three questions posed by Freeze and Harlan, on (1) model structure, (2) model parameter values, and (3) model execution (computing). We discuss these modeling challenges separately, while recognizing that these modeling challenges are strongly interdependent (e.g., a complex model structure may have large computing demands, restricting the extent to which it is possible to explore alternative model parameter sets). We will touch on these interdependencies in the individual sections of the paper.
For each question posed by Freeze and Harlan, we define the major research challenges, and we provide examples of different ways that the community has risen to meet these challenges, considering modeling approaches of varying complexity. We do not mean to provide a comprehensive review; rather, we present possible solutions to outstanding modeling problems, focusing attention on the research sphere of Eric F. Wood. Our overall intent in writing this paper is to demonstrate how diverse hydrologic modeling approaches advance the collective quest for physically realistic hydrologic models, and to define additional research that is necessary to further advance process-based hydrologic models.
The first question posed by Freeze and Harlan (1969) focuses on the adequacy of the mathematical descriptions of system of interest. Such mathematical descriptions define the structure of a model. They include both the equations used to parameterize individual processes and the interactions among processes and across scales.
A major research challenge addressed by Eric F. Wood is the problem of scaling, or closure (Wood et al., 1988; Blöschl and Sivapalan, 1995; Reggiani et al., 2001; Beven, 2006), i.e., how best to represent the influence of small-scale heterogeneities on large-scale fluxes, and how best to represent interactions among processes and the connectivity of water across the landscape. The scaling challenge is ubiquitous. For example, Mahrt (1987) demonstrates how localized areas of instability can dominate large-scale energy fluxes; Scott et al. (2008) demonstrate that transpiration from narrow riparian corridors in arid regions is much greater than the local precipitation; Seyfried et al. (2009) demonstrate that deep snow drifts produce local runoff “hotspots” that generate a disproportionate amount of the catchment runoff; Tromp-van Meerveld and McDonnell (2006a, b) demonstrate that the water stored in bedrock depressions must be raised to a sufficient level in order to connect bedrock depressions and generate hillslope outflow. The community has risen to meet these scaling challenges in very different ways – different models use very different sets of equations to describe the large-scale manifestation of spatial heterogeneity, process interactions, and connectivity.
The different solutions to the scaling and closure problem can be distinguished by the extent to which the effort is focused on developing new large-scale flux parameterizations or numerically integrating the small-scale heterogeneities across space. Such differences are perhaps best illustrated by considering the different approaches used to simulate the transmission of water through catchments. In bucket-style rainfall–runoff models – at the simplest end of the complexity continuum – the large-scale transmission of water is often defined as a linear (or near-linear) function of water storage (e.g., see the synthesis in Clark et al., 2008 and the recent review by Hrachowitz and Clark, 2017). Such large-scale closure relations implicitly represent the small-scale heterogeneity of flow paths, including the localized areas of high conductivity (e.g., macropores) that dominate the large-scale response (Beven and Germann, 1982; McDonnell, 1990). By contrast, the more complex 3-D variably saturated flow models typically use small-scale closure relations (Maxwell and Miller, 2005; Rigon et al., 2006), where unsaturated hydraulic conductivity is defined as a highly nonlinear function of soil moisture (e.g., Van Genuchten, 1980). These 3-D models compute large-scale fluxes by spatially integrating the small-scale heterogeneities (Maxwell and Kollet, 2008; Kollet et al., 2010). The differences in solutions to the scaling problem are not mutually exclusive, as many models include a mix of small-scale and large-scale flux parameterizations (e.g., VIC, the variable infiltration capacity model, uses a large-scale parameterization of infiltration, yet relies on small-scale equations to simulate the storage and transmission of water through the upper portion of the soil matrix).
When viewed in this way, the different solutions to the scaling and closure
problem can be shared among different modeling groups that employ very
different modeling approaches. To explain this perspective, consider the
inequality that describes ideal relationships between the model resolution
and the length scale of resolved and unresolved processes (Wood et al.,
1988):
The trend towards “hyper” resolution land models (Wood et al., 2011), e.g., 1 km or 100 m over large geographical domains, emphasizes the need for general parameterizations of hydrological processes on this scale. However, this is still an unsolved problem: we do not have firm evidence that the structure and parameter values of our element-scale equations correspond to hydrologic reality at those scales. One of the most important causes of this difficulty is the spatial heterogeneity in the initial and boundary conditions, and in the material properties of the medium. This heterogeneity occurs at multiple spatial scales, and has multiple physical causes (Seyfried and Wilcox, 1995). The multiple scales of heterogeneity are manifest as multiple dominant processes (Grayson and Blöschl, 2001), and also as processes without a well-defined spatial scale (e.g. preferential flow in the snowpack, on the land surface, in the subsurface). These problems cannot be solved solely by numerical integration across space. The next section summarizes recent advances in developing large-scale flux parameterizations and in effectively resolving dominant processes.
The common challenge of developing large-scale flux parameterizations has been addressed in a number of ways. One class of methods is that of statistical–dynamical flux parameterizations, where large-scale fluxes are defined based on probability distributions of subgrid or subelement model state variables. For example, area-average infiltration can be parameterized based on probability distributions of water table depth (Beven and Kirkby, 1979; Sivapalan et al., 1987) or on probability distributions of soil moisture (Moore and Clarke, 1981; Wood et al., 1992). Statistical–dynamical approaches are also used to parameterize the impact of frozen soils on area-average infiltration (Koren et al., 1999) and the impact of spatial variability of snow on area-average energy fluxes (Luce et al., 1999; Liston, 2004; Clark et al., 2011a). Another class of methods consists of scale-dependent parameterizations, where new flux parameterizations are defined directly on the scale of interest. Examples of this class of methods include the empirically derived storage–discharge relationships described earlier, where the large-scale transmission of water is often defined as a linear (or near-linear) function of water storage (Ambroise et al., 1996; Clark et al., 2008; Fenicia et al., 2011; Brauer et al., 2014). Similarly, large-scale stability corrections, used in computations of land–atmosphere energy fluxes, implicitly represent the impact of local pockets of instability on large-scale fluxes (Mahrt, 1987). There is a strong need to synthesize, evaluate, and compare these large-scale parameterizations in order to improve the physical realism of hydrologic models (Clark et al., 2011b, 2015b, 2016).
Statistical–dynamical flux parameterizations rely on the assumption that the
model-scale
The challenge of effectively resolving dominant processes has also been tackled in different ways. While one tactic is to simply discretize the domain into the highest resolution grid that modern computers allow (the numerical integration across space described above; Freeze and Harlan, 1969; Maxwell et al., 2015), this approach constrains capabilities to extensively experiment with alternative model configurations and to characterize model uncertainty (Beven and Cloke, 2012; Wood et al., 2012). Hence, for practical reasons, the challenge of spatial integration is commonly met using concepts of hydrologic similarity, often implemented at multiple levels of granularity within the same model. At a fine level of granularity, Wang and Leuning (1998) make separate stomatal conductance calculations on sunlit and shaded leaves to improve scaling from the leaf to the canopy. Similarly, Swenson and Lawrence (2012) make separate energy balance calculations over snow-covered and snow-free terrain to improve estimates of large-scale energy fluxes. On the system scale, many models spatially integrate across discrete landscape types to capture the large-scale manifestation of small-scale heterogeneity (e.g., Flügel, 1995; Tague and Band, 2004). For example, Newman et al. (2014) spatially integrate across a small number of discrete landscape types in order to reproduce the local runoff “hotspots” described by Seyfried et al. (2009). More recently, Chaney et al. (2016a) demonstrate that the use of spatially interacting hydrologic response units can reduce computational cost of a fully distributed hydrologic model by 3 orders of magnitude without appreciable losses in information. Like the large-scale flux parameterizations, there is a strong need to rigorously compare different approaches to explicitly resolve dominant processes.
An interesting twist is the interplay between explicitly representing small-scale processes and avoiding or reducing redundant calculations across large model domains. For example, in the push for hillslope-resolving models across large geographical domains, one approach is to use the concept of representative hillslopes (Troch et al., 2003; Hazenberg et al., 2015; Ajami et al., 2016). The representative hillslope has a length dimension much smaller than the length scale of the model element, and the hillslope is discretized into columns along an axis perpendicular to the stream to explicitly resolve lateral flow processes. The hydrologic and energy fluxes from the single hillslope, or averaged across local hillslopes of different types, are then considered representative of the model element as a whole. This approach spatially integrates both along a hillslope and among hillslopes. Such multiscale approaches show considerable promise and will likely be increasingly used to represent how small-scale heterogeneities, interactions among processes, and the connectivity of water across the landscape affects large-scale behavior.
A broader challenge is to simulate the myriad controls on catchment evolution, e.g., to predict how energy gradients dictate landscape evolution, how natural selection favors plants that make optimal use of available resources, and how the dynamic interactions between humans and the environment shapes the storage and transmission of water across the landscape (Rodríguez-Iturbe et al., 1992; Eagleson, 2002; Schymanski et al., 2009, 2010; Sivapalan et al., 2012; Harman and Troch, 2014; Zehe et al., 2014; Clark et al., 2016; Grant and Dietrich, 2017). Addressing this challenge requires shifting focus from traditional approaches at short timescales where “properties define processes” (Gupta et al., 2012) towards approaches on longer timescales that focus on predicting how “processes define properties” (Rodríguez-Iturbe et al., 1992; Eagleson, 2002; Harman and Troch, 2014). Importantly, it requires the treatment of humans as an endogenous component of the Earth system (Sivapalan et al., 2012; Clark et al., 2015a).
There are of course multiple possible approaches available to simulate dominant hydrologic processes. A useful path forward is to isolate and scrutinize alternative modeling approaches to represent scaling and heterogeneity. Peters-Lidard et al. (2017) propose the idea that the approximations in our models can be treated as hypotheses that can be tested in an information-based framework. Such advances in model evaluation methods will be critical in order to accelerate advances in process-based hydrologic models.
The second question posed by Freeze and Harlan (1969) focuses on the availability of data to define system properties (model parameter values).
A key part of this modeling challenge revolves around the availability and quality of spatial information on model parameters. For some model parameters, spatial information is not readily available. Examples of missing parameters include those that define the temporal decay of snow albedo and the recession characteristics of shallow aquifers. In such situations process-based hydrologic and land models often treat these uncertain parameters as physical constants, adopting hard-coded parameters that are selected based on order-of-magnitude considerations or on limited experimental data (Mendoza et al., 2015; Cuntz et al., 2016). For other parameters the available spatial information is limited to broad landscape characteristics, e.g., the parameters controlling carbon assimilation and stomatal conductance are typically tied to vegetation type (Bonan et al., 2011; Niu et al., 2011), or the available soil maps impose the same hydraulic properties over large areas (Miller and White, 1998). Such ill-defined information on vegetation and soils greatly underestimates the tremendous spatial heterogeneity that occurs in nature. Finally, when spatial information does exist it may have limited spatial representativeness and relevance – for example, the information on hydraulic conductivity from soil pits may only have weak relations with the transmission of water throughout catchments (Beven, 1989).
Such limitations notwithstanding, the challenge, really, is to make the most of the information we do have, and generate new information where we can (e.g., new observations), in order to improve estimates of the spatial variations in the storage and transmission properties of the landscape, including the scale dependence of these properties and their transferability across spatiotemporal scales (Klemeš, 1986; Samaniego et al., 2010; Melsen et al., 2016). The next section summarizes how the hydrologic modeling community is rising to this challenge.
The solutions to improving information on model parameters are general and can be applied across multiple models of different type and complexity. We see three specific paths forward. First, there are numerous opportunities to improve information on geophysical properties, including estimates of vegetation structure (Simard et al., 2011), soil depth (Pelletier et al., 2016), soil properties (Chaney et al., 2016b), bedrock depth and permeability (Fan et al., 2015), and the physical characteristics of rivers (Gleason and Smith, 2014).
Second, it is possible to improve the way that geophysical information is used to estimate model parameters. For example, the multiscale parameter regionalization (MPR) approach of Samaniego et al. (2010) focuses attention squarely on the transfer functions that relate geophysical attributes to model parameters – Samaniego et al. (2010) apply transfer functions on the finest spatial scale of the geophysical data (e.g., the soil polygons) and then apply parameter-dependent operators to upscale the fine-scale model parameters to the resolution of the model. The parameter estimation in MPR is hence centered on the coefficients in the transfer functions used to relate geophysical attributes to model parameters, maximizing the information extracted from the geophysical data. Much research has focused on pedotransfer functions to relate soil properties to soil parameters (e.g., Schaap et al., 2001; Soet and Stricker, 2003), and there has been limited work to relate geophysical attributes to other model parameters such as those controlling the impact of soil moisture on saturated areas (Balsamo et al., 2011).
Third, there is considerable scope to improve the way that multivariate data
are used to constrain model parameter values. A key path forward is to
identify different signatures from the data that can be used to improve
parameter values in different parts of the model (Gupta et al., 2008; Yilmaz
et al., 2008; Pokhrel et al., 2012; Vrugt and Sadegh, 2013; Rakovec et al.,
2015). For example, Troy et al. (2008) use regionalized estimates of the
precipitation
A very different solution is stochastic modeling (e.g., see Kim et al., 1997). Stochastic modeling accepts that many parameters are impossible to measure or estimate and instead generates synthetic model parameter fields using probability distributions with assumed length scales. For example, Maxwell and Kollet (2008) use spatially correlated random fields of saturated hydraulic conductivity to define the fine-scale spatial structure of their model domain, and evaluate the impact of this fine-scale structure on hillslope runoff. Similar approaches were used by Kollet et al. (2010) in their proof-of-concept study illustrating the spatial integration of fine-scale 3-D variably saturated flow simulations. These approaches derive from the indirect upscaling methods (numerical integration across space) developed in stochastic subsurface hydrology. The downside of such stochastic simulation approaches is that multiple realizations are necessary to separate the signals from the imposed random variability, making such approaches computationally challenging for fine-scale simulations over large geographical domains (Fatichi et al., 2016).
A major challenge is the parameterization of the deeper subsurface on regional to continental scales in order to support large-domain groundwater modeling (Bierkens, 2015; Clark et al., 2015a). Advances in estimating parameters of the subsurface may profit from new technologies. For example, it will be possible to use monitoring and exploration technologies (e.g., geophysics) to generate ensembles of hydraulic conductivity fields. Once these fields are estimated at high resolution, MPR could be used to estimate effective hydraulic conductivity values to characterize the required subsurface parameters. Also, stochastic methods need to be extended to capture the large structural variability in the formations and layers that dominate continental domains (Baroni et al., 2017; Schalge et al., 2016).
Recent attempts to parameterize the subsurface are a good first step. These include maps of global permeability and porosity for the upper 50 m of the world's aquifers (Gleeson et al., 2014), soil characteristics and regolith thickness (Pelletier et al., 2016; Shangguan et al., 2016; Hengl et al., 2017), and global thickness of the upper aquifers (de Graaf et al., 2015; Fan et al., 2015; Fan, 2016). However, these datasets have been globally extrapolated from locally established empirical relationships between subsurface properties and surface lithology (Hartmann and Moosdorf, 2012). None of these approaches resolve the multilayer structure of aquifers and aquitards. As a consequence, they provide useful information on the interaction between groundwater and evaporation, but have limited use for resolving true hydrogeological challenges such as assessing global groundwater depletion, groundwater age, and land subsidence related to groundwater pumping. Concerted efforts are needed to compile a global hydrogeological multilayer model based on national geological maps and archives and local- and regional-scale groundwater modeling studies, providing the kind of rich information on the subsurface that already exists for soils.
In their final question Freeze and Harlan (1969) ask if the computer limitations of storage capacity and speed of computation have been overcome, and if their blueprint for process-based hydrologic modeling is now economically feasible. Interestingly, we have made substantial (and economically feasible) advances in computing, yet we have also pushed beyond what they could envision with model resolution and process complexity. As a result, computing remains, ironically, a present-day challenge, and we still routinely push available computing resources to their limit (Kollet et al., 2010; Wood et al., 2011). We still struggle with tradeoffs between process complexity, spatial complexity, domain size, ensemble size, and the time period of the model simulation. We also still struggle to run our most complex models for a large number of model configurations, for example, experimenting with different model parameter sets, different process parameterizations, and different spatial architectures. To answer Freeze and Harlan's question: the computing limitations have not been overcome.
The challenge is as follows: as we push our models to their computational limit, the expense of these complex configurations may permit only a single deterministic simulation for a short time period (e.g., Maxwell et al., 2015; Fatichi et al., 2016). Such preferences for complexity and large-domain simulations arguably sacrifice opportunities for model analysis, model improvement, and uncertainty characterization. Complex models may struggle with physical realism because computational limitations constrain capabilities to identify and resolve model weaknesses. Paradoxically, more complex models may achieve less physical realism than computationally frugal alternatives. This is a critical concept – though a counterintuitive one – that ideally should guide the development of new model applications.
There are several solutions to these computational challenges, all of which are now being advanced by leading process-based hydrologic modeling groups. The first solution, and the most obvious, is to exploit advances in massively parallel (e.g., exa-scale) computation (Kollet et al., 2010; Wood et al., 2011; Paniconi and Putti, 2015; Fatichi et al., 2016). This solution is often implemented by running a complex model at the finest grid resolution possible over the domain of interest (e.g., Maxwell et al., 2015; Maxwell and Condon, 2016). A key reason for conducting such spatially resolved simulations is to understand explicit spatial controls on hydrologic processes. For example, Maxwell and Condon (2016) use high-resolution continental-domain ParFlow simulations to understand the controls of groundwater flow on the partitioning of evapotranspiration into bare soil evaporation and transpiration.
A second (related) solution to the computing challenge is to improve numerical solvers. In simpler models the need for robust numerical methods is often undervalued, and numerical errors in simple models contaminate model analysis and complicate model calibration (Kavetski et al., 2006b; Kavetski and Clark, 2010, 2011). For example, the “pits” in model parameter surfaces have been shown to be an artifact of numerical solution methods, requiring development of elaborate and time-consuming parameter estimation strategies that are not necessary in models with robust numerical solutions (Kavetski et al., 2006a; Clark and Kavetski, 2010; Kavetski and Clark, 2010). In more complex models, advances in solution methods are an active area of research, with several recent advances in numerical solvers and parallelization strategies (Qu and Duffy, 2007; Kumar et al., 2009; Kollet et al., 2010; Maxwell, 2013). Across all models there is a need to improve numerical solution methods (e.g., evaluate accuracy–efficiency tradeoffs) to support efficient model analysis and calibration strategies.
A third solution to the computing challenge is to identify model configurations that avoid redundant calculations while still capturing dominant processes. This can be accomplished using the concept of hydrologic similarity, i.e., recognizing that there is no need to repeat calculations for areas of the landscape with very similar forcing and geophysical properties (e.g., Flügel, 1995; Tague and Band, 2004). As noted earlier, recent applications of hydrologic similarity methods have shown that it is possible to reduce run times by 2 to 3 orders of magnitude, without any loss in information content (Newman et al., 2014; Chaney et al., 2016a). Also, hydrologic similarity concepts can be effectively applied using multiscale methods to resolve the dominant spatial gradients that drive flow; for example, using representative hillslopes to explicitly resolve lateral flow processes (Troch et al., 2003; Berne et al., 2005; Hazenberg et al., 2015; Ajami et al., 2016). In exploring these solutions, we recognize that there is not necessarily a tradeoff between physical realism and computational efficiency – the linkage between spatial complexity and process complexity may be rather weak, as models that are run using a large number of spatial elements may still miss dominant processes (e.g., Hartmann et al., 2017).
Schematic summarizing some key research priorities to advance the physical realism of process-based hydrologic models. Note that the framework accommodates multiple models (model 1, …, model n).
A fourth solution to the computing challenge, especially the concern that the computational cost of complex models sacrifices opportunities for analysis, is to focus on improving model analysis methods. Analysis of complex models is possible by developing surrogate models, i.e., models that emulate the behavior of complex models and run very quickly (Razavi et al., 2012). Analysis of complex models is also possible through computationally frugal model analysis methods that require fewer model simulations (Rakovec et al., 2014; Hill et al., 2015). A way to support these types of methods is to use quasi-scale invariant parameterizations (e.g., MPR) to estimate transfer function parameters at coarser resolutions instead of using a high-resolution model setting. Since parameters obtained with the MPR technique are transferable across scales without significant performance loss, models can be applied at higher spatial resolutions as shown by Kumar et al. (2013). This alternative would lead to computationally efficient large-scale hydrologic predictions and allows for the performance of parameter estimation over large domains.
In short, solving computing challenges will require judiciously combining emerging computing capabilities, advanced numerical methods, justifiable model simplifications, and extensive use of computationally frugal model analysis methods.
In this paper, we review key advances in process-based hydrologic models. We see that the community has risen to meet major hydrologic modeling challenges in diverse and productive ways. Specifically, the community has made noteworthy advances in improving mathematical descriptions of hydrologic processes, in parameter estimation, and in identifying justifiable model simplifications that make more effective use of available computing resources. Many of these modeling advances are general, and can be applied across multiple models of different type and complexity.
To summarize, there are three general opportunities to improve the physical realism of hydrologic models. First, there is still considerable scope to improve mathematical descriptions of hydrologic processes. A major research challenge is the scaling and closure problem, i.e., how to represent how small-scale heterogeneities shape large-scale fluxes, interactions among processes, and the connectivity of water across the landscape. While the hydrological modeling community has made progress in this challenge, through statistical–dynamical models, stochastic upscaling theory, scale-appropriate flux parameterizations, and spatial integration across discrete landscape types, much work is still required both to develop new closure schemes and to systematically compare existing modeling approaches. Second, there is considerable scope to improve information on model parameter values and their associated uncertainties. Advances in parameter estimation will require a focused effort to improve the available geophysical information (e.g., through improved observations), improve the links between geophysical information and model parameters, and advance methods to use multivariate data to constrain model parameter values. Third, there is a strong need to more effectively use the available computing resources. We argue here that in addition to exploiting advances in massively parallel computation and numerical solution methods, we can also make much more effective use of the available computing through more efficient and/or agile models (e.g., use of hydrologic similarity concepts). More effective use of available computing resources can increase capabilities for model analysis and uncertainty characterization, and shine the light toward further model improvements.
We see several specific needs underlying these general research themes (see
Fig. 1 for the general framework):
We need to improve the theoretical underpinnings of our hydrologic models
(Clark et al., 2016). Most discussions of intermodel differences focus on a
discussion of algorithms rather than a discussion of processes. While there
have been some calls in the past to improve the “dialog” between
experimentalists and modelers (Seibert and McDonnell, 2002) – e.g., to focus
more on processes – much of the interaction between experimentalists and
modelers is focused on individual watersheds (e.g., Tromp-van Meerveld and
Weiler, 2008; Hopp and McDonnell, 2009). More work is needed to synthesize
process explanations from research watersheds to develop more general
theories of hydrologic processes (e.g., Tetzlaff et al., 2009) and test
these alternative process descriptions in models. We need to expand our prominence in community hydrologic modeling (Wood et
al., 2005; Weiler and Beven, 2015), both by providing accessible and
extensible modeling tools, and also providing key research datasets and model
test cases to scrutinize alternative modeling approaches. Such community
activities will result in greater engagement of field scientists in model
development and greater collaboration across diverse modeling groups,
resulting in substantial improvements in the physical realism and predictive
capabilities of hydrologic models. Advancing such community activities
requires that we are much more effective and efficient in sharing data and
model source code. This goes beyond just making models and data publicly
available by, critically, integrating models and data in widely used
analysis frameworks and developing model standards to simplify the sharing of
source code in models developed by different groups (Clark et al., 2015b, 2016). We need to systematically and comprehensively explore the benefits of
competing modeling approaches (Clark et al., 2015a, b, 2016). A key need is
to systematically evaluate information gains and losses using models of
varying complexity, exploring the interplay between changes in process
complexity and changes in spatial complexity. These assessments will help
identify useful model configurations for specific applications. Another need
is to scrutinize models using data from research watersheds, using data on
both internal states or fluxes and intervariable relationships, in order to
understand the benefits of competing process parameterizations. More generally, and as emphasized by Peters-Lidard et
al. (2017), it is important to use applications of information theory to
quantify how effectively models use the available information, i.e., to
provide an estimate of system predictability and identify opportunities to
improve models. We need to substantially advance the development of new modeling approaches
that simulate the temporal dynamics of environmental change. Key challenges
include predicting how energy gradients dictate landscape evolution, how
natural selection favors plants that make optimal use of the available
resources, and how the dynamic interactions between humans and the
environment shapes the storage and transmission of water across the landscape
(Rodríguez-Iturbe et al., 1992; Eagleson, 2002; Schymanski et al., 2009,
2010; Sivapalan et al., 2012; Harman and Troch, 2014; Zehe et al., 2014;
Clark et al., 2016; Grant and Dietrich, 2017). We must advance research on process-oriented approaches to estimate spatial
fields of model parameters. The challenge is to estimate spatial variations
in the storage and transmission properties of the landscape. Advances are
possible through the development of new data sources on geophysical attributes
(Simard et al., 2011; Gleason and Smith, 2014; Fan et al., 2015; Chaney et
al., 2016b; Pelletier et al., 2016; De Graaf et al., 2017), new approaches to
link geophysical attributes to model parameters (Samaniego et al., 2010;
Kumar et al., 2013; Rakovec et al., 2015), and new diagnostics to infer model
parameters (Gupta et al., 2008; Yilmaz et al., 2008; Pokhrel et al., 2012).
Such focus will give the parameter estimation problem the scientific
attention that it deserves, rather than the far-too-common approach where
parameter estimation is relegated to a “tuning exercise” in model
applications. This focus on parameter estimation is necessary to improve the
physical realism and applicability of process-based models. We need to obtain better data on hydrologic processes. Field campaigns to
obtain new data to understand hydrologic processes are less supported and
supportable than before (Tetzlaff et al., 2017); thus a key need is to
motivate and design new field experiments to advance understanding of the
terrestrial component of the water cycle across scales and locations. Such
work is critical to ensure model development is not unduly constrained by the
limited experimental field data that we have at present. We need to advance methods for model analysis, especially for complex models.
As mentioned above, analysis of complex models is possible by both
(a) developing surrogate models, i.e., models that emulate the behavior of
complex models and run very quickly (Razavi et al., 2012); and (b) applying
computationally frugal model analysis methods that require fewer
model simulations (Rakovec et al., 2014; Hill et al., 2015). These advances
in model analysis are important because complex models are typically
calibrated or analyzed using semimanual or manual strategies, largely due to
their immense computational cost (it is only possible to run a handful of
simulations). We have very little insight into process–parameter dominance
and process–parameter interactions in very complex models; however, such
information is desperately needed in order to inform meaningful parameter
estimation strategies. Finally, and most importantly, we need to improve the construction of
hydrologic models. Many of today's models have developed somewhat of a
“shantytown” appearance, where a succession of students and post-docs
bolted on new components to suit the needs of their particular project, and
the overall construction of the model has become rather messy. Clark et
al. (2015b) define some key requirements as follows: (a) impose modularity at the
level of the individual fluxes, to enable greater model extensibility and
code reuse, as it is straightforward to combine different flux
parameterizations to form alternative conservation equations; (b) separate
the physical processes from their numerical solution, to enable
experimentation with alternative numerical solution methods, e.g., evaluating
accuracy–efficiency tradeoffs; and (c) use hierarchal data structures, to
enable representation of spatial variability and connectivity across a range of
spatial scales. Such improvements in model construction are a critical
underpinning activity that is critical to accelerate advances in hydrologic
science.
In addressing these research tasks it is important to take a unified
perspective – it is important to deliberately depart from previous debates
on the “correct” approach to hydrologic modeling, and focus instead on more
effective use of the diversity of modeling tools to advance our collective
quest for physically realistic hydrologic models.
No data sets were used in this article.
The authors declare that they have no conflict of interest.
This article is part of the special issue “Observations and modeling of land surface water and energy exchanges across scales: special issue in Honor of Eric F. Wood”. It is a result of the Symposium in Honor of Eric F. Wood: Observations and Modeling across Scales, Princeton, New Jersey, USA, 2–3 June 2016.
We thank John Ding, Murugesu Sivapalan, Thorsten Wagener, Eric Wood, and an anonymous referee for their constructive comments on an earlier version of this paper. Edited by: Dennis Lettenmaier Reviewed by: Thorsten Wagener, Murugesu Sivapalan, Eric Wood, and one anonymous referee