Interactive comment on “ Replication of ecologically relevant hydrological indicators following a covariance approach to hydrological model parameterisation

Abstract. Hydrological models can be used to assess the impact of hydrologic alteration on the river ecosystem. However, there are considerable limitations and uncertainties associated with the replication of the required, ecologically relevant hydrological indicators. Vogel and Sankarasubramanian's covariance approach to model parameterisation represents a shift away from the traditional calibration-validation goodness-of-fit paradigm. Using the covariance structures of the observed input and simulated output time-series, the region of parameter space which best captures (replicates) the characteristics of a hydrological indicator may be identified. Through a case study, a modified covariance approach is applied with a view to replicating a suite of seven ecologically relevant hydrological indicators. Model performance and consistency are assessed relative to four comparative studies. The ability of the approach to address the limitations associated with traditional calibration-validation is further considered. Benefits of the approach include an overall reduction in model uncertainty whilst also reducing overall time-demands. Difficulties in the replication of complex indicators, such as rate of change, are in line with prior work. Nonetheless, the study illustrates that consistency in the replication of hydrological indicators is achievable; additionally, the replication of magnitude indices is markedly improved upon.



Introduction
Water is the most essential natural resource (World Water Assessment Programme, 2009;Vörösmarty et al., 2010).The principle source of freshwater, rivers, lakes and groundwater hold only 0.7% of the water on the planet (Shiklomanov, 1993).
Rivers support prosperity, health, and well-being through the provision of ecosystem services; examples include: water security, energy production, hydro-hazard regulation, and water purification (Gilvear et al., 2017).The river flow regime is considered the major determinant in the structuring and functioning of the riverine ecosystem and provision of these services (Poff et al., 1997).Despite this, relentless pressures from both societal water demand and climate change raise significant questions over the long-term sustainability of this resource (Gleick 1998(Gleick , 2016;;Klaar et al., 2014).
The need to balance the conflicting demands of both human society and those of the ecosystem has led to recent environmental flows research.This is defined as: "…the quantity, timing, and quality of water flows required to sustain freshwater and estuarine ecosystems and the human livelihood and well-being that depend on…" (Brisbane Declaration, 2007).Richter et al. (1996) identified five facets of the flow regime required to support the riverine ecosystem: magnitude, frequency, duration, timing and rate of change.To date, over 200 ecologically relevant hydrologic indices (HIs) have been proposed (Olden and Poff, 2003;Monk et al., 2006;Thompson et al., 2013).Determined from flow time-series, simulated via hydrological model, these HIs may be used to assess the impact of hydrological change on the river ecosystem in the future, for examples, see: Richter et al. (1996); Carlisle et al. (2010); Poff and Zimmerman (2010); Murphy et al. (2012); You et al. (2014);De Girolamo et al. (2017); Williams (2017).Example applications include understanding the effect of a changed climate, engineering intervention or in the establishment of environmental flow limits.
A hydrological (rainfall-runoff) model is essentially a simplification of the hydrologic system: hydroclimatological variables, such as temperature and precipitation, are used to estimate river flow.The paramount aim of hydrological modelling is to determine a suitable model structure and corresponding parameter set that provides a realistic representation of the hydrological processes in the catchment of interest (Seibert, 2000;Beven, 2012a).Structural deficiencies, as opposed to inadequate calibration (Beven, 2010;Beven, 2012b), should be the principal cause for model rejection (Westerberg et al., 2011).
Traditionally, hydrological models are parameterised following a calibration-validation approach based on Klemeš (1986) split-sample technique (Vogel and Sankarasubramanian, 2003); often with multiple calibration-validation trials.The calibration of the hydrological model focuses on the goodness-of-fit (GOF) between the observed and simulated flow timeseries for a defined objective function; the Nash Sutcliffe model efficiency criteria (NSE) is among the most widely used.The sensitive nature of these traditional measures of GOF (objective functions) has been addressed through the consideration of modified NSE criteria and multi-criteria calibration (for example Gupta et al. (1998); Seibert (2000); Efstratiadis and Koutsoyiannis (2010); Pushpalatha et al. (2012)).However, despite improvements, certain problems remain, including (Westerberg et al., 2011): (1) the potential for bias in the model parameterisation as a result of measurement error (due to poor accuracy and/or calibration) and uncertainty in the flow data (Pelletier, 1988;Montanari et al., 2013); (2) the arbitrary nature of the GOF behavioural thresholds; and (3) the problem of equifinality, where similar GOF may be achieved across multiple calibration trials, but with different parameters sets (Beven, 2006).
The use of hydrological models to determine HIs implicitly premises that the underlying hydrological processes of the catchment are sufficiently captured.Where this premise proves false, it directly impacts accuracy in the HIs, leading to high levels of variability (Shrestha et al., 2014(Shrestha et al., , 2016;;Vis et al., 2015;Pool et al., 2017).For example, Shrestha et al. (2014) evaluated the ability of the VIC hydrologic model to replicate a number of HIs; HIs relating to annual and peak flows were simulated well whilst minimum flows and flow pulses were not.A focus on the characteristics of the flow regime, or hydrological signatures, has been shown to limit the influence of input uncertainties on the performance and consistency of hydrological models (Westerberg et al., 2011;Euser et al., 2013).Vogel and Sankarasubramanian's, 2003, covariance approach to model parameterisation without calibration addresses many of these problems.The objective is to identify the region of parameter space which captures (replicates) the characteristics of a specified HI.This is achieved by focussing on the ability of the hydrological model to capture the observed covariance structure of the input and output time-series.Presently, the approach is limited by its focus on a single HI, preventing its use for the determination of a suite of ecologically relevant HIs.This paper builds on the covariance approach, adapting the methodology to consider multiple ecologically relevant HIs.To determine the applicability of this modified covariance approach, the method is applied to a case study using the four-parameter hydrological model GR4J.The modelling objective is the replication of seven ecologically relevant HIs (as part of a larger work determining the impact of climate change on instream hydroecological response; Visser et al. (2018b)).The performance and consistency of the modified covariance approach is evaluated in terms of the replication ability of the hydrological model and with reference to prior studies with similar modelling objectives (Shrestha et al., 2014;Vis et al., 2015;Shrestha et al., 2016;Pool et al., 2017).
Three specific research questions are answered in this paper: 1) Is the modified covariance approach able to satisfactorily replicate a suite of hydrological indicators?(With regards to performance and consistency; definitions below.) 2) How do the outcomes (replication of the ecologically relevant hydrological indicators) compare with those studies with similar modelling objectives?
3) Does the covariance approach advance progress towards addressing the limitations inherent in traditional hydrological model calibration (as described above)?
Through this paper we refer to model performance and consistency.After Euser et al. (2013), model performance is defined as the ability to mimic the behaviour of catchment hydrological processes; consistency represents the ability of the hydrological model to reproduce the suite of HIs (i.e.multiple hydrological signatures simultaneously using the same parameter set).

Study area
The River Nar is a chalk stream located in Norfolk, south-east England.With two distinct river units, the River Nar has been designated a Site of Special Scientific Interest (SSSI); the upper Nar overlies a chalk scarp to Marham, whilst the lower alluvial reach forms a fen basin.Despite its high conversation value, the River Nar is subject to significant pressures, inhibiting the ecological potential of the river (NRT, 2012).The river has been subject to continuing research into its flow regime, and their governing factors (Garbe et al., 2016;Visser et al., 2017Visser et al., , 2018a)).
The focus of this paper is on the 153.3 km 2 upper catchment (chalk reach) only.Flow is primarily sustained by springs at West Lexham and near Castle Acre (Fig. 1), and upstream of Lexham, through groundwater seepage and surface water runoff; these upstream reaches are considered particularly vulnerable to low-flows (Sear et al., 2005).With a highly seasonal flow regime, the hydrology of the River Nar is characteristic of pure chalk streams (Sear et al., 2005); aquifer recharge occurs in the autumn months, with a progressive rise in flow March/April.Flow is relatively low, over the available 1961-2015 record the median flow is 1.11 m 3 s-1 , whilst Q10 and Q90 flows are 1.96 and 0.47 m 3 s-1 .As of September 2017 (the most recent data currently available), the year 1991 saw the most extreme hydrological drought recorded at the Marham gauge (Garbe et al., 2016), with flow falling below Q95 values for 178 consecutive days.

Hydrological model
The principle of parsimony, known as Occam's razor, posits that a solution should be no more complex than necessary.In the context of hydrological modelling, model simplicity relative to performance is thus made key (Kokkonen and Jakeman, 2002;Perrin et al., 2003;Beven, 2012a).To this end, GR4J, a four-parameter model from the GR-J series of hydrological models was selected (Perrin et al., 2003).The GR-J series of models have been applied in a variety of hydrological contexts, examples The model GR4J is a lumped model based on soil moisture accounting (Fig. 2).The model inputs, P, the catchment rainfall depth, and E, the average depth of (potential) evapotranspiration), fill the production store with a capacity of x1 mm.The routed depth of water, Pr, is determined by the rate of percolation, F(S, x1), as well as water in excess of the storage capacity.
To simulate the time difference between rainfall event and flow peak, Pr is divided into two flow components and routed through unit hydrographs, time base F(x4) days.Finally, the groundwater exchange term gw, F(x2), acts on the routed, Qr, and direct flow, Qd, components; a positive value indicates inflow from groundwater whilst a negative represents water export.
The total flow, Q, is determined by summing the routed and direct flows.The model is applied using the R package airGR (Coron et al., 2017;Coron et al., 2018).Parameter limits are summarised in Table 1.

Data
Continuous (daily) time-series of mean flow, precipitation and potential evapotranspiration for the period 1961-2015 serve as model input.Flow data from the Marham gauge (Fig. 1) was provided by the National River Flow Archive (CEH ( 2018)).The required climate data was computed from daily average rainfall and hourly temperature recordings at 5 MIDAS stations (Fig. 1; Met Office ( 2016)); potential evapotranspiration was estimated using a temperature-based PE model (Oudin et al., 2005).The ecologically relevant HIs were determined in Visser et al. (2018b) as part of the development of a hydroecological model following an Information Theory (IT) approach (Visser et al., 2018a).The selected HIs are summarised in Table 2 along with their relative importance (according to IT).These seven were selected from a set of 63 ecologically relevant HIs, based on Olden and Poff (2003), Monk et al. (2006) and Thompson et al. (2013).To reflect seasonality in the flow regime, the indices 10 are differentiated by season (Table 2): winter (October-March) and summer (April-September).

Covariance approach
The covariance approach was first developed by Vogel and Sankarasubramanian (2003), where the aim was to replicate a HI rather than the flow time-series.Here, modification of the covariance approach allows for the consideration of a suite of ecologically relevant HIs.The modified covariance approach is implemented over three stages (Fig. 3).Stage 1: The complete parameter space of the hydrological model was sampled; the number of parameter sets considered is dependent upon the number of free parameters in the hydrological model and the accepted level of uncertainty.To address the issue of parameter sensitivity (Tong and Graziani, 2008;Wu et al., 2017), the parameter space was sampled uniformly based on Sobol quasi-random sequences (a Quasi-Monte Carlo method).Here, 100,000 independent parameter sets were selected.
Stage 2: For each parameter set, flow time-series were simulated based on the observed climate data.For each of these flow time-series, a corresponding set of covariances (between observed climate and simulated flow) and HIs was computed.The observed covariance and HIs are also determined.
Stage 3: Before a parameter set was selected and evaluated, it was necessary to determine if the observed moments lie within the of the simulated moments (sampled parameter space).This was facilitated through plots (Fig. A2) of the observed and Selection of a model parameter set was based on a specified limit of acceptability, i.e. the ability to replicate or minimise the error, between the observed and simulated covariance structures and HIs.In Vogel and Sankarasubramanian ( 2003) the focus was on the replication of a single index.Here, the objective was the replication of multiple indices; to account for this, a limit of acceptability was specified per index, with the indices assigned maximum error thresholds based on their relative importance (Fig. 4).The index importance (Table 2) was normalised (rescaled to a range from zero to one) allowing the covariances to be assigned a relative importance of one (equal to the most important index).The limits of acceptability were determined through the linear relationship between the relative importance and a user-specified allowable error range (minimum and maximum; see Fig. 4).Parameter sets which fall below this limit of acceptability were rejected.Here, the minimum and maximum error were specified as 17.5% and 35% (2 •   ) respectively; see also Table A1 in the appendix.

Model evaluation
The ability of the performance and consistency of the modified covariance approach was made with reference to the seven HIs (Table 2).The seven HIs were calculated annually for both the observed and simulated flow datasets over the 54-year period.this study applied the same (non-parametric) evaluation metrics (Table 3).Two additional measures, Cramér-von Mises and the mean arctangent absolute percentage error (MAAPE), were considered to address limitations associated with certain metrics; see Table 3 for discussion.The modified covariance approach is considered relative to the above-mentioned studies; an overview of these works is provided in Tables B1 and B2.A correlation coefficient.The strength of the linear relationship between two variables.Nonparametric, applicable when data is not normally distributed.
stats::cor(…, method = "spearman") A factor developed as part of the Indicators of Hydrologic Alteration (Mathews and Richter, 2007).Tests the replicability of sections of the probability distribution (lower-tail, IQR and upper-tail) for a given index.

Model parameters
The relationship between the covariance of the input and output time-series and each HI, for all 100,000 simulations, are 5 summarised in Fig. A2.The observed moments can be seen to lie within the simulated moments, validating the use of the GR4J hydrological model.The parameters of the production (x1) and routing (x3) store capacities were estimated as 511 and 311 mm respectively; time elapsed for the routing of the flow is 1.17 days (x2).A positive groundwater exchange coefficient (x4) of 2.84 mm per day represents the inflow from the chalk aquifer.The percentage error for the covariances and HIs are where the error is equal to the upper threshold of +35%.

Model evaluation
The model is evaluated with reference to the distribution of each HI as well as the evaluation metrics used in Shrestha et al., 2014Shrestha et al., , 2016;;Vis et al., 2015 andPool et al., 2017 (Table 3).The ability of the model to replicate the seven HI is considered in terms of performance and consistency.
The distribution of the HIs is presented in Fig. 5.In the case of the empirical cumulative distribution functions (ECDF; top row), the level of agreement between the observed and simulated HIs is indicative of good overall performance.However, the probability density functions (PDF; bottom row) indicate a lack of consistency across the indicator RevPos, and the low flow indicators (Q70Q50, Q80Q50 and Q90Q50).With the tails being well replicated, issues lie principally within the central distribution.For RevPos this is confirmed by a lack of correlation (Spearman; Table A2) and the statistical tests, where the null hypothesis was not rejected (observed-simulated HI do not agree in terms of mean and distribution).
A summary of the NSE values and measures of error is provided in Fig. 6, optimal values are indicated by the dashed line; for the numerical values see Table A2.With a maximum of 0.54, the NSE values are suggestive of relatively poor model performance; again, highlighting difficulties in replicating the index RevPos.The metrics MARE and MAAPE suggest a more positive outcome, however consistency across the HIs is lacking.In contrast to the prior results, MARE indicates that the index RevPos is well replicated, with riseMn exhibiting the poorest performance.Conversely, MAAPE, which is intended to reduce the bias of large error values (see Table 3), identifies riseMn as the best performing index, whilst 10R90Log is deemed worst.
The normalised error associated with each HI, a measure of the difference between observed-simulated values relative to the observed range, is précised in Fig. 7.The range of values that the indices 10R90Log and RevPos can take are low (±0.25 and ±0.5 respectively), consequently, no inference can be made.In contrast to the ECDFs and PDFs, a high level of consistency is observed for the index logQVar and low flow indicators (across mean, range, distribution and bias of the normalised error).A strong negative bias, that was not clear in the Fig. 5distributions, is in evidence for the least important index (Table 2), riseMn.
The hydrologic alteration factor (HAF) is adapted from the IHA approach (Table 3).It is a measure of the simulated and observed frequencies of values within three target percentile ranges: 0-25 th , 25-75 th , and 75-100 th .As a measure of distribution, HAF is essentially a simplification of the distribution functions in Fig. 5.The acceptable range of HAF values is defined as ±0.33 (Mathews and Richter, 2007).HAF values for each HI are presented in Fig. 8.
The central distribution (25-75 th percentiles) of all HIs lie within this acceptable range, indicating good performance and consistency.Performance across the tails of the distributions is generally good (lying within or on the bounds of the acceptable range), though there is a lack of consistency in the direction of bias.With a HAF value of 1.82 (not pictured to preserve figure resolution), the positive bias in the index RevPos exhibits the largest such deviation.With bias at both tails, inconsistencies in the replication of the index riseMn are again highlighted.

Modified covariance approach
There is a clear need to understand the impact of hydrologic change on the river ecosystem.To assess this, hydrological models are used to simulate flow time-series from which HI of ecological relevance are derived.In this study, a modification of Vogel and Sankarasubramanian's (2003) covariance approach was considered, with a focus on the replication of a suite of seven ecologically relevant HIs.
The first aim of this study was to determine whether this modified covariance approach is able to satisfactorily replicate the suite of HIs.The hydrological model was successfully parameterised with observed moments lying within the bounds of the simulated moments for all HIs.Overall, replication of the HIs was good.Indices related to magnitude where best replicated, whilst difficulties were observed in replicating rate of change and integer indices.
In terms of distribution (Fig. 5) and evaluation metrics (Table 3), the best performing and most consistent HIs are the measures of magnitude: logQVar, Q80Q50 and Q90Q50.This is a clear indication that the model can successfully replicate the variation in flow and quantiles (specifically low-flows, Q80 and Q90).Measures of model performance associated with the remaining four indices present a conflicting picture, leading to a lack of clarity as to the full capacities of the hydrological model and relative success of the covariance approach.The two most important indices, 10R90Log and RevPos, are among the less wellreplicated; however, this may be due to their sheer inherent complexity (discussed further below).

Comparison
A number of studies have investigated the ability of hydrological models to replicate ecologically relevant HI.Subjecting the outcomes to a comparison is the second objective of this study.The comparative studies, Shrestha et al., 2014, 2016, Vis et al., 2015and Pool et al., 2017, follow a traditional calibration-validation approach, considering an array of objective functions and performance metrics (Table 3); see Appendix B for details.Comparison is made with reference to the facets of the flow regime, specifically magnitude and rate of change (Table 2).
Both Shrestha et al. (2014) and Vis et al. (2015) observed poor model performance in the replication of low flow HIs.
Consistent with the literature (Westerberg et al., 2011;Pushpalatha et al., 2012), Shrestha et al. (2014) attribute this to the use of objective functions tuned to high-flow periods (i.e.NSE and volume error; Table B1).As shown in this study and Pool et al., 2017, this  Of the studies considered, only Vis et al., 2015 andPool et al., 2017 are directly linked to the outcomes of hydroecological modelling, replicating a suite of ecologically relevant HIs.In Vis et al. (2015) performance was inconsistent, varying considerably across HIs and objective function; model evaluation was limited due to the focus on model efficiency (NSE) as an evaluation metric.Pool et al. (2017) concluded that the choice of objective function strongly influenced the accuracy in replication, with the best results achieved when the models were calibrated on the HI of interest.
It is clear that no approach has been able to achieve adequate performance and consistency in the replication of more complex HIs, specifically those related to rate of change.Whilst Pool et al., 2017 saw improvements, the need to calibrate the model to each HI in question would strongly call into question the reliability of the hydrological model (due to the inability of the hydrological model to simulate catchment hydrological processes simultaneously).The consistency with which (the majority of the) HIs are replicated here illustrates that this is not a necessary limitation of hydrological models.

Evaluation metrics
This work, and the comparative studies, highlight a number of shortcomings in the evaluation metrics used in the evaluation of hydrological models.One consequence being difficulty in evaluating the most important indices.Considerable conflicts have arisen between and across the measures; the following problems are highlighted: (1) the statistical tests of agreement (Table 3) are generally limited to the mean or central distribution; (2) the error measures NSE and MARE exhibit known bias (Table 3; Vis et al. (2015); Kim and Kim (2016)); (3) the HAF index is not well-suited for the evaluation of HIs which are integer counts or dimensionless.Number 3 is best illustrated through consideration of the index RevPos, an integer count of the number of positive reversals in the summer season.When assessing RevPos based on percentile ranges (as per HAF), integers equal to the percentile boundary are not considered (see Fig. A1).This suggests that HAF may not be applicable for those indices which take integer values (i.e.counts, days or Julian day time of year).Indeed, the difficulties observed here and in the comparative studies suggest that certain HIs, such as rate of change indicators like RevPos, may simply not be practically replicable by a hydrological model.Given redundancy in many HIs, it may be possible to identify another more suitable index capable of providing the same information.Such efforts would not arbitrarily improve the replication of the HIs, but rather, confidence in the performance of the outputs.

Addressing the limitations of traditional hydrological model calibration
As discussed at the outset of this paper, a number of limitations of a traditional approach to hydrological model calibration have been identified.These include: (1) bias and uncertainty as a result of measurement error, i.e. disinformative data; (2) the arbitrary nature of GOF behaviour thresholds; and (3) equifinality.Determining whether the modified covariance approach serves to address any of these limitations represents the final aim of this study.

Disinformative data
Models calibrated following a traditional approach are particularly sensitive to measurement error (Westerberg et al., 2011).Lack of agreement in the observed-simulated time-series, even for a single event, may bias the objective function, leading to rejection of an otherwise well-performing parameter set (Beven, 2010;Westerberg et al., 2011).Methods which do not focus on the replication of time-series, such as the modified covariance approach, limit the influence of input uncertainty (Westerberg et al., 2011;Euser et al., 2013).Additionally, length of the time-series is a significant factor, with shorter time-series featuring greater bias (Westerberg et al., 2011); depending on the area of application, long-term climate variability (e.g.El Niño/La Niña) may exacerbate this.It is worth noting that, amongst the comparative studies, none feature data in excess of 29 years; it is possible that, an alternative approach to parameterisation that does not focus on the time-series, such as the covariance approach, may reduce input uncertainty, leading to improvements in the replication of indicators.

Behaviour thresholds
The observed and simulated moments (Fig. A2) clearly illustrate whether a given hydrological model structure is able to capture the hydrological processes in the catchment.In this way, the modified covariance approach is not dependent on an arbitrary behavioural threshold to validate the use of the hydrological model.However, in the absence of a numerical measure of the relative importance of each HI, an element of subjectivity is necessarily introduced into the parameterisation of the model.An approach such as the Generalised Likelihood Uncertainty Evaluation (GLUE) framework (Beven and Binley, 2014) may represent a viable alternative where HI importance is unspecified or irrelevant.

Equifinality and parameter space
Equifinality, reaching the same outcome by different means, is a major challenge of hydrological modelling.In the modified covariance approach the entire parameter space is considered.The range of possible solutions, i.e. parameter sets, is reduced by focussing on the region which is best able to replicate the characteristics of the HIs, thereby reducing the uncertainty associated with equifinality (Wu et al., 2017).Additionally, the approach ensures the selection of the global optimum.
In hydrological uncertainty analyses, the size of the parameter space is highly variable; for example, Wilby (2005) considered 10,000 simulations, whilst Ballio and Guadagnini (2004) looked at 200,000.Here, a total of 100,000 simulations were considered in order to verify the method of investigation; upscaling this for the 16 parameter HBV model in Vis et al., 2015 and Pool et al. would necessitate 400,000 simulations.Whilst the large number of simulations may seem prohibitive, this demand may be offset.Unlike the calibration-validation paradigm, where selection algorithms may introduce issues of speed and accuracy (Seibert, 2000), finite time is needed to apply the covariance approach.All simulations of the hydrological model are performed at the outset; once the full suite of parameter sets have been simulated the hydrological model need not be run again.In traditional calibration-validation, where the HI serves as the objective function (e.g.Pool et al., 2017), the HIs must be specified at the outset.This is not the case in the modified covariance approach; the n Monte Carlo simulations can be performed in advance of HI selection.Further, multiple sets of HIs may be considered at a time (e.g.all rate of change or magnitude indicators), or at a later date, with limited additional time outlay.

Additional limitations
The outcomes of this, and comparative studies, highlight the present inability of hydrological to simulate a wide range of HIs concurrently (e.g.rate of change plus other facets of the flow regime).Any attempted improvements may, f come at the cost of parsimony and equifinality.Additionally, as a simplification of the hydrologic system, it is impossible, by definition, for a hydrological model to accurately replicate all aspects of the flow regime simultaneously (Beven, 2012b).For this reason, it may be that a focus on the replication of HIs, leads to a poor representation of the flow hydrograph (Seibert, 2000), limiting the use of such models to the initial modelling objective (replication of HIs).

Figure 1 .
Figure 1.River Nar case study catchment focussed on the chalk reach; the locations of the flow gauge and MIDAS climate stations are detailed.The three major springs sustaining flow in the reach are also indicated.

Figure 3 .
Figure 3. Overview of the three stages of the modified covariance approach to model parameterisation.

Figure 4 .
Figure 4. Boundaries of the limit of acceptability (shaded) for the case study selected parameter set.The lines indicate the relationship between the allowable error thresholds and relative importance.

Figure 5 .
Figure 5. Empirical cumulative distribution functions (top) and probability density functions (bottom) for observed-simulated HIs.

Figure 6 .Figure 7 .
Figure 6.Evaluation metric summary by HI; optimal values are indicated by the dashed line.

Figure 8 .
Figure 8. Hydrologic alteration factor (HAF) values for the three percentile ranges for each HI.The acceptable range of HAF values is defined as ±0.33;HAF > 0 represents an increas in frequenecy relative to the observed whilst HAF < 0 represents a decrease.
is largely redressed through explicit consideration of low-flow HIs in parameterisation of the hydrological model.Mean flows are similarly accounted for in Pool et al., 2017.Performance and consistency in the replication of high flow indicators was consistent across all studies, with the exception of Shrestha et al., 2016, where summer (June-September) high flows exhibit a distinct negative bias.No studies, this work inclusive, observed difficulties in replicating indicators related to flow variability directly.Whilst inconsistency in the replication of the rate of change indicator RevPos is clear, a lack of agreement in the evaluation metrics leads to difficulties in assessing the performance of riseMn.Such observations are found consistently across three out of the four studies: Shrestha et al., 2016 excluded frequency and rate of change indicators due to large negative bias observed in Shrestha et al., 2014, whereas Vis et al., 2015 saw inconsistencies across the calibration-validation and performance metrics (NSE and Spearman).Performance improvements were, however, seen in Pool et al., 2017 when the HI is considered as the objective function.

Figure A1 .
Figure A1.Histogram of the observed (green) and simulated (purple) values for the index RevPos.The dashed lines indicate the lower and upper boundaries for the HAF.