In many simulations of historical daily streamflow distributional bias arising from the distributional properties of residuals has been noted. This bias often presents itself as an underestimation of high streamflow and an overestimation of low streamflow. Here, 1168 streamgages across the conterminous USA, having at least 14 complete water years of daily data between 1 October 1980 and 30 September 2013, are used to explore a method for rescaling simulated streamflow to correct the distributional bias. Based on an existing approach that separates the simulated streamflow into components of temporal structure and magnitude, the temporal structure is converted to simulated nonexceedance probabilities and the magnitudes are rescaled using an independently estimated flow duration curve (FDC) derived from regional regression. In this study, this method is applied to a pooled ordinary kriging simulation of daily streamflow coupled with FDCs estimated by regional regression on basin characteristics. The improvement in the representation of high and low streamflows is correlated with the accuracy and unbiasedness of the estimated FDC. The method is verified by using an idealized case; however, with the introduction of regionally regressed FDCs developed for this study, the method is only useful overall for the upper tails, which are more accurately and unbiasedly estimated than the lower tails. It remains for future work to determine how accurate the estimated FDCs need to be to be useful for bias correction without unduly reducing accuracy. In addition to its potential efficacy for distributional bias correction, this particular instance of the methodology also represents a generalization of nonlinear spatial interpolation of daily streamflow using FDCs. Rather than relying on single index stations, as is commonly done to reflect streamflow timing, this approach to simulation leverages geostatistical tools to allow a region of neighbors to reflect streamflow timing.

Simulation of historical daily streamflow at ungauged locations is one of the
grand challenges of the hydrological sciences

As defined here, distributional bias in simulated streamflow is an error in
reproducing the tails of streamflow distribution. As attested to by many
researchers focused on the reproduction of historical streamflow, this bias
commonly appears as a general overestimation of low streamflow and
underestimation of high streamflow

Because of the importance of accurately representing extreme events, it is necessary to consider how the distributional bias of streamflow simulations can be reduced. The approach presented here assumes that, while the streamflow magnitudes of a historical simulation are biased, the temporal structure or rank order of simulated streamflows is relatively accurate. The nature of this approach is predicated on an assumption that although a historical simulation may produce a distribution of streamflow with biased tails, the temporal sequence of relative rankings or nonexceedance probabilities of the simulated streamflow retains valuable information. With this assumption, it can be hypothesized that distributional bias can be reduced, while not negatively impacting the overall performance, by applying a sufficiently accurate independently estimated representation of the period-of-record flow duration curve (FDC) to rescale each streamflow value based on the streamflow value of the regional FDC for the corresponding nonexceedance probabilities (see Sect. 2 below).

The approach presented here can be perceived as a generalization of the
nonlinear spatial interpolation of daily streamflow using FDCs as conceived
by

Furthermore, though necessarily explored in this study through the use of a
single technique for hydrograph simulation, this approach may be a means to
effectively bias-correct any simulation of streamflow, including those from
rainfall–runoff models, as presented by

The remainder of this work is organized in the following manner. Section 2 provides a description of the retrieval of observed streamflow, the estimation of simulated streamflows, the calculations of observed FDCs, the estimation of simulated FDCs, and the application and evaluation of the bias correction. Section 3 follows and it documents the bias in the original simulated streamflows and analyzes the potential bias correction that could be achieved if it were possible to know the observed FDC at an ungauged location and the bias correction that would be realized through an application of regional regression. Section 4 considers the implications of these results and hypothesizes how the methodology might be applied and improved. The major findings of this work are then summarized in Sect. 5.

This section, which is divided into four subsections, provides a description of the methods applied here. The first subsection describes the collection of observed streamflow as well as the initial simulation of streamflow. As the approach used here is applicable to any simulated hydrograph, the details of hydrograph simulation are not exhaustively documented. Instead, beyond a brief introduction, the reader is directed to relevant citations, as no modifications to previous methods are introduced here. The second subsection discusses the use of regional regression to define independently estimated FDCs. Again, as any method for the estimation of FDCs could be used and this application is identical to previously reported applications, following a brief introduction, the reader is directed to the relevant citations. The third subsection provides a description of how bias correction was executed, and the fourth subsection describes how the performance of this approach to bias correction was assessed.

Map of the locations of 1168 reference quality streamgages from the
GAGES-II database

The proposed approach was explored using daily mean streamflow data from the
reference quality streamgages included in the GAGES-II database

To control for streamflow distributions that vary over orders of magnitude, the simulation and analysis of streamflow at these streamgages is best explored through the applications of logarithms. To avoid the complication of taking the logarithm of a zero, a small value was added to each streamflow observation. The US Geological Survey rounds all mean daily streamflow to two decimal places in units of cubic feet per second (cfs, which can be converted to cubic meters per second using a factor of 0.0283). As a result, any value below 0.005 cfs is rounded to and reported as 0.00 cfs. Because of this rounding procedure, the small additive value applied here was 0.0049 cfs. While there may be some confounding effect produced by the use of an additive adjustment, as long as this value is not subtracted on back transformation, the following assessment of bias and bias correction will remain robust. That is, rather than evaluating bias in streamflow, technically this analysis is evaluating the bias in streamflow plus a correction factor. The conclusions remain valid as the assessment still evaluates the ability of a particular method to remove the bias in the simulation of a particular quantity.

Though the potential for distributional bias applies to any hydrologic
simulation

Daily period-of-record FDCs were developed independently of the streamflow
simulation procedure by following a regionalization procedure similar to that
of

A regional regression across the streamgages in each two-digit hydrologic unit
of each of the 27 FDC percentiles was developed using best subsets
regression. Best subsets regression is a common tool for exhaustive exploration
of the space of potential explanatory variables. All models with a given
number of explanatory variables are computed, exploring all combinations of
variables. The top models for a given number of explanatory variables are
then identified by a performance metric like the Akaike information
criterion. This is repeated for several model sizes to fully explore the
possibilities for variables and regression size. For each regression, the
drainage area was required as an explanatory variable. At a minimum, one
additional explanatory variable was used. The maximum number of explanatory
variables was limited to the smaller of either six explanatory variables or
5 % of the number of streamgages in the region, rounded up to the next larger
whole number. The maximum of six arises from what is computationally feasible
for the best subsets regression function used, whereas the maximum of 5 % of
streamgages was determined from a limited exploration of the optimal number
of explanatory variables as a function of the number of streamgages in a
region. Explanatory variables were drawn from the GAGES-II database

In order to allow different explanatory variables to be used to explain
percentiles at different streamflow regimes, the percentiles were grouped
into a maximum of three contiguous streamflow regimes based on the behavior
of the unit FDCs (i.e., the FDCs divided by drainage area) in the two-digit
hydrologic units. The regimes are contiguous in that only consecutive
percentiles from the list above can be included in the same regime; the
result is a maximum of three regimes that can be considered “high”,
“medium”, and “low” streamflows, though the number of regimes may vary across two-digit
hydrologic units. The percentiles in each regime were estimated by the same
explanatory variables, allowing only the fitted coefficients to change. The
final regression form for each regime was selected by optimizing the average
adjusted coefficient of determination, based on censored Gaussian (Tobit)

When estimating a complete FDC as realized through a set of discrete points,
non-monotonic behavior is likely

Diagram showing the bias correction methodology applied here. The
simulated daily hydrograph at the ungauged site is presented in

To implement bias correction, the initial predictions of the daily streamflow
values using the ordinary kriging approach were converted to streamflow
nonexceedance probabilities using the Weibull plotting position

Figure

As can be seen in Fig.

The hypothesis of this work, that distributional bias in the simulated
streamflow can be corrected by applying independently estimated FDCs, was
evaluated by considering the performance of these bias-corrected simulations
at both tails of the distribution. The differences in the common logarithms
of both high and low streamflow were used to understand and quantify the bias
(simulation minus observed) and the correction thereof. That is,

Distributional bias and improvement of that bias were considered in both the high and low tails of the streamflow distribution. Two methods were used to capture the bias in each tail. One method, referred to herein as an assessment of the observation-dependent tails, considers the observed nonexceedance probabilities to identify the days on which the highest and lowest 5 % of streamflow occurred. For each respective tail, the errors were assessed based on the observations and simulations of those fixed days. The other method, referred to herein as an assessment of the observation-independent tails, compares the ranked top and bottom 5 % of observations with the independently ranked top and bottom 5 % of simulated streamflow. Errors in the observation-dependent tails are an amalgamation of errors in the sequence of nonexceedance probabilities (the temporal structure) and in the magnitude of streamflow, whereas errors in the observation-independent tails only reflect bias in the ranked magnitudes of streamflow. In the same fashion, evaluation of the complete hydrograph can be assessed sequentially (sequential evaluation), retaining the contemporary sequencing of observations and simulations, or distributionally (distributional evaluation), considering the observations and simulations ranked independently. Though the overall accuracy will vary between the sequential and distributional case, overall bias will be identical in both cases.

With an analysis of both observation-dependent and observation-independent tails, it is possible to begin to tease out the effect of temporal structure on distributional bias. The bias in observation-independent tails is not directly tied to the temporal structure, or relative ranking, of simulated streamflow. That is, if the independently estimated FDC is accurate, then even if relative sequencing of streamflow is badly flawed, the bias correction of observation-independent tails will be successful. However, even if the distribution is accurately reproduced after bias correction, the day-to-day performance may still be poor. For observation-dependent tails, the temporal structure plays a vital role on the effect of bias correction. If the temporal structure is inaccurate in the underlying hydrologic simulation, then the bias correction of observation-dependent tails will be less successful.

The bias correction approach was first tested with the observed FDCs. These observed FDCs would be unknowable in the truly ungauged case, but this test allows for an assessment of the potential utility of this approach. This examination is followed by an application with the regionally regressed FDCs described above, demonstrating one realization of this generalizable method. This general approach to bias correction could be used with other methods for estimating the FDC and could also be used with an observed FDC for record extension, though neither of these possibilities are explored here.

Distribution of logarithmic bias, measured as the mean difference between the common logarithms of simulated and observed streamflow (simulated minus observed) at 1168 streamgages across the conterminous USA. Orig. refers to the original simulation with pooled ordinary kriging, BC-RR refers to the Orig. hydrograph bias-corrected with regionally regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph bias-corrected with observed flow- duration curves. The tails of the box plots extend to the 5th and 95th percentiles of the distribution; the ends of the boxes represent the 25th and 75th percentiles of the distribution; the heavier line in the box represents the median of the distribution; the open circle represents the mean of the distribution; outliers beyond the 5th and 95th percentile are shown as horizontal dashes.

Figures

Distribution of logarithmic accuracy, measured as the root mean squared error between the common logarithms of observed and simulated streamflow at 1168 streamgages across the conterminous USA. Orig. refers to the original simulation with pooled ordinary kriging, BC-RR refers to the Orig. hydrograph bias-corrected with regionally regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph bias-corrected with observed flow duration curves. Sequential indicates that contemporary days were compared, while distributional indicates that days of equal rank were compared. The tails of the box plots extend to the 5th and 95th percentiles of the distribution; the ends of the boxes represent the 25th and 75th percentiles of the distribution; the heavier line in the box represents the median of the distribution; the open circle represents the mean of the distribution; outliers beyond the 5th and 95th percentile are shown as horizontal dashes.

There is statistically significant overall bias at the median (

Distribution of logarithmic bias, measured as the mean difference between the common logarithms of simulated and observed streamflow at 1168 streamgages across the conterminous USA for observation-dependent and observation-independent upper and lower tails. Observation-dependent tails retain the ranks of observed streamflow, while matching simulations by day. Observation-independent tails rank observations and simulation independently. The upper tail considers the highest 5 % of streamflow, while the lower tail considers the lowest 5 % of streamflow. Orig. refers to the original simulation with pooled ordinary kriging, BC-RR refers to the Orig. hydrograph bias-corrected with regionally regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph bias-corrected with observed flow duration curves. The tails of the box plots extend to the 5th and 95th percentiles of the distribution; the ends of the boxes represent the 25th and 75th percentiles of the distribution; the heavier line in the box represents the median of the distribution; the open circle represents the mean of the distribution; outliers beyond the 5th and 95th percentile are shown as horizontal dashes.

In both observation-dependent and observation-independent cases, downward bias in the
upper tail is more probable than upward biases in the lower tail. For the
observation-dependent tails, approximately 89% of streamgages show downward
bias for the upper tail (Fig.

With respect to their central tendencies, these results show upward bias in lower tails and downward bias in upper tails of the distribution of streamflows from the original simulations for both observation-dependent and observation-independent cases. There is, of course, a great degree of variability around this central tendency. With these baseline results, the bias correction method presented here seeks to mitigate these biases.

Distribution of logarithmic accuracy, measured as the root mean squared error between the common logarithms of simulated and observed streamflow (simulated minus observed) at 1168 streamgages across the conterminous USA for observation-dependent and observation-independent upper and lower tails. Observation-dependent tails retain the ranks of observed streamflow, while matching simulations by day. Observation-independent tails rank observations and simulation independently. The upper tail considers the highest 5 % of streamflow, while the lower tail considers the lowest 5 % of streamflow. Orig. refers to the original simulation with pooled ordinary kriging, BC-RR refers to the Orig. hydrograph bias-corrected with regionally regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph bias-corrected with observed flow duration curves. The tails of the box plots extend to the 5th and 95th percentiles of the distribution; the ends of the boxes represent the 25th and 75th percentiles of the distribution; the heavier line in the box represents the median of the distribution; the open circle represents the mean of the distribution; outliers beyond the 5th and 95th percentile are shown as horizontal dashes.

The results for this idealized case that could not be applied in practice
provide clear evidence that distributional bias in simulated streamflow can
be reduced by rescaling using independently estimated FDCs. This evidence is
apparent in the reduction of the magnitude and variability of overall bias
(Fig.

Maps showing the distribution of logarithmic bias, measured as the mean difference between the common logarithms of simulated and observed streamflow (simulated minus observed) at 1168 streamgages across the conterminous USA for observation-dependent and observation-independent upper and lower tails. Observation-dependent tails retain the ranks of observed streamflow, while matching simulations by day. Observation-independent tails rank observations and simulation independently. The upper tail considers the highest 5 % of streamflow, while the lower tail considers the lowest 5 % of streamflow. The bias is derived from the original simulation of daily streamflow using pooled ordinary kriging at 1168 sites regionalized by the two-digit hydrologic units (polygons).

Whereas the measures of bias and accuracy are summarized in Tables

Measures of the distribution of logarithmic bias, computed as the
mean difference between the common logarithms of simulated and observed
streamflow (simulated minus observed) at 1168 streamgages across the
conterminous USA for observation-dependent and observation-independent upper and
lower tails. Orig. refers to the original simulation with pooled ordinary
kriging, BC-RR refers to the Orig. hydrograph bias-corrected with regionally
regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph
bias-corrected with observed flow duration curves. Observation-dependent (OD)
tails retain the ranks of observed streamflow, while matching simulations by
day. Observation-independent (OI) tails rank observations and simulation
independently. The upper tail observes the highest 5 % of streamflow, while
the lower tail considers the lowest 5 % of streamflow. Significance is the

Measures of the distribution of logarithmic accuracy, computed as the root mean squared error between the common logarithms of observed and simulated streamflow at 1168 streamgages across the conterminous USA for observation-dependent and observation-independent upper and lower tails. Orig. refers to the original simulation with pooled ordinary kriging, BC-RR refers to the Orig. hydrograph bias-corrected with regionally regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph bias-corrected with observed flow duration curves. Observation-dependent (OD) tails retain the ranks of observed streamflow, while matching simulations by day. Observation-independent (OI) tails rank observations and simulation independently. The upper tail observes the highest 5 % of streamflow, while the lower tail considers the lowest 5 % of streamflow.

Measures of the distribution of changes in absolute logarithmic bias
with bias correction, for which absolute logarithimic bias is computed as the
absolute value of the mean difference between the common logarithms of
bias-corrected and simulated streamflow at 1168 streamgages across the
conterminous USA for observation-dependent and observation-independent upper and
lower tails, for which the simulated streamflow was obtained with pooled ordinary
kriging. BC-RR refers to the Orig. hydrograph bias-corrected with regionally
regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph
bias-corrected with observed flow duration curves. Observation-dependent (OD)
tails retain the ranks of observed streamflow, while matching simulations by
day. Observation-independent (OI) tails rank observations and simulation
independently. The upper tail observes the highest 5 % of streamflow, while
the lower tail considers the lowest 5 % of streamflow. Significance is the

Measures of the distribution of changes in logarithmic accuracy between
original and bias-corrected simulations, for which the logarithmic accuracy is
computed as the root mean squared error between the common logarithms of
bias-corrected and simulated streamflow at 1168 streamgages across the
conterminous USA for observation-dependent and observation-independent upper and
lower tails, for which the simulated streamflow was obtained using pooled ordinary
kriging. BC-RR refers to the Orig. hydrograph bias-corrected with regionally
regressed flow duration curves, and BC-Obs. refers to the Orig. hydrograph
bias-corrected with observed flow duration curves. Observation-dependent (OD)
tails retain the ranks of observed streamflow, while matching simulations by
day. Observation-independent (OI) tails rank observations and simulation
independently. The upper tail observes the highest 5 % of streamflow, while
the lower tail considers the lowest 5 % of streamflow. Significance is the

With the use of a perfect, observed FDC for bias correction, one would expect that nearly all bias would disappear, but the results do not show this. The temporal structure of the simulated hydrograph continues to play a role in the bias of observation-dependent tails. The observation-independent tail continues to exhibit a small degree of residual bias, though it is still slightly nonintuitive. This residual bias arises from the effect of representing the FDC as a set of discrete points and interpolating between them. There may be some additional effect from the small value added to avoid zero-valued streamflows or the censoring procedure, but initial exploration found little impact.

The overall sequential performance (Fig.

To understand the effect of errors in the temporal structure further,
consider Fig.

Distribution of mean error in the simulated nonexceedance probabilities of the lowest and highest 5 % of observed daily streamflow (simulated minus observed) at 1168 streamgages across the conterminous USA. The upper tail considers the highest 5 % of streamflow, while the lower tail considers the lowest 5 % of streamflow. The tails of the box plots extend to the 5th and 95th percentiles of the distribution; the ends of the boxes represent the 25th and 75th percentiles of the distribution; the heavier line in the box represents the median of the distribution; the open circle represents the mean of the distribution; outliers beyond the 5th and 95th percentile are shown as horizontal dashes.

When the uncertainty of regionally regressed FDCs is introduced into the bias
correction procedure, the potential value of the bias correction procedure is
not as convincing. There is a slight, but significant, increase in the
overall bias (Table

The observation-independent tails, which are not affected by errors in
temporal structure, show a divergence in performance between the results
obtained using observed FDCs and those obtained using regionally regressed
FDCs. With observed FDCs, both tails demonstrated substantial reductions in
absolute bias and improvements in accuracy. With regionally regressed FDCs,
the upper observation-independent tails continue to show reductions in
absolute bias (Table

The effects of the rescaling with FDCs estimated with regional regression on
overall and observation-independent tail bias and accuracy can be better
understood if the properties of the estimated FDCs are considered.
Figure

The results are similar for the observation-dependent tails produced after
bias correction with regionally regressed FDCs, even when complicated by the
addition of temporal uncertainty as discussed in Sect. 3.2 with reference
to Fig.

Distribution of logarithmic bias

The introduction of uncertainty from regionally regressed FDCs diminishes the advantages gained by biased correction with observed FDCs. Considering the observation-independent lower tails, 55 % of streamgages show reductions in absolute bias with observed FDCs that were reversed into increases of absolute bias by the introduction of regionally regressed FDCs. Another 43 % of streamgages show smaller reductions in absolute bias when observed FDCs were replaced with regionally regressed FDCs. For the observation-dependent lower tails, 37 % of streamgages have reversals and 31 % show smaller reductions in absolute bias. For the observation-independent upper tails, 41 % show reversals and 56 % yield smaller reductions in absolute bias. For the observation-dependent upper tails, 24 % produce reversals and 40 % provide smaller reductions in absolute bias. Results are similar with respect to accuracy: while many streamgages saw reversals, a large proportion of streamgages continue to demonstrate improvements.

Though the first analysis presented, which utilized observed FDCs for bias
correction, represents only an assessment of hypothetical potential of this
general approach, the approach to bias correction presented here produced
near universal and substantial reduction in bias and improvements in
accuracy, overall and in each tail, for both observation-dependent and observation-independent evaluation cases when the uncertainty in independently
estimated FDCs was minimized. For the observation-independent evaluation
case, the errors are removed almost completely, and the remaining errors in
the observation-dependent case mimic the temporal structure (nonexceedance
probability) errors. These results, which are not applicable under the
conditions of the true ungauged problem, demonstrate that the bias correction
approach introduced here is theoretically valid. However, this improvement
becomes inconsistent with respect to bias and generally reduces the accuracy
when the biased and uncertain regionally regressed FDCs are used.
Furthermore, in both the observation-dependent and observation-independent
tails in the case of rescaling by regionally regressed FDCs, the improvements
in the lower tails are much more variable than the improvements in the upper
tail (Figs.

The regional regressions developed here were much better at estimating the upper tail of the streamflow distribution than estimating the lower tail. This provides a convenient comparison: the bias correction of lower tails with regionally regressed FDCs only improved the bias in the observation-dependent case when the low bias of the regionally regressed FDC offset the high bias of the observation-dependent tails, and did not improve accuracy in either case. However, the bias correction of upper tails with regionally regressed FDCs, which produced the upper tails with much less bias, continued to show, like in the case of observed FDCs, improvements in bias and accuracy, though to a much smaller degree than the improvements produced by observed FDCs.

Particularly in the lower tail of the distribution, the effectiveness of this
bias correction method is strongly influenced by the accuracy of the
independently estimated FDC. The change in the absolute bias of the
observation-independent lower tail has a 0.72 Pearson correlation with the
absolute bias of the lowest eight percentiles of the FDC estimated with
regional regression, showing that the residual bias in the FDC of the
bias-corrected streamflow simulations is strongly correlated with the bias in
the independently estimated FDC. The analogous correlation for the upper tail
is 0.31. For the observation-dependent these correlations are only 0.33 for
each tail, the reduced correlation for the lower tail being a result of the
combination of the uncertainty in the temporal structure and in the
regionally regressed FDC. Therefore, as regional regression is not the only
tool for estimating FDCs

While this method of bias correction, as implemented here using regionally
regressed FDCs, improves the bias in the upper tails, it had a negative
impact on lower tails. This makes the question of application or
recommendation more poignant. Under what conditions might this approach be
worthwhile? Initial exploration did not find a strong regional component to
performance of the bias correction method. Figure

The results of this work were also discussed in reference to earlier work
that suggested a prevalence, though not a universality, of underestimation of
high streamflows and overestimation of low streamflows. Similarly, the bias
correction approach produced a wide variability of results; where the high
tails might have been improved, the lower tails might have been degraded.
Figure

Scatter plots showing the correspondence of logarithmic bias, measured as the mean difference between the common logarithms of simulated and observed streamflow (simulated minus observed) at 1168 streamgages across the conterminous USA for observation-dependent and observation-independent upper and lower tails. Observation-dependent tails retain the ranks of observed streamflow, while matching simulations by day. Observation-independent tails rank observations and simulation independently. The upper tail considers the highest 5 % of streamflow, while the lower tail considers the lowest 5 % of streamflow. Orig. refers to the original simulation with pooled ordinary kriging, and BC-RR refers to the Orig. hydrograph bias-corrected with regionally regressed flow duration curves.

When looked at from the point of view of the estimated FDCs that need
temporal information in order to simulate streamflow, this approach to bias
correction is as akin to an extension of the nonlinear spatial interpolation
using FDCs developed by

That this approach to bias correction does improve the observation-dependent
tails and the overall performance when observed FDCs are used shows that the
temporal structure of the underlying simulation retains useful information,
even if the tails of the original simulation are biased. However, some error
remains in the simulated nonexceedance probabilities. A natural extension
would be to investigate if it might be more reasonable to estimate
nonexceedance probabilities directly rather than extracting their implicit
values from the estimated streamflow time series as was done here. Here, the
nonexceedance probabilities were derived from a simulation of the complete
hydrograph. In this alternative approach, the discharge volumes would not be
estimated but rather only the daily nonexceedance probabilities.

As mentioned earlier, recent work by

Although the results presented here are promising, they demonstrate that the
performance of two-stage modeling, where temporal structure and magnitude are
largely decoupled, is limited by the less well performing stage of modeling.
In this case, alternative methods for estimating the FDC might prove
worthwhile

Regardless of the underlying methodology, simulations of historical streamflow often exhibit distributional bias in the tails of the distribution of streamflow, usually an overestimate of the lower tail values and an underestimate of the upper tail values. Such bias can be extremely problematic, as it is often these very tails that affect human populations and other water management objectives the most and, thus, these tails that receive the most attention from water resources planners and managers. Therefore, a bias correction procedure was conceived to rescale simulated time series of daily streamflow to improve simulations of the highest and lowest streamflow values. Being akin to a novel implementation of nonlinear spatial interpolation using flow duration curves, this approach could be extended to other methods of streamflow simulation.

In a leave-one-out fashion, daily streamflow was simulated in each two-digit hydrologic unit code using pooled ordinary kriging. Regional regressions of 27 percentiles of the flow duration curve in each two-digit hydrologic unit code were independently developed. Using the Weibull plotting position, the simulated streamflow was converted into nonexceedance probabilities. The nonexceedance probabilities of the simulated streamflow were used to interpolate newly simulated streamflow volumes from the regionally regressed flow duration curves. Assuming that the sequence of relative magnitudes of streamflow retains useful information despite possible biases in the magnitudes themselves, it was hypothesized that simulated magnitudes can be corrected using an independently estimated flow duration curve. This hypothesis was evaluated by considering the performance of simulated streamflow observations and the performance of the relative timing of simulated streamflow. This evaluation was primarily focused on the examination of errors in both the high and low tails of the streamflow distribution, defined as the lowest and highest 5 % of streamflow, and considering changes in both bias and accuracy.

When observed flow duration curves were used for bias correction, representing a case with minimal uncertainty in the independently estimated flow duration curve, bias and accuracy of both tails were substantially improved and overall accuracy was noticeably improved. The use of regionally regressed flow duration curves, which were observed to be approximately unbiased in the upper tails but were biased low in the lower tails, corrected the upper tail bias but failed to consistently correct the lower tail bias. Furthermore, the use of the regionally regressed flow duration curves degraded the accuracy of the lower tails but had relatively little effect on the accuracy of the upper tails. Combining the bias correction and accuracy results, the test with regionally regressed flow duration curves can be said to have been successful with the upper tails (for which the regionally regressed flow duration curves were unbiased) but unsuccessful with the lower tails. The effect on accuracy of the bias correction approach using estimated flow duration curves was correlated with the accuracy with which each tail of the flow duration curve was estimated by regional regression.

In conclusion, this approach to bias correction has significant potential to improve the accuracy of streamflow simulations, though the potential is limited by how well the flow duration curve can be reproduced. While conceived as a method of bias correction, this approach is an analog of a previously applied nonlinear spatial interpolation method using flow duration curves to reproduce streamflow at ungauged basins. While using the nonexceedance probabilities of kriged streamflow simulations may improve on the use of single index streamgages to obtain nonexceedance probabilities, further improvements are limited by the ability to estimate the flow duration curve more accurately.

The data and scripts used to produce the results
discussed herein can be found in

WHF, TMO, and JEK jointly conceived of the idea. WHF designed the experiments and carried them out through the development of model code. WHF prepared the manuscript with contributions from all co-authors.

The authors declare that they have no conflict of interest.

This research was supported by the US Geological Survey's National Water Census. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the US Government. We are very grateful for the comments of several reviewers, among whom was Benoit Hingray. The combined reviewer comments helped to greatly improve early versions of this manuscript. Edited by: Monica Riva Reviewed by: Benoit Hingray and three anonymous referees