Preprints
https://doi.org/10.5194/hess-2016-639
https://doi.org/10.5194/hess-2016-639
 
19 Dec 2016
19 Dec 2016
Status: this preprint was under review for the journal HESS but the revision was not accepted.

Regional regression models of percentile flows for the contiguous US: Expert versus data-driven independent variable selection

Geoffrey Fouad1, André Skupin2, and Christina L. Tague3 Geoffrey Fouad et al.
  • 1Geography Program, Monmouth University, West Long Branch, NJ, USA
  • 2Department of Geography, San Diego State University, San Diego, CA, USA
  • 3Bren School of Environmental Science and Management, University of California, Santa Barbara, CA, USA

Abstract. Percentile flows are statistics derived from the flow duration curve (FDC) that describe the flow equaled or exceeded for a given percent of time. These statistics provide important information for managing rivers, but are often unavailable since most basins are ungauged. A common approach for predicting percentile flows is to deploy regional regression models based on gauged percentile flows and related independent variables derived from physical and climatic data. The first step of this process identifies groups of basins through a cluster analysis of the independent variables, followed by the development of a regression model for each group. This entire process hinges on the independent variables selected to summarize the physical and climatic state of basins. Distributed physical and climatic datasets now exist for the contiguous United States (US). However, it remains unclear how to best represent these data for the development of regional regression models. The study presented here developed regional regression models for the contiguous US, and evaluated the effect of different approaches for selecting the initial set of independent variables on the predictive performance of the regional regression models. An expert assessment of the dominant controls on the FDC was used to identify a small set of independent variables likely related to percentile flows. A data-driven approach was also applied to evaluate two larger sets of variables that consist of either (1) the averages of data for each basin or (2) both the averages and statistical distribution of basin data distributed in space and time. The small set of variables from the expert assessment of the FDC and two larger sets of variables for the data-driven approach were each applied for a regional regression procedure. Differences in predictive performance were evaluated using 184 validation basins withheld from regression model development. The small set of independent variables selected through expert assessment produced similar, if not better, performance than the two larger sets of variables. A parsimonious set of variables only consisted of mean annual precipitation, potential evapotranspiration, and baseflow index. Additional variables in the two larger sets of variables added little to no predictive information. Regional regression models based on the parsimonious set of variables were developed using 734 calibration basins, and were converted into a tool for predicting 13 percentile flows in the contiguous US. Supplementary Material for this paper includes an R graphical user interface for predicting the percentile flows of basins within the range of conditions used to calibrate the regression models. The equations and performance statistics of the models are also supplied in tabular form.

Geoffrey Fouad et al.

 
Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement
 
Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement

Geoffrey Fouad et al.

Geoffrey Fouad et al.

Viewed

Total article views: 1,089 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
717 320 52 1,089 94 55 77
  • HTML: 717
  • PDF: 320
  • XML: 52
  • Total: 1,089
  • Supplement: 94
  • BibTeX: 55
  • EndNote: 77
Views and downloads (calculated since 19 Dec 2016)
Cumulative views and downloads (calculated since 19 Dec 2016)

Viewed (geographical distribution)

Total article views: 1,059 (including HTML, PDF, and XML) Thereof 1,058 with geography defined and 1 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 08 Dec 2022
Download
Short summary
Regression models were developed to predict streamflow variables (i.e. flow duration curve percentile flows) for the contiguous US. Over 35 independent variables were evaluated, but only three selected through an expert assessment performed better than most combinations of the other variables. Simple regression models consisting of annual precipitation, potential evapotranspiration, and baseflow were converted into a tool for predicting percentile flows for ungauged basins in the contiguous US.