Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance
- 1National Center for Atmospheric Research, Boulder CO, USA
- 2United States Geological Survey, Modeling of Watershed Systems, Lakewood CO, USA
- 3United States Geological Survey, Center for Integrated Data Analytics, Middleton WI, USA
- 4US Department of Interior, Bureau of Reclamation, Denver CO, USA
- 5US Army Corps of Engineers, Institute for Water Resources, Seattle WA, USA
- 6Beijing Normal University, Beijing, China
Abstract. We present a community data set of daily forcing and hydrologic response data for 671 small- to medium-sized basins across the contiguous United States (median basin size of 336 km2) that spans a very wide range of hydroclimatic conditions. Area-averaged forcing data for the period 1980–2010 was generated for three basin spatial configurations – basin mean, hydrologic response units (HRUs) and elevation bands – by mapping daily, gridded meteorological data sets to the subbasin (Daymet) and basin polygons (Daymet, Maurer and NLDAS). Daily streamflow data was compiled from the United States Geological Survey National Water Information System. The focus of this paper is to (1) present the data set for community use and (2) provide a model performance benchmark using the coupled Snow-17 snow model and the Sacramento Soil Moisture Accounting Model, calibrated using the shuffled complex evolution global optimization routine. After optimization minimizing daily root mean squared error, 90% of the basins have Nash–Sutcliffe efficiency scores ≥0.55 for the calibration period and 34% ≥ 0.8. This benchmark provides a reference level of hydrologic model performance for a commonly used model and calibration system, and highlights some regional variations in model performance. For example, basins with a more pronounced seasonal cycle generally have a negative low flow bias, while basins with a smaller seasonal cycle have a positive low flow bias. Finally, we find that data points with extreme error (defined as individual days with a high fraction of total error) are more common in arid basins with limited snow and, for a given aridity, fewer extreme error days are present as the basin snow water equivalent increases.