The CAMELS data set : catchment attributes and meteorology for large-sample studies

We present a new data set of attributes for 671 catchments in the contiguous United States (CONUS) minimally impacted by human activities. This complements the daily time series of meteorological forcing and streamflow provided by Newman et al. (2015b). To produce this extension, we synthesized diverse and complementary data sets to describe six main 10 classes of attributes at the catchment scale: topography, climate, streamflow, land cover, soil and geology. The spatial variations among basins over the CONUS are discussed and compared using a series of maps. The large number of catchments, combined with the diversity of the attributes we extracted, makes this new data set well suited for large-sample studies and comparative hydrology. In comparison to the similar MOPEX (Model Parameter Estimation Experiment) data set, this data sets relies on more recent data, covers a wider range of attributes and its catchment are more evenly distributed across the 15 CONUS. This study also involves assessments of the limitations of the source data sets used to compute catchment attributes, as well as detailed descriptions of how the attributes were computed. The hydrometeorological time series provided by Newman et al. (2015b, https://dx.doi.org/10.5065/D6MW2F4D) together with the catchment attributes introduced in this paper (https://dx.doi.org/10.5065/D6G73C3Q) constitute the freely available CAMELS data set, which stands for Catchment Attributes and MEteorology for Large-sample Studies. 20 Short summary. We introduce a data set describing the landscape of 671 catchments in the contiguous USA: we synthesized various data sources to characterize the topography, climate, streamflow, land cover, soil and geology of each catchment. This extends the daily time series of meteorological forcing and discharge provided by an earlier study. The diversity of these catchments will help improving our understanding and modeling of how the interplay between catchment attributes shapes 25 hydrological processes.


Introduction
Catchment attributes are descriptors of the landscape.Their interplay shapes catchment behavior by influencing how catchments store and transfer water.To synthesize the multifaceted composition of catchments, catchment attributes necessarily cover a wide range of features, such as the catchment climate, hydrology, land cover, soil, geology, topography, and river network.Over the last decades, catchment attributes have been developed in a variety of ways and are the building blocks of countless hydrological studies.
A fruitful research direction is to explore interrelationships among catchment attributes.Key examples include how the interaction of climate and topography influences vegetation productivity (Voepel et al., 2011), how aridity affects the angle of stream intersections, thereby constraining the shape of the river network (Seybold et al., 2017), or the extent to which land cover influences annual streamflow (Oudin et al., 2008) or evapotranspiration (Zhang et al., 2001).Catchment attributes are also a standard way to characterize catchment (dis)similarities and are consistently employed to develop catchment classifications (e.g., McDonnell and Woods, 2004;Wagener et al., 2007;Sawicz et al., 2011;Berghuijs et al., 2014).Furthermore, there have been considerable efforts to use catchment attributes to reflect the structure of the landscape in models.One approach is to infer hydrological model parameter values from catchment attributes (Abdulla and Lettenmaier, 1997;Seibert, 1999;Hundecha et al., 2008;Samaniego et al., 2010;Hrachowitz et al., 2013) with the parallel objectives of accounting for landscape characteristics in an explicit way (not only implicitly by calibration), and of implementing hydrological models in ungauged basins.Another approach is to base not only parameter values but also Published by Copernicus Publications on behalf of the European Geosciences Union.
the choice of model structure on catchment attributes (Clark et al., 2011;McMillan et al., 2011;Fenicia et al., 2014).Both approaches provide guidance on how to deal with geophysical characteristics that vary dramatically within the model domain, for instance, in the context of continental-scale modeling.
Although catchment attributes are routinely used when working with a handful of catchments, there is a growing recognition that a large sample of catchments can provide insights that cannot be gained from a small sample (Gupta et al., 2014).Large-sample data sets enable us to concentrate on catchment similarities and on the formulation of conclusions that are valid for a large number of (gauged and ungauged) catchments.Individual catchments can then be considered to be part of a continuum of catchment attributes, which vary in space along several gradients (such as aridity or soil depth).Working with a large number of catchments enables us to study changes along different gradients and to better disentangle the effects of catchment attributes on catchment behavior.This is particularly useful for comparative hydrology, i.e., to identify how similarities and differences between locations influence ecohydrological processes (Falkenmark and Chapman, 1989;Troch et al., 2009;Thompson et al., 2011;Harman and Troch, 2014).Further, large-sample hydrology opens new opportunities for data analysis and, for instance, makes it possible to explore interrelationships between catchment attributes on the basis of their spatial patterns, as exemplified later in this study using map comparisons.
Several data sets of catchment attributes for large-sample hydrology now exist (see the review by Gupta et al., 2014).The large-sample data set introduced in this paper is an extension of the Newman et al. (2015b) data set, referred to as N15 hereafter.N15 covers 671 catchment in the contiguous USA (CONUS), for which it provides daily meteorological forcing from three data sets, Daymet (Thornton et al., 2012), Maurer (Maurer et al., 2002), andNLDAS (Xia et al., 2012), as well as daily streamflow measurements from the United States Geological Survey (USGS).All those catchments have 20 years of continuous discharge records from 1990 to 2009 and are minimally impacted by human activities (see Sect. 2.1 in Newman et al., 2015b).Here, we cover the same catchments and provide additional quantitative estimates of a wide range of catchment attributes.We named this extended N15 data set the CAMELS data set, which stands for Catchment Attributes and MEteorology for Large-sample Studies.
Section 2 explains the motivations to extend the N15 data set.Sections 3-8 present six classes of basin characteristics: topographic characteristics, climate indices, hydrological signatures, and land cover, soil, and geology characteristics, respectively.These six sections are organized using the following structure.We first provide some research background on this class of basin characteristics, introduce the attributes we selected, and explain the reasons behind their se-lection.Since these attributes are well established, we briefly introduce them in the main text and provide further details in tables, which contain units and abbreviations for the attributes, as well as references to the equations and data sets used for their computation.We follow by discussing the spatial variations of these attributes across the CONUS and by assessing their main limitations of the source data sets.Section 9 compares the CAMELS data set to the Model Parameter Estimation Experiment data set (MOPEX; Duan et al., 2006;Schaake et al., 2006), another large sample of catchments for the CONUS.Section 10 discusses the online availability of the CAMELS data set and possible future extensions.Conclusions are presented in Sect.11.
2 Motivations to extend the Newman et al. (2015b) data set In creating the CAMELS data set, we seek to achieve the following objectives: 1.In order to make a wide range of geophysical data sets available and comparable at the catchment scale, we compiled complementary catchment attributes from diverse data sources and synthesized them into a single coherent data set.These attributes have been available separately for some time, but comprehensive multivariate catchment-scale assessments have so far been difficult, because disparate data sets have different spatial configurations, are stored in different archives, and use different data formats.By creating catchment-scale estimates of these attributes, we simplify the assessment of their interrelationships.
2. To summarize meteorological forcing and discharge daily time series, we derived climate indices and hydrological signatures using the daily time series from N15.We selected climate indices and hydrological signatures that reduce the dimensionality of the hydroclimatic data sets, while preserving most of their information content.In other words, daily time series are rich in information, but summarizing this information makes catchment comparison easier.
3. For the characterization of catchment land cover, soil, and geology, we leveraged data sets not used in N15 in order to describe the land cover, soil, and geology of each catchment.The attributes we extracted are commonly used to explore catchment behavior and to support parameter estimation for hydrological and land surface models.A goal is to better assess how well those data sets capture the landscape features that matter for the storage and transfer of water across the landscape.
4. To define limitations in catchment attributes, our intention is not only to provide quantitative estimates of diverse catchment characteristics but also to explore and discuss limitations of those estimates.Catchment attributes are uncertain for different reasons, so we provide metadata of different kinds (e.g., the difference in basin area estimated using different data sources, or the fraction of soil poorly characterized).Our aims are (i) to contribute to raise awareness of uncertainties in geophysical attributes, which are frequently considered in a purely deterministic way, and (ii) to facilitate catchment selection based on the reliability of their attributes.
5. In order to ensure spatial consistency across the domain, we reduce the risk of generating artificial regional variations by using only data sets that cover the entire CONUS and not different data sets for different parts of the domain.
For most variables and catchments, the spatial resolution of the source data set (e.g., the remote-sensed land cover characteristics) is smaller than the catchments, making upscaling necessary.By default, the upscaling was done using the arithmetic mean, except where indicated otherwise.

Location and topography
Location information and topographic indices were extracted for each catchment by N15 (Table 1).We display these attributes on maps to introduce the main topographic features of the CONUS.Elevation obviously exerts a key control on catchment behavior (Fig. 1a), as it strongly influences a wide range of catchment attributes that we present in this paper, such as soil depth, land cover, the fraction of the precipitation falling as snow, or streamflow seasonality.Figure 1b illustrates that the eastern half of the CONUS is, with the exception of the Appalachian Mountains, much flatter than its western counterpart.Figure 1c shows the spatial distribution of catchment size and highlights that there are some large catchments: five catchments have an area greater than 10 000 km 2 , and four of those are located in the Great Plains.Since we compute the catchment average of every attribute presented in this paper, it is important to keep mind that those catchment averages become less meaningful as the catchment area increases.In the context of hydrological modeling, the larger the catchment, the greater the need to account for spatial heterogeneity using some kind of spatially distributed representation.
As explained earlier, our aim is to reveal weaknesses in catchment characteristics and to discuss the impacts of such weaknesses for hydrological modeling.One way to do so is to compare different estimates of the same quantity, for instance, catchment area.Two methods were used to determine the contours of each catchment: geospatial fabric (Viger, 2014;Viger and Bock, 2014) and GAGES II (Falcone, 2011).The polygons from geospatial fabric were instrumental to produce the N15 data set, since they were used to clip the gridded forcing data sets and the digital elevation model (from which elevation bands were derived), and importantly, they were used to estimate the area of each catchment, which enabled the conversion of discharge at the catchment outlet to average runoff depth over the catchment.It is hence essential to determine if the area computed from the geospatial fabric polygon is reliable.We compared it to the area computed using the GAGES II data set and computed the absolute relative error between the two estimates.In eight catchments, the error is greater than 100 % (red dots in Fig. 1d), and in 62 catchments, the relative error is greater than 10 % (red and orange dots).Several of these catchments are located in the Great Basin and in California where the geospatial fabric had difficulty identifying watershed boundaries.Additionally, the geospatial fabric was not designed to exactly replicate basin area above gauging locations, but rather its development focused on continental-scale hydrologic modeling; thus, some basin area discrepancies are inherent in the development of the geospatial fabric.We recommend not using catchments with large error discrepancies with GAGES-II, as they are most likely erroneous in the geospatial fabric (e.g., Bock et al., 2016).Note that, in general, catchment delineation is more challenging in flat areas, but here errors in flat areas are relatively well contained, except in Florida.

Data and methods
Climatic indices were derived using the N15 meteorological forcing data.N15 includes forcing from three data sets (NL-DAS, Maurer, and Daymet), but for the computation of the indices only Daymet data were used.All the climate indices and hydrological signatures (Sect.5) were computed for the period 1 October 1989 to 30 September 200930 September (hydrological years 199030 September to 2009)).The choice of this period was based on the proportion of missing daily discharge measurements (the forcing time series were all extracted from gridded data sets and are all complete).We consider this period to be long enough to derive climatological indices (in particular when rare events are characterized) and short enough to be little impacted by the lack of daily discharge measurements at the beginning and end of the period covered in N15 (1980-2014; see Fig. 2).
There is a wide range of climatic indices in the literature.We selected indices with the goal to synthesize this myriad of possibilities and to provide direct support to the study of hydrological processes (Table 2).These indices characterize dry periods, high precipitation events, and the baseline over two timescales: the daily timescale (e.g., frequency of high precipitation events) and the seasonal timescale (e.g., the proportion of precipitation falling as snow).At the seasonal timescale, we computed three indices: aridity, the fraction of the precipitation falling as snow, and the seasonality and timing of precipitation.These three indices were previously used for the classification of 321 catchments across the CONUS and were shown to provide relevant insights into the relationship between catchment behavior and their physiographic characteristics (Berghuijs et al., 2014; note that we use slightly different formulations of these indices; see Table 2).Aridity is defined as the ratio of mean annual potential evapotranspiration over the mean annual precipitation.The occurrence of snow was estimated for daily time steps using a temperature threshold of 0 • C. The seasonality and timing of precipitation are combined into a single metric, which relies on sine curves representing the annual cycle of precipitation and temperature.Note that sine curves do not necessarily provide a good fit to the annual precipitation cycle, for instance, in areas experiencing a strong annual cycle and multiple consecutive months with low precipitation, such as California (see Berghuijs and Woods, 2015 for a solution to this issue), yet they enable a first-order characterization of the dominant climatological features of diverse locations, which is useful for studies such as this one.These three seasonal indices provide a good overview of the mean and seasonal climatic conditions but do not explicitly consider dry periods and intense precipitation events, which occur at different timescales and are key drivers of droughts and floods.To fill this gap, we considered the frequency of dry days and high precipitation events, as well as the mean duration of these events, and determined the season during which most of the high precipitation events   and dry days occur.This provides some insights into the precipitation regime (convective or stratiform) and phase (liquid or snow).

Spatial variability in climatic indices
The annual precipitation cycle is strongest over the Pacific coast (maximum in winter), over the northern Great Plains, and Florida (maximum in summer) and weakest along the Atlantic coast (Fig. 3a).The fraction of precipitation falling as snow is highest over the Rocky, Cascade, and Sierra Nevada mountain ranges, followed by the Northeast and the Great Lakes regions (Fig. 3b).Aridity is the highest over the Southwest, High Plains, and Great Plains, when in contrast, the Northwest, Northeast, and the Appalachians are the most humid regions (Fig. 3c).High precipitation events occur most frequently in winter along the Pacific coast (Fig. 3f) and are relatively long lasting (Fig. 3e), which reflects their large (synoptic)-scale nature.In contrast, summertime convective systems (e.g., mesoscale systems) over the High Plains, Great Plains, and the upper and middle Mississippi Valley generate the most frequent high precipitation events.In the band stretching from Louisiana to Georgia, high pre- cipitation events are most frequent in winter, as the result of the intense extratropical cyclone activity.The frequency of dry days (Fig. 3g) is closely related to aridity (Fig. 3c).Catchments located in the region stretching from California to Texas typically experience the longest periods of successive dry days, while those in the Northeast are at the other end of the spectrum (Fig. 3h).Dry days are particularly frequent in summer west of the Rocky Mountains, in winter in the Great Plains and Mississippi Valley, and in autumn in the Atlantic coast states (Fig. 3i).

Data and methods
Hydrological signatures were chosen using a similar rationale as for climate indices: we aimed to capture the hydrological baseline, as well as low-flow and high-flow events.All signatures were computed using daily discharge time series retrieved by N15 from the USGS for the period 1 October 1989 to 30 September 2009 (Fig. 2).
We selected signatures from the set that Sawicz et al. (2011) used to explore the similarity between 280 catchments in the eastern US and classify them (Table 3).The runoff ratio indicates how much of the long-term precipitation leaves the catchment as streamflow, thereby reflecting losses to evapotranspiration and groundwater.We use the slope of the flow duration curve to characterize streamflow variability: steeper flow duration curves define greater variability over the year.The slope is computed between the log-transformed 33rd and 66th streamflow percentiles.In intermittent streams, the frequency of days with no flow can be greater than 33 %, so that Q33 = 0 mm day −1 .Since the logarithm cannot be extracted, the slope of the flow duration curve is undefined.The contribution of baseflow to the total discharge is estimated by the baseflow index computed by hydrograph separation using a digital filter implemented by Ladson et al. (2013).It has to be recognized that the technique used for the separation influences the estimated baseflow index (see Beck et al., 2013 andLadson et al., 2013 for recent examples), yet hydrograph separation can provide valuable insights into catchment behavior (e.g., Harman et al., 2011), and the baseflow index has proven to be a useful variable to compare and classify large samples of catchments (e.g., Sawicz et al., 2011;Beck et al., 2016).Further, catchment response to a change in precipitation, which is particulary relevant in the context of climate change (e.g., Vano et al., 2015), was evaluated by computing the elasticity between annual precipitation and discharge.Finally, we characterized discharge seasonality using the half-flow date.This indicator is frequently used to quantify the impacts of climate change on the hydrology of snow-dominated catchments (e.g., Court, 1962;Stewart et al., 2005;Addor et al., 2014).The half-flow dates have been shown to occur earlier, as temperature increases can force both an earlier onset of snowmelt and a higher proportion of precipitation falling as rain.
Since the hydrological signatures introduced so far do not explicitly consider low-and high-flow events, we defined high-and low-flow days using thresholds based on the median and mean daily flow, respectively (Clausen and Biggs, 2000;Olden and Poff, 2003;Westerberg and McMillan, 2015).We computed the average duration and average frequency of high-and low-flow events.We also extracted 5th and 95th percentiles (Q5 and Q95, respectively) from the flow duration curve to characterize those events.

Spatial variability in hydrological signatures
The mean daily discharge and runoff ratio are strongly correlated (Fig. 4a and b) and present clear similarities to catchment aridity (Fig. 3c).In the Great Plains, where the evaporative demands exceed available precipitation (aridity > 1), more than 80 % of the precipitation is evaporated (runoff ratio < 0.2) and the mean annual discharge is often as low as 0.3 mm day −1 .In contrast, in the Pacific Northwest, precipitation is often twice as high as PET (aridity < 0.5).Both the runoff ratio and the mean annual discharge are higher in the Pacific Northwest than in the Northeast, as can be expected from the seasonality of precipitation, which peaks in winter in the Pacific Northwest (Fig. 3a).Most of the discharge flows during the first half of the year in the Pacific Northwest (Fig. 4c).Streamflow is in contrast delayed by snow accumulation in the Rocky Mountains; it is also late in the Midwest (in part because of the seasonality of the precipitation, which peaks in summer), and in contrast early in the band stretching from eastern Texas to South Carolina (which is consistent again with the seasonality of precipitation).Similarities exists between the patterns of the slope flow of the duration curve and the baseflow index (Fig. 4d and e), with lower baseflow index and higher slopes both indicating flashy catchments, a clear example being the area stretching from east Kansas to Kentucky.Finally, at the annual scale, the discharge of more arid catchments tends to react more strongly to annual precipitation anomalies (Fig. 4f; see also Harman et al., 2011).
The frequency of low-and high-flow events is correlated (Fig. 4g and j) and, by definition, both frequencies are low in catchments with a low slope of the flow duration curve (Fig. 4d).High flows are least frequent and the most short lived in the Pacific Northwest and in the Appalachian Mountains, and when they occur, their absolute discharge is higher than in other regions (Fig. 4i).Q5 is more than 10 times higher in the Pacific Northwest and in the Appalachian Mountains than in the most arid catchments, which reflects the capacity of these humid catchments to sustain baseflow.
Note that even though spatial patterns emerge from the maps in Fig. 4, they tend to be less smooth than those of climate indices shown in Fig. 3.In other words, there can be some strong variations over short distances, for instance, in the slope of the duration curve or in signatures related to extreme (high and low) streamflow conditions.Plausible explanations are that (i) hydrological signatures are the end result of the interactions between several non-linear processes (as opposed to the smaller number of processes controlling, for instance, the fraction of precipitation falling as snow), and (ii) hydrological signatures are sensitive to uncertainties in discharge measurements (Westerberg et al., 2016), which we suspect to contribute to sudden variations over short distances in the maps of particularly sensitive signatures, such as the slope of duration curve or signatures related to extreme streamflow conditions.
6 Land cover characteristics

Data and methods
We considered two key indicators of vegetation density: the leaf area index (LAI) and the green vegetation fraction (GVF), which approximates the vertical and horizontal density of vegetation, respectively.We used the 1 km land cover products derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) data to estimate their climatological monthly values over 2002-2014.LAI is defined as the one-sided green leaf area per unit ground area in broadleaf canopies and as half the total needle surface area per unit ground area in coniferous canopies.We extracted the maximum monthly LAI, which can be used to constrain the maximum evaporative capacity and vegetation interception capacity in models.Seasonal variations in LAI are principally related to trees growing and shedding their leaves.To quantify these variations, we computed the difference between the maximum and minimum monthly LAI.In absolute terms, these variations are highest in areas of deciduous broadleaf forest.The GVF can be used in models to estimate the proportion of each grid cell covered by vegetation (1 minus the GVF gives the fraction of the grid cell from which evaporation occurs directly from the soil).Variations in the GVF are particularly high for croplands as a result of the growing and harvesting of the crops.Like for the LAI, we extracted the maximum monthly value of the GVF and the difference between the maximum and minimum monthly values.
Additionally, we included the land cover class based on the International Geosphere-Biosphere Programme (IGBP) classification (Belward, 1996) derived from MODIS data.For each catchment, we defined the dominant land cover class as the most frequent class based on all the grid points fully or partially contained in the basin (each grid cell was weighted based on how much of it was contained within the basin boundaries).The fraction of the catchment covered by the dominant class is an indicator of the representativeness of the dominant class for the whole catchment.
Finally, based on the IGBP land cover class of each grid point, we approximated the root-depth distribution based on Zeng (2001).The distribution is estimated using a twoparameter equation, the value of these parameters being dependent on the IGBP land cover.The root fraction decreases exponentially with soil depth: the depth of the soil layer encompassing the top 50 % of the root system is typically between 0.12 and 0.26 m depending on the land cover, and for the top 99 % of the root system, this depth is typically between 1.4 and 2.4 m and is often named "rooting depth".
We computed the root-depth distribution for each grid point based on its land cover.We then extracted the values associated with the following percentiles: 10, 25, 50, 75, and 99 %.For each percentile, the catchment average was estimated using the arithmetic mean.Table 4 provides the complete list of land cover attributes that we considered.

Spatial variability in land cover characteristics
The maximum LAI and GVF are highly correlated (Fig. 5a  and d; see also the mean value for each land cover class in Fig. 5j), which reflects that short vegetation tends to be sparse and forests of taller trees tend to be dense, but could also indicate that the MODIS data used to compute these two fields do not enable us to fully differentiate between vertical and horizontal vegetation density.These two fields are similar to that of the fraction of forest (Fig. 5c, positive correlation) and aridity (Fig. 5c, negative correlation), with arid catchments typically associated with a lower LAI and lower GVF.Note that because the catchments selected are minimally impacted by human activities, none of them are classified as predominantly urban (Fig. 5f).
Hydrol.Earth Syst.Sci., 21, 5293-5313, 2017 www.hydrol-earth-syst-sci.net/21/5293/2017/The amplitude of the seasonal variations of LAI is strongly linked to the LAI maximum (Fig. 5a and b).Overall, catchments dominated by land cover classes with a high LAI (e.g., deciduous broadleaf forest or mixed forest; Fig. 5j) tend to experience a significant increase and drop of LAI, a clear exception to this rule being evergreen broadleaf forests (Fig. 5j).The seasonal variations of GVF are particularly high for croplands, which is expected and reflects the growing-harvesting cycle.However, note that there are also important seasonal changes in the catchment dominated by the deciduous broadleaf forests (Fig. 5j), although the horizontal tree density does not change significantly.Again, this suggests that the MODIS data do not enable to fully differentiate between vertical and horizontal vegetation.Users might hence decide to consider LAI only to summarize seasonal land cover variations.
To explore spatial variations in rooting depth, we used aridity and depth to bedrock (introduced in Sect.7.1).Figure 5k shows that in water-limited catchments (aridity > 1) the rooting depth increases with aridity, which can be interpreted as a sign that trees increase their root-zone storage capacity to compensate for the overall lack of water.In more humid catchments (aridity < 1), shallow soils seem to constrain the vertical development of roots of tall trees like evergreen needleleaf forest and deciduous broadleaf forest, and in contrast, mixed forest and evergreen broadleaf forest can develop deeper roots.These hypotheses illustrate how the CAMELS data set enables the joint exploration of diverse attributes for a large number of catchments.
7 Soil characteristics

Data and methods
The soil characteristics we derived are principally based on the State Soil Geographic Database (STATSGO) data set post-processed by Miller and White (1998).Miller and White (1998) discretized the top 2.5 m of soil into 11 lay- ers, whose thickness increases with depth (from 5 cm for the two top layers to 50 cm for the three deepest ones).For each layer, they relied on the original STATSGO data to determine the dominant soil texture class.They considered a total of 16 classes: the 12 standard United States Department of Agriculture (USDA) soil texture classes plus 4 additional non-soil classes characterized as organic material, water, bedrock, and other.We estimated the saturated hydraulic conductivity and porosity (saturated volumetric water content) of each layer using the multiple regressions relying on sand and clay fraction originally proposed by Cosby et al. (1984) and now commonly used for land surface modeling (e.g., Lawrence and Slater, 2008).For organic material, we used default values for the saturated hydraulic conductivity and porosity based on Lawrence and Slater (2008).Then, for each STATSGO polygon, we computed the average of each soil characteristic (see list in Table 5) over the top 1.5 m of soil using the following weighted mean: where X p designates the mean value of the variable X over the 1.5 m of soil (nine top layers; see first and second points below), X i is its value over layer i, T i is the thickness of layer i, and S depth is the cumulative depth of the layers.Then, for each catchment, we computed the weighted mean of the soil characteristics of the STATSGO polygons within the catchment, the weight being the fraction of the catchment covered by each polygon.For hydraulic conductivity, the harmonic mean was used instead of the arithmetic mean for the aver-aging along each soil column and across the catchment (see Samaniego et al., 2010 for a discussion on upscaling operators).
Before we start interpreting the results of the aggregation of STATSGO data to the catchment scale, we consider it important to discuss some key limitations of STATSGO.Those limitations were already underscored by Miller and White (1998) and also affect more recent soil data sets.It is our impression that although they reduce our ability to correctly reflect soil properties in hydrological models, they are commonly overlooked.
Limited depth Miller and White (1998) note that " [. . . ] only about 2.5 % of the components have layers extending below 203 cm (80 in).Accordingly, the bottom two standard layers contain meaningful data only for a minority of the map units".In other words, although the STATSGO data set is often perceived as describing the top 2.5 m of soil, over the majority of the CONUS only the top 1.5 m are covered, and data from the bottom 1 m in those areas are potentially misleading.
Low information content in the deepest layers They warn the reader that "for approximately half the components, the minimum and maximum depth to bedrock [. . .] both have the value 152 cm (60 in); in the great majority of these cases, this indicates that this was the maximum depth to which soil was normally examined and bedrock was not actually encountered".This means that when the two last layers (1.5 to 2.5 m deep) are marked as bedrock, in about half of the cases, the bedrock has not actually been reached, which leads to an underestimation of the soil depth.Given these limitations, we decided to restrict our attention to the top 1.5 m of soil (i.e., to the top nine layers).

Only fine fraction characterized
The STATSGO sand, clay, and silt fraction are only for the portion of soil that is finer than 2 mm.That is, STATSGO data should certainly not be considered to be representative of the whole soil column, but it is also important to keep in mind that they does not either completely characterize its top part, since only the soil fraction finer than 2 mm is considered.
Lack of representativeness of the dominant soil texture class Miller and White (1998) stress that STATSGO "units may be quite internally heterogeneous, with as much as 50 % of the map unit having soil properties that differ significantly from the map unit description".
Scale inadequacy Although soil hydraulic properties can be measured in a lab, it is still unclear how to meaningfully upscale them to the catchment scale (these quantities can be characterized as incommensurate ;Beven, 2012).
In a general sense, soil data sets only characterize the top soil layers, even when the soil can be much deeper (first and second points raised above).In fact, the soil depth of a catchment indicated as 1.5 m in STATSGO can be an order of magnitude greater.Uncertainties in soil depth are critical for hydrological modeling, in particular because they hamper the determination of the root-zone storage capacity (Boer-Euser et al., 2016).To explore those uncertainties, we included a recently released soil data set (Pelletier et al., 2016, referred to as P16 in continuation), from which we extracted the thickness of the permeable layers above bedrock, i.e., the depth to bedrock.The principal advantage of this data set is that it covers the top 50 m of soil.It comes on a global 30 arcsec (∼ 1 km) grid.We estimated the catchment average by computing the mean from all the grid points falling within each Hydrol.Earth Syst.Sci., 21,2017 www.hydrol-earth-syst-sci.net/21/5293/2017/The orange area includes all the catchments for which there can be an agreement between the two data sets (i.e., estimates from both data sets are smaller than or equal to 1.5 m).The orange curve is a 1 : 1 curve (note the logarithmic scale on the x axis).
catchment.However, it does not provide information on soil texture classes, so it cannot be used to estimate variables like the saturated hydraulic conductivity or the porosity.Another key difference is that P16 leveraged geomorphological principles to obtain more precise estimates than what would be obtained by interpolating soil pit observations.We do not explicitly deal with the third and fifth points in this study but expect them to cause lower-than-expected performance when hydrological modeling relies on STATSGO or similar data sets.

Spatial variability in soil characteristics
Once aggregated to the catchment scale, STATSGO data reveal the following features.Catchments with a sand fraction greater than 50 % are predominantly located along the Gulf Coast and the Atlantic coast, and in the Great Lakes region (Fig. 6a).This leads to a relatively low porosity fraction and high saturated hydraulic conductivity (Fig. 6d and e).Conversely, catchments with a silt fraction greater than 50 % are mostly located in a band stretching from Kansas to New York (Fig. 6b).Catchments in this band also tend to feature a comparatively large clay fraction (Fig. 6c).This implies a higherthan-average porosity fraction (Fig. 6d).Although variables like porosity and saturated hydraulic conductivity are commonly relied upon for parameter estimation, we note that their value in terms of process understanding at the catchment scale should not be overestimated, given the limitations outlined in Sect.7.1.As for soil depth, STATSGO and P16 both indicate that the soil is shallower in the Appalachian Mountains than along the Gulf and Atlantic coasts.There are, however, disagreements in the Rocky Mountains (e.g., in Colorado) and in the Pacific Northwest: STATSGO indicates a soil depth equal to or greater than 1.5 m when the depth to bedrock according to P16 is smaller than 1 m.The lack of quantitative agreement between the two data sets appears clearly in Fig. 6i.The lefthand part of the figure (orange background) includes all the catchments in which there can potentially be an agreement between STATSGO and P16, since the estimated depth to bedrock is equal to or smaller than 1.5 m.There is, however, considerable scatter around the 1 : 1 orange curve, which illustrates the uncertainty in estimates of the soil depth, which directly impact estimates of maximum water content of soils (Fig. 6f).This issue is even clearer when the right-hand side of Fig. 6i is considered.In about half of the catchments (47 %), the depth to bedrock is greater than 1.5 m, so it cannot be covered by STATSGO.For 24 % of the catchments, the depth to bedrock estimated using P16 is greater than 15 m, i.e., 10 times the depth covered by STATSGO.This underscores the inability of data sets like STATSGO to provide a realistic characterization of soils in areas of high sedimentary deposition.
Finally, we considered three metrics that can be considered as metadata.The fraction of the catchment characterized as "water" is relevant because it indicates the presence of lakes (Fig. 6j).The "organic" fraction, which importantly impacts soil hydraulic properties, is negligible, but is non-negligible in many catchments in Florida and in the Great Lakes region (Fig. 6k).The fraction of soil marked as "other" (for which no soil characteristics are available and which is ig-nored from the computation of all soil attributes) is significant in many catchments (Fig. 6l).How detrimental that is will depend on the application.One way to assess this effect when using soil characteristics to explain the performance of hydrological models would be to test whether a clearer relationship is obtained by progressively excluding catchments with the highest fraction of soil marked as "other".

Data and methods
We used two complementary global sets to characterize the geology of each catchment.The first data set is the Global Lithological Map (GLiM) by Hartmann and Moosdorf (2012).GLiM synthesizes lithological data from 92 regional maps spread across the globe.The spatial resolution is remarkable, as GLiM relies on ∼ 1.2 million polygons to discretize the Earth's surface.Three levels of details are available.In this study, we focus on the first level, while the two other levels provide further details that could be processed at a later stage.The first level differentiates between 16 lithological classes (see the list of classes in the legend of Fig. 7).We determined the contribution of each lithological class to the area of each catchment, and recorded the first and second most frequent classes within the catchment, as well as the fraction of the catchment they cover.The class "carbonate sedimentary rocks" is particularly relevant from a hydrological perspective (it designates areas likely to host karst systems); we hence also recorded the fraction of each catchment associated with this class.Finally, note that although a 0.5 • gridded version of GLiM is available, we used the more detailed polygon-based version for this study.
The second data set we used to characterize catchment geology is the GLobal HYdrogeology MaPS (GLHYMPS) of the subsurface permeability and porosity by Gleeson et al. (2014).GLHYMPS is based on GLiM spatial polygons, so its level of spatial detail is equally high.Gleeson et al. (2014) principally relied on GLiM lithologic classes to derive quantitative estimates of two key characteristics of the geologic units below soil horizons: porosity and permeability (i.e., the ease of fluid flow through porous rocks and soils).For CAMELS, we produced catchment averages of these two variables, the contribution of each spatial polygon being weighted by the fraction of catchment it covers.The arithmetic mean was used for porosity, but for permeability, we followed Gleeson et al. (2011) and used the geometric mean instead.The geological attributes are summarized in Table 6.
A clear advantage of these high-resolution global lithological maps is that they can be used to extract catchment-scale attributes for diverse parts of the globe.Yet, data quality is spatially variable, and caveats of the GLiM and GLHYMPS (outlined in the Sect. 3 of Gleeson et al., 2014) should be kept in mind.In particular, there are unrealistic spatial discontinuities coinciding with jurisdictional boundaries in GLiM maps, which by construction also affect GLHYMPS maps (for instance, in the region of North and South Dakota).

Spatial variability in geological characteristics
The four most frequent dominant geological classes in CAMELS catchments are siliciclastic sedimentary rocks (34 % of the catchments), unconsolidated sediments (19 %), metamorphic rocks (16 %), and carbonate sedimentary rocks (12 %).Unconsolidated sediments dominate in catchments along the Gulf Coast and along the southern to middle Atlantic coast (Fig. 7a).In those catchments, both the subsurface porosity (Fig. 7f) and permeability (Fig. 7g) are high.The Pacific coast and the region north of the Appalachian Mountains features catchments rich in siliciclastic sedimentary rocks, leading to a comparatively low subsurface permeability.To the south of the Appalachian Mountains, metamorphic rocks are dominant, resulting in a particularly low subsurface porosity.Finally, the catchments with the highest proportion of carbonate sedimentary rocks are principally located in central-western Texas, in the region stretching from Lake Michigan to and including Missouri (Fig. 7a and e) and to some extent in the Appalachian Mountains (Fig. 7b).In addition to these three main regions, there are also isolated catchments with a high proportion of carbonate rocks, for instance, in Florida, Nevada, and Vermont.The subsurface permeability of those catchments is high.Overall, in 18 % of the CAMELS catchments, there is only one GLiM lithological type (Fig. 7c), while in 11 % of the catchments, the dominant geological class accounts for less than 50 % of the catchment area (Fig. 7d).

Comparison with the MOPEX data set
The CAMELS data set is similar to the data set produced for MOPEX (Duan et al., 2006;Schaake et al., 2006) in that it provides hydroclimatic time series and geophysical attributes for a large number of basins in the CONUS.MOPEX data have been used in a large number of studies, including two catchment similarity studies mentioned earlier (Sawicz et al., 2011;Berghuijs et al., 2014).For CAMELS, we use different criteria for catchment selection than those used for MOPEX, which leads to a relatively small overlap between the two data sets (they have 52 catchments in common; see Fig. 8).The main differences between MOPEX and CAMELS are summarized in Table 7.
Both MOPEX and CAMELS require long observation time series and exclude catchments subject to human influence, but they use different approaches to characterize these aspects.For MOPEX, the stations that are part of the hydroclimatic data network (HCDN; Slack and Landwehr, 1992) together with those selected by Wallis et al. (1991) were con-  Lins, 2012): some catchments were excluded (e.g., because they no longer met the minimal disturbance criteria defined in the original HCDN report) and other catchments were added (e.g., because their streamflow records, which were considered too short when the original HCDN report was published, became long enough).
For a catchment to be part of the MOPEX data set, an essential criterion was that its number of rain gauges had to be higher than a threshold based on the catchment area.This led to the exclusion of 77 % of the potential MOPEX basins, resulting in only 438 basins considered to have a dense enough network of gauges.Although we do recognize the importance of reliable precipitation data for hydrological modeling, we did not exclude catchments based on their rain gage density.We argue that uncertainties in precipitation estimates (and in forcing in general) can now be assessed using independent data sets (e.g., Newman et al., 2015a; see also Sect. 10 of this paper).We also consider that uncertainties in observed time series (in particular in discharge records; see Coxon et al., 2015;McMillan and Westerberg, 2015) and uncertainties in catchment attributes (e.g., soil depth; see discussion in Sect.7) can also lead to biased conclusions on hydrological processes and hence should also be considered in the catchment selection processes.Yet the influence of these sources of uncertainties on research results will depend on the catchments and variables of interest, so we leave it to the users to define their own criteria.
Overall, the data used for CAMELS are more recent than those used for MOPEX.The period covered by hydrometeorological time series is 1948-2003for MOPEX and 1980-2015 for CAMELS, so given the fast rate of human development and the impacts caused by climate change, CAMELS provides a more current picture of hydrological processes in the United States.Further, CAMELS leverages new data sets, which were not available when the MOPEX data were released, for instance, to characterize soils (Pelletier et al., 2016) and geology: GLiM (Hartmann and Moosdorf, 2012) and GLHYMPS (Glesson and et al., 2014).Importantly, data used for CAMELS are not only more recent but also tend to be better documented.A clear example is that CAMELS meteorological time series come from three widely used gridded data sets (Daymet, Maurer, and NLDAS), while for MOPEX, station measurements were aggregated to provide catchmentscale estimates (Schaake et al., 2006).
Finally, we present a more detailed and transparent description of the origins and limitations of the data sets used to derive catchment attributes.A substantial part of this paper is dedicated to the discussion of the limitations of the source data sets, and we use competing approaches to estimate the same quantity, thereby revealing uncertainties in those attributes.This is motivated by the belief that identifying weaknesses in catchment attributes helps us to anticipate how they might bias conclusions of hydrological studies.
10 Online availability and possible future extensions In summary, the CAMELS data set is the combination of two data sets, which are available for download separately: 1. the hydrometeorological time series introduced in N15 (Newman et al., 2014; https://doi.org/10.5065/D6MW2F4D)and 2. the catchment attributes introduced in this paper (Addor et al., 2017; https://doi.org/10.5065/D6G73C3Q).
Our intention with this paper is provide quantitative estimates of key geophysical attributes that shape catchment behavior.We see the data set in its current state as a starting point and anticipate that it will keep evolving and become more exhaustive.In particular, our next priority is to compute and make available network characteristics and descriptors of catchment geometry, such as drainage density and stream order statistics, which are important for the understanding and simulation of hydrological processes (Rodríguez-Iturbe and Valdés, 1979;Gupta et al., 1980;Rinaldo et al., 1991).
One of our goals is to enable users to assess the reliability of the attributes and to select catchments and interpret results accordingly, and more work is necessary for a complete  Gridded data from Daymet (Thornton et al., 2012), Maurer (Maurer et al., 2002), and NL-DAS (Xia et al., 2012) Streamflow data USGS streamflow measurements Soil data STATSGO (Miller and White, 1998) STATSGO (Miller and White, 1998)  Atmospheric forcing The N15 data set provides forcing from three data sets (Daymet, NLDAS, and Maurer) but in this study, we only use Daymet.Using the two other data sets might lead to some differences in climate indices, particularly when it comes to heavy precipitation events and/or to catchments with a sparse observation network.Another option to characterize the uncertainty in the forcing is to use the ensemble of gridded forcing produced by Newman et al. (2015a).
Discharge measurements Some hydrological signatures are more sensitive than others to uncertainties in discharge measurements (Westerberg and McMillan, 2015).Methods to characterize those uncertainties and explore their propagation into hydrological signatures in a large sample of catchments exist (Coxon et al., 2015) but require detailed information on the rating curves used for discharge estimation, which were not readily available for this study.

Soils
The STATSGO data set is subject to several critical limitations, many of them being overcome by the recently released POLARIS data set (Chaney et al., 2016) and SoilGrids (Hengl et al., 2017).A key advantage of these two data sets is that they describe soil attributes at high resolution, using machine learning algorithms to estimate uncertainty.
We introduced a new set of attributes for 671 catchments in the contiguous USA.These attributes, together with the hydrometeorological time series provided by Newman et al. (2015b) for the same catchments, constitute the CAMELS data set.The wide range of geophysical characteristics covered by these basins opens new opportunities to quantitatively explore how the interplay between topography, climate, land cover, soil, and geology shapes hydrological behavior.This enables us to test hypotheses and formulate conclusions valid in diverse conditions and not limited to a few specific locations.
We produced a series of maps depicting catchment attributes over the contiguous USA.We used these maps to examine regional variations of a wide range of attributes and to illustrate the relationships between them.From a practical perspective, our synthesis of several data sources into a single data set at the catchment scale greatly simplifies the comparative study of catchment characteristics and the exploration of their influence on hydrological processes.
An essential feature of this work is that it involves a critical assessment of the limitations of the data and methods used to derive catchment attributes, and a discussion of their consequences for process understanding and hydrological modeling.We highlight, in particular, uncertainties in soil attributes.By reviewing the assumptions made during the production and processing of the STATSGO data set, we aim to provide the context necessary to adequately manipulate and interpret these attributes.Other data sets also provide characteristics for a large number of catchments but usually deliver them without explicitly acknowledging their uncertainties.
The version of CAMELS introduced in this paper is a starting point.We plan to expand this data set by adding new catchment attributes and refining our characterization of the uncertainties in catchment attributes, forcing, and streamflow measurements.Furthermore, we designed the tables of this paper so that they fully describe the methods and data used to compute each attribute, in an effort to make our work transparent and reproducible.
To conclude, we envision that the CAMELS data set will enable progress on a wide range of hydrological challenges related to catchment similarity, model parameter estimation based on geophysical characteristics, model benchmarking, regional variations of model performance, and to the information content of geophysical data sets.

Figure 1 .
Figure 1.(a-d) Maps of topographic characteristics over the CONUS.The histograms indicate the number of catchments (out of 671) in each bin.(e) Map of the regions referred to in this study (source: NOAA National Centers for Environmental Information; https://www.ncdc.noaa.gov/temp-and-precip/drought/nadm/geography).
p_seasonality seasonality and timing of precipitation (estimated using sine curves to represent the annual temperature and precipitation cycles; positive (negative) values indicate that precipitation peaks in summer (winter); values close to 0 indicate uniform precipitation throughout the year) -N15 -Daymet * Eq. (14) in Woods et al. (2009) frac_snow fraction of precipitation falling as snow (i.e., on days colder than 0 • C) -N15 -Daymet * high_prec_freq frequency of high precipitation days (≥ 5 times mean daily precipitation) days yr −1 N15 -Daymet * high_prec_dur average duration of high precipitation events (number of consecutive days ≥ 5 times mean daily precipitation

Figure 2 .
Figure 2. Availability of streamflow measurements for periods of different lengths (colors) centered on different years (x axis).The symbols (crosses and circles) indicate the number of catchments with at most 1 or 5 % of daily streamflow measurements missing, respectively.The shape of the curves indicates that the proportion of missing data decreases from 1980 to 1990, stays low, and then increases after 2010.Note that years are hydrological years (starting on 1 October).

Figure 3 .
Figure 3. Maps of climatic indices over the CONUS.The histograms and bar plots indicate the number of catchments (out of 671) in each bin or category.

Figure 4 .
Figure 4. Maps of hydrological signatures over the CONUS.The histograms indicate the number of catchments (out of 671) in each bin.

Figure 5 .
Figure 5. (a-i) Maps of vegetation characteristics over the CONUS.The histograms indicate the number of catchments (out of 671) in each bin.(j) Comparison of the LAI and GVF (maximum and difference between maximum and minimum) based on the dominant land cover class.(k) Comparison of the mean aridity, mean total rooting depth, and mean depth to bedrock for each land cover class; the colored dots all have the same size.

Figure 6 .
Figure 6.Panels (a-h) and (j-l) show maps of soil characteristics over the CONUS.The histograms indicate the number of catchments (out of 671) in each bin.(i) Comparison of the estimates of the soil depth to bedrock from STATSGO and depth to bedrock from P16.The orange area includes all the catchments for which there can be an agreement between the two data sets (i.e., estimates from both data sets are smaller than or equal to 1.5 m).The orange curve is a 1 : 1 curve (note the logarithmic scale on the x axis).

Figure 7 .
Figure 7. Maps of geological characteristics over the CONUS.The histograms indicate the number of catchments (out of 671) in each bin.

Figure 8 .
Figure 8.A comparison of the spatial distribution of the catchments from (a) the MOPEX data set and (b) the CAMELS data set.
* Computer over the period 1 October 1989 to 30 September 2009.
* Only covers the top 1.5 m.

Table 7 .
Main differences and similarities between MOPEX and CAMELS.MOPEX CAMELS Number of catchments and spatial distribution 438 catchments, principally from the eastern half of the CONUS and with an underrepresentation of the Rocky Mountains 671 catchments, with a relatively even distribution over the CONUS