Articles | Volume 26, issue 22
Hydrol. Earth Syst. Sci., 26, 5721–5735, 2022
Hydrol. Earth Syst. Sci., 26, 5721–5735, 2022
Research article
14 Nov 2022
Research article | 14 Nov 2022

Precipitation biases and snow physics limitations drive the uncertainties in macroscale modeled snow water equivalent

Precipitation biases and snow physics limitations drive the uncertainties in macroscale modeled snow water equivalent
Eunsang Cho1,2, Carrie M. Vuyovich1, Sujay V. Kumar1, Melissa L. Wrzesien1,2, Rhae Sung Kim1,3, and Jennifer M. Jacobs4,5 Eunsang Cho et al.
  • 1Hydrological Sciences Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD, USA
  • 2Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA
  • 3Goddard Earth Sciences Technology and Research II, University of Maryland Baltimore County, Baltimore, MD, USA
  • 4Department of Civil and Environmental Engineering, University of New Hampshire, Durham, NH, USA
  • 5Earth Systems Research Center, Institute for the Study of Earth, Oceans, and Space, University of New Hampshire, Durham, NH, USA

Correspondence: Eunsang Cho (


Seasonal snow is an essential component of regional and global water and energy cycles, particularly in snow-dominant regions that rely on snowmelt for water resources. Land surface models (LSMs) are a common approach for developing spatially and temporally complete estimates of snow water equivalent (SWE) and hydrologic variables at a large scale. However, the accuracy of the LSM-based SWE outputs is limited and unclear by mixed factors such as uncertainties in the meteorological boundary conditions and the model physics. In this study, we assess the SWE, snowfall, precipitation, and air temperature products from a 12-member ensemble – with four LSMs and three meteorological forcings – using automated SWE, precipitation, and temperature observations from 809 Snowpack Telemetry stations over the western US. Results show that the mean annual maximum LSM SWE is underestimated by 268 mm. The timing of peak SWE from the LSMs is on average 36 d earlier than that of the observations. By the date of peak SWE, winter accumulated precipitation is underestimated (forcings mean: 485 mm vs. stations: 690 mm). In addition, the precipitation partitioning physics generates different snowfall estimates by an average of 113 mm with the same forcing data. Even though there are widespread cold biases (up to 3 C) in the temperature forcings, larger ablations and lower ratios of SWE to total precipitation are found even in the accumulation period, indicating that melting physics in LSMs drives some SWE uncertainties. Based on the principal component analysis, we find that precipitation bias and partitioning methods have a large contribution to the first principal component, which accounts for about half of the total variance. The results provide insights into prioritizing strategies to improve SWE estimates from LSMs for hydrologic applications.

1 Introduction

Seasonal snow plays a critical role in hydrologic and climatologic processes globally (Sturm et al., 2017). It benefits 1.2 billion of the world's population via its seasonal cycle, which retains water for release during warm and dry periods (Barnett et al., 2005; Viviroli et al., 2007). Also, the impact of seasonal snowpack on extreme events such as snowmelt and rain-on-snow floods (Davenport et al., 2020; Li et al., 2019; Musselman et al., 2018), drought (Huning and AghaKouchak, 2020), and wildfires (Westerling et al., 2006) are of considerable interest to the water resources planners and decision-makers. Despite being a vital component of water balances and extreme events, estimating the spatiotemporal change in snow water storage referred to as snow water equivalent (SWE), remains a significant challenge for the snow hydrology community (Cho et al., 2021b; Dozier et al., 2016; Kim et al., 2021).

Land surface models (LSM) are commonly used to quantify distributed estimates of SWE and other hydrologic variables with high spatiotemporal resolution over large spatial extents. Because LSMs enable the simulation of the physical interactions between water, energy, and carbon cycle processes on the land surface, they are widely used for hydrologic and climate research, as well as operational applications related to river flow forecasting and weather prediction (Mitchell et al., 2004), among others. However, recent studies revealed that there are large uncertainties in SWE estimates from LSMs (Broxton et al., 2016; Kim et al., 2021; Pan et al., 2003), which lead to uncertainties in other relevant variables and processes such as snowmelt runoff, spring soil moisture, and groundwater recharge. Broxton et al. (2016) found that the LSM-based reanalysis SWE products were largely underestimated as compared to the high-resolution reference SWE data sets and suggested that more rapid snow ablation in LSMs especially at near-freezing temperatures is the primary source of the SWE underestimates. In contrast, some studies found that biases in atmospheric forcing data, particularly precipitation, were a major driver of SWE error (Pan et al., 2003; Raleigh et al., 2015), and reanalysis products, often used as meteorological forcing for LSMs, underestimated precipitation, particularly in mountainous areas (Enzminger et al., 2019; Wrzesien et al., 2017). Typically, reanalysis products are limited in capturing the orographic effects near mountainous areas, subsequently resulting in underestimates of precipitation (Wrzesien et al., 2017).

Several snow model intercomparison exercises provided a series of valuable findings according to their objectives. Phases 1 and 2 of the Snow Model Intercomparison Project (SnowMIP) focused on snow energy-budget simulations (Etchevers et al., 2004) and forest snow processes (Essery et al., 2009; Rutter et al., 2009), respectively, based on site-scale reference simulations. SnowMIP1 found that, while the complex snow models were better able to simulate net longwave radiation, the model complexity had relatively little impact on albedo simulation based on two mountainous alpine sites (Etchevers et al., 2004). In SnowMIP2, Essery et al. (2009) found that the snow models generally captured the large differences in albedo and surface temperature quite well between snow-covered and snow-free surfaces and between forested and open sites at three open and forested site pairs. They also concluded that there was little consistency in model performance among sites and years, so no “best” model was identified. Recently, the Land Surface, Snow and Soil moisture MIP (LS3MIP v1.0; van den Hurk et al., 2016) and Earth system models-SnowMIP (ESM-SnowMIP; Krinner et al., 2018 and Menard et al., 2021), an extension of LS3MIP, focused on assessing snow models and snow–climate interactions at local and global scales. Krinner et al. (2018) found that albedo simulation was a major source of uncertainty in the context of snow-related climate feedback based on ten site experiments for a local-scale assessment. However, these model intercomparison projects did not evaluate SWE at the macro-scales needed for operational water resources management. Krinner et al. (2018) and these project exercises suggest that longer periods and larger study domains (and more sites over various snow environments) would improve our understanding of snow models and land surface schemes in Earth system models and reduce the uncertainties associated with snow processes and feedback on the climate.

While the previous studies of continental (or global) scale SWE evaluations outline sources of the SWE errors (Broxton et al., 2016; Cho et al., 2020; Kim et al., 2021; Pan et al., 2003; Mortimer et al., 2020), comprehensive quantification of the relative contribution of these error sources is still required. Furthermore, most of the prior studies used a single or multiple LSMs with one meteorological forcing and either simulated or reanalysis SWE with relatively coarse spatial resolutions (e.g., 12.5 to 50 km), which impedes the quantification of the contributions by producing additional uncertainties. To overcome this, we examine data from a multi-LSM, multi-forcing ensemble at 5 km spatial resolution (called the Snow Ensemble Uncertainty Project; SEUP) developed by Kim et al. (2021), which includes 12 combinations of 5 km resolution gridded SWE products from four LSMs and three meteorological forcings. Kim et al. (2021) quantified the spatial and temporal uncertainties in SWE and total snow storage uncertainty across North America (they referred to “uncertainty” as the range of SWE estimates across the 12 ensemble members). The largest uncertainty in the modeled SWE was found in mountainous regions. They speculated that this was due to the relative deep snow, meteorological forcing uncertainties, and variability among the different snow physics in the LSMs over complex terrain. Also, they demonstrated that SWE uncertainty drives runoff uncertainty, suggesting that improved SWE observations are required to reduce the SWE and runoff uncertainty, particularly during the melt season in high-latitude regions (e.g., northern Canada) and the western mountain regions.

This study seeks to identify the primary sources of the errors and to quantify their contributions to the modeled SWE uncertainty (forcing errors vs. snow-related physics) during the accumulation periods for 8 years (2010 to 2017) against the 809 Snowpack Telemetry (SNOTEL) stations. We acknowledge that some SNOTEL stations are not spatially representative of the surrounding areas (Meromy et al., 2013; Molotch and Bales, 2006), potentially leading to uncertainties in the comparison. Nevertheless, we assume that more than 800 SNOTEL sites across the western US would be sufficient to quantify macroscale LSM SWE uncertainties. This study focuses on the snow accumulation period, which is defined as 1 October to the date of the annual maximum SWE of each station. For the winter temperature comparison only, a period from 1 October to 31 May is used by assuming a typical winter period (Trujillo and Molotch, 2014). Previous studies have investigated LSM uncertainty during the melting season, such as too rapid snowmelt (e.g., Broxton et al., 2016), although less work has considered the accumulation season. We aim to answer the following questions. (1) How large is the LSM SWE uncertainty as compared to SNOTEL measurements? (2) How much does underestimation in precipitation forcing contribute to snow uncertainty? (3) Do precipitation partitioning methods contribute to the SWE underestimation? (4) How much does LSM melt physics contribute to the SWE underestimation? (5) What are the relative contributions among the sources of the error?

2 Data and methods

2.1 Snow Ensemble Uncertainty Project (SEUP)

The SEUP ensemble developed by Kim et al. (2021) is comprised of 12 ensemble members, created by the combination of four different LSMs – Noah with Multi-Parameterization version 3.6 (hereafter Noah-MP; Niu et al., 2011), Catchment version 2.5 (hereafter Catchment; Koster et al., 2000), Joint UK Land Environment Simulator (JULES; Best et al., 2011), and Noah version 2.7.1 (hereafter Noah; Ek et al., 2003) – and three different forcing data sets – European Centre for Medium-Range Weather Forecasts (ECMWF; Molteni et al., 1996), Global Data Assimilation System (GDAS; Derber et al., 1991), and Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA2; Gelaro et al., 2017). These models are selected to provide a baseline of operational LSM capabilities for SWE estimation because they are used for operational purposes at major modeling centers, such as the US National Centers for Environmental Prediction, NASA Global Modeling and Assimilation Office, and the United Kingdom Met Office (note that the model versions and detailed configurations used in this study could differ from what the centers are currently using). In the SEUP analysis, 3-hourly SWE estimates were generated for the 12 combinations using the NASA Land Information System (LIS; Kumar et al., 2006). All models were run at a 5 km resolution from 2000 to 2017 at 15 min time steps. To achieve initial hydraulic and thermal equilibrium states for each run, the first 10 water years from 2000 to 2009 were used as a model spin-up period and the remaining 8 water years from 2010 to 2017 were used for the evaluation in this study. Noah LSM uses a one-layer snow model to simulate SWE by calculating snowfall minus the sum of sublimation and snowmelt. More detailed descriptions of Noah's physics and development are presented by Ek et al. (2003) and Koren et al. (1999). Noah-MP LSM is the advanced Noah model with new multiple options for selected processes (Niu et al., 2011). Noah-MP includes up to three layers of snowpack, depending on snow depth. The snow scheme calculates snow compaction from the weight of overlying snow layers and melting metamorphism. The Catchment LSM includes a three-layer snowpack model that incorporates snow physics, including densification, snowmelt, refreeze, and snow insulating properties (Lynch-Stieglitz, 1994). The prognostic state variables for each layer include snow depth, snow cold content, and SWE. JULES LSM is run in stand-alone mode with a multi-layer snow scheme driven by forcing data (Best et al., 2011). In the multi-layer snow scheme, three layers of snowpack are set. For each layer, snow density is dynamically calculated by snow temperature over time, and thermal conductivity is calculated by using snow density. For further details, refer to Sect. 3 in the JULES model description paper (Best et al., 2011). To improve the spatial representativeness of the coarse resolution forcing inputs, the precipitation forcings were downscaled with the WorldClim monthly precipitation climatology (Fick and Hijmans, 2017). Other forcing variables including near-surface temperature were downscaled to 5 km by applying a constant lapse rate of 6.5 K km−1, hypsometric adjustments using the Shuttle Radar Topography Mission (5 km; SRTM) and the USGS Global 30 arcse Elevation (GTOPO30) data sets (Kim et al., 2021).

In the SEUP ensemble, two different methods are used for partitioning precipitation into rain or snow: (1) a single threshold method is used in the Catchment, JULES, and Noah, and (2) Jordan's fractioning method, which is used in Noah-MP (Jordan, 1991). A single threshold approach simply uses Tair to determine the precipitation phase (Motoyama, 1990). Snowfall occurs whenever there is nonzero precipitation, and the near-surface air temperature is less than 0 C. Jordan's method assumes that any precipitation is rainfall when Tair> 2.5 C, snowfall when Tair< 0.5 C, a snowfall fraction (Fsnow) of 0.6 when 2.0 C <Tair≤2.5C, and a linear equation (i.e., Fsnow= 1–0.2 Tair) from rainfall to snowfall between 0.5 C <Tair< 2.0 C. Schematic diagrams of each method are provided in the Supplement (Fig. S1). A detailed explanation of the SEUP framework can be found in Kim et al. (2021). For the comparison with daily SNOTEL observations in this study, we used the 3-hourly averaged SWE output at 00:00 Coordinated Universal Time (UTC). The air temperature (Tair) outputs from 00:00 and 12:00 UTC are averaged into a single daily average.

2.2 Snow telemetry (SNOTEL)

For reference data sets, we use daily SWE observations measured at the 809 SNOTEL stations across the western US operated by the Natural Resources Conservation Service (NRCS). These stations include accumulated precipitation and daily mean temperature observations co-located with the snow pillows and were used to evaluate the forcing precipitation and temperature data sets. Previous studies have identified a warm bias ranging from +0.5 to 2.0 C at cold temperatures (less than 10 C) in the SNOTEL temperature data (Fig. S4 in Oyler et al., 2015). This bias is found to be temperature-dependent, such that positive biases occur at colder temperatures (less than 12 C), and negative biases occur at warmer temperatures (above 12 C). In winter, the median biases for daily maximum and minimum observations were +1.25 and +1.75 C. For this work, we used a bias-corrected temperature applied by a linear equation used in previous studies (Currier et al., 2017; Sun et al., 2019).

(1) T corr = 1.03 T ori - 0.9 ,

where Tcorr is the bias-corrected temperature (C) and Tori the originally observed temperature (C). While we acknowledge that the difference in spatial representativeness (point vs. grid) may lead to some uncertainties in the comparison, this topic is out of the scope of this study. We assume that more than 800 SNOTEL sites across the western US would be sufficient to quantify macroscale LSM SWE uncertainties.

2.3 Principal component analysis

Principal component analysis (PCA) is a multi-variate technique that is widely used to analyze a data table containing several variables that are generally dependent and inter-correlated (Abdi and Williams, 2010). It extracts important information about relationships among the variables and expresses them as a set of new orthogonal variables, called principal components (PCs). The PC loading measures the correlation between the PC and variables. Therefore, variables with similar loadings for a given PC can be dependent on the same common factor and are positively correlated. The goal of using PCA in this study is to identify the spatial similarity of the SWE difference between LSMs and observations to that of the potential error sources. In our case, the data matrix arranges the SWE errors, SWE_bias, and four potential error sources, including winter total precipitation bias, P_bias, precipitation difference between two precipitation partitioning methods, P_phase, snow ablation difference for the accumulation period as a proxy of melting physics, Ablation, and mean bias in winter air temperature, T_bias, in columns and the 809 SNOTEL sites in rows. The “SWE_bias” is calculated by the ensemble mean SWE subtracted from the corresponding SNOTEL SWE for each station. The potential sources of the error are obtained from the comparison between SEUP and SNOTEL observations. Data in the matrix was standardized (so-called “scaling”) such that the mean and standard deviation of each variable is 0 and 1, respectively.

3 Results

3.1 How large is the LSM's SWE uncertainty?

To quantify differences between LSM SWE estimates and ground observations, the mean magnitude and timing of annual maximum SWE estimates for the four LSMs are compared to the corresponding SNOTEL SWE observations across the western US from October 2009 to May 2017 (Fig. 1). The actual magnitude and dates of the four LSMs are provided in Fig. S2 (the same maps but for annual mean, instead of annual maximum, and 1 April SWE difference are also provided in Fig. S3). There is a widespread underestimation of the annual maximum SWE for all LSMs with averages of 198, 311, 273, and 288 mm for Noah-MP, Catchment, JULES, and Noah, respectively. The amount of the underestimation is regionally dependent on elevation ranges. Larger underestimates of the maximum SWE occur in mountainous areas with higher elevation ranges, such as the Pacific Northwest, the Sierra Nevada, and the Rockies, where larger SWE generally occurs (see the elevation map of Fig. S4). The mean timings of the annual maximum LSM's SWE are on average 36 d earlier than the observation (21, 45, 38, and 42 d for Noah-MP, Catchment, JULES, and Noah, respectively). The patterns of the earlier timings of the maximum SWE are more apparent in areas with higher elevation ranges than with lower elevation. Generally, Noah-MP provides relatively smaller differences as compared to the other three LSMs, although in some Pacific Northwest stations where there is more winter precipitation, Noah-MP SWE exceeds the SNOTEL SWE. Of the four LSMs, the snow-related physics in Noah-MP generates SWE values the most similar to observations.

Figure 1(a) Mean annual maximum SWE map of SNOTEL observations and SWE difference maps of the four land surface model (LSM)'s annual maximum values from the SNOTEL observations with density plots of the SWE difference for each LSM by four elevation ranges of SNOTEL sites (a solid, vertical, black line in the density plots represents median value) and (b) mean annual maximum SWE date map of SNOTEL and date difference maps along density plots when the annual maximum SWE for each LSM occurs from 2010 to 2017 across the western US. Each LSM result is averaged over the three different forcings. Elevation map with the four ranges is provided in Fig. S4.

3.2 Is precipitation forcing data underestimated?

As compared to the SNOTEL observations, ECMWF, GDAS, and MERRA2 forcings have widespread underestimates of winter accumulated precipitation (which is from 1 October to the date of the maximum SWE of each ensemble member for each station) in 98 %, 77 %, and 85 % of the total stations, respectively. Of the stations, average biases (forcing minus SNOTEL) are 356, 262, and 292 mm (maximum biases: 1432, 1523, and 1466 mm) for ECMWF, GDAS, and MERRA2, respectively (Fig. 2). The largest underestimates were found in the Pacific Northwest, the Sierra Nevada, and the Southern Rockies, particularly areas with higher elevation ranges (> 3000 m a.s.l.; Fig. 2b). Among the three forcing data sets, GDAS provides a relatively lower overall difference as compared to ECMWF and MERRA2. In some stations in Washington and Wyoming, GDAS precipitation exceeds the SNOTEL observations. In stations with lower elevation ranges (0–1000 m a.s.l.), dominantly located in the Pacific Northwest regions, there is a wider range of the precipitation difference between the forcings and SNOTEL as compared to higher elevation ranges (Fig. 2b).

Figure 2(a) Mean difference maps (three forcings ensembled by four LSMs minus SNOTEL) in winter accumulated precipitation by the date of the maximum SWE of each ensemble member from 2010 to 2017 across the western US and (b) density plots of the precipitation difference by four elevation ranges of SNOTEL sites.

3.3 Does precipitation partitioning (rain vs. snowfall) contribute to the SWE underestimation?

The two different precipitation partitioning methods used in the LSMs generate different amounts of snowfall by region. Figure 3 provides mean annual total snowfall maps of Jordan's fractioning method (Jordan, 1991; used in Noah-MP) and a single threshold method (other three LSMs) and the difference map between the two. The annual mean difference between the two partitioning methods from 2010 to 2017 is about 113 mm for the 809 stations (24 % of the winter accumulated precipitation from 1 October to the max SWE dates [472 mm]). The differences are regionally dependent on elevation. The spatial patterns of the three difference maps are similar regardless of meteorological forcing sources (Fig. S5). This is not surprising because the fractioning method partitions precipitation amounts with air temperatures ranging from 0 to 2.5 C as snowfall, which would be classified as liquid rainfall with a single threshold method that uses 0 C as the rain-snow threshold. The patterns of the larger snowfall are apparent in the Pacific Northwest, where more precipitation occurs, and temperatures are frequently close to 0 C in winter, as compared to other regions.

Figure 3(a) Mean annual total snowfall maps during the accumulation period (1 October to the date of the maximum SWE of each ensemble member) from 2010 to 2017 using Jordan's (1991) fractioning method in Noah-MP and a single threshold method (0 C) in the other three LSMs, as well as the difference map across the western US and (b) density plots of the precipitation difference by four elevation ranges of SNOTEL sites. The snowfall maps are ensembled by three forcings

The sensitivity of snowfall amounts due to the partitioning methods draws our attention to the role of air temperature differences among the three meteorological forcing data sets and the SNOTEL observations. In Fig. 4, the mean annual temperatures from three forcing data sets are compared to the bias-corrected SNOTEL air temperature for the study period. In 68 %, 86 %, and 53 % of stations, air temperatures from ECMWF, GDAS, and MERRA2 forcings exhibit negative (cold) biases as compared to the bias-corrected SNOTEL temperatures, respectively. Larger cold biases of ECMWF and GDAS are found in the continental regions with higher elevation ranges (e.g., 1.3 and 2.8 C of the median biases for stations with > 3000 m a.s.l., respectively). MERRA2 has small biases for all elevation ranges even though there exist contrasting biases between continental (cold) versus maritime (warm) regions.

Figure 4Mean differences in daily mean air temperature for the winter period (1 October to 31 May) between three forcings (ECMWF, GDAS, and MERRA2) and bias-corrected SNOTEL data sets for 8 water years from 2010 to 2017.

3.4 Does melting physics in LSMs contribute to the SWE underestimation?

To examine the contribution of melting physics in LSMs to the SWE underestimation, the mean annual snow ablations during the accumulation period, i.e., from 1 October to the date of the maximum SWE for each year for each SEUP member, are compared to the corresponding amount of snow ablation from SNOTEL (Fig. 5). The reason why the spring melt ablation period was excluded in the analysis is that it would include the impact of precipitation underestimation (i.e., the LSM ablation would always be lower than the SNOTEL ablation simply because there was less total snowfall to begin with). The same maps but for the accumulated snow ablation from 1 October to 1 April are provided in Fig. S6. Even though the LSMs underestimate the SWE, and the peak SWE dates are generally earlier than the observations, there are larger ablations from all LSMs as compared to the observations at 68 % to 87 % of the total stations by up to 188 mm (Noah-MP; mean difference: 6.5 mm) to 314 mm (Catchment; mean difference: 52 mm), except for regions with lower elevation ranges (< 1000 m a.s.l.). The magnitude of the ablations is highly region-specific. While larger ablations are found in mountainous regions with higher elevation ranges (> 2000 m a.s.l.)m such as the Middle and Southern Rockies and the Sierra Nevada, there is a tendency toward smaller ablations than the observations in the Pacific Northwest. In fact, this is because there was lower SWE accumulation from LSMs than observed, resulting in less snow ablation.

Figure 5(a) Mean difference maps (four LSMs ensembled by three forcings minus SNOTEL) in the accumulated snow ablation during the snow accumulation periods from 1 October to the date of the maximum SWE of each ensemble member for each year from 2010 to 2017 across the western US and (b) density plots of the snow ablation difference by four elevation ranges of SNOTEL sites.

The ratios of SWE to total precipitation (fSWE,precip) for each LSM and SNOTEL are also compared to examine the proportion of early snow ablation to precipitation (Fig. 6). For the ratio map from SNOTEL, about 80 % to 90 % of the precipitation is accumulated as snowpack in the continental regions including the Sierra Nevada. Whereas SNOTEL has about 20 % to 40 % precipitation accumulated as SWE during the accumulation period in the Pacific Northwest and southern regions, including Arizona and New Mexico. Results show that most LSMs underestimate the fSWE,precip, and melting losses occur too frequently in LSMs during the accumulation period. Among the LSMs, the spatial distribution of the Noah-MP's ratio closely follows the SNOTEL observations, followed by Catchment, JULES, and Noah. However, Noah-MP still has lower ratios, particularly in the Middle and Northern Rockies, as well as Arizona and New Mexico Mountains.

Figure 6Maps of the mean ratio of the annual maximum SWE to total accumulated precipitation for the SNOTEL and four LSMs by the date of the maximum SWE of each ensemble member from 2010 to 2017. Each LSM result was ensembled by three forcings.

3.5 What is the relative contribution of potential causes on SWE uncertainties?

To quantify the relative contributions of the error sources on the SWE uncertainties, a principal component analysis (PCA) is conducted with the SWE biases (SWE_bias) and four identified error sources (P_ bias, P_phase, T_bias, and Ablation) for the 809 stations (Fig. 7). The standardized maps for the variables are provided in Fig. S7. The PC loadings for the four prominent PCs jointly account for about 95 % of the total spatial variance in the PCA data. With respect to the PCA with SWE_bias in Fig. 1a, the spatial variability in the SWE errors is prominently featured in the first three PCs that account for 86.8 % of the total variance. In the first PC (PC1; 46.1 % explained), SWE_bias has positive correlations with P_ bias but negative correlation with P_phase. That is, areas of larger P_bias and smaller P_phase are associated with larger SWE_bias, and these areas correspond to the continental regions (e.g., Middle and Southern Rocky Mountains). P_bias (0.53) has a slightly larger contribution to PC1 than P_phase (0.52). The second PC (PC2; 24.3 % explained) describes the negative correlations between SWE_bias (0.61) and Ablation (0.59). PC2 indicates that there are areas where the SWE errors and ablation are negatively correlated, and these areas correspond to the Pacific Northwest and some Northern Rocky Mountains in Fig. 5. Next, the third PC (PC3; 16.4 % explained) shows negative correlations between SWE_bias and T_bias. That is, areas of larger T_bias are associated with the SWE underestimation (negative), and the areas correspond to Arizona and New Mexico Mountains.

Figure 7Principal component analysis (PCA) on the relations between the SWE uncertainty (SWE_bias; values of LSM SWE differences from the observations from Fig. 1b) and the four potential causes (P_bias: precipitation bias in Fig. 2; P_phase: precipitation difference between two partitioning methods in Fig. 3; T_bias: mean bias in winter air temperature in Fig. 4; and Ablation: snow ablation difference during the accumulation period as a proxy of melting physics in Fig. 5).


4 Discussion and future perspectives

The results that the LSMs driven by meteorological forcing data from global models and observations systematically underestimate SWE are consistent with and complement the findings of related studies (Pan et al., 2003; Broxton et al., 2016). Pan et al. (2003) evaluated SWE products from four LSMs with a single forcing data from the North American Land Data Assimilation System (NLDAS) at approximately 12.5 km spatial resolution. They found that LSMs show systematic bias in the annual maximum SWE when compared to 110 SNOTEL stations with larger differences in the Pacific Northwest and the Sierra Nevada regions. The NLDAS forcing precipitation was consistently lower than the observations by up to 2000 mm annually at certain stations. Brown et al. (2018) also demonstrated insufficient solid precipitation from gridded reanalysis, and model products led to systematic errors in SWE estimations in southern Québec, indicating the need for improving estimates of snowfall. These results correspond to the widespread underestimates of precipitation up to more than 1500 mm for all meteorological forcings from this study, particularly for the mountainous areas with complex terrains. Previous studies discussed the uncertainty and challenges in estimating precipitation in gridded data sets over complex terrain (Gutmann et al., 2012; Livneh et al., 2014; Lundquist et al., 2015). Because topography information (e.g., elevation) is used to interpolate gauge-based precipitation and/or disaggregate course resolution precipitation data sets to generate higher resolution gridded products, precipitation uncertainties are larger over complex terrain at higher elevations (Henn et al., 2018). Our results provide similar findings that ECMWF and GDAS precipitation biases in Fig. 2 are dependent on elevations (larger differences at higher elevations), except for MERRA2. The PCA results reveal that the larger precipitation bias is a major contributor in the first PC (>46 % of the total explained variance) to the SWE uncertainties, which is consistent with previous studies (e.g., Raleigh et al., 2015).

In this study, the temperature biases depend on elevation, and the level of the dependence differs by forcing sources (see Fig. 3). While MERRA2 biases are relatively constant with elevation, GDAS has colder biases with increasing elevation. The results differ from previous findings of Pan et al. (2003), who concluded that the winter NLDAS temperature bias was generally constant by elevation. They stated that the constant lapse rate (6.5 C km−1) used to downscale the coarse meteorological fields to a 12.5 km grid is suitable (Cosgrove et al., 2003). However, applying the constant lapse rate (6.5 C km−1) for temperature downscaling may introduce temperature uncertainty. The surface temperature lapse rate varies differently in time and space by region (Blandford et al., 2008). For example, Minder et al. (2010) revealed that in the Cascade Mountains annual mean lapse rates were 3.9–5.2 C km−1, which is substantially smaller than the 6.5 C km−1 used in this study with substantial geographic differences in lapse rates. Although sensitivity studies with dynamic or static lapse rates were not conducted in this study, it is reasonable to assume that the cold biases in higher elevations found in this study could be partially corrected by applying more appropriate lapse rates regionally. This can result in corrected proportions of precipitation phases as well as the magnitude and timing of SWE and snowmelt (Lute and Abatzoglou, 2021).

As shown earlier, the underestimation of SWE is smaller in Noah-MP compared to the other LSMs despite the underestimated forcing precipitation (e.g., ECMWF) in the regions. The smaller underestimates may be partially attributed to the Jordan fractioning method compensating for precipitation error by allowing for snowfall when the air temperature is above 0 C. The result is that while Noah-MP still underestimates SWE as compared to the SNOTEL observations, it performs better in this region than the other three LSMs. For other LSMs, a single threshold method for precipitation partitioning largely contributes to the SWE underestimation. This is also supported by examples of time series of accumulated snowfall from LSMs (Fig. S8). Snowfall estimates of Catchment, JULES, and Noah LSMs with the GDAS forcing are much less than that of Noah-MP with the same forcing (see the snowfall time series in Fig. S8). In this study, the two precipitation partitioning methods generated differences in annual snowfall by up to 847 mm with an average of 117 mm across the SNOTEL sites in the western US, particularly larger in the Pacific Northwest. However, in the eastern US, Jordan's method generated too much snowfall and subsequently overestimated SWE as compared to observations from the New York State Mesonet (Letcher et al., 2022). This suggests that applying a certain partitioning method across larger areas (at least the continental US) can generate spatially different errors. Jennings et al. (2018) found that air temperature partitioning rain and snowfall varies across the Northern Hemisphere, ranging from 0.4 to 2.4 C (average: 1.0 C) for 95 % of the study's observations. This implies that the higher air temperature threshold ( 2.5 C) of Jordan's fractional method may overestimate snowfall.

The two precipitation partitioning approaches used in this study may have limitations. A new precipitation partitioning method incorporating humidity performed better than air temperature-only methods (Jennings et al., 2018). Also, solid precipitation simulations were improved when the wet-bulb temperature, defined as the temperature to which air can be cooled to saturation by the evaporation of water into the air, was used, particularly in the drier, higher elevation continental regions in the western US. This was because, as compared to air temperature, the wet-bulb temperature was closer to the actual temperature of a falling hydrometeor (Sims and Liu, 2015; Wang et al., 2019). Considering that the wet-bulb temperature is affected by surface skin temperature and vertical lapse rate (Sims and Liu, 2015), future comparison studies with multiple precipitation partitioning methods should consider humidity, wet-bulb temperature, and/or other meteorological variables in various environments in developing the best partitioning approach for the land surface and hydrological modeling communities. Since there are many differences among the LSMs beyond the partitioning scheme, further studies are also required to consider the single and combined impacts of other model physics on the SWE estimates (i.e., albedo and snow–soil interactions).

The SWE underestimates will likely increase if the cold biases in temperature forcings are corrected. This indicates that even though the cold biases may be fortuitously helping to generate larger snowfall instead of rainfall, contributions of the precipitation underestimation as well as melting physics to the SWE underestimates are still too large to simulate enough SWE. Note that the Colorado Snow Survey Program with the USDA Natural Resources Conservation Service has ongoing work to correct the air temperature for the entire SNOTEL network, beyond the linear equation correction used here and in other studies (Currier et al., 2017; Sun et al., 2019).

It is likely that early-season melting losses are mainly due to melting physics in LSMs (Broxton et al., 2016). In general, simplified snow layering schemes with a single snow density (e.g., a single snow layer) may cause rapid snowmelt (Suzuki and Zupanski, 2018) because the snow layering schemes influence thermodynamics in the snowpack and the subsequent timing and presence of melt (Dutra et al., 2011). Considering the layering schemes (a single-layer scheme in Noah vs. three-layer schemes in both Noah-MP, Catchment, and JULES), it is perhaps expected that Noah showed relatively poor spatial agreements with the SNOTEL SWE/total precipitation ratio (Fig. 6).

A future study investigating other melting physics in LSMs by comparing energy balance components (e.g., net radiation, snow albedo changes, and heat transfer with the ground) might provide a better understanding of the SWE uncertainty issues. Typically, errors in modeled winter albedo were linked to errors in snow cover fraction (SCF; Roesch, 2006) and tree cover fraction (TCF; Wang et al., 2016). Better quantification of the SCF and/or TCF in LSMs could improve the albedo and, consequently, snowmelt and SWE simulations. Recent snow model intercomparison exercise results (e.g., ESM-SnowMIP) provide insights into relationships between surface temperature and SWE estimations (Krinner e al., 2018; Menard et al., 2021). Menard et al. (2021) found clear differences in model ranking between SWE and snow surface temperature (e.g., four of the best models for SWE estimation were among the worst for snow surface temperature). They stated that underestimating surface temperature leads to a colder snowpack that remains on the ground for longer and results in less snowmelt (Conway et al., 2018). While the current results may not be addressed by the surface temperature issue, because our modeled SWE outputs were generally underestimated despite cold biases in air temperature (Fig. 5), it is worth exploring snow and soil temperature variations along with the SWE changes in models with various environments to explicitly understand internal snow processes (Gouttevin et al., 2012).

While the wind-driven processes are typically considered as smaller-scale issues, we acknowledge that this may partially influence the current results because the station observations may be affected by wind redistributions and their impact on the sublimation of snow (Groot Zwaaftink et al., 2013; Hood et al., 1999). This may cause increased biases, particularly at higher elevations because regions with higher elevations generally have more complex terrains where blowing/drifting snow frequently occurs due to wind and avalanches (Mott et al., 2018). Most LSMs, including those used in this study, do not have physical processes to simulate preferential deposition and snow redistribution that are crucial in mountain environments (Dadic et al., 2010; Mott et al., 2018; Lehning et al., 2008); further investigations are required to quantify uncertainties driven by these “missing” physics, along with the continuous efforts to include relevant model physics into LSMs for better SWE simulations.

Because this study relies on the use of point-based SNOTEL measurements, different spatial representativeness between the 5 km gridded outputs and in situ measurements is an inevitable issue preventing a definitive evaluation of the SWE, precipitation, and temperature simulations. In terms of the physiographic characteristics of SNOTEL sites, the stations tend to be located on flat and sheltered terrain in small forest gaps due to logistical difficulties in steep and densely forested areas (Meromy et al., 2013; Molotch and Bales, 2006), potentially leading to relatively large snow accumulation as compared to surrounding areas. However, Meromy et al. (2013), using more than 30 000 field observations, found that there were no consistent biases in SNOTEL snow depth values relative to the surrounding mean observed values in California, Colorado, Wyoming, Idaho, and Oregon. Considering that SNOTEL observations have been widely used for snow research, further investigations for quantifying the spatial representativeness of each station with surrounding areas will be helpful.

Regarding the uncertainty in complex terrain, the spatial resolution of the model simulation can add uncertainty to SWE outputs. Despite the relatively high resolution, the 5 km grid of the SEUP ensemble is still too coarse to fully represent local heterogeneity for microphysical features and snow processes in mountainous regions. Within a few square kilometers, many physical controls, including terrain, vegetation, and soils, play a role in forming spatial heterogeneity in the snowpack (Currier and Lundquist, 2018; Cho et al., 2021a; Meromy et al., 2013; Neumann et al., 2006). Lettenmaier et al. (2015) stated that spatial resolution of snowpack is ideally required to be no coarser than 100 m to characterize mountain snowpack. Also, in regional climate model experiments, high-resolution simulations (4 km or lower) enable microphysical features such as orographic updrafts generating clouds and precipitation to be captured (e.g., Ikeda et al., 2021). This has relevance to multi-resolution snow modeling studies (e.g., Pavelsky et al., 2011; Wrzesien et al., 2017), which showed better agreement between model and point (or reference) SWE when the model was run at a finer resolution (9 and 3 km). Although the relatively high-resolution simulations (5 km) in this study may partially alleviate the issue, we cannot fully address the issue. Future studies may focus on accounting for the impact of spatial representativeness on the errors identified in the gridded SWE.

Overall, the primary value of this study is to provide a big picture of LSM-based SWE and error sources of the uncertainty derived from meteorological forcings and limited physical processes of the LSMs using a large ensemble with four LSMs and three meteorological forcings. As compared to previous studies, this work not only identified primary sources of uncertainty but also quantified the relative contributions of the error sources on the uncertainty of macroscale modeled SWE. Thus, this study helps us to comprehensively understand current state of SWE performances from well-known LSMs and reanalysis products and provides insights into future improvement of snow modeling in LSMs.

5 Conclusions

This study identifies the dominant sources and relative contributions of SWE errors using the SEUP ensemble recently developed by Kim et al. (2021), including four LSMs, Noah-MP, Catchment, JULES, and Noah, with three different meteorological forcings, ECMWF, GDAS, and MERRA2, across western US. There is a widespread underestimation of SWE from all LSMs up to more than 1000 mm, although the uncertainty is regionally dependent. Substantial underestimations of precipitation for all meteorological forcings are found to be a primary source of the SWE underestimation, particularly in the Pacific Northwest and Southern Rockies with higher elevation ranges (> 3000 m). The precipitation partitioning approach generates different snowfall estimates by up to 800 mm with the same forcing data. In most LSMs, there are large melting losses during the accumulation period, contributing to the underestimation of SWE. Lastly, there are regionally different biases of air temperature from the forcings up to 3.5 C in areas with high elevation. Considering that the temperature biases contribute to determining the precipitation phase and subsequent uncertainty in snowfall and SWE, particularly in maritime regions, reducing the temperature bias from meteorological forcings and/or an improved temperature downscaling approach (e.g., a time or spatially varying lapse rate) is required to improve the SWE estimation. Among these LSMs, Noah-MP shows the best performance in simulating SWE (less underestimation), likely due to larger snowfall amounts from Jordan's method and better melting physics, even though there are several limitations in the current physics. The results from this study provide insights needed to guide the improvement of LSM's SWE for snow science and climate research. Further studies with sensitivity analysis for each process with relevant parameters in LSMs might be helpful to explicitly quantify their contributions and to ensure that improvements to LSMs in one region do not adversely impact another region.

Code and data availability

To reproduce results including figures, R codes and processed data are openly available at (Cho et al., 2022). The SNOTEL SWE and accumulated precipitation data are available from the U.S. Department of Agriculture Natural Resources Conservation Service National and Climate Center (, Serreze et al., 1999). The bias-corrected air temperature data for the SNOTEL sites are available from Pacific Northwest National Laboratory at (Sun et al., 2019). While the SEUP simulation outputs are too large to be publicly archived with available resources, data are available from the corresponding author upon request. To replicate the model simulation, users can freely access the NASA Land Information System at (Kumar et al., 2006). The MERRA2 forcing data set is distributed by the NASA Goddard Global Modeling and Assimilation Office (GMAO;, Gelaro et al., 2017). The GDAS forcing data are publicly available from the US National Centers for Environmental Prediction (NCEP;, Derber et al., 1991). The ECMWF forcing data are not publicly available but made available under license (, Molteni et al., 1996).


The supplement related to this article is available online at:

Author contributions

EC led the investigation, conceptualized the research, did the formal analysis, and wrote the initial draft. CMV and SVK conceptualized the research, took responsibility for the investigation, acquired the funding and the resources, supervised the project, and reviewed and edited the paper. RSK did the model simulations, helped with the investigation, and reviewed and edited the paper. MLW and JMJ helped with the investigation, provided technical and scientific inputs, and reviewed and edited the paper.

Competing interests

The contact author has declared that none of the authors has any competing interests.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


The authors are grateful to all colleagues who contributed to the SEUP project. Computing resources to run the NASA Land Information System (LIS) were supported by the NASA Center for Climate Simulation.

Financial support

This research has been supported by the National Aeronautics and Space Administration (grant no. NNH16ZDA001N).

Review statement

This paper was edited by Jan Seibert and reviewed by two anonymous referees.


Abdi, H. and Williams, L. J.: Principal component analysis, Wiley Interdiscip. Rev. Comput., 2, 433–459, 2010. 

Blandford, T. R., Humes, K. S., Harshburger, B. J., Moore, B. C., Walden, V. P., and Ye, H.: Seasonal and synoptic variations in near-surface air temperature lapse rates in a mountainous basin, J. Appl. Meteorol. Clim., 47, 249–261, 2008. 

Barnett, T. P., Adam, J. C., and Lettenmaier, D. P.: Potential impacts of a warming climate on water availability in snow-dominated regions, Nature, 438, 303–309, 2005. 

Best, M. J., Pryor, M., Clark, D. B., Rooney, G. G., Essery, R. L. H., Ménard, C. B., Edwards, J. M., Hendry, M. A., Porson, A., Gedney, N., Mercado, L. M., Sitch, S., Blyth, E., Boucher, O., Cox, P. M., Grimmond, C. S. B., and Harding, R. J.: The Joint UK Land Environment Simulator (JULES), model description – Part 1: Energy and water fluxes, Geosci. Model Dev., 4, 677–699,, 2011. 

Brown, R., Tapsoba, D., and Derksen, C.: Evaluation of snow water equivalent datasets over the Saint-Maurice river basin region of southern Québec, Hydrol. Process., 32, 2748–2764, 2018. 

Broxton, P., Zeng, X., and Dawson, N.: Why do global reanalyses and land data assimilation products underestimate snow water equivalent?, J. Hydrometeorol., 17, 2743–2761,, 2016. 

Cho, E., Jacobs, J. M., and Vuyovich, C.: The value of long-term (40 years) airborne gamma radiation SWE record for evaluating three observation-based gridded SWE datasets by seasonal snow and land cover classifications, Water Resour. Res., 56, e2019WR025813,, 2020. 

Cho, E., Hunsaker, A. G., Jacobs, J. M., Palace, M., Sullivan, F. B., and Burakowski, E. A.: Maximum entropy modeling to identify physical drivers of shallow snowpack heterogeneity using unpiloted aerial system (UAS) lidar, J. Hydrol., 602, 126722,, 2021a. 

Cho, E., McCrary, R. R., and Jacobs, J. M.: Future Changes in Snowpack, Snowmelt, and Runoff Potential Extremes Over North America, Geophys. Res. Lett., 48, e2021GL094985,, 2021b. 

Cho, E., Vuyovich, C. M., Kumar, S. V., Wrzesien, M. L., Kim, R. S., and Jacobs, J. M.: Data and code for: Precipitation Biases and Snow Physics Limitations Drive the Uncertainties in Macroscale Modeled Snow Water Equivalent, Zenodo [data set] and and [code],, 2022. 

Conway, J. P., Pomeroy, J. W., Helgason, W. D., and Kinar, N. J.: Challenges in modelling turbulent heat fluxes to snowpacks in forest clearings, J. Hydrometeorol., 19, 1599–1616,, 2018. 

Cosgrove, B. A., Lohmann, D., Mitchell, K. E., Houser, P. R., Wood, E. F., Schaake, J. C., Robock, A., Marshall, C., Sheffield, J., Duan, Q., Luo, L., Higgins, R. W., Pinker, R. T., Tarpley, J. D., and Meng, J.: Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project, J. Geophys. Res., 108, 2002JD003118,, 2003. 

Currier, W. R. and Lundquist, J. D.: Snow depth variability at the forest edge in multiple climates in the western United States, Water Resour. Res., 54, 8756–8773, 2018. 

Currier, W. R., Thorson, T., and Lundquist, J. D.: Independent evaluation of frozen precipitation from WRF and PRISM in the Olympic Mountains, J. Hydrometeorol., 18, 2681–2703, 2017. 

Dadic, R., Mott, R., Lehning, M., and Burlando, P.: Parameterization for wind–induced preferential deposition of snow, Hydrol. Process., 24, 1994–2006, 2010. 

Davenport, F. V., Herrera-Estrada, J. E., Burke, M., and Diffenbaugh, N. S.: Flood Size Increases Nonlinearly Across the Western United States in Response to Lower Snow-Precipitation Ratios, Water Resour. Res., 56, e2019WR025571,, 2020. 

Derber, J. C., Parrish, D. F., and Lord, S. J.: The new global operational analysis system at the National Meteorological Center, Weather Forecast., 6, 538–547, 1991 (data available at:, last access: 7 November 2022). 

Dozier, J., Bair, E. H., and Davis, R. E.: Estimating the spatial distribution of snow water equivalent in the world's mountains, Wiley Interdiscip. Rev. Water, 3, 461–474, 2016. 

Dutra, E., Kotlarski, S., Viterbo, P., Balsamo, G., Miranda, P. M. A., Schär, C., Bissolli, P., and Jonas, T.: Snow cover sensitivity to horizontal resolution, parameterizations, and atmospheric forcing in a land surface model, J. Geophys. Res.-Atmos., 116, D21109,, 2011. 

Ek, M. B., Mitchell, K. E., Lin, Y., Rogers, E., Grunmann, P., Koren, V., Gayno, G., and Tarpley, J. D.: Implementation of Noah land surface model advances in the National Centers for Environmental Prediction operational mesoscale Eta model, J. Geophys. Res.-Atmos., 108, 8851,, 2003. 

Enzminger, T. L., Small, E. E., and Borsa, A. A.: Subsurface water dominates Sierra Nevada seasonal hydrologic storage, Geophys. Res. Lett., 46, 11993–12001, 2019. 

Essery, R., Rutter, N., Pomeroy, J., Baxter, R., Stähli, M., Gustafsson, D., Barr, A., Bartlett, P., and Elder, K.: SNOWMIP2: An Evaluation of Forest Snow Process Simulations, B. Am. Meteorol. Soc., 90, 11201136,, 2009. 

Etchevers, P., Martin, E., Brown, R., Fierz, C., Lejeune, Y., Bazile, E., Boone, A., Dai, Y.-J., Essery, R., Fernandez, A., Gusev, Y., Jordan, R., Koren, V., Kowalczyk, E., Nasonova, N. O., Pyles, R. D., Schlosseer, A., Shmakin, A. B., Smirnova, T. G., Strasser, U., Verseghy, D., Yamazaki, T., and Yang, Z.-L.: Validation of the surface energy budget simulated by several snow models (SnowMIP project), Ann. Glaciol., 38, 150158,, 2004. 

Fick, S. E. and Hijmans, R. J.: WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas, Int. J. Climatol., 37, 4302–4315, 2017. 

Gelaro, R., McCarty, W., Suárez, M. J., Todling, R., Molod, A., Takacs, L., Randles, C. A., Darmenov, A., Bosilovich, M. G., Reichle, R., Wargan, K., Coy, L., Cullather, R., Draper, C., Akella, S., Buchard, V., Conaty, A., da Silva, A. M., Gu, W., Kim, G.- K., Koster, R., Lucchesi, R., Merkova, D., Nielsen, J. E., Partyka, G., Pawson, S., Putman, W., Rienecker, M., Schubert, S. D., Sienkiewicz, M., and Zhao, B.: The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2), J. Climate, 30, 54195454,, 2017 (data available at:, last access: 7 November 2022). 

Groot Zwaaftink, C. D., Mott, R., and Lehning, M.: Seasonal simulation of drifting snow sublimation in Alpine terrain, Water Resour. Res., 49, 1581–1590, 2013. 

Gouttevin, I., Menegoz, M., Dominé, F., Krinner, G., Koven, C., Ciais, P., Tarnocai, C., and Boike, J.: How the insulating properties of snow affect soil carbon distribution in the continental pan-Arctic area, J. Geophys. Res.-Biogeo., 117, 1–11,, 2012. 

Gutmann, E. D., Rasmussen, R. M., Liu, C., Ikeda, K., Gochis, D. J., Clark, M. P., Dudhia, J., and Thompson, G.: A comparison of statistical and dynamical downscaling of winter precipitation over complex terrain, J. Climate, 25, 262281,, 2012. 

Henn, B., Newman, A. J., Livneh, B., Daly, C., and Lundquist, J. D.: An assessment of differences in gridded precipitation datasets in complex terrain, J. Hydrol., 556, 1205–1219, 2018. 

Hood, E., Williams, M., and Cline, D.: Sublimation from a seasonal snowpack at a continental, mid-latitude alpine site, Hydrol. Process., 13, 1781–1797, 1999. 

Huning, L. S. and AghaKouchak, A.: Global snow drought hot spots and characteristics, P. Natl. Acad. Sci. USA, 117, 19753–19759, 2020. 

Ikeda, K., Rasmussen, R., Liu, C., Newman, A., Chen, F., Barlage, M., Gutmann, E., Dudhia, J., Dai, A., Luce, C., and Musselman, K.: Snowfall and snowpack in the Western U.S. as captured by convection permitting climate simulations: Current climate and pseudo global warming future climate, Clim. Dynam., 57, 2191–2215,, 2021. 

Jennings, K. S., Winchell, T. S., Livneh, B., and Molotch, N. P.: Spatial variation of the rain–snow temperature threshold across the Northern Hemisphere, Nat. Commun., 9, 1–9, 2018. 

Jordan, R. E.: A one-dimensional temperature model for a snow cover: Technical documentation for SNTHERM, 89, (last access: 30 September 2022), 1991. 

Lehning, M., Löwe, H., Ryser, M., and Raderschall, N.: Inhomogeneous precipitation distribution and snow transport in steep terrain, Water Resour. Res., 44, W07404,, 2008. 

Letcher, T. W., Minder, J. R., and Naple, P.: Understanding and improving snow processes in Noah-MP over the Northeast United States via the New York State Mesonet. Technical Report, Technical Report, U.S. Engineer Research and Development Center,, 2022. 

Li, D., Lettenmaier, D. P., Margulis, S. A., and Andreadis, K.: The role of rain-on-snow in flooding over the conterminous United States, Water Resour. Res., 55, 8492–8513, 2019. 

Livneh, B., Deems, J. S., Schneider, D., Barsugli, J. J., and Molotch, N. P., Filling in the gaps: Inferring spatially distributed precipitation from gauge observations over complex terrain, Water Resour. Res., 50, 8589–8610, 2014. 

Lundquist, J. D., Hughes, M., Henn, B., Gutmann, E. D., Livneh, B., Dozier, J., and Neiman, P.: High-elevation precipitation patterns: Using snow measurements to assess daily gridded datasets across the Sierra Nevada, California, J. Hydrometeorol., 16, 1773–1792, 2015. 

Lynch-Stieglitz, M.: The development and validation of a simple snow model for the GISS GCM, J. Climate, 7, 1842–1855, 1994. 

Kim, R. S., Kumar, S., Vuyovich, C., Houser, P., Lundquist, J., Mudryk, L., Durand, M., Barros, A., Kim, E. J., Forman, B. A., Gutmann, E. D., Wrzesien, M. L., Garnaud, C., Sandells, M., Marshall, H.-P., Cristea, N., Pflug, J. M., Johnston, J., Cao, Y., Mocko, D., and Wang, S.: Snow Ensemble Uncertainty Project (SEUP): quantification of snow water equivalent uncertainty across North America via ensemble land surface modeling, The Cryosphere, 15, 771–791,, 2021. 

Koren, V., Schaake, J., Mitchell, K., Duan, Q. Y., Chen, F., and Baker, J. M.: A parameterization of snowpack and frozen ground intended for NCEP weather and climate models, J. Geophys. Res.-Atmos., 104, 19569–19585, 1999. 

Koster, R. D., Suarez, M. J., Ducharne, A., Stieglitz, M., and Kumar, P.: A catchment-based approach to modeling land surface processes in a general circulation model: 1. Model structure, J. Geophys. Res.-Atmos., 105, 24809–24822,, 2000. 

Krinner, G., Derksen, C., Essery, R., Flanner, M., Hagemann, S., Clark, M., Hall, A., Rott, H., Brutel-Vuilmet, C., Kim, H., Ménard, C. B., Mudryk, L., Thackeray, C., Wang, L., Arduini, G., Balsamo, G., Bartlett, P., Boike, J., Boone, A., Chéruy, F., Colin, J., Cuntz, M., Dai, Y., Decharme, B., Derry, J., Ducharne, A., Dutra, E., Fang, X., Fierz, C., Ghattas, J., Gusev, Y., Haverd, V., Kontu, A., Lafaysse, M., Law, R., Lawrence, D., Li, W., Marke, T., Marks, D., Ménégoz, M., Nasonova, O., Nitta, T., Niwano, M., Pomeroy, J., Raleigh, M. S., Schaedler, G., Semenov, V., Smirnova, T. G., Stacke, T., Strasser, U., Svenson, S., Turkov, D., Wang, T., Wever, N., Yuan, H., Zhou, W., and Zhu, D.: ESM-SnowMIP: assessing snow models and quantifying snow-related climate feedbacks, Geosci. Model Dev., 11, 5027–5049,, 2018. 

Kumar, S. V., Peters-Lidard, C. D., Tian, Y., Houser, P. R., Geiger, J., Olden, S., Lighty, L., Eastman, J. L., Doty, B., Dirmeyer, P., Adams, J., Mitchell, K., Wood, E. F., and Sheffield, J.: Land information system: An interoperable framework for high resolution land surface modeling, Environ. Modell. Softw., 21, 1402–1415,, 2006 (code available at:, last access: 7 November 2022). 

Lettenmaier, D. P., Alsdorf, D., Dozier, J., Huffman, G. J., Pan, M., and Wood, E. F.: Inroads of remote sensing into hydrologic science during the WRR era, Water Resour. Res., 51, 7309–7342,, 2015. 

Lute, A. C. and Abatzoglou, J. T.: Best practices for estimating near-surface air temperature lapse rates, Int. J. Climatol., 41, E110–E125, 2021. 

Menard, C. B., Essery, R., Krinner, G., Arduini, G., Bartlett, P., Boone, A., Brutel-Vuilmet, C., Burke, E., Cuntz, M., Dai, Y., Decharme, B., Dutra, E., Fang, X., Fierz, C., Gusev, Y., Hagemann, S., Haverd, V., Kim, H., Lafaysse, M., Marke, T., Nasonova, O., Nitta, T., Niwano, M., Pomeroy, J., Schädler, G., Semenov, V., Smirnova, T., Strasser, U., Swenson, S., Turkov, D., Wever, N., and Yuan, H.: Scientific and human errors in a snow model intercomparison, B. Am. Meteorol. Soc., 102, 146,, 2021. 

Meromy, L., Molotch, N. P., Link, T. E., Fassnacht, S. R., and Rice, R.: Subgrid variability of snow water equivalent at operational snow stations in the western USA, Hydrol. Process., 27, 2383–2400, 2013. 

Minder, J. R., Mote, P. W., and Lundquist, J. D.: Surface temperature lapse rates over complex terrain: Lessons from the Cascade Mountains, J. Geophys. Res.-Atmos., 115, D14122,, 2010. 

Mitchell, K. E., Lohmann, D., Houser, P. R., Wood, E. F., Schaake, J. C., Robock, A., Cosgrove, B. A., Sheffield, J., Duan, Q., Luo, L., Higgins, W., Pinker, R. T., Tarpley, J. D., Lettenmaier, D. P., Marshall, C. H., Entin, J. K., Pan, M., Shi, W., Koren, V., Meng, J., Ramsay, B. H., and Bailey, A. A.: The multi-institution North American Land Data Assimilation System (NLDAS): Utilizing multiple GCIP products and partners in a continental distributed hydrological modeling system, J. Geophys. Res., 109, D07S90,, 2004. 

Molotch, N. P. and Bales, R. C.: SNOTEL representativeness in the Rio Grande headwaters on the basis of physiographics and remotely sensed snow cover persistence, Hydrol. Process., 20, 723–739, 2006. 

Molteni, F., Buizza, R., Palmer, T. N., and Petroliagis, T.: The ECMWF ensemble prediction system: Methodology and validation, Q. J. Roy. Meteorol. Soc., 122, 73–119, 1996 (data available at:, last access: 5 February 2021). 

Mortimer, C., Mudryk, L., Derksen, C., Luojus, K., Brown, R., Kelly, R., and Tedesco, M.: Evaluation of long-term Northern Hemisphere snow water equivalent products, The Cryosphere, 14, 1579–1594,, 2020. 

Motoyama, H., Simulation of seasonal snowcover based on air temperature and precipitation, J. Appl. Meteorol. Clim., 29, 1104–1110, 1990. 

Mott, R., Vionnet, V., and Grünewald, T.: The seasonal snow cover dynamics: review on wind-driven coupling processes, Front. Earth Sci., 6, 197,, 2018. 

Musselman, K. N., Lehner, F., Ikeda, K., Clark, M. P., Prein, A. F., Liu, C., Barlage, M., and Rasmussen, R.: Projected increases and shifts in rain-on-snow flood risk over western North America, Nat. Clim. Change, 8, 808–812, 2018. 

Neumann, N. N., Derksen, C., Smith, C., and Goodison, B.: Characterizing local scale snow cover using point measurements during the winter season, Atmos.-Ocean, 44, 257–269, 2006. 

Niu, G. Y., Yang, Z. L., Mitchell, K. E., Chen, F., Ek, M. B., Barlage, M., Kumar, A., Manning, K., Niyogi, D., Rosero, E., Tewari, M., and Xia, Y: The community Noah land surface model with multiparameterization options (Noah-MP): 1. Model description and evaluation with local-scale measurements, J. Geophys. Res.-Atmos., 116, D12109,, 2011. 

Oyler, J. W., Dobrowski, S. Z., Ballantyne, A. P., Klene, A. E., and Running, S. W.: Artificial amplification of warming trends across the mountains of the western United States, Geophys. Res. Lett., 42, 153–161, 2015. 

Pan, M., Sheffield, J., Wood, E. F., Mitchell, K. E., Houser, P. R., Schaake, J. C., Robock, A., Lohmann, D., Cosgrove, B., and Duan, Q.: Snow process modeling in the North American Land Data Assimilation System (NLDAS): 2. Evaluation of model simulated snow water equivalent, J. Geophys. Res.-Atmos., 108, 8850,, 2003. 

Pavelsky, T. M., Kapnick, S., and Hall, A.: Accumulation and melt dynamics of snowpack from a multiresolution regional climate model in the central Sierra Nevada, California, J. Geophys. Res.-Atmos., 116, D16115,, 2011. 

Raleigh, M. S., Lundquist, J. D., and Clark, M. P.: Exploring the impact of forcing error characteristics on physically based snow simulations within a global sensitivity analysis framework, Hydrol. Earth Syst. Sci., 19, 3153–3179,, 2015. 

Roesch, A.: Evaluation of surface albedo and snow cover in AR4 coupled climate models, J. Geophys. Res.-Atmos., 111, D15111,, 2006. 

Rutter, N., Essery, R., Pomeroy, J., Altimir, N., Andreadis, K., Baker, I., Barr, A., Bartlett, P., Boone, A., Deng, H., Douville, H., Dutra, E., Elder, K., Ellis, C., Feng, X., Gelfan, A., Goodbody, A., Gusev, Y., Gustafsson, D., Hellström, R., Hirabayashi, Y., Hirota, T., Jonas, T., Koren, V., Kuragina, A., Lettenmaier, D., Li, W.-P., Luce, C., Martin, E., Nasonova, O., Pumpanen, J., Pyles, R. D., Samuelsson, P., Sandells, M., Schädler, G., Shmakin, A., Smirnova, T. G., Stähli, M., Stöckli, R., Strasser, U., Su, H., Suzuki, K., Takata, K., Tanaka, K., Thompson, E., Vesala, T., Viterbo, P., Wiltshire, A., Xia, K., Xue, Y., and Yamazaki, T.: Evaluation of forest snow processes models (SnowMIP2), J. Geophys. Res.-Atmos., 114, D06111,, 2009.  

Serreze, M. C., Clark, M. P., Armstrong, R. L., McGinnis, D. A., and Pulwarty, R. S.: Characteristics of the western United States snowpack from snowpack telemetry (SNOTEL) data, Water Resour. Res., 35, 21452160,, 1999 (data available at:, last access: 7 November 2022). 

Sims, E. M. and Liu, G.: A parameterization of the probability of snow–rain transition, J. Hydrometeorol., 16, 1466–1477, 2015. 

Sun, N., Yan, H., Wigmosta, M. S., Leung, L. R., Skaggs, R., and Hou, Z.: Regional snow parameters estimation for large-domain hydrological applications in the western United States, J. Geophys. Res.-Atmos., 124, 5296–5313, 2019 (data available at:, last access: 7 November 2022). 

Suzuki, K. and Zupanski, M.: Uncertainty in solid precipitation and snow depth prediction for Siberia using the Noah and Noah-MP land surface models, Front. Earth Sci., 12, 672–682,, 2018. 

Sturm, M., Goldstein, M. A., and Parr, C.: Water and life from snow: A trillion dollar science question, Water Resour. Res., 53, 3534–3544, 2017. 

Trujillo, E. and Molotch, N. P.: Snowpack regimes of the western United States, Water Resour. Res., 50, 5611–5623,, 2014. 

van den Hurk, B., Kim, H., Krinner, G., Seneviratne, S. I., Derksen, C., Oki, T., Douville, H., Colin, J., Ducharne, A., Cheruy, F., Viovy, N., Puma, M. J., Wada, Y., Li, W., Jia, B., Alessandri, A., Lawrence, D. M., Weedon, G. P., Ellis, R., Hagemann, S., Mao, J., Flanner, M. G., Zampieri, M., Materia, S., Law, R. M., and Sheffield, J.: LS3MIP (v1.0) contribution to CMIP6: the Land Surface, Snow and Soil moisture Model Intercomparison Project – aims, setup and expected outcome, Geosci. Model Dev., 9, 2809–2832,, 2016. 

Viviroli, D., Messerli, H. H. D. B., Meybeck, M., and Weingartner, R.: Mountains of the world, water towers for humanity: Typology, mapping, and global significance, Water Resour. Res., 43, W07447,, 2007. 

Wang, L., Cole, J., Bartlett, P., Verseghy, D., Arora, V., Derksen, C., Brown, R., and von Salzen, K.: Investigating the spread of surface albedo in snow covered forests in CMIP5 models, J. Geophys. Res., 121, 11041119,, 2016. 

Wang, Y. H., Broxton, P., Fang, Y., Behrangi, A., Barlage, M., Zeng, X., and Niu, G. Y.: A wet-bulb temperature-based rain-snow partitioning scheme improves snowpack prediction over the drier western United States, Geophys. Res. Lett., 46, 13825–13835, 2019. 

Westerling, A. L., Hidalgo, H. G., Cayan, D. R., and Swetnam, T. W.: Warming and earlier spring increase western US forest wildfire activity, Science, 313, 940–943, 2006. 

Wrzesien, M. L., Durand, M. T., Pavelsky, T. M., Howat, I. M., Margulis, S. A., and Huning, L. S.: Comparison of methods to estimate snow water equivalent at the mountain range scale: A case study of the California Sierra Nevada, J. Hydrometeorol., 18, 1101–1119, 2017. 

Short summary
While land surface models are a common approach for estimating macroscale snow water equivalent (SWE), the SWE accuracy is often limited by uncertainties in model physics and forcing inputs. In this study, we found large underestimations of modeled SWE compared to observations. Precipitation forcings and melting physics limitations dominantly contribute to the SWE underestimations. Results provide insights into prioritizing strategies to improve the SWE simulations for hydrologic applications.