Effect of meteorological forcing and snow model complexity on hydrological simulations in the Sieber catchment ( Harz Mountains , Germany )

Detailed physically based snow models using energy balance approaches are spatially and temporally transferable and hence regarded as particularly suited for scenario applications including changing climate or land use. However, these snow models place high demands on meteorological input data at the model scale. Besides precipitation and temperature, time series of humidity, wind speed, and radiation have to be provided. In many catchments these time series are rarely available or provided by a few meteorological stations only. This study analyzes the effect of improved meteorological input on the results of four snow models with different complexity for the Sieber catchment (44.4 km) in the Harz Mountains, Germany. The Weather Research and Forecast model (WRF) is applied to derive spatial and temporal fields of meteorological surface variables at hourly temporal resolution for a regular grid of 1.1 km× 1.1 km. All snow models are evaluated at the point and the catchment scale. For catchment-scale simulations, all snow models were integrated into the hydrological modeling system PANTA RHEI. The model results achieved with a simple temperature-index model using observed precipitation and temperature time series as input are compared to those achieved with WRF input. Due to a mismatch between modeled and observed precipitation, the observed melt runoff as provided by a snow lysimeter and the observed streamflow are better reproduced by application of observed meteorological input data. In total, precipitation is simulated statistically reasonably at the seasonal scale but some single precipitation events are not captured by the WRF data set. Regarding the model efficiencies achieved for all simulations using WRF data, energy balance approaches generally perform similarly compared to the temperature-index approach and partially outperform the latter.


Introduction
The water balance of mountainous and sub-Arctic catchments is strongly influenced by snow processes.Due to the accumulation of snow during the winter season, a considerable fraction of precipitation is stored in the snow pack.Rapid snowmelt, partially superimposed by rain events, can cause extensive flooding with high potential of damage.Hence, hydrological modeling systems have to incorporate snow accumulation and melt for water resources management in regions that are affected by snow processes.In general, two types of snow models are available: -temperature-index model (also known as the degree-day method) and extended index models (e.g., Anderson, 1973): these models are "simple and modest" (Rango and Martinec, 1995) because only temperature and precipitation time series are needed.Some extensions also incorporate shortwave radiation input (Hock, 1999;Pellicciotti et al., 2005).
Published by Copernicus Publications on behalf of the European Geosciences Union.
-energy balance models include a detailed physically based description for the surface energy balance of a snow pack (e.g., Anderson, 1968).The process-based equations require further meteorological time series as input (e.g., humidity, radiation, wind speed).
When using energy balance approaches instead of index melt models, many more processes such as sublimation losses can be considered.These could be relevant for the Alpine water balance (Strasser et al., 2008).Even though some parameterizations of the energy balance can also be considered inductive empirical functions (Beven, 2012), energy balance approaches are considered to be more reliable for scenario simulations than simpler index methods due to their physical basis (e.g., Charbonneau et al., 1981;Walter et al., 2005;Pomeroy et al., 2007;Singh et al., 2009;Barry and Gan, 2011;Warscher et al., 2013;Marke et al., 2014b).
There are many physically based snow models available for point-scale applications (see, e.g., Etchevers et al., 2004;Rutter et al., 2009).Nevertheless, the temperature-index method is widely accepted for hydrological modeling at the catchment scale (Rango and Martinec, 1995;Seibert, 1999;Beven, 2001).Simple index approaches outperform energy balance approaches in many cases because the latter are more sensitive to the quality of meteorological input data (Zappa et al., 2003).Hence, "inadequate basin-scale hydrologic observations" (Franz et al., 2008) restrict catchmentscale applications of energy balance approaches.However, some studies prove that the energy balance approaches provide better results than temperature-index models, even at the catchment scale (see, e.g., Kuchment and Gelfan, 1996;Fuka et al., 2012).
Authors like Bales et al. (2006), Franz (2006), and Jost et al. (2012) suggest finding new strategies to provide suitable data to drive process-based snow models for catchmentscale applications.For instance, Franz (2006) and El-Sadek et al. (2011) propose (re-)analysis data as input for hydrological modeling.However, this type of data does not provide a sufficient spatial and temporal resolution for typical applications in hydrological modeling for complex topographies of mountain regions.Considering topography, land use and soil type, regional models of the atmosphere that are also referred to as limited-area models (LAM) are suitable tools to "add regional detail" to these global-scale (re-)analysis data (Giorgi, 2006).This procedure is called dynamical downscaling.Besides hydrological forecasts and climate impact studies, successful applications of LAM in hydrological modeling and cryospheric research using (re-)analysis data were reported by Kunstmann and Stadler (2005), Rögnvaldsson et al. (2007a), Rögnvaldsson et al. (2007b), Bernhardt et al. (2010), Liu et al. (2011), Marke et al. (2011), Maussion et al. (2011), Pavelsky et al. (2011), Mölg et al. (2012), and Marke et al. (2014a).In addition, globally available data sets can also be considered valuable information for predictions in ungauged basins (Sivapalan et al., 2003) and ungauged climates (Merz et al., 2011).The application of meteorological fields derived through dynamical downscaling might become increasingly relevant in this context since this procedure is mostly independent of local observations and it is expected that the accuracy of this method will further improve in the future as a result of intensive research in this field of science (see, e.g., Giorgi, 2006;Rummukainen, 2010;Warner, 2011).
The main questions to be answered in this study are: -Given the possible lack of observed data with respect to several meteorological variables (e.g., humidity, wind speed, and sometimes even temperature), does downscaled LAM data represent a valuable alternative to observations for modeling snow processes?
-Does increasing snow model complexity using these data increase model performance?
A LAM of the atmosphere is applied to obtain hourly spatially distributed meteorological time series for snowmelt simulations at both the point and the catchment scale.It is assumed that the meteorological forcing data generated by a LAM, by means of dynamical downscaling, are physically consistent in space and time (Giorgi, 2006) -a basic requirement for application as input for physically based snow models.As reported in the above-cited studies, analysis data are used herein as input for the LAM.In contrast or extension to other studies that address LAM input, four independent snow models including the temperature-index model and three energy balance models are applied in order to investigate different degrees of model complexity on the quality of snow model results.Moreover, using three energy balance models representing more complex models is seen as a multiple hypothesis framework (Clark et al., 2011).Since most energy balance approaches have been developed for point-scale applications but hydrological models require upscaling of these models, all snow models are tested at the point and at the catchment scale.

Study area
All simulations were carried out in the Harz Mountains, a low mountain range in the northern part of Germany (see Fig. 1).The study area covers elevations ranging from 300 to 1100 m a.s.l.The Brocken (1142 m a.s.l.) is the highest peak of both the Harz Mountains and northern Germany.The Harz Mountains delineate the northern boundary of low mountain ranges in Germany merging into the North European Plain with elevations below 200 m a.s.l.
Due to altitudinal differences, considerable gradients in meteorological fields reflecting different climates exist.The mean annual temperature and precipitation depth at the Clausthal climate station (585 m a.s.l.) are 6.2 • C and 1326 mm yr −1 , respectively.However, the climate at the Brocken is considerably different (2.9 • C, 1814 mm yr −1 ).All values provided above refer to the period 1961-1990.Two sites were selected for the simulations in this study: -Torfhaus meteorological station (805 m a.s.l.) for pointscale applications: besides precipitation, temperature and snow-depth recordings, a snowmelt lysimeter is available to provide melt rates.This snowmelt lysimeter is an unenclosed type with a small rim above the collector, which covers an area of 2 m 2 .A tipping bucket is installed below the collector in order to continuously record melt rates.Based on recordings from 2004 to 2012 the mean annual precipitation depth can be approximated as 1430 mm yr −1 .
-Sieber catchment upstream of the Pionierbrücke gauging station (340 m a.s.l.) for catchment-scale applications: the upstream Sieber catchment covers an area of 44.4 km 2 with elevations ranging from 340 m to 920 m a.s.l.Soil types of sandy loam and loamy sand over bedrock are prevailing.About 75 % of the catchment is covered by coniferous forest.Norway spruce (Picea abies) is the predominant wood species.The remaining parts of the area are covered by deciduous trees, meadows, upland moors, and minor settlements.Due to former mining activities in the Harz Mountains, a system of channels redirects water across the watershed boundary (Upper Harz Water Management System).The mean annual runoff depth, which was calculated for the period 1930-2013 based on long-term observations, is 1097 mm yr −1 .Even though average monthly precipitation depth is highest in December, the maximum value of mean monthly runoff depth typically occurs in April.This shift between the mean annual courses of precipitation and runoff depth could be related to snow accumulation during the winter and snowmelt in spring.

Selected winter seasons
As will be described later in Sect.2.3 in much more detail, computation time still limits the application of LAMs with respect to the number of runs, spatial resolution, or length of LAM runs (Rummukainen, 2010).Therefore, apart from large climate change impact projects, many studies that investigate the usage of LAMs with higher spatial and temporal resolution in hydrological modeling generally do not consider long-term simulations.The previously cited studies include LAM applications of 1 month (Maussion et al., 2011), 3 months (Liu et al., 2011), 9 months (Pavelsky et al., 2011), and 1 year (Kunstmann and Stadler, 2005), respectively.Only Rögnvaldsson et al. (2007b) consider 16 years (using a coarser grid of 8 km × 8 km), and the simulations of Mölg et al. (2012) cover 3 years.In this study, two winter seasons with different meteorological conditions were selected for further investigation.This selection was restricted to the last decade, for which hourly observations are available.Unless otherwise stated, the following comments refer to the Braunlage meteorological station (607 m a.s.l., Fig. 1).The winter season 2005-2006 was colder than average.This holds especially for the period from January 2006 until March 2006.Moreover, in February and March, precipitation of 30 % above average was recorded.In the beginning of April, rapid snowmelt occurred due to a significant rise in temperature accompanied by rain.The winter season 2010-2011 on the other hand differs from average with respect to timing of accumulation and melt.In December 2010 the observed temperature was 5 K colder than the average while precipitation was above average indicating ideal conditions for intensive snowfall.The highest daily snow depths in the second half of December observed at Clausthal for the period 1951-2011 were recorded in December 2010.In early January 2011 the weather changed and above average temperatures as well as rainfall were observed throughout the remaining winter season.In 2011, spring was exceptionally dry and warm.It is assumed that these differences in meteorological boundary conditions are sufficient to fulfill the prerequisites of a differential split sample test (Klemeš, 1986).With this type of test, different conditions for calibration and validation periods are presumed, and therefore, it "allows testing the 'risky' predictions of a model rather than the 'safe' ones" (Seibert, 2003).This approach allows one to consider two different winter seasons at still reasonable computational costs.

Dynamical downscaling using WRF
In order to provide meteorological data fields with high resolution in space and time, a non-hydrostatic limited-area model (LAM) was applied.Non-hydrostatic LAMs are considered preferable for spatial resolutions of less than 10 km, as also suggested by Warner (2011).The freely available LAM Advanced Research WRF (Weather Research and Forecast modeling system, Skamarock et al., 2008) was chosen for this reason.
A multi-nesting approach setup of the model was established to provide a spatial resolution of 1.1 km for the study area (see Fig. 2).The first domain ( x = y = 30 km) covers central Europe bounded by the North Sea and the Baltic Sea to the north and the Alps to the south.Due to further refinements of the grid resolution ( x = y = 1.1 km), major characteristics of the topography of the study area are captured in the fourth domain.
The parameterization schemes of the model used in this study are listed in Table 1.A detailed description of the listed parameterization approaches can be found in the user's guide (Wang et al., 2012) or in the cited literature.Convection parameterizations are only applied to domain 1 and 2.Moreover, four-dimensional data assimilation (FDDA, Stauffer and Seaman, 1990) was activated to keep the simulations close to the input data.This approach is proposed by Lo et al. (2008) although other approaches like the re-initialization are also common (Maussion et al., 2011;Hines et al., 2011).
Initial and boundary conditions were obtained from the National Center for Environmental Prediction (NCEP) Final Analysis (FNL) Operational Model Global Tropospheric Analyses data set (ds083.2) using the WRF Preprocessing System (WPS) which is also described by Wang et al. (2012).This data set contains major meteorological variables at the surface and mandatory vertical levels of the atmosphere on 1 • × 1 • grids prepared operationally every 6 h (NCEP, 2012a).Its temporal availability covers the range from July 1999 to the present.To account for temperature changes in the North Sea and Baltic Sea, additional sea surface temperature data available at daily grids with a spatial resolution of 0.5 • × 0.5 • (RTG-SST data set, Thiebaux et al., 2003;NCEP, 2012b) were also prepared using WPS.Finally, hourly gridded meteorological fields with grid spacings of 1.1 km were simulated using WRF for both the winter season 2005-2006 and 2010-2011.Surface meteorological variables that were derived in this manner are precipitation intensity, temperature (2 m), specific and relative humidity (2 m), wind speed (10 m), and shortwave as well as longwave radiation.
Depending on the computer hardware, WRF applications run for 3 days (24 cores) up to 20 days (4 cores) in order to simulate the meteorological fields of one single winter season for the domains shown in Fig. 2. A long-term WRF simulation, as would be desirable in this study, requires the application of a large-capacity computer resulting in relatively high computation costs.In order to avoid such costs and to show a pragmatic alternative for future users who want to apply WRF downscaling but do not have access to large-scale computers, we selected only two winter seasons that represent different meteorological conditions as described in Sect.2.2.

Snow models
Three energy balance snow models and a temperature-index model were tested at the point and at the catchment scale.All algorithms were originally available at the point scale.
The following list accounts for the major characteristics of each model.The reader is referred to the cited literature for a detailed description.Because large parts of the Sieber catchment are covered by forests, the short descriptions emphasize the consideration of forest effects on the energy balance.
-The Utah Energy Balance Model version 2.2 (Tarboton and Luce, 1996) is available as open source software and can be obtained via the internet (Tarboton, 2012).Originally, the model was designed to carry out simulations for time steps of 1 and 6 h.Snow water equivalent (SWE) and energy content are the prognostic variables calculated for each time step using a predictor corrector approach.The model accounts for liquid water and incorporates a simple approach to simulate the ground heat flux.Among others, the most important parameters for calibrating the model are as follows (Tarboton and Luce, 1996): roughness length, thermal conductivity of snow, albedo of fresh snow, and another parameter that accounts for forest cover effects.The latter parameter scales wind speed and solar irradiance.Snow accumulation is parameterized using two temperaturerelated thresholds, which also enables the occurrence of mixed liquid and solid precipitation (mixed phase transition temperature range).
-ESCIMO (Energy balance Snow Cover Integrated MOdel, see, Strasser et al., 2002;Strasser and Marke, 2010) was originally developed for point-scale applications.The model was extended to the fully distributed hydro-climatological model AMUNDSEN (Alpine MUltiscale Numerical Distributed Simulation ENgine, Strasser et al., 2004Strasser et al., , 2008)), which includes detailed process descriptions, e.g., for topographydependent radiative transfer, gravitational and windinduced redistribution of snow, technical snow production (Hanzer et al., 2014) and runoff concentration.For both models, a detailed description of snow canopy interaction was added recently (Strasser et al., 2008(Strasser et al., , 2011)).This approach includes sub-canopy modifications to open-site meteorological conditions and incorporates a canopy interception model (Hedstrom and Pomeroy, 1998), which builds upon the scaling approach of Pomeroy and Schmidt (1993) and Pomeroy et al. (1998).The dependence of the leaf area index (LAI) on the interception capacity is accomplished according to Liston and Elder (2006).All these processes were also successfully tested at the catchment scale (Warscher et al., 2013).In this study, the point-scale model ESCIMO including the canopy model is used.
The parameter set that is altered in order to calibrate the model includes albedo recession constants as well as the LAI.ESCIMO includes a fixed wet-bulb temperature threshold of 2.0 • C in order to separate rain and snow.
- Walter (2012) provides a spreadsheet version of the algorithms described in Walter et al. (2005).The basic idea of their investigation was to develop a physically based alternative to the temperature index method without any additional data requirements.Regardless of its simplicity, the algorithm accounts for all relevant components of the energy balance as well as for liquid water storage and variable density.The original algorithm assumes daily time steps.In this study, all simulations were carried out for hourly time steps.Therefore, an adaptation of the original algorithm was necessary.Only the radiation balance has been modified in this study.Instead of using daily minimum and maximum temperature as proposed by Walter et al. (2005), time series of shortwave and longwave radiation are used.Several equations, such as the albedo recession, were adopted for hourly time steps.As with the Utah Energy Balance model, the snow model from Walter et al. (2005) accounts for forest canopy effects by providing one parameter to scale shortwave radiation.Furthermore, a temperature threshold for separating rain and snow as well as albedo recession constants can be altered in order to calibrate the model.All other processes like, e.g., turbulent fluxes, are originally parameterized using fixed values.
-For reasons of comparability and because of its popularity, the temperature-index approach was also considered.Forest effects on snowmelt can be taken into consideration by calibration of the degree-day value that is typically lower in forests compared to open sites.The degree-day factor is assumed to be independent of time.Snow is accumulated if the temperature falls below a given threshold temperature.Other enhancements, which are sometimes used for this approach, such as cold content, radiation input, or liquid water storage for example, are neglected in order to test the most basic version of this model.
The parameters for each snow model are listed in Table 2.In order to make the snow models available for catchment-scale applications, they have been implemented into a hydrological modeling system.This step is described in the next section.

Hydrological modeling
All simulations at the catchment scale were carried out using the hydrological modeling system PANTA RHEI for the Sieber catchment.PANTA RHEI has been developed by the Department of Hydrology, Water Management and Water protection, Leichtweiss Institute for Hydraulic Engineering and Water Resources, University of Braunschweig, in corporation with the Institut für Wassermanagement IfW GmbH, Braunschweig, Germany (LWI-HYWAG and IfW, 2012).At present, the model is used for a wide range of tasks in national and international projects: Table 2. Model parameters that were altered throughout the calibration process for both the point and the catchment scale.Please note that the catchment-scale parameters refer to the entire Sieber catchment upstream of the Pionierbrücke gauging station.] 0.12 0.12 leaf area index scaling factor -1

Snow model
-science: long-term water balance simulations and optimization of reservoir cascade operation for climate change impact studies (Meon and Gocht, 2012).
-engineering practice: integrated flood protection concepts, hydrological design floods for hydraulic structures.
-engineering practice: online flood prediction of medium to small catchments, e.g., by the Flood Early Warning Centre of the German federal state of Lower Saxony.
Since PANTA RHEI features a graphical user interface (GUI) and geographic information system (GIS) data exchange capabilities, it is a fourth generation hydrological modeling system according to the classification of Refsgaard (1996).
With respect to the level of sophistication of processes, PANTA RHEI is a conceptual deterministic modeling system although physically based enhancements exist like the energy balance snow models described herein or physically based algorithms for evapotranspiration and the soil water balance (Förster et al., 2012;Kreye et al., 2012).It can also be characterized as a semi-distributed modeling system, subdividing the watershed into highly resolved sub-catchments and subsequently into hydrological response units (HRUs).The choice of the simulation time step depends on the task.Common settings range from some minutes to 1 day.For flood simulations, 1 h is a typical time step (LWI-HYWAG and IfW, 2012).Observed precipitation is corrected with respect to systematic errors assuming correction factors for snow and rain.To provide meteorological data for each subcatchment, a simple distance-related distribution method for point-scale observations is applied.A constant lapse rate is assumed in order to account for altitudinal variations of temperature.
The vertical column of hydrological processes including snowmelt, interception, infiltration, and soil water is calculated for every HRU (Fig. 3).Runoff concentration (including several runoff components) and routing calculations are carried out for the entire sub-catchment presuming an aggregation of all associated HRUs.For this study we used the following configuration: the snow models described in Sect.2.4 have been integrated into the modeling system as individual components, each representing a fully functional snow model.The respective parameter accounting for forest cover effects, which is included in each of the models, is provided through lookup tables in PANTA RHEI in order to account for different stand densities in forested areas.
None of the other hydrological process descriptions were changed throughout the study.Interception is calculated according to the Rutter et al. (1971) model.Infiltration and the soil water balance were simulated in a simplified way using a modified curve number approach which was adopted for continuous simulations (Riedel, 2004).Potential evapotranspiration is derived using the Penman-Monteith method (Monteith, 1965).For several land use classes, respective vegetation parameters (e.g., LAI) are provided in terms of lookup tables that were derived from literature review (e.g., Breuer et al., 2003).The calculation of actual evapotranspiration depends on the amount of intercepted water and soil water storage.For each HRU, the runoff is subdivided into surface runoff, interflow, and base flow for each time step.A subdivision assuming two groundwater storages is also possible.Then these runoff components are aggregated to obtain the respective values for the superordinate sub-catchment.Routing is not only available for the river reaches of the subcatchments but also for reservoirs, retention structures, culverts and other features.
A detailed model setup of the Sieber catchment for PANTA RHEI derived in an ongoing climate change research project was available for this study (Hölscher et al., 2012).The catchment was sub-divided into 73 sub-catchments which account for a mean area of 0.6 km 2 .This sub-division into sub-catchments and HRUs was carried out based on digital elevation data, land use, and soil information.Redirections of water due to channels associated to the Upper Harz Water Management System were also considered in this context presuming mean annual and seasonal flow rates, which have been observed using gauging stations (Eggelsmann and Lange, 2011).

Model calibration
The calibration of the hydrological model and the snow models at both scales was carried out through maximizing the Nash-Sutcliffe model efficiency (see, e.g., Hall, 2001) manually.The hydrological model of the Sieber catchment was calibrated using long-term simulations based on daily meteorological data ranging from 1971 to 1991 (Nash-Sutcliffe model efficiency E for daily time step E 24h = 0.72).In total, 10 parameters were altered for the entire catchment (one parameter for the scaling of distributed degree-day values as well as a temperature threshold separating rain and snow, three parameters of the soil model, three recession constants for runoff concentration, and two parameters of the routing model).Subsequently, the model was validated using daily time series from 1991 to 2001 (E 24h = 0.79).In order to adapt the model for hourly time series of precipitation, a second calibration step was considered using hourly precipitation time series ranging from 2002 to 2008 (E 1h = 0.71).This second step only included the adjustment of one parameter that separates infiltrated water and surface runoff in the model.Hence, it is assumed that the altered parameterization is suitable for simulations using hourly time steps.
The snow models were calibrated separately for the winter season [2005][2006] (1 November 2005-1 May 2006), whereas the winter season 2010-2011 (1 November 2010-1 April 2011) is viewed as the validation period.Nash-Sutcliffe model efficiencies are calculated for melt runoff at the point scale and streamflow at the catchment scale.As described in Sect.2.4, the data requirement of models differ with respect to meteorological input.For the temperature-index approach only precipitation and temperature time series are mandatory, whereas the energy balance approaches require radiation, humidity, and wind speed (modified Walter approach: only shortwave and longwave radiation are additional requirements).Besides melt runoff, only precipitation and temperature are recorded at Torfhaus.We therefore decided to prepare a uniform data set, which was used to calibrate each of the snow models at both scales.This data set includes all relevant meteorological time series derived by WRF except for precipitation in order to avoid misleading configurations caused by precipitation uncertainty at small temporal scales.Observed precipitation time series are used instead.
All relevant parameters that were altered for each snow model in order to maximize the model efficiency with respect to melt runoff and streamflow, respectively, are summarized in Table 2.For point-scale applications, none of the adjustable parameters of ESCIMO were altered at this stage.Calibrating the Utah Energy Balance model was most timeconsuming since six parameters had to be altered.All parameters are within the range of values given by Tarboton and Luce (1996).
Since three-quarters of the Sieber catchment is covered by forest, an adaptation to the point-scale parameter sets was necessary in order to calibrate each of the snow models, which are components of PANTA RHEI at the catchment scale.Thus, the calibration procedure for catchment-scale applications focused on the consideration of these effects.PANTA RHEI provides look-up tables including land-usedependent parameters that are relevant for the snow models including degree-day values, vegetation fraction, and LAI (among others).In order to calibrate the temperature-index model, a scaling factor for the HRU-based degree-day values was altered.Similarly, the land-use-dependent vegetation fraction values needed to be adjusted for the modified Walter approach and the Utah Energy Balance Model using a scaling factor.
In contrast to all other tested snow models, ESCIMO includes a detailed model component to simulate snow processes in canopies (thus, the model is hereinafter referred to as ESCIMO + Canopy).This model utilizes the spatial  Table 3 summarizes the model efficiency achieved for each of the snow models at the point and the catchment scale.The model performance is generally better for the catchment scale compared to point-scale considerations.Moreover, the range of model efficiencies is also smaller at the catchment scale.These findings also hold true for the results achieved for the independent validation period (Table 4) underlining the generally good model performance.

Results and discussion
This section is organized as follows: first, the results of simulations achieved using observed meteorological input are described for both the point and the catchment scale.Then, the respective results using downscaled WRF meteorological fields are presented including a brief evaluation of the downscaled meteorological fields.

Point scale
As described earlier, only temperature and precipitation time series are available for the Torfhaus meteorological station.Hence, this data only allows the application of the temperature-index approach.All simulations were carried out using a 1 h time step for the entire winter season 1 November 2005-1 May 2006 as well as 1 November 2010-1 April 2011.Observed precipitation was corrected to account for systematic errors.Figure 4 shows SWE simulations as well as observed snow depth, which is recorded using an ultrasonic sensor.The course of SWE, as modeled using the temperature-index model, tracks the observed snow depth very well for both seasons.However, the shallow snow cover observed in February and March 2011 is not captured by the simulation.The corresponding time series of melt runoff for the same configuration are depicted in Fig. 5.In contrast to the SWE plot, only the melt event is shown here.Every sub-plot shows the melt runoff recorded by the lysimeter.Cumulative runoff time series for both model and observation can easily be related to SWE and cumulative precipitation.In general, simulations match observed time series well, which holds also for diurnal features of melt runoff.However, some of the melt peaks in 2006 are overestimated, whereas one peak is not captured by the model.The total melt depth is simulated in accordance with observations.The model efficiency for the winter season 2005-2006, which is calculated for a 1 h time step, is E = 0.45.
The model efficiency achieved for the winter season 2010-2011 accounts for E = 0.75.This value is better than the respective model efficiency of the winter season 2005-2006 although the total melt depth is underestimated.These results emphasize the applicability of the previously calibrated temperature-index model using local observations at the point scale with respect to both SWE and melt runoff.

Catchment scale
According to the previous section, PANTA RHEI using the temperature-index approach was run using observed meteorological time series for the winter season 2005-2006.Since basin-scale observations of SWE are not available the results are displayed for streamflow for the entire catchment only.Using the temperature index model at the catchment scale yields E 2006 = 0.91 for the winter season 2005-2006 (Fig. 6).Modeled streamflow tracks the observed time series with high accuracy.However, the rain-on-snow event, which occurred from the end of March to the beginning of April 2006 yielding 29.5 m 3 s −1 , is underestimated by the model, which only accounts for 24 m 3 s −1 .
The winter season 2010-2011 is also simulated in accordance with observations (E 2011 = 0.83).The characteristics of the model performance is similar to the results obtained for the winter seasons 2005-2006.Modeled and observed time series coincide well, which is expressed in the high model efficiency.However, the melt peak is also underestimated.

Meteorological fields
To assess the accuracy of the downscaled WRF data, the meteorological data fields derived by WRF were compared to observations from meteorological stations.Figure 7 depicts the cumulative mean areal precipitation based on observations from the precipitation gauge network and the WRF simulations for the winter season 2005-2006.The cumulative precipitation is derived by calculating the arithmetic mean of 19 stations and 19 corresponding grid points of the atmospheric model.These stations are situated in the Harz Mountains and its vicinity (see, Fig. 1).The temporal course coincides well with observations.However, the total sum of simulated precipitation only accounts for 81 % of the observed areal mean, which is not corrected with respect to systematic errors for this comparison.During the melt season, which falls at the end of March and the beginning of April 2006, simulated precipitation intensities are substantially lower than observed intensities.The correlation coefficient of hourly intensities is r = 0.52.
Table 5 provides a statistical evaluation of hourly tation intensity for the stations that were included to calculate the areal mean.WRF underestimates the precipitation depth for the winter season 2005-2006.The mismatch is greatest for the Brocken meteorological station, which is located at the highest summit of the study area (1142 m a.s.l.).It is suspected that the complex topography is not resolved properly by the WRF model setup since the corresponding 1.1 km grid cell is located at 1008 m a.s.l.The Nash-Sutcliffe model efficiency E of most stations is below zero indicating a low model performance with respect to precipitation, even though the results are statistically reasonable, in particular at larger scales.For example, the statistical evaluation of the areal mean yields E=0.14.This value also increases if the temporal scale is shifted towards longer time steps (e.g., to 0.36 if a time step of 6 h is specified).At the daily timescale a model efficiency of 0.52 can be achieved.The reasonably good match of the precipitation time series plotted in Fig. 7 is also reflected by the model efficiency of 0.80 calculated for a monthly time step.
The deviation between observed and simulated precipitation is obviously caused by uncertainties involved in the original analysis data and the downscaling process with regard to the heterogeneous orographic conditions of the small project region (upwind and downwind effects).It is not clear whether the generally lower WRF performance at the station locations in the lowlands is caused by lower model performance due to the nearby model domain boundary or due to weaker representation of hydrometeorological processes in this elevation band.
In general, temperature simulations match observations better than precipitation simulations do.The temperature time series of the WRF grid point that corresponds to the location of the Torfhaus station was evaluated (Fig. 8).The simulated time series match observations at Torfhaus very closely, which is reflected in the high correlation of r = 0.93.The model efficiency E amounts to 0.81.Nevertheless, the statistics reveal that the simulation is biased towards colder temperatures.The mean simulated temperature is 0.7 K lower than the mean value (−1.2 • C) of the observed time series.This also becomes evident when considering the underestimation of maximum temperatures as shown in Fig. 8.
All other relevant meteorological variables were also well simulated.In accordance with Table 5 the statistical evaluation of all remaining variables that are relevant for hydrological modeling are summarized in Table 6.The model performance achieved for the variables strongly depends on the statistical characteristics of the meteorological variables.The model performance of meteorological variables that are subjected to fluctuations at small temporal scales is generally lower when compared to variables that do not show this feature.This holds especially for wind speed and for downwelling longwave radiation.The latter for instance is strongly influenced by the sequence of presence and absence of single clouds.However, mean and standard deviation of these variables are simulated in accordance with   observations.Moreover, modeled temperature and humidity time series coincide very well when compared to observations.Corresponding plots of all these meteorological time series are shown by Förster (2013).

Point scale
In contrast to the previously described model setup, which involves observed meteorological time series, the full range of meteorological surface variables derived through dynamical downscaling enables the application of energy balance snow models.Thus, this model setting is not restricted to the temperature-index approach.
The calibrated snow models were applied using all meteorological variables derived through dynamical downscaling using WRF for the winter season 2005-2006, which is shown in Fig. 9a for SWE and Fig. 10 for melt runoff.All snow models are in agreement with observations when modeled SWE is compared to observed snow depth (Fig. 9a).Accumulation and snow cover depletion are simulated very well with respect to the timing of these processes.However, it is not clear which model is most accurate in modeling maximum SWE.
Figure 10 shows the results for the melt event in spring of 2006 at Torfhaus (calibration period).While the timing of melt runoff including diurnal features is modeled well by all snow models, the peak runoff at 31 March 2006 is not captured.As explained earlier, this mismatch could be related to the fact that precipitation is underestimated by WRF throughout this period.There is still remaining snow pack at Table 5.Comparison of observed and simulated precipitation time series for each station.This table includes the elevation of each station (z), simulated and observed precipitation depth for the winter season 2005-2006 (P s and P o ), the corresponding standard deviation of simulated and observed precipitation intensity ( σ P s and σ P o ), correlation between simulated and observed precipitation (r), root mean square error (RMSE), and the model efficiency (E).The temporal resolution of this evaluation is t = 1 h, which is available for 18 of 19 stations (performance measures for the Brocken are excluded since the temporal resolution of these station recordings is t = 6 h).Observed values are not corrected for this comparison.the end of the melt event in all simulation runs, whereas the automated snow depth recordings indicate a complete ablation of the snow cover.In contrast, melt runoff was still observed even though no precipitation was recorded during this period.Hence, the results from the simulation runs seem to be realistic.The validation of the models was performed using data of the winter season 2010-2011 (Fig. 11).Modeled melt runoff time series coincide well compared to the recorded lysimeter observations.In particular, the Utah Energy Balance Model and ESCIMO closely match observations.However, the peak runoff is generally overestimated by all snow models.At the seasonal scale, the snow cover dynamics are captured very well by all models.The courses of SWE are in accordance with the observed snow depth time series (Fig. 9b).When comparing the model efficiencies for each run, it is evident

Catchment scale
According to the previous simulations at the point scale, all snow models were likewise applied using PANTA RHEI for the Sieber catchment.Figure 12 shows the results of each snow model according to the previous section but for the entire winter season 2005-2006 including the accumulation period.All results plotted in Fig. 12 were derived using downscaled meteorological variables.The simulated streamflow tracks the observed streamflow reasonably well for all snow models.The rain-on-snow event which occurred from the end of March to the beginning of April yielding 29.5 m 3 s −1 is remarkably underestimated by the models.As already explained, this issue is related to the fact that WRF underestimates precipitation intensity during this period.However, the snowpack evolution is simulated in accordance with all models.
The results of the validation period (winter season 2010-2011) are shown in Fig. 13.The tested snow models perform well compared to observations, although none of the models is able to capture the first flood peak in November 2010.Possible explanations for this mismatch could include a lack in representativeness of simulated precipitation.However, all models track the observed time series very well during the melt period.The flood peak in January is captured by all models.
The assumption that the mismatch between modeled and observed snowmelt can at least partly be explained by biases in simulated precipitation is also supported by a study by Förster (2013).In this study, simulated precipitation has been substituted by observations with all the other meteorological variables still provided by WRF.While this approach might be useful in practice due to the fact that observed precipitation data are usually easier available than other meteorological data, the combined application of observed and simulated meteorological input does not allow one to fully assess the potential of WRF data for snowmelt modeling.It is therefore not followed up further in the present study.
It is worth noting that the application of energy balance approaches leads to slightly improved model efficiency when compared to the temperature-index approach.For example, the model efficiencies achieved using the modified Walter approach, which performs best at catchment scale, are E 2006 = 0.76 for the calibration period and E 2011 = 0.70 for the validation period.For every simulation run, downscaled precipitation depth (P sim ), observed streamflow (Q obs , blue line), observed runoff depth (blue dashed line), simulated streamflow (Q sim , red line), simulated runoff depth (red dashed line), and simulated snow water equivalent (SWE s ) are plotted.Model efficiencies E are also given for each run.
When comparing results from both scales, it is interesting to note that the application of energy balance models enables reliable simulations of snowmelt processes.As explained in the previous section, model ranking should be interpreted with caution due to the fact that only two sites and two winter seasons were considered for this study.

Summary and conclusions
To investigate the effect of meteorological forcing and snow model complexity on hydrological simulations, two types of meteorological forcing data and four different snow models were tested for the Sieber catchment in the Harz Mountains, Germany.Three snow models include a detailed description of the energy balance.Moreover, a temperatureindex approach was also considered.All models were set up for point-and catchment-scale analyses.For the latter, the energy balance models were added to the hydrological modeling system PANTA RHEI, which already included the temperature-index approach.While calibration was based on observed precipitation due to biases in simulated precipitation, the huge efforts currently undertaken to improve LAM simulations, particularly with respect to an accurate simulation of precipitation, might allow calibration based on simulated meteorological fields only in future studies.
Observed meteorological time series including precipitation and temperature were used as input for the temperatureindex approach.Since other meteorological observations are hardly available, a dynamical downscaling approach using global atmospheric analysis data to force more complex energy balance snow models at different scales was also considered.In order to derive meteorological data fields, we applied the Weather Research and Forecast model (WRF) driven by NCEP analysis data.The meteorological data fields including precipitation, temperature, humidity, wind speed, shortwave radiation, and longwave radiation were prepared as gridded data sets for hourly time steps.
The meteorological data fields were extensively evaluated using time series from the existing meteorological station network.The accuracy of precipitation simulations strongly depends on the considered scales.At the seasonal scale, downscaled areal precipitation matches observations statistically reasonably, whereas at smaller spatial and temporal scales some events are not captured by the model.For all other meteorological variables relevant for hydrological modeling, the simulated time series match surface observations well.
The temperature-index model performs well at both scales if observed time series of temperature and precipitation are used as meteorological forcing.When observed time series are replaced by downscaled meteorological fields derived through WRF, the model performance of the temperatureindex model decreases.However, WRF provides a full set of meteorological fields, which are physically consistent, even for 1 h time steps.Thus, energy balance approaches can be applied successfully using this configuration of input data.The model performance of the energy balance models is comparable to their respective values achieved for the temperature-index model.In some cases, energy balance approaches perform better than the temperature-index model.
Most uncertainty in modeled melt runoff and streamflow arises from the mismatch in downscaled precipitation at small timescales although precipitation was modeled reliably at the seasonal scale.Even though deterministic precipitation simulations are generally seen as major challenge since modeling cloud microphysical processes is "founded upon less than perfect observational and theoretical base" (Stensrud, 2007), atmospheric models provide physically consistent meteorological fields, which holds also with respect to precipitation at larger temporal and spatial scales.These meteorological fields, in turn, enable the applicability of energy balance approaches for scenario simulations.Since these types of snow models are generally seen as more reliably for changing boundary conditions than temperature-index models (e.g., Charbonneau et al., 1981;Walter et al., 2005;Pomeroy et al., 2007;Singh et al., 2009;Barry and Gan, 2011;Warscher et al., 2013;Marke et al., 2014b), this study encourages the application of energy balance approaches for scenarios with changing boundary conditions.
To conclude, LAMs are valuable tools for deriving meteorological fields as input for hydrological modeling.In our study we have consistently analyzed hydrological simulations using either simulated or observed meteorological input.However, we have replaced simulated precipitation by observations in the calibration period to avoid biases in the parameter set originating from biases in simulated precipitation.For practical applications, a combined utilization of meteorological simulations and observations (e.g., using observed precipitation in combination with otherwise only simulated variables) might be a valuable alternative that can often even enhance the efficiency of the hydrological models (see Table 3 or Förster, 2013).

Figure 1 .
Figure 1.Map of the study area including the Sieber catchment (watershed boundary is magenta colored) located in the Harz Mountains, northern Germany.The station network shown on the map provides hourly time series for precipitation only (circles), precipitation and temperature (triangles), and additional observations (squares).The Harz National Park boundary is provided by OpenStreetMap, © OpenStreetMap contributors (http:// www.openstreetmap.org/copyright).

Figure 2 .
Figure 2. WRF domains of the study area.The resolution of the first (outer) domain is 30 km, the fourth and final (inner) domain's resolution is 1.1 km.

Figure 3 .
Figure 3. Conceptualization of hydrological processes in the hydrological modeling system PANTA RHEI (adapted from LWI-HYWAG and IfW, 2012).According to Sect.2.4, the snowmelt process can be simulated by four different models.

Figure 4 .
Figure 4. Comparison of simulated SWE and observed snow depth time series at the Torfhaus meteorological station: (a) winter season 2005-2006, (b) winter season 2010-2011.Both simulations were performed using observed meteorological time series.The shaded area refers to the corresponding melt period (see, Fig. 5).

Figure 5 .Figure 6 .
Figure 5. Snowmelt simulations at the point scale using observed meteorological time series at the Torfhaus meteorological station: (a) winter season 2005-2006, (b) winter season 2010-2011.For every simulation run, observed precipitation depth (P obs ), observed runoff ( r obs , blue line), observed runoff depth (blue dashed line), observed snow depth (H obs ), simulated runoff ( r sim , red line), simulated runoff depth (red dashed line), and simulated snow water equivalent (SWE s ) are plotted.Model efficiencies E are also given for each run.

Figure 7 .
Figure 7.Comparison of observed and simulated mean areal precipitation depth.The cumulative precipitation depth is derived by calculating the arithmetic mean of 19 stations and 19 corresponding grid points of the limited-area model.At this stage, no correction approach was applied to the observed time series, e.g., to correct wind induced errors.Please refer to Table5for further details including performance measures.

Figure 8 .
Figure 8.Comparison of simulated and observed temperature time series at the Torfhaus meteorological station.

Figure 9 .
Figure 9.Comparison of simulated SWE and observed snow depth time series at the Torfhaus meteorological station: (a) winter season 2005-2006, (b) winter season 2010-2011.This plot includes time series of SWE for all tested snow models using downscaled WRF meteorological input.

Figure 12 .
Figure 12.Streamflow simulations at the catchment scale using downscaled meteorological time series for the winter season 2005-2006: (a) temperature-index approach, (b) modified Walter et al. (2005) approach, (c) Utah Energy Balance model, and (d) ESCIMO + Canopy.For every simulation run, downscaled precipitation depth (P sim ), observed streamflow (Q obs , blue line), observed runoff depth (blue dashed line), simulated streamflow (Q sim , red line), simulated runoff depth (red dashed line), and simulated snow water equivalent (SWE s ) are plotted.Model efficiencies E are also given for each run.

Figure 13 .
Figure 13.Streamflow simulations at the catchment scale using downscaled meteorological time series for the winter season 2010-2011: (a) temperature-index approach, (b) modified Walter et al. (2005) approach, (c) Utah Energy Balance model, and (d) ESCIMO + Canopy.For every simulation run, downscaled precipitation depth (P sim ), observed streamflow (Q obs , blue line), observed runoff depth (blue dashed line), simulated streamflow (Q sim , red line), simulated runoff depth (red dashed line), and simulated snow water equivalent (SWE s ) are plotted.Model efficiencies E are also given for each run.

Table 3 .
Nash-Sutcliffe model efficiencies E derived through calibration at the point and the catchment scale, respectively.Statistics refer to the winter season 2005-2006 (calibration period).
distribution of LAI values for all areas covered by forests in the Sieber catchment.There is one adjustable parameter available that linearly scales distributed LAI values.For catchment-scale applications, no adaptation of parameters was necessary in order to run ESCIMO + Canopy.

Table 4 .
Nash-Sutcliffe model efficiencies E at the point and the catchment scale, respectively.Statistics refer to the winter season 2010-2011 (validation period).

Table 6 .
Comparison of observed and simulated time series for the most relevant meteorological variables.This table includes simulated and observed mean of each variable for the winter season 2005-2006 (x s and x o ), the corresponding standard deviation ( σ x s and σ x o ), root mean square error (RMSE), correlation (r), and the model efficiency (E).The superscripts denote the station for which the comparison was carried out.

. Earth Syst. Sci., 18, 4703-4720, 2014 www.hydrol-earth-syst-sci.net/18/4703/2014/ K. Förster et al.: Effect of meteorological forcing and snow model complexity 4715
The corresponding model efficiencies of the temperature-index approach are E 2006 = 0.51 and E 2011 = 0.66, respectively.Snowmelt simulations at the point scale using downscaled meteorological time series for the winter season 2005-2006.Statistics refer to the period 25 March-9 April 2006: (a) temperature-index approach, (b) modified Walter et al. (2005) approach, (c) Utah Energy Balance model, and (d) ESCIMO.For every simulation run, downscaled precipitation depth (P sim ), observed runoff ( r obs , blue line), observed runoff depth (blue dashed line), observed snow depth (H obs ), simulated runoff ( r s , red line), simulated runoff depth (red dashed line), and simulated snow water equivalent (SWE s ) are plotted.Model efficiencies E are also given for each run.