A comparison of hydrological models with different level of complexity in Alpine regions in the context of climate change

. This study compares the ability of two degree-day models (Poli-Hydro and a degree-day implementation of Alpine3D) and one full energy-balance melt model (Alpine3D) to predict the discharge on two partly glacierized Alpine catchments of different size and intensity of exploitation, under present conditions and climate change as projected at the end of the century. For present climate, the magnitude of snow melt predicted by Poli-Hydro is sensibly lower than the one predicted by the other melt schemes, and the melting season is delayed by one month. This difference can be explained by the combined effect of the 5 reduced complexity of the melting scheme and the reduced computational temporal resolution. The degree-day implementation of Alpine3D reproduces a melt season closer to the one obtained with its full solver; in fact, the onset of the degree-day mode still depends upon the full energy-balance solver, thus not bringing any particular beneﬁt in terms of inputs and computational load, unlike with Poli-Hydro. Under climate change conditions, Alpine3D is more sensitive than Poli-Hydro, reproducing discharge curves and volumes shifted by one month earlier as a consequence of the earlier onset of snow melt. Despite their 10 beneﬁts, the coarser temporal computational resolution and the ﬁxed monthly degree-days of simpler melt models like Poli-Hydro make them controversial to use for climate change applications with respect to energy-balance ones. Nevertheless, under strong river regulation, the inﬂuence of calibration might even overshadow the beneﬁts of a full energy-balance scheme.


Introduction
The hydrology of high Alpine catchments is dominated by the melt of seasonal snow cover and glaciers, and thus particularly sensitive to climate change (Barnett et al., 2005). The amount of runoff and its seasonal pattern is likely to be heavily modified in the future, impacting ecology, water resources management and the overall quality of life in inhabited areas (Yvon-Durocher et al., 2010;. Change in summer discharge in Alpine areas will also increase the sensitivity to air tem- 20 perature, enhancing the warming of Alpine rivers with climate change (Michel et al., 2021a). Therefore, the development of models reproducing reliable predictions of the response of Alpine catchments discharge to climate change is a crucial step.  Meteorological data over Switzerland are partly acquired from the MeteoSwiss (MCH) automatic monitoring network (IDAWEB, 2019). Since no MCH station is installed within the Dischma catchment itself, the nearby stations of Davos (DAV) and Weissfluhjoch (WFJ) provide daily values for air temperature (TA), precipitation (PSUM), wind velocity (VW), relative humidity 5 (RH) and incoming shortwave solar radiation (ISWR). The same variables, except for PSUM, are used for the Swiss side of the Mera catchment from the station Samedan (SAM) in the upper Engadine.

IMIS stations
The second part of the meteorological data used for Switzerland comes from the Intercantonal Measurement and Information System (IMIS) station network (IMIS, 2019), operated by the WSL Institute for Snow and Avalanche research (SLF). The 10 IMIS station network does not provide ISWR measures. Additionally, IMIS rain gauges are not heated, therefore winter precipitation is obtained indirectly based on snow depth variations and snow settling computed by the SNOWPACK model (?).
The information used in this study comes from the IMIS stations located in Davos (DAV), Flüelapass (FLU) and Zernez (ZNZ) (see Fig. 2). All data coming from Swiss stations (IMIS, MCH) are presented in Tab. 2. 15 The first automatic meteorological stations was installed in the Italian region of Lombardy only in the late 80s. Thus, Lombardy can only count on observational time series which, at most, date back to those years. Nowadays, ARPA counts on a network of Table 3. Details about Italian meteo stations (ARPA). The * symbol next to some of the available variables indicates that such variables started to be recorded later than the first recorded year for TA and PSUM. The climate change section of this paper depends upon the CH2018 climate change scenarios (MeteoSwiss, 2018). These scenarios were derived from the new EURO-CORDEX ensemble of climate simulations with Regional Climate Models (RCMs) Kotlarski et al., 2014). The RCM simulations of EURO-CORDEX were performed using a common model domain centered on western Europe. RCMs dynamically downscale coarser global projections (from Global Climate Mod-10 els, GCMs) into a resolution which is more suitable to represent the topography of Switzerland. However, RCM simulations from EURO-CORDEX do provide information at a spatial resolution of 0.11 or 0.44 • km, which is still too coarse for local impact assessments and might lead to biases, especially for complex orographies. CH2018 applied quantile mapping for spatial downscaling, to both observations at station's scale and gridded observations at 2 km to derive site-specific projections.

Station
Thus, CH2018 can provide projections at daily resolution for the MCH automatic weather station network. Available variables 15 are the mean, minimum and maximum near-surface air temperature, relative humidity, wind speed, cumulative precipitation and incoming shortwave radiation. 68 model chain outputs are provided under three Representative Concentration Pathways: RCP8.5, RCP4.5 and RCP2.6. In this paper, we considered a selected subset of 17 out of the original ensemble, resumed in Tab. 4. The subset was selected to represent the spread between models but to limit computational demand at the same time.

Quantile mapping on MCH, IMIS and ARPA data
The availability of CH2018 downscaled scenarios is restricted to the MCH station network. We adopt the methodology presented by Rajczak et al. (2016) and apply it on IMIS stations (Michel et al., 2021b). This Quantile Mapping (QM) methodology 5 enables the spatial transfer of the future climate projections from the set of CH2018 MCH stations to a different network. We extend the data set from Michel et al. (2021b) to ARPA network to have consistent climate change data for all our stations.
The procedure is summarized in the following: as a first step, the so-called Most Representative Station (MRS) is selected out of the MCH CH2018 network for each ARPA station. The MRS is selected by maximising a combined correlation score between observed mean daily temperature and cumulative daily precipitation time series. The combined correlation score is 10 the observed one. The QM spatial transfer is thus applied to the ensemble of ARPA stations, resulting in climate projections spanning to the end of the century.
Climate change simulations presented in Section 4.3 are run for the hydrological years 1991-2000 and 2081-2090. We will refer to the first decade as of "reference period", whereas to the last one as of "climate change period". For simplicity, we use full decade names further in the paper (e.g. 1990-2000 for the hydrological decade 1991-2000, meaning October 1990 to 5 September 2000).

Temporal downscaling
CH2018 scenarios for Switzerland only provide data at daily resolution. However, such temporal resolution can be too coarse for this data to be used as input to physically-based models such as Alpine3D and StreamFlow. This paper follows the method recently proposed by Michel et al. (2021b), which provides hourly downscaled climate change time series for MCH and IMIS 10 stations. The same approach was used to obtain hourly climate projections for ARPA stations by first disaggregating the MRS time series (according to the approach by Rajczak et al. (2016) described in the previous Section 2.5.2) and then applying the trend to ARPA stations. This approach is based on the delta-change method, which was also used in the previous CH2011 scenario (MeteoSwiss, 2011). In Michel et al. (2021b), this procedure is further developed, especially in the sense of the quality assessment of the generated time series and the validation of the parameters. This enhanced method highlights that the original 15 approach used in CH2011 did not represent correctly the seasonal cycle of the climate change scenarios. Furthermore, the method originally developed was validated only for temperature and precipitation, whereas the new method is validated on the 5 variables that are used in this paper. Time periods of 10 years are used (using the period 2005-2015 for historical baseline).
The work of Michel et al. (2021a) shows that using 10 years time periods for applications such as in this study leads to similar results than using 30 year periods. This finding is highly relevant for this paper for a twofold reason. On the one hand, this 20 allows to include data from IMIS stations. Although IMIS data are not available over the last 30 years to well represent climatic cycles, the 10 years available have proven to improve discharge simulations over Alpine areas (Schlögl et al., 2016). On the other hand, without this finding, no climate change study would be possible within poorly gauged catchments whose surveying does not date 30 years back, such as Mera. For the flow routing model StreamFlow (see Section 3.1.2), watershed, sub-watershed and stream network are derived using the TauDEM software (Tarboton, 1997) coupled with a wrapper (Michel et al., 2021a) to force it to reproduce exactly the stream network provided by the FOEN (of the Environment, 2013).

Geographical data
Conversely, Poli-Hydro model simulates the hydrological balance and the flow routing for the catchment area (through flow directions and flow accumulations) which is identified from the DEM using ArcGIS software (ESRI, 2012). The DEM used is 5 previously upscaled to match the scale of interest (500 m).
It is important to underline here that the two approaches prepare the DEM in different ways, thus justifying slight differences in basin shape hereafter.

Glacier data
For both catchments, we use glacier height maps from Zekollari et al. (2019). By means of an innovative model called Glo-GEMflow, the authors have developed the first regional Alpine glacier modelling making use of high-resolution climate model outputs from EURO-CORDEX Kotlarski et al., 2014). Glacier height needs to be understood as the ice depth above the surface. Such maps are used as initial condition of each past and future (climate change) simulation. For historical periods, all calibrations are run with the 2005 glacier map as initial condition. Glacier evolution is then treated and simulated according to the two models and melt schemes. A due consideration is that glacier maps do overwrite the CLC land 15 cover classes, and in the case that a pixel is classified as glacier by CLC but not by Zekollari et al. (2019), that pixel is simply converted into bare rock.

Models description
In this study, two different models are compared: the degree-day model Poli-Hydro (PH hereafter) and the process-based model 20 chain Alpine3D+StreamFlow (A3D+SF hereafter). Interestingly, both models have been used recently to perform climate change studies. In Michel et al. (2021a), A3D+SF chain is used to investigate future water temperature of Swiss rivers under climate change. Fuso et al. (2021) used PH to evaluate future hydrological scenarios for the Lake Como catchment in Italy according to three GCMs of the Sixth Assessment Report (AR6) of the IPCC driven by four shared-socio-economic-pathwaysbased scenarios. PH solves within the same model the surface mass and energy balance in order to obtain soil runoff and the 25 hydrological routing. For the process based side, A3D is first run in order to obtain the soil runoff, and in a second step SF is run for the hydrological routing using A3D output as input.
The forcing meteorological data at station location are extrapolated to the grids in A3D using distance weighted method with vertical lapse rate corrections (see details in Michel et al. (2021a)). The grids of forcing data are written out and also used in PH, in order to use exactly the same forcing data. Longwave radiation is also computed in A3D using TA, RH and cloud cover 30 derived from ISWR using the approach described in Omstedt (1990). The grids are produced at 500 m resolution. The same forcing grids are then used in PH.

Alpine3D
Alpine3D model (Lehning et al., 2006) is a deterministic and spatially distributed model designed for high-resolution simulation of snow processes in topographically complex areas. In A3D, the SNOWPACK model (Lehning et al., 2002) is applied 5 to each cell of the catchment (here a 500x500 m grid). SNOWPACK is a physical 1-D multi-layer snow and soil model formerly developed for avalanche warning. It comes with a detailed description of the snow micro-structure and it resolves phase changes in snow on a physical basis. SNOWPACK also contains a two-layer canopy module simulating the micro-meteorology in the forest, the evapotranspiration, and the interaction between trees and snow, including snow interception (Gouttevin et al., 2015).

10
A3D can also be run with a simpler melt-factor energy balance mode (Shakoor et al., 2018), this setup is called A3D DD hereafter. In A3D DD , the melt rate is linearly linked to air temperature by the Temperature Melt Factor (TMF). In A3D, A3D DD can be used within SNOWPACK as an energy boundary condition at the snow-atmosphere interface. When a melt phase occurs, i.e. when water and ice are coexisting in the snow element and air temperature is larger than snow surface temperature, the energy entering the snowpack is computed with A3D DD . Radiative fluxes are set to zero in the uppermost snow element and 15 the turbulent fluxes are not computed anymore. When the snowpack is not in a melting phase, the net energy flux at the snow surface is computed by solving the standard energy-balance boundary condition (Schlögl et al., 2016). Ice and snow are not distinguished by A3D DD melt model: a single TMF value is used for both of them, although such approach has been shown to be oversimplified for glacierized catchments (Hock, 1999). The energy balance (EB) in case of melting is computed by A3D DD as of Equation 1.
In Equation 1, T M F is the Temperature Melt Factor; T air and T snow the air and snow temperature, respectively.
Within A3D, the SNOWPACK model is run for every cell of the catchment at a 15 minutes time step, while the forcing boundary conditions are updated on an hourly basis. A3D also has a radiative sub-module enabling the consideration of topographic shading, terrain reflection and atmospheric cloudiness on shortwave radiation distribution. Topographic effects deeply 25 influence radiation balance in mountain regions. In fact, the intensity of incoming and outgoing radiative fluxes depends on many factors such as surface inclination angle, shading and surface properties (von Rütte et al., 2021). In mountain terrain, incoming longwave radiation decreases with elevation because higher areas have an enhanced angular exposition to the radiating sky, which is colder than the terrain (Lehning et al., 2006). A3D is run with the same setup described in detail in Michel et al. (2021a), to which we refer for further details.

StreamFlow
StreamFlow is a semi-distributed hydrological model, described in (Gallice et al., 2016). The runoff output produced in A3D is collected into each sub-watershed in SF and the residence time in the soil is determined with an approach using two linear reservoirs. The water is then routed to the river. While SF offers multiple routing scheme, the simple instant routing scheme shows a similar performance as more complicated ones for Alpine catchments studies (Michel et al., 2021a), thus we use this 5 scheme in the present work. SF is run at hourly timestep and at a resolution of 100 m. Further details about the setup used for SF can be found in (Michel et al., 2021a).

Poli-Hydro
Poli-Hydro is a spatially semi-distributed, cell based (here 500x500 m) hydrological model, suitable for the simulation of highaltitude, poorly gauged catchments (Bocchiola et al., 2018;Casale et al., 2020). It is able to reproduce deposition of snow and In Equation 2, M s and M i are the melt factors for snow and ice respectively; T the mean daily temperature; DD s and DD i 20 the temperature melt factors for snow and ice (Tab. 7); T t the threshold temperature equal to 0°C, RM F s and RM F i the radiation melt factors; α s,i the snow and ice albedo (Soncini et al. (2017)); q sw the shortwave radiation flux.
Ice and snow melt occur only when the average daily temperature is higher than the threshold temperature. Ice melt occurs on ice covered domain cells, once snow melt is completed.
Overland flow Q s is only generated in the condition of saturated soil (Equation 3).
In Equation 3, S is the soil water content evaluated from the mass balance equation for each time step, and S max is the greatest potential soil storage calculated by the Curve Number method (Usda., Scs., 1986), from a land cover map for each cell of catchment. The sub-superficial discharge Q g is computed as of Equation 4 In Equation 4, K is the saturated permeability and k is a power exponent.

5
Then, for the flow routing, two parallel systems are considered, one for superficial flow and one for sub-surface flow. Two instantaneous unit hydrograms u(t) (IUH) are evaluated for each cell using the Nash approach (Rosso, 1984) for superficial and sub-surface flow respectively.

Statistical scores
Statistical scores are used to assess the models' predictive performance. In this work, three metrics are used: the RMSE for 10 the snow cover analysis in Section 4.2.1, the Nash-Sutcliffe Efficiency (NSE, Nash and Sutcliffe (1970)) and the Kling-Gupta Efficiency (KGE, Gupta et al. (2009)) for the comparison of predicted discharge in section 4.2.
With respect to NSE, the value N SE = 0 is commonly regarded as an inherent benchmark for models' performance (Schaefli and Gupta, 2007;Knoben et al., 2019), as it describes the situation where the average of the observations has the same explanatory power as the model's predictions. 15 The KGE score further extends NSE into three different components: the correlation r, the flow variability error α and the mean flow bias β. It is important to remark that, as explained in Knoben et al. (2019), it is impossible to define within KGE a benchmark value that acts as a threshold between a "good" and a "bad" model. To keep consistency with NSE, here we will consider as a benchmark the case where Q sim = Q obs , which yields KGE = −0.41 (Knoben et al., 2019) and corresponds to N SE = 0. 20 The need for considering the KGE in addition to the widely-used NSE mostly arises from two important drawbacks of the NSE. In the first place, The use of the observed mean annual as a benchmark can be a mediocre predictor (Schaefli and Gupta, 2007), especially for strongly seasonal discharge time series, as those dominated by snow melt, leading to the overestimation of the model efficiency. Secondly, the NSE computes squared differences between the observed and predicted values. As a result, the metric becomes excessively sensitive to extreme values by enhancing higher magnitude streamflows and neglecting 25 lower ones (Legates and McCabe, 1999;Criss and Winston, 2008). It is important to note that PH is calibrated using the NSE metric as the score to maximise, while SF uses the KGE metric, so each model might be biased toward better performance in its respective calibration metric. The simulated snow water equivalent is compared with measured snow depth at the AWS. Within Dischma catchment itself, MCH/IMIS snow depth data are not available, therefore only SCA is verified. Detailed approach and results are reported for Mera catchment in Fuso et al. (2021). Modelled snow validation is performed comparing observed and modelled time series of snow depth. To this aim, the daily modelled snow water equivalent is converted to snow depth using the approach proposed 1.14 1.07 2.00 3.50 9.00 9.00 0.5 0.5 1.00 0.80 1.00 1.14 by Martinec and Rango (1989) and more recently implemented in Aili et al. (2019). It is important to remark that the approach by Martinec and Rango (1989) is only used to verify the modelled snow water equivalent against the measured snow depth.

Model comparison
As mentioned in the Introduction, the use of degree-day models might be preferred for the lighter computational load and the lowest amount of input data required, but on the other hand, their use for climate change impact studies is disputable.
With the aim of performing climate change impact studies, the objective is to assess the models' efficiency in reproducing discharge and total volumes at the river section of interest. However, high-performance discharge simulations strongly depend 20 on the correct representation of runoff formation. Specifically, runoff in Alpine catchments can be separated in two main components: the fast runoff, which is mainly due to precipitation and surface flow, and the slow runoff, generated by snow and ice melt and sub-surface flow.
Thus, in this section we compare the models presented in Sections 3.1.1, 3.1.2 and 3.1.3 in three different ways: (1) by means of a snow cover analysis, (2) before flow routing (referring to the water volume released at each pixel of the catchment 25 domain) considering the different contributions to the total runoff (precipitation, snow melt and glacier melt) and (3) after flow routing (referring to discharge and volumes). When discussing runoff contributions, before the flow routing, we refer to models    as A3D, A3D DD and PH respectively for Alpine3D (in its full solver and degree-day version) and Poli-Hydro (Sections 3.1.1, and 3.1.3). Conversely, for discharge and volumes, after the flow routing, we refer to the same models as SF, SF DD and PH R respectively for Streamflow (forced with A3D and A3D DD) and the flow routing scheme of PH (Sections 3.1.2 and flow routing scheme in Section 3.1.3).

5
In the first place, the two models are implemented with different temperature thresholds for rain-snow separation, 0°C for PH (Fuso et al., 2021) and 1.5°C for A3D and A3D DD (Michel et al., 2021a), as this is the way they are typically used. As a result, PH may simulate more liquid precipitation than A3D in winter which does not accumulate as snow (see Fig. 3). for this station. A similar analysis was not possible for Dischma, because the MCH/IMIS dataset used for this paper did not contain snow height or snow water equivalent measurements within the basin itself. Models' performance in reproducing snow height is rated by means of RMSE. Snow height is best reproduced by A3D, which gives the lowest RMSE (top right of Fig.   4 panel (a)). A3D DD tends to underestimate snow height, however, RMSE is only slightly higher than A3D. PH shows the highest RMSE. Common errors for A3D seem to be related to compaction, for A3D DD to an excessively fast melting, whereas 15 PH shows errors in accumulation and melting. The poorer performance of PH may be explained by the relative simplicity of the San Giacomo Filippo is used to calibrate/validate snow water equivalent in model PH but not in A3D/A3D DD . On the one hand the comparison is uneven, but on the other hand it conceives even more relevance to A3D and A3D DD outperforming PH there.
Modelled snow water equivalent at basin scale is shown in Fig. 5 and Fig. 6. As already observed in Fig. 4, A3D reproduces systematically higher snow water equivalent compared to A3D DD and PH. Snow water equivalent peaks are equally reproduced by all models in terms of seasonality. However, regardless of the catchment and year, PH pictures a longer melting season. The 10 reason might be a protracted snow melt process due to PH's degree-day scheme and the resolution of meteorological forcing: as mentioned in Section 3.1.3, the melting scheme only takes into account a mean daily temperature value, which needs to be higher than the melting threshold for any snow melt to be represented. This condition is likely to be met only after melting has already set in reality, delaying the process of snow ablation and leading to more snow mass predicted for summer months.
These findings are confirmed by the work of Terzago et al. (2020), where melt models of lower complexity did show higher biases against observations with respect to more complex ones like SNOWPACK. In Terzago et al. (2020), high biases for simpler models are related both to their underestimation of snow water equivalent peak values and to the protraction of snow 5 melt season that they reproduce. Maps of summer months in Fig. 5 and Fig. 6 corroborate that PH reproduces higher snow water equivalent at high elevation areas, especially in early summer.

Runoff formation
The different components of runoff are shown in Fig. 7 and Fig. 8. Throughout Section 4.2.1, we underlined major differences among models in how they reproduce the snow cover. Such differences lead to discrepancies in modeled snow melt. As a 10 consequence of the model's structure described in Section 3.1.3 and 4.2.1, PH reproduces a delayed and reduced melt season compared to both versions of A3D. Interestingly, A3D DD reproduces a melt season closer to the one obtained with the full solver and more important melt events during the winter.
An altered reproduction of snow melt seasonality affects models' performance in reproducing discharge. In Fig. 7

Statistical performance of runoff simulations
In this section, performance metrics described in 3.2 are used to evaluate discharge simulations. Despite showing that simpler 30 melt schemes conceive an altered representation of runoff seasonality, the previous section did not quantify such alterations. Table 9 shows models' performance scores over the validation period. In general, better performances are obtained over Dischma catchment than over Mera, in agreement with what can be seen in Fig. 7 and Fig. 8. The Mera river being regulated     by multiple dams (see Fig. 1 and Tab. 1), this result is not surprising and illustrates the need of explicitly considering such infrastructures in models (which is not the case for the two models used here). With PH R , better scores are obtained both in terms of annual NSE and KGE over the Mera, although we previously showed that the predicted melt dynamics is substantially wrong and Fig. 7 shows a clear overestimation of runoff during the summer season. Table 9 also shows the r, α and β components of KGE scores. Over the Mera, flow variability predicted by the models is higher than the observed one (positive α), suggesting again the influence of the regulation.
Over the Dischma catchment, both indicators show a lower performance of PH R compared to SF, as we can expect since PH R tends to delay the melt season by one month. As mentioned earlier, for catchments with strong seasonal signals using quality metrics on a yearly basis is not optimal. Many authors have addressed this issue by using those metrics on a seasonal basis (Garrick et al., 1978;Martinec and Rango, 1989;Legates and McCabe, 1999;Schaefli and Gupta, 2007). We thus evaluate the

Discussion
The annual scores shown in Tab

25
In general, both versions of A3D used reproduce the melt dynamics more correctly than PH, the main issues with PH being the shifted seasonality of the melting and the absence of a peak behaviour. Indeed, PH tends to reproduce a rather constant melt.
This result is similar to the findings by Magnusson et al. (2010). The degree-day version of A3D obtains results surprisingly close to the full energy balance solver. In the Mera catchment, snow melt events are more important in winter with A3D DD , leading to a reduced snow height and thus snow melt later in the year. In the Dischma catchment, the melt is slightly enhanced in 30 spring and thus reduced in summer in A3D DD . Besides, the in-depth snow cover analysis in Section 4.2.1 shows the superiority of both versions of A3D against PH, which however does not model snow depth directly. However, the difference in discharge between the two versions of A3D are rather low and the seasonal scores obtained are similar. This shows that part of the difference between outputs of A3D can be absorbed in the calibration of the soil reservoir residence time in SF, leading in the end to similar discharge simulations.
A key point regarding the higher performances of both A3D versions on Dischma compared to PH is the higher temporal resolution at which the energy balance is solved. The melt scheme used by PH is based on the mean daily temperature, which means that if the mean is lower than the melting threshold, the model does not simulate any snow melt, whereas temperatures 5 might well reach higher values during the daytime and melting could happen instead. As a consequence, the melt during the spring season is delayed in PH. Previous studies showed that adopting shorter time steps in hydrological modelling can be beneficial for runoff simulations, not only for short-duration events, but also for the analysis of outputs at a larger time scale (Ficchi et al., 2016). This however requires to have forcing time series at hourly resolution, which is not always the case, especially for climate change scenarios. In addition to the shorter time step, the degree-day scheme of A3D benefits from the 10 enhanced complexity of the model. In particular, this scheme is enabled only when the melting starts (i.e. liquid and solid water coexist in the snowpack). The onset of melting is thus still determined by the full energy balance solver. Moreover, A3D DD remains a multi-layer snow model, which is generally desired for representing snow processes in a physically based model, given the significant vertical variations of snow properties Arduini et al. (2019). This version of A3D is thus not at all similar to a simple degree-day model such as PH. It is important to note that, unlike PH, the degree-day scheme in A3D does not bring any particular advantage in terms of required input variables and computational speed, but it is only used here for sensitivity analysis purposes.
All models used here obtain lower performance in the Mera catchment than in the Dischma. The explanation is twofold.
First, the Mera catchment is highly regulated by dams, which is not accounted for in the models. The errors of the models (overestimation of runoff in summer and underestimation in winter) match the pattern of dams usage: water retention in summer to produce electricity in winter during the demand peak. Besides regulation, another key point affecting the representation of discharge in the Mera catchment could be the scarcity of input data. Lack of data might be an impairment for models with higher degree of complexity like A3D, as they are more sensitive to poor interpolation of meteorological data (Bougamont et al., 2007;Magnusson et al., 2011;Schlögl et al., 2016). The scarcity of meteorological input forcing in the Mera catchment, compared to 10 the Dischma, could explain why A3D does not outperform PH model there. In addition to the different representation of snow, SF and PH R use completely different water routing schemes. While SF uses a semi-distributed approach, the approach of PH R is purely conceptual. The important difference we observe in the snow melt seasonality between the two models do not allow comparison on the performance of the water routing schemes. Such comparison should be done by forcing both water routing model with the same input, which was not possible here since the water routing module of PH is deeply linked to the snow 15 melt module in the model. In summary, from models comparison over the validation period, we conclude that: -Both versions of A3D outperform PH in the simulation of snow cover. The PH model is effectively underestimating the contribution of snow and ice melt during the melting season; -Such underestimation has strong repercussions on the correct representation of the sharp and narrow discharge peak characterising nival rivers' hydrograms;

20
-The degree-day version of A3D obtains slightly lower performances in the representation of snow, but these differences are compensated during the calibration of SF and the simulated runoff is then very similar; -Input data scarcity in the Mera catchment leads to lower performances with A3D, while all models fails at correctly representing the discharge in the Mera catchment (possibly also in response to water regulation from dams).
For the upcoming application to climate change, the differences observed between PH and A3D must be considered. Indeed, 25 the model PH, despite an extensive calibration and the usage of ad-hoc values for many parameters corresponding to nowadays conditions (see Tab. 7), exhibits a poorer representation of the snow melt contribution to runoff. This is a strong indication that snow melt may be incorrectly captured in the future by such models when the climate regime will exhibit substantial change.
Additionally, the temperature melt factors are assumed to be constant for the whole basins and throughout the years, even if this approach could be wrong, especially for climate change applications. However, large basins such as Mera often range 30 from low to high and inaccessible elevations, where data accessibility is likely limited, thus justifying the need to use simpler schemes for such applications. A comparison of the full solver of A3D and PH when forced with climate change scenarios is presented in the next section.

Climate change
Climate change simulations are run over the period 2080-2090 and forced by the set of RCMs listed in Tab. 4. In the following, we present the predicted evolution of precipitation, snow melt (as of the most significant seasonal forcing for total runoff and the major difference between the two melt schemes), discharge and total volumes with the aim of discussing the sensitivity of the models to climate change. Glacier melt will not be covered because its influence by 2050-2060 and later is predicted 5 to be extremely small due to glacier retreat. In the following, A3D's degree-day configuration will not be used anymore since (1) as mentioned before, there are no real computational advantages in using such scheme with respect to the full solver and (2) differences with respect to A3D are very small, although they could as well increase in a future changed climate. For precipitation and snow melt, climate change simulations are compared against the mean over the reference period 1990-2000, whereas for discharge and volumes they are compared to the observed time series. The reference period is used here with 10 the only aim of showing seasonality changes induced by climate change, as the models' differences in reproducing melt were already discussed throughout Section 4.2.
Predicted evolution of precipitation, snow melt, discharge and volume patterns are shown in Fig. 10 and Fig. 11 for Mera and Dischma respectively. Boxplots in Fig. 12 and Fig. 13 show deviations of the RCP8.5 mean from the mean over the reference period.
Changes in snow melt predicted by A3D and PH are consistent at a seasonal level: with respect to the reference mean over 1990-2000, a net increase is predicted in spring (most evidently in the nivo-glacial Dischma catchment), a net decrease in late summer (August to September) and late autumn (October to November), an increase during winter months (December 20 to March). Such findings are in line with previous studies, e.g. Bavay et al. (2013) and Kobierska et al. (2013). Although tendencies are similar amongst the two melt models, the magnitude of changes is rather different. The energy-balance model A3D appears to be more sensitive to climate change than the degree-day model PH: not only the predicted snow melt is significantly higher (up to 50%), but also changes with respect to the reference period are more marked, regardless of the catchment (see Fig. 12 and Fig. 13, panels (b)). Differences in melting magnitude and change are mostly ascribed to spring and summer 25 months. To explain this, the basis of the degree-day melt scheme needs to be discussed again. The main physical processes involved in the energy balance are incoming and outgoing longwave radiation, absorbed shortwave radiation, turbulent heat fluxes and melt (Ohmura, 2001;Zappa et al., 2003). The degree-day estimation approach of snow/glacial melt of PH relies on air temperature, which is considered to be a good predictor for the majority of energy fluxes taking part in the balance, and on absorbed shortwave radiation (see Equation 2), which is incompletely described by air temperature and whose effect 30 is included through calibrated parameters. Therefore, there are two substantial limitations in the use of a degree-day melting scheme for climate change applications. On the one hand, snow degree-days parameters of PH are calibrated and fixed for each month (see Tab. 6). As a consequence, the model gains less sensitivity towards air temperature changes and it cannot capture changes in melt seasonality under climate change conditions, leading to possibly altered results. On the other hand, the gener- (panels (a)), snow melt (panels (b)), discharge (panels (c)) and volumes (panels (d)) predicted by A3D (left panels) and PH (right panels).  (panels (a)), snow melt (panels (b)), discharge (panels (c)) and volumes (panels (d)) predicted by A3D (left panels) and PH (right panels).
Precipitation and volumes are presented as average monthly cumulates. Snow melt is presented as the average year. Discharge is presented as the average year, smoothed by a moving average with a window of 30 days.
The predicted shift in melting season affects the seasonality of the discharge regime for both catchments. Figure 10 and  at A3D over Dischma catchment. Even if the discharge peak timing is not expected to shift significantly, the RCP8.5 mean reproduces a discharge curve which is shifted by one month with respect to the average observations (see Figure 11, left panel (c)). The same pattern is reproduced by the predicted monthly volumes (see Figure 11, left panel (d)). A greater magnitude of change in spring and summer months predicted by A3D is also shown in Fig. 12 and Fig. 13, especially for the nivo-glacial Dischma catchment. 15 Another interesting aspect that this analysis brings to light concerns predicted discharge magnitude for the Mera catchment.
As shown by panels (c) and (d) in Fig. 10 illustrating discharge and volumes, both models reproduce a significant reduction in river's streamflow magnitude. This behaviour is only ascribed to Mera river as no mass seems to be lost for Dischma (Fig. 11,panels (c) and (d)), where the curve is shifted by new climatic conditions but the volumes are overall preserved.
Discharge magnitude reductions go as far as -85% for RCP2.6, -82% for RCP4.5, -86% for RCP8.5 predicted by A3D, and 20 -77%, -78% and -82% predicted by PH R . We ascribe this peculiar inability of both models to correctly reproduce base flow in Mera catchment under climate change to the river's regulation and to human activity in general. If both models show coherent volumes for a nearly natural catchment such as Dischma, it is likely that for a strongly regulated catchment such as Mera, the calibration process is the one that most influences the models' ability to reproduce volumes correctly. Under changed climatic condition, the parameters that have been calibrated or fixed under current climate may fail to reproduce the base flow correctly.

25
To test this hypothesis, one may refer to the predicted evapotranspiration over the catchments. At the time when these simulations were run, this output was not yet implemented within A3D, so that it is impossible to make a comparison with the output from PH. In the meantime, this information was outputed (Michel et al., 2021a), but nevertheless it is shown there that evapotranspiration term will not have a great influence in cryospheric areas.
To conclude, our findings are twofold. On the one hand, the analysis on the almost natural Dischma catchment confirmed 30 what previous studies discussed before (Hock, 2005;Magnusson et al., 2010), i.e. that the use of degree-day models for future hydro-climatic studies is questionable since they rely on parameters which are calibrated in current climatic conditions and on a partial representation of the energy balance, whose inner physical processes won't change under changing climate. On the other hand, the case of Mera suggested that the use of either complex or simple melt models coupled with routing modules alone might be disputable when applied to strongly regulated, extensive catchments, because the calibration process might 35 influence models' predictive ability more profoundly than the energy balance alone. Henceforth, our guess is that hydropower reservoir operation cannot be neglected in similar contexts and need to be accounted for in modelling.

Conclusions
This paper compares the discharge response of two Alpine catchments to present conditions and climate change, predicted by one energy balance and two degree-day melt models: A3D, A3D DD and PH respectively. The two catchments, Mera and Dis-5 chma, are different in size and extent of water resources exploitation. Under present climatic condition, both the full energy balance and the degree-day versions of A3D outperform PH in reproducing the melt dynamics, especially over the almost-natural, nivo-glacial Dischma catchment, where snow melt is severely underestimated by PH. Over the Mera catchment, monthly volumes are underestimated in winter and overestimated in summer by all models, suggesting that regardless of the melt scheme, hydropower operations (i.e. water release in winter to produce electricity when demand is peaking and subsequent withholding 10 in summer for accumulation purposes) can reduce model's discharge predicting capacity. The superiority of both versions of A3D compared to PH is particularly evident when analyzing snow depth and spatial distribution.
In terms of predicted discharge, seasonal performance scores over the entire validation period don't show significant difference between models for Mera, with scores being satisfactory but not outstanding. The explanation is twofold. On the one hand, flow regulation might alter monthly volumes relatively, but the impact on daily flow regimes is certainly heavy, thus 15 hindering each model and melt scheme in reaching high scores at all. On the other hand, data scarcity over Mera is a bigger problem for the more complex energy balance approach, which may explain why A3D does not outperform the simpler melt scheme there. Conversely, performance metrics over the well-gauged, almost-natural Dischma catchment show better performance for both versions of A3D+SF over PH R . Seasonal scores, however, show that both versions of A3D+SF chain outperform PH R in about all seasons and all catchments. Interestingly, in terms of snow melt magnitude/seasonality and dis-20 charge, results from the degree-day version of A3D+SF are very similar to those obtained from its full energy balance one.
However, the scheme A3D DD is only enabled at the melting onset, so it is always determined by a full energy balance model in the first place and it cannot be compared to a simpler degree-day model as PH, which lacks A3D's predicting capacity but brings desirable advantages such as reduced input detail and computational load. However, A3D DD also carries the advantages of being a simplification of a multi-layer snow model in the first place.

25
Under climate change, end-of-century changes in snow melt seasonality predicted by A3D and PH are qualitatively the same: a net increase in spring and winter, a net decrease in summer and autumn. However, A3D's melt scheme appears to be more sensitive to climate change than PH's, as the discharge curve predicted by A3D+SF is shifted by one month under RCP8.5 scenario. Likely, the use of a degree-day melt scheme like PH for climate change studies is not suitable, since (1) fixed monthly degree-days compromise the model's ability to perceive seasonal changes in snow melt and (2)  Ficchi, A., Perrin, C., and Andréassian, V.: Impact of temporal resolution of inputs on hydrological model performance: An analysis based