General comments
The main objective of the study was to develop relatively simple extended degree-day snow models driven by freely available snow-cover images. Authors see the novelty of their research in independent calibration of the snowmelt models on snow cover images which allows standalone estimation of associated parameters and thus a better representation of the snow processes. Output from these snow models were later used as input data in modified HBV model for streamflow simulation in five selected catchments in Germany and Switzerland.
First, it should be noted that the paper has been reviewed by two reviewers before and authors created a new version of the manuscript. After reading the reviewers comments and authors replies, it becomes clear that the study has been significantly revised. Nevertheless, I did not base my review on the earlier reviews, and rather tried to comment the revised study without bias.
In my opinion, authors did an interesting work. I certainly agree that the focus on testing different variants of degree-day models and their calibration against snow cover area using MODIS data is important, although not fully novel. Similarly, the de-coupling of snow routine from the selected hydrological model and its standalone calibration might bring some new insight on calibration procedures and model equifinality, although many hydrological models are nowadays calibrated using more variables next to streamflow (SWE, snow cover, groundwater levels etc.). Therefore, I found the study important and particularly novel. I thus agree with previous reviews that the study is worth publishing in HESS. However, I have several specific comments and questions regarding the methodology approach and quality of presentation. These comments should be carefully addressed before I can recommend the manuscript for publication. I only partly checked the original manuscript (before the revisions), so I hope I will not be in contradiction with initial reviews.
Specific comments
In my opinion, introduction section still needs partial improvement. Especially part within lines 45-66 looks like a list of studies containing just a short description without synthesis of individual information and results. I read the comments of the reviewers in the first round of reviews and their concerns regarding the introduction as well as authors response. Therefore, I do appreciate that authors extended the introduction section, but in my opinion, it resulted only in a partial improvement. Although I understand previous authors argumentation about writing long reviews with citing unrelated studies (and with a deep respect to the second author experience), I still think that it should be possible to write a relatively short and focused introduction section which shows the state of the art of the topic and research gaps which helps the readers to understand what’s going on in the topic. Therefore, I would like to encourage the authors to improve the introduction section once again and to better relate individual information to each other.
Two study catchments, Reuss and Aare have some percentage of area covered by glaciers, whereas the glaciation cover for Aare is relatively high (15.5%) and thus the glacier melt considerably influences catchment runoff. Was glacier routine somehow included in the HBV model structure which authors used to simulate streamflow? I did not find this information in the text and thus it seems that glacier routine was not used. If true, I am not sure to what degree the simulations reflect the real observed values (at least for the Aare catchment). Could this somehow influence results interpretation? While I think the missing glacier component is not a problem for snow models and related results interpretation, I think it might be important for interpretation of results related to “standard” HBV and “modified HBV” (although authors assessed NSE values just for cold season months, I assume the simulation itself were done over the whole period 2010-2018). The most straightforward solution would be to include the glacier component for the two glaciated catchments (at least for the Aare catchment), or at least I would like to ask the authors to carefully address this point in discussion section.
L 313-317: It is not fully clear to me how exactly authors proceeded when creating the variants of a hydrological (HBV) model. If I understood correctly, authors created six HBV model variants for each catchment (which were named as “modified HBV”). These six HBV variants did not contain snow routines since snowmelt simulations resulting from previously defined six snow routines were directly used as input data to the HBV model. Last variant (seventh) was just a “true” HBV with its snow routine in its original structure (which is partly different than other snow routines, due to, e.g., including water holding capacity and refreezing). Is it right? If yes, then please, consider reformulation of the respective method part to be clearer. The fact that you used HBV snow routine to compare it with other six snow routine variants became clear only from results section to me (mainly from Fig. 7). Therefore, to improve the clarity of methods section, I would suggest modifying it such as you will describe seven variants of the snowmelt model (Model 1 – Model 7); the six you already have and the last (seventh) representing the original HBV snow routine. The seventh snow routine variant should be calibrated in the same way as the other six variant and comparison will be plotted in Fig. 7. In my opinion, using this procedure would make it clearer how well/badly your snow routines perform compared to the original snow routine structure implemented in HBV.
Related to comment above, can be the differences between “modified” and original HBV (shown in Fig. 9) attributed to separated calibration of the snow routines or rather to different model structures of the snow routines or both? Can you differentiate between these two influences? Maybe it would be methodologically clearer, if you calibrate the “modified” HBV model against discharge separately for all seven snowmelt inputs (as describe in my comment above) as a first step (this is what you probably did). This way you can better compare which snow routine performs better when implemented in the HBV model. In the second step, you may select just “modified” HBV model with snowmelt inputs from separately calibrated HBV snow routine (snow model variant 7 as suggested in my comment above) and compare it with calibration of the original HBV model. This way, the first step shows the differences between individual snow routine structures (including original HBV snow routine), the second step shows the advantages/disadvantages of separate snow routines calibration compared to “normal” calibration (just against discharge) of the complete HBV model.
Important question is also whether the model performance should be assessed using NSE only. Current best practise is to use more criteria to make the results more robust. Would results interpretation change in case you will use different objective criteria (logarithmic NSE, volume error, etc.)? With this comment I come a little back to what was mentioned by Reviewer 2 in the first round of reviews, and it is to what degree the values of a single objective function (NSE in this case) could really tell us whether the one model is better than another (especially in case of small differences).
In my opinion, the discussion section should be improved since it seems to me that it is not clearly linked with results. It is certainly the matter of personal preferences, but I prefer using the results section just for results description and basic interpretation related to a single figure/table described, and everything which goes beyond a single figure interpretation (it means the results interpretation in a wider context of all your presented results and other literature) should be placed in discussion section. In this respect, the discussion section should be comparable to results in its extend and it may follow (not necessarily) similar structure as the structure of result section.
Overall, the text is often difficult to follow since there are a lot of unclear statements, and it is not often clear how exactly authors proceeded (see also points above). This is also the case of some of figure and tables which are not clearly linked with the text, and they do not provide the reader with all needed information, such as informative caption or correct legend. Please see also my detailed comments in the list below. Maybe my comments and criticism stem just from these unclear issues rather than from real problems in methodological approach and results interpretation. Anyway, I would like to encourage the authors to go carefully through the text and try to make the text clearer and more consistent.
Technical corrections
L15: Please use “Nash-Sutcliffe efficiency” instead of NSE in abstract.
L17: Two full stops at the end of the sentence.
L 88-93: I would omit this paragraph since I found it too general. In fact, this is how all scientific papers are organized, thus, the specific description is not necessary here.
Fig. 1: Legend for elevation for the three inset figures (study area) seems not correct to me. As far as I can recognize, the colour scale is continuous in these small inset maps, thus the legend should be displayed accordingly (there aren’t only four or five colours in figures, right?). Besides, in case of intervals are used for the colour scale (which is, to my knowledge, the best cartographic practise), the legend should be displayed without spaces between individual coloured rectangles. Additionally, use “Elevation [m a.s.l]” for the respective legend caption and add graphical scale (for all inset maps and the main map).
L 99, 101: please use “m a.s.l.” instead of “masl” (please check also other potential occurrences in the text whenever relevant.
L 101: The highest point of Switzerland is 4634 m a.s.l. (Dufourspitze, Monte Rosa massif). This should be also reflected in legend of Fig. 1 (the last number in the legend). In this context, I would prefer the “real” highest point rather than the highest cell of the DTM raster you used to create the map.
Table 1: Please use correct unit conventions (km2, m a.s.l.)
Section 2.2: Why not to use official Meteoswiss and DWD gridded products (which are available for much finer spatial resolution than your interpolations)? Was it because you needed also Tmax and Tmin while official gridded products were created only for daily Tmean and P? Or was there any other reason? Please clarify shortly.
L 182: Authors mentioned that their “Basic Degree-day model” (Model 1) is the same model as implemented in HBV. However, this is not fully true since the snow routine implemented in the HBV model accounts also for liquid water holding capacity (which delays the water release from snowpack and thus directly influences daily SWE values) and refreezing (which has usually only a small effect on SWE calculation, at least at seasonal temporal scales). Please also look on my specific comment related to “Model 7”).
L 197-198: “falling on the snowpack”.
While I fully agree that topography (e.g., slope orientation) is important for snowmelt distribution, I would not say it also impacts snowfall temperatures (the shortwave radiation do not much differ between north and south facing slopes during snowfall events). Therefore, I think the Model 4 doesn’t make much sense. Nevertheless, I accept authors decision to include it.
L 257: “grid” instead of “gird”.
L 262: I would prefer “seamless” numbering, it means that title “3.1” should follows immediately after title “3”. Therefore, I suggest using some title (3.1) for general methodological approach (including Fig. 2 and the list of parameters), continue with title 3.2 named something like “Snow routines variants” (or similar) followed by “3.3 Data requirement of the models” etc.
Chapter 3.4 would perhaps better fit to discussion.
L 350 and 362: There are no Figs. 4a and 4b. Or, maybe better put a) to g) labels to individual panels of Fig. 4.
Fig. 5: Please add colour scale captions.
L 359: Typo in “efficiency”.
L 408: Delete “below” after “Fig. 7” (the figures are placed during post-production and may be placed elsewhere, not necessarily “below”).
Fig. 6: Is the colour scale needed? If I correctly understood, colour scale used here just follows the parameter values, but the parameters are of different physical meaning and different magnitudes thus not comparable to each other. Therefore, I think the colour scale is rather confusing in this context. It would be also good to add units for each parameter. Additionally, please make more informative figure caption. Figures and their captions should be understandable even without the related text. For example, which model variant is shown here? Why the last line represents specific date rather than year as other lines?
Fig. 7: Same as above, please make the Figure caption more informative. Among others, what scores are included within individual plots? Those resulted from 1000 parameter sets? What is represented by the width of individual plots? Please provide clear description in the figure caption.
Fig. 8, 9, Table 4 and 5: Same as above, please provide more informative figure caption. For tables, it is not clear what numbers are shown (the fact that it is Brier scores are mentioned only in the text).
L 425: Typo in “hydrological”. Besides, perhaps “Hydrological models validation” would sound better.
L 426-431: This part would fit better to methods section.
Fig. 10: Why this figure is actually shown here? And why specifically the Horb catchment and the season 2012/13? Please explain it better in the text. I understand that this might be an example to support your conclusion of using separate calibration for snow routines and then for the rest of a hydrological model. However, without any other information it looks like you selected the “best” result to support your conclusion, but without any evidence that also other catchments/years performed similarly well or badly. I would strongly suggest either to put this figure in wider context or remove it.
Fig. 10: Y-axis description should contain units.
All figures: Besides specific comments above, please check the font size in all figures. |