Articles | Volume 30, issue 8
https://doi.org/10.5194/hess-30-2337-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
Technical note: High Nash–Sutcliffe Efficiencies conceal poor simulations of interannual variance in seasonal regimes
Download
- Final revised paper (published on 23 Apr 2026)
- Supplement to the final revised paper
- Preprint (discussion started on 20 Oct 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-3851', Anonymous Referee #1, 26 Nov 2025
- AC2: 'Reply on RC1', Sacha Ruzzante, 19 Dec 2025
-
RC2: 'Comment on egusphere-2025-3851', Anonymous Referee #2, 02 Dec 2025
- AC1: 'Reply on RC2', Sacha Ruzzante, 19 Dec 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Publish subject to revisions (further review by editor and referees) (13 Jan 2026) by Elena Toth
AR by Sacha Ruzzante on behalf of the Authors (20 Feb 2026)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (07 Mar 2026) by Elena Toth
RR by Anonymous Referee #2 (10 Mar 2026)
RR by Anonymous Referee #1 (20 Mar 2026)
ED: Publish subject to technical corrections (21 Mar 2026) by Elena Toth
AR by Sacha Ruzzante on behalf of the Authors (09 Apr 2026)
Author's response
Manuscript
The authors touch a very important and increasingly spotted (luckily) topic: should we blindly trust our traditional performance metrics for hydrological modeling? Aside other very interesting insights, they discuss a sad (although needed) truth: high NSEs (or even KGEs) do not necessarily mean that the simulations are adequate. It urges in some aspect our need to improve, as modelers, our optimization metrics. The paper is definitely a fit for HESS and should be published, but as should be expected, some concerns should be clarified/corrected/improve before, aside many suggestions.
1. I believe that the methodology used for the time-series decommission needs to be better explained (with more details) and if needed, authors could make use of Appendix/Supporting information. This is a crucial part and needs to be ensured to be easy to follow by readers.
2. Also on that, I feel that authors could justify better the choice of the decomposition. Was it motivated by previous work? Are there more references? This needs to be made clear in the text.
3. The authors called the seasonal component the long-term seasonality of the basins. Our rivers are under changes and the seasonality is consequently changing in many of our rivers. I think this could fit a bit better in the text. I understand the choice (L85-89), and also I believe that much of the change is captured in the irregular, but the text would benefit for a bit of clarification in the choices.
4. Simulations: If I understood correct, the authors used simulated data from several models (and in one case they simulated themselves). Did the authors check for the different periods of calibration/evaluation/tests for all the models? Or for overlapping period? Did the authors used only what was classified as test? my main concern, is that during the model comparison, the authors might be using simulated streamflow from test for some models and for "calibration" for other models. Or even, single-basin versus regional simulations. I see no problem in using different settings, but this needs to be extensively reported and discussed in the results. For example, I have the feeling that for the PREVAH-CH simulations, the authors might have used all the simulation (including calibration) and not only evaluation (I might be wrong). My suggestion is to review these aspects, and incorporate such information in the manuscript.
5. Regarding Figure 3 (along also L275 onwards) models that performed better for highly seasonal catchments were the ones with the lowest performances overall, or is it my impression? I think you should discuss better this, maybe showing the median performances? A box plot in appendix? Something to clarify if these models being better in seasonal are actually just the case that they had overall poor performance? Also touching point 4, how were these simulations obtained by the original authors? did they report them as the evaluation phase? or are they actually for the calibration period? This would be worthy clarifying for the readers.
6 L328-332: Needs to be rephrased (maybe) after reviewing points 4 and 5.