Interactive comment on “ Usefulness of four hydrological models in simulating high-resolution discharge dynamics of a catchment adjacent to a road ” by Z . Kalantari

The presented article is a systematically performed study on modelling runoff from a small catchment adjacent to a road. The value / novelty of the article mainly is i) in concentrating on winter time conditions that often are neglected even in studies in cold climate regions, and ii) in using and comparing the performance of four structurally different hydrological models. The main concerns related to the overall quality of the article are i) the difficulty to picture runoff generation in the study area, and thereby the difficulty to assess the results and the presented analysis on the performance of the


General comments
The present manuscript does not constitute a very innovative contribution.In fact, studies that compare several hydrological model structures, or utilise modular structures for the same purpose, have become quite frequent over the last years (Breuer et al., 2009;Clark et al., 2008;Holländer et al., 2009;Plesca et al., 2012;Reed et al., 2004;Refsgaard and Knudsen, 1996).The main differences here is that the authors mainly target peak flow events in hourly time-step simulations, whereas most of the previous studies consider long term simulations.I appreciate that the authors put their study in a real-world frame with the stated will to correctly simulate hydrological extreme events to predict their effect on road infrastructures.However, this point is forgotten throughout the manuscript and only quickly evoked in the conclusion part without addressing the problematic introduced in part 1.If some flood warning levels exist in the area, it would also be good to assess the ability of each model to correctly predict them with hit rates / false alarm rates for example (e.g.Roulin, 2007).The authors focus too much on, and are satisfied by, correctly repredicting the magnitude and timing of the peak flows, but they do not really speak about total volumes.Total volumes may also be relevant in the frame of extreme events prevention and infrastructure design.This may be due to the model evaluation which is based only on a quadratic evaluation of the error that makes the Nash-Sutcliffe efficiency more sensitive to peak values (Legates and McCabe Jr, 1999).Introducing some bias information in the evaluation (e.g.Plesca et al., 2012) could better constrain parameter sets.The usefulness and correctness of the GLUE methodology adopted for CoupModel C2720 and HBV is discussable.First, these two models are calibrated over a time period (Oct-07 to Apr-08) but run over the same 16 month period as LISEM and MIKE-SHE.Do you use the remaining time period (Jan-07 to Sep-07) to spinup those models?This calibration period covers the Periods I, II and III later examined more in details.A validation of any sort is lacking for these calibrated models.Second, while the authors introduce a threshold corresponding to R 2 > 0.6 and NSE > 0.6 to discriminate between behavioural and non-behavioural parameter sets, this aspect is skipped in the presentation of results.The advantage of having quick models probably resides in the possibility to address the predictive uncertainty, especially if the calibration strategy is based on a GLUE approach that does not aim at finding a best parameter sets.Therefore, I would expect uncertainty bounds rather than single predictions in the hydrographs.Third, performing 1,000 Monte-Carlo runs is probably much too low to find a global optimum with 17 parameters.Finally, if the aim of the study is to correctly predict high flows, why not targeting the sole evaluation of these events in the uncertainty analyses?Good metrics over a 6 month period do not necessarily imply a correct representation of punctual events.It would make results more comparable with LISEM's.
Based on these comments, I do not think that the manuscript is suitable for publication in HESS.Authors will find hereafter specific comments that may help them improve the presentation of their work.

Introduction
A recent paper by Coumou and Rahmstorf (2012) tends to lower the influence of climate change on extreme event occurrences.

Material and Method
The Material and Method part needs a major reshaping.The input data part (2.4) should be merged with the catchment description (2.1) although some model specific information (e.g.P 5133 L10-15) should be placed along the model description.Similarly, part 2.3 should be merged with part 2.2.Since there are substantive differences from model to model, setup procedures would find a better place along model descriptions.Generally, some more details are required about the setup of CoupModel and HBV in comparison with the extended description of LISEM and MIKE SHE: are they lumped models?Semi-distributed?etc...

P5125 L25:
To which period does this average correspond?P5126 L5: Please indicate with which method PET was calculated.P5126 L25: C2722 Initial conditions are still important regardless of the simulation length.The state-ofthe-art way to lower their influence is to use a spin-up period.This needs to be clarified here.
P 5130 L22-24: Wrong.Please check Fig 3 p. 280 of Lindström et al. (1997): there is an exponential parameter "BETA" that is used to calculate the amount of soil water recharging the flow generation boxes even when moisture conditions are below field capacity.This makes sense, otherwise the model would only simulate saturation excess processes.P 5132 L16: Why not using the same delineation than in LISEM?This would make the two models more comparable.P 5132 L18: Why is this depth map not used in LISEM?P 5134 L24: Cite Nash and Sutcliffe (1970) here.P5135 L15: Figure 3  P5137 L5-10: Thousand Monte-Carlo runs for 17 parameters is very low!There is a high risk of non uniqueness of parameter sets.How did you choose the threshold of NSE > 0.6?Since NSE is biased toward peak values, a high threshold is probably more appropriate.

P5138 L1:
See previous comment on the number of Monte-Carlo runs and threshold used.P5138 L9-13: This should go in the Results part with some more details (number of accepted parameter sets, etc...).P5138 L14-24: This part is not necessarily needed.Authors should mention which operating system they used.

C2724
P5139 L2-18: These 3 paragraphs are redundant with the methods.P 5140 L3-4: Please indicate to which model these values of NSE and R 2 should be attributed.

P5140 L10-12:
This should be included in the model description.One could ask why using this model at all.P5143 L4-5: Please quantify this difference.Alhough HBV and CoupModel use the same evapotranspiration module, the calibration is realised to match runoff so this difference is not very surprising as some compensation in the water balance can occur.P5143 L12-14: Please describe the acknowledged process.
P5143 L 21-23: Large errors in MIKE-SHE and LISEM are negative ones, i.e. due to flow underestimation.This may be in relation to the Nash-Sutcliffe evaluation of the models which is biased toward high values.

Discussion
The authors should place their results in the frame of previously published studies and not provide a simple summary of the results part.They should also put more emphasis C2725 on the effect of floods on roads in their catchment, the actual purpose of the modelling effort as stated in the abstract and introduction.Are there specific water level thresholds? and how good are the models to correctly predict them?P5145 L13: Seventeen calibrated parameters is NOT low, especially with only 1,000 Monte-Carlo runs.
P 5147 L11-15: Is it a planned improvement?

Tables
In Table 3, why is there no NSE for LISEM.I do not think you should compare criteria between models when considering different time periods.

Figures
is described before Figure 2. P5135 L19: Do you refer to Table 2? C2723 P5135 L23-27: Please give more details on the calibration procedure: is it automatic?manual?etc... P5136 L10-23: This should go in the Results part.P5137 L4: Please precise the distribution and ranges.

Figure 2
Figures Figure 2 is not necessary.Consider deleting it