the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Remote sensing-aided rainfall–runoff modeling in the tropics of Costa Rica
Saúl Arciniega-Esparza
Christian Birkel
Andrés Chavarría-Palma
Berit Arheimer
José Agustín Breña-Naranjo
Download
- Final revised paper (published on 21 Feb 2022)
- Supplement to the final revised paper
- Preprint (discussion started on 17 Aug 2021)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on hess-2021-428', Anonymous Referee #1, 15 Sep 2021
The authors selected Costa Rica as a case study to evaluate the performance of a global hydrological model, aiming to show that a coarse scale model can be effectively calibrated and used to model streamflow at finer scales in the humid tropics. The study includes the comparison of 4 different calibration strategies which were used to generate a well-calibrated version of the HYPE model targeted to the domain of Costa Rica. Additionally, the authors demonstrate how remotely-sensed data can effectively be bias-corrected using simple strategies to generate input data of sufficient quality for hydrological modelling. Such information is highly valuable for local management of water resources and of general interest to the hydrological community. The methods used are sufficiently described and the results clearly presented but a revision of the paper could further strengthen it.
-
After reading the title, I expected to be presented with a modelling study covering larger areas of the humid tropics. I was thus surprised to find that the manuscript only discusses the case study of Costa Rica when reading the abstract. Thus, I suggest to change the title and exchange “humid tropics” with “Costa Rica”.
-
Line 121 states that delineation of the catchments was performed using “the terrain analysis toolset from SAGA GIS”. Were the standard settings used?
-
The description of the 4 calibration strategies and the associated schematic in Figure 3 left me somewhat confused. Looking at the figure, I assumed that M2 was a stepwise calibration in which a first iteration calibrated against monthly streamflow, followed by a second calibration against daily streamflow. I thus wonder what the “first streamflow” in line 307 refers to. Furthermore, the colour coding in Figure 3 left me wondering how M2 and M4 differ from each other and why M4 was similar to M3. The schematic would be clearer if a 4th row could be added, so that each row represents one calibration scheme.
-
Both NSE and KGE values are presented for comparing the performance of the 4 calibration strategies with each other. In line 437 a values of KGE < 0 are deemed to be poor and in lines 474 and 476, values of KGE > 0.6 are said to be acceptable. How is the choice of these ranges justified? As Knoben et al. (2019) show, even negative KGE values could present an improvement over using the mean flow as a predictor. At the same time, there is no guarantee that KGE > 0.6 is linked to an improvement over a specific benchmark. While the given values clearly show which of the methods provides an improvement over the other, it remains unclear how good the performance actually is. This is particularly relevant in lines 516-521 where an acceptable performance of KGE > 0.5 is linked to both underestimated high and low flows. I would thus like to see a prupose-based KGE benchmark specified against which the results can be compared.
Technical corrections
Line 274: The abbreviation IDW needs to be defined.
Figure 5: Please extend the y-axis so that the values for Rancho Ray M1 become visible as well.
All figures: Unfortunately, the colour scheme used is often not colour-blind friendly. Particularly the lines in Figures 8 and 9 are barely distinguishable. Also, the colour gradient green-yellow-red (e.g. in Figure 1f) or the multicolour gradient (e.g. Figures 4a, 6) generate maps which are very hard to read. I thus suggest switching to a different colour scheme and to use different line shapes (dotted, dashed) to further improve the readability.
References:
Knoben, W. J. M., Freer, J. E., & Woods, R. A. (2019). Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores. Hydrology and Earth System Sciences, 23(10), 4323–4331. https://doi.org/10/ghvjxf
Citation: https://doi.org/10.5194/hess-2021-428-RC1 -
AC1: 'Reply on RC1', Saul Arciniega, 25 Oct 2021
Thanks for your suggestions and your comments. We attached our responses to each point starting the line with “R:”.
After reading the title, I expected to be presented with a modelling study covering larger areas of the humid tropics. I was thus surprised to find that the manuscript only discusses the case study of Costa Rica when reading the abstract. Thus, I suggest to change the title and exchange “humid tropics” with “Costa Rica”.
R: We are agree with this statement that Costa Rica is a limited geographical space, but we would argue that the many tropical climates from seasonally-dry, high-elevation temperate to very humid all covered within the territory of Costa Rica allows a more general reference to the tropics already in the title. Therefore, we will change the title of the revised paper to “Remote sensing-aided large-scale rainfall-runoff modelling in the tropics of Costa Rica”.
Line 121 states that delineation of the catchments was performed using “the terrain analysis toolset from SAGA GIS”. Were the standard settings used?
R: Thank you for this comment. In the revision, we will fully describe the corresponding settings used with SAGA tools as they deviate from the default due to the strong topographical gradient that influences channel initiation.
The description of the 4 calibration strategies and the associated schematic in Figure 3 left me somewhat confused. Looking at the figure, I assumed that M2 was a stepwise calibration in which a first iteration calibrated against monthly streamflow, followed by a second calibration against daily streamflow. I thus wonder what the “first streamflow” in line 307 refers to. Furthermore, the color coding in Figure 3 left me wondering how M2 and M4 differ from each other and why M4 was similar to M3. The schematic would be clearer if a 4th row could be added, so that each row represents one calibration scheme.
R: Thanks for this comment. In the revised paper, we will modify Figure 4 to add a new row clarifying the different model setups used. We will also modify the color scheme using a colorblind palette as in nature communications (https://www.nature.com/articles/s41467-020-19160-7).
Both NSE and KGE values are presented for comparing the performance of the 4 calibration strategies with each other. In line 437 a values of KGE < 0 are deemed to be poor and in lines 474 and 476, values of KGE > 0.6 are said to be acceptable. How is the choice of these ranges justified? As Knoben et al. (2019) show, even negative KGE values could present an improvement over using the mean flow as a predictor. At the same time, there is no guarantee that KGE > 0.6 is linked to an improvement over a specific benchmark. While the given values clearly show which of the methods provides an improvement over the other, it remains unclear how good the performance actually is. This is particularly relevant in lines 516-521 where an acceptable performance of KGE > 0.5 is linked to both underestimated high and low flows. I would thus like to see a prupose-based KGE benchmark specified against which the results can be compared.
R: Thank you for raising this important point. In the revised paper, we will clarify our choice in using the single discharge metric KGE from Kling et al. (2012) to evaluate the performance of the different HYPE model setups as opposed to multi-criteria calibration. We will refer to the work by Garcia et al (2016), that showed that the KGE is a relatively balanced metric with slightly more focus on high flow. However, Santos et al. (2018) advice against the use of log-transformed discharge with the KGE for low flow evaluation due to potential numerical issues. The latter points against a multi-criteria evaluation, but as Ding (2018) shows this issue is an ongoing scientific debate in the community. Nevertheless, we will implement a clearer description and discussion on this issue and also show other performance metrics (KGE, Pearson Correlation Coefficient, MAE, NSE) for comparison purposes in the supplementary material.
The second issue of a benchmark evaluation will be addressed using the %-deviation from the model calibrated with daily streamflow data (M1), which usually is common practice. These %-deviation values will be added to Figure 8 in the revised paper.
References:
Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube basin under ensemble of climate change scenarios, J. Hydrol., 424–425, 264–277, https://doi.org/10.1016/j.jhydrol.2012.01.011, 2012. a, b, c
Garcia, F., Folton, N., and Oudin, L.: Which objective function to calibrate rainfall–runoff models for low-flow index simulations?, Hydrol. Sci. J., 62, 1149–1166, https://doi.org/10.1080/02626667.2017.1308511, 2016.
Santos, L., Thirel, G., and Perrin, C.: Technical note: Pitfalls in using log-transformed flows within the KGE criterion, Hydrol. Earth Syst. Sci., 22, 4583–4591, https://doi.org/10.5194/hess-22-4583-2018, 2018.
Ding, J.: Interactive comment on “Technical note: Pitfalls in using log-transformed flows within the KGE criterion” by Léonard Santos et al., Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-298-SC2, 2018
Technical corrections
Line 274: The abbreviation IDW needs to be defined.
R: In the revised paper, we will spell out the Inverse Distance Weighted (IDW) interpolation at first mention.
Figure 5: Please extend the y-axis so that the values for Rancho Ray M1 become visible as well.
R: Thanks, we will modify the figure to improve the visualization of the Rancho Rey M1 metrics.
All figures: Unfortunately, the colour scheme used is often not colour-blind friendly. Particularly the lines in Figures 8 and 9 are barely distinguishable. Also, the colour gradient green-yellow-red (e.g. in Figure 1f) or the multicolour gradient (e.g. Figures 4a, 6) generate maps which are very hard to read. I thus suggest switching to a different colour scheme and to use different line shapes (dotted, dashed) to further improve the readability.
R: Thank you for this suggestion. In the revised paper, we will modify the figures to improve the visualization using different color schemes more friendly for color-blind readers using the colorblind palette as in nature communications (https://www.nature.com/articles/s41467-020-19160-7).
Citation: https://doi.org/10.5194/hess-2021-428-AC1
-
-
RC2: 'Comment on hess-2021-428', Anonymous Referee #2, 18 Sep 2021
The overarching goal of the manuscript was to use the HPYE model to simulate Costa Rica's catchments with four different model configurations. The authors used remote sensing data as the forcings of the model, in particular, the CHIRPS data for rainfall and MODIS for evaporation. In addition to the usual daily streamflow calibration, the authors incorporated monthly streamflow, PET, and AET data into the model calibration. The use of remote sensing data into hydrological models and their performance is of significant interest to the hydrological community and is important to the local governments with limited access to ground monitoring. The methods are well presented, but the manuscript could benefit from additional information. In particular, a clear evaluation of the model differences and a better time series evaluation would improve it significantly.
1. There is a lack of connection between the supplementary section and the main text. For instance, when the authors introduced the CHIRPS product (~L240), they could link it to Fig. S1 to have a clear picture of the improvement. Another example are tables S1 and S2, which are not mentioned anywhere in the text but would be a useful reference in the discussion section where the authors discuss these hydrological signatures for all models.
2. The abstract needs to be improved by including some of the nice statistics and results from the paper that quantify the improvements. Around L23, the authors talk about the hydrological signatures and that using both daily and monthly streamflow is better than just using the daily flows. However, it is not clear by how much.
3. The authors do not specify which model configuration is the baseline (which I assume is M1). Furthermore, while they present performance statistics, it is unclear if these differences are statistically significant to merit the additional data. Moreover, when they discuss the time-series analysis and the differences between the models, they do so in a descriptive manner to quantify it better. For instance, using a distance metric to evaluate series similarity to the observed data. See DOI: 10.1016/j.rse.2011.06.020 for a summary of some useful metrics. My suggestion would be to make plots of the Mahalanobis distance rather than presenting the original time series (or in addition to Fig. 8).
4. I believe that the first objective should be merged into the other objectives. Running the model (independently of the computer language used) is a trivial objective as it is met from the start of the project.
5. The authors need to explain how they did the catchment extraction in GRASS by providing additional detail into the used parameters. Also, they need to explain the IDW method in the methods section, define the acronym, and add a reference.
6. The authors need to improve Fig. 3; interpreting it is confusing. Perhaps it would be best to have it with 4 rows rather than arrows, even if there is a degree of repetition.
7. Can the authors modify the presentation of the 86 parameters in L331? It is hard to understand; I would suggest presenting the numbers in parenthesis as the main parameter numbers and then elaborating on how many were linked to soil types, land cover, etc.
8. Can the authors add box plots of the other statistics as supplementary? It is hard to visualize them as isolated numbers. Again, can the authors perform tests of significance on the statistics to determine a significant difference between them?
9. Can the authors mention what the criteria for defining a KGE of 0.5 as acceptable were?
10. Around L605, the authors mention that the corrected temperature improved model performance. The authors need to quantify this performance increase.
11. The authors mention that the streamflow overestimation can be related to a precipitation bias in CHIRPSc. However, from Fig. S1, this does not seem to be the case.
12. When discussing model improvement, please quantify it. The authors mention in L635 that M3 and M4 showed better and more realistic results but failed to quantify the improvement. Moreover, from Fig. 10, it seems that even though KGE was higher for M3 and M4, M1 was able to reproduce the actual spatial distributions of PET and AET better, overlapping more with the observed ranges.
13. In the discussion section, the authors mention that adding PET and AET to the calibration improved model representativeness and link earlier studies. The authors need to also link this assertion to their study, which is one of their objectives.
14. Due to missing tests, I do not see how the authors can conclude that M3 and M4 are better configurations since the statistical significance of the differences has not been evaluated. And in fact, for a lot of the variables, it seemed that M1 performed adequately well compared to M3 and M4. The authors can further support the increased accuracy of M3 and M4 by their link to the FDC information.
15. Finally, I suggest adding ": A case study in Costa Rica." to the title since it was the only region analyzed in the manuscript.
Around L54, the authors mention the opportunities from including additional variables. Please, specify which variables or give a few examples.Technical corrections:
Around L56, the authors mention that more realistic hydrological partitioning comes at the expense of increased computational cost. Can the authors quantify the time penalties involved?
Around L77, do the authors mean simple bucket models? Any model can be a black-box model.
Around L87, the authors mention that the coarse spatial resolution of the climatological data is an important source of error. Can the authors mention the related uncertainty in the data? (i.e., how much of the model error is associated with the coarse spatial resolution).
L135, the authors mentioned that they merged land covers. Can the authors include how much each merged class contributed to the overall classification?
L159, please remind the reader what both sides are.
L175, please quantify the statement; how well did MODIS compare with the ground data? State r2 or another statistic.
L209, can the authors mention how they chose the soil layer thickness?
L245, the correction factor appears as B in equation one; it appears as BF and BF2 in equations 2 and 3.
L266, from the text, it is somewhat ambiguous if y refers to each year or the whole period.
L288, the authors should mention that the parameters for correction are part of a monte Carlo simulation and are set to the ranges in Table 3.
L290, sine function.
L320, can the authors justify why only two years were used as a warm-up?
L361, please remind the reader which time series.
L375-379, this information should appear in the introduction.
L405, can the authors normalize the MAE by the mean precipitation? Doing so would help the reader to understand the relative magnitude of the MAE.
L415, please, specify how they affect the performance.
L532, from Fig. 9, it seems that all the models underestimated the real flows to some extent. Is this due to CHIRPSc?
L575, can the authors increase the border thickness of the catchments of Fig. 10? It isn't easy to see them.
L619, is the deviation a positive or negative bias?
L644, what do the authors mean by increased parameter sensitivity?
L650, Can the authors comment why none of the models at the best performing catchment could reproduce the decrease in water content between 2014-2015?Citation: https://doi.org/10.5194/hess-2021-428-RC2 -
AC2: 'Reply on RC2', Saul Arciniega, 26 Oct 2021
Thanks for your suggestions and your comments. We attached our responses to each point starting the line with “R:”.
- There is a lack of connection between the supplementary section and the main text. For instance, when the authors introduced the CHIRPS product (~L240), they could link it to Fig. S1 to have a clear picture of the improvement. Another example are tables S1 and S2, which are not mentioned anywhere in the text but would be a useful reference in the discussion section where the authors discuss these hydrological signatures for all models.
R: Thank you for pointing out that we did miss linking the supplementary material to the text. In the revised paper, we will refer to the additional supplementary materials to improve the description of the results and the discussion.
- The abstract needs to be improved by including some of the nice statistics and results from the paper that quantify the improvements. Around L23, the authors talk about the hydrological signatures and that using both daily and monthly streamflow is better than just using the daily flows. However, it is not clear by how much.
R: Thanks for your suggestion. In the revised paper, we will include more details about the improvements of the model configurations already in the abstract, and we will show the comparison of model performance using a %-deviation from the model M1 calibrated with daily streamflow as a benchmark. The latter metric, along with others added to the KGE used for calibration, will be reported already in the abstract and throughout the text, together with a new table.
- The authors do not specify which model configuration is the baseline (which I assume is M1). Furthermore, while they present performance statistics, it is unclear if these differences are statistically significant to merit the additional data. Moreover, when they discuss the time-series analysis and the differences between the models, they do so in a descriptive manner to quantify it better. For instance, using a distance metric to evaluate series similarity to the observed data. See DOI: 10.1016/j.rse.2011.06.020 for a summary of some useful metrics. My suggestion would be to make plots of the Mahalanobis distance rather than presenting the original time series (or in addition to Fig. 8).
R: Thank you for raising this important point. In the revised paper, we will clarify the statistical improvement in the model performance with respect to the baseline (M1) calibrated with daily streamflow data, which usually is common practice. The %-deviation from the benchmark for all models will be used together with comparative metrics such as XXXX in a new table. With respect to your final suggestion, we consider that the distance-metric plots used in Mahalanobis (2011) are interesting but rather hard to interpret for hydrological time series; instead, we will modify Figure 8 to better explain the differences between model configurations.
- I believe that the first objective should be merged into the other objectives. Running the model (independently of the computer language used) is a trivial objective as it is met from the start of the project.
R: Thank you for this suggestion. In the revised version, we will merge the first two objectives into a single objective.
- The authors need to explain how they did the catchment extraction in GRASS by providing additional detail into the used parameters. Also, they need to explain the IDW method in the methods section, define the acronym, and add a reference.
R: Thank you for your point. Only to clarify, SAGA GIS was used to delineate the catchment boundaries, and GRASS GIS was used to address the space-time series of precipitation and temperature. In the revised version, we will describe the parameters used in SAGA, as well as the description of IDW interpolation.
- The authors need to improve Fig. 3; interpreting it is confusing. Perhaps it would be best to have it with 4 rows rather than arrows, even if there is a degree of repetition.
R: We agree and will add a row for model configuration in Figure 3 to clarify the different model configurations in the revised paper.
- Can the authors modify the presentation of the 86 parameters in L331? It is hard to understand; I would suggest presenting the numbers in parenthesis as the main parameter numbers and then elaborating on how many were linked to soil types, land
cover, etc.
R: Thanks for your suggestion. In the revised version, we will clarify the presentation of the model parameters.
- Can the authors add box plots of the other statistics as supplementary? It is hard to visualize them as isolated numbers. Again, can the authors perform tests of significance on the statistics to determine a significant difference between them?
R: We agree. In the revised version, we will change Table 1 for a boxplot in the supplementary materials section.
- Can the authors mention what the criteria for defining a KGE of 0.5 as acceptable were?
R: Thank you for raising this important point. In the revised paper, we will implement a clearer description and discussion on this issue based on recent literature. We will also introduce a clear description of the other performance metrics (KGE, Pearson Correlation Coefficient, MAE, NSE) used for comparison purposes in the supplementary material.
- Around L605, the authors mention that the corrected temperature improved model performance. The authors need to quantify this performance increase.
R: Thank you for this suggestion. We will state the performance obtained with the original temperature time series and the comparison using the corrected temperature.
- The authors mention that the streamflow overestimation can be related to a precipitation bias in CHIRPSc. However, from Fig. S1, this does not seem to be the case.
R: Thank you for raising this important point. We found that precipitation overestimation persisted in drier environments despite the bias correction. The overestimate was associated with the lack of ground precipitation records to correct the CHIRPS product in headwater catchments such as Rancho Rey. Our Fig. S1 in the supplementary section shows that, in many cases, differences between the water balance fluxes (P, ET, Q) were reduced. In the revised manuscript, we will clarify this point to highlight the issues associated with precipitation bias correction.
- When discussing model improvement, please quantify it. The authors mention in L635 that M3 and M4 showed better and more realistic results but failed to quantify the improvement. Moreover, from Fig. 10, it seems that even though KGE was higher for M3 and M4, M1 was able to reproduce the actual spatial distributions of PET and AET better, overlapping more with the observed ranges.
R: That is indeed an interesting point, and we thank you for the comment. Only to clarify, M1 showed high performance for PET but a lower performance for ET in comparison with M3 and M4 (shown in Figure 5). In the revised paper, we will include a statistical comparison between model performances to clarify the improvements between model configurations.
- In the discussion section, the authors mention that adding PET and AET to the calibration improved model representativeness and link earlier studies. The authors need to also link this assertion to their study, which is one of their objectives.
R: Thank you for this suggestion. We will include statements and refer to the performance improvements in the text of the reviewed manuscript.
- Due to missing tests, I do not see how the authors can conclude that M3 and M4 are better configurations since the statistical significance of the differences has not been evaluated. And in fact, for a lot of the variables, it seemed that M1 performed adequately well compared to M3 and M4. The authors can further support the increased accuracy of M3 and M4 by their link to the FDC information.
R: Thank you for your suggestion. In the reviewed paper, we will include a statistical comparison using the %-deviation from the model calibrated with respect to the baseline (M1), as well as adding the corresponding metrics of FDCs in Figure 9.
- Finally, I suggest adding ": A case study in Costa Rica." to the title since it was the only region analyzed in the manuscript.
R: We agree with this statement that Costa Rica is a limited geographical space and will change the title of the revised paper to “Remote sensing-aided large-scale rainfall-runoff modelling in the tropics of Costa Rica”.
Around L54, the authors mention the opportunities from including additional variables. Please, specify which variables or give a few examples.
R: We agree. In the revised paper, we will include some examples that were found useful for model calibration in the recent literature and how this could be applied to Costa Rica.
Technical corrections:
Around L56, the authors mention that more realistic hydrological partitioning comes at the expense of increased computational cost. Can the authors quantify the time penalties involved?
R: Thanks for your comment. In the revised manuscript, we will clarify that including additional variables to provide a more realistic hydrological partitioning is more related to an increase of complexity for model calibration and validation instead of an increase in the computational cost.
Around L77, do the authors mean simple bucket models? Any model can be a black-box model.
R: We agree and will modify this sentence accordingly in the revised paper.
Around L87, the authors mention that the coarse spatial resolution of the climatological data is an important source of error. Can the authors mention the related uncertainty in the data? (i.e., how much of the model error is associated with the coarse spatial resolution).
R: That’s an interesting point. In our research, we did not compare different products of precipitation and temperature with different spatial resolutions; however, in the revised paper, we will mention the errors found in previous research for Costa Rica and elsewhere.
L135, the authors mentioned that they merged land covers. Can the authors include how much each merged class contributed to the overall classification?
R: In the revised paper, we will include a short description of the percent of major classes merged in the land cover classification.
L159, please remind the reader what both sides are.
R: Thanks for your suggestion, this will be corrected in the revised paper.
L175, please quantify the statement; how well did MODIS compare with the ground data? State r2 or another statistic.
R: Thanks for your comment. In the revised paper, we will include metrics of MODIS performance from previous research.
L209, can the authors mention how they chose the soil layer thickness?
R: Thanks for your comment. Due to the lack of accurate soil thickness maps in Costa Rica, we considered a maximum soil thickness of 3 m in forest land cover and a minimum of 2 m in bare land cover, following the recommendation in the model configuration shown by Arheimer et al. (2020). In the revised paper, we will clearly state these boundary conditions.
Arheimer, B., Pimentel, R., Isberg, K., Crochemore, L., Andersson, J. C. M., Hasan, A., and Pineda, L.: Global catchment modelling using World-Wide HYPE (WWH), open data and stepwise parameter estimation. Hydrology and Earth System Sciences Discussions, 1–34. https://doi.org/10.5194/hess-24-535-2020, 2020.
L245, the correction factor appears as B in equation one; it appears as BF and BF2 in equations 2 and 3.
R: Thanks for your detailed review. In the revised paper, we will correct this error in equation 1.
L266, from the text, it is somewhat ambiguous if y refers to each year or the whole period.
R: Thanks for your comment. We agree and will clarify the equations and text in the revised paper.
L288, the authors should mention that the parameters for correction are part of a monte Carlo simulation and are set to the ranges in Table 3.
R: Thanks for your suggestion. Will be included in the revised paper.
L290, sine function.
R: We will change this error in the revised paper, thank you.
L320, can the authors justify why only two years were used as a warm-up?
R: Thanks for your suggestion. The warm-up period was stablished as two years before the calibration period in order to avoid issues with initial conditions of water content in soil layers, rivers, and reservoirs. We tested from 1 to 3 years and results did not change from two to three years. In the revised paper, we will include some references to the recommended warm-up periods, and we will clarify our selection of two years.
L361, please remind the reader which time series.
R: Thanks. In the revised paper, we will include an explanation of which series are compared.
L375-379, this information should appear in the introduction.
R: Thanks for your suggestion. In the revised paper, we will move this sentence to the introduction section.
L405, can the authors normalize the MAE by the mean precipitation? Doing so would help the reader to understand the relative magnitude of the MAE.
R: Thanks for this great suggestion. In the revised paper, we will include the normalized MAE in addition to the current value.
L415, please, specify how they affect the performance.
R: Thanks for your suggestion. We describe the impact of precipitation issues in different sections, such as L536, where we highlight that higher model performance was obtained in catchments with high cross-correlation on the Pacific slope. In the revised paper, we will link the description section with a sentence in L415 to clarify such an affirmation.
L532, from Fig. 9, it seems that all the models underestimated the real flows to some extent. Is this due to CHIRPSc?
R: Thanks for your comment. As you mention, precipitation from CHIRPS is an important factor of error in our results. In the discussion section, we explained how catchments from the Pacific slope showed higher performance in comparison with catchments from the Caribbean slope, related to the performance of CHIRPS to detect rainy and dry years on both slopes. In the revised version, we will clarify the limitation of low flow simulation and the relationship to precipitation performance, and we will also refer to Fig. S2 from supplementary.
L575, can the authors increase the border thickness of the catchments of Fig. 10? It isn't easy to see them.
R: Thanks for your suggestion. In the revision, we will increase the line thickness to emphasize the borders.
L619, is the deviation a positive or negative bias?
R: Thanks for your suggestion. In the revision, we will include such information.
L644, what do the authors mean by increased parameter sensitivity?
R: Thanks for your comment. We agree that this is a confusing message and will change this sentence to clarify the implications of MODIS PET and ET on model parameters.
L650, Can the authors comment why none of the models at the best performing catchment could reproduce the decrease in water content between 2014-2015?
R: Thanks for your comment. We assume that do you refer to the water content of Rancho Ray catchment in Figure 8. In this case, Rancho Ray showed the poorest performance of all catchments evaluated, which we will better highlight. The lower capacity to reproduce soil water content by most model configurations is related to the precipitation overestimation that stores water in the soil buckets.
Citation: https://doi.org/10.5194/hess-2021-428-AC2
-
AC2: 'Reply on RC2', Saul Arciniega, 26 Oct 2021