Drainage assessment of irrigation districts: on the precision and accuracy of four parsimonious models

Laluet, Pierre; Olivera-Guerra, Luis; Altés, Víctor; Rivalland, Vincent; Jeantet, Alexis; Tournebize, Julien; Cenobio-Cruz, Omar; Barella-Ortiz, Anaïs; Quintana-Seguí, Pere; Villar, Josep Maria; Merlin, Olivier

doi:https://doi.org/10.5194/hess-28-3695-2024

Articles | Volume 28, issue 16

https://doi.org/10.5194/hess-28-3695-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/hess-28-3695-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 28, issue 16

Research article

|

16 Aug 2024

Research article |

| 16 Aug 2024

Drainage assessment of irrigation districts: on the precision and accuracy of four parsimonious models

Pierre Laluet, Luis Olivera-Guerra, Víctor Altés, Vincent Rivalland, Alexis Jeantet, Julien Tournebize, Omar Cenobio-Cruz, Anaïs Barella-Ortiz, Pere Quintana-Seguí, Josep Maria Villar, and Olivier Merlin

Download

Final revised paper (published on 16 Aug 2024)
Preprint (discussion started on 24 Apr 2023)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-543', Anonymous Referee #1, 08 May 2023

(1) Table 6: Why are the parameter values different between 2021 and 2022 periods? I guess that 2021 period is regarded as the calibration period and 2022 period is regarded as the validation period. It's weird if you calibrate both two periods.
(2) The four models evaluated herein result from the combination of two water balance models (RU and SAMIR) and two drainage discharge models (Reservoir and SIDRA). I'm confused why you don’t use the same parameter values to describe the same physical process. For example, the values of parameter S_inter (mm) used in the RU-Reservoir and RU-SIDRA are different in Table 6. It feels like a black box model study, if we don't use the same parameter values to describe the same process.
(3) I suggest add more comparison between the results of the two sub-basins AB1 and AB2, as the latter is more than ten times the size of the former.
(4) I agree that the simple model RU-Reservoir presents a better precision in your study. But I still think that in general the more complex the model, the better the simulation result.
(5) I am confused why you simulated the models with the default values. Is it to reduce the work of parameter calibration?

Citation: https://doi.org/10.5194/egusphere-2023-543-RC1
- AC1: 'Reply on RC1', Pierre Laluet, 25 Mar 2024
  
  We thank the reviewer for his/her helpful feedback. The relevant inputs help improve our work's quality and clarity.
  We addressed all the questions and proposed additions and changes in the paper. The reviewers' specific comments are listed below, followed by our responses in italics and the quotations we propose to add to the manuscript in bold.
  
  (1) Table 6: Why are the parameter values different between 2021 and 2022 periods? I guess that 2021 period is regarded as the calibration period and 2022 period is regarded as the validation period. It's weird if you calibrate both two periods.
  Thank you for this question. The two periods analyzed (2021-2022 and 2022) were calibrated independently. This was done because the aim of the precision evaluation was to assess the models' ability to reproduce observed drainage when calibrated on the same period (and therefore with optimal calibration conditions) and not to assess their ability to simulate drainage outside this calibration period. The accuracy evaluation assessed the models' ability to simulate drainage under non-optimal calibration conditions (default calibration, i.e., with values taken from the literature).
  To clarify this point, we propose adding the following sentences to line 106:
  “The idea of precision evaluation is to find out whether the models' formalism can reproduce the observed drainage under optimal calibration conditions, i.e., when data are available for calibration for the period analyzed. The idea of accuracy evaluation is to determine i) whether it is possible to estimate drainage when no in situ drainage data is available for calibration (which is the case for most irrigation districts), i.e. under non-optimal calibration conditions, and ii) which of the models evaluated performs best under these conditions.”
  It is true that the parameter values obtained after calibration differ between 2021 and 2022. This informs us that models don't perfectly reproduce certain processes that vary from one year to the next, either because they represent them too empirically or because they neglect them (such as subsurface lateral flows). As a result, parameter values differ between the two periods to compensate for these overly empirically simulated processes.
  To emphasize this point, we propose adding the following sentences to line 442.
  “The parameter values obtained after calibration of a given model differ between the two periods, illustrating the semi-empirical nature of the models evaluated. Indeed, parameter values vary between the two periods to compensate for the fact that certain physical processes (which vary between the two periods) are either too empirically simulated, or neglected."
  
  (2) The four models evaluated herein result from the combination of two water balance models (RU and SAMIR) and two drainage discharge models (Reservoir and SIDRA). I'm confused why you don’t use the same parameter values to describe the same physical process. For example, the values of parameter Sinter (mm) used in the RU-Reservoir and RU-SIDRA are different in Table 6. It feels like a black box model study, if we don't use the same parameter values to describe the same process.
  We would like to remove any confusion about the objective of the study. As it is highlighted at lines 104-106, 567, 575, and in sections 2.3 and 2.4, the evaluation of the four models is assessed in two different steps. The first step is to evaluate the precision of each model by calibrating the model parameters and comparing the modeled drainage to in situ measurements. The second step is to evaluate the robustness of each model by setting the model parameters to default values found in the literature and comparing the modeled drainage to in situ measurements. In that sense, the models are not considered as black boxes because 1) physical interpretations are given for the variability of model parameters retrieved in the first step of the model evaluation and 2) physical or standard parameter values are used in the second step of the model evaluation.
  Furthermore, we argue that a parameter having different values between models does not discredit the validity of the calibrated models. The models evaluated are semi-empirical, and their parameters are imperfect representations of the studied system and its physical processes. Each model simulates them in their own way, according to their hypothesis. The values of a parameter may vary from one model to another to account for these hypotheses. For example, the Reservoir's formalism generates less reactive drainage than SIDRA, requiring a lower Sinter value than SIDRA to simulate more recharge.
  We propose adding the following sentences to the line 442 to clarify this point:
  "The parameter values also vary between models for a same period, illustrating again the semi-empirical nature of the models and their parameters.”
  
  (3) I suggest add more comparison between the results of the two sub-basins AB1 and AB2, as the latter is more than ten times the size of the former.
  Thank you for this suggestion. We believe that this diversity in terms of surface area and crop type is a strength of this study, as it allows the models to be tested in different contexts.
  We showed in Table 1 the differences in terms of area and crop type between AB1, AB2, and the entire Algerri-Balaguer district. This comparison shows that AB1 has a proportion of Double crop and Summer cereal closer to AB than AB2. We also showed in Figure 2 the comparison of daily drainage (discharge flow divided by surface area) for the two sub-basins.
  We propose adding the following sentence on line 171 to provide quantitative information on the drainage of the two sub-basins.
  "The observed drainage at AB1 was 29 mm in 2021 and 46 mm in 2022, while it was 61 mm in 2021 and 46 mm in 2022 at AB2."
  We showed in Table 4 the comparison between the irrigation simulated by SAMIR on AB1 and AB2 after calibration, for each of the two periods. AB2's irrigation is higher than AB1's, consistent with the fact that AB2 has more Double Crops and observed drained volume than AB1, as explained in line 386.
  To better compare the irrigation of AB1 and AB2, we propose to add a column to Table 4 presenting the irrigation observed at the pumping station supplying the entire Algerri-Balaguer district (area including non-irrigated plots). The modified Table 4 is shown in the supplementary file.
  
  (4) I agree that the simple model RU-Reservoir presents a better precision in your study. But I still think that in general the more complex the model, the better the simulation result.
  Thank you for your comment.
  We agree that a model with more equations, more processes simulated, and more parameters to calibrate can better simulate a given process than simple models when calibrated on observed data. However, when no data is available for calibration, the most complex models are also the most prone to uncertainty (Puy et al., 2022; https://doi.org/10.1126/sciadv.abn9450). Since we assessed the ability of models to simulate drainage without calibration with in situ data (accuracy assessment), we chose to focus on four parsimonious models.
  The less complex of these four parsimonious models, RU-Reservoir, was found to be the most precise, showing that in our study the most complex model is not the most precise. As explained in lines 421 and 446, the RU-Reservoir model performs better than the other models because the way it simulates recharge and converts it into drainage is particularly well suited to the relatively low reactivity observed at AB1 and AB2. However, a more complex model describing recharge and drainage processes in more detail could have been even more precise than RU-Reservoir.
  The fact that RU-Reservoir, the simplest model, performed better than the others is an important result of our study, since we were interested in models balanced between performance and simplicity (because we use them with a default calibration that can lead to large uncertainties if too complex).
  To clarify this point, we propose to add the following sentences to line 574:
  “Complex models, with many processes simulated and parameters to calibrate, are likely to simulate drainage more precisely than simple models when calibrated on observed data. However, when no data is available for calibration, the most complex models are also the most prone to uncertainty (Puy et al., 2022). Moreover, drainage observations required for site-specific calibration are rarely available.”
  
  (5) I am confused why you simulated the models with the default values. Is it to reduce the work of parameter calibration?
  Thank you for this question. Indeed, we have used default values for the model's most sensitive parameters. We did this to determine to what extent the models can simulate the observed drainage without using in situ drainage data for an automatic calibration step (such as NSGA-II). In most cases, drainage data is not available for calibration (as mentioned on lines 94 and 574). This evaluation is, therefore, a first step towards finding out 1) whether it is possible to retrieve drainage without an automatic calibration step requiring in situ drainage discharge data and 2) which of the four evaluated models performs best under these conditions.
  We propose adding the following sentences to line 106 to clarify this point.
  “The idea of accuracy evaluation is to find out i) whether it is possible to estimate drainage when no in situ drainage data is available for calibration (which is the case for the vast majority of irrigation districts), i.e., under non-optimal calibration conditions, and ii) which of the models evaluated perform best under these conditions.”
  
  Citation: https://doi.org/10.5194/egusphere-2023-543-AC1
RC2:
'Comment on egusphere-2023-543', Anonymous Referee #2, 05 Dec 2023

Review of
Drainage assessment of irrigation districts: on the precision and accuracy of four parsimonious models
by P. Laluet et al.

The paper calibrates four relatively simple models to predict drainage from weather/irrigation data and crop information for two irrigation districts in a semi-arid climate. Uncalibrated versions of the model are also tested. The best performing models are identified, and explanations for good/poor performance are given based on the model structures.
Main comments
The science of the paper is sound, and the presentation is OK by and large, although English editing is required. My main reservation is that I would like to see a more elaborate analysis of the resources required to run the models and the level of training and expertise of the model operators. For practical application of these models (of which the authors are keenly aware), this information is very relevant. I think this can be accomplished without additional simulations.

Detailed comments not limited to a single line
The text is wordy at times, and the English needs some work. I did not edit this because of time constraints.
The equations use long variable names that sometimes seem to be meant to be subscripts. Sometimes, the same or very similar symbols are used for different variables.

Further detailed comments
L.40: What about drainage to keep down the groundwater level if the groundwater is saline? See also line 149.
L.44: Why can you not simply measure the discharge by installing a weir in the discharge canal? If you analyze water samples taken at the same locations, it serves all objectives you listed so far. Later in becomes clear that you need estimates for scenario studies, but the text so far seems to aim towards a measurement system.
L.84: Please elaborate a little on the FAO-56 method, or at least provide a reference.
L.90: What do you mean with the start of drainage? Figure 3, for instance, shows that drainage never stops.
L.101-102: I would also be interested in an evaluation of the level of the complexity of model calibration (in terms of the number of parameters and the difficulty to obtain them) and the benefit in terms of precision as defined by you. Perhaps you can also evaluate the required computational resources and the necessary skill level needed to run the models.
Figure 1: Typo: litterature -> literature. I had some difficulty with all the abbreviations, perhaps add some kind of legend.
L.145-146: I am not sure I understand what main drains are. A tube-drained field needs a ditch into which the tube drains discharge. Is this ditch a main drain? If fields are so small that they can be drained by ditches only, than you call these drains. If these ditches all drain into a ditch that is ‘one hierarchy level up’, than this ditch is a main drain too, if I follow you correctly. But I suppose these main drains themselves discharge into the next hierarchy level (and so on, perhaps) so that all water can leave the area through a single discharge point. What do you call these higher-level ditches/canals?
L. 165: ‘Data’ is not a very descriptive title for a section of a research paper.
L. 259: What is an intense drainage season?
Eq. (11): Here, the water table appears to be a function of time only, implying that it is flat. The Boussinesq equation is an expression for the water table in (one-dimensional) space and time. How come the water table is assumed flat here?

L. 331-336 and Table 3. You only partially explain how you arrived at the distributions of the fitting parameters. I think it is very useful for users of your approach to have some information on how to arrive at plausible parameter distributions. Perhaps it is even worth devoting a subsection of the Methodology section to that.
L. 366-372: I like this approach, but the explanation is not so clear. In one sentence you say you use the distributions of Table 3, in another you say you fixed L. These statements cannot both be true. Perhaps you found that L is not a sensitive parameter and could be fixed. How did you determine whether a parameter was sensitive or not? And did you check if fixing the non-sensitive parameters affected the results, for instance be generating a population with all parameters drawn from their distributions?
I presume you assumed no correlations between the parameters when selecting the parameter combinations during the Monte Carlo sampling.
Table 4: Please include the observed irrigation amounts for comparison.
L. 438-440: This explanation sounds plausible, and I am not disputing it. But do you know how the discharge network of ditches of various hierarchies affects the discharge response at the outlet to possibly non-uniform irrigation and rainfall?
L. 459-461: Do the fitted values of S_inter not expose the models to the risk of being ‘right for the wrong reason’? This might become an issue for scenario studies, if the model is used outside the data range for which it was calibrated. You seem to allude to this in L. 535, but for another model.
L. 473-474: I think the required resources and training to run the models is very relevant for a study of this nature. As I commented above, I would like to have a more thorough treatment of these aspects.
L. 480-483: Good point, food for thought.
Fig. 6, 7: In most of the panels (6a, 6c, 6e, 6f, 6g, 7a, 7e, 7g, 7h), for at least a section of the time line, the observed drainage appears to be among the outliers of the 2000 simulations. I do not know what to think of this. It perhaps suggests that the models need extreme parameter values to be able to simulate this, which may indicate that the model or the parameter distributions may not be that good. But I am not sure if I am not over-interpreting this. Any thoughts?
L: 513-515: I do not follow your line of thought here. Perhaps rephrase.
L. 539-541: Not only that, lateral fluxes across the system boundary will be difficult to estimate accurately. But I agree with you they can be an important factor.
L. 554: Rename ‘Summary and Conclusions’.

Citation: https://doi.org/10.5194/egusphere-2023-543-RC2
- AC2: 'Reply on RC2', Pierre Laluet, 25 Mar 2024
  
  We thank the reviewer for his/her helpful feedback. The relevant inputs help improve our work's quality and clarity. The reviewers' specific comments are listed below, followed by our responses in italics and the quotations we propose to add to the manuscript in bold.
  
  The paper calibrates four relatively simple models to predict drainage from weather/irrigation data and crop information for two irrigation districts in a semi-arid climate. Uncalibrated versions of the model are also tested. The best-performing models are identified, and explanations for good/poor performance are given based on the model structures.
  Main comments
  The science of the paper is sound, and the presentation is OK by and large, although English editing is required. My main reservation is that I would like to see a more elaborate analysis of the resources required to run the models and the level of training and expertise of the model operators. For practical application of these models (of which the authors are keenly aware), this information is very relevant. I think this can be accomplished without additional simulations.
  We appreciate the reviewer's recommendation to add elements concerning the complexity of use and calibration of the various models, inviting us to place ourselves as potential users. We have addressed this issue in our responses below.
  Detailed comments not limited to a single line
  The text is wordy at times, and the English needs some work. I did not edit this because of time constraints.
  Thank you for raising this point. We've reworked the English and made it less wordy.
  The equations use long variable names that sometimes seem to be meant to be subscripts. Sometimes, the same or very similar symbols are used for different variables.
  Thank you for pointing this out. We propose to replace the depletion coefficient parameter naming "k" with "ω". Indeed, "k" was not judicious as it is too similar to the hydraulic conductivity parameter "K".
  
  Further detailed comments
  L.40: What about drainage to keep down the groundwater level if the groundwater is saline? See also line 149.
  We fully agree that drainage generally consists in lowering the groundwater level, regardless of its salt concentration. The fact that groundwater is saline is an additional argument for draining the water table to prevent the groundwater's salt from accumulating on the surface via capillary rise.
  We propose modifying the following sentence to line 41:
  "Drainage systems are generally installed with the aim of preventing waterlogging during heavy rainfall events (by the sudden rise of groundwater levels) and facilitating salt leaching (particularly where irrigation water and/or groundwater has high salt concentrations). "
  
  L.44: Why can you not simply measure the discharge by installing a weir in the discharge canal? If you analyze water samples taken at the same locations, it serves all objectives you listed so far. Later in becomes clear that you need estimates for scenario studies, but the text so far seems to aim towards a measurement system.
  We agree that if weirs were available at the outlets of drained perimeters, they would provide sufficient drainage discharge data to meet the objectives previously mentioned. But the point is that very few drainage areas are instrumented.
  In this paper, we are not interested in understanding a specific irrigation district where drainage flow is monitored but in knowing whether models can estimate drainage, including in areas where no drainage data is available (which represents most situations). This is why we carried out the accuracy assessment.
  To clarify this point, we propose adding the following sentences to line 44:
  "Measuring drainage discharge is an effective way of monitoring the quantity and quality of drainage to help address the three challenges mentioned above. However, the proportion of drained irrigation districts equipped with such instruments is very low. In this context, estimating the drained water in irrigated areas, including those not instrumented, is of major importance. Some work has been done in this direction, focusing on modeling the quantity and quality of the drained water discharged (e.g., Negm et al., 2017), or on developing drainage scenarios that integrate changes in agricultural practices (e.g., Tournebize et al., 2004) or in climatic conditions (e.g., Golmohammadi et al., 2020; Jeantet et al., 2022)".
  
  L.84: Please elaborate a little on the FAO-56 method, or at least provide a reference.
  Thank you for your comment. We propose to add the reference "Allen et al., 1998" to line 84.
  Allen, Richard & Pereira, L. & Raes, Dirk & Smith, M.. (1998). FAO Irrigation and drainage paper No. 56. Rome: Food and Agriculture Organization of the United Nations. 56. 26-40.
  
  L.90: What do you mean with the start of drainage? Figure 3, for instance, shows that drainage never stops.
  Thank you for this comment. Jeantet et al. (2021) applied RU-SIDRA only on non-irrigated sites where drainage was null in summer and started to increase in autumn. We referred to this start when we wrote "start of drainage."
  We propose to modify the following sentence in line 90 to clarify this point:
  "Indeed, in RU, the start of drainage, occurring in autumn in the non-irrigated sites studied by Jeantet al. (2021), is not systematically well reproduced from year to year".
  
  L.101-102: I would also be interested in an evaluation of the level of the complexity of model calibration (in terms of the number of parameters and the difficulty to obtain them) and the benefit in terms of precision as defined by you. Perhaps you can also evaluate the required computational resources and the necessary skill level needed to run the models.
  Thank you for this comment. We believe that providing readers with this information is highly relevant.
  We propose adding a section "2.5 Complexity of models calibration" in line 373:
  "2.5 Complexity of models calibration
  The models involving SAMIR are more complex to calibrate than those involving RU. This is due in particular to the fact that 1) SAMIR-based models involve input data that are not always readily available at the required fine resolution (in particular land use maps), 2) SAMIR is spatialized and therefore requires more computing resources (several hours of computation for 2000 simulations on the AB district with 8 GB RAM and 4 CPUs running in parallel). The RU-based models are simpler to calibrate than the SAMIR-based ones because RU requires only meteorological data as input, they are not spatialized and demand fewer computing resources (a few minutes to run 2000 simulations on the AB district with the same computing configurations).
  The two subsurface models are simple and require few computing resources. SIDRA-based models are slightly more complex to calibrate than Reservoir-based models as they require two drainage network characteristic parameters (mid-drain spacing and depth of drains) as input. However, they are not the most sensitive parameters in the SIDRA model (Henine et al., 2022; Chelil et al., 2022).
  We see a gradient in terms of the level of complexity and expertise required to calibrate the models: from SAMIR-SIDRA to RU-Reservoir. The four models are coded in Python, as is the NSGA-II calibration algorithm provided by the Python package spotpy."
  
  Figure 1: Typo: litterature -> literature. I had some difficulty with all the abbreviations, perhaps add some kind of legend.
  Thank you for this remark. We have corrected the typography and proposed an Index to Figure 1. The modified Figure 1 is shown in the supplementary file.
  
  L.145-146: I am not sure I understand what main drains are. A tube-drained field needs a ditch into which the tube drains discharge. Is this ditch a main drain? If fields are so small that they can be drained by ditches only, than you call these drains. If these ditches all drain into a ditch that is ‘one hierarchy level up’, than this ditch is a main drain, too, if I follow you correctly. But I suppose these main drains themselves discharge into the next hierarchy level (and so on, perhaps) so that all water can leave the area through a single discharge point. What do you call these higher-level ditches/canals?
  Thank you for these questions. To clarify how the drainage system in the Algerri-Balaguer district is, we suggest adding more descriptive information to lines 144-147:
  "Field drains (underground perforated plastic pipes) are connected to collectors (underground concrete pipes larger than field drains). These collectors, in turn, are connected to main drains (either larger underground concrete pipes or open ditches). The main drains ultimately convey the water to general outlets (green dots in Fig. 2). During irrigation implementation, the collectors and main drains were installed in the first few years. Since then, field drains have been installed progressively at the initiative of each farmer according to his needs."
  
  L. 165: ‘Data’ is not a very descriptive title for a section of a research paper.
  Thank you for your comment. We propose to change the title to "2.1.2 Description of the data used in this study".
  
  L. 259: What is an intense drainage season?
  The "intense drainage season reservoir level" in RU is the soil reservoir level reaching Smax meaning that the soil profile is saturated. It is called "intense drainage season reservoir level" because this level is reached at the time of the agricultural season when any rainfall or irrigation is converted into drainage discharge with a restitution coefficient close to 1.
  We propose adding the following sentence to line 259:
  “Where SIDS (mm) is the intense drainage season reservoir level, reached at the season period where drainage discharge is close to the amounts of precipitation and/or irrigation (Jeantet et al., 2021). During this time, the level of the reservoir reaches the threshold of Smax.”
  
  Eq. (11): Here, the water table appears to be a function of time only, implying that it is flat. The Boussinesq equation is an expression for the water table in (one-dimensional) space and time. How come the water table is assumed flat here?
  Thank you for your comment. Equation 11 calculates the depth of the water table in the mid-drain spacing. In fact, equation 11 does not assume that the water table is flat: at the drain locations, the water table is lower than at the mid-drain spacing. This horizontal variation in the water table is accounted for in equation 11 by using a water table shape factor C to describe a quarter-elliptical water table shape, as described in Zimmer et al. (2023) (https://doi.org/10.5802/crgeos.194).
  Nevertheless, equation 11 assumes that the depth of two adjacent drains is the same, which is respected in district AB, particularly given its flat topography. Indeed, the AB district has a flat topography since the installation of the drains in 1998 was also accompanied by applanation work to consolidate the plots.
  We propose replacing the sentence in line 143 with the following:
  "In 1998, modernization works were carried out in the AB district, including flattening the land for plot consolidation and installing irrigation systems and a drainage network. The drainage network consists of surface (open ditches) and subsurface (buried pipes) drains (Altés et al., 2022).”
  We also propose to modify the sentence in line 272:
  “where h is the water table (m), K the horizontal hydraulic conductivity (m d -1), μ the drainable porosity (m3 m-3 ), and C a water table shape factor (-) equal to 0.904 (which simulates a quarter-elliptical water table shape; Zimmer et al., 2023). h is bounded between 0 and 1.5 (the average drain depth assumed at AB). Eq. (11) is based on the assumption that adjacent drains have the same depth, which is respected in the AB district due to its flat topography.”
  
  L. 331-336 and Table 3. You only partially explain how you arrived at the distributions of the fitting parameters. I think it is very useful for users of your approach to have some information on how to arrive at plausible parameter distributions. Perhaps it is even worth devoting a subsection of the Methodology section to that.
  Thank you for your comment. We agree that readers would benefit from more information on parameter distributions.
  We propose dividing section “2.3 Strategy for evaluating the models' precision” into three subsections:
  On line 309, we propose to add the sub-section title "2.3.1 Calibration strategy".
  On line 331, we propose to add the sub-section title "2.3.2 Parameters distribution for calibration" with the following text:
  "NSGA-II requires a distribution provided by the user for each calibrated parameter. The distribution and references used in this study are provided in Table 3.
  The distribution of aKcb of the SAMIR model is based on Laluet et al. (2023a), who obtained aKcb values for 37 agricultural seasons (mainly maize and wheat, being widely present in the AB district) from the linear relationship Kcb = aKcb ⋅ NDVI + bKcb. Knowing the value of NDVI at bare soil (where Kcb is zero) and at full vegetation (where Kcb is equal to Kcbmax), aKcb and bKcb can be inferred.
  The distribution of Zrmax is derived from tables provided by Allen et al. (1998) and Pereira et al. (2021). This study uses the mean and standard deviation of Zrmax for maize, being the most present and irrigated crop type at AB1 and AB2.
  The distribution of Sinter of the RU model is taken from Jeantet et al. (2021), who calibrated this parameter on drainage discharge in situ data from 22 French drained sites.
  The distribution of K and μ of the SIDRA model is also based on Jeantet et al. (2021), who derived the mean and standard deviation of these parameters from field measurements performed on 15 silty soils in France, similar to the soil texture of AB1 and AB2. The mid-drain-spacing parameter L distribution was chosen to be wide (uniform distribution between 4 m and 60 m) to consider that some plots are not drained at AB1 and AB2. As a comparison, Chelil et al. (2022) used a uniform distribution between 3.5 m and 6 m for fully drained sites.
  The distribution of the k parameter of the Reservoir model is derived from Cenobio-Cruz et al. (2023), who calibrated this parameter on discharge flow data of about 25 catchments in northern Spain.”
  We also propose adding the sub-section title "2.3.3 Metrics used for calibration and validation" in line 337.
  
  L. 366-372: I like this approach, but the explanation is not so clear. In one sentence you say you use the distributions of Table 3, in another you say you fixed L. These statements cannot both be true. Perhaps you found that L is not a sensitive parameter and could be fixed. How did you determine whether a parameter was sensitive or not? And did you check if fixing the non-sensitive parameters affected the results, for instance be generating a population with all parameters drawn from their distributions?
  I presume you assumed no correlations between the parameters when selecting the parameter combinations during the Monte Carlo sampling.
  We used a uniform distribution for L between 4 m and 60 m in the precision assessment because we know that the AB district has the particular feature of not having its entire surface drained. For sites where the surface is completely drained, L appears to be not very sensitive, as shown in Chelil et al. (2022) and Henine et al. (2022) and does not need to be calibrated. However, when assessing accuracy, we consider a case where the user has no a priori knowledge of the geometry of the drainage network. Consequently, the value of 6 meters generally found in the literature has been chosen.
  We suggest modifying the paragraph at line 366 to clarify this point as follows:
  “To this end, for each of the four models, 2000 sets of their most sensitive parameters are generated randomly using a Monte Carlo sampling, with the distributions presented in Table 3, except for the mid-drain spacing L. Indeed, for the accuracy evaluation, we consider a situation where we have no information on the geometry of the drainage network and therefore on L. In this hypothetical situation we don’t know that a portion of the surface is not drained, resulting potentially in high L values. Therefore, we use a value of 6 meters for the accuracy evaluation, which is a value frequently found in the literature.”
  
  Table 4: Please include the observed irrigation amounts for comparison.
  Thank you for this suggestion. We propose to add a column to Table 4 showing in situ irrigation averaged over the whole AB district for 2021 and 2022 (area including the non-irrigated plots). The modified Table 4 is shown in the supplementary file.
  
  L. 438-440: This explanation sounds plausible, and I am not disputing it. But do you know how the discharge network of ditches of various hierarchies affects the discharge response at the outlet to possibly non-uniform irrigation and rainfall?
  Thank you for your comment. We agree that drainage from a highly irrigated drained plot close to a collector will reach the outlet faster than a similar plot located further away.
  However, this phenomenon is unlikely to affect the drainage observed at AB1 and AB2, as irrigation and rainfall are fairly homogeneous. Firstly, double-cropped plots, being the most represented and the most irrigated at AB1 and AB2, are homogeneously distributed (cf. Figure 2). Some double-cropped plots can be close to a collector and others further away, balancing out the potential effect of collector distance on drainage. Secondly, the surface areas of AB1 and AB2 are small enough for precipitation to be considered uniform, thus affecting all plots equally, whether far from collectors or not.
  
  L. 459-461: Do the fitted values of S_inter not expose the models to the risk of being ‘right for the wrong reason’? This might become an issue for scenario studies if the model is used outside the data range for which it was calibrated. You seem to allude to this in L. 535, but for another model.
  Thank you for this comment. Indeed, a value ofp a calibrated parameter outside the distribution range is problematic in the case of accuracy assessment or for scenario. Using default Sinter values (higher than the calibrated optimal Sinter) RU-based models do not simulate sufficient recharge and ultimately drainage (as shown in Figures 6 and 7).
  These very low calibrated Sinter values provide valuable information on the functioning of the RU-based models. Indeed, Sinter is low to reduce the size of the reservoir and force the model to simulate sufficient recharge to compensate for the fact that irrigation and recharge are not spatialized in RU, unlike in SAMIR, as explained line 510.
  To clarify this point, we propose adding this sentence line 461:
  "These low Sinter values do not pose a problem for precise drainage simulation. However, they can be problematic in terms of accuracy when no drainage data is available to calibrate Sinter and to find these low Sinter values. This is particularly the case for scenario studies."
  
  L. 473-474: I think the required resources and training to run the models is very relevant for a study of this nature. As I commented above, I would like to have a more thorough treatment of these aspects.
  We've already addressed the issue in response to your fifth comment "L.101-102: I would also be interested in evaluating the level of the complexity of model calibration...".
  
  L. 480-483: Good point, food for thought.
  
  Fig. 6, 7: In most of the panels (6a, 6c, 6e, 6f, 6g, 7a, 7e, 7g, 7h), for at least a section of the time line, the observed drainage appears to be among the outliers of the 2000 simulations. I do not know what to think of this. It perhaps suggests that the models need extreme parameter values to be able to simulate this, which may indicate that the model or the parameter distributions may not be that good. But I am not sure if I am not over-interpreting this. Any thoughts?
  Thank you for this question.
  It is true that the observed drainage appears among the outliers of the 2000 simulations for RU-Reservoir (Fig. 6a, Fig. 7a) and RU-SIDRA (Fig. 6c, Fig. 7c) for the period 2021.
  This is explained by the Sinter parameter values used for the accuracy evaluation. The optimal Sinter values obtained after calibration during the precision evaluation are low (between 10.5 mm and 28 mm for the 2021 period and between 27 mm and 94 mm for the 2022 period) compared with the distribution provided in the literature (mean of 138 mm and standard deviation of 53 mm; cf. Table 3). However, the accuracy evaluation consisted of using the high default values from the literature, which do not allow sufficient drainage to be simulated. Therefore, only a few of the 2000 simulations used low Sinter values (linked to the normal distribution used), and these are the outliers you mentioned.
  To clarify this in the text, we propose modifying the following sentence line 510:
  "This is related to the fact that, in a context where RU is not spatialized while irrigation is spatially heterogeneous, the optimal Sinter value to generate sufficient recharge is lower than those given in the literature (optimal Sinter values are shown in Table 5)".
  
  L: 513-515: I do not follow your line of thought here. Perhaps rephrase.
  Thank you for this suggestion. We propose the following reformulation:
  "For the 2022 period, more drainage is simulated than in 2021 because the irrigation amounts applied by farmers in 2022 (587 mm between May and October 2022 in average over the AB district) are significantly larger than in 2021 (509 mm between May and October 2021)."
  
  L. 539-541: Not only that, lateral fluxes across the system boundary will be difficult to estimate accurately. But I agree with you they can be an important factor.
  Thank you for this remark. We propose adding the following sentences to line 539 to emphasize this point:
  "Note that representing subsurface lateral flows would be challenging, especially to estimate them accurately across system boundaries. Furthermore, this would require more complex models than those tested in this study and additional data (e.g., piezometric), which are currently not available in the study area."
  
  L. 554: Rename ‘Summary and Conclusions ’.
  Done.
  
  Citation: https://doi.org/10.5194/egusphere-2023-543-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (27 Mar 2024) by Gerrit H. de Rooij

Dear authors,

I agree with your assessment that the reviewers provided constructive feedback. I studeid your replies, and by and large you seem to have a good idea about how to revise yopur paper. I have some doubts about a few of your replies though, and I explain those below. I therefore request you to have a closer look at those to see if they merit a somewhat different reponse.

Yours sincerely,

Gerrit de Rooij
Editor

Reviewer 1:

The comments about calibration vs validation worry me a little. Successful calibration alone is not a measure of model performance or quality. Your use of default/literature parameters is a sort of validation, but still: what is the point of calibrating a model if it has to be recalibrated for every season? You argue this is due to the semi-empirical nature of the models, but is this not beside the point? If a model is successful at calibration but fails at validation, it can only fit a curve well. In that case, the parameters have high descriptive value, but low predictive value. This may be the fundamental reason for the good performance of the simplest model: by sacrificing descriptive value, the model may have gained predictive value because it models the key processes with some accuracy.

I think you need to better explain how models that have fluctuating parameter values can be used in practice. It might affect the usefulness of the models as decision support tools, for instance. This needs to be addressed thoughtfully.

In your response to the comment by the reviewer about the vast difference in size between two irrigation districts, you suggest an edit to clarify the differences between the two that does not mention the size.

Reviewer 2:

Your proposed edit to the comment about saline groundwater is incomplete. You are correct that irrigated fields with relatively shallow groundwater tables need drainage to allow leaching of salts from the root zone. But the reviewer alluded to the need to lower the groundwater table when the groundwater is saline to such a level that capillary rise into the root zone is no longer possible (salinization from below).

The reply to comment about the groundwater table only being a function of time includes a suggested edit that does not address this. The new text does not explain that ‘groundwater table’ refers to the phreatic level at the midpoint between drains. The accompanying explanation makes this clear, but his information belongs in the main text.

In your response in the comment on the Monte Carlo-type simulation about the reviewers’ confusion about the way L was treated, you use the term ‘mid-drain spacing’ to define L. This will add to the confusion. Either it is the ‘drain spacing’, or simply ‘half the drain spacing’. If you state that 6 m is a value for L frequently found in the literature, you may want to back that up by a few references.

In your reply to the comment about Fig 6 and 7 you state that the calibrated Sinter values are much lower than those reported in the literature. The edit you suggest indicates that this is the result of spatially heterogeneous irrigation. Does this not imply that RU is of limited use for irrigation districts that are not under monoculture? If so, this should be discussed.

Hide

AR by Pierre Laluet on behalf of the Authors (23 May 2024) Author's response Author's tracked changes Manuscript

ED: Publish as is (28 May 2024) by Gerrit H. de Rooij

AR by Pierre Laluet on behalf of the Authors (07 Jun 2024) Manuscript

Short summary

Monitoring agricultural drainage flow in irrigated areas is key to water and soil management. In this paper, four simple drainage models are evaluated on two irrigated sub-basins where drainage flow is measured daily. The evaluation of their precision shows that they simulate drainage very well when calibrated with drainage data and that one of them is slightly better. The evaluation of their accuracy shows that only one model can provide rough drainage estimates without calibration data.