FarmCan: a physical, statistical, and machine learning   model to forecast crop water deficit for farms

Sadri, Sara; Famiglietti, James S.; Pan, Ming; Beck, Hylke E.; Berg, Aaron; Wood, Eric F.

doi:https://doi.org/10.5194/hess-26-5373-2022

Articles | Volume 26, issue 20

https://doi.org/10.5194/hess-26-5373-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Special issue:

Experiments in Hydrology and Hydraulics

https://doi.org/10.5194/hess-26-5373-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 26, issue 20

Research article

|

27 Oct 2022

Research article |

| 27 Oct 2022

FarmCan: a physical, statistical, and machine learning model to forecast crop water deficit for farms

Sara Sadri, James S. Famiglietti, Ming Pan, Hylke E. Beck, Aaron Berg, and Eric F. Wood

Download

Final revised paper (published on 27 Oct 2022)
Preprint (discussion started on 31 Mar 2022)

Interactive discussion

Status: closed

RC1:
'Comment on hess-2022-96', Anonymous Referee #1, 29 Apr 2022
This paper presents a study of using a machine learning framework, FarmCan, to forecast irrigation demand in 4 farms in Canada. Based on the machine learning modeling results, the authors find that soil moisture shows a strong correlation with precipitation. Also, evaporation and potential evaporation are effective predictors of NI. The study shows the potential of using machine learning models to improve the timing of irrigation and therefore to save water and achieve sustainable agricultural production. The manuscript is on a topic of interest to the audience of HESS. I only have a few minor comments that I hope the authors could address in their revision.

Specific comments:

Lines 51-58: In this part, the authors could add a few more references and add more in-depth discussion about the current stage of ML models for irrigation water demand.

Line 101: I checked the citation (FAO, 2021), which has the equation as: ICU = ET – P – dS. Please revise equation (1).

Line 167: There is a question mark here, which I assume is a place holder for references.

Lines 171-175: This description suggests that the FarmCan model is site-specific. The authors could add some discussion here to explain the flexibility of the model. Also, the authors can add explanation how the model can be transferred to other farm fields.

In Figure 6, I would suggest change the color scheme. It is a bit confusing with ET and SM both presented in reddish colors.

At the end of the result section, maybe the authors can add a subsection to discuss about the practical application of the FarmCan model. For example, how can we use the model to improve agricultural water use management?
Citation: https://doi.org/10.5194/hess-2022-96-RC1
- AC2: 'Reply on RC1', Sara Sadri, 25 Aug 2022
  
  The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2022-96/hess-2022-96-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC2
RC2:
'Comment on hess-2022-96', Geoff Pegram, 06 May 2022

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Review

This is a short, informative and to my mind original, article on the development of a tool to improve grain farming in Canada. This topic is new to me in the reviews I have done and found I had a sharp learning experience to enlighten me.

There is nothing seriously poor herein that needs to be attended to. Most of my remarks are attached to the Figures and the odd Table, to make reading easier. To repeat a passage I wrote as a comment after the conclusion, I make an appeal which I hope will help the readers of the article: “Most potential readers will probably scan the Abstract, look at the Figures and possibly read the Conclusion, before they decide to read the whole. Please repeat the text referred to by the acronyms in this passage. Because you have a lot of them, please add them in an appendix for reference below the text. Your article is relatively short, so an extra page will not hurt!”

After some tidying up, I recommend that a reviewed version will likely be acceptable to the Editor. I would be happy to see the revision. My comments to the Authors follows my Signature below this passage, which is my wont.

Geoff Pegram

6 May 2022

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Details of comments inserted in the article. Clips of your text are numbered and my remarks follow introduced by #. I will not copy the trivial suggested corrections, but I will take the more pithy selections and add them here.

11 MSWEP

# Multi-Source Weighted-Ensemble Precipitation

24 population of 9.1 billion (UN/ISDR, 2007; FAO, 2009

# I checked and found World Population Clock 2 May 2022: They give 7.9 Billion People (2022) – Worldometer

Table 1

# What do these numbers mean? Please make your caption more informative. Add PET = Potential ET. Rotate the table so we don't have to crane our necks! It fits if you make the columns a bit thinner and deeper.

Page 8

# At this stage, I have copied as many acronyms that I can find, some of them having their meaning explained, but after listing 23 at this stage, I for a list of acronyms that we can check out after the conclusion, before Refs.

149 ……0.1 is the scale factor meaning that the data had to be corrected by multiplying them by 0.1

(Running et al., 2019).

# I do not understand this sentence; reducing the data by a factor of 10? What for?

167 V280 combined with the MSWX product (?)

$ In place of (?) I suggest “as they match in frequency (3 hr) and pixel size (0.10)”

175

# Getting info from the farmers is very smart

Fig.2

# Good informative layout

2.5 Relative importance of FarmCan inputs to P

# Does not make sense - inputs Precipitation ?

230 variables (ET, PET, SM, and RZSM) are used first as predictants

# is not a word in the Oxford English Dictionary, nor could I find it on the Web. Nice try but you might substitute:"seen as items to be estimated"

Fig. 3

# That is seriously good corroboration cell for cell - almost identical - by eye (I did it in one minute) and I would estimate a cross correlation average of 95%

Figure 4. Spatial patterns of climatology. Data was collected from 2015-2020 for the agricultural months (Apr-Oct).

# Please expand the legend in this relatively short article, as most readers will check the abstract, then possibly the figures which need to be self-explanatory. Then they might take the challenge of the text if they have been enticed! Expand the acronyms here, as well as listing them at the end of the text.

Fig. 5

# In the caption please change Apr—Oct to “April and October”. Also, please give horizontal definition of columns in legend - it took me a while to unpack ...

Fig. 6

# Make these sample bars thicker as in the figure - their colours are indistinguishable in this legend; Fig. 7 gets it right. What is 'Teal'? Light green? Make the bar-chart thicker? The dates are unpackable - they are a jumble. In my first look I had no clue as to which is day, month nor year and what the numbers below the blank spaces are designed to tell the reader. Why not give dates, of start and finish, of the readings?

Fig. 8

# What about (b) & (c). Nevertheless, our figures are well laid out imbedded in the text. Also, the 3 & D are chopped off ... the images are very readable and can be reduced in size without loss of message - same for Fig. 7 which I missed

Fig. 9

# Enlarge the words Predicted as they are unreadable at an A4 size - Observed as well. There's enough space. Also please make the caption more informative

4 Conclusions#

# Most potential readers will probably scan the Abstract, look at the Figures and possibly read the Conclusion, before they decide to read the whole. Please repeat the text referred to by the acronyms in this passage. Because you have a lot of them, please add them in an appendix for reference below the text. Your article is relatively short at 350 lines including Figs & Tables, so an extra page will not hurt!

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Citation: https://doi.org/10.5194/hess-2022-96-RC2
- AC3: 'Reply on RC2', Sara Sadri, 25 Aug 2022
  
  The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2022-96/hess-2022-96-AC3-supplement.pdf
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC3
RC3:
'Comment on hess-2022-96', Anonymous Referee #3, 10 May 2022
Hydrological forecasts provide valuable information for agricultural planning and management. This paper has developed a physical, statistical and machine learning model, which is called FarmCan, to forecast crop water deficit at farm scales. One feature of FarmCan is the integration of remote sensing datasets, including soil moisture, root zone soil moisture, precipitation, evapotranspiration and potential evapotranspiration. Through the case study of four farms in Canada. The usefulness of FarmCan is demonstrated.

There are three comments for further improvements of the paper.

Firstly, there is a gap between rainfed farms and needed irrigation. Specifically, four rainfed farms are investigated in this paper (Lines 85 to 86) and the attention is paid to the needed irrigation (Lines 107 to 112). It is noted that rainfed and irrigated systems are two distinct approaches to agricultural production and that irrigation is generally not involved in rainfed systems. Please clarify the issue of needed irrigation in rainfed farms.

Secondly, the irrigation if applied would augment soil moisture and then affect evaporation. In Eq. (1) on Page 7, the needed irrigation is calculated by using evaporation and soil moisture. The calculation seems to mix independent and dependent variables. Specifically, from the perspective of statistical modelling, if x depends on y then it may be improper to regress y against x.

Thirdly, the algorithm of FarmCan accounts for 4 phenological stages of crop growth (Lines 179 to 180). It is known that crop water requirements vary by the different stages even under the same background climate (https://www.sciencedirect.com/topics/agricultural-and-biological-sciences/crop-water-requirement). In addition, the analysis involves multiple crops, including soybeans, oats, spring wheat, etc. Please illustrate how the different crops and crop growth stages are considered under the same framework of FarmCan. Given that there are numerous combinations of crops/stages, can the data presented in this paper provide enough samples to train the FarmCan? How are the sampling variability and parametric uncertainty for the FarmCan?

Below are a few minor comments:

Please add a flowchart of the steps of data processing and the dataset involved.

In Fig. 9, it seems the uncertainty ranges are determined by linear regression models. Can the FarmCan quantify the uncertainty by itself?
Citation: https://doi.org/10.5194/hess-2022-96-RC3
- AC4: 'Reply on RC3', Sara Sadri, 25 Aug 2022
  
  The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2022-96/hess-2022-96-AC4-supplement.pdf
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC4
RC4:
'Comment on hess-2022-96', Anonymous Referee #4, 12 May 2022

n this paper a study of a physical, statistical, and machine learning model to forecast crop water deficit is presented. With this model, called FarmCan, the authors show that sensing dataset like soil moisture, root zone soil moisture, precipitation, evapotranspiration can predict the needed irrigation.

This paper shows that the study of using machine learning models to improve the use of water resources for a sustainable agriculture has great potential.

Here in the following some comments to improve the presentation of the paper:

- In general some figure and tables are not very clear. If the informations in Table 1are divide in two parts the two new figures fits orizontally and this improves the readability. Figure 6 contains al lot of informations and the colour and the bars confuse the reader: maybe also here is posbbile to split the figure in two parts. Figure 7-8: I do not understand why you show the “observed variables for farmS2” separately and you don’t do this for the other farms; maybe the “growth stage” (it is simply a line) is useless.

- There are really a lot of acronyms: please list them with the correspondent explanations at the end.

- Line 167: there is (?) - please correct it.

- In the abstract you affirm “...our algorithm was able to forecast crop water requirements 14 days in advance…”: I do not understand why in the rest of the paper (for example figure 7-8) the predictions are up to 10 days.

At the end, my opinion is that the paper is of interest to HESS and with some minors reviews will be acceptable to the Editor.

Citation: https://doi.org/10.5194/hess-2022-96-RC4
- AC5: 'Reply on RC4', Sara Sadri, 23 Sep 2022
  
  please see collated response on Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC5
RC5:
'Comment on hess-2022-96', Anonymous Referee #5, 12 May 2022
The manuscript is good and it can be publishable in the HESS journal. The specific comments of each section of the manuscript are given below:

The finding details of the research can be included in the abstract section.

A comprehensive literature review is needed considering recent developments.

The study area section needs to be more explanatory such as past climatic scenarios which will directly or indirectly affect the crop.

Figure 1 should include the scale and North direction symbol.

Figure 3 explains day 8, PET, ET etc. Why day 8 parameters are important and what about other days. Figure 6 also only describes the 8-day variability. What is the significance of the day 8 events?

The conclusion section should be more informative.
Citation: https://doi.org/10.5194/hess-2022-96-RC5
- AC6: 'Reply on RC5', Sara Sadri, 23 Sep 2022
  
  please see the collated response on Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC6
RC6:
'Comment on hess-2022-96', Anonymous Referee #6, 13 May 2022

1. What would happen with NI-prediction accuracies if RF is directly applied to model the P-NI relationship?

2. The authors may consider other NI-related climate variables in the modeling process.

3. Land use changed for the farms in the past years. Could FarmCan adapt to this?

4. A portion of samples in one day, one year, etc. are used to demonstrate the accuracy of FarmCan. Using all samples may avoid misestimation of the accuracy under temporal dynamics.

5. It would be much helpful for readers if the principle of FarmCan can be clarified as soon as possible.

6. Please carefully polish the manuscript. Some errors exist.

Citation: https://doi.org/10.5194/hess-2022-96-RC6
- AC7: 'Reply on RC6', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC7
RC7:
'Comment on hess-2022-96', Anonymous Referee #7, 13 May 2022

This manuscript is well-prepared and concise. The authors provide a thorough description of a random-forest model for irrigation need prediction. They demonstrate the relevance in the literature, assess correlations that make an RF feasible, provide a brief case study, and provide some validation. I see no reason to prevent publication once the comments of other reviewers have been addressed. I see, as I will describe in my comments, places where the work could be extended to greater impact if desired.

My main concern is that most of the paper is devoted to description of the model and a grounding in its anticipated utility; comparatively little space is given to through validation or exploration of the nuances of performance. Further investigation could lead to great impact.

The model, for example, produces a KGE for fourteen-day predications of something like 0.4. The authors provide limited discussion of the implications of this performance. What level of uncertainty does that imply? What could be the social and economic costs (crop loss, reduced yield, water costs, etc.)? How does this prediction accuracy vary over the season? How does accuracy vary over the prediction horizon? These questions could be discussed qualitatively (in leaving work for later) or quantitatively (in trying to add more work here). The provide, either way, a more robust understanding of the utility of this model.

I’d also be interested in, if space allows, a qualitative discussion of the accuracy of the forecast data. That is, what hypotheses can we make about the decay in performance as a function of forecast uncertainty? Obviously, a lot of work would be required to answer this question quantitatively, and I’m not asking for that necessarily. I still think it important to understand what question can now be posed because of the development of FarmCan.

All in all, a great piece of work, and I’m looking forward to reading more. Thank you.

Citation: https://doi.org/10.5194/hess-2022-96-RC7
- AC8: 'Reply on RC7', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC8
RC8:
'Comment on hess-2022-96', Anonymous Referee #8, 14 May 2022

In my opinion the term “Needed Irrigation” is not appropriately used in this paper and could be misleading.

Since FarmCan has been applied to rainfed cropping system, I think that the term “water deficit” (as in the title) is more appropriate.

The computed “crop water deficit” could be smaller than the actual crop water demand for achieving an “optimal yield”. As far as I know, MODIS PET is influenced by vegetation indices. In rainfed cropping system, crop canopy development could be suboptimal, as it might be affected by crop water stress. Thus, the estimated PET is smaller than the crop ET under standard conditions (i.e., no soil water constraints with respect to what required for crop optimal development): MODIS PET cannot be used for assessing the “water needed ….to reach an optimal yield”. The paper should be revised by carefully considering this point. The attribute “optimal” should be carefully applied.

I am not sure about Equation 1: is NI negative when PET is larger than P?

The paper should also clarify how this tool could be practically employed in “near real time”: what kind of strategies could be implemented “to minimize potential crop failure and losses” in rainfed cropping systems?

Citation: https://doi.org/10.5194/hess-2022-96-RC8
- AC9: 'Reply on RC8', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC9
RC9:
'Comment on hess-2022-96', Anonymous Referee #9, 15 May 2022
Review of „FarmCan: A Physical, Statistical, and Machine Learning Model to Forecast CropWater Deficit at Farm Scales “ by Sadri et al.

This paper describes a climate-informed machine learning (ML) framework to predict crop water demand at the farm scale that was tested at four farms in the Canadian Prairies Ecozone.

The topic is of general interest due to the increasing drought and overexploitation of water resources in many regions of the world due to global climate change and fits well within the scope of the journal. The manuscript is written in a comprehensible way for the most part, but it needs to be revised as it often contains incorrect wording and phrasing.

Unfortunately, in my opinion, this work is not yet mature enough to be recommended for publication because of numerous methodological shortcomings (see general and specific comments). Furthermore, the presentation of the results and the discussion are not adequate and need to be revised.

General comments:

There are already numerous tools available to predict water demand for crop management. The novelty of this study is for the user to select a specific farm location, which alone is not sufficient for publication. Therefore, the novelty of this study needs to be better explained.

It is problematic that the ML method was only tested at sites with relatively high precipitation and thus no need for irrigation. In particular, the method should be tested in climate zones where irrigation is actually needed to be carried out due to the climate conditions.

The modelling procedure shows some weaknesses:

The model uses P to predict ET and PET, although both variables were found not to correlate well with P in the datasets used.

The model predicts 8 day-sums of NI, which are then disaggregated to daily values based on daily P values. This seems to be an unnecessary oversimplification. A more accurate way would be to directly forecast daily NI values.

The model uses PET to calculate the NI, which can lead to an overestimation of the NI, as PET represents the maximum ET at optimal soil water supply. In this way, NI will also be greater than 0 when the crops are not under water stress. Water stress only sets in when soil moisture falls below a certain values called MAD (maximal allowable demand), which is crop and soil specific (see e.g. Taghvaeian et al., 2020)

Switching from SM to RZSM after the second harvest stage is questionable, as the plants reach a greater rooting depth than 5 cm much earlier.

The analysis in Chapter 3.2 shows mainly weak correlations and incomprehensible argumentations. In addition, it is detached from the modelling part. Therefore, I suggest removing it.

A discussion of the results in the light of existing literature and potential limitations is missing.

Specific comments:

L14: “four” instead of “4”

L16-18: This statement already indicates that the ML method was not sufficiently tested.

L32-35: It is unclear why and to what extent irrigation demand forecasts are important to rainfed farmers. It is very unlikely that they would change their farm management just because irrigation demand forecasts are available. This statement is also not included in the citations.

L60: Use consistent capitalization

L64: I wonder why the length of the growing season cannot be determined by the inputs, i.e. coordinates and crop type.

L66: What do you mean by “subfield”? Please consider that the SMAP data has only 36 km resolution.

L67: I don't understand why this method is tailored to this area, as the method seems to be generic.

L110: This equation does not correspond to the one shown in the cited reference, e.g. effective precipitation is used. In addition, using actual ET instead of potential ET provides more realistic estimates of irrigation demand and should therefore be used when available, as in this study.

L118: “highly accurate” seems to be exaggerated. Please use quantitative information on accuracy from SMAP validation studies, e.g. Montzka et al., 2017.

L148-150: Please remove the 0.1 factor discussion, which is not important.

L167: There seems to be a citation missing.

L173: What do you mean be “relevant depth of soil moisture”?

L174: The total number of expected growing days seems to be very uncertain. How will this uncertainty influence the forecast results?

L181-185: This procedure is not clear to me. How do you calculate the radii? How do account for the different spatial resolution of the different data sets? Are you averaging over the SMAP grid? How representative is the SMAP soil moisture for a specific field?

L187-189: It is unclear how the extension of the SMAP data to 2010 was done in detail. Furthermore, it is likely that a machine learning method will lead to very uncertain estimates of soil moisture, and I therefore do not see its benefit for the predictive modelling. Either explain in more detail how and why the extension was done, or better leave it out.

L190-192: From a viewpoint of a soil hydrologist, it is very strange and arbitrary to first predict RZSM and than use this for the prediction of SM, as SM should be much stronger controlled by P than RZSM due to infiltration processes. Please explain in more detail reasoning behind this. In addition, explain why you predicting SM at all?

L194: Again, why are you not using ET instead of PET, see comment L110?

L195-202: Eq. 2 is not correct. To obtain the correct weights, the result from Eq. 2 must be divided by 800. Furthermore, this procedure is a strong simplification, as it does not distinguish on which day P fell within the 8 days, which makes a big difference in reality. Let us assume, for example, that 100 mm P fell on the first day of the week. In this case, the irrigation requirement for the crops for the following days would be much lower compared to if 100 mm P would fell on the last day.

L229: No precipitation predictions are used in this study.

L240-242: Given the data shown in this paper, only the first explanation seems plausible. This, however, questions the motivation of this study, i.e. that the CPE region is susceptible to droughts and that predictions of NI are important for agricultural management.

L259: This statement is an exaggeration, because this is clearly not always the case. Otherwise, farmers would use irrigation regularly. The available soil water at the beginning of the growing season also contributes to meeting the water needs of the plants because the soil type in this region is predominantly chernozemic clay, which has an extremely high water storage capacity.

L260-261: Just because irrigation is not used is not an argument for drought vulnerability in agriculture. See also comment above.

L262-263: One should be careful with these claims, because if this were indeed the case, farmers in this region would have already started irrigating to achieve high crop yields. Conversely, this may also indicate large uncertainties in the used P and PET products.

L281-282: This argumentation is not plausable.

L284-285: But high PET values can be found also for positive delta values of RZSM and SM for all regions.

L292-294: This statement contradicts the time series of soil moisture shown in Fig. 6. If it were true, one would expect soil moisture to decrease continuously during the growing season. However, the delta values of SM and RZSM both show fluctuations around zero, suggesting that P is sufficient to compensate for ET losses. This discrepancy may be due to uncertainties in the PET and ET data.

L309: In Fig. 8, M1 and M2 show strong P events of about 30 mm on 1 July. This results in an increase of SM to about 0.5 m³/m³ indicating soil saturation. Nevertheless, the model predicts a decrease in ET which is not plausible.

Table 1: The source for the crop water needs is missing and not all crops are covered. The values for P are much too low (in the text values between 400-1100 during the vegetation period are given). The values of P/PET seems to be too low as well.

Figure 4: I don’t see the benefit for showing the spatial distribution of P, ET, PET, SM and RZSM as this study only concerns time series analysis. Instead, monthly climate diagrams would be very useful to better understand the climate and soil hydrological situation in the four farms used in this study to develop and test the model.

Figure 6: Colours for ET and SM not distinguishable.

Figure 7: The two subplots are redundant and should be merged.

Figure 8: There should be no subtitles under the subplots. The plots are difficult to understand because of the large number of symbols. The meaning of the growth stage line is unclear.

Figure 9: Font size is too small. NI is not an observed value.

Literature

Taghvaeian, S., Andales, A. A., Allen, L. N., Kisekka, I., O’Shaughnessy, S. A., Porter, D. O., ... & Aguilar, J. (2020). Irrigation scheduling for agriculture in the United States: The progress made and the path forward. , (5), 1603-1618.
Citation: https://doi.org/10.5194/hess-2022-96-RC9
- AC10: 'Reply on RC9', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC10
RC10:
'Comment on hess-2022-96', Anonymous Referee #10, 16 May 2022

The study is about a short-term prediction of water available for a crop production in Canada: a new FarmCan model (tool) is suggested for prediction of a set of variables needed for farmers. FarmCan allows the forecasting up to 14 days for the evapotranspiration [potential evapotranspiration] and the soil moisture content [root soil moisture content] on the basis of the remote sensing products (available operationally), the short-term forecast for the precipitation and a farm specific information like location, crop type, etc. The manuscript discusses the results of application of the FarmCan for the prediction of the water demand for the crop production at four farms located in Canada. The FarmCan tool implements methods of the machine learning technique; the model is training and running on the data covering the period of 2015 - 2020.

In my opinion, the manuscript requires a major revision in order to better show the progress achieved and originality of findings obtained by the authors. I would suggest extending the introduction by explaining the diversity of the methods (models) applied in short-term forecasts for the agricultural needs with focus on their disadvantages and gaps. It is now not clear what gaps were filled with the newly developed FarmCan model/software/tools.

It seems that many references are missing, and it makes it difficult to understand whether authors developed this particular method or just implemented the existing method into the developed software (FarmCan model). There are too many abbreviations in the text, and many of them are given without explanation; and it makes reading the text difficult. I would suggest putting all notations into the table in the Annex.

The text includes many specific terms often used by modelers (ie. dynamic, static variables or “the hindcast”) which were leaving without the explanations; it makes understanding of the text difficult to the wider scientific community. I am suggesting to present the results more in Tables instead of duplicating the figures (Fig. 7 and 8, and the subfigures in Fig. 9).

In my opinion, the manuscript requires a major revision. I am wondering if the manuscript with the description of developed software is relevant for HESS: https://www.hydrology-and-earth-system-sciences.net/about/manuscript_types.html. This manuscript will probably better fit to the requirements of the Geoscientific model development journal.

Specific comments:

Lines 2-21: I would suggest removing the notations from the text of the abstract. It is better to introduce them later in the text and/or place them in the special section. Check the abbreviations MSWEP, R2, RMSE KGE: they are given without the explanation.

Line 23, …FAO…: I would suggest to always give first the explanation of any notation that will be used further in the text.

Line 46: what are “the crop stress models”? How do they differ from the crop-water model (line 43) and plant hydraulic model?

Line 52: the reference to ML is needed.

Line 60: Does it mean that you developed the software (FarmCan) allowing using a specific set of available data and methods implemented? Is it relevant for HESS?

Lines 61-62: I would suggest removing the repetition of the abbreviations if they were already introduced in the previous text.

Lines 78-79: Please, provide the numbers while describing the climatology of the region (length of winter and summer seasons, [annual/growing period] amount of precipitation, relative humidity, etc.).

Lines 84-85: The notations first used in the text must be explained for readers (RISMA and NASA).

Table 1: I would suggest providing the numbers for the precipitation rounded to tens of mm (i.e 122.45 -> 122 ).

Line 95: please give the reference providing the amount of precipitation in Table 1.

Lines 117: The notations first used in the text must be explained for readers (SMAP?).

Line 125: Please explain what is “a dynamic parameter”?

Line 135: Please, check the reference (A1…

Lines 140-143: Check where the definition of the abbreviation GEOS is given first.

Line 156: the abbreviation P was previously explained in the text, there is no need to repeat it here.

Line 163: Remove L. in reference to Chen and …

Line 165: Please, explain the abbreviation MSWX first time found here.

Line 167: What does “... MSWX product (?) ... ” mean?

Line 207: Please give the reference with the RF method description (algorithms)?

Line 233: Pearson correlation analysis is first time mentioned in the text. Please give the reference for it.

Line 243: (I. Sevevi…) -> (Sevevi..)

Line 243: please explain what is “a transition time” or exclude this term from the text.

Line 255-256. I would suggest not using the term “climatology” when discussing statistics estimated from the 5-year period.

Line 258: “from 400-1200 mm” -> from 400 to 1200 mm

Lines 255-268: The numbers are only given for the precipitation in the growing period while the “hydrological variables” are promised in the name of the paragraph.

Lines 260, 264: please use the names instead of abbreviations.

Figure 4: I would suggest adding the explanation of all abbreviations used in the plots. Also, please remove the term “climatology” from the notation to the figure.

Figure 5: Please, add the unit for X,Y variables and the explanation of the abbreviations used in the figure.

Fig. 6: What is 8-day variability analysis? It was not given in the section of methods?

Table 3: In my opinion, the information in table 3 is better given in the text, and I suggest removing the table.

Line 303: I would suggest using specific terms (like “the hindcast”) without their explanation.

Figure 7: I would suggest rounding predicted values for PET, ET, P and NI: like P=30.46 mm -> P=30 mm since the precision of the modeled value never becomes better than the observed (measured) value. The precipitation is usually measured with the precision of 1 mm; if the precision of the observed precipitation is better than 1 mm, please provide the description of the measuring technique or the reference describing it.

Figure 8: In my opinion it is better to give the results in Table. The explanation of the results is very poor.

Table 4: I would suggest giving the results for the KGE together with other estimates (R2 RMSE).

Figure 9. I would suggest showing the plots only for one farm, and the rest of the results are better shown in the table. What [2020] is meaning in the figure caption?

Line 326: It seems that the section with the discussion is missing. In this section newly obtained findings are discussed in comparison with already known facts. I would suggest discussing the effectiveness of the developed model/software FarmCan for the short-term forecasting (up to 14 days) of four variables in terms of its computational costs and accessibility to the sources of input data. Do you know any other models/software/methods allowing the short-term prediction of PER, NI, etc? What are the advantages of FarmCan compared to the existing models?

Line 376: I would suggest adding the information on the data and code availability for the results shown in this study. This information is missing.

Citation: https://doi.org/10.5194/hess-2022-96-RC10
- AC11: 'Reply on RC10', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC11
RC11:
'Comment on hess-2022-96', Anonymous Referee #11, 17 May 2022
The manuscript authored by Sara Sadri et al demonstrates an interesting study of using machine learning modelling technique to predict one of the challenging research topics - crop water deficit. The authors showed that by utilising several freely available remote sensing and modelling results datasets, they were able to reach their forecast goal with a 2-week lead time and reasonable performance. The presentation of the manuscript is well organised, and I found it was easy to follow although certain parts do need to be improved further.

It is also clear that the manuscript has received a great deal of interest from reviewers (I am the 11th reviewers by the time I was submitting my review). I don't think I need to repeat the points other reviewers already mentioned but would like to highlight the one that concerns me most, i.e., the fact that the model in question was built without using any physically observation for model validation and calibration and this, in my opinion, must be properly justified. I therefore think the followings should be provided in any future revision (or revisions):

How reliable are the datasets used in this study in terms of representing the physical quantity they are supposed to? For example, ET, PET, P, SM etc. This should be referred to the study area instead of the overall performance as they can be very sensitive to spatial locations.

The precipitation P has been found to be a very critical variable (which is not surprising). However, there is nowhere to see from the current version that how accurate it is, as compared with real gauge measurements from the locations of the study.

The model calibration/validation is not clearly presented. We need to see both the training and testing of the models. This could well be accompanied by a flow chart.

Finally, I don’t see how the forecast target, NI, can be checked against the field measurement. If again, the NI value obtained from other models/remote sensing, the authors should make this very clear.

Another general observation, from modelling perspective, is that the study although seems very interesting and probably of great practical value, it remains as a ‘fitting process’ of several datasets from other modelling processes, which produces very limited insights into the underlying physical processes. The ‘forecast’ capacity is highly dependent on the climate, and I doubt whether the whole methodology can be applied elsewhere, for example, with much larger temporal variation of precipitation.
Citation: https://doi.org/10.5194/hess-2022-96-RC11
- AC12: 'Reply on RC11', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC12
RC12:
'Comment on hess-2022-96', Anonymous Referee #12, 18 May 2022

In this paper, the authors proposed a machine learning method to predict soil moisture, evapotranspiration and potential evapotranspiration from precipitation forecast. Based on this statistical forecast, the real-time prediction of needed irrigation can be achieved.

General comments:

The topic of this paper is suitable to HESS. However, I believe there are some fatal flaws in this paper. I recommend the editor to reject this paper.

First, it is unclear for me if their statistical model has a significant added value for predicting future conditions. Since they found that there is no significant relationship between P and PET (and ET), I guess that the initial condition of ET and PET is the major source of predictability of them (although they did not clarify this point). In this case, the authors may be able to replace their prediction of ET and PET with the persistent model. I think Figure 7 also implies that the persistent model is effective to predict them and it is not absolutely necessary to predict the dynamic change in ET and PET. In addition, the temporal change in soil moisture is also important, but the skill of their model to predict soil moisture is not actually good according to Figure 9. Although the precipitation prediction significantly contributes to the prediction of needed irrigation through equation (1), precipitation prediction comes from the existing data and is not the contribution of this work. In summary, without more detailed comparisons between their prediction and some benchmarks such as a persistent model, I cannot be very convinced that the authors’ statistical model really provides an added value.

Second, it is unclear for me how this work contributes to estimating crop water conditions at farm scales since they fully relied on satellite observation with coarse grid sizes. Specifically, the size of the original footprint of SMAP is approximately 50km, which is apparently not a farm scale. Although it might be possible to integrate local information into the authors’ proposed framework, I could not find any contributions to farm-scale water resource management in the present work.

Specific comments:

Major points:

L165-167: This description is a bit ambiguous for me. I believe that the authors used the forecast of P. Please explicitly say the P prediction was used in this paper.

L171-175: It is necessary to show the advantage of integrating the local information into the model more clearly. I’m not very convinced that how this information contributes to water resource management since the prediction itself is not provided in a farm scale.

L310: In Figures 7 and 8, please show the prediction of SM and RZSM in addition to ET and PET.

L320: KGE of soil moisture is not included in Table 4. Why? Please include it.

Minor points:

L167: Please remove “?”.

L181: “P” appears twice.

Citation: https://doi.org/10.5194/hess-2022-96-RC12
- AC13: 'Reply on RC12', Sara Sadri, 23 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC13
CC1:
'Comment on hess-2022-96', Nils-Otto Kitterød, 18 May 2022

The authors address an important issue, and the topic deserves more attention in the hydrological community. The article is generally well written, and I recommend the editor to publish the manuscript after some minor corrections and clarifications. I’ve not worked through the other comments to this article, and I expect there will be some overlapping viewpoints and suggestions. Hence, below follows my independent comments.
General comments:
The manuscript would benefit of a clearer distinction between modelling results, discussion, and conclusions. As it is now, the conclusion sounds more like a discussion. For conclusions in a scientific paper, you are not supposed to “… speculate …” (line 341), or “.. expect …” (line 343). If the results are unclear or uncertain, it should be elaborated in the discussion. The phrase “FarmCan can become a promising tool …” (line 352), also belongs to the discussion. What are the necessary requirements for FarmCan to be successful? Or why can FarmCan fail to meet the requirements? What are the critical factors? Everything written in the conclusions should be evident from the results and the discussion. The abstract should be consistent to the conclusions. The current abstract states that: “FarmCan is a promising tool for use in any region of the world …”, but this is not the same as written in the conclusions.
As far as I can see, the work is based on estimated data derived from remote sensing (Tab.2, line 131). The selected farms were parts of a network of field stations for soil moisture observations (line 82-86), but it is not clear to me whether validation data is empirical observations taken from the selected farms or extracted from remote sensing data. Is empirical data included in Fig.6 (for example) or not? If not, it should be written explicitly in the text and explained why, and also elaborated in the discussion.
I would also expect some kind of cross-validation exercises in a study like this. As far as I understand, this is in principle done in Fig.7 and 8., where simulation results after the given date (2020/07/02) is illustrated. But I can not see how these results fits the observations. The cross-validation results are more easy to understand in Fig.9, but the results need a more through discussion.
In general, I would also recommend the authors to elaborate the main results in the figure texts. What specific results (or features) should the reader appreciate in the figure?
The definition of Potential Evapotranspiration (PET) needs clarification to avoid ambiguities. FAO recommends to substitute ‘PET’ with ‘Reference Evapotranspiration’, which I support. In this case, the authors use pre-calculated data, but the term should be defined explicitly so that others can reproduce the results.

Specific comments:
Equation 1 (line 110) needs some clarifications. Why is NI approximate equal to the right hand side of the equation? I understand it is sum over 8 days, but that should also be evident from the equation. I guess the sum also go over ΔSM? That needs also to be written explicitly. I would also appreciate if ΔSM was defined explicitly because a negative ΔSM will increase NI.
In Tab. 1 it is written that the units for (P/PET) ratio is mm/day, but this is (probably?) a misprint. Units for P and PET is [Length/Time], thus the fraction must be unit less [-].
Figure 6. The first y-axis label says ΔSM and ΔRZSM. The legend says SM and RZSM, the figure text says ΔSM and ΔRZSM, please correct the legend. PET looks more like “teal green” than P, which is green bars (same color as in Fig.7).
Figure 5. What is the “take away” message of this figure? It’s said that ΔSM is plotted against 8-days P, but it is also data for ΔRZSM. Please, also consider elaborating the figure text (c.f. general comments above).
Figure 7. Why does past NI decrease from approx. 9 mm/d to approx. 1 mm/d from June 25^th to 26^th?
It’s a misprint in Line 250, “0.5x1000”, I think the correct pharase is “0.2x1000”.

Citation: https://doi.org/10.5194/hess-2022-96-CC1
- AC14: 'Reply on CC1', Sara Sadri, 24 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC14
CC2:
'Comment on hess-2022-96', Panayiotis Dimitriadis, 22 May 2022

Dear Authors,
Apologies for not sending my review on time. Please see three minor and four major comments below to initiate a fruitful discussion:
Major comments:
1) It is mentioned in the text that the key processes of the FarmCan model are the P, ET, PET, SM and RZSM. Also, as key climatic variables that link the water cyclic are the PET and SM, and that that the total energy of ET is more dependendable on SM. Please mention that the most important process that links the atmospheric water to the surface one, is the humidity (and all the related one, such as specific humidity, dew-point, etc.). This process is often either misused or forgotten; however, it is main link that drives the water-cycle (please see details and importance in a recent global analysis in https://hess.copernicus.org/articles/24/3899/2020/). Also, please consider mentioning how the FarmCan model takes into account changes in humidity, or whether it solely predicts precipitation.
2) It is mentioned in the analysis, that daily precipitation is predicted based on the Multi-Source Weighted-Ensemble Precipitation (MSWEP). Please mention that while these meteorological models are powerful in predicting changes in temperature, they often perfom very poor in precipitation (for example see such discussion, references and examples in https://www.tandfonline.com/doi/full/10.1080/02626667.2010.513518).
3) The so-called Hurst phenomenon (https://ascelibrary.org/doi/10.1061/TACEAT.0006518; power-law type of the autocorrelation function across lags and scales as comapred to the zero autoco-correlation of the white-noise) seems not to be taken into account in the analysis. This phenomenon (also known as long-term persistence or long-range dependence) is found in all key hydrological-cycle processes including the ones applied by the authors (see review, references, and results in https://www.mdpi.com/2306-5338/8/2/59). The Hurst phenomenon has been shown to explain a vast portion of the variability observed in these hydrological-cycle processes. Its existence is one of reasons that is difficult (or even impossible) to predict a hydrometeorological process' value beyond a specific time-window (or else called time-window of predictability; https://www.tandfonline.com/doi/full/10.1080/02626667.2015.1034128). For example, in this work, the authors propose a 14-days. Finally, please note that the authors have not probably identify this phenomenon, since they only use data of 5 years of lengh, whereas the impact of the Hurst phenomenon takes place in the long-term scales (e.g., in more than 10-30 years). Therefore, it is expected that if a predictive model does not take it into account, in the long run it would end up underestimating the correlation of precipitation, evaportanspiration, etc.
4) Besides the Hurst phenomenon, which is responsible for the long-term auto-correlation function of each hydro-meteorological process, there is also the short-term auto-correlation structure, which is far from zero (i.e., in the case of independent variables). However, in the analysis, the authors mention that their applied method of Randon-forest can de-correlate the trees, and tackle the 'noise' sensitivity of the prediction. However, please note that even without the existence and impact of the Hurst phenomenon, the existence of a strong short-term auto-correlation function (i.e., at small lags and scales) cannot be easily get rid off by non-linear transformations. Therefore, the appearance of 'noise' is probably due to this impact, since all the processes applied by the authors at FarmCan (e.g., precipitation, evapotranspiration, PET, etc.) are shown to have a strong short-term auto-correlation function (for example, in https://www.mdpi.com/2306-5338/8/4/177/htm, in Figure 12, even after a 10 month period the correlation function of PET, as expressed through the climacogram, exhibits a value more than 0.5). Please consider estimating the auto-correlation functions (for several lags) of all processes inlcluded in FarmCan, so more light is shed in its impact to the prediction values and so as to further discuss this issue.
Minor comments:
1) In the Introduction, the water-food nexus is mentioned as an important impact of climatic variability; however, the water-food-energy nexus is more appropriate in my opinion (there are many works in literature about this triangle; see for example discussion in a recent one: https://www.mdpi.com/2673-4060/2/2/11/htm).
2) For the FarmCan model is mentioned that (ii) establishes a methodology to forecast PET, SM, and RZSM using P prediction. How about ET? Also, how is possible to derive the SM and RZSM value from the precipitation prediction? These two questions are not very clear for me in the text, please consider giving more information.
3) In the text it is mentioned that the assumption of an evenly distributed soil moisture across depth is used. Please consider giving some examples of how this assumption may affect the result and validity of the FarmCan prediction.
4) Please consider replacing (for the P, PET, ET, RZSM, and SM) 'variables' with 'processes', since all these processes are found to have strong auto-correlation structures and therefore, they cannot be mentioned as stochastic variables but rather as stochastic processes (the word 'variables' is used when there is absence of correlation, i.e., a white-noise behaviour; please see definitions and discussion in http://www.itia.ntua.gr/en/docinfo/2000/).
Sincerely,
Panayiotis Dimitriadis

Citation: https://doi.org/10.5194/hess-2022-96-CC2
- AC15: 'Reply on CC2', Sara Sadri, 24 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC15
- AC16: 'Reply on CC2', Sara Sadri, 24 Sep 2022
  
  please see the collated response under Reply to EC1.
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC16
EC1:
'Editor comment', Daniel Green, 08 Jun 2022

Dear authors,
This paper presents a study of using a machine learning framework, “FarmCan”, to forecast irrigation demand in 4 farms in Canada using machine learning. The authors find that soil moisture shows a strong correlation with precipitation and that ET and PET are effective predictors of NI. The study shows the potential of using machine learning models to improve the timing of irrigation and therefore to save water and achieve sustainable agricultural production.
The Editors would like to acknowledge the efforts all of the reviewers who have made comments on this manuscript. Due to an editorial issue during the invitation of reviewers phase, we have received 12 reviewer comments on the submitted manuscript, and an additional two community comments (14 reports to address in total). Invitations for review were sent out to a large volume of reviewers and we had an unusually large number of request for manuscript reviews accepted. Although this highlights the novelty and importance of the work undertaken, it is unrealistic to expect you to address all of these comments in turn.
In light of the editorial issues, I recommend that the Authors reply individually to three Reviewer comments (RC 1, RC 2, RC 3), which request relatively minor revisions. Further, I recommend that the Authors also reply to this Editor comment addressing the specific and general comments which have been summarised and taken from the remaining reviewer comments below (divided into general and specific comments, as well as those relating to the figures/tables). Please see attached document.
Generally, most reviewers recommend that a reviewed version should be accepted after minor revisions (with the exception of Reviewers 9 and 12; RC9 recommend work due to some methodological shortcomings, and RC12 recommend rejection of the paper as they do not recognise the added values for predicting future conditions).
Once again, we apologise for the unusually large amount of reviewer comments received, but we hope this solution helps in responding to all of the reviewer’s comments whilst recognising the inputs from all reviewers. Thank you again for submitting your manuscript to this Special Issue in HESS and we look forward to receiving your response.
Kind regards,
Dr. Daniel Green

Citation: https://doi.org/10.5194/hess-2022-96-EC1
- AC1:
  'Reply on EC1', Sara Sadri, 21 Aug 2022
  Dear Dr. Green,
  
  First, we thank you for handling our manuscript, Hess-2022-96. We hereby respond to the reviewers' comments in green font below. We responded to the first three of the twelve reviews, as instructed. Additionally, we responded to the other reviewers' specific and general statements you summarized. The main changes are as follows:
  
  added additional information to maps, showing the scale and north sign (Figure 1), broke down Table 1 into two tables, showed NI values in the pair correlation analysis (Figure 4), enhanced the font and clarity (Figure 5 and Figure 9), corrected the legends (Figure 6), and did the analysis again for Table 3 and Figure 9, and a list of acronyms at the end of the manuscript;
  
  improved the explanations of several caveats and limitations;
  
  added citations to several relevant studies;
  
  clarified the equations and the results in a more consistent manner; and
  
  improved the introduction of FarmCan and the contributions of the algorithm.
  
  Sincerely,
  
  Sara Sadri (on behalf of all co-authors)
  
  Citation: https://doi.org/10.5194/hess-2022-96-AC1

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish as is (04 Oct 2022) by Daniel Green

ED: Publish as is (04 Oct 2022) by Louise Slater (Executive editor)

AR by Sara Sadri on behalf of the Authors (14 Oct 2022) Manuscript

Short summary

A farm-scale hydroclimatic machine learning framework to advise farmers was developed. FarmCan uses remote sensing data and farmers' input to forecast crop water deficits. The 8 d composite variables are better than daily ones for forecasting water deficit. Evapotranspiration (ET) and potential ET are more effective than soil moisture at predicting crop water deficit. FarmCan uses a crop-specific schedule to use surface or root zone soil moisture.