Advancing flow duration curve prediction in ungauged basins using machine learning and deep learning

Yi, Sooyeon; Yoon, Jeongin; Lee, Chulhee; Lee, Seonmi; Ji, Jungwon; Lee, Eunkyung; Yi, Jaeeung

doi:10.5194/hess-2024-355

Preprints

https://doi.org/10.5194/hess-2024-355

Preprints

21 Jan 2025

| 21 Jan 2025

Status: this preprint has been withdrawn by the authors.

Advancing flow duration curve prediction in ungauged basins using machine learning and deep learning

Sooyeon Yi, Jeongin Yoon, Chulhee Lee, Seonmi Lee, Jungwon Ji, Eunkyung Lee, and Jaeeung Yi

Abstract. The flow duration curve (FDC) represents the distribution of streamflow, providing vital information for managing river systems. Constructing FDC is especially challenging in ungauged basins where streamflow data are lacking. This study addresses key gaps by utilizing machine learning and deep learning models to predict FDC in ungauged basins. The objectives include: (a) identifying influential hydrologic, meteorological, and topographic factors, (b) evaluating various combinations of predictor variables, (c) assessing the effects of different precipitation metrics on flow predictions, and (d) comparing ML and DL model performance. We developed and evaluated random forest (RF), deep neural network (DNN), support vector regression (SVR), and elastic net regression (ENR) models using historical data from 140 streamflow stations. Feature importance analysis revealed that watershed area and precipitation were the key factors for high discharge percentiles, whereas land use and basin characteristics gained greater importance for medium and low flows. Scenario analysis showed that combining all variables yielded the highest accuracy in predicting FDC. Different precipitation metrics had minimal impact on streamflow predictions, indicating that other factors played a more significant role. The DNN outperformed RF, SVR, and ENR in predicting low (Q₉₅), medium (Q₅₀), and high flows (Q₅), achieving an average coefficient of determination that was 8.03 % higher, a root mean square error that was 227.4 % lower on average, and a standard deviation that was 46.4 % lower. This study demonstrates the effectiveness of advanced ML and DL approaches for predicting FDC in ungauged basins, offering a foundation for advancing hydrological prediction.

This preprint has been withdrawn.

Received: 12 Nov 2024 – Discussion started: 21 Jan 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2848 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (2848 KB)

Download & links

This preprint has been withdrawn.

Sooyeon Yi, Jeongin Yoon, Chulhee Lee, Seonmi Lee, Jungwon Ji, Eunkyung Lee, and Jaeeung Yi

Interactive discussion

Status: closed

RC1: 'Comment on hess-2024-355', Anonymous Referee #1, 05 Mar 2025

The manuscript “Advancing flow duration curve prediction in ungauged basins using machine learning and deep learning” by Yi et al. discusses different approaches to predict the quantiles of the flow duration curve (FDC) at a generic ungauged basin. All these approaches belong to the family of machine learning models.
The topic of predicting FDCs in ungauged basins had a great appeal in the last decade, but it is still a challenge and further research is needed, especially towards a robust use of machine learning methods (i.e., replicability, calibration on small datasets, analysis of uncertainty, etc).
Despite this context, the manuscript lacks a clear advance in the field but, more important, it also has several methodological flaws, as described below. My suggestion is thus to reject the paper.

General comment
The manuscript shows a very simplified application of a regionalisation approach, where none of the steps are well described: from data selection, to model fitting and model validation, as described in more detail in the main comments. Not even a comparison with more traditional regionalisation approaches is presented to support the claim in the title of "advancing" flow duration curve prediction.
Last but not least, the paper does not provide any practical information on how to pragmatically apply the model to an ungauged catchment, thus making the procedure effectively inapplicable.

Major comments
Starting from the point-list of page 3 (points b and c), and throughout the text, the analysis considers the precipitation characteristics of the basin separately from the other descriptors (area, LULC, slope and elevation). Clearly, precipitation is one of the most important characteristics, but there is no reason to treat it separately. In a data-driven procedure, as the machine learning approaches (but it is the same for simpler models like multiple regression), the model would be able to select precipitation when relevant and discard it when it is not.

Whatever the model adopted, the procedure is based on the prediction of different discharge percentiles (see e.g. P7 L128, or table 4, or figure 6). This choice poses the problem of the congruence of the various estimates: as the FDC is a non-increasing function, the percentiles should be jointly estimated or their congruence (Q95 <= Q90 <= Q80 … <= Q5) should be checked/constrained.
In the literature, other approaches that do not have this problem are predictions made by computing the parameters or the moments of a distribution function that represents the FDC.

The authors develop four scenarios (Section 3.4) where different combinations of predictors are used in to feed the models. These scenarios result extremely simplified and reduce to the use or not of the LULC and slope-elevation basin descriptors. It is surprising that a framework based on machine learning techniques is based on a so small number of basin characteristics and that scenarios (i.e., pre-selection of characteristics accounted for by the model) are done arbitrarily. I would have expected that i) the algorithm is fed with a very large set of basin characteristics and ii) the algorithms automatically selected the most useful subset characteristics. This approach is typical of regionalization methods developed in the past decades where more traditional approaches to select subsets of descriptors were used (e.g., stepwise regression, multicollinarity tests, etc).

P15 L304 the authors refer that the model with all the independent variables is the best performing. This is expected because the number of parameters is higher and the fitting is better, but the model could be subject to overfitting. This issue is not mentioned by the authors and does not seem to be investigated in the paper. For instance, table 6 (summary of fitting performance of the models) should compare performance indicators between calibration and validation.
Moreover, validation has been done (section 5.1) introducing a new station not included in the original calibration set. While this is not incorrect, it is common practice in hydrological analysis, due to the limited number of available data, to perform a leave-one-out cross-validation that allows one to test the model performances in a more comprehensive way.

Minor comments
In the data collection and preprocessing section, there is no reference about the use of “annual fdc” or “total fdc”. Most of the cited literature well describe the two methods to obtain sample fdcs.

About the applicability of the model, the paper does not provide practical information on the model usage; the selected model parameterization should be presented to make the procedure applicable at an ungauged basin.

Section 4.1 describes a regression-based screening to highlight which basin characteristics impact more on each quantile of the fdc. However, the results of this investigations are not used anywhere in the study, and it is not clear if they are useful for the application.

In the validation section (5.1) seven other basins are mentioned for a comparison of results. Although these basins have similar areas, discharge values are not normalized, making the comparison very qualitative.

Typos
Table 1: unit of “average precipitation” is missing

Citation: https://doi.org/10.5194/hess-2024-355-RC1
RC2: 'Comment on hess-2024-355', Anonymous Referee #2, 04 Jun 2025

The present work presents a methodology to predict the flow duration curve (FDC) in ungauged basins employing machine learning algorithms, including deep learning methods. The work is complemented by a feature importance analysis to understand which climate and catchment properties are the most relevant towards satisfactory predictions of the FDC.
It is my opinion that the work should be rejected. Below is a list of comments in support of the latter.
Comment 1
Section 3.3 regarding the machine learning algorithm is at a very low level in terms of the presentation of (at least the fundamental) mathematical aspects of the diverse models considered.
Comment 2
Formula (3): to me this is the standard deviation of the available observations and it is not a metric to assess the goodness of model predictions. Moreover, after eq. (2) the Authors write: ‘to calculate the standard deviation, which measures the amount of variation or dispersion of a set of values, you can use a similar approach to the RMSE formula you provided’. This is a clear evidence of the abuse of generative AI employed by the Authors that transpires across the whole work!
Comment 3
Section 4.1: After introducing machine learning algorithms, the Authors evaluate the relevance of diverse catchment and weather features on the basis of a simple linear regression coefficient analysis between the former and diverse quantiles of the FDC. Why not ground the features relevance on the basis of the machine learning algorithm?

Moreover, at lines 291-293 the Authors state that watershed area and precipitation exhibit the highest relevance for high discharge percentiles. This is in contrast with results in table 5 where the elevation and slope features are consistently ranked as the second most relevant features.
Comment 4
Section 4.4: the Authors introduce a Taylor diagram to visualize the quality of model(s) results. There is no clear explanation of how to read and interpret such diagram, or what the ‘reference point’ means in this context.

Citation: https://doi.org/10.5194/hess-2024-355-RC2

Interactive discussion

Status: closed

RC1: 'Comment on hess-2024-355', Anonymous Referee #1, 05 Mar 2025

The manuscript “Advancing flow duration curve prediction in ungauged basins using machine learning and deep learning” by Yi et al. discusses different approaches to predict the quantiles of the flow duration curve (FDC) at a generic ungauged basin. All these approaches belong to the family of machine learning models.
The topic of predicting FDCs in ungauged basins had a great appeal in the last decade, but it is still a challenge and further research is needed, especially towards a robust use of machine learning methods (i.e., replicability, calibration on small datasets, analysis of uncertainty, etc).
Despite this context, the manuscript lacks a clear advance in the field but, more important, it also has several methodological flaws, as described below. My suggestion is thus to reject the paper.

General comment
The manuscript shows a very simplified application of a regionalisation approach, where none of the steps are well described: from data selection, to model fitting and model validation, as described in more detail in the main comments. Not even a comparison with more traditional regionalisation approaches is presented to support the claim in the title of "advancing" flow duration curve prediction.
Last but not least, the paper does not provide any practical information on how to pragmatically apply the model to an ungauged catchment, thus making the procedure effectively inapplicable.

Major comments
Starting from the point-list of page 3 (points b and c), and throughout the text, the analysis considers the precipitation characteristics of the basin separately from the other descriptors (area, LULC, slope and elevation). Clearly, precipitation is one of the most important characteristics, but there is no reason to treat it separately. In a data-driven procedure, as the machine learning approaches (but it is the same for simpler models like multiple regression), the model would be able to select precipitation when relevant and discard it when it is not.

Whatever the model adopted, the procedure is based on the prediction of different discharge percentiles (see e.g. P7 L128, or table 4, or figure 6). This choice poses the problem of the congruence of the various estimates: as the FDC is a non-increasing function, the percentiles should be jointly estimated or their congruence (Q95 <= Q90 <= Q80 … <= Q5) should be checked/constrained.
In the literature, other approaches that do not have this problem are predictions made by computing the parameters or the moments of a distribution function that represents the FDC.

The authors develop four scenarios (Section 3.4) where different combinations of predictors are used in to feed the models. These scenarios result extremely simplified and reduce to the use or not of the LULC and slope-elevation basin descriptors. It is surprising that a framework based on machine learning techniques is based on a so small number of basin characteristics and that scenarios (i.e., pre-selection of characteristics accounted for by the model) are done arbitrarily. I would have expected that i) the algorithm is fed with a very large set of basin characteristics and ii) the algorithms automatically selected the most useful subset characteristics. This approach is typical of regionalization methods developed in the past decades where more traditional approaches to select subsets of descriptors were used (e.g., stepwise regression, multicollinarity tests, etc).

P15 L304 the authors refer that the model with all the independent variables is the best performing. This is expected because the number of parameters is higher and the fitting is better, but the model could be subject to overfitting. This issue is not mentioned by the authors and does not seem to be investigated in the paper. For instance, table 6 (summary of fitting performance of the models) should compare performance indicators between calibration and validation.
Moreover, validation has been done (section 5.1) introducing a new station not included in the original calibration set. While this is not incorrect, it is common practice in hydrological analysis, due to the limited number of available data, to perform a leave-one-out cross-validation that allows one to test the model performances in a more comprehensive way.

Minor comments
In the data collection and preprocessing section, there is no reference about the use of “annual fdc” or “total fdc”. Most of the cited literature well describe the two methods to obtain sample fdcs.

About the applicability of the model, the paper does not provide practical information on the model usage; the selected model parameterization should be presented to make the procedure applicable at an ungauged basin.

Section 4.1 describes a regression-based screening to highlight which basin characteristics impact more on each quantile of the fdc. However, the results of this investigations are not used anywhere in the study, and it is not clear if they are useful for the application.

In the validation section (5.1) seven other basins are mentioned for a comparison of results. Although these basins have similar areas, discharge values are not normalized, making the comparison very qualitative.

Typos
Table 1: unit of “average precipitation” is missing

Citation: https://doi.org/10.5194/hess-2024-355-RC1
RC2: 'Comment on hess-2024-355', Anonymous Referee #2, 04 Jun 2025

The present work presents a methodology to predict the flow duration curve (FDC) in ungauged basins employing machine learning algorithms, including deep learning methods. The work is complemented by a feature importance analysis to understand which climate and catchment properties are the most relevant towards satisfactory predictions of the FDC.
It is my opinion that the work should be rejected. Below is a list of comments in support of the latter.
Comment 1
Section 3.3 regarding the machine learning algorithm is at a very low level in terms of the presentation of (at least the fundamental) mathematical aspects of the diverse models considered.
Comment 2
Formula (3): to me this is the standard deviation of the available observations and it is not a metric to assess the goodness of model predictions. Moreover, after eq. (2) the Authors write: ‘to calculate the standard deviation, which measures the amount of variation or dispersion of a set of values, you can use a similar approach to the RMSE formula you provided’. This is a clear evidence of the abuse of generative AI employed by the Authors that transpires across the whole work!
Comment 3
Section 4.1: After introducing machine learning algorithms, the Authors evaluate the relevance of diverse catchment and weather features on the basis of a simple linear regression coefficient analysis between the former and diverse quantiles of the FDC. Why not ground the features relevance on the basis of the machine learning algorithm?

Moreover, at lines 291-293 the Authors state that watershed area and precipitation exhibit the highest relevance for high discharge percentiles. This is in contrast with results in table 5 where the elevation and slope features are consistently ranked as the second most relevant features.
Comment 4
Section 4.4: the Authors introduce a Taylor diagram to visualize the quality of model(s) results. There is no clear explanation of how to read and interpret such diagram, or what the ‘reference point’ means in this context.

Citation: https://doi.org/10.5194/hess-2024-355-RC2

Sooyeon Yi, Jeongin Yoon, Chulhee Lee, Seonmi Lee, Jungwon Ji, Eunkyung Lee, and Jaeeung Yi

Viewed

Total article views: 1,054 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
826	196	32	1,054	44	58

HTML: 826
PDF: 196
XML: 32
Total: 1,054
BibTeX: 44
EndNote: 58

Views and downloads (calculated since 21 Jan 2025)

Month	HTML	PDF	XML	Total
Jan 2025	85	13	4	102
Feb 2025	39	10	1	50
Mar 2025	30	6	3	39
Apr 2025	16	6	2	24
May 2025	18	6	2	26
Jun 2025	62	8	6	76
Jul 2025	40	4	0	44
Aug 2025	70	8	0	78
Sep 2025	259	14	0	273
Oct 2025	24	6	0	30
Nov 2025	24	30	0	54
Dec 2025	29	13	4	46
Jan 2026	44	25	6	75
Feb 2026	65	40	3	108
Mar 2026	21	7	1	29

Cumulative views and downloads (calculated since 21 Jan 2025)

Month	HTML	PDF	XML	Total
Jan 2025	85	13	4	102
Feb 2025	39	10	1	50
Mar 2025	30	6	3	39
Apr 2025	16	6	2	24
May 2025	18	6	2	26
Jun 2025	62	8	6	76
Jul 2025	40	4	0	44
Aug 2025	70	8	0	78
Sep 2025	259	14	0	273
Oct 2025	24	6	0	30
Nov 2025	24	30	0	54
Dec 2025	29	13	4	46
Jan 2026	44	25	6	75
Feb 2026	65	40	3	108
Mar 2026	21	7	1	29

Viewed (geographical distribution)

Total article views: 1,018 (including HTML, PDF, and XML) Thereof 1,018 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 10 Mar 2026

Short summary

Our study explores how advanced machine learning and deep learning models can predict river flow patterns in areas lacking direct measurements. We combined several data types like rainfall, land use, and topography to improve accuracy. The results show that our methods can effectively estimate river flow, which is crucial for water management and preparing for floods and droughts, especially in regions with limited data. This work could lead to better decision-making in managing water resources.


Total:	0
HTML:	0
PDF:	0
XML:	0