Integrated Catchment Classification Across China Based on Hydroclimatological and Geomorphological Similarities Using Self-Organizing Maps and Fuzzy C-Means Clustering for Hydrological Modeling

Niu, Jiefan; Zhang, Ke; Li, Xi; Bao, Hongjun

doi:https://doi.org/10.5194/hess-2024-304

Preprints

https://doi.org/10.5194/hess-2024-304

Preprints

28 Nov 2024

| 28 Nov 2024

Status: this preprint is currently under review for the journal HESS.

Integrated Catchment Classification Across China Based on Hydroclimatological and Geomorphological Similarities Using Self-Organizing Maps and Fuzzy C-Means Clustering for Hydrological Modeling

Jiefan Niu, Ke Zhang, Xi Li, and Hongjun Bao

Abstract. Accurately identifying similar catchments is crucial for transferring model parameters and improving hydrological modeling, especially in ungauged regions with varied climates and topographies. This study presents an integrated method for catchment classification by combining Self-Organizing Maps artificial neural network (SOM) and Fuzzy C-Means clustering (FCM), utilizing hydrometeorological and geomorphological data. We evaluated six climate indices and fifteen landscape characteristics for catchments across China, identifying key variables through correlation and principal component analyses. The optimal classification produced six distinct climate regions and 35 catchment types with unique streamflow patterns. Validation using ten catchments confirmed the effectiveness of the SOM-FCM approach. The study underscores the importance of considering both climate and landscape factors for a comprehensive classification of catchments, offering valuable insights for hydrological model predictions in ungauged areas and enhancing our understanding of hydrological processes at various timescales.

Received: 04 Oct 2024 – Discussion started: 28 Nov 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 4600 KB)

Supplement (857 KB)

Download & links

Jiefan Niu, Ke Zhang, Xi Li, and Hongjun Bao

Status: open (extended)

Post a comment Subscribe to comment alert

CC1:
'Comment on hess-2024-304', Huan Xu, 08 Feb 2025 reply

I'm interested in catchment classification, so I took a look at this paper. It utilizes SOM and FCM to classify catchments in China based on climate and geomorphic attributes. The paper validates its findings using small watersheds, and the results are quite interesting. However, there are some parts I didn't fully understand, and I have a few suggestions for the author to consider.

Table 3
Why are the attributes in Table 3 selected based on the coefficient of variation?

Figure 2
It is suggested to include an explanation of the d-matrices in the methodology. Consider moving the statement “Vesanto (1999) suggested that SOM results can be expressed in the form of two types…” from L279 to section 2.1.2 and expand on it in more detail.

Figure 3
Consider adjusting the color band so that the color corresponding to 0.5 is set to white. This would better highlight basins belonging to a cluster with higher confidence.

Figure 6
It is recommended to use different color schemes for the third and fourth categories, as their current colors are too similar and not effective.

Figure 7
Consider clearly marking the boundaries of each climate zone in the figure and labeling the basin class in the subplots.

Introduction
The paper overlooks previous catchment classification studies conducted in China:
Luo, K. (1954) Draft of natural geography regionalization of China. (in Chinese)
罗开富,1954. 中国水文区划草案.

Xiong, Y., Zhang, J., et al. (1995) Hydrology Regionalization of China, Science Press, Beijing.(in Chinese)
熊怡,张家桢,等,1995. 中国水文区划. 科学出版社

Liu, C., Zhou, C., et al. (2014) Chinese Hydrological Geography, Science Press, Beijing
刘昌明，周成虎等，2014. 中国水文地理. 科学出版社

Xu, H., Wang, H., Liu, P. (2024). Identifying control factors of hydrological behavior through catchment classification in Mainland of China. Journal of Hydrology, 645, 132206. DOI: 10.1016/j.jhydrol.2024.132206

Methodology
In L180, the author claims FCM has “low sensitivity to initialization.” I am curious if this is the case, and it might be beneficial to demonstrate FCM results under multiple initializations.

Methodology
It is suggested that the methods used in the results section be introduced in the methodology, highlighting the logic and approach rather than just detailing the SOM and FCM algorithms. A flowchart would be helpful if possible.

Methodology
How to classify catchment from climate region to basin class? FCM? If so, are the inputs to FCM the features in Table 1 or their principal components?

Results
Were the selected 10 small watersheds affected by human activities, such as agricultural water use or urban consumption? Would this impact the results?
Using 10 small watersheds for validation might be insufficient. If the author is willing, more runoff data can be found in NESSDC (https://www.geodata.cn), such as:
DOI: 10.12041/geodata.30184613892738.ver1.db
DOI: 10.12041/geodata.69811525443157.ver1.db
DOI: 10.12041/geodata.31258482188424.ver1.db

Discussion
The discussion needs to emphasize the connection with the results. Currently, the discussion section seems to introduce existing knowledge within the basin. Perhaps discussing similarities and differences with similar studies, limitations, and potential applications would be more effective.

Discussion
The features used in this paper do not consider any human activities. How might this affect the results of catchment classification? Given the significant human activities in many regions of China, how should we interpret or use the classification results obtained without considering human activities?

L460-471
This part is not easy to understand. Especially, I didn't understand this sentence: L464“The flow regime in climate region II presented multiple peaks following multiple peaks in precipitation in June and July during the same period.”

L495-498
What do “combined indicators” refer to? What does “at different scales” mean? Basin area? Time?

L560-561
What does “There is no particular classification for one catchment that allows greater flexibility in the selection of a catchment for comparative studies or parameter transplantation in ungauged catchments” mean?

L556-557
The statement “Moreover, climate-homogeneous regions respond to hydrological behaviors at medium- or longtime scales, whereas catchment classification regulates hydrological processes at the flood event scale” needs to be strengthened in the results to support this conclusion.

Reply

Citation: https://doi.org/10.5194/hess-2024-304-CC1
- AC1:
  'Reply on CC1', Ke Zhang, 16 Feb 2025 reply
  
  Thank you for your comments. We have responded to each of your comments in detail (see attachment) and will make revisions in the manuscript.
  If you have further questions, we would be pleased to engage in a more detailed discussion.
  
  Reply
  
  Citation: https://doi.org/10.5194/hess-2024-304-AC1
  - CC2: 'Reply on AC1', Huan Xu, 16 Feb 2025 reply
    
    Thank you for the detailed reply. I appreciate the clarification.
    
    Reply
    
    Citation: https://doi.org/10.5194/hess-2024-304-CC2
RC1:
'Comment on hess-2024-304', Anonymous Referee #1, 16 May 2025 reply
The manuscript, “Integrated Catchment Classification Across China Based on Hydroclimatological and Geomorphological Similarities Using Self-Organizing Maps and Fuzzy C-Means Clustering for Hydrological Modeling” by Niu et al., introduces a catchment classification method, which combines the SOM for climate zone classification and the FCM for further classification based on topographic, soil, vegetation, and topological features for catchments. After classification, 10 watersheds in 5 classification groups were chosen to compare their within-group similarity and inter-group distinction in hydrological signatures (flow duration curve).
Though the topic of this manuscript fits the scope of HESS, I have some concerns about the study, which are commented below. Therefore, I suggest that the manuscript to be revised with major revisions at this moment.

General comments
The novelty of the study is unclear to me. Is this the first study of fuzzy classification for small and medium-sized watersheds in China? Compared with other classified results, what are the major differences (not methodology) or improvements, such as hydrologic signatures? This should be elaborated in the Introduction and Discussion sections.

The structure of the paper needs to be reorganized. The Results section contains many texts that should be moved to Methods and Discussion. For example, Line 302-306 for how the optimal number of clusters is chosen should be moved to the Methods explaining FCM; Line 407-498 for the flow duration curve and hydrologic signature looks like a great point that should be moved to the Discussion section. Overall, the current organization, having discussions inside Results, makes the manuscript long and disruptive to read. The manuscript should be organized more neatly, where the Results should focus on presenting numbers, while moving and consolidating interpretations and implications in the Discussion.

The validation of classification is only performed for 10 watersheds in all entire China, which I think is insufficient. Based on the Figure 6, there are many same-class watersheds that are fairly distant from each other. However, the current selection of watersheds, though the similarity of FDC in each class is shown, might be insufficient to support the conclusion, as these watersheds in same classes are too spatially close to each other. Therefore, I am wondering how the similarity of FDC would be if watersheds that are more spatially distant are chosen for evaluation.

The application of the study is not thoroughly discussed. The section 4.1 focuses more on the advantages of the probabilistic approach of FCM over hard-boundary classification ones, and the potential of improving regional hydrological modeling. However, the potential of 1) transferring model parameters from calibrated to ungauged watersheds and 2) estimating floods under various design storms based on the similarity of flow duration curve could be discussed, and can improve the novelty and value of the research.

I think the references are not in the required style of HESS (https://www.hydrology-and-earth-system-sciences.net/submission.html#references)

Specific comments
Line 69: “an indisputable fact” looks like a strange statement. Do you mean machine learning is now widely used for regionalization studies?

Line 102-103: need to add references.

Line 117-126: The organization of this paragraph needs improvement. The six indices should be stated before reasoning why they are selected. It would make the flow more logical, rather than making readers wonder what indices are chosen (line 119, three indices but not stating what they are).

Line 174: reference for FCM?

Line 178: “may be the most” to “is a”

Line 207: reference for Penman-Monteith equation

Line 214: What are the average size and the range of catchment sizes?

Figure 2: Though the interpretations of hexagons values are provided, I still don’t quite get the physical meanings of these plots and wonder how they should be interpreted spatially on maps (if a basemap can be added, it would be helpful). In the figure’s caption, briefly explain the legends and how readersshould interpret the figure. Also, line 276-285 should be moved to Methods.

Line 304: What is the AP algorithm? I don’t think this was mentioned in Methods, and should add the reference

Line 369: Two questions here: 1) For soil & veg characteristics, the second PC has an eigenvalue of 0.91. Why do you choose this PC below one? 2) Improve the writing: Instead of using semicolons, clearly state which class (topographic, soil & veg, and tolological) you are discussing first (For topographic, XXX. For soil & veg, XXX.), then state how each PC is correlated to the input indices.

Line 449: space between class and (Li et al)

Figure 7: some recommendations: 1) List each site/catchment's classification in the line charts, and 2) maybe consider another color scheme to present the variation range. The grey colors are hard to differentiate. Also, briefly describe the ranges in the legend within the caption for people to understand.

Line 495: correct the citation

Figure 8: What will FDCs look like if using discharge in mm/day (normalized by drainage area)? I am wondering this because these watersheds vary significantly in drainage size, and normalizing discharge by size may allow for expanding the validation to more gauged watersheds.

Line 517: The statement, “leading to errors”, should be more evidence-based. What specific errors could inaccurate classification result in? What consequences/risks will these errors cause? Provide references of previous studies showing so.

Line 546: Is this a possible reason causing the challenge that some watersheds are hard to be classified with one dominant group?

Reply
Citation: https://doi.org/10.5194/hess-2024-304-RC1
- AC2: 'Reply on RC1', Ke Zhang, 28 May 2025 reply
  
  Thank you for your comments. We have responded to each of your comments in detail (see attachment) and will make revisions in the manuscript.
  If you have further questions, we would be pleased to engage in a more detailed discussion.
  
  Reply
  
  Citation: https://doi.org/10.5194/hess-2024-304-AC2
- AC3: 'Reply on RC1', Ke Zhang, 28 May 2025 reply
  
  Thank you for your comments. We have responded to each of your comments in detail (see attachment) and will make revisions in the manuscript.
  If you have further questions, we would be pleased to engage in a more detailed discussion.
  
  Reply
  
  Citation: https://doi.org/10.5194/hess-2024-304-AC3

Jiefan Niu, Ke Zhang, Xi Li, and Hongjun Bao

Supplement

https://doi.org/10.5194/hess-2024-304-supplement

Jiefan Niu, Ke Zhang, Xi Li, and Hongjun Bao

Viewed

Total article views: 391 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
288	77	26	391	37	16	28

HTML: 288
PDF: 77
XML: 26
Total: 391
Supplement: 37
BibTeX: 16
EndNote: 28

Views and downloads (calculated since 28 Nov 2024)

Month	HTML	PDF	XML	Total
Nov 2024	50	15	2	67
Dec 2024	40	8	3	51
Jan 2025	16	7	0	23
Feb 2025	37	14	4	55
Mar 2025	27	4	2	33
Apr 2025	22	7	4	33
May 2025	40	12	4	56
Jun 2025	29	6	7	42
Jul 2025	23	4	0	27
Aug 2025	4	0	4

Cumulative views and downloads (calculated since 28 Nov 2024)

Month	HTML	PDF	XML	Total
Nov 2024	50	15	2	67
Dec 2024	40	8	3	51
Jan 2025	16	7	0	23
Feb 2025	37	14	4	55
Mar 2025	27	4	2	33
Apr 2025	22	7	4	33
May 2025	40	12	4	56
Jun 2025	29	6	7	42
Jul 2025	23	4	0	27
Aug 2025	4	0	4

Viewed (geographical distribution)

Total article views: 384 (including HTML, PDF, and XML) Thereof 384 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 07 Aug 2025

Short summary

This study developed a new method for classifying catchments, combining machine learning techniques with climate and landscape data. By analyzing catchments across China, we identified six climate regions and 35 unique catchment types, each with distinct streamflow patterns. This classification method improves hydrological predictions, especially in areas lacking direct data.


Total:	0
HTML:	0
PDF:	0
XML:	0