Closing the data gap: runoff prediction in fully ungauged settings using LSTM
Abstract. Prediction in ungauged basins (PUB), where flow measurements are unavailable, is a critical need in hydrology and has been a focal point of extensive research efforts in this field over the past two decades. From the perspective of deep learning, PUB can be viewed as a scenario where the generalization capability of a pretrained neural network is employed to make predictions on samples that were not included in its training data set. This paper adopts this view and conducts genuine PUB using long short-term memory (LSTM) networks. Unlike PUB approaches based on k-fold training-test technique, where an arbitrary catchment B is treated as gauged in k−1 rounds and as ungauged in one round, our approach ensures that the sample for which the PUB is conducted (the UNGAUGED sample) is completely independent from the sample used to previously train the LSTMs (the GAUGED sample). The UNGAUGED sample includes 379 catchments from five hydrological regimes: Uniform, Mediterranean, Oceanic, Nivo-Pluvial, and Nival. PUB predictions are conducted using LSTMs trained both at the regime level (using only gauged catchments within a specific regime) and at the national level (using all gauged catchments). For benchmarking the performance of LSTM in PUB, four regionalized variants of the GR4J conceptual model are considered: spatial proximity, multi-attribute proximity, regime proximity, and IQ-IP-Tmin proximity, where IQ, IP, and Tmin are the indices defining the five hydrological regimes. To align with the study's fully ungauged context, the IQ index, which is also an input feature for the LSTMs, and the regime classification, crucial for the REGIME LSTMs, are reproduced under ungauged conditions using a regime-informed neural network and an XGBoost multi-class classifier respectively. The results demonstrate the overall superior performance of NATIONAL LSTMs compared to REGIME LSTMs. Among the four regionalization approaches tested for GR4J, the IQ-IP-Tmin proximity approach proves to be the most effective when analyzed on a regime-wise basis. When comparing the best-performing LSTM with the best-performing GR4J model within each regime, LSTMs show superior performance in both the Nival and Mediterranean regimes.
Status: open (until 13 Mar 2024)
Viewed (geographical distribution)