Semi-supervised learning approach to improve the predictability of data-driven rainfall-runoff model in hydrological data-sparse regions
Abstract. Numerous data-driven models have been introduced to establish reliable predictions in the rainfall-runoff relationship. The majority of these models are trained using a supervised learning (SL) approach, with paired observed samples of climate and streamflow data. However, in practice, the availability of such paired observations is often constrained due to sparse data from streamflow gauges worldwide, which typically covers only a few years. This limited number of paired samples can significantly impede the learning ability of the data-driven model. The semi-supervised learning approach, which is an emerging machine learning paradigm that additionally incorporates unpaired samples, has the potential to be a highly effective method for modeling rainfall-runoff relationships. In this study, we present a novel semi-supervised learning-based framework for rainfall-runoff modeling. Our framework introduces a unique loss function designed to handle two distinct types of samples, namely paired and unpaired samples, effectively during the training process. To validate the effectiveness of the proposed framework, we conducted an extensive set of experiments employing a diverse range of designs, all of which utilized the LSTM network. The experiments are based on 531 basins from the freely available CAMELS dataset, which spans the entire continuous United States. Results indicate that the proposed framework show significantly enhanced performance compared to the baseline models. Results also show that the framework can serve as a viable alternative to the previously developed fully supervised approaches. Lastly, we address potential avenues for enhancing the model and provide an outline of our future research plans in this domain.
Viewed (geographical distribution)