29 Mar 2022
29 Mar 2022
Status: this preprint is currently under review for the journal HESS.

The Great Lakes Runoff Intercomparison Project Phase 4: The Great Lakes (GRIP-GL)

Juliane Mai1,, Hongren Shen1,, Bryan A. Tolson1,, Étienne Gaborit2,, Richard Arsenault3, James R. Craig1, Vincent Fortin2, Lauren M. Fry4, Martin Gauch5, Daniel Klotz5, Frederik Kratzert5,6, Nicole O'Brien7, Daniel G. Princz8, Sinan Rasiya Koya9, Tirthankar Roy9, Frank Seglenieks7, Narayan K. Shrestha7, André G. T. Temgoua7, Vincent Vionnet2, and Jonathan W. Waddell10 Juliane Mai et al.
  • 1Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, ON, Canada
  • 2Meteorological Research Division, Environment and Climate Change Canada, Dorval, QC, Canada
  • 3Department of Construction Engineering, École de technologie supérieure, Montreal, QC, Canada
  • 4Great Lakes Environmental Research Laboratory, National Oceanic and Atmospheric Administration, Ann Arbor, MI, USA
  • 5Institute for Machine Learning, Johannes Kepler University, Linz, Austria
  • 6Google Research, Vienna, Austria
  • 7National Hydrological Service, Environment and Climate Change Canada, Burlington, ON, Canada
  • 8National Hydrological Service, Environment and Climate Change Canada, Saskatoon, SK, Canada
  • 9Department of Civil and Environmental Engineering, University of Nebraska–Lincoln, Lincoln, NE, USA
  • 10Great Lakes Hydraulics and Hydrology Office, U.S. Army Corps of Engineers, Detroit, MI, USA
  • Lead contributors. Other authors are ordered alphabetically by last name.

Abstract. Model intercomparison studies are carried out to test and compare the simulated outputs of various model setups over the same study domain. The Great Lakes region is such a domain of high public interest as it not only resembles a challenging region to model with its trans-boundary location, strong lake effects, and regions of strong human impact but is also one of the most densely populated areas in the United States and Canada. This study brought together a wide range of researchers setting up their models of choice in a highly standardized experimental setup using the same geophysical datasets, forcings, common routing product, and locations of performance evaluation across the 1 million square kilometer study domain. The study comprises 13 models covering a wide range of model types from Machine Learning based, basin-wise, subbasin-based, and gridded models that are either locally or globally calibrated or calibrated for one of each of six predefined regions of the watershed. Unlike most hydrologically focused model intercomparisons, this study not only compares models regarding their capability to simulated streamflow (Q) but also evaluates the quality of simulated actual evapotranspiration (AET), surface soil moisture (SSM), and snow water equivalent (SWE). The latter three outputs are compared against gridded reference datasets. The comparisons are performed in two ways: either by aggregating model outputs and the reference to basin-level or by regridding all model outputs to the reference grid and comparing the model simulations at each grid-cell.

The main results of this study are: (1) The comparison of models regarding streamflow reveals the superior quality of the Machine Learning based model in all experiments performance; even for the most challenging spatio-temporal validation the ML model outperforms any other physically based model. (2) While the locally calibrated models lead to good performance in calibration and temporal validation (even outperforming several regionally calibrated models), they lose performance when they are transferred to locations the model has not been calibrated on. This is likely to be improved with more advanced strategies to transfer these models in space. (3) The regionally calibrated models – while losing less performance in spatial and spatio-temporal validation than locally calibrated models – exhibit low performances in highly regulated and urban areas as well as agricultural regions in the US. (4) Comparisons of additional model outputs (AET, SSM, SWE) against gridded reference datasets show that aggregating model outputs and the reference dataset to basin scale can lead to different conclusions than a comparison at the native grid scale. This is especially true for variables with large spatial variability such as SWE. (5) A multi-objective-based analysis of the model performances across all variables (Q, AET, SSM, SWE) reveals overall excellent performing locally calibrated models (i.e., HYMOD2-lumped) as well as regionally calibrated models (i.e., MESH-SVS-Raven and GEM-Hydro-Watroute) due to varying reasons. The Machine Learning based model was not included here as is not setup to simulate AET, SSM, and SWE. (6) All basin-aggregated model outputs and observations for the model variables evaluated in this study are available on an interactive website that enables users to visualize results and download data and model outputs.

Juliane Mai et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on hess-2022-113', Anonymous Referee #1, 29 Apr 2022
  • CC1: 'Comment on hess-2022-113', John Ding, 18 May 2022
  • RC2: 'Comment on hess-2022-113', Matteo Giuliani, 24 May 2022

Juliane Mai et al.

Juliane Mai et al.


Total article views: 949 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
704 227 18 949 66 8 8
  • HTML: 704
  • PDF: 227
  • XML: 18
  • Total: 949
  • Supplement: 66
  • BibTeX: 8
  • EndNote: 8
Views and downloads (calculated since 29 Mar 2022)
Cumulative views and downloads (calculated since 29 Mar 2022)

Viewed (geographical distribution)

Total article views: 897 (including HTML, PDF, and XML) Thereof 897 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 26 May 2022
Short summary
Model intercomparison studies are carried out to test various models and compare the quality of their outputs over the same domain. In this study, 13 diverse models setup using the same input data are evaluated over the Great Lakes region. Various model outputs – such as streamflow, evaporation, soil moisture, and amount of snow on the ground – are compared using standardized methods and metrics. The basin-wise model outputs and observations are made available through an interactive website.