Global catchment modelling using World-Wide HYPE 1 ( WWH ) , open data and stepwise parameter estimation 2

12 Recent advancements in catchment hydrology (such as understanding hydrological processes, 13 accessing new data sources, and refining methods for parameter constraints) make it possible to 14 apply catchment models for ungauged basins over large domains. Here we present a cutting-edge 15 case study applying catchment-modelling techniques at the global scale for the first time. The 16 modelling procedure was challenging but doable and even the first model version show better 17 performance than traditional gridded global models of river flow. We used the open-source code of 18 the HYPE model and applied it for >130 000 catchments (with an average resolution of 1000 km), 19 delineated to cover the Earths landmass (except Antarctica). The catchments were characterized 20 using 20 open databases on physiographical variables, to account for spatial and temporal variability 21 of the global freshwater resources, based on exchange with the atmosphere (e.g. precipitation and 22 evapotranspiration) and related budgets in all compartments of the land (e.g. soil, rivers, lakes, 23 glaciers, and floodplains), including water stocks, residence times, interfacial fluxes, and the 24 pathways between various compartments. Global parameter values were estimated using a step25 wise approach for groups of parameters regulating specific processes and catchment characteristics 26 in representative gauged catchments. Daily time-series (> 10 years) from 5338 gauges of river flow 27 across the globe were used for model evaluation (half for calibration and half for independent 28 validation), resulting in a median monthly KGE of 0.4. However, the world-wide HYPE (WWH) model 29 shows large variation in model performance, both between geographical domains and between 30 various flow signatures. The model performs best in Eastern USA, Europe, South-East Asia, and 31 Japan, as well as in parts of Russia, Canada, and South America. The model shows overall good 32 potential to capture flow signatures of monthly high flows, spatial variability of high flows, duration 33 of low flows and constancy of daily flow. Nevertheless, there remains large potential for model 34 improvements and we suggest both redoing the calibration and reconsidering parts of the model 35 structure for the next WWH version. The calibration cycle should be repeated a couple of times to 36 find robust values under new fixed parameter conditions. For the next iteration, special focus will be 37 given to precipitation, evapotranspiration, soil storage, and dynamics from hydrological features, 38 such as lakes, reservoirs, glaciers, and floodplains. This first model version clearly indicates challenges 39 in large scale modelling, usefulness of open data and current gaps in processes understanding. Parts 40 Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2019-111 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 1 April 2019 c © Author(s) 2019. CC BY 4.0 License.


12
Recent advancements in catchment hydrology (such as understanding hydrological processes, 13 accessing new data sources, and refining methods for parameter constraints) make it possible to 14 apply catchment models for ungauged basins over large domains. Here we present a cutting-edge 15 case study applying catchment-modelling techniques at the global scale for the first time. The 16 modelling procedure was challenging but doable and even the first model version show better 17 performance than traditional gridded global models of river flow. We used the open-source code of 18 the HYPE model and applied it for >130 000 catchments (with an average resolution of 1000 km 2 ), 19 delineated to cover the Earths landmass (except Antarctica). The catchments were characterized 20 using 20 open databases on physiographical variables, to account for spatial and temporal variability 21 of the global freshwater resources, based on exchange with the atmosphere (e.g. precipitation and 22 evapotranspiration) and related budgets in all compartments of the land (e.g. soil, rivers, lakes, 23 glaciers, and floodplains), including water stocks, residence times, interfacial fluxes, and the 24 pathways between various compartments. Global parameter values were estimated using a step-25 wise approach for groups of parameters regulating specific processes and catchment characteristics 26 in representative gauged catchments. Daily time-series (> 10 years) from 5338 gauges of river flow 27 across the globe were used for model evaluation (half for calibration and half for independent 28 validation), resulting in a median monthly KGE of 0.4. However, the world-wide HYPE (WWH) model 29 shows large variation in model performance, both between geographical domains and between 30 various flow signatures. The model performs best in Eastern USA, Europe, South-East Asia, and 31 Japan, as well as in parts of Russia, Canada, and South America. The model shows overall good 32 potential to capture flow signatures of monthly high flows, spatial variability of high flows, duration 33 of low flows and constancy of daily flow. Nevertheless, there remains large potential for model 34 improvements and we suggest both redoing the calibration and reconsidering parts of the model 35 structure for the next WWH version. The calibration cycle should be repeated a couple of times to 36 find robust values under new fixed parameter conditions. For the next iteration, special focus will be 37 given to precipitation, evapotranspiration, soil storage, and dynamics from hydrological features, 38 such as lakes, reservoirs, glaciers, and floodplains. This first model version clearly indicates challenges 39 in large scale modelling, usefulness of open data and current gaps in processes understanding. Parts 40

47
Hydrological models are useful tools to better understand processes behind observation, to 48 reconstruct past events and to predict future events, as well as to explore the impact of various 49 scenarios of change in flow controlling factors, such as climate or human activities. Catchment 50 models were traditionally often applied in small well-monitored rivers under pristine conditions, to 51 understand mechanisms in flow generation (e.g. Bergström and Forsman, 1973;Beven and Kirby, 52 1979; Lindström et al., 1997) or to support flow forecasts at warning services (e.g. Arheimer et al., 53 2011). However, a combination of societal requests and scientific initiatives has changed this context 54 for catchment modelling recently. As catchment models are mimicking observation through 55 calibration procedures, they have high credibility among practitioners and water managers. Hence, 56 they are used operationally in many societal sectors, to provide for instance design values for 57 infrastructure, water allocation schemes, navigation routes, flood warnings, environmental-status 58 indices or optimal industrial-water use. Currently, all these users of catchment model outputs also 59 face climate change and seek data and information to best implement climate adaptation for their 60 specific business. Hence, catchment models are also used to estimate climate change impact. 61 The catchment research community has embraced this applied focus and, at the same time, 62 expanded the geographical domain to multi-catchments. The applied focus is illustrated by the new 63 decade of the International Association of Hydrological Sciences (IAHS) called "Panta Rhei", which 64 addresses change in hydrology and society (Montanari et al., 2013) and focuses on the human impact 65 on the water cycle instead of traditional pristine conditions. The spatial expansion, on the other 66 hand, is driven by accelerating advances in hydrological research as described by Archfield et al. 67 (2015). For instance, comparative hydrology (Falkenmark and Chapman, 1989)  parameters are transferred to areas without observed time-series of river flow, such as 73 regionalization, parameter constraints, and Monte Carlo approaches for empirical quality control, to 74 ensure that the process description is realistic and account for uncertainties. This opened up for 75 catchment models to be tested and applied also at the continental scale (e.g. Pechlivanidis   In the early 1970's, model parameters were calibrated using a rather simple curve fitting towards 101 observed time-series of river flow in a specific catchment outlet (e.g. Bergström and Forsman, 1973 wide covering the entire globe with relatively high resolution, providing an average subbasin size 119 of ̴ 1000 km 2 (WWH version 1.3). Our specific objective is to provide a harmonized way to predict 120 hydrological variables (especially river flow and the water balance) globally, which can also be shared 121 for further refinement to assist in regional and local water management wherever hydrological 122 models are currently lacking.

Catchment delineation and characteristics
220 Catchment borders were delineated using the World Hydrological Input Set-up Tool (WHIST), 221 software developed at SMHI that is linked to the Geographic Information System (GIS) Arc-GIS from 222 ESRI. By defining force-points for catchment outlets in the resulting topographic database (c.f. Table  223 1) and criteria for minimum and maximum ranges in catchment size, the tool delineates catchments 224 and the link (routing) between them. By adding information from other types of databases, WHIST 225 also aggregates data or uses the nearest grid for assigning characteristics to each catchment. WHIST 226 handles both gridded data and polygons, and was used to link all data described in Section 2, such as 227 land-cover, river width, precipitation, temperature, and elevation, to each delineated catchment. 228 WHIST then compiles the input data files to a format that can be read by the HYPE source code. The 229 software runs automatically, but also has a visual interface for manual corrections and adjustments. 230 It may also adjust the position of the gauging stations to match the river network of a specific 231 topographic database. 232 When setting up WWH, force-points for catchment delineation were defined according to: 233  Locations of gauging stations in the river network: in total, catchments were defined for all 234 21 704 gauging stations which had an upstream area greater than 1000 km 2 (except for data 235 sparse regions (500 -1000 km 2 ). Their coordinates were corrected to fit with the river 236 network of the topographic data, using WHIST and manually. Quality checks of catchment 237 delineation were done towards station metadata and 88% of the estimated catchment areas 238 were within +/-10% discrepancy towards metadata. These catchments were used in further 239 analysis for parameter estimation or model evaluation; however, not all of these sites 240 provided open access to time-series (see Section 2.3). 241 242  Outlets of large lakes/reservoirs: New lake delineation was done to solve the spatial 243 mismatch between data of the water bodies from various sources (c.f. Table 2). The centroid 244 of the lakes included in GLWD and GRanD was used as initialization points for a Flood Fill 245 algorithm, applied over the ESA CCI Water Bodies, followed by manual quality checks. The 246 outlet location was defined using the maximum upstream area for each lake. In total, around 247 13 000 lakes and 2500 reservoirs > 10 km 2 were identified globally. The new dataset was 248 tested against detailed lake information for Sweden, which represents one of the most lake-249 dense regions globally. Merging data from the two databases and adjusting to the 250 topographic data used was judged more realistic for the global hydrological modelling than 251 only using one dataset. 252 253  Large cities and cities with high flood risk: The UNEP/GRID-Europe database (Table 1)   Catchment size: the goal was to reach an average size of some 1000 km 2 , for practical 261 (computational) and scientific reasons, reflecting uncertainty in input data. Criteria in WHIST 262 were set to reach maximum catchment size of 3000 km 2 in general and 500 km 2 in coastal 263 areas with < 1000 m elevation (to avoid crossing from one side to another of a narrow and 264 high island or peninsula). Post-processing was then done for the largest lakes, deserts, and 265 floodplains, following specific information on their character (see data sources in Table 2). 266 Using this approach, the land surface of the Earth (i.e. 135 million km 2 when excluding Antarctica) 267 was divided into 131 296 catchments with an average size of 1020 km 2 . Flat land areas of deserts and 268 floodplains ended up with somewhat larger catchments, about 4500 km 2 and 3500 km 2 , respectively. 269 Around 23.8% of the land surface did not drain to the sea but to sinks (Fig. 2), the largest single one 270 being the Caspian Sea. This water was evaporated from water surfaces but also percolating to 271 groundwater reservoirs. Moreover, several areas across the globe are of Karstic geology with wide 272 underground channels, which does not follow the land-surface topography. Sinks within Karst areas 273 according to the World Map of Carbonate Rock outcrops (Table 1) were linked to "best neighbour" 274 and inserted to the river network. The Canadian prairie also encompasses a lot of sinks due to 275 climate and topography, but here we could apply a national dataset from Canada with well-defined 276 noncontributing areas to adjust the routing in this area. The land-cover data from ESA CCI LC v1.6 (Table 2) was used as the base-line for HRUs. It has 36 282 classes and subclasses and three of these were adjusted using additional data to improve the quality; 283 (1) by using glacier outlines from RGI v5 we avoided overestimation of the glacier area; (2) by using 284 GMIA  and land-cover globally. The second aim was addressed by applying a step-wise approach following 344 the HYPE process description along the flow paths, only calibrating a few parameters governing a 345 specific process at a time (Arheimer and Lindström, 2013). The estimated parameter values were 346 then applied wherever relevant in the whole geographical domain, i.e. world-wide. 347 Different catchments were selected globally to best represent each process calibrated (Fig. 3). For 348 HRUs, separate calibration was done for the snow-dominated areas (>10% of precipitation falling as 349 snow), as the snow processes give such strong character to the runoff response and simultaneous 350 calibration with catchments lacking snow may thus underestimate other flow-controlling processes. 351 The HRUs based on the ESA CCI 1.6 data was aggregated from 36 classes into 10 (Table 4) for more 352 efficient calibration and to ensure that some 50% of the gauged catchment selected was 353 representing the appointed land-cover. Some local hydrological features such as large lakes and 354 floodplains were calibrated individually. When evaluating the effect of this, we discovered some 355 major bias for the Great Lakes in North America and Malawi and Victoria lakes in Africa. Finally, we 356 introduced the 11 th step to calibrate the evaporation of these separately (Fig 3).  parameter ranges were defined as the median and the 3 rd quartile of the 10% best agreements 373 between HYPE and MODIS in terms of RE. The first selection was done with 400 runs and then 374 repeated for a second round. In addition, a priori parameters (

392
The model was evaluated against independent observed river flow, which was not used in the 393 calibration procedure. The agreement between modelled and observed time-series was evaluated 394 using the statistical metric KGE and its components r, and , which are directly linked with CC 395 ( In addition, a number of flow signatures (   (2016) was testing a scheme for global parameter regionalization world-wide; in an ensemble of ten 472 global water allocation or land surface models, the median performance of monthly KGE was found 473 to be 0.22 using 1113 river gauges. The best median monthly KGE was then 0.32 for catchment scale 474 calibration of regionalized parameters, using a gridded HBV model globally (Beck, 2016). Even though 475

Global river flow and general model performance
it is difficult to compare results when not using the same validation sites or time-period, the 476 catchment modelling approach of the present study seems to have better performance than other 477 gridded global modelling concepts of river flow. 478 The red spots in Figure 6 indicate where the HYPE model fails, such as in the US mid-west (Kansas to  479 be precise), north-east of Brazil and parts of Africa, Australia and central Asia. When decomposing 480 the KGE, it was found that the correlation was in general fine. However, the relative error in standard calibration varied a lot for each hydrological process considered in the step-wise parameter 499 estimation (Table 6). Although, a large number of river gauges was collected for parameter 500 estimation, only a few could be considered as representative with enough quality assurance. More 501 gauges in the calibration procedure would probably have given another result. Nevertheless, the 502 results show promising potential in applying the process descriptions of catchment models also at 503 the global scale. 504 In spite of the wide spread in geographical locations across the globe, a priori values were reasonable 505 for hydrological processes describing glaciers and soils. As shown in Table 6, the water balance (RE) 506 was improved considerably by first calibrating PET globally, and then precipitation vs altitude of 507 catchment and land-cover type. Simultaneous calibration of soil storage and discharge in HRUs 508 increased the KGE both in areas with and without snow by 0.1 on average. For calibration of river 509 routing and rating curves of lake outflows, the correlation coefficient was used to avoid erroneous 510 and not volume. Especially lake processes benefited from calibration. Less convincing was the 512 metrics from calibration of the floodplains, which were not always improved by the floodplain 513 routine applied. Overall, the results indicate that global parameters are to some extent possible for 514 describing hydrological processes world-wide, using a catchment model and globally available data of 515 physiographic characteristics to describe spatial variability. Nevertheless, the WWH v.1.3 model has 516 still considerable potential for improvements and to really make use of more advanced calibration 517 techniques, the water balance needs to be improved first as too much volume error makes the 518 tuning of dynamics difficult. 519 520 Table 6. Metrics of model performance before and after calibrating various hydrological processes 521 simultaneously at a number of selected river gauges, using the stepwise parameter-estimation procedure 522 globally. Parameter values and names in the HYPE model are given in Appendices.

Hydrological Process
No. gauges Median value of metric(s) Before After Potential Evapo-Transpiration (3 PET-algorithms: median of ranges constrained with MODIS)

525
The WWH1.3 is more prone to success or failure in simulating specific flow signatures than to specific 526 physiographic conditions, which is visualized by vertical rather than horizontal stripes in Figure 7. In 527 general, the model shows reasonable KGE and CC for spatial variability of flow signatures across the 528 globe (i.e. a lot of blue in the two panels to the left in Fig. 7). However, the RE and the standard 529 land-cover, and little or much of snow and ice. This shows where efforts need to be taken to improve 553 the model in its next version. For other water-balance indices, it was interesting to note that the ratio 554 between precipitation and river flow (RunoffCo) show good results (RE ± 25%) all over Köppen region 555 C (Temperate) but otherwise is often underestimated for some parts of the quartile range of 556 physiographic variables studied. On the contrary, precipitation minus flow (ActET) is over-estimated 557 in parts of the quartile range, except for the good results in Köppen region C, needle-leaved, 558 deciduous trees (TreeNeDec) and regions with snow and ice (i.e. where mean specific runoff failed). 559 Figure 7 clearly shows the compensating errors between processes governing the runoff coefficient 560 and actual evapotranspiration, with one being over-estimated when the other is underestimated for 561 the same specific physiographic conditions. This indicates the need for recalibrating the HRUs of 562 WWH in its next version, but also reconsidering the initial parameters for evapotranspiration and the 563 quality of the precipitation grid and its linkage with the catchments.  improvements. The WWH model has severe problems with dry regions and base flow conditions, 578 especially where the flow is sporadic (e.g. red areas in Fig. 5). These are difficult areas to model and 579 they will need special analysis, so instead, we suggest starting with improvements that can be 580 undertaken relatively quickly and easily. These mainly focus on the overall water balance. 581 Firstly, the global water balance can be improved through re-calibration but some basic concepts 582 need to be adjusted accordingly: (i) more careful analyses indicate that the choice of climate regions 583 based on Köppens classification for applying the different PET algorithms was not optimal and needs 584 some adjustments, (ii) linking the centroid of the catchments to the nearest precipitation grid seems 585 to remove a lot of the spatial variation and instead an average of nearest grids should be tried. 586 Secondly, the HRUs can be recalibrated and reconsidered, and we suggest (i) testing a calibration 587 scheme based on regionalized parameters rather than global, using clustering based on 588 physiographic similarities (e.g. Hundecha  Finally, more observations can be included, both in-situ by adding more gauges to the system and 594 from global Earth Observation products, for instance on water levels and storage. Hence, each step in 595 Fig. 3 still has potential for model improvements. 596 The stepwise parameter-estimation approach should ideally be cycled a couple of times to find 597 robust values under new fixed parameter conditions. However, as the model was carefully evaluated 598 during the calibration, there were a lot of bug fixing, corrections and additional improvements 599 resulting between the steps and time was rather spent on this than on several full-filled iterations. 600 Therefore, the stepwise calibration was subjected to several re-takes and shifts between steps until it 601 successfully could full-fill all the calibration steps in one entire sequence (Fig. 8). Hence, only one 602 loop was done for parameter estimations in this study. The procedure was judged as very useful for 603 the model to be potentially right for the right reason, but also very time-consuming. However, 604 applying a catchment modeler's approach, this is inevitable for reliably integrated catchment 605 modelling and both the step-wise calibration and iterative model corrections will continue with new 606 model versions. The analysis of WWH model performance shows that also this first version can to some extent be 615 useful for water managers in several regions globally. However, the WWH resulting from this first model version should be used with caution (especially in 665 dry regions) as the performance may still be of low quality for local or regional applications in water 666 management. Geographically, the model performs best in Eastern USA, Europe, South-East Asia and 667 Japan, as well as parts of Russia, Canada, and South America. The model shows overall good 668 potential to capture flow signatures of monthly high flows, spatial variability of high flows, duration 669 of low flows and constancy of daily flow. Nevertheless, there remains large potential for model 670 improvements and it is suggested both to redo the calibration and reconsider parts of the model 671 structure for the next WWH version. 672 The step-wise calibration procedure was judged as very useful for the model to be potentially right 673 for the right reason, but also very time-consuming. The calibration cycle is suggested to be repeated 674 a couple of times to find robust values under new fixed parameter conditions, which is a long-term 675 commitment of continuous model refinement. The model set-up will be released in new model 676 versions during this incremental improvement. For the next version, special focus will be given to the 677 water balance (i.e. precipitation and evapotranspiration), soil storage and dynamics from 678 hydrological features, such as lakes, reservoirs, glaciers and floodplains. 679 The model will be shared by providing a piece of the world to modellers working at the regional scale 680 to appreciate local knowledge, establish a critical mass of experts from different parts of the world 681 and improve the model in a collaborative manner. The model can serve as a fast track to a model 682 environment for users who do not have this ready at hands and in return the WWH can be improved 683 from feedback on hydrological processes from local experts across the world. Potentially it will 684 accelerate scientific advancement if more researchers start using the same tools and data, which 685 makes it easier to be transparent when evaluating and comparing scientific results. 686 687 Code availability 688 Hypecode.smhi.se 689 Data availability 690 Hypeweb.smhi.se 691 692 Appendices 693 694 The