Global catchment modelling using World-Wide HYPE 1 ( WWH ) , open data and stepwise parameter estimation 2

12 Recent advancements in catchment hydrology (such as understanding hydrological processes, 13 accessing new data sources, and refining methods for parameter constraints) make it possible to 14 apply catchment models for ungauged basins over large domains. Here we present a cutting-edge 15 case study applying catchment-modelling techniques at the global scale for the first time. The 16 modelling procedure was challenging but doable and even the first model version show better 17 performance than traditional gridded global models of river flow. We used the open-source code of 18 the HYPE model and applied it for >130 000 catchments (with an average resolution of 1000 km), 19 delineated to cover the Earths landmass (except Antarctica). The catchments were characterized 20 using 20 open databases on physiographical variables, to account for spatial and temporal variability 21 of the global freshwater resources, based on exchange with the atmosphere (e.g. precipitation and 22 evapotranspiration) and related budgets in all compartments of the land (e.g. soil, rivers, lakes, 23 glaciers, and floodplains), including water stocks, residence times, interfacial fluxes, and the 24 pathways between various compartments. Global parameter values were estimated using a step25 wise approach for groups of parameters regulating specific processes and catchment characteristics 26 in representative gauged catchments. Daily time-series (> 10 years) from 5338 gauges of river flow 27 across the globe were used for model evaluation (half for calibration and half for independent 28 validation), resulting in a median monthly KGE of 0.4. However, the world-wide HYPE (WWH) model 29 shows large variation in model performance, both between geographical domains and between 30 various flow signatures. The model performs best in Eastern USA, Europe, South-East Asia, and 31 Japan, as well as in parts of Russia, Canada, and South America. The model shows overall good 32 potential to capture flow signatures of monthly high flows, spatial variability of high flows, duration 33 of low flows and constancy of daily flow. Nevertheless, there remains large potential for model 34 improvements and we suggest both redoing the calibration and reconsidering parts of the model 35 structure for the next WWH version. The calibration cycle should be repeated a couple of times to 36 find robust values under new fixed parameter conditions. For the next iteration, special focus will be 37 given to precipitation, evapotranspiration, soil storage, and dynamics from hydrological features, 38 such as lakes, reservoirs, glaciers, and floodplains. This first model version clearly indicates challenges 39 in large scale modelling, usefulness of open data and current gaps in processes understanding. Parts 40


12
Recent advancements in catchment hydrology (such as understanding hydrological processes, 13 accessing new data sources, and refining methods for parameter constraints) make it possible to 14 apply catchment models for ungauged basins over large domains. Here we present a cutting-edge 15 case study applying catchment-modelling techniques at the global scale for the first time. The 16 modelling procedure was challenging but doable and even the first model version show better 17 performance than traditional gridded global models of river flow. We used the open-source code of 18 the HYPE model and applied it for >130 000 catchments (with an average resolution of 1000 km 2 ), 19 delineated to cover the Earths landmass (except Antarctica). The catchments were characterized 20 using 20 open databases on physiographical variables, to account for spatial and temporal variability 21 of the global freshwater resources, based on exchange with the atmosphere (e.g. precipitation and 22 evapotranspiration) and related budgets in all compartments of the land (e.g. soil, rivers, lakes, 23 glaciers, and floodplains), including water stocks, residence times, interfacial fluxes, and the 24 pathways between various compartments. Global parameter values were estimated using a step-25 wise approach for groups of parameters regulating specific processes and catchment characteristics 26 in representative gauged catchments. Daily time-series (> 10 years) from 5338 gauges of river flow 27 across the globe were used for model evaluation (half for calibration and half for independent 28 validation), resulting in a median monthly KGE of 0.4. However, the world-wide HYPE (WWH) model 29 shows large variation in model performance, both between geographical domains and between 30 various flow signatures. The model performs best in Eastern USA, Europe, South-East Asia, and 31 Japan, as well as in parts of Russia, Canada, and South America. The model shows overall good 32 potential to capture flow signatures of monthly high flows, spatial variability of high flows, duration 33 of low flows and constancy of daily flow. Nevertheless, there remains large potential for model 34 improvements and we suggest both redoing the calibration and reconsidering parts of the model 35 structure for the next WWH version. The calibration cycle should be repeated a couple of times to 36 find robust values under new fixed parameter conditions. For the next iteration, special focus will be 37 given to precipitation, evapotranspiration, soil storage, and dynamics from hydrological features, 38 such as lakes, reservoirs, glaciers, and floodplains. This first model version clearly indicates challenges 39 in large scale modelling, usefulness of open data and current gaps in processes understanding. Parts 40 of the WWH can be shared with other modellers working at the regional scale to appreciate local 41 knowledge, establish a critical mass of experts and improve the model in a collaborative manner. 42 Setting up a global catchment model has to be a long-term commitment of continuous model 43 refinements to achieve successful and trulymore useful results for water management. 44 45

47
Hydrological models are useful tools to better understand processes behind observation, to 48 reconstruct past events and to predict future events, as well as to explore the impact of various 49 scenarios of change in flow controlling factors, such as climate or human activities. Catchment 50 models were traditionally often applied in small well-monitored rivers under pristine conditions, to 51 understand mechanisms in flow generation (e.g. Bergström and Forsman, 1973;Beven and Kirby, 52 1979; Lindström et al., 1997) or to support flow forecasts at warning services (e.g. Arheimer et al., 53 2011). However, a combination of societal requests and scientific initiatives has changed this context 54 for catchment modelling recently. As catchment models are mimicking observation through 55 calibration procedures, they have high credibility among practitioners and water managers. Hence, 56 they are used operationally in many societal sectors, to provide for instance design values for 57 infrastructure, water allocation schemes, navigation routes, flood warnings, environmental-status 58 indices or optimal industrial-water use. Currently, all these users of catchment model outputs also 59 face climate change and seek data and information to best implement climate adaptation for their 60 specific business. Hence, catchment models are also used to estimate climate change impact. 61 The catchment research community has embraced this applied focus and, at the same time, 62 expanded the geographical domain to multi-catchments. The applied focus is illustrated by the new 63 decade of the International Association of Hydrological Sciences (IAHS) called "Panta Rhei", which 64 addresses change in hydrology and society (Montanari et al., 2013) and focuses on the human impact 65 on the water cycle instead of traditional pristine conditions. The spatial expansion, on the other 66 hand, is driven by accelerating advances in hydrological research as described by Archfield et al. 67 (2015). For instance, comparative hydrology (Falkenmark and Chapman, 1989) or large sample 68 hydrology (Gupta et al., 2014) show the potential to advance science by addressing a larger domain 69 with multiple catchments than just exploring one single catchment at a time. Similarly, the previous 70 scientific decade of IAHS "Predictions in Un-gauged Basins", PUB (Hrachowitz et al., 2013;Bloeschl et 71 al., 2013), resulted in methods to maintain the procedures typical for catchment modelling when 72 parameters are transferred to areas without observed time-series of river flow, such as 73 regionalization, parameter constraints, and Monte Carlo approaches for empirical quality control, to 74 ensure that the process description is realistic and account for uncertainties. This opened up for 75 catchment models to be tested and applied also at the continental scale (e.g. Pechlivanidis   and to assign time-series of observed flow at some catchment outlets. This enables the use of 100 recognised methods in catchment modelling for parameter estimation and model evaluation, as 101 described in the following paragraphs. Using catchments instead of grids as a calculation unit also 102 makes it possible to apply an ecosystem approach and account for spatial co-evolution of processes 103 at the landscape scale (e.g. Bloeschl et al., 2013). Model parameters can thus be linked to catchment 104 state from interacting entities and not only to aggregation of separated building blocks of the 105 catchment. 106 In the early 1970's, model parameters were calibrated using a rather simple curve fitting towards 107 observed time-series of river flow in a specific catchment outlet (e.g. Bergström and Forsman, 1973 representative basins across the entire globe, although, some hydrological features as large lakes and  119 floodplains were calibrated individually. 120 The hypothesis tested in the present study states that, it is now possible and timely to apply 121 catchment modelling techniques at the global scale. We address this hypothesis by applying a 122 catchment model world-wide and then evaluating the results using statistical metrics for time-series 123 and flow signatures. To our knowledge, this is the first time a catchment model was applied world-124 wide covering the entire globe with relatively high resolution, providing an average subbasin size 125 of ̴ 1000 km 2 (WWH version 1.3). Our specific objective is to provide a harmonized way to predict 126 hydrological variables (especially river flow and the water balance) globally, which can also be shared 127 for further refinement to assist in regional and local water management wherever hydrological 128 models are currently lacking.  readily available open data sources globally (Table 3). In total, information from 21 704 gauging 236 stations could be assigned to a catchment outlet. Of these, time-series could be downloaded for 11 237 8 369 while 10 336 could only assist with metadata, such as upstream area, river name, elevation or 238 natural of regulated flow. The time-series were screened for missing values, inconsistency, skewness, 239 trends, inhomogeneity, and outliers (Crochemore et al., 2019manuscript). Only stations representing 240 the resolution of the model (≥1000 km 2 ) and with records of at least 10 consecutive years between 241 1981 and 2012 were considered for model evaluation. With these criteria, 5338 time-series were 242 finally used for evaluating model performance, of which 2863 represented completely independent 243 model validation and 2475 were also involved when estimating some of the model parameters. 244 245 information from other types of databases, WHIST also aggregates data or uses the nearest grid for 277 assigning characteristics to each catchment. WHIST handles both gridded data and polygons, and was 278 used to link all data described in Section 2, such as land-cover, river width, precipitation, 279 temperature, and elevation, to each delineated catchment. WHIST then compiles the input data files 280 to a format that can be read by the HYPE source code. The software runs automatically, but also has 281 a visual interface for manual corrections and adjustments. It may also adjust the position of the 282 gauging stations to match the river network of a specific topographic database. 283 When setting up WWH, force-points for catchment delineation were defined according to: 284  Locations of gauging stations in the river network: in total, catchments were defined for all 285 21 704 gauging stations which had an upstream area greater than 1000 km 2 (except for data 286 sparse regions (500 -1000 km 2 ). Their coordinates were corrected to fit with the river 287 network of the topographic data, using WHIST and manually. Quality checks of catchment 288 delineation were done towards station metadata and 88% of the estimated catchment areas 289 were within +/-10% discrepancy towards metadata. These catchments were used in further 290 analysis for parameter estimation or model evaluation; however, not all of these sites 291 provided open access to time-series (see Section 2.3). 292 293  Outlets of large lakes/reservoirs: New lake delineation was done to solve the spatial 294 mismatch between data of the water bodies from various sources (c.f. Table 2). The centroid 295 of the lakes included in GLWD and GRanD was used as initialization points for a Flood Fill 296 algorithm, applied over the ESA CCI Water Bodies, followed by manual quality checks. The 297 outlet location was defined using the maximum upstream area for each lake. In total, around 298 13 000 lakes and 2500 reservoirs > 10 km 2 were identified globally. The new dataset was 299 tested against detailed lake information for Sweden, which represents one of the most lake-300 dense regions globally. Merging data from the two databases and adjusting to the 301

Ändrad fältkod
Formaterat: Engelska (USA) topographic data used was judged more realistic for the global hydrological modelling than 302 only using one dataset. 303 304  Large cities and cities with high flood risk: The UNEP/GRID-Europe database (Table 1) was  305 used to define flood-prone areas for which the model may be useful in the future. The 306 criteria for assigning a force point was city areas of > 100 km 2 (regardless of the risks on the 307 UNEP scale) or city areas of 10-100 km 2 with risk 3-5 and an upstream area > 1000 km 2 . This 308 was only considered if there was no gauging station within 10 km from the city. This gave 309 another 2 439 forcing points to the global model. 310 311  Catchment size: the goal was to reach an average size of some 1000 km 2 , for practical 312 (computational) and scientific reasons, reflecting uncertainty in input data. Criteria in WHIST 313 were set to reach maximum catchment size of 3000 km 2 in general and 500 km 2 in coastal 314 areas with < 1000 m elevation (to avoid crossing from one side to another of a narrow and 315 high island or peninsula). Post-processing was then done for the largest lakes, deserts, and 316 floodplains, following specific information on their character (see data sources in Table 2). 317 Using this approach, the land surface of the Earth (i.e. 135 million km 2 when excluding Antarctica) 318 was divided into 131 296 catchments with an average size of 1020 km 2 . Flat land areas of deserts and 319 floodplains ended up with somewhat larger catchments, about 4500 km 2 and 3500 km 2 , respectively. 320 Around 23.8% of the land surface did not drain to the sea but to sinks (Fig. 2), the largest single one 321 being the Caspian Sea. This water was evaporated from water surfaces but also percolateding to 322 groundwater reservoirs. Moreover, several areas across the globe are of Karstic geology with wide 323 underground channels, which does not follow the land-surface topography. Sinks within Karst areas 324 according to the World Map of Carbonate Rock outcrops (Table 1) were linked to "best neighbour" 325 and inserted to the river network. The Canadian prairie also encompasses a large numberlot of sinks 326 due to climate and topography, and there existed, but here we could apply a national dataset from 327 Canada with well-defined noncontributing areas to adjust the routing in this area. 328 The land-cover data from ESA CCI LC v1.6 ( Table 2) was used as the base-line for HRUs. It has 36  333 classes and subclasses and three of these were adjusted using additional data to improve the quality; 334 (1) by usinig glacier delineatedoutlines from by the RGI v5 and comparing spatially the outlines of 335 both sources, we avoided overestimation of the glacier area; (2) by using GMIA and MIRCA in a data 336 fusion algorithm to create a more robust new irrigation database, we added irrigation where this 337 information where is was missing and underestimated; (3) by combining several sources of water 338 bodies sources (see Table 2) and spatial analyses (e.g. a flood fill algorithm and geospatial tools) we 339 differentiated one general class of waterbodies into four: large lakes, small lakes, rivers, and coastal 340 sea, which makes more sense in catchment modelling. Five elevation zones were derived to 341 differentiate land-cover classes with altitude (0-500 m, 500-1000 m, 1000-2000 m, 2000-4000 m and 342 4000-8900 m) as the hydrological response may be very different at different altitude due to 343 vegetation growth and soil properties. The land-cover at these elevations was thus treated as a 344 specific HRU globally. In total, this resulted in 169 HRUs. 345 All catchments were characterized according to Köppen-Geiger (  Gao et al., 2014). HYPE has maximum three layers of soil and these were all applied in the 374 WWH, with a different hydrological response from each one for each HRU. The first layer 375 corresponds to some 25 cm, the second to some 1-2 meters and the third can be deep also 376 accounting for ground water. A specific routine can account for deep aquifers, but this was not 377 applied in the WWH due to lack of local or regional information of aquifer behavior. HYPE has a snow 378 routine to account for snow storage and melt, while a glacier routine account for ice storage and 379 melt. Mass balances of glaciers were based on the observations provided in the Randolph Glacier 380 Inventory

3.34.2
Step-wise parameter estimation 393 The method to assign parameter values for the global model domain aimed at finding (i) robust 394 values also valid for ungauged basins, as well as (ii) reliable process description of dominating flow 395 generation processes and water storage along the flow paths. The first aim was addressed by 396 simultaneous calibration in multiple representative catchments world-wide. Spatial heterogeneity 397 was accounted for by separate calibration of catchments representing different climate, elevation, 398 and land-cover globally. The second aim was addressed by applying a step-wise approach following 399 the HYPE process description along the flow paths, only calibrating a few parameters governing a 400 specific process at a time (Arheimer and Lindström, 2013). The estimated parameter values were 401 then applied wherever relevant in the whole geographical domain, i.e. world-wide. 402 Different catchments were selected globally to best represent each process calibrated (Fig. 3). 403 Processes were assumed to be linked to different physiographic characteristics (Kuentz et al., 2017) 404 and catchments with gauging stations where these characteristics were most prominent in the 405 upstream area were selected (i.e. the representative gauged basin method). For HRUs, separate 406 calibration was done for the snow-dominated areas (>10% of precipitation falling as snow), as the 407 snow processes give such strong character to the runoff response and simultaneous calibration with 408 catchments lacking snow may thus underestimate other flow-controlling processes. The HRUs based 409 on the ESA CCI 1.6 data was aggregated from 36 classes into 10 (Table 4) for more efficient 410 calibration and to ensure that some 50% of the gauged catchment selected was representing the 411 appointed land-cover. Some local hydrological features such as large lakes and floodplains were 412 calibrated individually. When evaluating the effect of this, we discovered some major bias for the 413 Great Lakes in North America and Malawi and Victoria lakes in Africa. Finally, we introduced the 11 th 414 step to calibrate the evaporation of these separately (Fig 3). as the median and the 3 rd quartile of the 10% best agreements between HYPE and MODIS in terms of 433 RE. The first selection was done with 400 runs and then repeated for a second round. In addition, a 434 priori parameters ( The model was evaluated against independent observed river flow by using remaning gauges, which 454 was were not used inchosen for the calibration procedure. The agreement between modelled and 455 observed time-series was evaluated using the statistical metric KGE and its components r, and , 456 which In addition, a number of flow signatures (   hydrography, resulted in a new database with delineated hydrographical features (e.g. Fig. 4) of 505 major importance for hydrological modelling. The merging of several data sources resulted in 506 consistency between available information on water bodies, topographic data and the river network 507 (e.g. for glaciers, floodplains, lakes, and gauging stations) so that this information can be used in 508 catchment modelling and provide results of river flow at a resolution of some 1000 km 2 globally. 509 510 511 The WWH version 1.3 resulted in a realistic spatial pattern of river flow world-wide, clearly 517 identifying desert areas and the largest rivers (Fig. 5). Compared to other global estimates of average 518 water flow in major rivers, HYPE gives results in the same order of magnitude, but of course, 519 comparisons should be based on the same time period to account for natural variability due to 520 climate oscillations. The Amazon, Congo and Orinocco rivers came out as the three largest ones, 521 where the river flow of the Amazon river is almost 6 times larger than any other river. This indicates that the model results are robust and the same model performance can be assumed 537 also in ungauged basins. Catchment modellers would normally judge these results as poor, but 538 Ggiven that global open input data was used for model setup and rough assumptions were made 539 when generalizing hydrological processes across the globe, the overall model performance meets the 540 expectations. Similar results were recently achieved when Beck et al. (2016) was testing a scheme for 541 global parameter regionalization world-wide; in an ensemble of ten global water allocation or land 542 surface models, the median performance of monthly KGE was found to be 0.22 using 1113 river 543 gauges for mesoscale catchments gobally (median size 500 km2). The best median monthly KGE was 544 then 0.32 for catchment scale calibration of regionalized parameters, using a gridded HBV model 545 globally (Beck, 2016). Even though it It is difficult to compare results when not using the same 546 validation sites or time-period and more concerted actions for model inter-comparison are needed at 547 this scale. Nevertheless, the catchment modelling approach of the present study seems to have 548 better performance than other gridded global modelling concepts of river flow. 549 The red spots in Figure 6 indicate where the HYPE model fails, such as in the US mid-west (especially 550 Kansas to be precise), north-east of Brazil and parts of Africa, Australia and central Asia. When 551 decomposing the KGE, it was found that the correlation was in general fine. However, the relative 552 error in standard deviation was causing the main problems showing that the HYPE model does not 553 capture the variations of the hydrograph, and instead, generates a too even flow. The relative error 554 also seemed problematic, which indicates problems with the water balance. The model has severe 555 problems with dry regions and areas with large impact from human alteration and water 556 management, where the model underestimates the river flow. Such regions are known to be more 557 difficult for hydrological modelling in general (Bloeschl et al., 2013), but in addition, precipitation 558 data do not seem to fully capture the influence of topography and mountain ranges. calibration varied a lot for each hydrological process considered in the step-wise parameter 570 estimation (Table 6). Although, a large number of river gauges was collected for parameter 571 estimation, only a few could be considered as representative with enough quality assurance. More 572 gauges in the calibration procedure would probably have given another result. Nevertheless, the 573 results show promising potential in applying the process descriptions of catchment models also at 574 the global scale. 575 In spite of the wide spread in geographical locations across the globe, a priori values were reasonable 576 for hydrological processes describing glaciers and soils. As shown in Table 6, the water balance (RE) 577 was improved considerably by first calibrating PET globally, and then precipitation vs altitude of 578 catchment and land-cover type. Simultaneous calibration of soil storage and discharge in HRUs 579 increased the KGE both in areas with and without snow by 0.1 on average. For calibration of river 580 routing and rating curves of lake outflows, the correlation coefficient was used to avoid erroneous 581 compensation of the water balance, as the parameters involved should only set the dynamics of flow 582 and not volume. Especially lake processes benefited from calibration. Less convincing was the 583 metrics from calibration of the floodplains, which were not always improved by the floodplain 584 routine applied. Overall, the results indicate that global parameters are to some extent possible for 585 describing hydrological processes world-wide, using a catchment model and globally available data of 586 physiographic characteristics to describe spatial variability. Nevertheless, the WWH v.1.3 model has 587 still considerable potential for improvements and to really make use of more advanced calibration 588 techniques, the water balance needs to be improved first as too much volume error makes the 589 tuning of dynamics difficult. 590 591 Table 6. Metrics of model performance before and after calibrating various hydrological processes 592 simultaneously at a number of selected river gauges, using the stepwise parameter-estimation procedure 593 globally. Parameter values and names in the HYPE model are given in Appendices.

Hydrological Process
No. gauges Median value of metric(s) Before After Potential Evapo-Transpiration (3 PET-algorithms: median of ranges constrained with MODIS) The WWH1.3 is more prone to success or failure in simulating specific flow signatures than to specific 597 physiographic conditions, which is visualized by vertical rather than horizontal stripes in Figure 7. In 598 general, the model shows reasonable KGE and CC for spatial variability of flow signatures across the 599 globe (i.e. a lot of blue in the two panels to the left in Fig. 7). However, the RE and the standard 600 deviation of the RE (RESD) are less convincing (i.e. the two panels to the right). This means that the 601 model can capture the relative difference in flow signature and the spatial pattern globally, but not 602 always the magnitudes, nor the spread between highest and lowest values. The relative errors are 603 mostly due to underestimations, except for skewness, low flows and actual potential 604 evapotranspiration; the two latter are always over-estimated when not within ±25% bias.
Overall, 605 the model shows good potential to capture spatial variability of high flows (Q95), duration of low 606 flows (LowDurVar), monthly high flows (Mean30dMax) and constancy of daily flows (Const). These 607 results were found robust and independent of metrics or physiography. 608 The model shows most difficulties in capturing skewness in observed time-series (skew), the number 609 of high flow occurrences (HighFrVar), and base flow as average (BFI), or absolute low flows (Q5). 610 Short-term fluctuations (RevVar and RBFlash) are also rather difficult for the model to capture. Some 611 results are not consistent between metrics; for coefficient of variation (CVQ) the RE was good while 612 the RESD was poor. This indicates that the model does not capture the amplitude in variation 613 between sites even if the bias is small. The opposite was found for high flow discharge (HFD) and 614 low-flow spells (LowFr), i.e. poor performance in volumes but RESD showing that the variability is 615 captured. 616 For the remaining flow signatures studied, it was interesting to note that the model performance 617 could be linked to physiographic characteristics, indicating that the model structure and global 618 parameters are valid for some environments but not for others. For instance, the volume of mean 619 specific flow (RE of MeanQ) is especially difficult to capture in regions with needle-leaved, deciduous 620 trees (TreeNeDec) and for medium and large flows in the Köppen region B (Arid), large flows in D 621 (Cold-continental) and small flows in E (Polar). Moreover, the analysis shows that the model tends to 622 fail with the mean flow in catchments with high elevation, high slope, small fraction water and urban 623 land-cover, and little or much of snow and ice. This shows where efforts need to be taken to improve 624 the model in its next version. For other water-balance indices, it was interesting to note that the ratio 625 between precipitation and river flow (RunoffCo) show good results (RE ± 25%) all over Köppen region 626 C (Temperate) but otherwise is often underestimated for some parts of the quartile range of 627 physiographic variables studied. On the contrary, precipitation minus flow (ActET) is over-estimated 628 in parts of the quartile range, except for the good results in Köppen region C, needle-leaved, 629 deciduous trees (TreeNeDec) and regions with snow and ice (i.e. where mean specific runoff failed). 630 Figure 7 clearly shows the compensating errors between processes governing the runoff coefficient 631 and actual evapotranspiration, with one being over-estimated when the other is underestimated for 632 the same specific physiographic conditions. This indicates the need for recalibrating the HRUs of 633 WWH in its next version, but also reconsidering the initial parameters for evapotranspiration and the 634 quality of the precipitation grid and its linkage with the catchments. improvements. A thorough analysis of spatial patterns would benefit from evaluation against 649 independent data of spatial patterns of hydrological variables, for instance from Earth Observations. 650 The In general, the WWH model has severe problems with dry regions and base flow conditions, 651 especially where the flow is sporadic (e.g. red areas in Fig. 5). The flow generating processes in such 652 areas are known to be se are difficult areas to model (Bloeschl et al., 2013). For instance, most model 653 concepts, and also the WWH, have problems with the and they will need special analysisgreat plains 654 of US (e.g. Mizukami et al., 2017;Newman et al., 2017), where the terrain is complex with prairie 655 potholes, which are disconnected from the rivers, and precipitation comprise a major source of 656 hydrologic model error (e.g. Clark and Slater, 2006). Poor model performance were also found for 657 the tundra and deserts, but it should then be recognized that the parameters for these regions were 658 estimated using only four time-series for bare soils (Table 6); including more gauging stations would  659 be a way to improve the model here. In large parts of Africa, however, model errors could be linked 660 to the soil-runoff parameters and local calibration based on catchment similarities have already been 661 found to improve the performance a lot in west Africa. 662 In the snow-dominated part of the globe, extensive hydropower regulation change the natural 663 variability of river discharge  but the global databases miss 664 out of all medium and small dams that may affect discharge along these river networks. A general 665 problem with modelling river regulation is that reservoirs can have multi-purposes and must be 666 examined individually to understand the regulation schemes applied. Such analyses have started and 667 shown potential to improve the global model a lot as the poorest model results are often linked to 668 river regulations. However, individual reservoir calibration will be very time-consuming, so instead, 669 we suggest starting with improvements that can be undertaken relatively quickly and easily. These 670 mainly focus on the overall water balance. 671 Firstly, the global water balance can be improved through re-calibration but some basic concepts 672 need to be adjusted accordingly: (i) more careful analyses indicate that the choice of climate regions 673 based on Köppens classification for applying the different PET algorithms was not optimal and needs 674 some adjustments, (ii) linking the centroid of the catchments to the nearest precipitation grid seems 675 to remove a lot of the spatial variation and instead an average of nearest grids should be tried. 676 Secondly, the HRUs can be recalibrated and reconsidered, and we suggest (i) testing a calibration 677 scheme based on regionalized parameters rather than global, using clustering based on 678 physiographic similarities (e.g. Hundecha et al., 2016), (ii) including soil properties in the HRU 679 concept again (as in the original version of HYPE, see Lindström et al., 2010) to account for spatial 680 variability in soil-water discharge linked to porosity in addition to vegetation and elevation. Thirdly, 681 the behavior of hydrological features, such as lakes, reservoirs, glaciers, and floodplains can be 682 evaluated and calibrated separately, after categorizing them more carefully or from individual tuning. 683 Finally, more observations can be included, both in-situ by adding more gauges to the system and 684 from global Earth Observation products, for instance on water levels and storage. Hence, each step in 685 Fig. 3 still has potential for model improvements. 686 The stepwise parameter-estimation approach should ideally be cycled a couple of times to find 687 robust values under new fixed parameter conditions. However, as the model was carefully evaluated 688 during the calibration, there were a lot of bug fixing, corrections and additional improvements 689 resulting between the steps and time was rather spent on this than on several full-filled iterations. 690 Therefore, the stepwise calibration was subjected to several re-takes and shifts between steps until it 691 successfully eventually could full-fill all the calibration steps in one entire sequence (Fig. 8). Hence, 692 only one loop was done for parameter estimations in this study. The procedure was judged as very 693 useful for the model to be potentially right for the right reason, but also very time-consuming. 694 However, applying a catchment modeler's approach, this is inevitable for reliably integrated 695 catchment modelling and both the step-wise calibration and iterative model corrections will continue 696 with new model versions. we currently lack such studies for global hydrological models. Focus should then be on comparing 705 model performance in general but also on input data and performance of specific hydrological 706 processes to understand differences between various model concepts. The latter could be done by 707 using the representative gauged basin approach, as in this study, to evaluate model performance for 708 sites where flow is dominated by certain processes, or by analyzing specific parts of the hydrograph 709 or flow signatures that represents time periods when specific processes dominate the flow 710 generation. In addition to river gauges, other data sources should be used for model evaluation of 711 spatial patterns, e.g. earth observations. Specific areas that are intensively managed and impacted by 712 humans should also be distinguished and evaluated separately to better understanding process 713 variability vs human impacts. Various sources of input data (from which errors may propagate) 714 should also be evaluated to improve global hydrological modelling. 715 716

5.26.2
Model usefulness 717 Catchment models are often applied by water managers and the usefulness is part of the concept. 718 The analysis of WWH model performance shows that also this first version can to some extent be 719 useful for water managers in several regions globally. For instance, long-term averages are rather 720 reliable in Eastern USA, Europe, South-East Asia, Japan as well as most of Russia, Canada, and South 721 America. Here the model could thus be used for e.g. analyzing shifts in water resources between 722 different climate periods. For high flows, monthly values show good performance as well as the 723 spatial pattern of relative values. This implies that the model could already be used for seasonal 724 forecasting of recharge to hydropower reservoirs, for which these variables are often used. 725 Accordingly, the model has been applied for producing water-related climate impact indicators and it 726 is set-up operationally to provide monthly river-flow forecasts for 6 months ahead 727 (http://hypeweb.smhi.se/). 728 In many areas, HYPE should still be considered as a scientific tool and cannot be used locally by water 729 managers because of poor performance. However, Tthe model provides a first platform for 730 catchment modelling to be further refined and experimented with at the global, regional and local 731 scales. Parts of the model can be extracted (e.g. specific catchments or countries) and used as 732 infrastructure, when starting the time-consuming process of setting up a catchment model. However, the WWH resulting from this first model version should be used with caution (especially in 771 dry regions) as the performance may still be of low quality for local or regional applications in water 772 management. Geographically, the model performs best in Eastern USA, Europe, South-East Asia and 773 Japan, as well as parts of Russia, Canada, and South America. The model shows overall good 774 potential to capture flow signatures of monthly high flows, spatial variability of high flows, duration 775 of low flows and constancy of daily flow. Nevertheless, there remains large potential for model 776 improvements and it is suggested both to redo the calibration and reconsider parts of the model 777 structure for the next WWH version. 778 The step-wise calibration procedure was judged as very useful for the model to be potentially right 779 for the right reason, but also very time-consuming. The calibration cycle is suggested to be repeated 780 a couple of times to find robust values under new fixed parameter conditions, which is a long-term 781 commitment of continuous model refinement. The model set-up will be released in new model 782 versions during this incremental improvement. For the next version, special focus will be given to the 783 water balance (i.e. precipitation and evapotranspiration), soil storage and dynamics from 784 hydrological features, such as lakes, reservoirs, glaciers and floodplains. 785 The model will be shared by providing a piece of the world to modellers working at the regional scale 786 to appreciate local knowledge, establish a critical mass of experts from different parts of the world 787 and improve the model in a collaborative manner. The model can serve as a fast track to a model 788 environment for users who do not have this ready at hands and in return the WWH can be improved 789 from feedback on hydrological processes from local experts across the world. Potentially it will 790 accelerate scientific advancement if more researchers start using the same tools and data, which 791 makes it easier to be transparent when evaluating and comparing scientific results. 792 793 Code availability 794 Hypecode.smhi.se 795 Data availability 796 Hypeweb.smhi.se 797 798 Appendices 799 800 The Acknowledgements 808 We would like to thank all data providers listed in Table 1-3 who make their results and observations  809 readily available for re-purposing; without you any global hydrological modelling would not be 810 possible at all. Especially we would like to express our gratitude to Dr. Dai Yamazaki, University of 811 Tokyo, for developing and sharing the global width database for large rivers, which we found very 812 useful. The WWH was developed at the SMHI Hydrological Research unit, where much work is done 813 in common taking advantages from previous work and several projects running in parallel in the 814 group. It was indeed a team work. We would especially like to acknowledge contributions from our 815 colleagues Jörgen Rosberg, Lotta Pers, David Gustafsson and Peter Berg, who provided much of the 816 model infrastructure.