High-resolution global topographic index values for use in large-scale hydrological modelling

Modelling land surface water flow is of critical importance for simulating land surface fluxes, predicting runoff and water table dynamics and for many other applications of Land Surface Models. Many approaches are based on the popular hydrology model TOPMODEL (TOPography-based hydrological MODEL), and the most important parameter of this model is the well-known topographic index. Here we present new, high-resolution parameter maps of the topographic index for all ice-free land pixels calculated from hydrologically conditioned HydroSHEDS (Hydrological data and maps based on SHuttle Elevation Derivatives at multiple Scales) data using the GA2 algorithm (GRIDATB 2). At 15 arcsec resolution, these layers are 4 times finer than the resolution of the previously best-available topographic index layers, the compound topographic index of HYDRO1k (CTI). For the largest river catchments occurring on each continent we found that, in comparison with CTI our revised values were up to 20 % lower in, e.g. the Amazon. We found the highest catchment means were for the Murray–Darling and Nelson–Saskatchewan rather than for the Amazon and St. Lawrence as found from the CTI. For the majority of large catchments, however, the spread of our new GA2 index values is very similar to those of CTI, yet with more spatial variability apparent at fine scale. We believe these new index layers represent greatly improved global-scale topographic index values and hope that they will be widely used in land surface modelling applications in the future.


Introduction
Land surface models (LSMs) are widely used for predicting the effects of global climate change on vegetation development, runoff and inundation, evapotranspiration rates and land surface temperature (Gerten et al., 2004;Prentice et al., 2007;Clark and Gedney, 2008;Dadson and Bell, 2010;Dadson et al., 2010Dadson et al., , 2011;;Wainwright and Mulligan, 2013;IPCC, 2013).However, the simulation of hydrological dynamics within LSMs remains relatively simplified because these models are usually run at coarse spatial resolution (up to 300 km grid boxes) and the physics they follow is based predominantly on approximations of processes that occur at much finer spatial scales (Ducharne, 2009;Wainwright and Mulligan, 2013).Correctly characterising hydrology is very important because landscape-scale water movements ( ∼ 10-100 km scale) and changes in the water cycle control many effects ranging from local energy and carbon fluxes to landatmosphere feedbacks to the climate system to potentially catastrophic changes in vegetation distributions.
Land Surface Models require a representation of surface and subsurface runoff.Models of runoff production used in regional and continental applications typically contain parameterised physics based on statistical representations of processes known to operate at finer scales (Ward and Robinson, 2000;Clark and Gedney, 2008), which can lead to inaccurate predictions in data-sparse regions and generally high uncertainty.The large quantity of detailed topographic information now widely available at sublandscape-scale resolutions offers an opportunity to improve the fidelity of largearea simulations of the hydrological cycle, for the benefit of both climate and hydrological models (Dharssi et al., 2009;Wainwright and Mulligan, 2013).
Currently, the most common approach to inundation prediction is to use a runoff production scheme such as TOP-MODEL (TOPography-based hydrological MODEL), which partitions runoff from the soil column into surface and subsurface components (Beven and Kirkby, 1979;Quinn et al., 1991Quinn et al., , 1995;;Beven, 1997Beven, , 2012;;e.g. MacKellar et al., 2013).One of the most important configurational parameters for TOPMODEL is the well-known topographic index (defined in Appendix A), which is widely used in hydrology and terrain-related applications (Ward and Robinson, 2000;Wilson and Gallant, 2000).
The HYDRO1k global values for the compound topographic index (CTI) were released by USGS in 2000 (USGS, 2000) and they have since become the most commonly used global ancillary files for topographic index values.HY-DRO1k was a great step forward in the development of global hydrological modelling applications: it allowed spatially explicit hydrological routines to be incorporated into LSMs for the first time and large-scale applications of the TOPMODEL hydrological model to become standard (Beven, 2012).The recent availability of topographic maps at even higher spatial resolution with globally consistent coverage builds on this and means that further improvements can now be made.
The limitations of HYDRO1k CTI values become most apparent when considering wetland ares.Wetlands are critical nodes in the Earth system where land-atmosphere fluxes are strongly dependent on seasonal and interannual hydrological variability (Coe, 1998;Baker et al., 2009;O'Connor et al., 2010;Dadson et al., 2010).In wetlands, the availability of water introduces important feedbacks on climate via surface fluxes of energy and water and these areas form a key link between the hydrological and carbon cycles (Ward and Robinson, 2000;Gedney et al., 2004;Seneviratne et al., 2006Seneviratne et al., , 2010;;Coe et al., 2009;Dadson et al., 2010).Some analyses based on CTI values have consistently overestimated the extent and duration of tropical wetlands of various types.Notably, simulations using the Earth system model HadGEM2, which is parameterised using CTI (Collins et al., 2011), predict much larger and more persistent Amazonian wetlands than actually exist according to current surveys (e.g.Lehner and Döll, 2004;Prigent et al., 2007;Junk et al., 2011), which may at least partly be caused by the limited resolution and quality of the HYDRO1k CTI.
In the context of LSMs, the need for high-resolution topographical data across wide spatial domains has recently been highlighted (Lehner et al., 2008;Wood et al., 2011;Lehner and Grill, 2013).With the advent of satellite-based global mapping, notably the Shuttle Radar Topography Mission (SRTM), there has been a significant improvement in the availability of high-resolution data with continental coverage, such as in the high-resolution global HydroSHEDS (Hydrological data and maps based on SHuttle Elevation Derivatives at multiple Scales) data layers (Lehner et al., 2008), but such data have generally not yet been utilised to support large-scale hydrological modelling studies (Wood et al., 2011(Wood et al., , 2012)).
In this study, we respond to the need for higher-resolution data for use in LSMs.We have three main aims: (1) to calculate the topographic index using the GA2 algorithm based on high-resolution global HydroSHEDS data; (2) to compare our values to the current standard for values of this index (the CTI of HYDRO1k); and (3) to discuss current developments in large-scale hydrological modelling and how models can benefit from higher-resolution parameter maps such as these.

Topographic index
The topographic index is a parameter of the TOPMODEL hydrological model (Beven and Kirkby, 1979;Quinn et al., 1991Quinn et al., , 1995;;Beven, 1997Beven, , 2012)).The algorithm required for calculating this index is relatively simple (Appendix A), but it has not previously been applied to generate a global data layer at very high spatial resolution because (1) the index must be calculated from harmonised topographic information, which only became available in the 2000s and (2) LSMs have only recently become sophisticated enough to make use of such a high-quality layer (Prentice et al., 2007).

The HydroSHEDS "hydrologically conditioned" layers
Grid-based topographic index calculations require a digital elevation model (DEM) and we have used the HydroSHEDS DEM (Lehner et al., 2008; http://www.hydrosheds.org/).The HydroSHEDS data layers were derived from raw SRTM data at 3 arcsec pixel resolution (approximately 90 m at the Equator) through the application of hydrological conditioning in a sequence of correction steps (Lehner, 2013) which resulted in a globally consistent suite of grid layers which were subsequently upscaled to a resolution of 15 arcsec (approximately 450 m at the Equator).We acquired the HydroSHEDS DEM and also a layer of pre-calculated contributing upstream catchment areas for each 15 arcsec pixel (UPLAND, in m 2 ; Lehner, 2013).As of April 2014, HydroSHEDS data have only been produced at the highest quality for all land areas south of 60 • N (the limit of SRTM); so for areas at higher latitude we substituted the HYDRO1k DEM to provide seamless global grids (more specifically, the GTOPO30 DEM underlying HYDRO1k disaggregated to 15 arcsec resolution by tiling the larger pixels and applying a 3 × 3 kernel average filter to smooth the resulting surface).

Generating the ancillary files
Our calculations had to be carried out over domains composed of complete watersheds, so we mosaicked both the DEM and UPLAND tiles into a global data layer using Ar-cGIS 10.1 (Esri Inc., Redlands, California).These two input layers were then converted to NetCDF format using GDAL (Geospatial Data Abstraction Library; OSGF, 2011).
Topographic index values were calculated using the GA2 algorithm, which is the widely used GRIDATB algorithm with some modifications for use with HydroSHEDS data (see Appendix A for details).Resulting index values for the global land surface were then filtered to remove areas for which topographic index values are invalid or meaningless, including lakes and reservoirs (masked out using the Global Lakes and Wetlands Database; Lehner and Döll, 2004), mountain glaciers and ice caps (using the Randolph Glacier Inventory; Pfeffer et al., 2014) and the Greenland ice sheet (using Lewis, 2009).
Because of the large layer size (1.2 × 10 9 land pixels), GA2 was run on the ARCUS server for all continental-scale calculations, a 1344-core computer cluster at the Oxford e-Research Centre (OeRC).Zonal histograms were plotted using ArcGIS 10.1 and subsequent statistics calculated using R (R Development Core Team, 2013).

Results
We produced a layer of topographic index (TI) values following the GA2 algorithm for all ice-free land pixels worldwide (Fig. 1).TI values calculated this way are not just relative measures but consistent and comparable between catchments (Appendix A), so we may compare global values: as expected, TI values are low at ridge-tops (minimal catchment area) and high in valleys (along drainage paths and in zones of water concentration in the landscape; Wilson and Gallant, 2000), yielding a global range of 0.00-25.00and average of 5.99 (Fig. 2).
Wetter areas of the globe generated generally higher TI values (Fig. 1), although there are many exceptions to this (e.g. in desert areas where high TI values do not correlate with high flow accumulation, at least in the present climate).Zonal statistics calculated for the various lake and wetland types of the world (as defined by the Global Lakes and Wetlands Database, see Table 1) show that pixels representing rivers had the highest TI values (global mean 8.81 over 0.42 × 10 6 km 2 ), but also the highest variance with some river pixels scoring below the global mean for ice-free land outside lakes, reservoirs, wetlands and wetland complexes (global mean 5.88 over 128.99 × 10 6 km 2 ).In terms of TI, wetland complexes in Asia (mostly occurring in India and Tibetan China; Table 1) and mires (mostly occurring in boreal Canada and the Russian Federation) were indistinguishable from dry land (Fig. 3), indicating that wetlands in both Table 1.Topographic index values from the GA2 algorithm applied to HydroSHEDS data (Appendix A) compared to CTI values from HYDRO1k for all global river basins larger than 10 6 km 2 (SD: standard deviation).Note that some sources quote much higher index values, but these are often not scale-corrected values and are therefore not directly comparable (e.g.Yang et al., 2007).

CTI of HYDRO1k
GA2 these areas are maintained by factors other than topography (e.g.rainfall and evapotranspiration).
In comparison to HYDRO1k, the new TI values from GA2 based on HydroSHEDS were higher for river pixels and slightly higher for intermittent wetlands and lakes.TI values were lower at pixels in tropical swamp forests and inundated forests and also slightly lower in coastal wetlands.
The new TI values from GA2/HydroSHEDS were in line with HYDRO1k values for CTI (USGS, 2000) at most global pixels, but in certain areas there were significant divergences.Considering all river catchments larger than 10 6 km 2 in particular (Table 2), values from GA2 were lower for many basins, most notably the Amazon, Congo, Paraná, Niger and St. Lawrence rivers; in the case of the Amazon as Table 2. Topographic index values from the GA2 algorithm applied to HydroSHEDS data (Appendix A).For a map of the extent of these wetland types (see Lehner and Döll, 2004).Lehner et al., 2008).Permanent lakes and reservoirs cover 1.23 × 10 6 km 2 globally (Lehner and Döll, 2004), the Greenland ice sheet covers 1.99 × 10 6 km 2 (Lewis, 2009) and all glaciers cover 0.80 × 10 6 km 2 (Pfeffer et al., 2014).much as 20 % lower than the CTI values (Fig. 4).According to our calculations, the catchments with the highest spatially averaged TI values were the Murray-Darling, Nelson-Saskatchewan, Nile and Niger (compared to the order Amazon, St. Lawrence, Niger and Nelson under the CTI calculations, although n.b.HYDRO1k's CTI included no esti-mates for the Murray-Darling; Table 2).Although it might be expected that the size of the Amazon floodplain would be enough to ensure it scored the highest TI, note that (i) there is no globally consistent correlation between wetland area and TI (Fig. 3) and (ii) because these are spatial averages, the density of wetland within each catchment is more important than the absolute wetland size (and the Nelson-Saskatchewan, for example, is known for a high density of wetland terrain).
As expected, our new index values reflect the same pattern of below-average values in mountainous areas and aboveaverage values in lowland areas as seen in HYDRO1k, however more variability is visible in the histograms for GA2 because the higher resolution means that more of the smaller river valleys within the mountain ranges become visible (leading to an increase in the mean and spread of index values e.g. in the Mackenzie Mountains; Fig. 5c).Also visible on the zoomed comparison maps of the Rocky Mountains (Fig. 6) is an example of differing qualities of HYDRO1k vs. HydroSHEDS data: on the eastern half of the maps, the CTI version shows a series of blue, lake-shaped objects with topographic indices in the range of 10 (also visible as a small rise in the corresponding histogram; Fig. 5a), while the GA2 version does not show these features.These objects represent valleys that are drained, in reality, through narrow gorges or river channels.The higher-resolution data of HydroSHEDS (and possibly manual corrections) are capable of resolving this issue correctly.Yet due to the coarser resolution of HY-DRO1k, the valleys would appear as closed depressions in the DEM; the standard GIS solution to enforce continued drainage in such cases is to lift the topography until overflow  , 2000), both of which applied the scale correction of Ducharne (2009).Boxes show mean ± SD index values for the global distribution of that wetland type.For reference, the mean topographic index value for ice-free land outside wetlands is shown by a broken line on all panels (Table 1).
occurs (using a sink-filling algorithm); the resulting (erroneous) flat topography then leads to overestimated CTI values.Index values at 15 arcsec resolution are available at http: //doi.org/10/t7d in NetCDF format (a version in GeoTIFF format -translated using GDAL (OSGF, 2011) -is available on request).

Discussion
Modelling soil water flow and runoff generation is of critical importance for simulating land surface fluxes, predicting water table dynamics, wetland inundation and river routing and, at a regional scale, quantifying surface evaporation rates and the growth, transpiration and seasonality of vegetation (Ward and Robinson, 2000;Baker et al., 2009;Dadson et al., 2010;Marthews et al., 2014).Landscape-scale hydrological processes are therefore key elements in modelling land surfaceatmosphere exchange processes and critical to the successful use of coupled LSMs to predict the effects of climate change at larger scales.
The hydrological routines of LSMs have undergone steady improvement in recent years (Wood et al., 2011;Zulkafli et al., 2013;Wainwright and Mulligan, 2013).However, these landscape-scale processes remain generally less well-modelled than processes operating at the finer local scale (e.g.photosynthesis models) or larger continental scale (e.g.general circulation models).Arguably, the development of landscape-scale processes has been relatively slow not just because of a lack of complete understanding of the processes involved, but also, more simply, by the limited availability of high-resolution parameter maps for the models concerned (Wood et al., 2011;Wainwright and Mulligan, 2013;Marthews et al., 2014).Because LSMs are now being applied at increasingly high spatial resolution in order to analyse the distribution and movement of water resources, model development is gaining momentum.Large-scale gridded simulations based on high-resolution drivers are now becoming routine, and this has led to an increasingly recognised need for the high-resolution data sets required to drive those simulations (e.g.Wood et al., 2011Wood et al., , 2012;;Beven and Cloke, 2012;Castanho et al., 2013).

High-resolution hydrological modelling
TOPMODEL was originally applied at the scale of small catchments, using pixels smaller than 50 m × 50 m in extent (Quinn et al., 1991(Quinn et al., , 1995;;Ward and Robinson, 2000;Beven, 2012), with the index values understood to have relative significance only (i.e.similar values calculated in different catchments do not necessarily imply hydrological similarity; see Chappell et al., 2006).There have been many developments from this basic framework over the years (e.g.see Wolock, 1993;Wilson and Gallant, 2000;Hjerdt et al., 2004;Beven, 2012) and this study has likewise taken a novel approach.Notably, we have applied our calculations at continental scales with larger pixels (approximately 450 m × 450 m at the Equator), using the resolution correction of Ducharne (2009; also see Moore et al., 1993;Wolock and McCabe, 1995;Clark and Gedney, 2008).Additionally, because our calculations are carried out over complete continental land masses, the index values derived may be considered to be consistent and comparable between catchments.
Although we accept the arguments of Beven and Cloke (2012) that moving to higher-resolution data sets is not the only line of development that should be followed, ultimately we support the ideas of Wood et al. (2011Wood et al. ( , 2012) that increasing the resolution at which global hydrological simu-lations are carried out will have many benefits including the more realistic representation of processes currently at subgrid resolution and, ultimately, better weather and inundation prediction (Wood et al., 2011).Methane production in wetlands, for example, is critically dependent on the level of the water table (Gedney et al., 2004;O'Connor et al., 2010;Pangala et al., 2013), models of which are in turn dependent on accurate representation of the topography; therefore, higherresolution simulations involving improved topographic index values should of necessity improve the representation of wetland fluxes of heat, water and trace gases to the atmosphere (Gedney et al., 2004) and overall estimates of methane release.
In this study we have refined the standard topographic index calculations and greatly improved their spatial resolution.We have presented our new maps of topographic index values both by wetland type (using the Global Lakes and Wetlands Database; Lehner and Döll, 2004) and also in terms of the largest river catchments occurring on each continent, finding that, in comparison to our revised values, HY-DRO1k's CTI topographic index values were significantly higher in some catchments (Table 2).In most large catchments, however, the spread of our new GA2 index values was very similar to those of CTI, yet with more spatial variability apparent at fine scale (Figs. 4,5).

Limitations of the GA2 algorithm
The topographic index is a measure of the relative propensity for soil to become saturated to the surface as a result of local topography (Beven, 2012).We have calculated it using a robust algorithm (GA2) based on the original implementation of these calculations (GRIDATB; Appendix A).
Although topographic index values are comparable between different areas, it is important to remain careful when inter-preting their meaning in different regions, such as arid vs. humid, or shallow vs. deep soils (i.e. when factors other than topography influence water accumulation in the landscape).
In regions where saturation-excess overland flow (the component of runoff most affected by topography) is less than dominant as a runoff generation mechanism, uncertainties in inundation predictions based on TOPMODEL must be carefully calculated and predictions interpreted with care (see Beven, 2012).
A well-known limitation of topographic index values is that they are not absolute because the maximum value in any particular catchment is dependent on the catchment's area and slope profile.Therefore, we could not carry out more than an ad hoc comparison between TI and CTI (because of no independent baseline to refer to other than HY-DRO1k itself).Histograms of TI and CTI values correspond closely (e.g.Fig. 5), though, and the consistency and rigour of the algorithm we have applied as well as the improved HydroSHEDS base data used for the calculation lead us to believe that our values are at least as robust as CTI at all spatial points.
A second limitation of our method is that we have used global base elevation data that is not on an equal-area projection.The HydroSHEDS data layers are projected using a geographic coordinate system (GCS, with WGS84 datum), i.e. a grid of unrotated cells that become increasingly distorted in the east-west direction as latitude increases.This implies that slopes will be underestimated in east-west directions at higher latitudes as true pixel distances are getting shorter (Appendix A).There is no appropriate method, however, to avoid uncertainty completely in the slope calculations as the underlying SRTM elevation measurements are already unequally spaced, and as there is no commonly agreed method to calculate slopes or drainage (flow) directions (see Appendix A).We assume that our calculations of the steepest gradients with average pixel distances provide a reasonable compromise to approximate the real slope of each pixel (see Appendix A).
Finally, as a related issue, we used HydroSHEDS UP-LAND data which are ultimately based on the underlying D8 concept of deriving drainage directions from the steepest slopes.We acknowledge that recent advances in creating DEM-based drainage networks (e.g.D8-LTD or other options such as FD8 or MD8; Orlandini et al., 2014) provide avenues to alter and potentially improve the drainage-direction calculations and, in consequence, the topographic index values, but testing for the individual effects is beyond the scope of this project due to the multiscale complexity involved (see Appendix B for further explanations).We believe, however, that while these methods may have a significant effect on local drainage directions and channel routing, the cumulative calculation of "contributing upstream area" is less affected.

Conclusions
LSMs have now been applied over many years to the problem of explaining and predicting global climate change (Prentice et al., 2007;IPCC, 2013).Recent developments in land surface modelling and Earth observation have attempted to incorporate better hydrological understanding into these applications, with a particular focus on a better characterisation of the physical processes that control the water cycle (Coe, 1998;Gedney and Cox, 2003;Coe et al., 2009;Dadson and Bell, 2010;Dadson et al., 2010Dadson et al., , 2011;;Zulkafli et al., 2013).This study offers a new high-resolution, spatially consistent data layer of topographic index values for all icefree land pixels worldwide based on the hydrologically conditioned HydroSHEDS data (Lehner et al., 2008).These data layers are at 4 times the resolution of the HYDRO1k compound topographic index layers (USGS, 2000) and we believe represent the most accurate global-scale calculation of topographic index values that exists to date.
The topographic index is a measure of the relative propensity for the soil to, at a point, become saturated to the surface, given the area that drains into it A and its local outflow slope β (Beven, 2012; increasing A will tend to increase the accumulation of water, but increasing β will tend to reduce it by increasing gravitational outflow; Quinn et al., 1991).The index is often calculated using an algorithm called GRI-DATB, originally written in 1983 by K. Beven of the Hydrology Group, University of Lancaster (revised for distribution 1993-1995 by P. Quinn and J. Freer and described in Quinn et al., 1991Quinn et al., , 1995;; for alternative calculations see e.g.Wolock, 1993).
We calculated topographic index values for each pixel using the GA2 algorithm, which is a slightly modified version of GRIDATB version 95.01 that has been written specifically for this study based on the basic loop structure implemented in Buytaert (2011) with some modifications to allow for the use of HydroSHEDS data.GA2 calculates the outflow gradient of each pixel (Fig. A1) and uses precalculated UP-LAND values from HydroSHEDS for the catchment area A of each pixel (corrected for latitudinal projection distortions, B. Lehner, 2013).
Because of the use of the HydroSHEDS DEM, we made three small modifications in GA2 to the standard GRIDATB calculations.
-We applied the correction for the DEM resolution suggested by Ducharne (2009) to allow calculations to be carried out at continental scales (see A1 below).
-The HydroSHEDS DEM does not have uniformly sized grid cells because of its native geographic projection (GCS WGS84) where pixel dimensions vary with latitude (i.e. the real width of a pixel gets increasingly shorter than the height towards the poles).Because slope directions are restricted to the eight cardinal and diagonal directions, we account for varying pixel dimensions in our slope calculations by taking an average distance between neighbouring pixels (rather than direction-dependent): we approximated DX as the square root of the area of each cell (with latitudecorrected pixel areas calculated using the Met.Office Unified Model routine realat1.f90written by T. Oki in 1996; Dadson and Bell, 2010).When away from the Equator, this implies that slopes will be slightly overestimated in north-south directions and underestimated in east-west directions.
-Finally, because the value of dfltsink is undefined on plains (i.e.areas of no outflow and no inflow, which occur more often when vertical resolution is lower) we followed USGS (2000) and Evans (2003) in applying a minimum of 0.001 to tan(β ).A question arises when comparing catchments digitised at different resolutions (e.g.Chappell et al., 2006): how to compare topographic index values calculated from DEMs at different resolutions?Although not part of the original topographic index calculations, it has become accepted that topographic index values as calculated above should be reduced to the "equivalent" value for a 1 m resolution DEM by subtracting ln(DX) (and restricting the result to be ≥ 0).Applying this scale correction is becoming standard; e.g.see Ducharne (2009; also see Moore et al., 1993;Wolock and McCabe, 1995;Clark and Gedney, 2008).

Appendix B: Drainage direction and UPLAND calculations
Our calculations of topographic index values depend on the HydroSHEDS UPLAND layer containing the upstream catchment area draining into each point, and this layer in turn depends on the underlying drainage direction grid of HydroSHEDS.At its highest resolution of 3 arcsec, Hy-droSHEDS follows the D8 algorithm to determine drainage directions based on steepest slopes, which is considered the standard for use with large-scale routing models (e.g.TRIP, Grid-2-Grid, Dadson and Bell, 2010).However, in areas where turbulence or diffusional effects lead to significant hydrologic dispersion, flow lines may not coincide uniformly with slope lines (Rice et al., 2008;Orlandini et al., 2014).Deriving channel networks from terrain data has been an area of active recent research (e.g.Orlandini et al., 2014) and there is not yet universal agreement between the many different methods for calculating drainage/flow directions from DEM data (see discussions in Wilson and Gallant, 2000;Zhao et al., 2009;Orlandini et al., 2014).
At the upscaled 15 arcsec resolution of HydroSHEDS, the D8 concept is still valid in terms of providing one of eight possible neighbour pixels as the downstream direction; however, the direction values are not based on steepest slopes alone but also incorporate information from the 3 arcsec flow accumulation maps (Lehner, 2013).Additionally, a large number of manual corrections have been implemented over several years which modify the native DEM values ("hydrological conditioning"; Lehner 2013).As a consequence, our use of HydroSHEDS has unavoidably involved an acceptance of these algorithms and manipulations, and testing alternative settings to derive drainage directions or routing schemes is beyond the logistical limits of this study as it would require coordinated changes in slope, upscaling, and correction procedures at the multiple scales involved.

Figure 1 .
Figure 1.Global topographic index values based on GA2 applied to HydroSHEDS base data (Appendix A).Blue shades indicate pixels with index values above the global mean (5.99) and brown shades indicate below-average values.

Figure 2 .
Figure 2. Histogram of global topographic index values (vertical line shows global mean of 5.99; global maximum is 25.0044 at a pixel within a river island at the confluence of the Amazon and Xingú rivers in Brazil).

Figure 4 .
Figure 4. Comparison of the CTI and GA2 calculations of the topographic index (from Table 2), showing that CTI values are larger for some most notably the Amazon, Congo, Paraná, Niger and St. Lawrence.Circle areas are proportional to catchment area and a oneone line is shown for reference.The largest catchments tend to be closest to the global average index value of 5.99 (also shown for reference).Histograms are shown for six catchments: the Rhine, Amazon, Lena, Congo, Yangtze and Mississippi-Missouri (each grey histogram shows CTI values, hatched histogram shows GA2; axes on all histograms are omitted: all are topographic index -horizontal -and fraction of pixels -vertical): for catchments close to the one-one line, the corresponding histograms were closely similar.

Figure 5 .
Figure 5.Comparison of the CTI and GA2 calculations of the topographic index for four example areas of 10 6 km 2 each: (a) an area of the Rocky Mountains (USA), (b) the Lower Ob-Irtysh (Russian Federation), (c) an area of the Mackenzie Mountains (Canada) and (d) the Congo Basin (Democratic Republic of the Congo, Republic of the Congo, Cameroon and the Central African Republic) (see inset).These examples were chosen so that two are mountainous, two lowland plains, two are north of 60 • N and two south, to demonstrate that the new topographic index values are a refinement on the CTI values of HYDRO1k.On each histogram, grey bars show CTI values, hatched bars show GA2 and a red broken line shows the global average index value of 5.99 for reference.Axes on all histograms are omitted: all are topographic index (horizontal) and fraction of pixels (vertical).

Figure 6 .
Figure 6.Comparison of the CTI and GA2 calculations of the topographic index for an area of the Rocky Mountains (USA).Maps of the CTI (left panel) and GA2 (right panel) values are shown (from which the histograms of Fig. 5a were calculated), with identical colour scale to Fig. 1.Note the 4400 km 2 Great Salt Lake, Utah, to the N of the area (which is masked out of the GA2 map (light blue) but included in CTI as if it were a flat plain) and the San Luis Valley, Colorado, to the SE, being the headwaters of the Rio Grande, USA.

Figure A1 .
Figure A1.Illustration of the topographic index calculation of GA2for one pixel of a DEM (the black square) downstream from a catchment area A (in m 2 , defined to include the area of the pixel itself, which is usually negligible in comparison to A).The inflow contour of the pixel is shown in blue, the outflow contour in orange and the remaining perimeter of the octagon is shown green (q.v. the octagon of contour lengths shown inQuinn et al. (1991), Fig.1).We calculate DX (pixel sidelength in m), tan(β) (mean slope across the outflow contour), tan(β ) (mean slope across the nonoutflow contour (blue + green)), clout (outflow contour length in m), a (specific catchment area in m) A/clout (n.b. called an "area" but units are m 2 m −1 , i.e. m) and dfltsink = ln A Area of lakes, reservoirs, glaciers and ice sheets within the basin given in parentheses (the topographic index is not evaluated at these pixels by GA2, whereas the HYDRO1k CTI calculation assigns values to lakes as if they are flat plains, Appendix A). b HYDRO1k did not include mainland Australia therefore no CTI values are available for theMurray-Darling (USGS, 2000). a These areas sum to 143.43 × 10 6 km 2 which is the global extent of land not covered by lakes, reservoirs, glaciers or ice sheets that lies outside Antarctica and other islands excluded from HydroSHEDS (viz.Antarctica, Polynesia east of the 180 • meridian line, the Azores, St Helena, Ascension Is., Tristan da Cunha, South Georgia, the South Sandwich Is., the Kerguelen Archipelago and some smaller oceanic islands; a Following the Global Lakes and Wetlands Database (GLWD;Lehner and Döll, 2004).b