A Global Data Set of the Extent of Irrigated Land from 1900 to 2005

Irrigation intensifies land use by increasing crop yield but also impacts water resources. It affects water and energy balances and consequently the microclimate in irrigated regions. Therefore, knowledge of the extent of irrigated land is important for hydrological and crop modelling, global change research, and assessments of resource use and management. Information on the historical evolution of irrigated lands is limited. The new global historical irrigation data set (HID) provides estimates of the temporal development of the area equipped for irrigation (AEI) between 1900 and 2005 at 5 arcmin resolution. We collected sub-national irrigation statistics from various sources and found that the global extent of AEI increased from 63 million ha (Mha) in 1900 to 111 Mha in 1950 and 306 Mha in 2005. We developed eight gridded versions of time series of AEI by combining sub-national irrigation statistics with different data sets on the historical extent of cropland and pasture. Different rules were applied to maximize consistency of the gridded products to sub-national irrigation statistics or to historical cropland and pasture data sets. The HID reflects very well the spatial patterns of irrigated land as shown on historical maps for the western United States (around year 1900) and on a global map (around year 1960). Mean aridity on irrigated land increased and mean natural river discharge on irrigated land decreased from 1900 to 1950 whereas aridity decreased and river discharge remained approximately constant from 1950 to 2005. The data set and its documentation are made available in an open-data repository at https:


Introduction
Since the beginning of crop cultivation, irrigation has been used to reduce crop drought stress by compensating for low precipitation.In rice cultivation irrigation is also used to control the water level in the paddy fields and to suppress weed growth.Crop yields are therefore higher in irrigated agriculture than in rainfed agriculture, often by a factor of 2 or more (Bruinsma, 2009;Colaizzi et al., 2009;Siebert and Döll, 2010).In many regions, irrigation is required to grow an additional crop in the dry season and therefore helps to increase land productivity.Around the year 2000, about 43 % of global cereal production was harvested on irrigated land, whereas eliminating irrigation would reduce cereal production by ∼ 20 % (Siebert and Döll, 2010).To achieve this gain in agricultural production, large volumes of freshwater are consumed and consequently irrigation represents the largest anthropogenic global freshwater use.Estimates of total water withdrawal for irrigation range from 2217 to 3185 km 3 yr −1 (Döll et al., 2012(Döll et al., , 2014;;Frenken and Gillet, 2012;Hoogeveen et al., 2015;Wada et al., 2011Wada et al., , 2014) ) and additional crop evapotranspiration ranges from 927 to 1530 km 3 yr −1 (Döll et al., 2014;Hoff et al., 2010;Wada et al., 2014).Globally, irrigation accounts for about 60 % of total fresh water withdrawals and 80 % of total fresh water consumption (Döll et al., 2014).To ensure water supply for irrigation, a large infrastructure of man-made reservoirs (Lehner et al., 2011), channels, pumping networks, and groundwater wells is required, markedly modifying global fresh water S. Siebert et al.: A global data set of the extent of irrigated land from 1900 to 2005 resources, negatively impacting ecologically important river flows (Döll et al., 2009;Steffen et al., 2015) and depleting groundwater (Döll et al., 2014;Konikow, 2011).These impacts raise concerns about the sustainability of water extraction for irrigation (Gerten et al., 2013;Gleeson et al., 2012;Konikow, 2011;Lehner et al., 2011;West et al., 2014).Irrigation of agricultural land also has major impacts on the temperature in the crop canopy and crop heat stress (Siebert et al., 2014), and regional climate and weather conditions by changing water and energy balances (Han et al., 2014).Increased evapotranspiration due to irrigation results in surface cooling and considerable reduction in daily maximum temperatures (Kueppers et al., 2007;Lobell et al., 2008;Puma and Cook, 2010;Sacks et al., 2009).These impacts on water and energy balances are considered to affect the dynamics of the South Asian monsoon (Saeed et al., 2009;Shukla et al., 2014), while a large part of the increased evapotranspiration being recycled to terrestrial rainfall also affects nonagricultural biomes and glaciers (Harding et al., 2013).
Because of the diverse impacts of irrigation and its importance for food security and global change research, many assessments require knowledge about where cropland is irrigated and how the spatial pattern of irrigated land has changed over time.Understanding the past evolution of irrigated regions may also improve projections of future irrigation required to meet rising food demands.High-resolution data sets on the extent of irrigated land have been developed at global (Salmon et al., 2015;Siebert et al., 2005;Thenkabail et al., 2009) and regional scales (Ozdogan and Gutman, 2008;Siebert et al., 2005;Wriedt et al., 2009;Zhu et al., 2014) for a certain historic time period, but little is known about spatio-temporal changes in irrigated land at large scales.The statistical database FAOSTAT of the Food and Agriculture Organisation (FAO) of the United Nations (FAO, 2014b) includes annual data on area equipped for irrigation (AEI) at the country level for the period since 1961.This information and data collected from many other sources were harmonized to develop an annual time series of AEI per country for the period 1900-2003 (Freydank and Siebert, 2008).
Since then, these country-level time series have been used in many global change studies to describe effects of irrigation on various parts of the global water and energy cycles such as river discharge, water withdrawals, water storage changes, evapotranspiration, or surface temperature (Biemans et al., 2011;Döll et al., 2012;Gerten et al., 2008;Haddeland et al., 2007;Pokhrel et al., 2012;Puma and Cook, 2010;Wisser et al., 2010;Yoshikawa et al., 2014).The method used in these studies to estimate the spatial pattern of irrigated land over historical time periods was to multiply current values of AEI in each grid cell of a country (Siebert et al., 2005(Siebert et al., , 2007) ) by a scaling factor computed from the time series of AEI per country, from either FAO (2014b) or Freydank and Siebert (2008).This scaling method may result in considerable inaccuracies, in particular for large countries such as the USA, India, China, Russian Federation, or Brazil, because changes in the spatial pattern of irrigated land within countries are not represented.Another disadvantage is that the historical extent of irrigated land generated in this way is not consistent with other historical data sets of agricultural land use, e.g.extent of cropland or pasture.For studies requiring such consistency, e.g. on crop productivity or water footprints, several adjustments were required (Fader et al., 2010).
The objectives of this study were to improve the understanding of the historical evolution in the extent of irrigated land by (i) developing a new data set of sub-national statistics on AEI from 1900 to 2005, with 10-year steps until 1980 and 5-year steps afterward, and (ii) developing and applying a methodology to derive gridded AEI (spatial resolution 5 arcmin × 5 arcmin, ∼ 9.2 km × 9.2 km at the Equator) that is consistent with sub-national irrigation statistics and with existing global spatial data sets on cropland and pasture extent, using a hindcasting methodology starting with presentday global irrigation maps.Considering the high level of uncertainty in the data, we did not develop a best-estimate time series of gridded AEI but instead developed eight alternative products (Table 1).In addition, we analyzed the derived products to identify differences in the development of AEI in arid regions, humid or sub-humid rice production systems, as well as other humid or sub-humid regions, and estimated changes in mean aridity and mean river discharge in AEI as indicators of changes in water requirements and freshwater availability.
The data set of sub-national statistics on AEI since year 1900 and the derived gridded versions at 5 arcmin × 5 arcmin resolution form the historical irrigation data set (HID), which is made available as Supplements S1-S7 at https: //mygeohub.org/publications/8(doi:10.13019/M20599).

Development of a spatial database of sub-national irrigation statistics
An extensive amount of statistical data from multiple sources, such as national agricultural census information or international databases (e.g.FAOSTAT), were collected to develop the HID.The input data varied in scale (extent and resolution), completeness, reference years, and terminology.
To develop a joint database with global coverage, high spatial resolution, and consistent terminology, the input data had to be combined and harmonized.Below we describe the terminology, data, and methods used to develop a global database of sub-national statistics on the extent of AEI for 1900-2005.

Terminology
The time series developed in this study refers to the AEI, i.e. the area of land that is equipped with infrastructure   , 2009).This resulted in an estimate of irrigated land for year 2007 that is 31 % larger than AAI reported by the agricultural census for the same year.
To ensure categorical consistency in reported variables, international databases such as Aquastat (FAO, 2014a) use similar methods to estimate AEI for countries where only AAI is available.In contrast, historical irrigation statistics or historical reports often simply refer to irrigated land without defining the term; therefore, comparisons with other data sources and knowledge of the statistics system in the corresponding country are required to infer whether AAI or AEI is meant.Although AAI differs from AEI, we also used statistics on AAI to develop this inventory because trends in AEI are often similar to trends in AAI.Furthermore, data on AAI at high spatial resolution were used to estimate the spatial pattern in AEI when AEI was only available at low resolution.The methods used to estimate AEI based on AAI are described below (Sect.2.1.3).

Description of input data and sources of information
To develop the spatial database of sub-national irrigation statistics, we combined sub-national irrigated area statistics with consistent geographic data describing the administra-  1900 1910 1920 1930 1940 1950 1960 1970 1980 1985 1990 1995 2000 2005 Census Irrigation data (sub-national level) Irrigation data (country level)
In addition to these international databases, we also used data collected in national surveys and derived from census reports or statistical yearbooks for most of the countries because the spatial detail is often higher in national data sources than in the international databases.For the period before 1950, availability of national census data on AEI was limited to a few countries.Therefore, we also used secondary sources from the literature, e.g.scientific publications or books with reported data from primary national sources.
Many of the irrigation statistics for year 2005, as the starting point of the hindcasting, were derived from the database used to develop version 5 of the Global Map of Irrigation Areas (GMIA5).This data set, which is described in detail in Sect.2.2.1, contains several layers describing AEI, AAI and the water source for irrigation at a global scale in 5 arcmin resolution (Siebert et al., 2013).We used the data layer on AEI for downscaling the irrigation statistics to 5 arcmin grid cells (see Sect. 2.2); therefore, the subnational irrigation statistics used to develop the GMIA5 data set were automatically incorporated into this HID.However, for many countries, the sub-national irrigation statistics used to develop the GMIA5 data set referred to a year different from 2005.Therefore, the difference between AEI in the year taken into account in GMIA5 and the year 2005 was derived from other sources, e.g.FAOSTAT, Eurostat or data derived from national statistical offices.For most of the years, between 50 and 75 % of the global AEI was derived from subnational statistics, most of it provided by reports of national surveys (Fig. 1).The data sources are described in detail for each country and time step in Supplement S1.
To map the reported AEI and to use the data in the downscaling to 5 arcmin (described in Sect.2.2), it was necessary to link the irrigation statistics to geographic data describing the boundaries of the administrative units.We used version 1 of the Global Administrative Areas database GADM (GADM.org, 2009) for this purpose.Because GADM refers to the current administrative units, the shapefile had to be modified with the administrative unit boundaries for each time step, taking into account the historical changes in the administrative set-up.Boundaries had to be adjusted at the sub-national level (e.g.federal states, districts, provinces) but many country boundaries changed as well.For sub-national level changes we used information obtained from the administrative sub-divisions of countries database (statoids.com)that lists the changes in administrative divisions of countries.To adjust country boundaries, e.g. for the Indian Empire or Germany, we used historical maps to create the administrative area boundaries for the available data.For each time step we created a unique administrative boundary layer, depending on the level of data available for each country and changes in administrative units.These layers were converted to grids with 5 arcmin resolution and are provided as Supplement S2.

Methods used to harmonize data from different sources
For most of the countries, we used data derived from different sources with different temporal and spatial resolution and sometimes different definitions used for irrigated land which resulted in inconsistencies among the input data sets (Supplement S1).Moreover, national irrigation surveys were often undertaken for years that differ from the time steps used in this inventory.This resulted in data gaps, which needed to be filled by interpolation or scaling.Therefore, it was important to harmonize the data, particularly within a country as well as among countries.The three main harmonising procedures were (i) data type harmonising, (ii) temporal harmonising, and (iii) infilling data gaps.These procedures are briefly introduced below with detailed information in Supplement S1 on procedures, assumptions, and data sources used for each country and time step.
-Data type harmonising was used when statistics referred to terms different from the definition used in this inventory for AEI.One example is China where the irrigated area reported in statistical yearbooks refers to the so-called effective irrigation area which includes annual crops but excludes irrigated orchard and pasture.In these situations we used the closest time step in which we had both data, AEI and the effective irrigation area or other terms used in original data sources, to calculate a conversion factor.This conversion factor was then applied at the sub-national level assuming that the ratio between AEI and the term reported in the original data source did not change over time.For other countries, e.g.Argentina, Australia, India, Syria, the USA, or Yemen, the national databases referred to AAI.AEI was then estimated as the maximum of the AAI reported at high spatial resolution (e.g.county or district level) for different years around the reference year.Again a conversion factor (estimated AEI divided by reported AAI for the reference year) was calculated and applied to estimate AEI based on reported AAI for historical years.Data sources and procedures to estimate or derive AEI and AAI for each country around year 2005 are described in detail in the report documenting the development of GMIA5 (Siebert et al., 2013), while the method used for data type harmonising in historical years is described in Supplement S1.
-Temporal harmonising was used when the input data did not exactly correspond with the predefined time steps and thus data needed to be interpolated between years to match with the exact year in question.For this purpose we used a linear interpolation between the two closest data points on each side of the time step in question.
-Filling of data gaps was required when irrigation statistics were not available either for a specific time step or for the time step before or after.In this case we used, similar to the method applied for temporal harmonising, a linear interpolation between two existing data points, or we estimated AEI based on other information (e.g.trend in AEI in neighbouring countries, or trend in cropland extent).In cases where we had reliable data from a neighbouring country, where irrigation development is known to be similar to the country in question, we used the trend in the neighbouring country to scale the evolution of irrigation in that particular country.In some cases with gaps in sub-national data we used cropland extent development data to fill these gaps.We did this for example in China for years 1910-1930, where we used cropland development based on the History Database of the Global Environment HYDE (Klein Goldewijk et al., 2011) to fill gaps in the sub-national data set (Buck, 1937) that did not have information for all of the provinces in China.

Downscaling of irrigation statistics to 5 arcmin resolution
The spatial database of sub-national irrigation statistics was developed as described in the previous section, including data on the AEI per country or sub-national unit and the corresponding geospatial data describing the administrative set-up (boundaries of national or sub-national units in each time step).To derive AEI on a 5 arcmin resolution and thus the final product of HID, additional data were required.Further, we developed a downscaling method to spatially allocate changes in AEI for each time step.

Data used for downscaling
As a starting point for the hindcasting in 2005 we used AEI data from GMIA5 (Siebert et al., 2013).This data set combines statistics on AEI for 36 090 sub-national administrative units with a large number of irrigation maps or remotesensing-based land use inventories.The reference year differed among countries, with about 90 % of global AEI assigned according to statistical data from the period 2000-2008.By using GMIA5 as a starting point for the downscaling, the underlying data were automatically introduced into the HID.One objective of the downscaling of sub-national irrigation statistics (AEI_SU) to 5 arcmin resolution was to maximize consistency with other data sets on the historical ex-

S. Siebert et al.: A global data set of the extent of irrigated land from 1900 to 2005
tent of cropland and pasture.We demonstrate our method in this study by using cropland and pasture extent derived from version 3.1 of the History Database of the Global Environment HYDE (http://themasites.pbl.nl/tridion/en/themasites/hyde/index.html) and by using the Earthstat global cropland and pasture data set developed by the Land Use and Global Environment Research Group at McGill University (http://www.earthstat.org/).Both data sets have a spatial resolution of 5 arcmin and cover the period 10 000 BC-AD 2005 (HYDE) or 1700-2007 (Earthstat) (Earthstat).The hindcasting methodology developed in this study can be applied to any global historical data set if the extent of cropland and pasture is reported for the time steps considered in this study.
The HYDE cropland data set was developed by assigning cropland reported in historical sub-national cropland statistics to grid cells based on two weighting maps.One weighting map was based on satellite imagery and showed the cropland extent in year 2000, while the second map was developed by considering urban built-up areas, population density, soil suitability for crops, extent of coastal areas and river plains, slope, and annual mean temperature.The influence of the satellite map (first weighting map) increased gradually from 10 000 BC to AD 2005 while the impact of the second weighting map declined over time (Klein Goldewijk et al., 2011).Allocation of pasture to specific grid cells was similar but the second weighting map considered additional information on the biome type (Klein Goldewijk et al., 2011).To account for uncertainties in historical land use, mainly caused by assumptions on historical per capita cropland and pasture demand, the HYDE database also provides upper and lower bounds on cropland and pasture use.Consequently, we used three HYDE versions as input data for our historical irrigation database: the best guess called HYDE_FINAL and the upper and lower estimates HYDE_UPPER and HYDE_LOWER resulting in separate gridded products of our historical irrigation database.
The Earthstat Global Cropland and Pasture Data 1700-2007 represents a complete revision of the historical cropland data set developed previously at the Center for Sustainability and the Global Environment (SAGE) at University of Wisconsin-Madison (Ramankutty and Foley, 1999).Based on remote-sensing data and land use statistics, a cropland and pasture map for year 2000 was created (Ramankutty et al., 2008).Historical (and future) changes in cropland and pasture extent were then estimated using a simple scaling approach that combined the maps for year 2000 with historical (and future) sub-national cropland and pasture extent statistics and estimates, using the same method as Ramankutty and Foley (1999).The data have been made available by the Earthstat group (http://www.earthstat.org/).In the subsequent sections we refer to the data set as EARTHSTAT.

Description of downscaling method
The objective of the downscaling procedure was to assign AEI to 5 arcmin grid cells and to ensure that the sum of the AEI assigned to specific grid cells is similar to the AEI reported by the sub-national statistics for the corresponding sub-national administrative unit and year.In addition, we wanted to ensure that for each grid cell AEI did not exceed the sum of cropland and pasture extent in that year.Further, irrigated land in the past is preferably assigned to grid cells where we find it presently.However, it was impossible to generate layers of historical irrigation extent that were completely consistent with both the historical irrigation statistics and the historical cropland and pasture maps because of differences in methodology, input data, and assumptions used to generate the HID and the historical cropland and pasture maps.For some administrative units and years, for example, AEI is larger than the sum of cropland and pasture extent.Because of spatial mismatch between AEI and agricultural land, these inconsistencies are even larger at the grid cell level.In many grid cells, AEI according to the GMIA5 exceeds the sum of cropland and pasture area in year 2005 according to the two historical land use inventories.
To account for these inconsistencies, we developed a stepwise approach to maximize the consistency with either the sub-national irrigation statistics (AEI_SU) or with the historical cropland and pasture data (Fig. 2).Therefore, eight separate time series of gridded data were developed which differed with respect to the historical cropland and pasture data set used (HYDE_LOWER, HYDE_FINAL, HYDE_UPPER, or EARTHSTAT) and with regard to the consistency with either the sub-national irrigation statistics (suffix_IR) or with the historical land use (suffix CP) (Table 1, Fig. 2).
The downscaling procedure marched back in time starting with year 2005.A nine-step procedure was repeated for each sub-national statistical unit, each year in the time series and each of the gridded products (Fig. 2).For each step and grid cell, a maximum irrigation area IRRI max was calculated according to the criteria described in Fig. 2. The criteria were defined in a way that IRRI max increased with each of the nine steps by considering more and more areas outside the extent of irrigated land in the previous hindcasting time step.The basic assumptions underlying the rules shown in Fig. 2 are that irrigated areas in historical periods are more likely to occur at places where irrigated areas are today, that irrigation of cropland is more likely than irrigation of pasture and that irrigation of pasture is more likely than irrigation of non-agricultural land.
In many administrative units the downscaling procedure terminated in the first step (Fig. B1) because, for most of the countries, AEI was much lower in historical periods than it is today (Fig. 3).Consequently, IRRI max calculated for the first step had to be reduced to match the AEI reported for the administrative unit.The reduction was performed half in relative terms (equal fraction of cell specific IRRI max ) and half Illustration of the rules used to assign irrigated area to specific grid cells.The maximum irrigated area in each grid cell (IRRI max ) is calculated in steps S1-S9 depending on irrigated area assigned to the grid cell in the previous time step (IRRI t+1 ), cropland extent in the current time step (CROP), pasture extent in the current time step (PAST) and total land in the grids cell (LAND).The assignment terminates, when the sum of IRRI max for all grids cells belonging to an administrative unit is greater than or equal to the irrigated area reported in the sub-national statistics for the administrative unit.Please note that the previous time step is t + 1 (and not t − 1) as the procedure is marching back in time.The downscaling procedure is described based on seven examples in Supplement S4.Regional AEI (million ha) 1900 1910 1920 1930 1940 1950 1960 1970 1980 1985 1990 1995 2000  Global AEI: Regional AEI: Evolution of AEI Figure 3. Evolution of regional (thin lines; right y axis) and global (thick line and symbols; left y axis) area equipped for irrigation (AEI) for the 20th century based on the sub-national historical irrigation statistics (AEI_SU) collected for the historical irrigation data set (HID), and of global AEI in Freydank and Siebert (2008) and FAOSTAT (FAO, 2014b). in absolute terms (equal area in each grid cell).Performing half of the reduction as an area equal for each grid cell ensured that cell-specific AEI became 0 in many grid cells with little AEI in the previous time step and that, consequently, the number of irrigated grid cells declined in the hindcasting process.Different from the national scaling approach, the decrease of irrigated area in each grid cell is not the same within a sub-national unit because in step 1 of the downscaling approach, information on cropland area in the grid cell at time t is also taken into account (Fig. 2).
When the sum of IRRI max in the administrative unit calculated for a specific step was less than the AEI reported in the historical database, AEI in each grid cell was set to IRRI max and the routine proceeded to the next step.The pro-cedure was terminated and the subsequent steps discontinued when the sum of IRRI max in the administrative unit exceeded the AEI_SU reported in the historical database.Half of the increment in AEI still required in the present step was assigned in relative terms (equal fraction of the grid-cellspecific IRRI max after the previous step) and the other half of the required increment was assigned as an area equal for each grid cell.The downscaling procedure is explained in more detail in Supplement S4 where we describe the specific steps and calculations using seven examples.
The rules applied in specific steps of the downscaling procedure differed between gridded time series maximizing consistency with historical cropland and pasture and gridded time series maximizing consistency with AEI_SU (Fig. 2).The gridded products maximizing consistency with historical cropland and pasture extent (AEI_HYDE_LOWER_CP, AEI_HYDE_FINAL_CP, AEI_HYDE_UPPER_CP, AEI_EARTHSTAT_CP) ensured that AEI was less than or equal to the sum of cropland and pasture extent, for each time step and grid cell.Therefore, the AEI in the gridded products is less than the AEI reported in the sub-national statistics for administrative units in which AEI_SU exceeded the sum of cropland and pasture extent (Table 1).In the gridded products maximizing consistency with the historical irrigation statistics (AEI_HYDE_LOWER_IR, AEI_HYDE_FINAL_IR, AEI_HYDE_UPPER_IR, AEI_EARTHSTAT_IR) AEI can exceed the sum of cropland and pasture extent.Therefore, the AEI reported in the sub-national irrigation statistics was completely assigned to the gridded products (Table 1), with the exception of a few administrative units that were so small that they disappeared in the conversion of the administrative unit vector map to 5 arcmin resolution grids (mainly very small islands).

Comparison of the historical irrigation database with other data sets and maps
Validation of HID against historical statistical data was not possible because all historical irrigation statistics available to us were used as input data to develop the HID.However, we compared our spatial database of sub-national irrigation statistics to AEI reported in other inventories at the national scale to highlight the differences (FAO, 2014b;Freydank and Siebert, 2008).We found two historical maps showing the major irrigation area in the western part of the USA in year 1909 (Whitbeck, 1919) and year 1911 (Bowman, 1911) and compared our 5 arcmin irrigation map for year 1910 visually with these two maps.In addition, we compared our map to a global map showing the extent of the major irrigation areas and interspersed irrigated land beginning of the 1960s (Highsmith, 1965).A strict numerical comparison was not useful because the way irrigated land is shown on these maps is incompatible with our product.Historical irrigation maps include shapes of regions in which major irrigation development took place, resulting in a binary yes or no representation (see also the maps shown in Achtnich, 1980;Framji et al., 1981Framji et al., -1983;;Whitbeck, 1919).But even within the areas shown on these maps as irrigated there were sub-regions that were not irrigated (e.g.buildings, roads, rainfed cropland or pasture).In addition, many minor irrigation areas with small extent were not represented on these maps because of the limited accuracy of the historical drawings (Highsmith, 1965).In contrast, the gridded product developed in this study shows the percentage of the grid cell area that is equipped for irrigation and thus provides a discrete data type.Therefore a visual comparison was preferred to a numerical one.We also compared our new product (HID) to maps derived by multiplying the GMIA5 with scaling factors derived from historical changes in AEI at country level, as this procedure has been used in previous studies (Puma and Cook, 2010;Wisser et al., 2010;Yoshikawa et al., 2014).

Gridded area equipped for irrigation in the different product lines
Differences in AEI across gridded products were evaluated by pair-wise calculation of cumulative absolute differences (AD) (ha) as where AEI_A c is the AEI in cell c and product A, AEI_B c the AEI in cell c and product B, and n is the number of grid cells.In addition, we calculated the relative difference (RD) (-) as the ratio between AD and total AEI in the corresponding year.To identify the reasons behind the differences we calculated the mean of AEI for each grid cell and year of the six HYDE products and the mean of the two EARTHSTAT products.Then we compared each of the six HYDE products to the HYDE mean, the two EARTHSTAT product lines to the EARTHSTAT mean and the two means across HYDE and EARTHSTAT products.In addition, we compared the mean across HYDE and the mean across EARTHSTAT to a grid obtained by multiplying the GMIA5 with scaling factors derived from historical changes in AEI at the country level (national scaling approach used in previous studies).These comparisons were undertaken at the native 5 arcmin resolution while many potential applications of the HID (e.g.global hydrological models) often use a coarser resolution.Therefore, the historical irrigation maps were aggregated to a resolution of 30 arcmin where the sum of AEI in 6 × 6 grid cells at 5 arcmin resolution resulted in the AEI of one corresponding grid cell at 30 arcmin resolution.All the pair-wise comparisons described were then repeated at 30 arcmin resolution.

Irrigation evolution by irrigation category
We divided the irrigation areas of the world into three categories, namely (i) irrigation in arid regions, (ii) irrigation in humid or sub-humid rice production systems, and (iii) irrigation in other humid or sub-humid regions (Fig. 4a).Aridity was defined as the ratio between annual precipitation sum and annual sum of potential evapotranspiration (UNEP, 1997) derived from the CGIAR-CSI Global Aridity and PET Database (CGIAR-CSI, 2014; Zorner et al., 2008).In this paper, all regions with an aridity index less than 0.5 are termed dry.Therefore, dry zones defined in this study include hyper arid, arid, and semi-arid zones according to the classification used by UNEP (UNEP, 1997).Irrigated humid or sub-humid (wet) rice production systems were defined by selecting grid cells with an aridity index greater than 0.65 and a harvested area of irrigated rice that was at least 30 % of the total harvested area of irrigated crops according to the MIRCA2000 data set (Portmann et al., 2010).To fill the gaps between grid cells that are not irrigated according to MIRCA2000 irrigation is mainly used to increase crop yields by reducing drought stress during occasional dry periods.We calculated the change in AEI in the three zones at the global scale and in addition the number of people in the distinct irrigation zones based on the HYDE population density (Fig. 4c) (Klein Goldewijk, 2005).

Change in climatic water requirements and freshwater availability in areas equipped for irrigation
As a final step in analysing our gridded historical irrigation maps, we calculated the change in mean aridity and mean natural river discharge on irrigated land as indicators of changes in climatic water requirements and freshwater availability for irrigation.Global means were derived for both indicators.Mean aridity on irrigated land was computed by weighting cell-specific aridity with AEI within the cell as where AI is the mean aridity index on irrigated land (-), AI c is the aridity index in grid cell c derived from the CGIAR-CSI Global Aridity and PET Database (Fig. 4e) (CGIAR-CSI, 2014; Zorner et al., 2008), AEI c is the AEI in cell c (ha), AEI is the total AEI (ha), and n5 is the number of 5 arcmin grid cells with irrigation.
Similarly, mean natural river discharge on irrigated land Q (km 3 yr −1 ) was calculated as where Q c is the mean annual river discharge in the period 1961-1990 (Fig. 4g) (km 3 yr −1 ) calculated with the global water model WaterGAP 2.2 (Müller Schmied et al., 2014) at a 30 arcmin resolution by neglecting anthropogenic water extractions and by using GPCC precipitation and CRU TS3.2 (Harris et al., 2014) for the other climate input data, and n30 is the number of 30 arcmin grid cells with irrigation.To perform these calculations on a 30 arcmin grid, the historical irrigation maps were aggregated as described in Sect.2.3.2.Q in this study refers to the entire river discharge that would be potentially available for the irrigated areas if there were no human water abstractions in the upstream basin.

Irrigation evolution over the 20th century
The pace of irrigation evolution can clearly be divided into 2 eras, with the year 1950 being the breakpoint.Prior to 1950, the AEI gradually increased, whereas since the 1950s the AEI increased extremely rapidly until the end of the century before somewhat levelling off within the first 5 years of the 21st century (Fig. 3).According to the AEI_SU of the HID database, the global AEI covered an area of 63 Mha in year 1900, nearly doubled to 111 Mha within the first 50 years of the 20th century and approximately tripled within the next 50 years to 306 Mha by year 2005 (Fig. 3).More variation can be seen in the historical trends when those are explored for regions or countries separately (Fig. 3, Table A1).In many regions irrigation increased more rapidly (relative to year 1950) than the global average since the 1950s (most rapidly in Australia and Oceania, southeastern Asia, Middle and South Africa, Central America, and eastern Asia), while irrigation development has been much slower than the global average in North America and North Africa.AEI development in eastern Europe and central Asia is unique, with a slow decrease due to the collapse of the former irrigation infrastructure since 1990.
When AEI is compared across world regions, South Asia and eastern Asia have had the largest shares in global irrigation over the entire study period, ranging from 26 to 33 and 20 to 34 %, respectively (Supplement S3).Other world regions with substantial AEI include North America, Middle East, eastern and central Asia, and Southeast Asia, with shares on global AEI between 7 and 12 %.
AEI at the grid cell level in year 1900 shows concentrations of irrigated land mainly on arid cropland, e.g.western North America, the Middle East and central Asia, along the Nile and Indus rivers or the upstream region of the river Ganges (Figs. 4e and 5a).In China, Japan, Indonesia and western Europe irrigated land was mainly in humid regions and served watering of rice fields (Asia) or meadows (western Europe).In Africa, important irrigation infrastructure was found only in Egypt and South Africa.In eastern Europe, the extent of irrigated land was limited to the southern part of Russia and the Ukraine (Fig. 5a).In 12 countries the extent of irrigated land exceeded 1 Mha in the year 1900: India (17.8 Mha), China (17.6 Mha), the USA (4.5 Mha), Japan (2.7 Mha), Egypt (2.3 Mha), Indonesia (1.4 Mha), Italy (1.3 Mha), Kazakhstan (1.2 Mha), Iran (1.2 Mha), Spain (1.2 Mha), Uzbekistan (1.1 Mha), and Mexico (1.0 Mha) (Supplement S3).
In year 1960, irrigated land exceeded 1 Mha in 23 countries (Supplement S3).In the western part of the USA, Canada, and Mexico but also in South America and the Caribbean, e.g. in Argentina, Brazil, Colombia, Chile, Cuba, Ecuador, Peru, and Venezuela irrigation was already widespread (Fig. 5g).In Europe, irrigated land increased mainly in the southern part, e.g. in Albania, Bulgaria, France, Greece, Italy, Portugal, Romania, Russia, Serbia, Spain, and the Ukraine.Irrigated land began to develop at a large scale in Australia and New Zealand but also in several African countries such as Algeria, Libya, Madagascar, Morocco, and Nigeria.In Asia, irrigation was already developed on cropland in all the arid regions but extended also to more humid regions with rice irrigation in countries or regions such as  1900, 1930, 1960, 1980, and 2005) based on the product AEI_HYDE_FINAL_IR of the historical irrigation data set (HID).The maps are presented at global scale and for two selected close-up areas, namely western USA and South Asia, for each time step.
Until year 1980 AEI continued to increase, reaching its maximum extent in some countries in eastern Europe, Africa, and Latin America (Belarus, Bolivia, Botswana, Estonia, Hungary, Mozambique, and Poland) (Fig. 5j).Until year 2005 AEI increased further in many countries and extended also to the more humid eastern part of the USA (Fig. 5).
In Australia, Bangladesh, Brazil, China, France, India, Indonesia, Iran, Iraq, Mexico, Myanmar, Pakistan, Thailand, Turkey, the USA, and Vietnam AEI increased by more than 1 Mha between 1980 and 2005 (Supplement S3).In contrast, AEI decreased between 1980 and 2005 in many European countries such as Albania, Belarus, Bulgaria, Czech Republic, Estonia, Germany, Hungary, Latvia, Lithuania, the Netherlands, Poland, Portugal, Romania, Russia, and Serbia but also in Bolivia, Botswana, Israel, Japan, Kazakhstan,

S. Siebert et al.: A global data set of the extent of irrigated land from 1900 to 2005
Mauritania, Mozambique, South Korea, and Taiwan (Supplement S3).

Gridded area equipped for irrigation in the different gridded products
The rules used to downscale AEI_SU to grid cells (Fig. 2) resulted in differences in AEI per grid cell but also in differences in the total AEI assigned in total in the gridded products.The main reason is that AEI in each grid cell was constrained to the sum of cropland and pasture for the product lines that maximize consistency with the land use data sets (right column in Fig. 2, see Sect.2.2.2).In particular in very small sub-national administrative units in arid regions, where most of the agricultural land is irrigated, AEI based on irrigation statistics was larger than the sum of cropland and pasture in the corresponding administrative unit.Consequently, in the downscaling process this difference between AEI and the sum of cropland and pasture was not assigned to grid cells.In the product lines maximizing consistency with the irrigation statistics (left column in Fig. 2) AEI was constrained by total land area only.Therefore, if required (in step 9 of the allocation), AEI exceeded the sum of cropland and pasture.In the gridded products based on HYDE land use the AEI not assigned to grid cells was smallest in year 2005 (10 529 ha or 0.003 % of total AEI) and largest in year 1900 (2.6 % of total AEI in AEI_HYDE_LOWER_CP, 2.1 % of total AEI in AEI_HYDE_FINAL_CP, and 1.6 % of total AEI in AEI_HYDE_UPPER_CP).In AEI_EARTHSTAT_CP, the extent of AEI not assigned to grid cells was largest in year 1990 (2.3 Mha) while the extent relative to total AEI was largest in year 1920 (0.9 %).While differences in total AEI per administrative unit across gridded time series are relatively low, differences at the grid cell level are considerable (Supplement S5).This reflects different patterns in historical cropland and pasture extent and varying downscaling rules.Cumulative absolute differences, calculated according to Eq. ( 1), increase in the hindcasting process from 2005 to 1970 and decrease prior to that until year 1900 (Fig, B2a and c).In contrast, relative differences are lowest in year 2005 and increase continuously until year 1900 (Fig. B2b and d).Differences among the six gridded products based on HYDE land use are relatively low, similar to differences among the two products based on EARTHSTAT land use.In contrast, differences between the specific gridded products and the mean of all gridded products are much larger but still lower than the difference between the mean of the gridded products of the HID and AEI derived from the national scaling approach (Fig. B2).Thus, differences between the HYDE land use and the EARTHSTAT land use seem to have a larger effect than differences between the HIGHER, LOWER and FINAL HYDE land use variants.Aggregation of the data to 30 arcmin resolution reduced AD and RD by about one-third (Fig. B2) but differences at the grid cell level are considerable even at this resolution.This shows the importance of using different land use data sets for the development of historical irrigation data and the need to develop specific gridded products to be used in conjunction with specific cropland and pasture data sets.To describe and map our results in more detail, for the next sections we used the product AEI_HYDE_FINAL_IR which has maximum consistency with the sub-national irrigation statistics.

Irrigation evolution by irrigation category
In year 1900 about 48 % of the global AEI was in dry areas, 33 % in wet areas with predominantly rice irrigation and 19 % in other wet areas (Fig. 4b).In contrast, only 19 % of the global population lived in dry regions while 35 % of the global population lived in wet areas with predominantly rice irrigation and 46 % in other wet areas (Fig. 4d).The reason for differences between AEI and population in these zones is likely that the majority of the rainfed cropland was located in wet regions whose carrying capacity without irrigation was higher and consequently a higher population density could be supported.While the share of AEI in dry regions remained quite stable varying around 50 % through the entire study period (e.g.46 % in year 2005 and 48 % in year 1900), the share of AEI in wet regions with predominantly rice irrigation decreased from 33 to 26 % while the share of AEI in other wet irrigation areas increased from 19 to 28 % between 1900 and 2005 (Fig. 4b).The share of the global population living in dry regions increased between 1900 and 2005 from 19 to 26 % while the population living in other wet regions decreased from 46 to 35 % (Fig. 4d).

Change in mean aridity and river discharge in areas equipped for irrigation
The global mean aridity index on AEI declined from year 1900 to 1950 from 0.66 to 0.60 indicating that new irrigation was developed on land with higher aridity.After 1950 the mean aridity index increased to 0.63 until year 2005 (Fig. 4f).Global mean natural river discharge on AEI declined by 4-5 km 3 yr −1 in the period 1900-1950 (EARTH-STAT and HYDE gridded products) and increased then again by 2 km 3 yr −1 (7.8 %) (EARTHSTAT products) or remained more or less stable (HYDE products) (Fig. 4h).For 2005, all products converge to a mean natural river discharge on irrigated land of 24-25 km 3 yr −1 .

Data set comparison
For most of the countries, global AEI in the HID is similar or very close to the data reported in the FAOSTAT database (FAO, 2014b) for the period since 1961 or to the AEI in the inventory by Freydank and Siebert (2008) A1 and A2).
There are three major reasons for these differences between HID and FAOSTAT: First, there are countries in which statistics on AEI are not collected by the official statistics departments such as Australia, Canada, New Zealand, Pakistan, and Puerto Rico.In these countries statistics on irrigated land refer to the AAI in the year of the survey.Many factors can result in only a part of the irrigation infrastructure being actually used for irrigation, such as failure in water supply or damaged infrastructure.In other, mainly humid and sub-humid regions, only specific high value crops, such as vegetables, are irrigated (Siebert et al., 2010).For many of these countries FAOSTAT reports the AAI instead of AEI while the statistics used for the HID were adjusted (as described in Sect.2.1.3)to account for the difference between AEI and AAI.Consequently, AEI in the HID is higher than the irrigated area reported by FAO-STAT (Table A1).
Another group of countries in which AEI in the HID differs from the data reported by FAOSTAT is developed regions, e.g. in Europe, North America or Oceania, such as Austria, Canada, Germany, Greece, Italy, Portugal.FAO is collecting detailed country-specific information on water management and irrigation in its Aquastat program and provides this information in country profiles (http://www.fao.org/nr/water/aquastat/countries_regions/index.stm).These country profiles are based on information obtained from different national data sources and are compiled and revised by FAO consultants from the respective country.This detailed information collected in the Aquastat program is also used to improve and update the time series in FAOSTAT.The mandate of FAO is, however, focused on developing and transition countries; therefore, these detailed country profiles are not available for developed countries and consequently, less effort is made to improve historical data for these countries.In contrast, for many of the developed countries, the HID is based on information obtained from historical national census reports (Table A1).
A third group of countries with differences between AEI in the HID and FAOSTAT are the former socialist countries in eastern Europe.In these countries large-scale irrigation infrastructure was developed with centralized management structure.After the transition to a market-based economy, most of this former infrastructure was not used anymore and it is a matter of definition to decide whether these areas should still be considered as areas equipped for irrigation or not.For most of these countries the HID shows a major decline in AEI after 1990 (based on national surveys or statistics on irrigable area provided by Eurostat) while FAOSTAT still includes the former irrigation infrastructure in some countries (Table A1).
The main reason for differences between AEI per country in the HID and the inventory of AEI per country (Freydank and Siebert, 2008) (Table A2) is that the number of references used to develop the HID was much larger than the number of historical reports used by Freydank and Siebert (2008).Many assumptions used in Freydank and Siebert (2008) were thus replaced by real data.This also includes changes for the year 2000 (e.g. for Australia, Bulgaria, Canada, China, Indonesia, Kazakhstan, Russia, Ukraine, and the USA; see Table A2) because the recent extent of irrigated land in Freydank and Siebert (2008) was based on the statistical database used to develop version 4 of the Global Map of Irrigation Areas (Siebert et al., 2007), while the HID is consistent with the updated and improved version 5 of this data set (Siebert et al., 2013).In addition, the HID explicitly accounts for the historical practice of meadow irrigation used mainly in central and northern Europe resulting in higher estimates of AEI, in particular for year 1900 for many European countries, e.g.Austria, Germany, Norway, Poland, Sweden, Switzerland, and UK (Table A2).
To verify the spatial patterns in historical irrigation extent in the gridded product, we compared the product AEI_HYDE_FINAL_IR for year 1910 (Fig. 6a) to two historical maps of the major irrigation areas in the United States in years 1909 (Fig. 6b) and 1911 (Fig. 6c).We found that our product represents remarkably well the spatial pattern of the major irrigation areas shown on the historical maps, in particular in states such as Idaho, Utah, and Wyoming.In some states, the pattern of AEI in the HID differs from the pattern shown in the historical maps.This can be expected because of the simplicity of our downscaling approach and because of difficulties of showing minor irrigation sites on the historical maps.However, based on visual comparison of the maps it seems that for many states the agreement is even better than the match between the two historical maps, and that the agreement of the pattern shown in the HID and in the map for year 1909 is best.One exception is California where the HID and the historical map for year 1911 show irrigation development over the entire Central Valley (Fig. 6a and c) while in the historical map for year 1909 only shows irrigation in the southern part of the Central Valley (Fig. 6b).
The good agreement between the spatial pattern in the HID, that is mainly determined by the current pattern of irrigated land in the GMIA5 (Siebert et al., 2013), and the pattern shown in historical maps indicates that most of the major irrigation areas today in the western USA were already irrigated in year 1910, although the total extent of irrigated land in the United States at that time was only about 20 % of the current extent.This may not be the case for other countries where most of the irrigation infrastructure was developed more recently.A systematic validation of the HID to historical maps was however not possible because of the limited availability of historical large-scale irrigation maps.A comparison of the HID for year 1960 with a historical global irrigation map (Highsmith, 1965) shows, however, that the main irrigation areas shown in the historical global map are also present in the HID (Supplement S6).A very good agree- ment of the historical irrigation map (Highsmith, 1965) and our gridded product for year 1960 (AEI_HYDE_FINAL_IR) was found for the major irrigation areas in the Central Valley in California, along the Yakima River in Washington, at the High Plains aquifer in Texas, along the Colorado River and the Rio Grande (the United States, Mexico) in Alberta (Canada), the Pacific Coast and along Rio Lerma in Mexico, in Honduras and Nicaragua, in Peru, Chile and Argentina, in Spain, along the French Mediterranean coast, in northern Italy, Bulgaria and Romania, along the Nile River in Egypt and Sudan, in South Africa and Zimbabwe, in the Euphrates-Tigris region and the Aral Sea basin, in Azerbaijan, Pakistan, northern India and eastern India, in the area around Bangkok (Thailand), in Vietnam, Taiwan, North Korea, South Korea and Japan, in the North China Plain, on the island of Java (Indonesia), in the Murray-Darling Basin (Australia), and on the southern island of New Zealand (Supplement S6).However, there are also some regions that show differences in the two products.For example, the map published by Highsmith (1965) shows very little irrigation in the eastern United States, northern and central Europe, Portugal, southwest and northern France, southern Brazil, the Fergana Valley in Uzbekistan, the interior of Turkey, western China and Sumatra (Indonesia), while the sub-national statistics used to develop the HID indicate that there was irrigation already developed at this time (Supplement S6).For other regions, such as northeast Brazil or Namibia, the extent of irrigated land seems to be larger in the historical drawings relative to the newly developed HID (Supplement S6).The general impression from the comparison of the two map products is that there is a very good agreement for most of the major irrigation areas while there is less agreement for the minor irrigation areas.Some of the differences may be related to difficulties with drawing interspersed small-scale irrigation on the historical maps.In other cases it may be that the newly developed HID shows irrigation in areas where infrastructure was not developed at this time, e.g. because the resolution of the sub-national irrigation statistics was not sufficient.

Improvements in mapping of historical irrigation extent by the new inventory
In previous studies only changes in irrigated land at the country level were considered and gridded data showing the percentage of irrigated land under current conditions were multiplied by a factor which represented the change in irrigated land at the country level to derive patterns of irrigated land for historical periods.Consequently, the relative contribution of specific grid cells to the national sum of irrigated land remained the same through the entire study period and the number of irrigated grid cells only changed when irrigated land in a country decreased to zero.Development of the HID improves on the historical development of irrigated land from previous studies by considering sub-national data on the extent of irrigated land.In addition, when irrigated land declined historically, the number of irrigated grid cells is reduced and irrigated land is concentrated into smaller regions in the HID (Figs. 6a, 7a, c and e) while there were many irrigated cells with very small irrigated areas in the historical layers when the national scaling approach was used (Figs.6d and 7b-d).
At least for the USA, the historical pattern derived with the new method (HID) agrees much better with the pattern shown on historical maps (Fig. 6), particular in central US states (e.g.Texas, Kansas, and Nebraska).In the USA, irrigation developed first in the arid western part of the country.While this is reflected well in the HID, a national scaling approach would also assign irrigated land to grid cells that are currently irrigated and located in the eastern part of the country, e.g. to the lower Mississippi Valley (Figs. 6 and 7).Similar to this, historical irrigated land in India was mainly located in the northwest of the country and in China more in the south of the country, while the national scaling approach would also assign irrigated land to the eastern part of India and the northeast of China (Fig. 7).Consideration of subnational statistics therefore resulted in a clear improvement in the historical irrigation layers, in particular for these large countries.
Differences in sub-national patterns of irrigated land in the HID, as compared to the maps obtained with a national scaling approach, also affected the weighted mean aridity and river discharge on irrigated land (Fig. 4f and h) which were computed as indicators of irrigation water requirement and irrigation water availability.In year 2005, the global mean ratio of annual precipitation and annual potential evapotranspiration (aridity index) was 0.63 on irrigated land (Fig. 5f).Back to year 1950 the aridity index decreased to 0.60 (HID products) or 0.59 (national scaling approach) while the aridity index increased back to year 1900 again to 0.65-0.66(HID products) or 0.63 (national scaling approach).Mean weighted river discharge decreased from 27-29 km 3 yr −1 in year 2005 to 23-25 km 3 yr −1 (HID product lines) in 1950 or from 24 to 20 km 3 yr −1 in the national scaling approach in the same period (Fig. 4h).Application of the new methodology used to develop the HID therefore resulted, at global scale, in more humid conditions with higher river discharge on irrigated land in year 1900 as compared to the means computed with the national scaling approach.We expect therefore, that the use of the new HID will also result in different results for major applications such as for estimates of irriga- tion water use, water scarcity, terrestrial water flows, or crop productivity.

Determinants of the fraction of irrigated cropland
The indicators including AEI by irrigation category, change of mean aridity and of mean river discharge in AEI presented in Sect.3.3 and 3.4 can also be associated with the fraction of irrigated cropland to better describe reasons for spatial differences in densities of irrigated land and of trends in irrigation development (Fig. 8).Irrigation is a measure of land use intensification because it is used to increase crop yields (Siebert and Döll, 2010).Therefore, a high density of irrigated land can be expected in regions where high crop yields (in kcal per ha and year) are required to meet the demand for food crops due to high population densities, e.g. in South Asia, East Asia, and Southeast Asia (compare Fig. 8c and f).Consequently, a large part of the spatial patterns in the use of irrigated land can be explained by population density (Neumann et al., 2011).However, there are also other methods of land use intensification, e.g.multiple cropping, fertilization, or crop protection from pests.The highest benefit from using irrigation is achieved in arid and semi-arid climates because of the reduction of crop drought stress and in paddy rice cultivation because rice is an aquatic crop and irrigation is also used to suppress weed growth by controlling the water table in the rice paddies.The high aridity explains the high fraction of irrigated cropland in central Asia, on the Arabian Peninsula, in Egypt, Mexico, the USA, Peru, and Chile (compare Fig. 8c and h) while the importance of traditional paddy rice explains high fractions of irrigated cropland in tropical regions, e.g. in Southeast Asia, Suriname, French Guyana, Colombia, or in Madagascar and Japan (compare Fig. 8c and g).
Large volumes of water are required for irrigation; therefore, the extent of irrigated land is also constrained by available and accessible freshwater resources.For example, the mean annual discharge weighted with AEI is relatively low in several countries in North and East Africa, but also in Mongolia, Mexico, and Australia, which may be a barrier for the establishment of large-scale irrigation infrastructure (Fig. 8i).In contrast, annual discharge weighted with AEI is high in most of the humid rice cultivation regions (compare Fig. 8g and i) and in some regions where arid irrigation areas (Fig. 8h) are connected by a river to more humid upstream areas (Fig. 8i), e.g. the river Nile basin in Egypt, the Indus and Ganges basins in Pakistan and India, the Aral basin in central Asia, or the Tigris and Euphrates basins in Turkey and Iraq.Many of the historical cultivation in these regions benefited greatly from irrigation and abundant water resources (Fig. 8a).In contrast, trends in the share of irrigated cropland between 1970 and 2005 (Fig. 8a-c) seem to be more closely associated with changes in cropland productivity, shown here as kcal produced per year and hectare of cropland (Fig. 8df).Large increases in cropland productivity in South America, Southeast Asia, Mexico, the USA, and parts of western Europe (Fig. 8d-f) are consistent with increases in irrigated cropland fraction (Fig. 8a-c) while regions with a decline in irrigated cropland fraction, e.g. in the period 1990 to 2005 in eastern Europe, some countries of the former Soviet Union or Mongolia (Fig. 8b and c) agree with regions with a similar trend in crop productivity (Fig. 8e and f).
The relationships between irrigated cropland fraction and cropland productivity, aridity, rice cultivation, and river discharge raise the question of whether these relationships can be used to predict future spatio-temporal changes in the extent of irrigated land.Such information could improve climate impact assessments or global change studies, which assume in most cases a fixed extent of irrigated land in the coming decades.A key question for such applications will be to determine the drivers of land productivity.In historical periods the majority of crops were produced close to the region of consumption; therefore, cropland productivity was mainly driven by population density (Boserup, 1965;Kaplan et al., 2011).More recently, regions of crop production and consumption are increasingly decoupled by trade flows (Fader et al., 2013;Kastner et al., 2014).World food supply has increased within the last 50 years but food self-sufficiency has not improved for most countries (Porkka et al., 2013).Furthermore, only a few countries, such as the USA, Canada, Brazil, Argentina, or Australia have been net food exporters while most other countries have been net food importers (Porkka et al., 2013).Most of these net food exporting countries are characterized by an increase in cropland productivity and irrigated land (compare Fig. 8d-e to Fig. 3 in Porkka et al., 2013).These net food exporting countries also supply crop products to net importing countries with low cropland productivity and a low extent of irrigated land, e.g. in Africa.In a globalizing world these long distance links are expected to become even stronger and need to be considered when projecting future extent of irrigated land.

Limitations and recommended use of the data set
The uncertainty in estimated AEI is driven by uncertainties in input data (statistics on AEI, cropland and pasture ex-tent) and by the assumptions made when harmonizing input data or when disaggregating AEI per administrative unit to grid cells.In particular, for the period before 1960 availability of survey-based first-hand statistics on AEI was limited and missing data had to be replaced by expert guesses or assumptions (Fig. 1).Therefore, AEI for the period before 1960 is expected to be less accurate than afterwards.Similar to this, the trend for the development of global AEI maybe less certain for the most recent years in the time series because detailed agricultural census surveys are typically undertaken only every 5-10 years and there is an additional 2-5-year lag before the survey results become available.For many countries the latest detailed survey data were available for the period around year 2000 and sometimes it was assumed that AEI did not change afterwards until year 2005 (Supplement S1).Therefore the declining increase of AEI for the period 1998-2005 (Fig. 3) could be an artefact of the data constraints for the most recent years.Data availability also differed across countries (Supplement S1).In addition, boundaries of nations have been changing, for example, AEI for countries belonging to the former Soviet Union or the former Socialist Federal Republic (SFR) of Yugoslavia is reported since the begin of the 1990s, while for the period before 1990 the trend in AEI was estimated based on the trend reported for the USSR or for the SFR of Yugoslavia unless sub-national information for historical years could be used.These changes in the extent of nations add another source of uncertainty to AEI.
Uncertainty in input data also impacts disaggregation of AEI into 5 arcmin resolution.In countries with a high resolution of sub-national irrigation statistics, uncertainty in the gridded product lines is expected to be less than in countries where data are available at the national scale only, in particular in the case of large countries.The resolution of sub-national irrigation statistics has also been higher for the more recent time steps (Supplement S2).In the disaggregation process, sub-national irrigation statistics were combined with gridded land use data sets (Sect.2.2).Therefore, uncertainties in these input data are also introduced into the gridded product lines described in this article.Furthermore, it is possible that the census-based land use statistics used as input to develop the HYDE and EARTHSTAT data layers on cropland and pasture extent may be inconsistent with the irrigation statistics used in this study, in particular for small sub-national statistical units.This issue cannot be avoided because often the institutions responsible for collecting land use data (e.g.Ministries of Agriculture) differ from institutions collecting irrigation data (e.g.Ministry of Water Resources; Ministry for Environment) resulting in different sampling strategies and different survey years.
The assumptions and rules used in the disaggregation to 5 arcmin resolution (Fig. 2) may not be appropriate for all the countries and time steps.For specific countries such as Australia, New Zealand, or Switzerland it is known, for example, that irrigation has been used mainly for grassland or fodder; therefore, the rule to preferentially assign irrigation to cropland (S1, S5, S7 in Fig. 2) resulted in incorrect assignment of irrigation in some places.The disaggregation could therefore be improved by applying country-specific disaggregation rules; these rules should also be time specific to reflect time-varying differences in irrigation infrastructure development across regions.However, the country-specific information on the historical development of irrigation was insufficient to develop these rules for this global-scale study.
Differences in spatial pattern of disaggregated AEI among the gridded products based on HYDE cropland and pasture extent and EARTHSTAT cropland and pasture extent suggest that the products developed in this study are only compatible with the specific land use data set used in this study as input.Application in studies in which both land use and irrigation data are required as input may result in inconsistencies when other land use information is used.However, the method and rules applied here are sufficiently general that they can easily be applied on request for other land use data sets reporting the extent of cropland and pasture at the required spatial and temporal resolution.
Because of differences in AEI at the grid cell level among the gridded products, we suggest that more than one specific gridded product should be used in typical applications such as global hydrological modelling to get a better understanding of differences in model outputs caused by using different input data.We cannot make a general recommendation on which HID product may be most appropriate for different applications or represents patterns in AEI in a region better at this stage.When complete coverage of the global irrigation extent is most important, use of AEI_HYDE_FINAL_IR or AEI_EARTHSTAT_IR is recommended.When, in contrast, consistency with cropland or pasture data is more important, the corresponding CP-product (AEI_HYDE_LOWER_CP, AEI_HYDE_FINAL_CP, AEI_HYDE_UPPER_CP, AEI_EARTHSTAT_CP) may be more appropriate.
We are unable to quantify the uncertainties in our map because this would require defining error ranges and probabilities for each specific source of uncertainty, e.g.all the sources used as input.However, we expect that uncertainties are scale dependent with higher uncertainty for specific grid cells than for entire countries or the whole globe and that estimates for specific years are less certain than trends for longer time periods.Therefore, we recommend application of the data set mainly for global-scale research or for continental studies.Use of the data set for studies constrained to single countries is only suggested after carefully checking the resolution and origin of the input data used for the specific country (Supplement S1), checking the assumptions made to fill data gaps (Supplement S1), and testing whether the rules and assumptions made in the downscaling (Fig. 2) are appropriate for that specific case.
The data set presented in this study shows AEI and we need to highlight that the spatio-temporal patterns in the development of AEI cannot directly be translated into patterns in area AAI, information that is required for many applications.The main reason is that the percentage of AEI that is actually used for irrigation differs across countries (Siebert et al., 2010(Siebert et al., , 2013)).In addition, interannual variability in AAI is higher than that in AEI because the area that is actually irrigated in a specific season also depends on the specific weather conditions (supplementary irrigation) or on the performance of the water supply infrastructure (in particular in arid regions).Data on the percentage of AEI that is actually being used for irrigation, e.g.provided by GMIA5 (Siebert et al., 2013) for year 2005 could be used as a starting point for long-term studies but modification would be required for historical years to account for dynamics in construction and abandonment of irrigation infrastructure.The same applies to irrigation water use.A study on global groundwater depletion, in which hydrological modelling was combined with groundwater well observations and total water storage trends as derived from observations by the GRACE (Gravity Recovery and Climate Experiment) satellites, found that independent estimates of groundwater depletion could best be simulated by the model if it is assumed that farmers in groundwater depleted areas only use 70 % of the optimal irrigation water volume (Döll et al., 2014).However, it is not known whether this reduction in water use was achieved by reducing AAI or by reducing irrigation water application on AAI.
Despite the uncertainties and limitations described earlier we are convinced that application of HID will improve model results in many fields of research, e.g. for all applications that hitherto used the national scaling approach so far to derive trends in irrigated land (examples are described in Sect.1).In addition, the data set may also be used in sociohydrological research (Baldassarre et al., 2013;Sivapalan et al., 2014) to study two-way interactions between humans and water resources or in sustainability research more generally.One advantage of HID is that trends in irrigated land are determined by the official land use data and therefore implemented independently from trends in socio-economic variables, such as gross domestic product (GDP), prices, or population density which makes it possible to study interactions between the extent of irrigated land and socio-economic development.For studying relationships with physical properties, such as soil suitability, slope, or climate, it is however recommended to use the sub-national inventory of historical statistics (AEI_SU) or the gridded products based on EARTHSTAT historical cropland extent because some relationships with physical variables have been used to develop the HYDE cropland data set (Klein Goldewijk et al., 2011) so that our gridded products based on HYDE are not completely independent of these variables.

Conclusions
The historical irrigation data set (HID) describes the development of area equipped for irrigation (AEI) for the period 1900-2005.For the first time, sub-national historical irrigation statistics were collected and incorporated into the data set resulting in an improved consideration of changes in the spatial pattern of irrigated land.A new method was developed and applied to downscale the sub-national irrigation statistics to 5 arcmin resolution.Different from previous approaches, the downscaling method aims to harmonize the downscaled irrigated area to historical cropland and pasture data, which represents an important improvement for many potential applications of the data set, including global hydrological modelling, modelling of changes in crop productivity, or climate impact assessments.

Figure 4 .
Figure 4. (a) Classification of irrigation areas in dry areas, wet rice cultivation areas, and other wet irrigation areas and (b) development of global AEI for historical irrigation data set (HID; AEI_HYDE_FINAL_IR) in these zones for the period 1900-2005; (c) population density for year 2005 according to the HYDE database (Klein Goldewijk et al., 2010) and (d) number of people in the three different irrigation zones in the period 1900-2005; (e) aridity index according to the CGIAR-CSI Global Aridity and ET database (CGIAR-CSI, 2014; Zorner et al., 2008) and (f) change in mean aridity on irrigated land in the period 1900-2005; (g) mean annual river discharge in the period 1961-1990 calculated with WaterGAP 2.2 (Müller Schmied et al., 2014) and (h) change in global mean of natural river discharge on irrigated land in the period 1900-2005.

(
but may have been irrigated in the past), we used a Euclidean allocation routine which assigned to each grid cell without irrigation the irrigated rice share of the nearest grid cell with irrigation.All the other grid cells were classified as wet and include humid or sub-humid regions in which www.hydrol-earth-syst-sci.net/19et al.: A global data set of the extent of irrigated land from 1900 to 2005

Figure 5 .
Figure 5. Spatial and temporal evolution of global area equipped for irrigation (AEI) for five time steps(1900, 1930, 1960, 1980, and 2005)   based on the product AEI_HYDE_FINAL_IR of the historical irrigation data set (HID).The maps are presented at global scale and for two selected close-up areas, namely western USA and South Asia, for each time step.

Figure 6 .
Figure 6.Comparison of the historical irrigation data set (HID) for year 1910 (developed using HYDE land cover, central estimate; AEI_HYDE_FINAL_IR) (a) with a map showing irrigated area in the western part of the USA in year 1909 (Whitbeck, 1919) (b), a map showing irrigated area in the western part of the USA in year 1911 (Bowman, 1911) (c), and an irrigation map for year 1910 developed by multiplying area equipped for irrigation (AEI) in year 2005 with scaling factors derived from historical changes of AEI at country level (d).

Figure 7 .
Figure 7.Comparison of the historical irrigation data set (HID) (developed using HYDE land cover, central estimate; AEI_HYDE_FINAL_IR) (a, c, e) to irrigation maps developed by multiplying area equipped for irrigation (AEI) in year 2005 with scaling factors derived from historical changes of AEI at country level (b, d, f) for years 1900 (a, b), 1960 (c, d) and 1980 (e, f).

Figure 8 .
Figure 8. Ratio between the area equipped with irrigation according to the sub-national irrigation statistics (AEI_SU) used for the historical irrigation data set (HID) and cropland extent (a-c) or total cropland productivity (kcal ha yr −1 , d-f) per country for years 1970 (a, d), 1990 (b, e) and 2005 (c, f); fraction of AEI in regions with mainly rice irrigation (g), mean aridity weighted with AEI (h), and mean river discharge weighted with AEI (km 3 yr −1 , I).Cropland productivity (kcal ha −1 yr −1 ) was calculated based on crop production data for years 1969-1971 (d), 1989-1991 (e) and 2004-2006 (f) and cropland extent for years 1970, 1990 and 2005 extracted from the FAO FAOSTAT database (FAO, 2014b).

Table 1 .
Spatial resolution, land use data used in downscaling of area equipped for irrigation (AEI) compiled from sub-national statistics (AEI_SU), consistency rules in downscaling and AEI lost in downscaling for products in the global historical irrigation data set HID.
a Klein Goldewijk et al. (2011), b http://www.earthstat.org.toprovide water to crops.It includes area equipped for full/partial control irrigation, equipped lowland areas, and areas equipped for spate irrigation (FAO, 2014a), but it excludes rainwater harvesting.AEI is reported in many national census databases and international databases such when irrigation occurs during the year of inventory, or during ≥ 2 of the 4 years prior to the inventory (US Department of Agriculture