SACRA – a method for the estimation of global high-resolution crop calendars from a satellite-sensed NDVI

To date, many studies have performed numerical estimations of biomass production and agricultural water demand to understand the present and future supply–demand relationship. A crop calendar (CC), which defines the date or month when farmers sow and harvest crops, is an essential input for the numerical estimations. This study aims to present a new global data set, the SAtellite-derived CRop calendar for Agricultural simulations (SACRA), and to discuss advantages and disadvantages compared to existing census-based and model-derived products. We estimate global CC at a spatial resolution of 5 arcmin using satellite-sensed normalized difference vegetation index (NDVI) data, which corresponds to vegetation vitality and senescence on the land surface. Using the time series of the NDVI averaged from three consecutive years (2004–2006), sowing/harvesting dates are estimated for six crops (temperate-wheat, snow-wheat, maize, rice, soybean and cotton). We assume time series of the NDVI represent the phenology of one dominant crop and estimate CCs of the dominant crop in each grid. The dominant crops are determined using harvested areas based on census-based data. The cultivation period of SACRA is identified from the time series of the NDVI; therefore, SACRA considers current effects of human decisions and natural disasters. The difference between the estimated sowing dates and other existing products is less than 2 months (< 62 days) in most of the areas. A major disadvantage of our method is that the mixture of several crops in a grid is not considered in SACRA. The assumption of one dominant crop in each grid is a major source of discrepancy in crop calendars between SACRA and other products. The disadvantages of our approach may be reduced with future improvements based on finer satellite sensors and crop-type classification studies to consider several dominant crops in each grid. The comparison of the CC also demonstrates that identification of wheat type (sowing in spring or fall) is a major source of error in global CC estimations.


Introduction
Recent population growth has increased biomass demand significantly, and humans have expanded cropland globally.Agriculture occupies more than 70 % of world water usage and has a large impact on the water cycle (Rost et al., 2008).Consequently, simulations of biomass production and agricultural water demand are necessary to understand the present and future supply-demand relationship.To date, many studies have estimated biomass accumulation (Fischer et al., 2000;Tan and Shibasaki, 2003;Stehfest et al., 2007) and agriculture water demand (Döll et al., 2002;Hanasaki et al., 2008;Rockström et al., 2009;Siebert and Döll, 2010;Pokrel et al., 2012).Those studies estimated biomass production and agricultural water demand with numerical models using meteorological forcing data and land surface parameters.A crop calendar (CC) is an essential input to estimate biomass production and agricultural water demand accurately with those numerical models.CCs define the date or month when farmers sow and harvest crops.There are three major approaches to developing CC data sets: census-based; model-based; and Earth observation-based.
The first approach, the census-based method, estimates CCs by collecting and integrating agricultural census data provided by international and national organizations such as the Food and Agriculture Organization (FAO) and the United Published by Copernicus Publications on behalf of the European Geosciences Union.1), and the other boxes indicate the results from our data processes.
States Department of Agriculture (USDA).The census-based CC products are represented by MIRCA2000 (Portmann et al., 2010; Monthly Irrigated and Rain-fed Crop Areas around the year 2000) and Sacks et al. (2010).The census-based products have the advantage of high reliability in regions that have sufficient census data.However, they also have the disadvantage of low reliability in regions that have no cen-sus data.Additionally, the spatial resolution of census-based products is limited because of the sampling scheme (Portmann et al., 2010).Because only one CC is defined per administrative unit for each crop, differences in CCs for the same administrative unit are not considered.
Model-based approaches generate CCs using crop growth models.These models simulate crop growth based on meteorological forcing data such as temperature, solar radiation, and soil moisture.In particular, accumulated temperature is widely used to indicate phenological progress.Hanasaki et al. (2008) estimated global CCs for several crops using the Soil and Water Integrated Model (SWIM; Krysanova et al., 2000).Waha et al. (2012) simulated the sowing dates of major annual crops based on climatic conditions and cropspecific temperature requirements.The crop growth models have the advantage of accurate crop growth simulation in cases of well-calibrated parameters.However, proper calibration is difficult in areas where observation data are insufficient.Additionally, the crop growth model, being based on environmental processes, is of limited accuracy with respect to the identification of sowing dates, because the sowing date is heavily affected by human decisions.
Finally, Earth observation-based studies estimate the CC using time series from satellite observations.Time series of vegetation indices (VIs) correspond well to vegetation vitality and senescence on the land surface.In this context, satellite-derived VIs have been widely used to classify crop type and to monitor crop growth at the regional scale (Mingwei et al., 2008;Sakamoto et al., 2005Sakamoto et al., , 2010;;Wardlow and Egbert, 2008;Wardlow et al., 2007).An advantage of satellite-derived data is its spatial resolution (less than 1 km).However, a few studies have estimated global CCs with satellite-derived data.Yorozu et al. (2005) estimated a global CC using the normalized difference vegetation index (NDVI), but they did not compare their results to other global CC data sets.
In this paper, we present a new global data set, the SAtellite-derived CRop calendar for Agricultural simulations (SACRA).Using satellite-sensed NDVI data, we estimate the global CC at a spatial resolution of 5 arcmin (∼ 9.2 km at the Equator).This study aims to develop a high-resolution and highly accurate CC product by combining the satellitederived NDVI with a census-based product.We also aim to discuss the advantages and disadvantages of our satellitederived CC, compared to existing census-based and modelderived products.The products are available for download at http://data-assimilation.jp/opendata/sacra/sacra_des.html.

Materials and methods
This section describes the methods applied to produce SACRA according to a defined data processing scheme (Fig. 1).SACRA is produced from four different data sets: time series of the NDVI; land cover data; reanalysis temperature data; and census-based agricultural data (Table 1).This study estimates the CC for six crops (temperate-wheat, snowwheat, rice, maize, soybean, and cotton) that are widely cultivated around the world (Table 2a).We treat temperate-wheat and snow-wheat separately because our method is unsuitable for estimating the sowing date in grid areas where the surface is covered by snow during the cultivating period (e.g., Russia and North China; see Sect.2.3 for details).

Dominant crop and census-based sowing/harvesting data
Firstly, we identify the dominant crop at a spatial resolution of 5 arcmin using MIRCA2000.The grid of SACRA is set to that of MIRCA2000.Portmann et al. (2010) compiled irrigated and rain-fed areas of 26 crop types at a spatial resolution of 5 arcmin (cf.Table 4 in Portmann et al., 2010).In other words, we can obtain 52 classes of crop areas at each grid (i.e., both irrigated and rain-fed areas of 26 crop types).Their crop calendars in major and second cultivation seasons are also defined in MIRCA2000.Since our method cannot consider the mixture of several crops in a grid (see Sect. 2.2.2 for details), we consider only one dominant crop in each grid.
We define the dominant crop in the major cultivation season as that which has the maximal harvested area in the grid, out of 52 possible crops (considering rain-fed and irrigated areas separately; cf.Appendix I in Portmann, 2011).If more than two crops have identical harvested areas, the early order of the crops is chosen to be the dominant crop (e.g., for irrigated wheat in India (Uttar Pradesh) with two identical areas with different cropping periods; cf.Table I -211 of Appendix I in Portmann, 2011).The dominant crop in the second cultivation season is determined from those crops whose cultivation periods do not overlap more than 3 months with that of the dominant crop in the major cultivation season.Secondly, we obtain the sowing and harvesting months of the dominant crop in both major and second cultivation seasons, using MIRCA2000.At each grid, we use the sowing and harvesting months.The census-based sowing and harvesting months are used to calibrate crop calendar parameters in Sect.2.3.
Finally, we classify temperate-wheat and snow-wheat (originally classified as "wheat" in MIRCA2000) using reanalysis temperature data (Table 2).Again, our method is unsuitable for the estimation of sowing dates for grids where the surface is covered by snow during cultivation.If the minimum monthly averaged temperature during the cultivating period is below 5.0 • C, the wheat is categorized as snow-wheat.In this categorization, we use MIRCA2000-derived cultivating periods (from sowing month to harvesting month) and reanalysis temperature (Hirabayashi et al., 2008).Hirabayashi et al. (2008) compiled 3-hourly surface temperature data by statistical methods, the parameters of which had been obtained from available surface observations.Here, we simply use the reanalyzed temperature of the nearest 30 arcmin grid from MIRCA2000's 5 arcmin grids.
The resulting global distribution of dominant crops in SACRA in the major cultivation season is shown in Fig. 2a.The minimum monthly averaged temperature during the cultivation period of the dominant crop is shown in Fig. 2b.Regions showing the minimum monthly averaged temperatures below 5.0 • C in Fig. 2b are categorized as snow-wheat (purple) or other crops (grey) in Fig. 2a.The categories of temperate-wheat and snow-wheat classify whether or not the surface is covered by snow during cultivation.Note that the classification of temperate-wheat and snow-wheat is independent of the classification of spring-wheat and winterwheat.The classification of spring-wheat and winter-wheat depends on the sowing season (spring or fall).

VEGETATION/SPOT NDVI data
Vegetation indices are simple, graphic indicators to assess whether the targeting area contains live, green vegetation or not.In this study, we use the NDVI defined by the following: where VIS and NIR indicate the spectral reflectance in the visible and near-infrared bands.The formula is based on the fact that chlorophyll absorbs VIS, whereas the mesophyll leaf structure scatters NIR (Pettorelli et al., 2005).The NDVI correlates with the accumulation and decomposition of leaf cell tissue.Therefore, we are able to detect crop growth with the time series of the NDVI over the cropland.The time series of a satellite-sensed NDVI at a double-cropping pixel in China is shown in Fig. 3a.As shown in Fig. 3a, peak dates can be clearly identified from the time series of the NDVI.In this study, we use a 10-day composite NDVI provided by VEGETATION/SPOT (Maisongrande et al., 2004).
To reduce the effect of clouds, the best index slope extraction (BISE) method (Viovy et al., 1992) is applied to the time series of the NDVI (Fig. 3a).To estimate the CC with the smooth time series of the NDVI, we use an averaged NDVI over 3 years (2004)(2005)(2006).Hereafter, this averaged NDVI is indicated by SPOT-NDVI in this paper.The time series of the NDVI has inter-annual variability as shown in Fig. 3b.

Aggregation of the NDVI
Two NDVI data sets (NDVI-Pure and NDVI-Crop; 5 arcmin resolution) are aggregated from the original SPOT-NDVI (1 km resolution) using two land cover data sets: Global Land Cover Characterization, version 2.0 (GLCC; Loveland et al., 2000) and Ecoclimap, version 2.0 (Faroux et al., 2013).The GLCC and Ecoclimap data are provided by the US Geological Survey and Meteo France, respectively.Schematic imagery of the aggregated NDVI-Pure and NDVI-Crop data is shown in Fig. 4. The NDVI-Pure and NDVI-Crop data are aggregated by averaging 1 km NDVI pixels where both GLCC and Ecoclimap agree on the cropland (i.e., at a higher level confidence; Fig. 4a).However, it is possible for there to be no pixel where both GLCC and Ecoclimap agree on the cropland.In this case, only the NDVI-Crop is aggregated by averaging the pixels where the GLCC and Ecoclimap disagree, but where one of them agrees on cropland (i.e., a lower level confidence; Fig. 4b).The NDVI-Pure is undefined in the latter case.The NDVI-Pure, containing only  higher confidence grids, is used to identify the two CC parameters (Sect.2.3).The NDVI-Crop is used to produce the global CC in Sect.2.4.The two aggregations (spatial and temporal) aim to obtain a smoother time series of the NDVI by removing the phenology of non-dominant and voluntary crops.

Normalization of the NDVI
Absolute peak values of the NDVI differ depending on climate conditions and density of crops.Therefore, we normalize NDVI data to consider variety over a wide range of environmental conditions at the global scale.First, we identify cropping intensity using the time series of the NDVI.We define the peak of the NDVI (NDVI pk ) and the date of the peak (t pk ) if the time series of the NDVI satisfy Eqs. ( 2) and (3): where the boundary is cyclic (i.e., NDVI 0 = NDVI 36 and NDVI 1 = NDVI 37 ) since we have 36 NDVI data per year from 10-day composite data of the SPOT-NDVI.We assume an increase/decrease in the NDVI before/after the peak of the NDVI, as shown in Eqs. ( 2) and (3).The cropping intensity is equal to the number of peaks of the NDVI, up to three times per year.Second, we detect the lowest NDVI between peaks (NDVI btm ).Finally, NDVI data are normalized using the following equations: where nNDVI represents the normalized NDVI.Subscripts btm, bas, and snow denote bottom, base and snow, respectively.The NDVI snow is a parameter to avoid a remnant irregular NDVI mainly caused by snow cover reflection.The NDVI snow is set at 0.20, which corresponds to 40 % of the snow cover over the land surface (Dye and Tucker, 2003).
Figure 3c shows a schematic image of the normalization of the NDVI at the double-cropping pixel in China.As shown in Fig. 3c, the NDVI bas can be different for each peak.We do not need to avoid the negative nNDVI in this normalization process.The normalization is applied for both the NDVI-Pure and the NDVI-Crop.
The detected cropping intensity with the NDVI-Crop is compared with a climate-based estimation (Zabel et al., 2014).Zabel et al. (2014) estimated potential cropping inten-  Zabel et al. (2014), and MIRCA2000 simplified cropping intensity for irrigated (IRC) and rain-fed (RFC) crop classes in six administrative units (six boxes A-F in Fig. 5a and b).Averaged cropping intensities over the administrative units are shown as "Cropping intensity"."Code" and "Crop" represent the assigned code of the administrative unit in SACRA and the dominant crop in the major cultivation season in this study.The simplified cropping intensities are annual harvested area divided by maximum monthly cropped area, which are defined for both irrigated and rain-fed classes of 26 crop types.Table 3 shows a comparison of estimated cropping intensity in this study and that of Zabel et al. (2014) for the six administrative units (six boxes A−F in Fig. 5a and b).For the comparison, a simplified cropping intensity for irrigated (IRC) and rain-fed (RFC) crop classes from MIRCA2000 is also described.Here, simplified cropping intensities are defined as "annual harvested area" divided by "maximum monthly cropped area", which are defined for both irrigated and rain-fed classes of 26 crop types in MIRCA2000.The averaged cropping intensities over the administrative units are shown in the table.We illustrate the time series of the NDVI-Crop in the six administrative units in Fig. 6 to investigate the difference in cropping intensity.The time series of the NDVI-Crop, averaged over the administrative units, are shown in Fig. 6 for 2004, 2005, and 2006, and averaged from 2004 to 2006 (red, blue, magenta, and green lines in Fig. 6).

Cropping intensity (yr
In Brazil Rio Grande do Sul (box A) and US Mississippi (box F), the average cropping intensity in this study is smaller than in Zabel et al. (2014).On the other hand, our estimations are close to the simplified cropping intensity by MIRCA2000.Zabel et al. (2014) estimated potential cropping intensity, which provides a reason for the overestimation of cropping intensity compared to our study.A mixture of bimodal and nearly constant NDVI-Crop (black lines) is shown in Brazil Rio Grande do Sul (Fig. 6a).The nearly constant NDVI is characteristic of a tropical forest.The NDVI-Crop data may not represent the phenology of the cropland  2014) and MIRCA2000, with the only exception being irrigated crop in MIRCA2000 in India Uttar Pradesh.In India Uttar Pradesh, the harvested area of irrigated crops is larger than that of rainfed crops (cf.Table I -211 in Portmann, 2011), suggesting our estimation of cropping intensity corresponds to the irrigated crop.We see clear trimodal and bimodal NDVI-Crop (green lines) in China Henan and Kenya (Fig. 6b and e).Again, Fig. 5b shows the average cropping intensity for the dominant crop in the major cultivation season, according to Zabel et al. (2014).Generally, farmers do not conduct multiple cropping with wheat only.Zabel et al. (2014) reported multiple-cropping intensities for other crops in two administrative units.The other possibility is that overestimations of cropping intensity derive from a mixture of phenology from different crops or vegetation in France (box C).On the other hand, Zabel et al. (2014) and MIRCA2000 may underestimate cropping intensity in Kenya, where a clear bimodal NDVI-Crop (green line) is detected in Fig. 6e.It is also possible that the bimodal NDVI-Crop can be derived from mixture of phenology from different crops.

First estimation of global crop calendar
This study estimates sowing and harvesting dates (t sw and t hv ) using two CC parameters (nNDVI sw and nNDVI hv ) and a time series of nNDVI data.The sowing and harvesting dates are determined by the following: where subscripts sw and hv denote sowing and harvest, respectively.Figure 7a and b show schematics of identification of sowing and harvesting dates for temperate crops (temperate-wheat, rice, maize, soybean, and cotton) and snow-wheat.In the multiple cropping grids, sowing and harvesting dates are also determined for each cropping season (except for the sowing date of snow-wheat).Our method is unsuitable for the estimation of sowing dates of snow-wheat  2a).
because we assume an increase in the NDVI from sowing date to peak in Eq. ( 6).However, the NDVI decreases if the surface is covered by snow (Fig. 7b).Therefore, in this process, we determine both sowing and harvesting dates for temperate crops, and only the harvesting date for snow-wheat.
The two CC parameters used for the determination of sowing and harvest dates are defined for each crop type with the exception of the nNDVI sw of snow-wheat.We calibrated the two CC parameters (nNDVI sw and nNDVI hv ) for each crop type as described in the following paragraph.
To remove the noise of the time series of NDVI data as much as possible, we use limited grids (hereafter, calibration grids) to estimate the two CC parameters.The calibration grids satisfy the following conditions: (1) single cropping defined by cropping intensity; (2) dominant crop occupying more than 25 % of the total cropland area (using land cover fraction data from 26 crop types in MIRCA2000); (3) up to five grids from the same administrative unit of MIRCA2000; (4) NDVI pk is larger than NDVI snow and (5) contains NDVI-Pure (i.e., using only higher-level confident grids).Once the parameters nNDVI pl and nNDVI hv are determined, sowing and harvesting dates can be determined using Eq. ( 6).The values of the two CC parameters nNDVI pl and nNDVI hv are calibrated for each crop to minimize the errors over calibra-Figure 8. Scheme used to adjust the cultivation period of SACRA to that of MIRCA2000.t sw (t hv ) and t sw−adj (t hv−adj ) denote sowing (harvesting dates) for the first estimation and subsequent to the adjustment, respectively.For temperate crops, sowing and harvesting dates are moved (advanced or postponed) to adjust the cultivation period to MIRCA2000.In this treatment, the ratio of t pk − t sw to t hv − t pk is preserved as the ratio of t pk − t sw−adj to t hv−adj − t pk .For snow-wheat, the harvesting date has not changed (i.e., t hv = t hv−adj ).Sowing date is determined by the cultivation period of MIRCA2000 and the harvesting date of the first estimation.
tion grids between determined sowing/harvesting dates and MIRCA2000 (see Appendix A for details).Table 4 shows the number of calibration grids, the calibrated two CC parameters and averaged errors in sowing/harvesting dates for six crop types.

SACRA data sets
The global sowing and harvesting dates are determined by Eq. ( 6) using time series of the nNDVI-Crop and two CC parameters (first estimation in Fig. 1, except for the sowing date of snow-wheat).Our method detects the cultivation season using time series of the satellite-sensed NDVI.However, our algorithm carries the possibility of overestimating or underestimating cultivation periods.The cultivation period (from sowing date to harvesting date) in our scheme is largely affected by the shape of the NDVI (i.e., kurtosis of the NDVI curve).Therefore, our method can result in unrealistic cultivation periods in some grids (e.g., less than 60 days) if NDVI-Crop does not represent the phenology of the dominant crops due to mixture of phenology from other crops and vegetation.Therefore, we adjust the length of the cultivation period to be equal to MIRCA2000 to avoid the unrealistic cultivation periods.For the temperate crops, sowing and harvesting dates are moved (advanced or postponed) to adjust the cultivation period to MIRCA2000.In this treatment, the ratio of t pk − t sw to t hv − t pk is preserved as the ratio of t pk − t sw−adj to t hv−adj − t pk , (Fig. 8a), where t sw (or t hv ) and t sw−adj (or t hv−adj ) denote sowing (or harvesting) dates for the first estimation and after the adjustment, respectively.For snow-wheat, the harvesting date is fixed (i.e., t hv = t hv−adj ).The sowing date is determined by the cultivation period of MIRCA2000 and the harvesting date of the first estimation (Fig. 8b).Here, we use the cultivation period in MIRCA2000 from the 15th of the sowing month to the 15th of the harvesting month.For multiple-cropping grids, the corresponding cultivation season in MIRCA2000 (i.e., major or second cultivation seasons) from each cropping is determined by the following: major season second season when Mon sw(major),1st ≤ t pk < Mon sw(second),1st Mon sw(second),1st ≤ t pk < Mon sw(major),1st , where Mon sw,1st denotes the 1st of the sowing month in MIRCA2000.Subscripts major and second denote major and second cultivation seasons, respectively.Here, we consider the cyclic boundary of the calendar.We apply the cultivation period of the major cultivation season in grids where no dominant crop in the second cultivation season is defined.
The adjusted sowing and harvesting dates are referred to as SACRA and are discussed in the next section.

Results and discussion
This section provides validation and discussion regarding the produced SACRA data set.However, true validation is hard to achieve in global studies.Therefore, we compare the estimated CC with other CC data produced using other estimations, either census-based or model-based.

Comparison with census-based and model-based approaches
We compare SACRA with two CC data sets: MIRCA2000  subscript ave denotes average.F and F −1 define functions to compute η from DOY and DOY from η, respectively.Equations ( 8) and ( 9) are used to compute the averaged sowing date considering the cyclic boundary of the calendar.The spatial distributions of the sowing dates for the dominant crops in the major cultivation season for SACRA, MIRCA2000 and W12 are shown in Fig. 9.The sowing dates are illustrated in grids where the dominant crop of SACRA in the major cultivation season is temperate-wheat, snowwheat, maize, rice, soybean or cotton.If multiple sowing exists in the SACRA dates for the major cultivation season, we illustrate the sowing dates derived from the largest NDVI pk among the sowing dates.For MIRCA2000, we illustrate the 15th of the sowing month.Although three different sets of data are produced from the different approaches (Earth observation-based, census-based, and model-based), they have similar spatial patterns (Fig. 9a-1, b-1, and c-1).Their sowing dates generally represent spring in their grids.Figure 9a-2, b-2, and c-2 show the sowing dates in South Asia, selected arbitrarily to highlight the higher spatial variability in SACRA.Since SACRA uses high-resolution satellite data, it reflects a variety of sowing dates in the same administrative unit, as shown in Fig. 9a-2 (e.g., Thailand, Vietnam, and Laos).W12 also resulted in a variety of sowing dates for Vietnam (Fig. 9c-2) due to the estimation being based on climatic data.The detection of variability in the CC within an administrative unit is an advantage of Earthobservation-based and model-based approaches compared to census-based methods.On the other hand, SACRA carries the disadvantage of undetection of a crop calendar in grids where the NDVI-Crop is not defined (boxes in Fig. 9a-2, b-2,  and c-2).
While SACRA can detect the variability of the CC within administrative units, it is difficult to demonstrate whether the variability is correct around the globe without knowledge of the local CC information.Therefore, the following subsection discusses the differences in the CCs among the three products, with sowing dates.We compare the sowing dates of the three products averaged over administrative units defined in MIRCA2000.

Comparison of averaged CC over MIRCA2000 administrative units
To investigate the characteristics of the three approaches, we compare the averaged sowing dates over administrative units.
The averaged sowing dates of the dominant crop in the major cultivation season are computed by three products (SACRA, MIRCA2000, and W12) using Eqs.( 8) and ( 9), averaging not over years but over administrative units.We assign the 15th of the sowing month for MIRCA2000.The sowing dates of temperate-and snow-wheat in SACRA are compared with the sowing dates of "wheat" in MIRCA2000 and W12.Here, only single cropping grids are used to compute the averaged sowing date for SACRA.We suppose that NDVI bas represents the condition with little vegetation in the winter (or dry) season.In the multiple cropping grids, the NDVI bas in the summer (or wet) season can be higher than other seasons due to mixture of phenology from other crops and vegetation (e.g., NDVI bas in June is higher than that in December in Fig. 3c).Therefore, accuracy of CCs in multiple cropping grids may be lower than that in single cropping grids.
The differences in the sowing dates of the dominant crop are shown in Fig. 10.The administrative units are illustrated if their dominant crop in the major cultivation season is temperate-wheat, snow-wheat, maize, rice, soybean or cotton.The difference for each specific crop type is shown in Fig. A3.The difference between the two data sets is less than 2 months (< 62 days; yellow-or green-colored units) in most of the administrative units in Fig. 10a and b. Figure A3 shows that wheat contains the largest number of units, with a large difference in sowing dates (> 93 days; red-or blue-colored units).We observe a later signalling trend in sowing dates in SACRA than in W12 (Fig. 10b; green-or blue-colored units).The direction of the later signalling trend is dominant in wheat, maize, rice and soybean (Fig. A3-a2, A3-b2, A3-c2, and A3-d2).
Table 5 compares the sowing dates of the three products in 16 administrative units that fall into the category of Table 5. Administrative units with large absolute differences in sowing date (> 150 days) between SACRA and MIRCA2000 or SACRA and Waha et al. (2012).SCR, MRC, and W12 in the table represent SACRA, MIRCA2000, and Waha et al. (2012), respectively."Code" and "Crop" represent the assigned code of the administrative unit in SACRA and the dominant crop in the major cultivation season in this study.Bold numbers denote large absolute differences in sowing date between SACRA and the other data sets.The table compares sowing dates averaged over the administrative units.Only single cropping grids are used to compute the averaged sowing date for SACRA.disagreement (more than 150 days) between SACRA and MIRCA2000 or SACRA and W12.We present the cultivation seasons (from sowing to harvesting dates) in 16 administrative units in Fig. A4 to understand the discrepancies in the CCs of the three products.To interpret the disagreements in Table 5 and Fig. A4, we use Fig. 11, which shows the time series of the NDVI-Crop, average NDVI-Crop, average NDVI-Forest, and average temperature data.Here, average means the average over administrative units.NDVI-Forest is produced by following NDVI-Crop production, but with forest pixels using GLCC and Ecoclimap land cover data.That is, NDVI-Forest is aggregated by averaging the pixels where the GLCC or Ecoclimap agree on forest.We observe disagreements in 12 administrative units where the dominant crops in the major cultivation season are temperate-or snow-wheat, shown in Table 5. Cultivated wheat in the world can be classified into two types depending on the sowing season.The FAO (2002) notes the following: (1) the first type of wheat is planted in the fall to germinate and develop into young plants that remain in the vegetative phase during the winter and resume growth in the early spring; (2) the second type of wheat is usually planted in the spring and matures in late summer, but can be sown in the fall in countries that experience mild winters, such as in South Asia, North Africa, the Middle East, and at lower latitudes.
In Azerbaijan (code 8) and Kazakhstan (code 225), large differences (> 150 days) are observed between SACRA and W12, while the differences between SACRA and MIRCA2000 are < 50 days.In Australia Queensland (code 36), China Gansu (code 102), China Ningxia (code 117), and India Himachal Pradesh (code 192), a large difference is observed between SACRA and MIRCA2000.In the above six administrative units, the assumed wheat type may be incorrectly identified in MIRCA2000 and W12.On the other hand, SACRA's sowing dates differ from both MIRCA2000 and W12 for Beijing (code 99), Hebei (code 107), Henan (code 109), Shaanxi (code 119), Shandong (code 120), and Shanxi (code 122) in China.In these six administrative units, SACRA has possibly detected incorrect signals of the NDVI (e.g., signals of forest or other crops).As shown in Fig. A3, wheat is related to the largest number of units with disagreements in sowing dates.Disagreements in sowing dates are also observed between MIRCA2000 and W12.The identification of wheat type (sowing in spring or fall) may be a major source of error in global CC estimations.
In Bhutan (code 49), India Sikkim (code 207), and Uruguay (code 394), clear unimodal NDVI-Crops are not observed in Fig. 11.The accuracy of SACRA is affected by the accuracy of the land cover data sets.It is known that the 1 km land cover data sets contain uncertainties (Herold et al., 2008;Nakaegawa, 2012).For example, forests may be classified as croplands in the 1 km land cover data sets.Also, NDVI and land cover data sets at 1 km resolution may be insufficient to detect the phenology of the dominant crop in the administrative units.
In China Yunnan (code 126), we observe disagreements between SACRA and MIRCA.In China Yunnan, we observe that some of the grids have bimodal NDVI-Crop (black lines) in Fig. 11.It is possible that NDVI-Crop represents a mixed phenology of non-dominant and voluntary crops.Our approach is unable to consider a mixture of phenology.This may explain the disagreement between SACRA's sowing dates and those of other products.
A major discrepancy in crop calendars between SACRA and other products can be due to the selection of one dominant crop in each administrative unit.It is possible that SACRA detects the CC of similar maximal harvested area or another sub-crop in MIRCA2000 (e.g., for irrigated wheat in India Uttar Pradesh with two identical areas with different cropping periods; cf.Table I -211 of Appendix I in Portmann, 2011).The disadvantages of our approach may be reduced with future improvements based on finer satellite sensors to avoid mixture of phenology from other crops and vegetation, and crop-type classification studies to consider several dominant crops in each grid.
Taking into account the extreme disagreement between SACRA and MIRCA2000 or W12 in some regions (Table 5 and Fig. 10), it becomes important to determine which CC is more reliable.However, it is difficult to decide which data set is more accurate in global studies.For example, the identification of the wheat type (sowing in spring or fall) is difficult, as shown in disagreements among the three products in 12 administrative units (Table 5).Also, it is possible that both are correct, e.g., if they referred to different time periods.MIRCA2000 possibly used the conditions of nearby administrative units because of a lack of more detailed reference information.Therefore, it is difficult to determine the absolute accuracy of the products through comparison.However, combined application of several products is useful for taking the uncertainty of the CC into account.Since SACRA, MIRCA and W12 detect the CC from different approaches, a comparison of their results is useful for cross-validation.

Advantages and disadvantages of SACRA
This subsection discusses the advantages and disadvantages of SACRA compared to two other approaches: census-based and model-based methods.Additionally, this subsection also discusses possible improvements of SACRA.Table 6 summarizes the advantages and disadvantages of the censusbased methods, model-based methods, and SACRA.
An advantage of SACRA is its fine spatial resolution compared to the other two data sets.Therefore, different CCs in the same administrative unit are considered in SACRA (Fig. 9a-2).The model-based method can also result in a variety of CCs.However, it is difficult to demonstrate that the variability is correct around the globe without knowledge of local CC information.
The spatial resolution of SACRA is equal to the maximum resolution of the satellite-sensed NDVI and the crop classification map.At present, the NDVI from the MODerateresolution Imaging Spectroradiometer (MODIS) is available at a spatial resolution of 250 m (e.g., Zhang et al., 2006).However, present studies provide global crop classification maps at a spatial resolution of 5 arcmin (e.g., Monfreda et al., 2008;Portmann et al., 2010).Present land cover data sets, such as GLCC and Ecoclimap, only contain a small number of coarse agricultural classes.At the regional scale, many studies have been performed to classify crops using satellitesensed data (e.g., Mingwei et al., 2008;Wardlow and Egbert, 2008;Wardlow et al., 2007).In this study, we use the crop classification map from MIRCA2000 at a spatial resolution of 5 arcmin.SACRA can be recalculated with higherresolution remote sensing data (e.g., from future Sentinel-2 data; Drusch et al., 2012) if higher-resolution land cover maps become available.The higher-resolution CC products can contribute to hydrological/agricultural studies that aim to conduct simulations at a spatial resolution of 1 km (e.g., Wood et al., 2011;Kotsuki et al., 2015).
A second advantage of SACRA is its easy detection of cultivation using time series of the NDVI.Because agriculture is controlled by human decisions, it is difficult to estimate from the census-based and model-based methods whether or not farmers actually perform cultivation.Additionally, agriculture is affected by disasters, such as droughts, inundations, heat waves, and cool summer damages.The satellite-sensed NDVI can be used to detect whether the managed land is currently being cultivated or is temporarily in disuse.It is also possible to identify cropping intensity with time series of the NDVI (Fig. 5).
However, SACRA has the disadvantage that it is inapplicable to future simulations such as impact assessments of climate change because SACRA is produced using past observational data.Future changes in agricultural water demand and biomass production are major issues in assessment studies of climate change (Hanjra and Qureshi, 2010).An advantage of SACRA compared to MIRCA2000 is that SACRA provides not only sowing/harvesting dates but also the peak date from the time series of the NDVI.The peak date can be used to calibrate the parameters of crop growth models that simulate the growing stage during cultivation (e.g., Horie 1987).SACRA can contribute to future assessment studies indirectly by being utilized to calibrate their model parameters.
It should be noted that our method is unsuitable for detecting the sowing dates of snow-wheat.Furthermore, our algorithm carries the possibility of overestimating or underestimating cultivation periods in the first estimation.Therefore, we adjusted the length of the cultivation period of SACRA to MIRCA2000.For the temperate crops, sowing and harvesting dates are moved (advanced/postponed) to adjust to the cultivation period.For snow-wheat, the sowing date is defined with respect to the cultivation period of MIRCA2000 and the harvesting date of the first estimation.The adjustment indicates that the cultivation period of SACRA completely relies on that of MIRCA2000.However, the cultivation period can be different in the same administrative unit because of different climates.We plan to utilize both censusbased and model-based cultivation periods for the adjust- ment.Also, utilization of snow cover products from satellite (e.g., the MODIS snow cover product; Hall et al., 2002) or land surface data assimilation (e.g., the global land data assimilation system; Rodell et al., 2004) would help to adjust the sowing date of snow-wheat appropriately.
Our method has the disadvantage that the mixture of several crops in a grid is not considered.Therefore, we assume that the NDVI-Crop represents the phenology of the dominant crop at each grid.Because of this assumption, our approach contains the following disadvantages: (1) the censusbased and model-based approaches can contain CCs for more than one crop for every unit (e.g., MIRCA2000 and W12), while SACRA only contains the CC for the dominant crop in a given unit; (2) census-based data can deliver a CC for either irrigated or rain-fed crops, while SACRA cannot separate them.In fact, CCs for irrigated and rain-fed cropland can be different; and (3) our approach cannot consider the mixture of phenology from several crops and voluntary crops.It should also be noted that the length of the cultivation period in SACRA is adjusted to MIRCA2000 to avoid the unrealistic cultivation periods.A major discrepancy in crop calendars between SACRA and other products can be due to the selection of one dominant crop in each administrative unit.The disadvantages of our approach would be reduced with future improvements based on finer satellite sensors and crop-type classification studies to consider several dominant crops in each administrative unit.
The idea behind CC estimation in SACRA is very simple, and therefore easily applicable to the global cropland and additional satellite observations.Due to data scarcity, we resort to averaged data from three consecutive years (2004)(2005)(2006).The data product generated from this study therefore is of limited use for the direct parameterization of global growth models.However, taking into account the current development in Earth observation (e.g., the development of the European Space Agency's Sentinel series), data scarcity will soon be less of an issue.The proposed method represents a simple and thus easily applied approach that can potentially make use of large amounts of temporally, highly resolved, global, optical, Earth observation data, and may provide interesting input parameters for global land surface models.For example, the estimation of an annual crop calendar is a major part of our scope.
Finally, the accuracy of SACRA depends on the accuracy of the NDVI and land cover data sets.The wavelengths required for the calculation of the NDVI are relatively easy to measure from satellite sensors.Therefore, the accuracy of the NDVI largely depends on the temporal resolution of adequate observations (e.g., the revisiting time of the applied systems and weather at satellite observation, such as cloud cover).Usage of several satellite sensors (e.g., MODIS) would help to reduce the uncertainty of the NDVI.With respect to the accuracy of land cover data, we combine two land cover data sets to reduce the uncertainty of the land cover data.The land cover data sets, however, contain uncertainties (Herold et al., 2008;Nakaegawa, 2012).The land cover data sets could be improved by developing new algorithms, increasing the amount of supervised data, and utilizing multi-spectrum information.Further improvements of the land cover data sets would contribute to improvement of SACRA.

Summary
This study aimed at producing a new crop calendar, SACRA, using a satellite-sensed NDVI.This paper describes the methods to produce SACRA from the following four data sets: time series of the NDVI, land cover data sets, reanalysis temperature, and census monthly agricultural data.The resulting SACRA data set included three products at a spatial resolution of 5 arcmin: (1) the spatial distribution of the dominant crop in major and second cultivation seasons; (2) time series of the NDVI of the cropland; and (3) sowing, peak, and harvesting dates of the dominant crop.The advantages and disadvantages of SACRA compared to other global crop calendars are summarized as follows.
First, an advantage of SACRA is its finer spatial resolution compared to other existing global crop calendars.However, a disadvantage is that the mixture of several crops in a grid is not considered in SACRA.Second, the cultivation period of SACRA is identified from the time series of the NDVI, which corresponds to vegetation vitality.Therefore, SACRA considers current effects of human decisions and natural disasters.Satellite-sensed NDVI data enable detection of whether the managed land is currently cultivated or temporarily in disuse.Finally, SACRA is inapplicable to future simulations because it is based on Earth-observation data.However,

Figure 1 .
Figure 1.Data processing scheme for the production of the global satellite-derived crop calendar (SACRA).The bold numbers inside the boxes indicate the subsections in this paper where the different processing steps are described.The numbers outside the boxes indicate the spatial resolution of the respective data sets.The top four boxes indicate input data (Table1), and the other boxes indicate the results from our data processes.

Figure 2 .
Figure 2. Global distribution of (a) dominant crops in SACRA, and (b) minimum monthly averaged temperature ( • C) during the cultivation period of the dominant crops.Both panels represent the dominant crop in the major cultivation season.

Figure 3 .
Figure 3.Time series of the NDVI at a double-cropping pixel in China (116.76 • E, 32.60 • N).(a) represents the original NDVI and the NDVI with the BISE correction.(b) represents the NDVI with the BISE correction from 2004 to 2006.(c) represents the NDVI average over 2004-2006 and the normalized NDVI (nNDVI).

Figure 4 .
Figure 4. Schematic image of the aggregation of NDVI-Pure and NDVI-Crop from the 1 km resolution original NDVI.Small-sized squares with thin lines represent pixels of the original SPOT-NDVI (1 km resolution).Large-sized squares with bold lines represent grids of SACRA and MIRCA2000 (5 arcmin resolution).Pixels with diagonal lines (from upper left to bottom right and from bottom left to upper right) show where GLCC and Ecoclimap agree on the cropland.

Figure 5 .
Figure 5. Global distribution of (a) detected cropping intensity in this study and (b) climate-based estimation of cropping intensity suitability (i.e., maximal cropping intensity; Zabel et al., 2014).The cropping intensity of the dominant crop is illustrated in Fig. 5(b).

Figure 6 .
Figure 6.Time series of the NDVI for six administrative units in Table3; Brazil Rio Grande do Sul (code 73), China Henan (code 109), France (code 157), India Uttar Pradesh (code 211), Kenya (code 227), and US Mississippi (code 364).Black lines show time series of NDVI-Crop averaged over 2004-2006 in the administrative units (i.e., the NDVI of all grids in the administrative units).Green lines show the average of black lines (i.e., averaged over the administrative units).Red, blue, and magenta representNDVI-Crop in 2004, 2005, and 2006, respectively, averaged over the administrative units.

Figure 7 .
Figure 7. Scheme of identification of sowing and harvesting dates in this study.Sowing and harvesting dates (t sw and t hv ) are identified together with a vegetation index time series (black lines) and two crop calendar (CC) parameters: nNDVI sw and nNDVI hv .(a) and (b) indicate temperate crops (temperate-wheat, maize, rice, soybean, and cotton) and snow-wheat, respectively.The two CC parameters are defined for the six crop types (Table2a).

Figure 9 .
Figure 9. Sowing dates (unit: day of year) of dominant crops in the major cultivation season for (a) SACRA, (b) MIRCA2000, and (c) Waha et al. (2012).Left panels (a-1, b-1, and c-1) and right panels (a-2, b-2, and c-2) show global and South Asian maps, respectively.Sowing dates are illustrated in grids where the dominant crop is temperate-wheat, snow-wheat, maize, rice, soybean or cotton.The major seasons at multiple cropping grids are determined by Eq. (7) for SACRA.Panels of SACRA contain sowing dates of the major cultivation season for both single and multiple cropping grids.Boxes in right panels represent the area where SACRA did not detect the crop calendar in some grids while MIRCA2000-and Waha et al. (2012)-defined.

Figure 10 .
Figure 10.Differences in sowing dates of the dominant crop in the major cultivation season (a: SACRA-MIRCA2000; b: SACRA-Waha et al., 2012).The administrative units are illustrated if their dominant crop in the major cultivation season is temperate-wheat, snow-wheat, maize, rice, soybean or cotton.The differences for each specific crop type are shown in Fig. A3.Only single cropping grids are used to compute the averaged sowing date for SACRA.

Figure A1 .
Figure A1.Average error of sowing/harvesting dates (blue/orange lines; unit days) among calibration grids for six crops (a: temperate-wheat; b: snow-wheat; c: maize; d: rice; e: soybean; and f: cotton).Dots in the figures represent minimized errors and nNDVI sw or nNDVI hv (i.e., the two calibrated parameters in Table4).

Figure A2 .Figure A4 .
Figure A2.Global distribution of calibration grids for the six crops.The calibration grids are illustrated larger than the real grid size (5 arcmin) for emphasis.

Table 1 .
Characteristics and sources of the four global input data sets.

Table 2 .
List of crops.Checkmarks denote crops used for (a) estimation of crop calendar and (b) comparison of cropping intensity.

Table 3 .
Comparison of estimated cropping intensity in this study,

Table 4 .
Number of calibration grids, two calibrated crop calendar parameters (nNDVI sw and nNDVI hv ), and averaged errors (nNDVI) in sowing/harvesting dates between determined crop calendar and MIRCA2000 among calibration grids of the six crop types.Unit Temp.-wheatSnow-wheat Maize Rice Soybean Cotton

Table 6 .
Advantages and disadvantages of three types of global crop calendars: census-based, model-based, and Earth observation-based.