Improved Representation of Agricultural Land Use and Crop Management for Large Scale Hydrological Impact Simulation in Africa using SWAT+

To date, most regional and global hydrological models either ignore the representation of cropland or consider crop cultivation in a simplistic way or in abstract terms without any management practices. Yet, the water balance of cultivated areas is strongly influenced by applied management practices (e.g. planting, irrigation, fertilization, harvesting). The SWAT+ model represents agricultural land by default in a generic way where the start of the cropping season is driven by accumulated 15 heat units. However, this approach does not work for tropical and sub-tropical regions such as the sub-Saharan Africa, where crop growth dynamics are mainly controlled by rainfall rather than temperature. In this study, we present an approach on how to incorporate crop phenology using decision tables and global datasets of rainfed and irrigated croplands with the associated cropping calendar and fertilizer applications in a regional SWAT+ model for Northeast Africa. We evaluate the influence of the crop phenology representation on simulations of Leaf Area Index (LAI) and 20 Evapotranspiration (ET) using LAI remote sensing data from Copernicus Global Land Service (CGLS) and WaPOR ET data respectively. Results show that a representation of crop phenology using global datasets leads to improved temporal patterns of LAI and ET simulations, especially for regions with a single cropping cycle. However, for regions with multiple cropping seasons, global phenology datasets need to be complemented with local data or remote sensing data to capture additional cropping seasons. In addition, the improvement of the cropping season also helps improve soil erosion estimates, as the timing 25 of crop cover controls erosion rates in the model. With more realistic growing seasons, soil erosion is largely reduced for most agricultural Hydrologic Response Units (HRUs) which can be considered as a move towards substantial improvements over previous estimates. We conclude that regional and global hydrological models can benefit from improved representations of crop phenology and the associated management practices. Future work regarding incorporating multiple cropping seasons in global phenology data is needed to better represent cropping cycles in regional to global hydrological models. 30 https://doi.org/10.5194/hess-2021-247 Preprint. Discussion started: 25 May 2021 c © Author(s) 2021. CC BY 4.0 License.

crop growth dynamics are mainly controlled by rainfall rather than temperature. In this study, we present an approach on how to incorporate crop phenology using decision tables and global datasets of rainfed and irrigated croplands with the associated cropping calendar and fertilizer applications in a regional SWAT+ model for Northeast Africa.
We evaluate the influence of the crop phenology representation on simulations of Leaf Area Index (LAI) and 20 Evapotranspiration (ET) using LAI remote sensing data from Copernicus Global Land Service (CGLS) and WaPOR ET data respectively. Results show that a representation of crop phenology using global datasets leads to improved temporal patterns of LAI and ET simulations, especially for regions with a single cropping cycle. However, for regions with multiple cropping seasons, global phenology datasets need to be complemented with local data or remote sensing data to capture additional cropping seasons. In addition, the improvement of the cropping season also helps improve soil erosion estimates, as the timing 25 of crop cover controls erosion rates in the model. With more realistic growing seasons, soil erosion is largely reduced for most agricultural Hydrologic Response Units (HRUs) which can be considered as a move towards substantial improvements over previous estimates. We conclude that regional and global hydrological models can benefit from improved representations of crop phenology and the associated management practices. Future work regarding incorporating multiple cropping seasons in global phenology data is needed to better represent cropping cycles in regional to global hydrological models. 30

Introduction
Even though cropland cultivation covers over 40 % of the planet's ice-free land surface, most regional and global hydrological models either ignore the representation of cropland or consider crop cultivation in a simplistic way or in abstract terms without any management practices (Sood and Smakhtin, 2015;Srivastava et al., 2020). In most cases, the models neither address crop 35 phenological development nor distinguish between different crops and the associated management practices (e.g. planting, irrigation, fertilization, harvesting) (Chen and Xie, 2012;Srivastava et al., 2020). Yet, the water balance of cultivated areas is strongly influenced by applied management practices and their precise timing (Twine et al., 2004). In the context of global change studies, realistic representation of agricultural systems is a major concern as changes in climatic factors affect crop growth and productivity of agricultural systems (Makowski et al., 2014). Therefore, hydrological models that simulate 40 cropland ecosystems should have a reasonable representation of crop phenology and the associated management practices of these ecosystems (Lokupitiya et al., 2009).
The SWAT+ model (Bieger et al., 2017;Arnold et al., 2018) which is a restructured version of SWAT (Soil and Water Assessment Tool; Arnold et al., 1998) utilizes the principles of the EPIC crop growth model (Williams and Singh, 1995) to simulate agricultural land by default in a generic way where the phenological development of crops from planting is driven by 45 accumulated heat units (Arnold et al., 1998). However, the primary controlling factor for the start of the growing season in tropical and sub-tropical regions such as the sub-Saharan Africa is rainfall (Lotsch et al., 2003;Alemayehu et al., 2017). Waha et al., (2013) describes the crop growing season in sub-Saharan Africa as the period in which temperature and moisture are suitable for growth determined by the start and end of the main rainy season. Zhang et al., (2005) showed that the onset of seasonal vegetation green-up across Africa can be directly linked to rainfall seasonality. Studies (e.g. Msigwa et al., 2019, 50 Nkwasa et al., 2020 have further pointed out how the existing multiple cropping seasons in tropical and subtropical climates within an agricultural year coincide with the rainfall and irrigation patterns. Therefore, the use of heat units to trigger the start of the cropping seasons could lead to inconsistencies in crop phenology simulations for tropical and sub-tropical regions. Croplands include various types with associated differences in crop physiology and management practices (Lokupitiya et al., 2009;Yin and Struik, 2009). The phenological change during the vegetation cycle of crop types actively controls the ET 55 process through internal physiology by increasing the amount of leaf stomata with canopy growth (Gong et al., 2014). In the SWAT+ model, plant transpiration is simulated as a linear function of Leaf Area Index (LAI) and Potential Evapotranspiration (PET) (Neitsch et al., 2005). Thus, inconsistences in crop simulations could lead to inaccurately estimating canopy properties such as LAI and canopy height resulting in uncertain estimates of ET (Alemayehu et al., 2016). Accurate estimations of ET in a hydrological model are important because ET is the central flux that defines land-atmosphere interactions (Mueller et al., 60 2011;Fisher et al., 2017).
Additionally, changes in cropland use and crop management have received little attention in hydrological impact assessments yet these may have more significant impacts on model outputs such as soil erosion and sediment yield than rainfall and temperature (O'Neal et al., 2005). Abaci and Papanicolaou, (2009)  significantly affect the impact of precipitation on soil erosion. Cropland practices cause great variations in the erodibility of 65 cropland since soil erosion depends on what crop is grown and the crop cover density (Sundborg and White, 1982). The crop cover is crucial in the estimation of the C (crop management) factor in erosion models such as the Modified Universal Soil Loss Equation (MUSLE) used by SWAT+ (Lin et al., 2014). Other crop management practices such as amounts of fertilizer, alters soil ability to produce biomass and thus alters soil resistance to erosion (Souza et al., 2017). The timing and duration of soil cover on cropland are affected by the planting and maturity dates of the crop. 70 Previous studies have applied the SWAT model at a regional scale within and including sub-Saharan Africa (Schuol and Abbaspour, 2006;Schuol et al., 2008). However, these studies utilized the default generic way of representing agricultural land use without any management practices. Yet, Arnold et al., (2012) emphasized the need for realistic representation of local and regional crop processes to reliably simulate the water balance, erosion and nutrient yields in a SWAT model. One wonders whether these regional studies consider an accurate representation of the internal catchment processes of crop phenology and 75 vegetation dynamics. Chawanda et al., (2020) describes one of the few regional applications of the latest SWAT+ version in a tropical region. The study highlighted that the inclusion of irrigation and reservoirs in model set up using decision tables (Arnold et al., 2018) led to an improvement on the simulations of discharge and ET.
Regional cropping phenology datasets and management practices have been developed using remote sensing approaches (Li et al., 2014;Estel et al., 2016;Xiong et al., 2017) and non-remote sensing approaches, including observational census data 80 (Potter et al., 2010;Portmann et al., 2010;Lu and Tian, 2017;Iizumi et al., 2019;Hurtt et al., 2020;Jägermeyr et al., in revision), to integrate into regional agricultural and hydrologic modelling frameworks. However, remote sensing approaches have been criticized as not being able to detect crop types and cropping sequences without local knowledge or ground truth data (Bégué et al., 2018). Nevertheless, these spatially explicit global cropping phenology data sets have not been utilized in regional hydrological models to improve the land use and crop representation. 85 The novelty of this study is in improving land use and crop process representation for large scale hydrological modelling using SWAT+ by (1) proposing an approach that reasonably incorporates crop phenology using decision tables and global datasets of rainfed and irrigated croplands with the associated management practices in a regional SWAT+ model for Northeast Africa, (2) evaluating model improvements of crop representation by using the remote sensing LAI from Copernicus Global Land Service (CGLS) and ET derived from WaPOR (Water Productivity through Open access of Remotely sensed derived data, 90 FAO, 2018), (3) evaluating how the consideration of crop phenology and the associated management practices affects long term water-driven soil erosion estimates. We do not intend to fully model soil erosion but show how improvements in crop representation can impact soil erosion estimates.

Study area
Our study area in Figure 1 is the North-eastern part of Africa that covers 4,489,000 km 2 . This area includes wholly or partially countries of the Nile basin including Uganda, Kenya, Tanzania, Rwanda, Burundi, Sudan, South Sudan, Ethiopia, Egypt. The area includes the main Nile basin with sub basins such as, Victoria Nile, Blue Nile, White Nile, Atbara, Baro-Akobo-Sobat, 100 Bahr El jebel and Bahr El Ghazal. The agricultural sector is responsible for nearly 75 % of the water withdraw within the basins (Swain, 2011). A strong latitudinal wetness gradient characterizes the climate of the region. The areas north of 18 o N remain dry mostly of the year while there is a gradual increase of monsoon precipitation amounts in the south (Camberlin, 2009).

Modelling approach using SWAT+
SWAT+ is a revised version of SWAT that offers greater flexibility in connecting spatial units in the representation of management operations (Bieger et al., 2017;Arnold et al., 2018). This is a semi-distributed river basin scale model that relies on the physical characteristics of a catchment. It divides a basin into sub basins connected by a stream network, which are 110 further divided into Hydrologic Response Units (HRUs). HRUs represent areas within the sub basin that comprise of the same land use, soil, slope and management practices (Neitsch et al., 2005). SWAT+ also introduces landscape units (LSU) to allow separation of lowland (wetland) processes from upland process (Bieger et al., 2017). SWAT+ applies the hydrological water balance concept, Eq. (1) as the basic driver of all hydrological processes. Where; and are the final and initial soil water content respectively (mm d -1 ), is the amount of rainfall (mm d -1 ), is the amount of surface runoff (mm d -1 ), is the ET amount (mm d -1 ), is the percolation amount (mm d -1 ), is the return flow amount (mm d -1 ), Δt is the change in time (day) and j is the index. The model estimates erosion and sediment yield for each HRU using the Modified Universal Soil Loss Equation (MUSLE) (Williams and Berndt, 1977), Eq.
(2). The MUSLE uses runoff energy rather than rainfall to estimate sediment yields, making it suitable at daily time scale. 120 where; Sed is the sediment yield (tones/day), Q surf is the surface runoff volume (mm/day), q peak is the peak runoff rate (m 3 /s), Area hru is the area of the HRU (ha), K USLE is the USLE soil erodibility factor, C USLE is the USLE crop management factor, P USLE is the USLE support practice factor, is the USLE topographic factor and CFRG is the coarse fragment factor.
Land use and management operations in SWAT+ can be scheduled using either or both decision tables and management 125 schedules. However, decision tables enable the user to model intricate sets of rules and their subsequent actions by allowing them to add conditions for scheduling management (Arnold et al., 2018). Nkwasa et al., (2020) compared the use of decision tables to management schedules and concluded that decision tables provided higher flexibility in representing agricultural practices.

Crop growth cycle with heat unit scheduling 130
SWAT+ uses the simplified version of the EPIC growth model to simulate plant growth (Neitsch et al., 2005). As in the EPIC model, phenological plant development is based on the daily accumulated heat units or by calendar dates, while plant growth can be inhibited by temperature, water, nitrogen and phosphorus nutrients (Neitsch et al., 2005;Arnold et al., 2012). The heat unit theory assumes that plants have requirements that can be quantified and linked to maturity. The total number of heat units required by the plant to start growing or to reach maturity is calculated as in Eq. (2). 135 where; PHU is the total heat units required to plant maturity, T av is the mean daily temperature ( o C), T base is the plant's minimum temperature for growth ( o C), d = 1 is the day of planting and is the number of days required for a plant to reach This heat index is solely a function of climate calculated by SWAT+ using the provided long-term weather data (Neitsch et 140 al., 2005).
While scheduling by heat units is convenient for temperate regions that are mainly driven by temperature, users need to consider that cropping seasons in tropical and sub-tropical regions are primarily driven by water availability (Alemayehu et al., 2017). Hence, the use of heat units causes incorrect cropping seasons for these regions.

Default Model set up 145
The SWAT+ model was set up with the QGIS interface using the data in Table 1. An approached suggested by Chawanda et al., (2020) was used in the model set up since the state-of-the-knowledge harmonized land use product that is formatted in NetCDF was adapted in this study. By default, the cropland was represented in a generic way using heat units to trigger the cropping seasons. The study area was discretized into 768 landscape units and 12526 unsplit HRUs. The USDA Soil Conservation Service (SCS) curve number method was used to estimate surface runoff, variable storage method selected for 150 flow routing and the Penman-Monteith method (Monteith, 1965) used to calculate the potential evapotranspiration.  was made on a pixel by pixel basis. Whatever crop layer fraction occupied a larger percentage for the rainfed and irrigated agricultural areas within a pixel was selected to represent cropland for irrigated and rainfed areas in that pixel. For example in Figure 2; if the C4ann and C3nfx crop occupied a larger fraction within a pixel compared to other cropland use fraction layers for irrigated and rainfed cropland respectively, they were selected to represent cropland use in that pixel. A crop map was 160 developed from this pixel by pixel analysis and a representative crop selected for each cropland use fraction based on literature (Leff et al., 2004) as shown in Table 2.

Validation of model results
Our study focused on improved cropland use representation. We evaluated our simulations for LAI and ET for a period of 7 years (2009 -2015) using remote sensing products from CGLS (https://land.copernicus.vgt.vito.be/) and WaPOR respectively.
Studies (e.g., Alemayehu et al., 2017;Ha et al., 2018;Nkwasa et al., 2020) have demonstrated the capability of using remote sensing products to evaluate hydrological model outputs. Representative basins in the model as shown in Figure 3 were selected 175 to highlight the importance of incorporating global phenology datasets on LAI simulations in regional hydrological modelling.
The selected basins were based on the reported cropping patterns that start with the rainy season (Waha et al., 2013) i.e Upper Blue Nile basin with a predominantly single cropping season, Victoria basin with a double cropping season and the Nile delta with mainly a double irrigated cropping season (Sugita et al., 2017;M. El-Marsafawy et al., 2018). Crop HRUs within the selected sub-basins, that occupied the largest areas were selected to reduce the effect of mixed LAI from different land cover 180 classes when comparing with the remote sensing LAI.   To illustrate the impact of revised cropland use representation on model outputs, we compare the differences in soil erosion simulations between the default and the revised SWAT+ models. However, due to the sparse and poor quality records of erosion and sediment yield in this region (Haregeweyn et al., 2017), it was not possible to quantitatively validate erosion model 190 results. Instead, we adopted a 'scientific validation' approach that is suitable for cases when observations for comparison with model outputs are limited and when the model is utilized to advance the knowledge of physical processes (Biondi et al., 2012).
We compared our erosion estimates for some catchments e.g. Upper Blue Nile with those from a few previous studies (Hurni, (Arnold et al., 1998;Srinivasan et al., 2010).
In addition, the uncalibrated models already had good water balance estimates (0.5% and 0.4% for the default and revised models respectively) calculated using Eq. (1) for the simulated period. We assume that the differences seen in the model setups 205 originate primarily from the crop representation and management practices. Hence, we do not address issues concerning the SWAT+ model calibration and validation in this paper.

LAI simulations
The simulated LAI from both the default and revised SWAT+ models was compared with the remote sensing LAI extracted 210 for the maize, wheat and soy HRUs in the 3 selected sub-basins (Upper Blue Nile, Lake Victoria and Nile Delta). In the Upper Blue basin, Figure 4(a) and Figure 4(c), there is an improved LAI simulation in the revised SWAT+ model with the phenological development being captured in the correct major cropping season within the rainy season for both the rainfed and irrigated maize HRUs. Additionally, the revised SWAT+ model LAI strongly correlates (rd > 0.5) with the remote sensing (RS) LAI. Figure A1   causes some errors in the model outputs. The global crop calendars also lack a temporal time series dimension which could be a substantial source of uncertainties in predicting phenological events of croplands. Another source of uncertainties could be the use of remote sensing LAI data (1km resolution) in evaluation that does not represent a pure signal of a crop but rather vegetation with in the pixel. Nkwasa et al., (2020) highlighted these scaling issues when using remote sensing products in model evaluation. Nevertheless, the remote sensing data still provides insights on the temporal vegetation growth relationship 250 with seasonal weather patterns.

ET simulations
The annual average simulated agricultural ET from the revised SWAT+ model improves the default agricultural ET simulation from 732 mm y -1 to 837 mm y -1 as compared to the WaPOR agricultural ET of 936 mm y -1 . Figure  The inclusion of the global phenology and management practices shows that ET is one of the major components of a basin water balance that is greatly influenced by the seasonal vegetation growth cycles. Although, the agricultural ET is improved with the incorporation of the global crop phenology, there is still an underestimation. This underestimation could be attributed to the missing multiple cropping seasons especially in areas that are irrigated. Additionally, automatic irrigation was specified 260 in the model which applies water from an unlimited source to the field when the water stress is below a specified threshold (0.7) of the field capacity. However, this may be unrealistic in all irrigation sites causing uncertainties in irrigation applications which affects the ET estimates. Nevertheless, the ET estimates could be further improved by model calibrations to obtain the optimal possible ET.

Erosion simulations
LAI is not only directly related to processes such as rainfall interception, evaporation, transpiration, soil evaporation, root depth but also to soil erosion through canopy cover which varies during the growth cycle of the plant. With a better Additionally, with more biomass, more residue is generated which could be more effective in reducing soil erosion even after the cropping season. Residue intercepts rain droplets near the soil surface that drops regain no fall velocity (Neitsch et al., 2011). For the Nile delta in Figure 6(b), the soil erosion estimates reduced further even though they were already insignificant.
The soil erosion estimates are reduced by a maximum of 625 t km -2 y -1 , Figure 8(a) or up to 90 %, Figure 8(b) in some areas 280 within the region when using the revised SWAT+ model as compared to the default model. The average regional soil erosion yield reduced by 16 %. This reduction is attributed to the improved timing of the cropping seasons in correspondence to the start of the rainy season which provides more canopy cover to intercept the raindrops. However, in some isolated regions, the revised SWAT+ model simulated an increase in soil erosion estimates as compared to the default model. In most of those regions, the global phenology data captures the irrigated cropping season which is often occurring in the dry seasons( Figure A2(c) and Figure A3(a)) which causes discrepancies by not representing the major growing season in the rainy season. This is attributed to the fact the global phenology data provides a single cropping season per pixel 290 per year.
In order to validate the regional soil erosion estimates, the simulated soil loss from the revised SWAT+ model was compared with the spatial patterns in erosion rates from the literature. From published literature, Ethiopia is the one of the most documented countries in Northeast Africa with marginal information existing for other countries (Molina, 2009). The revised SWAT+ model shows that the regional soil erosion extent varies from zero to over 20500 t km -2 y -1 , (Figure 9), revealing the 295 severity of soil erosion in the Blue Nile basin (Ethiopian highlands) as compared to the other parts of the region.
Comparing with estimates from the Upper Blue Nile basin, the model estimated an erosion yield extent from 0 to 13000 t km -2 y -1 and a mean of 701 t km -2 y -1 which is slightly lower but comparable to a net soil erosion mean of 734 t km -2 y -1 reported by Haregeweyn et al., (2017) and soil erosion yield extents from zero to over 15000 t km -2 y -1 reported by Hurni, (1985), Betrie et al., (2011) and Haregeweyn et al., (2017). Tamene and Le, (2015) reported a net soil loss of 8500 t km -2 y -1 and 600 t km -2 300 y -1 in the Blue Nile and White Nile basins respectively. These estimates should be considered as indicative as comparing these values with the Northeast African regional model estimates can be challenging mainly due to the differences in the sizes of areas involved resulting from the different delineation procedures.  Even though the regional model underestimates the soil erosion in comparison with these localized studies, the order of magnitude is within the same range. The underestimation can be attributed to the finer resolution of datasets utilized by the local studies as compared to the coarse datasets utilized in the regional model. For example, Molnár and Julien, (1998) calculated soil erosion using different DEM grid sizes and concluded that the estimated slope gradients decreased as the cell size increased which influenced the topographic factor (LS) estimation. Additionally, the input global weather data is at a scale 310 of 0.5 o which makes it too coarse to capture the spatial variability of weather at a finer scale. This has been a challenge for large scale hydrological modelling (Chawanda et al., 2020), that needs to be addressed for better performance.
With that background, it is not wise to entirely consider the soil erosion estimates in this study as exact quantification but rather as close approximations. It is worth noting that the focus of this study was not soil erosion estimation but to illustrate a concept. 315

Conclusion
In this work, an approach has been developed for an improved representation of crop phenology and management in a regional SWAT+ model using decision tables and global datasets. In addition, global remote sensing datasets of LAI and ET have been https://doi.org/10.5194/hess-2021-247 Preprint. Discussion started: 25 May 2021 c Author(s) 2021. CC BY 4.0 License. used for model evaluation. A comparison of the simulated LAI revealed improved temporal growth patterns in agreement with remote sensing LAI, especially for regions with a single cropping cycle. However, for regions with multiple cropping cycles, 320 only one cropping cycle was represented as most global phenology datasets provide a single cropping cycle per year.
The improvements in the SWAT+ model reduced the agricultural ET deficit by 50 % in comparison with the WaPOR ET, showing a strong linkage between hydrological response and agricultural land use representation. Additionally, this improvement in ET estimates is expected to reduce any calibration efforts needed to obtain the maximum possible ET as the physical process representation of crops is improved. A considerable reduction of 16 % in the average regional soil erosion 325 estimates was noticed after implementing this approach. This impact on soil erosion estimates shows the importance of proper representation of crop processes and an important element for minimizing errors in soil erosion estimates.
There is a need for global phenology datasets with multiple cropping seasons for further improvements in the crop representation, especially for improving crop processes in irrigated areas or areas with multiple rainy seasons. The approach developed in this research lays a foundation for improved agricultural land use representation with associated management 330 practices at regional and global scales which will further improve regional to large scale hydrological and water quality impact assessments of global change. Author contributions: AN and AG designed this study. JJ provided the phenology datasets. CJC and AN set up the model. AN performed the model simulations, primary analysis and drafted the paper. All authors contributed to results interpretation and reviewed the paper. 355 Competing Interests: The authors declare that they have no conflict of interest.