Drought impact links to meteorological drought indicators and predictability in Spain

. Drought affects many regions worldwide, and future climate projections imply that drought severity and frequency will increase. Hence, the impacts of drought on the environment and society will also increase considerably. Monitoring and early warning systems for drought rely on several indicators; however, assessments of how these indicators are linked to impacts are still lacking. Here, we explore the links between different drought indicators and drought impacts within six sub-regions in Spain. We used impact data from the European Drought Impact Report Inventory database and provide a new case study to evaluate these links. We provide evidence that a region with a small sample size of impact data can still provide useful insights regarding indicator–impact links. As meteorological drought indicators, we use the Standardised Precipitation Index and the Standardised Precipitation Evapotranspiration Index; as agricultural and hydrological drought indicators, we use a Standardised Soil Water Content Index and a Standardised Streamﬂow Index and a Standardised Reservoir Storage Index. We also explore the links between drought impacts and teleconnection patterns and surface temperature by conducting a correlation analysis, and then we test the predictability of drought impacts using a random forest model. Our results show that meteorological indices are best linked to impact occurrences overall and at long timescales between 15 and 33 months. However, we also ﬁnd robust links for agricultural and hydrological drought indices, depending on the sub-region. The Arctic Oscillation, Western Mediterranean Oscillation, and the North Atlantic Oscillation at long accumulation periods (15 to 48 months) are top predictors of impacts in the northwestern and northeastern regions, the community of Madrid, and the southern regions of Spain, respectively. We also ﬁnd links between temperature and drought impacts.

Abstract. Drought affects many regions worldwide, and future climate projections imply that drought severity and frequency will increase. Hence, the impacts of drought on the environment and society will also increase considerably. Monitoring and early warning systems for drought rely on several indicators; however, assessments of how these indicators are linked to impacts are still lacking. Here, we explore the links between different drought indicators and drought impacts within six sub-regions in Spain. We used impact data from the European Drought Impact Report Inventory database and provide a new case study to evaluate these links. We provide evidence that a region with a small sample size of impact data can still provide useful insights regarding indicator-impact links. As meteorological drought indicators, we use the Standardised Precipitation Index and the Standardised Precipitation Evapotranspiration Index; as agricultural and hydrological drought indicators, we use a Standardised Soil Water Content Index and a Standardised Streamflow Index and a Standardised Reservoir Storage Index. We also explore the links between drought impacts and teleconnection patterns and surface temperature by conducting a correlation analysis, and then we test the predictability of drought impacts using a random forest model. Our results show that meteorological indices are best linked to impact occurrences overall and at long timescales between 15 and 33 months. However, we also find robust links for agricultural and hydrological drought indices, depending on the subregion. The Arctic Oscillation, Western Mediterranean Oscillation, and the North Atlantic Oscillation at long accumulation periods (15 to 48 months) are top predictors of impacts in the northwestern and northeastern regions, the community of Madrid, and the southern regions of Spain, respectively. We also find links between temperature and drought impacts. The random forest model produces skilful models for most sub-regions. When assessed using a cross-validation analysis, the models in all regions show precision, recall, or R 2 values higher than 0.97, 0.62, and 0.68, respectively. Thus, our random forest models are skilful in predicting drought impacts and could potentially be used as part of an early warning system.

Introduction
Drought, as defined by Wilhite and Glantz (1985), "is a condition relative to some long-term average condition of balance between rainfall and evapotranspiration in a particular area, a condition often perceived as 'normal"'. The prediction of drought onset or end is a complex task. Drought severity is also difficult to measure or quantify. This is because drought depends on several factors, for instance, the duration, intensity, and the geographical extent of the event. Additional factors specific to each region also play a large role, such as the water demand with respect to water supply. All of these characteristics make drought difficult to identify and quantify. Drought has far-reaching impacts on society and the environment that may last for long time periods (Wilhite and Glantz, 1985).
There is not a common and straightforward definition of drought; however, all types of drought originate from a lack of precipitation. Many different definitions of drought have been developed by different disciplines. There are four main types of disciplinary definitions, i.e. meteorological, agricultural, hydrological, and socioeconomic. Meteorological drought often uses precipitation as its atmospheric parameter. Agricultural drought considers links between meteorological drought and its impacts on agriculture. Hydrological drought accounts for the repercussions of dry periods on surface and subsurface hydrology, for instance, on streamflow, groundwater, and reservoirs. Finally, socioeconomic drought considers the effects that drought has on the supply and demand of economic goods (Wilhite and Glantz, 1985).
There is already evidence that climate change, as a result of anthropogenic actions, has increased the risk of meteorological drought in southern Europe . Similarly, warmer temperatures have increased atmospheric evaporative demand, which has in turn increased drought severity over the past 50 years (Vicente-Serrano et al., 2014b). Climate change projections point towards a reduction in the water resources in Spain. Hydraulic infrastructures have been designed with safety margins; however, these may be surpassed due to the effects of climate change. Increased evapotranspiration, as a result of increased temperatures, together with a possible increase in the length of the irrigation period, might increase the water demand for irrigation and for agricultural use, which currently accounts for more than 70 % of total water demand. In addition, the energy sector is also dependent on the availability of water, which also makes it vulnerable to increased drought risk (Ministerio para la Transición Ecológica y el Reto Demográfico, 2020). Vulnerability to water scarcity and drought is also likely to increase due to challenges such as a growing population, population migration to more arid regions, urbanisation, increasing tourism, and pollution (Rossi and Cancelliere, 2013). As a result, the consequences of drought on the environment and society are becoming more important. Spain is a country that already has an intense use of water resources; hence, it is crucial to reinforce water management to provide future water security.
In order to lessen the impacts of drought on society and the environment, efficient mitigation measures are necessary. This means that effective drought monitoring and early warning systems (DEWSs) are essential. DEWSs reduce societal vulnerability to drought by maximising the lead time of early warnings to allow more time for the implementation of mitigation measures (Pozzi et al., 2013). These systems usually rely on different drought indicators that represent different parts of the water cycle. Drought indicators describe drought conditions, and examples of commonly used variables are precipitation, temperature, streamflow, groundwater and reservoir levels, soil moisture, and snowpack (Svoboda et al., 2016). DEWSs use a large variety of drought indicators, and the most commonly used one is the Standardised Precipitation Index (SPI; McKee et al., 1993). DEWSs usually use indicators based on variables that can be measured with ease and that are readily obtainable in time (Bachmair et al., 2016a). However, Bachmair et al. (2016a) revealed that although there has been increasing efforts on the research and practice of drought indicators, DEWSs are still not well linked with assessments on how drought impacts the environment and society. This is because links between drought indicators and impacts have not been sufficiently studied. This means that impact data are not being used to determine whether indicators are linked to the impacts of drought. The authors call for drought to be framed as a coupled dynamical system of the environment and society to fully understand drought impacts.
Drought-related impacts are complex to study and document because many sectors depend on water availability to produce goods and provide services. Bachmair et al. (2016a) also revealed that there are very few systematic approaches for the collection of impact data, except for agricultural drought. A good example of efforts to improve such documentation exists in Europe, where a drought impact inventory has been created, i.e. the European Drought Impact Report Inventory (EDII; Stahl et al., 2016). This database collects reported drought impacts for different European countries. Impacts are classified into major impact categories (e.g. agriculture and livestock farming, wildfires, public water supply, and forestry), and each category has several subtypes. Also, each drought impact event has, at the very least, information on the source of information, location, duration, and impact category and has a description.
Here, we investigate the links between different drought indicators and reported impacts from the EDII database for Spain. The time period studied is from August 1975 to May 2013. We aim to assess two meteorological indicators, the SPI and the Standardised Precipitation Evapotranspiration Index (SPEI; Vicente-Serrano et al., 2010), two hydrological indicators, streamflow and reservoir storage levels, and an agricultural indicator, soil water content. The main motivation for this study is to provide a new case study for the evaluation of these links and to test the usefulness of impact data in this region. Impact data from the EDII has already shown links with drought indicators in similar studies (Bachmair et al., 2015(Bachmair et al., , 2016bStagge et al., 2015).
The first objective of the study is to investigate how strong and robust the link between drought indicators and drought impacts is. We conduct a correlation analysis, and to further examine these indicator-impact links, we then use machine learning (a random forest model) to model and predict drought impact occurrences. Another goal of this analysis is to test the potential of such a method, with the available data, to predict future impacts. We aim to determine whether drought impacts can be skilfully predicted in this region.
To our knowledge, links between teleconnection patterns or temperature and drought impacts using the EDII database have not been studied before. We therefore investigate five teleconnection patterns (Feldstein and Franzke, 2017), the El Niño-Southern Oscillation (ENSO), the North Atlantic Os-cillation (NAO), the East Atlantic (EA) pattern, the Western Mediterranean Oscillation (WeMO), and the Arctic Oscillation (AO), as well as a surface temperature index, as possible predictors of drought impact occurrences. We aim to investigate whether these climate indices show links and are better predictors of impacts than the previously presented drought indicators. Our next objective is to determine what type of indicators are the best predictors of impacts in each sub-region. Last, the incompleteness of the impact reports from the EDII database challenges the quantification of impact occurrences, hence, we also investigate different ways to take incomplete reports into account. We investigate whether different impact quantification methods change the results in a significant way and, if so, how.
Our paper is structured as follows: in Sect. 2, we introduce the study area, the drought indicators, climate indices ,and vulnerability factors used and their data sets. Also, we show how we deal with incomplete impact data and the methods for the data analysis. In Sect. 3, we present the results of our correlation and random forest analyses. In Sect. 4, we provide a discussion of our results and conclude.
2 Methods and data

Study area
Spain is located in a geographical area with a high recurrence of drought events due to it being in a transition zone between polar and subtropical atmospheric circulation influences (Sivakumar et al., 2011). Its precipitation and runoff patterns are highly diverse and complex, which is characteristic of the Mediterranean area. As described by Vide (1994), the characteristic climate of Spain has the following: modest rainfall overall, high interannual variability, with a high amount of rainfall occurring during relatively few days, long dry periods, an arid climate, meaning the amount of potential evapotranspiration is greater than rainfall, large regional variations in seasonal rainfall patterns, and anomalies that may be related to atmospheric teleconnection patterns.
Because of the chaotic patterns of precipitation, Spaniards have been attempting to increase water availability for at least the last 2000 years (del Moral and Saurí, 1999). Water scarcity and frequent droughts are recurrent problems that Spain suffers, and this is mainly because the spatial and temporal distribution of its water resources is irregular. This also occurs because water demands are highest in the more water scarce areas and during the seasons when precipitation is the lowest and evapotranspiration the highest. Climate change will most likely exacerbate the existing problems with water resources (Estrela and Vargas, 2012;Ministerio de Medio Ambiente, 2005). Droughts cause extensive impacts in Spain; for instance, during a very intense drought event in 1991-1995, the water supply was restricted for more than 25 % of the total population (12 million people). In the most affected regions, evacuation plans were enacted. Agricultural production was also severely affected (del Moral and Hernandez-Mora, 2015). The major drought event in 2004-2005 led to social unrest and created disputes over future water infrastructure (Iglesias et al., 2009). Drought periods can impose significant costs on farmers and affect crop productivity (e.g. Iglesias et al., 2003;Austin et al., 1998;Páscoa et al., 2017;Peña-Gallardo et al., 2019). Moreover, vegetation activity has been shown to be linked to the interannual variability in drought , and drought has been related to burned areas from wildfires . Drought events have also shown links with daily mortality across Spain (Salvador et al., 2020). Overall, the economic damages of drought are severe. According to the international Emergency Events Database (EM-DAT; Guha-Sapir et al., 2016), Spain ranked fourth worldwide and first in Europe for total economic damages resulting from drought events from 1990 to 2018 (USD 7.7 billion).
We chose to perform this study in the specific region of Spain because this region is severely impacted by droughts; hence, a better understanding of indicator-impact links here is urgently needed. Such links have been already been successfully investigated in five other European regions (Bachmair et al., 2015(Bachmair et al., , 2016bStagge et al., 2015). We focus on characterising these links in six sub-regions within Spain because studying a small region means that indicator-impact links can be better detected and quantified .
We also chose this region to investigate indicator-impact links in a region that has a very dense network of reservoirs and to investigate the timescales at which drought conditions lead to drought impacts. As described by González-Hidalgo et al. (2018), Spain has a high density of hydraulic reservoirs. Spain has, after China, the second largest number of dams in the world. This is because Spain's climate is characterised by dry summers and high interannual variability. Such a dense network increases Spain's resilience to short-term droughts by guaranteeing water supply during these events. However, at longer timescales, drought conditions still produce severe drought impacts to water supply; conditions that last more than 2 or 3 years have been shown to reach the limits of the capacity of these infrastructures (González-Hidalgo et al., 2018).

Drought indicators and data sets
As drought indicators, we considered the SPI, the SPEI, and a streamflow index because these are commonly used in DEWSs (Bachmair et al., 2016a). The SPI and especially SPEI have also shown higher correlations than other drought indices with crop yields in Spain . Furthermore, we included an additional hydrological indicator (using reservoir storage data) and an agricultural drought index (using soil water content data) to compare their performance against the other types. Soil moisture has been found to be an important factor when studying drought impacts on the productivity of some agricultural crops in Spain (Sainz de la Maza and Del Jesús, 2020). Furthermore, the multiscalar nature of all these drought indicators is useful when assessing the timescales of drought impacts. We also explored the use of several teleconnection patterns (Feldstein and Franzke, 2017) as predictors of drought impacts.
The SPI is a commonly used drought index that is simple to compute. It can be used to compare droughts in different regions, and it can be temporally aggregated over different timescales (Guttman, 1999). It calculates "the precipitation deviation for a normally distributed probability density with a mean of zero and standard deviation of unity" (Mc-Kee et al., 1993). It is computed by fitting precipitation data to a distribution and then transforming it to a normal distribution (McKee et al., 1993). In this study, we calculated the SPI using the SPEI R package Beguería et al., 2014). We used a gamma probability distribution to model the observed precipitation values.
The SPEI is a similar index to the SPI, but instead of being computed with precipitation values only, it is based on climatic water balance. The climatic water balance is a monthly difference between precipitation and potential evapotranspiration (PET) at different timescales. This provides a measure of the accumulated water surplus or deficit. We used the approach of Vicente-Serrano et al. (2014b) to calculate the PET; this is a simple approach that only requires data for monthly mean temperature and uses the Thornthwaite equation (Thornthwaite, 1948). To obtain the final index, the same procedure as for the SPI was followed; however, a log-logistic probability distribution was used to model the precipitation-PET values. We also calculated the SPEI with the SPEI R package. This index accounts for the effects of temperature variability on drought. The advantages of this index, especially under global warming conditions, are that it identifies increased drought severity when the water demand is higher as a result of increased evapotranspiration. In addition, its multi-scalar nature allows its use for drought analysis and monitoring .
Volumetric soil water is the volume of water (m 3 ) in a soil layer (m 3 ). We used this variable to create an agricultural drought indicator, a Standardised Soil Water Content Index (SSWI), using the Standardised Drought Analysis Toolbox Farahmand and AghaK-ouchak, 2015a). This toolbox provides a generalised framework for deriving nonparametric univariate indices that can be interpreted similarly to the rest of the indices used in this study. This index was calculated at four different soil depths (defined by the data set used), hereafter referred to as SSWI1, SSWI2, SSWI3, and SSWI4. To create a Standardised Streamflow Index (SSFI) and a Standardised Reservoir Storage Index (SRSI), we standardised streamflow and reservoir values using the same methodology as for SPI. We also computed a Standardised Temperature Index (STI) using the same methodology, with surface temperature data only.
All of the indices were aggregated over different timescales. The aggregation of the SPI, SPEI, SSFI, and SRSI was done prior to fitting them to a distribution and transforming to the normal distribution. This means that the data for the current month and past X months was used to compute the value for a given month. The aggregation period is hereafter labelled with the suffix "-X" (e.g. SPI-X). Similarly, for the SSWI, the aggregation was done prior to computing the empirical distribution.
We chose teleconnection patterns that could potentially be relevant as drought predictors. The chosen climate indices have shown correlations with precipitation in Spain (e.g. Rodó et al., 1997;Martinez-Artigas et al., 2021;Ríos-Cornejo et al., 2015), and the NAO, EA, AO, and WeMO have also shown links with the drought index, SPEI (Manzano et al., 2019). The effects of the NAO on droughts using the SPI and/or SPEI at the European scale are also well studied Kingston et al., 2015). Furthermore, empirical links between drought impacts and ENSO and NAO in Spain have already been established. For instance, Gimeno et al. (2002) looked at the influence of ENSO and the NAO on the most important Spanish crops. They detected significant effects on the yield for most of these crops. They found low yields during La Niña years and higher yields during positive NAO phases. All of the climate indices were aggregated by computing a moving average over X months.
We used the Iberia01 daily precipitation and temperature observational gridded data set to calculate the SPI and SPEI (http://hdl.handle.net/10261/183071, last access: 9 March 2020; Gutiérrez et al., 2019;. This is a high-resolution data set produced using a dense network of stations over the Iberian Peninsula, with 3481 and 276 stations for precipitation and temperature, respectively. Gridded values are provided at a spatial resolution of 0.1 • , and they cover the entire time period studied here. This data set has been shown to produce more realistic patterns in the case of precipitation than other frequently used data sets. We used data from individual streamflow and reservoirlevel monitoring stations from Ministerio para la Transición Ecológica y el Reto Demográfico (2022) to calculate the SSFI and SRSI. There were a total of 1447 and 367 streamflow and reservoir storage monitoring stations, respectively; however, after removing stations with more than 20 % miss-ing data, data from 786 and 322 stations remained. We obtained volumetric soil water content data from the ERA5-Land data set (Muñoz Sabater, 2019). This is a reanalysis data set that provides estimates for land variables. It has a horizontal spatial resolution of 0.1 • × 0.1 • and has a vertical resolution that consists of four levels of the surface at 0-7 cm (layer 1), 7-28 cm (layer 2), 28-100 cm (layer 3), and 100-289 cm (layer 4). We obtained data for the NAO, EA, AO, and ENSO from the NOAA Climate Prediction Center (https://psl.noaa.gov/data/climateindices/list/, last access: 2 August 2021 and https://www.cpc.ncep.noaa.gov/ data/teledoc/ea.shtml, last access: 2 August 2021) and data for the WeMO (Martin-Vide and Lopez-Bustins, 2006) from http://www.ub.edu/gc/wemo/ (last access: 2 August 2021). We used data from these data sets for the period 1975-2013, except for volumetric soil water content data, where we used data from 1981-2013, since data were only available starting from 1981. We also aggregated all the indices over a range of timescales. These ranges differed depending on up to which timescales the indicator-impact correlation strengths were greatest. These were 1-33 months for the SPI, SPEI, and SRSI and 1-48 months for the SSWI, STI, and teleconnection patterns.
The nomenclature of territorial units for statistics (Nomenclature des Unités territoriales statistiques -NUTS) classification divides economic territories of the European Union (Eurostat, 2020). NUTS-1 regions represent major socioeconomic regions, and these were the sub-regions we considered in this study. These were the northwest (NW), northeast (NE), community of Madrid (MA), centre (CE), east (E), and south (S). The Canary Islands were excluded due to a lack of impact data in this region. We aggregated all of the indicators studied over each NUTS-1 region and produced a mean monthly time series for each sub-region using the R package panas (De Felice, 2020).

Drought vulnerability and data sets
To understand how a region is impacted by drought, drought risk needs to be considered as a function of the hazard, vulnerability, and exposure to drought events. "Vulnerability refers to the propensity of exposed elements such as human beings, their livelihoods, and assets to suffer adverse effects when impacted by hazard events" and "exposure refers to the inventory of elements in an area in which hazard events may occur" (Cardona et al., 2012). Exposure varies spatially, and vulnerability depends on social and economic factors of a region, which can greatly change over time (Wilhite, 2000).
To explain the drought impacts beyond the hazard, we also investigated whether adding vulnerability factors as drought impact predictors would increase the predictability of drought impact models. We used data for public water supply, unemployment rate, population density, gross domestic product (GDP) per capita, and gross value added (GVA) by industry (except construction), by agriculture, forestry, and fishing, and by the Statistical Classification of Economic Activities in the European Community (NACE) activities. We also used land cover data from the CORINE Land Cover data set and calculated the percentage of each land cover class per NUTS-1 region. A monthly time series was created for each vulnerability factor from yearly or 6year data by linearly interpolating the data points. We obtained the data from the Instituto Nacional de Estadística (https://www.ine.es, last access: 6 September 2021), Eurostat (https://ec.europa.eu/eurostat/, last access: 10 November 2021), and the CORINE Land Cover data set (https:// land.copernicus.eu/pan-european/corine-land-cover, last access: 12 November 2021). Most of these factors have been reviewed and tested as drought vulnerability factors by Blauhut et al. (2016).

Drought impact data
We retrieved drought impact information from the EDII. This database had 388 impact report entries for Spain, which covered the time period from August 1975 to May 2013. Each reported impact has three spatial references which correspond to the three levels of the NUTS regions. We aggregated the impact information by NUTS-1 region and did not differentiate impact types from one another; hence, we treated all impacts as equal and of a general type. The impact categories considered in the EDII were as follows: -Agriculture and livestock farming  In order to evaluate links between indicators and impacts, we mainly followed the methodology by Bachmair et al. (2015Bachmair et al. ( , 2016b, who assessed links between hydrometeorological indicators and impacts for Germany and the UK. We first explored correlations between indicators and impacts, and we then used a random forest model to evaluate the predictive potential and predictor importance of the different indicators. Before conducting the analysis, we first converted impact reports into a monthly time series of the number of drought impact occurrences for each sub-region. To do this, we imposed criteria to convert a single drought impact report (an entry from the EDII) into a drought impact occurrence, which will be referred to as DIO. We converted impact reports into a monthly time series by creating a DIO for every month in between the start and end date. However, a large proportion of the reports were incomplete. The data had the following five main problems: 1. The specific sub-region affected was not indicated.
2. The start and the end year were indicated, but there was no indication of the start and end month.
3. The start year was indicated, but there was no indication of the end year or start and end month.
4. The start month was indicated, but there was no indication of the end month and year.
5. The start month and the end year were indicated, but there was no indication of the end month.
In total, 24 % of the reports had problem 1, 33 % had problem 2, 37 % had problem 3, 9 % had problem 4, and 3 % had problem 5. Therefore, to overcome and estimate the uncertainty in our analysis as a result of the incompleteness of the data, we developed different counting methods. This meant that we tested the effects of including or excluding reports with these problems. The counting methods (CM) are described in Table 1. We visually examined impact reports, specifically their durations and descriptions, to make sure that their quantification to impact occurrences was sensible regarding the nature of each impact. For instance, we made sure that impacts that usually do not last more than 1 month by nature were reported in such a way. Some of these impact types were wildfires, air quality, human health, and public safety. Quantifying more long-lasting impacts, such as impacts on agriculture and livestock and freshwater and terrestrial ecosystems, was more challenging, since determining their exact start and end dates is not possible. However, we still used the start and end dates of these reports, since we believe that these represent the period during which sectors were most affected by drought impacts. For example, for a report that stated that "livestock farming economic losses are estimated to be EUR 39.372 million for the period 1 November 2004-30 April 2005" , we assumed that the occurrence of these economic losses corresponds to the period during which the sector was most affected by drought impacts. Figure 1 shows the distribution of impacts types for each NUTS-1 region and for the whole of Spain using the most and least censoring counting methods. It shows that most of the impacts recorded in the EDII for Spain were on agriculture and livestock farming, public water supply, and freshwater ecosystems. Also, depending on the censoring criteria, the distribution of impact types varied slightly. For instance, the most censoring counting methods showed a larger proportion of impacts on terrestrial ecosystems in the S region and for the whole of Spain. The least censoring methods also had a larger variety of impact types.
Aggregating all impact types to a single category is not an optimal choice, since focusing on sector-specific drought impact occurrences separately would produce more informative results. However, each sub-region contains three or fewer types of impact categories, except the S region which contains eight (Fig. 1, CM1). This meant that there were not enough data points to investigate categories separately. Figure 2 shows the time series of total DIOs for Spain and also shows identified precipitation deficit episodes in which Spain suffered major impacts due to severe drought and water scarcity events (Hervás-Gámez and Delgado-Ramos, 2019; Ministerio de Agricultura, Pesca y Alimentación). Historical drought periods identified by Sainz de la Maza and Del Jesús (2020), determined from economic impacts of past droughts, are also shown. The latter authors identified these by using data from the EM-DAT and from another study (Ollero Lara et al., 2018) that used insurance data by the Entidad Estatal de Seguros Agrarios (National Agricultural Insurance Agency). Figure 2 shows that most of the DIOs occurred during the identified historical drought periods by the authors mentioned. A slight disagreement occurs in 2008 until late 2009, where DIOs continue to occur even though the reported drought episodes end in 2007. Moreover, Fig. 3 shows the time series of DIOs for all sub-regions (see Fig. S1 in the Supplement for DIOs when using the two least censoring counting methods). We observe that, during each drought episode, DIOs do not always occur in all sub-regions, and that the amounts and patterns of DIOs do change between counting methods, depending on the harshness of the censoring criteria.

Correlation analysis
For each NUTS-1 region, we selected a subset of years from August 1975 to December 2013 for the analysis. This selection excluded years where no impact occurrences were reported. We then included each month of each selected year in a censored time series. We did this to exclude years in which regions may have experienced drought impacts but which may not have been recorded in the EDII. The lengths of the censored time series are shown in Table 2. To determine the relationship between drought indicators and drought impacts,  we first conducted a cross-correlation analysis. We calculated the Spearman rank correlation coefficients (Spearman, 1961) and the significance levels for the time series of different indicator versus the time series of DIOs for each NUTS-1 region. For each indicator, we spatially aggregated the indicators over each NUTS-1 region using their mean.

Random forest analysis
A random forest (Breiman, 2001) is a machine learning approach that uses ensemble trees. This approach has already been used to link drought indicators to impacts (Bachmair et al., 2016b(Bachmair et al., , 2017 and even to forecast drought impacts (Sutanto et al., 2019). A random forest is a tree-based ensemble; each tree depends on a random sample of predictor variables and a random response variable. The model then  creates a prediction function to predict the response variable by constructing ensembles of trees. The predictions over all trees are combined by being averaged (in regression models) or by selecting the class that is more frequently predicted (in classification models) (Cutler et al., 2012). Random forest models are appealing because the predictor and response variables can either be continuous or categorical variables. It is a fast model to run computationally, and very few tuning parameters are required. Apart from training and making predictions, they can also provide variable importance measures. Random forest models also require minimal human supervision (Cutler et al., 2012).
In this study, we used random forest models to model drought impacts, using the R package randomForest (Liaw and Wiener, 2002). We trained two models for each NUTS-1 region, i.e. a regression and a classification model. For the regression models, we used a time series of normalised DIOs as the response variable and the monthly time series of drought indicators as predictors. The number of DIOs was normalised by dividing them by the total number of DIOs for each region. This allowed for a fair comparison of the model errors between regions and counting methods. For the classification models, we used a binary time series of impacts, which was constructed by categorising the response variable by setting DIO = 0 to no impact and DIO > 0 to impact. Including this binary signal time series also considered the fact that the reporting of drought impacts might have changed over the years due to improvements in reporting and data collection. For instance, it excluded a potential bias in DIOs (upward trend observable in Fig. 2).
Unlike the correlation analysis, the DIO time series were not censored in this analysis. As predictors, we included all the indicators mentioned earlier aggregated over the different timescales. We identified the best predictors for each region using the variable importance feature. The algorithm estimates the variable importance by examining by how much the prediction error increases when that variable is excluded from the model (Liaw and Wiener, 2002).
The main input parameters in a random forest model are the number of trees, ntree, and the number of variables randomly sampled at each split, mtry. As Breiman (2001) mentions, for a large number of trees, and as the number of trees increases, the generalisation error converges to its limiting value. We set ntree equal to 1000. In order to select the optimal mtry parameter for each model, we used the R package caret (Kuhn, 2008) for tuning the models. We also used this package to perform a cross-validation analysis.
To assess the predictive potential of the random forest models, we first conducted a 10-fold cross-validation analysis using all of the available data (June 1983 to December 2013) repeated five times. We used the root mean squared error (RMSE) and R 2 performance metrics to evaluate the re- gression models. To assess the classification models, we used receiver operating characteristic (ROC) curves and area under the ROC curve (AUC). The AUC assesses the quality of a forecast of binary outcomes. These measures consider the proportion of impact occurrences correctly predicted and the proportion of (wrongly) predicted occurrences when there were none (Mason and Graham, 2002). Because we found large imbalances in the event classes (a large number of no impact events with respect to impact events), we also used precision, recall, and F score metrics to assess the models. These metrics better assess model performance in this case (Davis and Goadrich, 2006). Precision is the number of impact occurrences correctly predicted as a proportion of the total impact occurrence predictions made, and recall is the proportion of impact occurrences correctly predicted (Davis and Goadrich, 2006). The F score is a combination of precision and recall as their harmonic mean (Hripcsak and Rothschild, 2005).
In order to further evaluate the performance of the models, we randomly partitioned the data into a training and testing part, with a 75:25 split. We tuned and trained the models (using a 10-fold repeated cross-validation) to then predict the testing set. In this analysis, we also compared the two types of models, i.e. classification and regression. We did this by converting the outputs of the regression model into binary classes. The threshold to classify the outputs was set to 0.5, 1, 1.25, and 1.5; outputs below each threshold were classified as no impact, and outputs above each threshold as impact. These thresholds were then normalised by dividing by the total number of impact occurrences in each region. Overall, the results from this analysis show that drought impact occurrences are negatively correlated to the drought indicators studied, and when not, the correlation values were usually weak or not significant. Drought indices are negative during a period of drought; this means that when the drought indicator severity increases, then the impact occurrences tend to increase and vice versa. The NE and CE regions show the lowest correlations for all indicators; this indicates that these regions show the weakest links with impact occurrences. In this analysis, neither the total number of impact occurrences nor the length of the censored time series (see Table 2) seemed to be related to correlation strength. For instance, these two regions were the regions with the fourthand second-highest number of impact occurrences, respectively, but they showed the weakest links. As mentioned earlier, the drought risk is also affected by a region's exposure to drought. This could explain why we find the weakest hazardimpact correlations in two of the least populated regions (NE and CE), since the impacts are reported by a region only when it has been exposed to the hazard and has been vulnerable to it.

Correlation analysis
The SPI and SPEI showed a similar performance to one another, with the exception of the S region, where the SPI showed a larger number of significant and strong correlations. Aggregations over timescales of 18-21 months showed the highest correlations for both indicators. Moreover, the agricultural indicator, SSWI3 showed the greatest number of significant and strong correlations out of the remaining soil layers. SSWI4 outperformed SPI and SPEI in the S region at a timescale of 18 months; however, its overall performance was lower, especially in the CE and NE regions. The hydrological indicator, SSFI, showed strong and significant correlations in most regions but underperformed in the CE region, when compared to the SPI and SPEI. It also showed very similar patterns to the SSWI but with a slightly better performance in the NE region. SRSI showed strong significant correlations in the MA and S regions and had slightly lower strength in the NE and E regions.
When comparing the correlation patterns across different counting methods, we found that, overall, the two most censoring counting methods had the highest average correlation coefficients and smallest average p values over all regions, indicators, and aggregation timescales. Correlation patterns remained very similar for the two most censoring counting methods, except for the SRSI, in that one method showed significant correlations in one region (NW) and did not show this when using the other method. The two least censoring methods showed similar patterns to the two other methods in four sub-regions (MA, CE, E, and S); however, most significant correlations disappeared in the remaining two regions (NW and NE; see Fig. S2). We found that, generally, the less censoring the method, the lower the correlation strengths. These results indicate that counting methods do, to some extent, affect correlation patterns and strengths between indicators and impact occurrences. The results obtained here indicate that it is important to investigate different counting methods when working with incomplete impact data. In the following, we will use the two most censoring counting methods to determine the links between indicators and impacts, since the results were most consistent when using these two counting methods. The predictors that showed the highest correlation strengths using these counting methods are displayed in Table 3. A comparison of the correlations of CM1 and CM2 in Table 3 shows that correlations are usually very close in value. The two counting methods also show similar results, since the same indicators are usually present in both columns, especially for the drought indices.
Overall, the teleconnection patterns and STI (Fig. 5) showed strong and significant correlations with impact occurrences for many sub-regions and timescale aggregations when using the two most censoring counting methods. However, the main difference when compared to the correlation patterns with the drought indicators in Figs. 4 and S2, is that the correlation directions varied across the different subregions and counting methods here. This behaviour can be due to there being a negative linear relationship between the two variables or due to a lag between indices and impact occurrences. For example, although STI correlated negatively with impacts in the MA region, when we compared both time series (not shown) we saw that impact occurrences appeared during an abnormally hot period but appeared at a time when there was a short period of decreasing temperatures. This suggests that there is lag between elevated temperatures and impact occurrences in this region. Excluding the MA and S regions, the STI and impact occurrences showed positive correlations. This suggests that there is a positive linear relationship between temperature anomalies and drought impact occurrences.
The correlation patterns using the two most censoring counting methods (Fig. 5) showed that EA at 24-39 months has the strongest correlations in the NW region, with AO at 39-48 months and STI at 6-12 months in the NE region, EA at 27-39 months, AO at 9 months and STI at 27-30 months in the MA region, WeMO at 12-24 months and AO at 30-39 months in the CE region, WeMO at 27-36 months in the E region, and ENSO at 30-36 months, STI at 27-30 months, and NAO at 48 months in the S region. These results are also displayed in Table 3. Out of all the teleconnection patterns, the NAO and AO showed very similar correlation patterns to each other, which is to be expected since both are dynamically related (Feldstein and Franzke, 2006).
The third most censoring counting method showed similar patterns to the two most censoring methods, except for  two sub-regions (NW and NE) where the most significant correlation patterns disappeared or changed direction. The least censoring method overall still showed significant correlations but with decreased strength. Therefore, we again find that counting methods sometimes affected correlation patterns. We must also take into account that a correlation analysis only assesses links between two variables; however, the pathways by which these patterns affect drought conditions and the propagation to impacts are usually complex. Possible interactions between different teleconnection patterns or other atmospheric phenomena were not modelled in this analysis due to its bivariate nature.

Cross-validation analysis
The performance of the regression random forest models is shown in Fig. 6. RMSE values ranged from 0.0008 to 0.007 across all counting methods. Since impact occurrences were normalised in this analysis, these values should be interpreted as a fraction of the total number of the impact occurrences for each model. Overall, R 2 values ranged from 0.68 to 0.97; this meant that the models explained the variance observed relatively well. However, when interpreting these results, one must note that a metric such as RMSE is expected to be small. This is because (as explained further in this section) the models tended to underpredict the number of DIOs.
We need to consider that the time series of occurrences does not contain a lot of DIOs compared to the (larger) number of time steps with zero DIOs. The RMSE is expected to be relatively small, especially because the model is good at not predicting DIOs when there are none (high specificity) and because there were more actual DIOs than those predicted. In Fig. 7 we assess the performance of the classification of random forests. The classification model outputs predictions as either probabilities for each class or directly outputs the class. For this reason, the performance measures are different than in Fig. 6. We see that, generally, the precision of all models was higher than the recall. This means that the models predicted fewer than actual impact occurrences (low/moderate recall), but the predictions of impact occurrences were usually correct (high precision). In other words, the model had a low false positive rate and a slightly higher false negative rate. Moreover, recall generally appeared to be highest for the models with more balanced data sets; this was not the case for precision. AUC values were very high for all models (> 0.95). However, this seems to be because specificity values were very high when compared to sensitivity values, and AUC is based on both of these measures. This means that there were very few incorrect predictions of DIOs, and there were more DIOs than those predicted.
Precision values were very high for all the models; when the models predicted an impact occurrence, the predictions were correct for 97 %-100 % of the time. Recall values var- Figure 5. Correlation coefficients (ρ) between time series of climate indices and impact occurrences for each sub-region using one of the most (CM1) and the third most (CM3) censoring counting methods. Asterisks indicate the significance (p < 0.05).
ied more than precision between regions, and the standard errors were higher. However, all models predicted at least 62 % and up to 98 % of the impact occurrences. Since the F score metric combines both precision and recall, we used this measure to conclude that the best-performing counting method was the one with the least censoring criteria. However, the rest of the methods had very similar average F scores, and their censoring level did not notably affect their F scores. Furthermore, the results from the two most censoring counting methods showed that the three regions with the largest number of impact occurrences showed higher recall values than the rest of the regions. The precision across the different sub-regions did not show much variation. These regions also showed the best performance overall when using the regression models (Fig. 6). This suggests that, in order to have models with better skill and especially better recall, we need data sets with a greater number of impact occurrences, which means longer impact time series. When interpreting the model performance metrics in Figs. 6, 7, and 8, one must be aware that a new model was run each time (for each plot), which was tuned using the corresponding performance metric (see Sect. S1 for further details).
When comparing counting methods, we also found that a small sample size (low numbers of impact occurrences) limits the model's performance. We found that, generally, the less strict the censoring criteria, the better the model performance; this is displayed in Figs. 6 and 7. When using the classification models, the balance between the two class types affected model performance; the more balanced the class type is, the better the performance will be. However, although the least censoring method shows the best performance, this method could be excluding specific indicatorimpact links at the NUTS-1 level, since this method counts impacts that affected the whole country as impacts that occurred independently in all sub-regions. This shows that the Figure 6. Performance metrics (RMSE and R 2 ) of the regression random forest model when performing a repeated 10-fold cross-validation using all the data. All counting methods are shown. The error bars show the standard deviation of these metrics. Figure 7. Performance metrics (recall, precision, F score, and AUC) of the classification random forest model when performing a repeated 10-fold cross-validation using all the data. All counting methods are shown. The error bars show the standard deviation of these metrics. choice of a counting method is important when modelling impacts and can significantly affect model performance. Figure 8 shows the performance of the regression and classification random forest models after being trained and tested on 75 % and 25 % of the data, respectively. To do this, the regression model's output was converted to the categorical classes of impact or no impact. We tested different thresholds to convert the model outputs and found that, when the threshold was lowered, the recall increased, and the precision decreased. The opposite happened when the threshold was increased. We tested the following thresholds: 0.5, 1, 1.25, and 1.5. Model performance in this analysis was found to be slightly worse than in the cross-validation analysis. It is important to note that model performance for a region can vary, depending on how the data are partitioned for testing and training the model. Hence, model performance does not only depend on the strengths between the predictor and the response variable but also on the particular splitting. For instance, if an important event or pattern remains in the testing set, it will not be used in the training of the model; hence, it will decrease the model's predictive ability.

Comparison of regression and classification models using a train-test analysis
Although classification models appear to be better at predicting impact events in this analysis, these models did not contain any information on impact severity, whereas regression models did. Since both the model types performed well, we suggest further evaluation of these to predict impacts, for example, by using a classification model with more classes (to represent different impact severities) or using a different threshold to divide the two classes in this study. Also, similar to the previous cross-validation analysis (Sect. 3.1), here we see more clearly that, overall, regions with a higher number of total impact occurrences performed best here; these are the S, CE, and E regions. Figure 9 shows the predictor importance using the two most censoring counting methods and regression random forest models. The overall patterns of predictor importance did not change significantly when we compared them to the classification models (not shown). This, and the fact that both regression and classification models showed similar performance (Sect. 3.2.2), also confirms that the discussed potential bias in reporting culture over time does not seem to affect our random forest results. The top predictors for each region are summarised in Fig. 10.

Predictor importance
Overall, the meteorological indicator, SPEI, was a top predictor in the CE and MA, at timescales between 24-33 and 15-18 months, respectively. In the E and in MA, agricultural indices were top predictors at timescales of 12-21 and 18 months, respectively. The hydrological indicator, SSFI, was a top predictor in the NW at a timescale of 21 months. The other hydrological indicator, SRSI, was a top predictor in the S region, at timescales of 1-6 months. Out of all the teleconnection patterns, AO, NAO, and WeMO were the top predictors, with the AO in the NW and NE at timescales between 15-21 months, the NAO in the S region between 36-48 months, and the WeMO in MA aggregated at 12 months.
We found that both analyses (correlation strength and random forest variable importance) tended to show similar predictor importance results in most sub-regions. When assessing the best drought indicators using both methods, the MA, CE, E, and S regions showed similar results. For the climate indices, the NE, followed by the S, showed the most agreement. These results indicate that, for these top predictors, the indicator-impact relationship is linear.
The results presented here are based on the most censoring counting methods (CM1 and CM2), since these showed the highest correlation strengths in the correlation analysis. Even though these two methods showed a lower predictive skill in the random forest models, we attribute this to the reduced number of impact occurrences. If we compare these results to the ones using CM3, the results remain mostly the same in half of the regions. We excluded CM4 from this analysis.

Drought vulnerability analysis
Including drought vulnerability factors when modelling drought impacts has been shown to increase model performance ; however, because most of the vulnerability factors studied here (e.g. GDP per capita, public water supply, unemployment, GVA by industry except construction, agriculture, forestry, and fishing, and by all NACE activities) were only available starting from the years 1999 or 2000, models built with these factors were not as robust as the rest of our models. Therefore, we cannot assume that the results found would reproduce themselves for the rest of the study period.
However, from an exploratory analysis, we find that vulnerability factors, in particular the land cover types of forest and seminatural areas and agricultural areas, and factors, such as unemployment rate and GVA by industry (except construction), do increase the accuracy of the models when they are included in especially two regions (CE and E). Also, the exclusion of the drought indices does not substantially decrease the model performance in either the regression or classification random forest models. Therefore, we conclude that including drought vulnerability factors, in some cases, does seem to improve the accuracy of some models. This is shown in the variable importance results (Fig. 11). Regions that had very few DIOs (NW and S) were not considered in this analysis. Since this analysis is limited by the availability of vulnerability data (the period studied here missed the first two drought events), we cannot assume that these conclusions hold true for our main results, especially considering that these models are not as robust. Figure 8. Performance metrics (recall, precision, and F score) of the random forest models when training and validating the models on 75 % and 25 % of the data. All counting methods are shown. The regression model test predictions were converted into binary outcomes using a threshold of 1.25 DIOs.

Discussion and conclusion
In this study, we systematically investigated the link between drought impacts in Spain and drought indicators and teleconnection patterns. We also investigated the potential for vulnerability factors to be used as impact drivers. This means that we used a hybrid data approach, as defined by Blauhut (2020), to investigate drought risk as a function of hazard indicators, exposure and vulnerability factors, and impact information using a statistical model. We found significant links between the drought indicators and climate indices, and drought impact reports from the EDII database. We assessed these links by, first, using a correlation analysis and, second, by modelling drought impacts using drought indicators and teleconnection patterns as predictors in a random forest model. While random forest models seemed to be limited by the amount of impact occurrence data, they were skilful in predicting drought impact occurrences. Furthermore, we have shown that using drought impact reports from the EDII with a random forest model for Spain, a region with a reduced number of impact report entries, already provides good predictability of impacts for several sub-regions, making us confident in the robustness of our results. Drought impact information from this database has already been successfully linked to drought hazards and shown to have potential for impact forecasting (Blauhut et al., 2015Stagge et al., 2015;Sutanto et al., 2019;Bachmair et al., 2015Bachmair et al., , 2016b; it also proves to be useful here when assessing links between different drought indicators and impacts. We found strong and significant correlations between drought indicators and reported impacts in most regions. We also found spatial differences in indicator-impact correlations. Out of all the indices, the SPI (followed by SPEI) showed the strongest correlations overall and significant correlations in all regions. Therefore, we recommend the use of these indicators if only one indicator is to be used for predictive purposes. These meteorological indices have already been linked to drought impacts in Spain or the Mediterranean region in several studies. For example, the SPI has been used to successfully forecast above-normal summer wildfire activity (Gudmundsson et al., 2014). The SPEI has been used to detect drought impacts on vegetation activity , and the SPI has been correlated to tree ring widths to determine the impacts of drought on forest growth .
When comparing the most important predictors from both analyses (correlation strength and random forest variable importance), we found a general agreement for most subregions. In the random forest analysis, the top predictors for each region were the SSFI and AO in the NW, AO in the NE, SSWI1-2, SPEI and WeMO in MA, SPEI in the CE, SSWI2-4 in the E, and SRSI and NAO in the S region (see Fig. 10). Our results show strong links between impacts and drought indices and teleconnection patterns. Drought and its links to teleconnection patterns is studied well in the literature (see Figure 9. Predictor importance when using the regression random forest models and the two most censoring counting methods. The top 10 predictors for each counting method and sub-region are shown. Sect. 2.2); however, links between EDII impacts and teleconnection patterns have not been investigated before, and here we show that, in some regions, they are better predictors of drought impacts than commonly used drought indices.
By including the STI we also investigated links between temperature and impact occurrences. The correlation results showed mainly positive and significant correlations, which suggest a relationship between these two variables. However, the STI neither showed the strongest correlations nor greatest variable importance (in the random forest analysis) when compared to other drought indicators or teleconnection patterns, except for the NE region, where it showed higher correlation strengths than the rest of the indicators. Although we do not recommend the use of this index as a single drought predictor, we believe that its observed connection to drought impacts is important and might become more important as temperatures in Spain continue to increase, especially since there already is evidence of increasing trends in evapotranspiration in most meteorological stations in Spain due to decreased relative humidity and increased maximum temperature that has occurred since the 1960s (Vicente-Serrano et al., 2014a). Moreover, González-Hidalgo et al. (2018) pointed out that, since 1990, the role of atmospheric evaporative demand has been playing a large role in drought development. They state that drought is being driven by temperature conditions that affect atmospheric evaporative demand independently of precipitation evolution. In our study, the SPEI, which includes the effects that temperature has on evapotranspiration, showed higher correlations than the SPI in four out of six regions (NW, NE, CE, and E), which again suggests that including the effects of temperature when investigating drought and its impacts is important.
Adequate drought management requires knowledge of the time that different drought types take to propagate through Figure 10. Map with the top four predictors for each sub-region when using the regression random forest models and the two most censoring counting methods, in blue and black, respectively. The best type of predictors for each sub-region are in red. different water resource systems. Both of our analyses mostly agreed on the timescales at which different types of drought started to cause impacts. The timescales that showed the strongest links with impact occurrences depended on the sub-region and the method for its analysis. However, using both analyses, we found the strongest links overall to be at timescales between 15-33 months for the meteorological indices, between 6-33 months for the hydrological indicator SSFI, between 1-18 months for the hydrological indicator SRSI, and between 6-21 months for layers 1-3 of the agricultural index. For the deepest soil layer, the correlation analysis showed strongest correlations at shorter timescales, i.e. from 1-9 months. The timescales at which the meteorological indices showed the strongest links were usually longer than those found in Germany and similar to the UK (Bachmair et al., 2016b). In these regions, SPI and SPEI showed the best links with impact occurrences at accumulation periods of 12-24 months for the UK and at accumulation periods of 2-4 months for Germany. Stagge et al. (2015) found that Norway, Bulgaria, and Slovenia responded even more rapidly to meteorological drought than Germany and the UK, which shows that Spain has the longest impact response out of these countries.
Furthermore, our results show that systems that respond to precipitation anomalies at the shortest timescales take longer to propagate to impacts. For instance, we have shown that the meteorological indices correlated with impacts at long timescales. The agricultural index in the top three layers (1-3) showed correlations at long timescales. Viewed differently, the hydrological index showed strong correlations overall at the earliest timescales. In our analysis, this indicates that drought impacts respond to hydrological droughts faster than meteorological and shallow layer soil moisture droughts. This drought propagation chain has been suggested by Van Loon and Laaha (2015), who detected propagation signals from meteorological to hydrological droughts. Our results indicate that if we want to predict drought impacts at short timescales, we should use hydrological drought indices.
The agricultural index showed more significant and negative correlations in the two shallowest soil layers. A weaker link between lower layer soil moisture and drought impacts can be explained by the fact that the soil moisture content of these layers, which are usually below the root zone of most crops, has a slower and more aggregated behaviour. Aggregating these indices then creates a more averaged time series with fewer anomalies, which then leads to lower correlations. The agricultural indicator also revealed an anomalous pattern in the NE region; it showed positive and significant correlations at all soil depths. This may be explained by a lower exposure to drought due to lower population density, which also explains the weaker indicator-impact correlations shown by the meteorological and hydrological drought indicators. However, this pattern may also be (partly) explained by the fact that this region reported drought impact occurrences in July-October 2009 (see Fig. 3), a period in which there were no reports in other regions. A region can still be suffering from drought impacts during a period where precipitation levels are above the climatological mean due to long-lasting impacts of a previous drought (e.g. Boletín Oficial del Estado, 2009) and, hence, results in positive (or weak negative) correlations.
Spain's resilience to short-term droughts, due to its extensive network of hydraulic reservoirs, could explain why we found most indicator-impact links at long timescales (especially meteorological indicators and teleconnection patterns). We found that most of the links between meteorological indicators and teleconnection patterns, and impact occurrences were strongest at timescales between 1-3 years and 1-4 years, respectively, depending on the specific indicator and sub-region. As mentioned earlier, drought conditions that last more than 2 or 3 years have been shown to limit the capacity of Spain's hydraulic infrastructures (González-Hidalgo et al., 2018). In addition to reservoir systems, groundwater storage also provides resilience (water supply to satisfy demands) during periods of drought. Therefore, groundwater droughts may play a role and be an additional factor that contributes to these long accumulation periods. Especially since 15 %-20 % of all water used in Spain is provided by groundwater (Hernández-Mora et al., 2003).
It is important to note that considering that there are not many drought events within the study period and that impacts are often clustered due to the nature of drought events, the observed tendency for longer aggregation periods to show stronger links to impacts can also be attributed to the methods used. This could be due to (or partly due to) aggregated indices being more smoothed out and consequently extending the periods that are above and below the normal.
The most frequent types of reported impacts were agriculture and livestock farming and public water supply (Fig. 1). Both of these sectors depend on reservoir systems for storing Figure 11. Predictor importance when using regression random forest models built with vulnerability factors in addition to the drought indices. Counting method CM1 is used. Models are built with data from 2000-2012. water, since irrigation and public water supply are the two sectors that consume most of the stored water from reservoirs. Our results show that SRSI is the best predictor of impacts, outperforming all other indicators, in the S region. The correlation analysis also showed strong and significant correlations between SRSI and impacts for several regions. This suggests that drought impacts in Spain depend on reservoir resilience, and this could explain why it takes a long time for precipitation anomalies to propagate to impacts (and the response to less frequent but longer drought periods). Reservoir storage has been shown to respond to anomalies in SPI and SPEI at long timescales in some Spanish regions (Vicente-Serrano and López-Moreno, 2005;Lorenzo-Lacruz et al., 2010). This further demonstrates that, to understand drought impacts at local scales, we need to consider the effects of local reservoir systems in addition to studying other water resource systems.
The accuracy of our results is dependent on the accuracy of the impact data used which is, specifically, the method of quantification, the completeness of the data, and potential sources of error. Since many impact reports were incomplete and their quantification is subjective, we tested four different versions of counting methods and investigated whether they had an effect on the results. We mainly tested (1) whether to count an impact that affects the entire country equally as if it only occurred in one sub-region and (2) whether and how to count impacts that lacked information on the start or end date of the impact report. In the correlation analysis, different counting methods mainly produced differences in the strength of the correlations. The least censoring counting methods showed weaker correlations overall, and significant correlations disappeared in one-third of the regions. However, in the random forest analysis, the least censoring counting methods produced models with higher predictive skill than the more censoring counting methods. Regions with the most impact data also performed best. We infer that this is because the performance of a random forest model highly depends on the quantity of data used for its training. We therefore conclude that, when working with impact data, it is important to compare counting methods and to investigate their effect on the results to overcome potential biases due to subjectivity.
The vulnerability analysis revealed that some vulnerability factors may also be appropriate drought impact drivers. When we compare our vulnerability analysis results to the results from Blauhut et al. (2016), who investigated drought risk in Europe using vulnerability factors and drought hazard indices, we find some similarities. For instance, they found that, for the western Mediterranean region, some of the best performing vulnerability factors included the area of agriculture, seminatural areas, and wetlands. These results agree with our findings (see Fig. 11). They also did not find factors such as GDP per capita or public water supply to be good predictors. They also showed that, overall, vulnerability factors improved model accuracy.
The results of this study are limited by the availability of drought impact data because there was just a small number of impact occurrences recorded for each drought event (when compared to other countries) and because the data used only covered events until 2013. A later major drought event (2017-2018) was not included because those data are, so far, not publicly available and quality checked. Therefore, we encourage future studies to (1) focus on conducting sector-specific analyses of impacts (e.g. Blauhut et al., 2015Blauhut et al., , 2016Stagge et al., 2015;Bachmair et al., 2016b), (2) explore different types of impact data, for instance, agricultural and economic data (e.g. Sainz de la Maza and Del Jesús, 2020), and (3) model exposure and vulnerability (in addition to drought hazard) to understand how future drought risk will change (Blauhut, 2020).
Author contributions. HTS and CLEF contributed to the design of the study, the interpretation of the results, and the writing of the paper. HTS performed the analysis and prepared the paper.
Competing interests. The contact author has declared that neither they nor their co-author has any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.