Mediterranean Specific Climate Classification and Future Evolution Under RCP Scenarios

The Mediterranean is one of the most sensitive regions to anthropogenic and climatic changes mostly affecting its water resources and related practices. With multiple studies raising serious concerns of climate shifts and aridity expansion in the region, this one aims to establish a new high-resolution classification for hydrology purposes based on Mediterranean specific climate indices. This classification is useful in following up hydrological, (water resources management, floods, droughts, etc.), and ecohydrological applications such as Mediterranean agriculture like olive cultivation and other environmental practices. 5 The proposed approach includes the use of classic climatic indices and the definition of new climatic indices mainly precipitation seasonality index Is or evapotranspiration threshold SPET both in line with river flow regimes, a Principal Component Analysis to reduce the number of indices, K-Means classification to distribute them into classes and finally the construction of a decision tree based on the distances to classes kernels to reproduce the classification without having to repeat the whole process. The classification was set and validated by WorldClim-2 at 1-km high resolution gridded data for the 1970-2000 10 baseline period and 144 stations data over 30 to 120 years, both at monthly time steps. Climatic classes coincided with a geographical distribution in the Mediterranean ranging from the most seasonal and dry class in the south to the least seasonal and most humid class in the North, showing up the climatic continuity from one place to another and enhancing the visibility of change trends. The MED-CORDEX ALADIN historical and projected data at 12-km resolution simulated under RCP 4.5 and 8.5 scenarios for the 2070-2100 period served to assess the climate change impact on this classification by superimposing 15 the projected changes on the baseline high resolution classification. Both RCP scenarios showed a 7% to 9% increase of the average seasonality index Is and 3% to 20% increase of the average aridity index IArid for the least seasonal classes. These classes located to the north are slowly evolving towards moderate coastal classes which might affect hydrologic regimes due to shorter humid seasons and earlier snowmelts. This kind of classification might be reproduced at the global scale, using same or other climatic indices specific for each region highlighting their physiographic characteristics and hydrological response. 20


Introduction
Mediterranean climate is a result of a complicated cyclonic system swiping a large evaporative basin. The distribution of marine and continental air masses creates an alternation of low-pressure zones coming over from Iceland and the Persian Gulf or high-pressure zones from Siberia and Azores. The seasonal shifts of these zones are magnified by the North Atlantic Oscilla--The climatic boundary could be defined according to Köppen's classification where a set of regions share similar temperature and precipitation characteristics and known for their warm and dry summers and cold and humid winters. It is limited by the African desert to the South and the temperate European countries to the North. This boundary might change according to the definition of this similarity. Some regions share a similar Mediterranean climate although located far outside the Ecumene such as Chile, California or South Africa.

5
-The topographic boundary is defined by the set of catchments draining towards the Mediterranean Sea (Milano, 2013).
This definition neglects some of Mediterranean climate regions like Portugal, part of Spain and favours geographically adjacent regions like Egypt and Libya.
-The agricultural-bioclimatic boundary consists of the set of regions sharing the same types of vegetation considered as indicators of the Mediterranean region such as olives, (Moreno, 2014). This definition is linked to human activity with the 10 same nuances as the climatic limit.
-The administrative boundary of countries adjacent to Mediterranean Sea has a problematic definition independent of any natural base (Wainwright and Thornes, 2004). These boundaries include several climatic classes and cover larger areas than the topographical limits.

15
Since the geographic extent of the study is very wide to be treated in a personal way, the delimitation of catchments was imported from international references. The European Commission using the Joint Research Centre (JRC) has done extensive and elaborate work on the delimitation of catchments in Europe and some adjacent countries as part of the "Catchment Characterization and Modelling" (CMM) project (De Jager and Vogt, 2010). For catchments in the Middle East and Northern Africa, catchments from HydroSHEDS, the World Wildlife Fund's project, were used (Lehner and Grill, 2013). According 20 to these databases, the total number of catchments exceeding 1 km 2 and having a Mediterranean sea mouth outlet is 3681 covering a total area of 1,781,645 km 2 . It should be noted that the Nile was omitted for its extent 3500 km to the south of the Mediterranean. Catchments surface distribution is shown in Table 1 where middle range catchments, between 100 and 3000 km 2 , constitute 35% of the total and cover 28% of the total area.

Climatic data 25
Three climatic datasets were used for this study, (1) WorldClim-2 new 1-km spatial resolution climate surface data, which consists of long-term average monthly temperature and precipitation, solar radiation, vapor pressure and wind speed data, aggregated across a target temporal range of 1970-2000, using data from 9000 to 60000 weather stations (Fick and Hijmans, 2017). Worldclim-2 database is a refined and expanded version of the 2005 "WorldClim-1 database" (Hijmans et al., 2005). This database covers the whole study area, thus climatic 30 classification of Mediterranean catchments was possible. Monthly precipitation and temperature were averaged for each catchment and then climatic indices calculated at both catchment and grid scale. Both classifications were compared for validation.
Climatic characteristics of Mediterranean catchment are summarised and illustrated in Table 2 and Figure 2, reflecting the wide variability of mean annual precipitation ranging between 5 and 3000 mm and mean annual temperature ranging between -14 and +26°C where some catchments receive 50 times more than others the amount of precipitation while being 4 times colder.
(2) 144 ground weather station data covering the whole study area served to validate the Mediterranean climate classification with 105 stations located within catchments boundary and 39 outside. Also, 102 of these stations located within Köppen's (Csa) and (Csb) Mediterranean climate and 42 outside. These stations are recognized by the World Meteorological Organization 5 (WMO) and available for free access on the portal of the National Administration of Oceans and Atmosphere of the United States (NOAA). The length of data series ranges between 30 and 120 years at monthly time step.
(3) The MEDCORDEX climate projection, computed with the RCM ALADIN-Climate v5.2 at 12 km spatial resolution grid, was used to analyse the climate change impacts on the climatic classification for the end of the century projection period 2070-2100, and for two different Radiative Concentration Pathway scenarios (RCP 4.5 and 8.5) in comparison to the historical 10 1970-2000 baseline period which was also adopted from ALADIN historical run (Tramblay et al., 2013).

Methodology
Taxonomy aims to separate a population into several groups of similar characters. It was mainly developed by naturalists (Linnaeus, 1748). But Thornthwaite pointed out that climate classification does not follow the same approach since one goes from one climate to another continuously, whereas the various species of fish, for example, are all different, in fact individu-15 alized (Thornthwaite, 1948). This continuity can be demonstrated using a fine intra-climate classification. To achieve this fine classification, it is essential to introduce measurable indices ensuring continuous variable scale.
Automatic classification methods partition a set of objects knowing their distances by pairs in a way to keep the classes as much homogeneous as possible while remaining distinct from each other. Like any classification, the adopted method depends from the objective and its author. There are several modes of climatic classification: (a) Genetic classifications related to 20 meteorological causes and the origin of air masses (Bergeron, 1928;Barry and Chorley, 2009). (b) Bioclimatic classifications based on the interrelation between vegetation type and climate (Holdridge, 1947;Mather and Yoshioka, 1968;Harrison et al., 2010). (c) Agro-climatic method based on the assessment of the Rainfall -Evapotranspiration balance for the estimation of agricultural productivity (Thornthwaite, 1948). (d) Climatic methods based on precipitation and temperature indices similarly to the classification of Köppen in 1936(Köppen, 1936 updated by Peel in 2007(Peel et al., 2007 and which remains the most 25 used, this method divided the globe into thirty climate zones and was based on a hierarchical approach. The Mediterranean climate corresponds to dry hot or dry warm summer where either the precipitation in the driest month in summer is below 40 mm or below the third of the precipitation in the wettest month in winter (Cs) and the air temperature of the warmest month is above 22 (Csa) or the number of months with air temperature above 10°C exceeds 4 (Csb).
The (Cs) climate doesn't reign all over the Mediterranean region, some exceptions could be observed. A Desertic climate

Principle Component Analysis
Principal Component Analysis (PCA) is widely applied to reduce the dimensionality of datasets and keeping the most representing and uncorrelated variables. This section presents a brief description of the method along with some of their applications in hydrology. For an extensive mathematical description and demonstration of these methods we advise to consult; Krzanowski's Principles of multivariate analysis: a user's perspective (Krzanowski, 1988) Jollife's book Principal Component 5 Analysis including a wide range of applications (Jolliffe, 2002).
PCA was first introduced by Karl Pearson (Pearson, 1901) and then developed by Harold Hotelling (Hotelling, 1933).
Hotelling's motivation is that there may be a smaller fundamental set of independent variables which determine the values and conserve the maximum amount of information of the original variables (Jolliffe, 2002). This is achieved by transforming a vector of p random variables to a new set of variables, named Principal Components (PC), by looking for a linear function of 10 the elements having maximum variance. And next looking for another linear function uncorrelated with the first and having maximum variance and so on up to p PCs. It is hoped in general, that most of the variation will be accounted for by m PCs, where m < p.

K-Means clustering technique
Cluster analysis consists of data points partitioning into isolated groups while minimizing the distance between same cluster 15 data points and maximizing it between different clusters. One of the most popular clustering methods is the K-Means method introduced by Edward Forgy (Forgy, 1965) and MacQueen (MacQueen, 1967). It aims to minimize the square error objective function for distance optimization. The optimization steps begin with (1) kernels initialization, the kernel being a virtual point representing the statistical centre of a class, (2) updating classes, (3) re-evaluation of kernels and (4) repetition of steps (2) and (3) until stabilization. The quality of the solution thus found strongly depends on the initial kernels. In its turn, kernel 20 initialization is sensitive to the data dimensionality. The application of K-Means requires setting a number of classes, otherwise the optimization leads to as many classes as individuals.
K-Means gained in reputation the last decades and was widely applied in hydrology field for clouds classification from satellite imagery (Desbois et al., 1982), for climatic classification using measured and simulated timeseries (Moron et al., 2008;Carvalho et al., 2016) for catchment classification based on streamflow characterization and precipitation (Toth, 2013).

25
K-Means classification was applied, and catchments were distributed based on their distances to 5 classes kernels, for their geographical suitability, to determine whether they belong, or not, to a Mediterranean climate and to which type they belong to, if so.

Decision Tree
The purpose of a decision tree analysis is to classify a population into groups by predicting values of a dependent variable 30 based on values of predictor variables. This procedure provides validation tools for exploratory and confirmatory classification analysis. In our case, the dependent variables are the climatic classes obtained from K-Means clustering while the predictor variables are the distances to each clusters' kernels. This procedure was done for both catchments and gridded classification.
The decision tree generates a set of classification rules usually used to classify new stations based on their distances to classes kernels. In this study, these rules were used in section 5 to classify RCP 4.5 and 8.5 projected indices. In this way, we have fixed the classes kernels indices of the 1970-2000 baseline period and calculated the distances of the 2070-2100 projected grid to baseline to compare both the classification indices and spatial evolution. The decision tree might have differed if another 5 kernel was forced into the first node, but kernel 1 was adopted as it yielded the highest accuracy rate.

Adopted methodology
The proposed methodology consists on calculating the climatic indices using WorldClim-2 monthly gridded data averaged at the catchment scale using ArcGIS zonal statistics. The climatic indices were then PCA-reduced and classified using K-Means clustering. The classification was validated on WorldClim-2 gridded indices and ground stations indices. In addition to 10 a decision tree built to classify projected indices and to avoid repeating the whole process. All PCA, K-Means and the decision tree where calculated using SPSS® software.
For climate change assessment and for better comparison, temperature and precipitation delta change were calculated between MED-CORDEX RCM ALADIN grids for the baseline 1970-2000 and projected 2070-2100 periods for RCP 4.5 and 8.5 and then superimposed to the WorldClim-2 grid through proximity analysis and spatial join. The indices of projected grids 15 were then re-classified using the decision tree and compared to the baseline grid.

Hydrology driven climatic indices
The hydrology driven independent climatic indices were subjectively developed from WorldClim-2 monthly average data and divided onto four groups to highlight the Mediterranean seasonality hypothesis of the climate and its corresponding hydrological response. While the flow seasonality is clearly affected by the precipitation seasonality, the other indices help in fine tuning this theory like monthly temperature and potential evapotranspiration variation. A complete list of indices with a description 25 of each is in Table 3.
Group I: indices based on monthly precipitation from which we mention seasonality index I s , peak indices S P 1.5 , S P 2 and frequency indices P 25% , P 75% . I s is directly linked to Mediterranean flow regimes for expressing the precipitation ratio between the 3 most humid months and the 3 most dry months with values ranging from 0 to 1 (Hreiche, 2003). I s values tending towards 0 express uniform distribution of precipitation along the year with a hydrological response lacking flood and 30 drought seasons while I s values tending towards 1 correspond to a normal distribution of precipitation with a hydrological response more likely to show flood and drought seasons.
Group II: indices based on monthly temperature expressed by the temperature lag between the coldest and warmest months ∆T 1 , frequency indices T 25% , the number of months exceeding the average Mediterranean temperature S T m .
Group III: indices based on both temperature and precipitation expressed by I Decal the time lag between the coldest and 5 most humid month.
Group IV: indices based on precipitation and evapotranspiration expressed by aridity index I Arid .

PCA Results
The number of indices was reduced the first time based on the correlation matrix and the second based on PCA results. We eliminated the strongly correlated indices (correlation higher than 0.85) and 11 indices were kept upon the first step.

10
• I s and P 75% are strongly inversely correlated (-0.959). I s was kept.
Once the correlation matrix transformed into a diagonal one, it was possible to find the eigenvalues representing the pro-15 jection from p to k dimensions. The eigenvector matrix is the linear expression of the indices with respect to the principal components. The first eigenvalue 6.36 represents 58% of the variability and the second 1.31 represents 12%. The first two factors F1 and F2 represent the two greatest variabilities with respect to the following factors and 70% of the total variability is thus preserved with this choice. Upon the PCA, the number of indices was reduced to 7 showing that I s , P 25% , S P 1.5 , I Arid , T 25% , S P ET and S T m were the most contributing climatic indices with 70% of total variance explained for the first 20 two components. Statistical summaries are shown in Table 4 with I s values ranging between 0.2 and 1 with an average of 0.8 highlighting Mediterranean seasonality.

Climatic Classification of WorldClim-2 catchments indices
The K-Means classification showed in Figure 4   This variability is highlighted in the class kernels indices (Figure 3). It is mainly due to the complex seasonality across the Mediterranean. This complexity is shown here more delicately than the one defined by Köppen which is climate oriented only and limited to the simple criteria of a wet winter and dry or temperate summer. Therefore, we think that a hydrology oriented climatic classification should account for and intra climate characteristics expressed by specific indices like the one shown here, specific to the Mediterranean and expressed by I s . The continuous evolution of climate across the Mediterranean was 10 demonstrated by the indices values uniformly increasing or decreasing from North to South. Seasonality is highest in the South and lowest in North, same for other precipitation indices and aridity.

Validation for WorldClim-2 gridded indices
The K-Means clustering of WoldClim-2 gridded data resulted with a similar spatial distribution where class 1 dominates the 15 south, class 5 in the north and classes 2, 3 and 4 in the middle ( Figure 5). This classification has shown better resolution due to catchments averaging approximation and revealed some shifts to adjacent classes. Class 4 climate appeared on Spanish coasts, class 3 climate appeared on Sardinia and Greece, Class 2 in Syria and a limited spread of class 4 and 5 on Eastern Turkey. However, climate continuity is conserved in this classification for indices are gradually increasing or decreasing from North to South.

20
Motivated by the quest for coupled physiographic-climatic models, we believe that this classification is useful both for hydrological and ecohydrological applications like cultivation and other related environmental practices affected by water resources and river flows. Olive is one of the best Mediterranean-specific physiographic indices and we noticed that its cultivation boundary is limited by those of classes 1 and 5 where 13% is in Class 2, 49% in class 3 and 34% in class 4. This observation gives an accurate idea of perfect climate conditions for olive cultivation, deducing that extreme seasonality combined with very 25 high aridity (South) or very low seasonality combined with high humidity (North) are avoided by olive trees. In a similar way, other tree types like pine trees also characterise Mediterranean landscape putting forward the need for a physiographic classification to interpret in parallel to this climatic classification under the umbrella of hydrological characterisation. The future of Mediterranean cultivation in case of climate change is to be checked under RCP 4.5 and 8.5 scenarios in next section.

30
The 144 stations were also K-Means clustered based on the selected indices from the PCA. The resulting geographical distribution differed only by some shifting due to averaging and normalization as the sample is much less than the gridded cells. There is no coverage of class 1 as no weather station was found in that region (Figure 6). Despite the shifting, there is an 82% accuracy rate or 86 out of 105 stations located within catchments boundary that matched the gridded distribution, the rest is located within the adjacent classes boundaries. As for olive boundary, there was only one class 5 station corresponding to Firenze that was located within the boundary.

Decision tree analysis validation 5
The total population of gridded classification was divided into two equal subsets, one for training and the second for testing.
The predicted classes values of both sets were then compared to the original classification and both yielded an overall 93% accuracy for gridded classification (Table 5). We notice that some grids have joined one of the adjacent classes due to interclass connectivity; this confirms once more the continuity of climate. The generated decision tree of 3 levels includes 75 nodes in total due to high population number with 75 classification rules sampled in (Table 6). As an example, for class 1, if the distance 10 to kernel 1 (D1) is below 3.5 and the distance to kernel 2 (D2) is above 2.2, then the grid cell belongs to class 1.

RCP 4.and 8.Scenarios Climate Evolution
For climate change impact assessment, temperature and precipitation delta change were calculated between both baseline period 1970-2000 and projected period 2070-2100 for MED-CORDEX RCM ALADIN grids and for two different Radiative Concentration Pathway scenarios (RCP 4.5 and RCP 8.5). Those delta changes were then transposed to the WorldClim-2 grid 15 through proximity analysis and spatial join. The decision tree rules from Table 6 were then applied for the projected period and the climate change under RCP was illustrated in Figure 7 and expressed by indices evolution between classes in Table 7.
Under RCP 4.5 scenario, Mediterranean region temperature is increasing by 1.4 to 3.5°C and precipitation is decreasing by 10% in one third of the region but increasing by 10% in two third of it. Overall, the Mediterranean is evolving towards a moderate/arid region under RCP 4.5 scenario as no major area change has occurred but instead, classes 4 and 5 seasonality index 20 I s is increasing by 7% and 9% while classes 1, 2 and 3 are constant. Also for classes 4 and 5, S P 1.5 is highly increasing (70%) with P 25% almost the same (2-3%) which means that the precipitation change that has occurred was temporally distributed in a way that more months are exceeding the average monthly precipitation by 1.5, so the humid season has become shorter enhancing seasonality variation. Another remarkable change is class 5 I Arid 20% increase pushing it towards class 4. In detail, even though no major area change was observed, classes distribution has changed where class 5 has reduced its extent in Greece 25 and Albania in favour of classes 3 and 4 but compensated by the appearance of class 5 in central Spain. Class 3 extent has decreased in Turkey and Corsica in favour of class 4 in Lebanon and class 2 in Cyprus.
The case is similar but accentuated under RCP 8.5 scenario for temperature is increasing by 2.5 to 5.6°C and precipitation is decreasing by up to 20% in half the region and increasing by up to 25% in the other half. The difference with RCP 4.5 scenario resides first in the indices evolution where I s is increasing by 9% in class 5 but S P 1.5 is highly increasing by 96%. This change 30 has caused an area change of 2% towards class 4 mainly in Spain, Greece and Albania. Another change occurred in class 3 where I Arid has increased by 19% and S P ET decreased by 10% which means that this moderate region is pushing towards more arid climate. In the spatial distribution details, although its area has not changed, class 3 is taking over the south eastern coast of Spain but retreating in favour of class 4 in North West Africa and Turkey.
The evaluation of uncertainties in mean precipitation over the 1981-2010period,Colmet-Daage et al. (2018, found that the total monthly precipitation in spring and summer is overestimated over the mountainous regions and underestimated over the coastal region. The mean and spread for future period remain unchanged under RCP4.5 scenario and decrease under RCP8.5 5 scenario. However, precipitation events are infrequent during spring and summer seasons in the Mediterranean except for Class 5 region which is characterized by the lowest seasonality.
The RCP 4.5 and 8.5 scenarios might look Mediterranean friendly as classes 4 and 5 seasonality indices are evolving towards class 3 in addition to some spatial expansion which make it more favourable for Mediterranean cultivation but not as much for Mediterranean hydrology and water resources management, as temperature increase might affect snowmelt runoff discharge 10 and consequently the hydrological regimes as per Haines classification mainly class 13 Extreme Winter taking over class 14 Early spring. The RCP 8.5 impact on hydrology is even more accentuated with less available resources caused by lower precipitation.

Conclusions
The Mediterranean climate characteristics and specifically precipitation seasonality, main contributor according to PCA, plays 15 an important role in the hydrological mechanisms of Mediterranean catchments and flow intermittence. A decision tree makes it possible to define, from distances to class kernels, if any place has a Mediterranean climate or not, and to which type of Mediterranean climate does it belong to, for present and future scenarios. The interclass connectivity showed that climate is continuous from one place to another since some catchments or grid cells can meet the membership criteria of adjacent classes. On the other hand, the superposition of olive cultivation boundary as Mediterranean-specific physiographic index 20 highlighted the utility and importance of physiographic-climatic coupled scenario models that could be extended to other Mediterranean physiographic or bio-climatic indices. The climatic classification and corresponding indices evolution under RCP scenarios helped in identifying the general climate change impact on Mediterranean seasonality that might uncover valuable findings about water balance, floods and droughts for water sector stakeholders. Both RCP 4.5 and 8.5 scenarios showed an increase of the average seasonality and aridity indices affecting hydrologic regimes due to shorter humid seasons 25 and earlier snowmelts. The results of this study are useful for future water resources and cultivation management policies to identify the most impacted zones and propose preventive and adaptative measures for a more resilient region. This kind of classification might be reproduced at the global scale, using same or other region-specific climatic indices highlighting their physiographic characteristics and hydrological response.    Table 6. Sample of the decision tree set of rules for the gridded classification (D1, D2, D3, D4 and D5 correspond to distance to kernel of class 1, 2, 3, 4 and 5) (23 rules) 5.5 < (D1) < 5.9 and 1.3 < (D5) < 1.7 and 1.2 < (D4) < 1.5 5.5 < (D1) < 5.9 and (D5) > 1.7 5.1 < (D1) < 5.5 and 1 < (D4) < 1.2 and (D5) < 1.3 5.1 < (D1) < 5.5 and 1.5 < (D4) < 1.5 and (D5) < 1.7 CLASS 5 5.9 < (D1) < 6.5 and 1.5 < (D4) < 2.4 & (D5) > 1.7 (12 rules) 5.9 < (D1) < 6.5 and (D4) > 2.4