Coherence of Global Hydroclimate Classification Systems

Climate classification systems are useful for investigating future climate scenarios, water availability, and even 5 socioeconomic indicators as they relate to climate dynamics. These classification systems typically utilize various forms of water and energy indicators to create zone boundaries. However, there has yet to be a classification framework that includes evapotranspiration (ET) rates as a governing principle, nor has there been an effort to simultaneously compare the structure and function of multiple existing classification schemes. Here, we developed three new classification systems based on ET rates and one new system based on precipitation and potential evapotranspiration, and we compared these four new systems 10 against four previously established climate classification systems. The within-zone similarity, or coherence, of long-term water budget components was evaluated for each system based on the premise that the application of a climate classification framework should correspond to those variables that are most coherent. Additionally, the complexity of zone boundaries in each system was assessed. The most frequently used system, Kӧppen-Geiger, had high hydroclimate coherence but also high spatial complexity. This study produced classification systems of improved coherence for individual water budget components, 15 lower spatial complexity, and fewer parameters needed for their construction. The Water-Energy Clustering classification system is the primary framework proposed here for future investigations in which regions of interest include zones of differing hydrologic dynamics.


Introduction
A variety of classification schemes have been introduced to categorize specific biophysical characteristics of Earth 20 systems, including those based on climatic behavior (Beck et al., 2018;Berghuijs and Woods, 2016;Holdridge, 1967), biodiversity (Olson et al., 2001), plant-climate interactions (Papagiannopoulou et al., 2018), or plant hardiness (Magarey et al., 2008;McKenney et al., 2007). These frameworks classify elements of a system based on common atmospheric or terrestrial characteristics to maximize their within-zone similarity, or coherence, which allows for a transfer of understanding across zones of similar attributes (Lanfredi et al., 2019). This study focuses specifically on climate classification schemes, which have 25 provided a climatic context for a variety of applications, including socioeconomic assessments of human health conditions (Boland et al., 2017;Jagai et al., 2007;Lloyd et al., 2007), economic development (Mellinger et al., 2000;Richards et al., 2019), and evaluating anticipated biophysical and climatic changes (Chen and Chen, 2013;Tapiador et al., 2019). https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License.
Different climate classification systems have emerged based on framework-specific suites of hydroclimatic variables used to define the climate zone boundaries. Therefore, users should consider how potential classification system application 30 corresponds to the variables used to create it (Knoben et al., 2018;Meybeck et al., 2013). Climate classification systems are usually based in part on annual and seasonal water-energy budgets (Beck et al., 2018;Berghuijs and Woods, 2016;Holdridge, 1967;Knoben et al., 2018;Meybeck et al., 2013). The Kӧppen-Geiger classification system, the most widely used climate framework (Peel et al., 2007), was developed to regionalize climatic variables (specifically accounting for seasonal precipitation and temperature) and is often employed to compare the output of global climate models (Peel et al., 2007;35 Tapiador et al., 2019). Another common system is the Holdridge Life Zones scheme, which was created to classify land area with respect to vegetation and soil (Holdridge, 1967). This system subdivides zones based on thresholds of annual precipitation (P), potential evapotranspiration (PET), biotemperature (growing season length and temperature), and latitude and altitude.
Recent work has extended climate classification frameworks to specifically encompass hydrological attributes, since water resources-based analyses should take place within relevant hydrologic boundaries (Knoben et al., 2018;Meybeck et al., 40 2013). For example, Meybeck et al (2013) proposed a global zoning system that was primarily based on the mean temperatures and average locally generated runoff (Q) of river basins. They compared the resulting boundaries against the Kӧppen-Geiger and Holdridge frameworks to assess zone overlaps. Those authors also evaluated the within-zone coherence of mean annual temperature, P, and Q, concluding that the latter two were most coherent in dry zones and least coherent in equatorial zones, while temperature was most coherent in equatorial zones. However, Meybeck et al. (2013) did not compare their zone 45 coherence to that of previously established systems. Similarly, Knoben et al. (2018) formed zone boundaries based on climate indices (average aridity, seasonality of aridity, and P as snow) with the objective of minimizing within-zone Q variability (i.e., maximizing Q coherence). Those authors compared their results to the Kӧppen-Geiger framework and found theirs to be more coherent with respect to Q, but they did not evaluate other water budget components or other climate classification systems.
Although the P and Q components of the long-term water budget have been extensively considered in climate 50 classification schemes (Beck et al., 2018;Berghuijs and Woods, 2016;Holdridge, 1967;Knoben et al., 2018;Meybeck et al., 2013), notably absent is a system that is directly based on actual evapotranspiration (ET) rates. This gap is likely because ET traditionally has been the least empirically identified element of regional to global water budgets (Zhang et al., 2016).
Moreover, there has been no comparison of within-zone hydroclimate coherence across climate classification systems, with evaluation particularly lacking in considering ET rates. Furthermore, the spatial complexity of climate classification systems 55 has not been systematically examined. Assessing the spatial structure of a biophysical system is a concept that most notably originates from landscape ecology (O'Neill et al., 1988), which provides a suite of shape metrics that can be cross-disciplinarily applied. Quantifying shape pattern and spatial configuration of climate classification systems is important for understanding the interactions between governing hydroclimatic characteristics.
This work seeks to provide empirical support for application-dependent selection among possible climate 60 classification systems. We suggest that a classification system should have high within-zone coherence for variables that are related to the system's intended use, as well as relatively low shape complexity across zones, which is useful for ease of https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License.
interpretation within management and policy contexts. As such, an overarching hypothesis was postulated that for a given climate classification system, within-zone hydrologic coherence and inter-zone shape complexity will be closely related to the organizing principle of that system. For example, the Kӧppen-Geiger and Meybeck et al. (2013) systems are based in large 65 part on P and Q, respectively, and therefore these systems should show high coherence for these variables. Similarly, zone shape complexity will be lower in classification systems that include spatial contiguity in the organizing criteria (e.g., Meybeck et al., 2013). Given the major gap regarding the inclusion of ET in climate classification systems, we also propose a series of ET-based global classifications that should yield comparatively higher ET coherence than other systems. We tested our hypothesis by evaluating within-zone coherence of long-term water budget components (mean annual 70 ET, P, and Q) and synchronous P and PET seasonality, as well as zone shape complexity for our four new global classification systems, and further compared these against four previously established systems (Beck et al., 2018;Holdridge, 1967;Knoben et al., 2018;Meybeck et al., 2013). The primary zone shape complexity metrics were zone area and zone fragmentation (i.e., number of patches comprising each zone). This work presents novel approaches to determine appropriate applications and boundary complexities of classification frameworks. Understanding the relevance of a climate classification system is 75 important since such frameworks are used in multi-disciplinary contexts to examine hydrological, ecological, and societal phenomena.

Database construction
We evaluated global gridded monthly P and PET and mean annual ET and Q between 1980 and 2018 at a 0.5° x 0.5° 80 spatial resolution. The Climate Research Unit TimeSeries V4.04 supplied monthly P and PET (Harris et al., 2020), while mean annual ET and Q were constructed from aggregated TerraClimate daily data (Abatzoglou et al., 2018). Long-term mean values were used to mute interannual variability. Annual ET and Q were resampled from their original 1/24° x 1/24° resolution to the 0.5° x 0.5° resolution of P and PET. Spatial analysis R packages raster (Hijmans, 2017), sp (Bivand et al., 2013) and ncdf4 (Pierce, 2017) were used to build the database of long-term monthly and annual averages. The spatial extent of this study 85 comprised all global land areas, excluding Antarctica, which resulted in a total of 61,701 pixels.

Sinusoidal functions as a descriptor of seasonality
The seasonal dynamics of monthly P and PET were also considered in this analysis, as they are also included in the KPG framework, which considers temperature as a general proxy for PET (Beck et al., 2018). Sine functions, and their corresponding parameters, can be used to describe intra-annual climate behavior. Sine functions were fitted to the long-term 90 monthly distribution (following Berghuijs and Woods, 2016) https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License. (1) where is P or PET (mm month -1 ) for each month, , with monthly mean denoted by the overbar, and is dimensionless 95 amplitude. Phase angle, , is the offset (months) from the reference time, January ( = 1), with absolute value of phase difference |∆ | ≤ 6). Phase difference, ∆θ, describes the synchronization of P and PET throughout the year as (2)

100
Equation 1 showed overall good fits to the long-term mean monthly distributions of P and PET, with R 2 =0.67±0.28 and 0.85±0.17, respectively (mean±standard deviation across all pixels). To constrain −6 ≤ ∆ ≤ 6, 12 was either added to or subtracted from ∆θ values outside of these bounds (e.g., ∆ = 8 months is translated to -4 months). Because a constant reference time of January does not describe the water year for each zone, the ∆θ distributions were normalized by centering around the mode and correcting to contain only positive values (0 to 12 months). 105

Established climate classification systems
Four previously established climate classification schemes were assessed in this analysis. We included two veteran schemes, Kӧppen-Geiger (KPG) and Holdridge Life (HDL) zoning systems, and two recently proposed frameworks, here referred to as Meybeck Hydroregion (MHR, Meybeck et al., 2013) and Knoben Hydroclimate (KHC, Knoben et al., 2018) systems. Note that the original KHC zones created by Knoben et al. (2018) were not delineated by discrete boundaries but 110 were instead represented as a probability continuum of belonging to a zone. In the present study, KHC pixel values were rounded to create 30 separate zones, equal to the number of KPG zones, and similar to HDL (n=38) and MHR (n=27). This KHC zone distinction was made so that each zone could be independently evaluated. Note that the very small KPG zones "Csc" and "Cwc" did not appear in the 0.5° x 0.5° resolution KPG output created by Beck et al. (2018) that was used in this study, resulting in 28 KPG presently analyzed zones. As in other climate classification studies (Knoben et al., 2018;Meybeck 115 et al., 2013), KPG was considered here to be the standard to which other systems are primarily compared.

Proposed univariate ET climate classification systems
This study establishes and verifies ET-relevant climate classification frameworks by creating zones primarily based on ET rates and comparing ET coherence between systems. Three of the four systems proposed in this study were univariate (formatted from global mean annual ET rates) and uni-conditional (incorporating an additional system-dependent single 120 condition). The additional conditions were included to emphasize a specific optimization goal.
The first two proposed univariate classification systems were based on the global ET empirical cumulative distribution function (CDF). The first classification system, ET Area-optimizing (ETA), was created with the objective of having an equal number of pixels in each ET-based zone. This was motivated based on the relatively high spatial nonuniformity of the KPG system, resulting in highly variable relevance for regional analyses. Additionally, it is useful to have a 125 simple baseline framework upon which to compare other systems. This type of spatial condition is similar to the prioritizations of the MHR framework that state zones should ideally be "delineated in one piece" (Meybeck et al., 2013). The cumulative probability interval [0,1] was divided into 15 equal parts, and the corresponding upper and lower bounds of ET thresholds for each zone were determined from the CDF of mean annual ET for all global land pixels ( Figure S1-A). The number of ETA zones was chosen based on the number of zones in previously established systems, the relative improvement of ET coherence 130 with the addition of more zones ( Figure S1-B), and the objective of having equal or fewer zones than the standard KPG framework.
The second proposed classification system, ET Variability-optimizing (ETV), was based on the principle of maximizing within-zone ET coherence to the greatest feasible extent, considering the tradeoffs with increasing complexity by adding zones. By fitting the empirical CDF with a continuous distribution, zone boundaries can be determined analytically for 135 the minimum desired CVmin. For simplicity, and also supported by empirical evidence (Figure S1 where the upper and lower limits of sequential zones are shared (i.e., −1 = ). The largest value of b = 1,454 mm yr -1 was based on the maximum ET for all pixels, and CVmin = 0.075 was chosen based on marginal decrease with increasing number of zones ( Figure S2), which resulted in 29 zones. This method produces nearly equal CV in all zones.
The third univariate scheme proposed here is the ET Clustering (ETC) classification system. Previous analyses have 145 used clustering techniques for climate classification purposes (Knoben et al., 2018;Tapiador et al., 2019), and here k-means clustering based on the Hartigan and Wong (1979) algorithm was employed using the kmeans function in the R package stats (R Core Team, 2018). Zones were built by forming clustering centers iteratively until the within-zone sum of squares of mean annual ET, based on Euclidean distances, was reduced. The final number of clustering centers (i.e., zones) was 20, which was chosen because it is the fewest number of zones with ET mean CV ≤ 0.1 ( Figure S3). 150

Proposed multivariate climate classification systems
The final proposed system in this study is a multivariate climate clustering framework, which was created from the same k-means clustering method as the ETC framework. This new climate classification system included multiple https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License. hydroclimate variables and was designed for comparison with previously established systems that were similarly formed with multiple variables, as well as to assess against the univariate ET classification frameworks. A suite of potential multivariate 155 systems was generated using mean annual P, PET, ∆θ, ET, and Q. The best performing multivariate classification scheme was chosen from the batch of potential systems by evaluating criteria corresponding to KPG system characteristics. The primary criteria considered were water budget coherence (mean CV of ET, P, and Q) and zone complexity. Some flexibility was allowed for the final multivariate system, with water budget coherence constrained to be within 50% of KPG coherence values.
The specific values for these elimination thresholds are listed in Supporting Information, and the resulting eligibility of the 160 multivariate climate classification systems is shown in Tables S1 and S2. Ultimately, the climate classification system formed from clustering mean annual P and mean annual PET was chosen as the representative multivariate framework. This classification scheme was named the Water-Energy Clustering (WEC) climate classification system.

Coherence and complexity metrics
Variable coherence is defined by within-zone variability, represented by the intra-zone coefficient of variation (CV) 165 of the variable of interest. Lower CV corresponds to higher coherence. Complexity metrics were defined with the objective to use the smallest number of metrics to characterize shape structure (Cushman et al., 2008). Classification system complexity metrics were based on three principles: 1) Classification systems should consist of a relatively even distribution of pixels across zones, avoiding disproportionately large or small zones, 2) Zones should be as hydrologically continuous as possible (Meybeck et al., 2013), minimizing patchiness or fragmentation, and 3) Classification systems should comprise less than or equal to the 170 number of zones in the KPG framework. Therefore, complexity was assessed based on the inter-zone distribution of the number of pixels (zone area evenness, CVz) and the number of patches in each zone (zone fragmentation). Note that the subscript z is added to differentiate between-zone complexity from the above metrics which emphasize intra-zone coherence. The number of patches is the only primary coherence or complexity metric in which CV is not used, since the objective here was to minimize the degree of fragmentation, and not the similarity of fragmentation across zones. The number of patches was determined using 175 the R function ClassStat in package SDMTools (VanDerWal et al., 2019). For each hydroclimate and complexity variable, differences between the KPG framework and the other classification systems were determined based on the Kolmogorov-Smirnov (K-S) test.

Results
This study compared four previously established climate classification systems (KPG, HDL, MHR, KHC) and four 180 potential new climate classification systems (ETA, ETV, ETC, and WEC) to assess for hydroclimate coherence as well as zone boundary complexities. The coherence and complexity metrics for the KPG system, the standard used in this study, are shown in Figure 1 for all zones. The KPG system had relatively high hydroclimate coherence, as all zones had CV < 1 for all variables except Q (Figure 1). However, the KPG spatial complexity was also relatively high. Zones in the KPG system had high https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License. variability in the number of pixels, with several Boreal zones less than 5% the size of the largest zone, polar tundra zone "ET" 185 ( Figure 1E).

D). Zone size (number of pixels) and patchiness (number of patches) are normalized to their maximum values.
Coherence and complexity results for the six best performing of the eight climate classification systems are shown in Table 1. The performance compared to KPG, in order of worst-to-best, was generally HDL, ETV, KHC, MHR, ETC, ETA, and WEC. The HDL and ETV systems performed either worse than or not statistically different from KPG in all metrics, 195 except higher ET coherence for ETV, and these results are therefore not shown in Table 1 (see Table S3). The established https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License.
KHC and MHR systems were also poor-performing, with overall worse or similar hydroclimate coherence compared to KPG.
However, KHC and MHR had improved pixel coherence and fewer patches, respectively (Table 1). The proposed ETA and ETC systems had much better ET coherence than KPG and similar P coherence to KPG. The ETA system also had the fewest number of zones and very high coherence for zone size (by far the lowest CVz), but both ETA and ETC had lower coherence 200 than KPG for the remaining hydroclimate variables. Finally, the proposed WEC system had similar or better performance than the KPG system in all coherence and complexity metrics, except for ∆θ coherence, and can be concluded as a strong contender for the overall best system. Table 1: Hydroclimate coherence (intra-zone CV for mean annual ET, P, Q, ∆θ, and PET) and complexity (inter-zone CVz for pixels, 205 and numbers of patches and zones) in established and proposed climate classification systems. Mean(standard deviation), with significantly higher (ꜛ) or lower (ꜜ) values than KPG determined based on K-S tests. Bold indicates the best overall system for each metric (more than one system had statistically similar results for some metrics).  The coherence and complexity metrics for the WEC system are shown in Figure 2 for all zones. Like KPG, coherence was 210 high (CV < 1) in all zones for each hydroclimate variable apart from Q, but WEC had even higher hydroclimate coherence than KPG overall. Zone area was more equally distributed in WEC, but with similar patchiness to KPG (Table 1). The KPG system groups 30 zones into 5 categories, and here the WEC zones were similarly divided into 5 groups (Figure 3) based on aridity index ( = P/PET) and minimizing the variability of zone areas. Groups were organized to comprise near-equal pixel distributions (11,367 to 13,895 pixels in each group) across zones of decreasing aridity index, with G1 to G5 ̅ = {2.4, 1.1, 215 increasing aridity. While WEC groups represented similar total areas, they comprised different numbers of zones, from G3 with only two large zones, to G1 with 7 zones. 220

225
Maps of the boundaries for the proposed WEC system and the standard KPG framework are compared in Figure 3.
While there were some similarities (e.g., see the Iberian Peninsula in Figure 3), most regions are divided differently. For example, parts of northern Europe are divided into three KPG zones but five WEC zones. Similarly, the southeastern United States, excluding south Florida, is mostly one KPG Temperate zone, but is separated in the WEC system into two distinct G2 zones. Clustering centers for the WEC climate classification system are listed in Table S4.

Discussion 235
We hypothesized that variable coherence and zone shape complexity would be related to the governing principles of the classification systems, which was mostly supported by the results of this study. Of the four previously established systems, KPG was the most hydroclimatically coherent, but had high variability in pixel distribution across zones (Table 1). The KPG and WEC frameworks had the overall highest ∆θ coherence of the eight total compared systems, which is reasonable since KPG was the only system that accounted for monthly variability of water (P) and energy (temperature), resulting in 24 240 parameters (Beck et al., 2018). The WEC system was also based on water (P) and energy (PET), but from a mean annual perspective, thus requiring only two parameters. It is important to highlight that the number of required parameters, a notable aspect of system complexity, was lower for all systems proposed here (between 1 and 3 parameters) compared to the KPG system (24 parameters).
When comparing all eight systems, WEC had the highest P and PET coherence and similar ET, ∆θ, and Q coherence 245 to KPG. It is not surprising that the WEC classification system yielded highest P and PET coherence, given these were the variables used to draw its zone boundaries, but WEC also had much more uniform pixel distribution, similar zone fragmentation, fewer zones, and required substantially fewer parameters than KPG. The MHR system used mean Q as a https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License.
governing principle (Meybeck et al., 2013), and the KHC framework considered Q as independent validation for their zones (Knoben et al., 2018), but neither system was comparatively high in Q coherence. However, the principle of contiguity in the 250 MHR system led to the lowest patchiness of all systems evaluated, so this system could be useful when continuous boundaries are important for ease of implementation or interpretation purposes. Lastly, the three univariate ET-based classification systems had the highest ET coherence, while ETA (which additionally optimized equal zone area) also had the most uniform pixel distribution across zones.
Evaluating hydroclimate coherence is important for understanding water availability distribution within a group of 255 related zones and within individual zones to make informed management decisions. For KPG, water budget dynamics are most uniform in Tropical zones and least coherent in Polar and Arid zones (Table S5). Arid zones had the most uniform pixel distribution, while the Boreal group was least fragmented with the lowest mean number of patches, suggesting these zones are interrupted neither by other zones nor by water bodies. For WEC, group G1 had the highest mean P and Q coherence, but also the lowest mean PET coherence. The latter is likely because of the temperature variation across G1 zones, which encompass 260 both equatorial and subarctic regions (Figure 3). Group G5, comprising the most arid zones, had the lowest water budget coherence and highest PET coherence, indicating relative uniformity in (low) rainfall and (high) temperature. Pixel distribution was most uniform in G3 and G5, while G5 was least fragmented. It is valuable to note the structural attributes of zone boundaries because these boundaries are expected to change over time (Beck et al., 2018;Knoben et al., 2018).
Of the water budget components, Q was the least coherent while ET was the most coherent across all systems except 265 KHC and WEC (Table 1). This overall high coherence suggests that the variability of the drivers of ET (water and energy budget components) are mostly captured, even if ET itself is not a governing principle in the framework. However, there was still room for improvement within the established classification systems with respect to optimizing ET variability. Based on the climate classification system comparison presented here, it can be concluded that water and energy ET drivers are important considerations for broad hydroclimate analyses, since water budget coherence is mostly achieved when P and PET are included 270 as governing principles. However, when specifically evaluating ET dynamics, using an ET framework is most appropriate.
Depending on which spatial complexity metric is favored, MHR (mean zone patches = 10) and ETA (CVz = 0.02) were the least complex systems. Based on both ET coherence and spatial complexity, the ETA system established here is suggested for ET-focused questions such as large-scale assessments of crop productivity (Howell et al., 2015). This study is limited by a few factors. First, distinct climate zone boundaries, although useful in practice, do not exist 275 in the physical system (Knoben et al., 2018). Second, this study compared averaged metrics that were applied across zones within each classification system, although some individual zones were better or worse than others with respect to coherence and complexity. Third, the focus on long-term mean annual hydroclimate attributes for zone formation does not account for interdecadal climate dynamics. It is also important to recall that Q here is based on locally generated runoff (P-ET), rather than the accumulated runoff from upstream contributing areas, which would be representative of gaged streamflow. 280 The KPG system is the most widely used climate classification system, and this analysis revealed that it indeed has high hydroclimatic coherence. However, WEC was either better than or not statistically different from the KPG framework in https://doi.org/10.5194/hess-2020-522 Preprint. Discussion started: 26 October 2020 c Author(s) 2020. CC BY 4.0 License. all coherence and complexity metrics. Ultimately, the WEC climate classification system is suggested for large spatial scale hydroclimate analyses, as it has lower zone area variability than KPG and requires fewer parameters and zones to create a hydroclimatically coherent classification system. Applying the most relevant framework in which hydroclimate similarities 285 are captured by appropriate boundaries is critical to address questions of large spatial-scale climate variability and water availability (Meybeck et al., 2013). The WEC and ETA boundary systems proposed here can be useful in determining environmental management decisions in which land area encompasses multiple climate zones.

Author Contribution 290
KLMP performed the analyses and led the manuscript preparation. JWJ conceived and directed the study.

Competing Interests
The authors declare that they have no conflict of interest.