Statistical approaches to study extreme events require, by definition, long
time series of data. In many scientific disciplines, these series are often
subject to variations at different temporal scales that affect the frequency
and intensity of their extremes. Therefore, the assumption of stationarity is
violated and alternative methods to conventional stationary extreme value
analysis (EVA) must be adopted. Using the example of environmental variables
subject to climate change, in this study we introduce the
transformed-stationary (TS) methodology for non-stationary EVA. This approach
consists of (i) transforming a non-stationary time series into a stationary
one, to which the stationary EVA theory can be applied, and (ii) reverse
transforming the result into a non-stationary extreme value distribution. As
a transformation, we propose and discuss a simple time-varying normalization
of the signal and show that it enables a comprehensive formulation of
non-stationary generalized extreme value (GEV) and generalized Pareto
distribution (GPD) models with a constant shape parameter. A validation of
the methodology is carried out on time series of significant wave height,
residual water level, and river discharge, which show varying degrees of
long-term and seasonal variability. The results from the proposed approach
are comparable with the results from (a) a stationary EVA on quasi-stationary
slices of non-stationary series and (b) the established method for
non-stationary EVA. However, the proposed technique comes with advantages in
both cases. For example, in contrast to (a), the proposed technique uses the
whole time horizon of the series for the estimation of the extremes, allowing
for a more accurate estimation of large return levels. Furthermore, with
respect to (b), it decouples the detection of non-stationary patterns from
the fitting of the extreme value distribution. As a result, the steps of the
analysis are simplified and intermediate diagnostics are possible. In
particular, the transformation can be carried out by means of simple
statistical techniques such as low-pass filters based on the running mean and
the standard deviation, and the fitting procedure is a stationary one with a
few degrees of freedom and is easy to implement and control. An open-source
MATLAB toolbox has been developed to cover this methodology, which is
available at

Extreme value analysis (EVA) attains a great importance in several applied sciences, particularly in earth science, because it is a fundamental tool to study the magnitude and frequency of extreme events and their changes (e.g., Alfieri et al., 2015; Forzieri et al., 2014; Jongman et al., 2014; Resio and Irish, 2015; Vousdoukas et al., 2016a). Climatic extreme events are usually associated with disasters and damages with significant social and economic costs. A correct statistical evaluation of the strength of extreme events related to their average return period is crucial for impact assessment, for the evaluation of the risks affecting human lives and activities, and for planning actions regarding risk management and prevention (e.g., Hirsch and Archfield, 2015; Jongman et al., 2014).

Often it is necessary to apply EVA to non-stationary time series, i.e., series with statistical properties that vary in time due to changes in the dynamic system. In particular, climate change can induce variations in the statistical properties of time series of climatic variables. For example, an intensification of the meridional thermal gradient at middle latitudes on a global scale would lead to an increase of the climatic variability (e.g., Brierley and Fedorov, 2010), resulting in a reduction of the average return period of storms with a given strength. Consequently, in the study of climate change, an accurate statistical estimation of middle to long-term extremes is inherently connected to the application of non-stationary methodologies.

While a general theory about non-stationary EVA has not yet been formulated (Coles, 2001), there are several studies describing methodologies for the estimation of time-varying extreme value distributions on non-stationary time series, which rely on the pragmatic approach of using the standard extreme value theory as a basic model that can be further enhanced with statistical techniques (e.g., Coles, 2001; Davison and Smith, 1990; Hüsler, 1984; Leadbetter, 1983; Méndez et al., 2006).

An established technique consists in expressing the parameters of an extreme
value distribution as time-varying parametric functions (

A drawback of this approach is that there is no general indication on how to
formulate the function

Another commonly used approach for dealing with non-stationary series is to divide them into quasi-stationary slices and apply the stationary theory to each slice (e.g., Vousdoukas et al., 2016a). This technique is referred to in the text as “stationary on slice” (SS). Although this technique enables the detection of meaningful trends for short return periods, it has the drawback of reducing the size of the sample used for the EVA, implying larger uncertainty in the estimation of long return periods.

This study aims to contribute to the field of non-stationary EVA by introducing the transformed-stationary (TS) extreme value methodology, which decouples the analysis of the non-stationary behavior of the series from the fitting of the extreme value distribution. For this purpose, it introduces a standard methodology to model the variations of the statistical properties of the series.

The remainder of the paper is structured as follows. In Sect. 2, the TS methodology is described and discussed in a general and theoretic way and implementation details are outlined. In Sect. 3, the validation of the methodology is presented. Section 4 illustrates a comparison with other common approaches for the EVA of non-stationary series, such as EM and SS for modeling time series characterized by seasonal cycles and time series showing long-term trends. In Sect. 5, the results are discussed and in Sect. 6, the most important conclusions are drawn.

The TS methodology consists of three steps: transforming a non-stationary
time series

The transformation

Equation (2) guarantees that the average of

Once the hypothesis of stationarity of

Using Eqs. (3) and (5) in Eq. (4), we find

The same principle can be applied differentiating

The findings drawn above are general and can be applied also to peak over
threshold (POT) methodologies, because the GPD is formally derived from the
GEV as the conditional probability that an observation beyond a given
threshold

It is worth noting that the TS methodology is neutral for a stationary
series, i.e., the application of this methodology to a stationary series
leads to the same results as a stationary EVA with the same underlying
statistical model. That is because in such case

Often, we would like to model extreme events that show seasonality, for
example with local winter extremes that differ in magnitude from summer
extremes. A simple way to add the seasonal cycle to Eqs. (7)–(9) is by
expressing the trend

The implementation of the TS methodology is illustrated in Fig. 1. The
fundamental input is represented by the series itself, and the core of the
implementation consists of a set of algorithms for the elaboration of the
time-varying trend

TS methodology: block diagram.

In this study, we propose algorithms based on running means and running
statistics (see Sect. 2.2.1). Hence, an important aspect is the definition of
a time window

In this implementation of the TS methodology, the estimation of the long-term statistics is separated from the estimation of the seasonality. This allows to study the long-term variability of the extreme values as is typically done when studying extremes on an annual basis, as well as the combination of long-term and seasonal variability to evaluate extremes on a monthly basis.

After the estimation of

The final step of the implementation is the back-transformation of the fitted extreme value distribution into a non-stationary one as given by Eqs. (10)–(12) and (25)–(27) for GEV and by Eqs. (19)–(21) and (28)–(30) for GPD.

There are several possible ways of estimating the slowly varying trend and
standard deviation and their seasonality. We propose here a simple
methodology based on a running mean and standard deviation. We formulate the
trend

To estimate the seasonality we perform another running standard deviation

Since there is an inherent error in the estimation of the trend, standard
deviation and seasonality given by Eqs. (32)–(36), we need to estimate this
error and propagate it to the statistical error of the parameters of the
non-stationary GEV and GPD distributions. In general, given a sample

Using Eqs. (37) and (38), we can estimate the error on

To assess the generality of the approach, the TS methodology has been validated on time series of different variables, from different sources and with different statistical properties.

The analysis of annual and monthly maxima has been carried out on time series
of significant wave height at two locations: the first located in the
Atlantic Ocean, west of Ireland (^{®} (Tolman, 2014) forced by the
wind data projections of the RCP8.5 scenario (van Vuuren et al., 2011) of the
CMIP5 model GFDL-ESM2M (Dunne et al., 2012) on a time horizon spanning from
1970 to 2100. This data set is referred to from now on as GWWIII. Here, the
TS methodology is used in order to examine its applicability to climate
change studies. The annual and monthly analyses have been repeated on a
series of water level residuals offshore of the Hebrides Islands (Scotland,

For the annual maxima of the considered series, we further compare the TS methodology with the SS technique as implemented by Alfieri et al. (2015) and Vousdoukas et al. (2016a). For this purpose, we extracted time series from projections of streamflow in the Rhine and Po rivers covering a time horizon from 1970 to 2100 (Alfieri et al., 2015), from now on referred to as JRCRIVER. Also, the two series of significant wave height of west Ireland and Cape Horn extracted from the GWWIII data set have been used in this comparison.

Finally, we compare the TS methodology and the EM for monthly maxima using time series of significant wave height extracted from a 35-year wave hindcast database (Mentaschi et al., 2015) near the locations of La Spezia and Ortona. The analysis of this data set, further referred to as WWIII_MED, focuses on a comparison between seasonal cycles modeled by the two approaches.

The validation of the TS methodology was performed first on the time series
of significant wave height of west Ireland and Cape Horn from the GWWIII data
set. We verified first the non-seasonal transformation given by Eq. (2) and
the time-dependent GEV and GPD given by Eqs. (7)–(9) and (19)–(21),
respectively. By ignoring the seasonality, this formulation is suitable for
finding extremes and peaks on an annual basis. For technical reasons the two
series do not have data in two time intervals, from 2005 to 2010 and
from 2092 to 2095. The impact of the missing data on the analysis is small,
however, especially if we choose a time window

The results of the analysis for the two time series are illustrated in
Figs. 2 and 3. Panel (a) of each figure shows the original time series and
its slowly varying trend and standard deviation. Panel (b) illustrates the
normalized series obtained through the transformation given by Eq. (1),
allowing an evaluation at a glance of the stationarity of the normalized
series. The mean and the standard deviation of the normalized series plotted
in panel (b) are 0 and 1, respectively. Higher order statistics such as
skewness and kurtosis are included in the graphics to support the assumption
of stationarity of the normalized series. From the normalized time series we
extracted the annual maxima and estimated the corresponding non-stationary
GEV as given by Eqs. (7)–(9) (see Figs. 2c and 3c). Moreover, we performed a
POT selection of the extreme events on the normalized series. The threshold
was defined in order to have on average five events per year, following
Ruggiero et al. (2010), corresponding for both of the series to the
97th percentile. From the resultant POT sample we estimated the corresponding
non-stationary GPD as given by Eqs. (19)–(21) (see Figs. 2d and 3d). In
Fig. 2c and d and Fig. 3c and d, the shape parameters

Long-term analysis of the projections of significant wave height in
Cape Horn:

Long-term analysis of the projections of significant wave height in
Cape Horn:

It is worth noting that for both of the considered series, the statistical
mode of GEV and GPD grows faster in time than the slowly varying
trend

Average error components for the long-term analysis of the projections of significant wave height extracted at west Ireland and Cape Horn, for non-stationary GEV and GPD. The error is dominated by the component due to the stationary MLE.

Average error components for the seasonal analysis of the projections of significant wave height extracted at west Ireland and Cape Horn, for non-stationary GEV and GPD. The error is dominated by the component due to the stationary MLE.

Seasonal analysis of the projections of significant wave height in
west Ireland:

Seasonal analysis of the projections of significant wave height in
Cape Horn:

The impact of the statistical error in the slowly varying trend and the
standard deviation on the uncertainty of the distribution parameters have
been examined using Eqs. (48)–(50) and (51)–(53), which, for the
non-seasonal analysis, reduce to

The seasonal formulation of the approach is suitable to estimate extreme value distributions on a monthly basis. Hence, we applied Eq. (24) to estimate the normalized series, then fitted a stationary GEV of monthly maxima by means of a MLE that was back-transformed into a non-stationary GEV through Eqs. (25)–(27). It is worth stressing that for the stationary MLE, the entire normalized series was used, covering a time horizon of 130 years. For the GPD, we selected the threshold in order to have on average 12 events per year, corresponding to the 93rd percentile for both series. Results are displayed in Fig. 4 for the location of west Ireland and in Fig. 5 for Cape Horn. To make the seasonal cycle distinguishable in these figures, we plotted only a slice of 5 years from 2085 to 2090. The meaning of the four panels in Figs. 4 and 5 is the same as in Figs. 2 and 3. The non-stationary extreme value distribution estimated for the location of west Ireland presents a strong seasonal cycle with higher and more broad-banded extremes during winter. For Cape Horn, the seasonal cycle is weaker, with the extremes of significant wave height slightly lower during the local summer. The estimated PDF for the seasonal GEV and GPD are significantly lower than those estimated for the non-seasonal analysis because in the seasonal analysis we consider monthly extremes, while in the non-seasonal one we consider annual extremes.

It is worth stressing that in the study of the monthly maxima, the long-term trend is also estimated even if it cannot be appreciated in Figs. 4 and 5 due to the short time horizon represented.

Table 2 reports the components of the statistical error due to the
uncertainty in the estimation of the seasonality, together with the
components of the stationary MLE. The error components relating to the
uncertainty in the estimation of

To verify the performance of the TS methodology on a series from a different
source, of a different size, and with different statistical characteristics,
we tested it on a series of water level residuals extracted from the
JRCSURGES data set for an off-shore location of the Hebrides Islands,
Scotland (

An interesting aspect is that the estimated standard deviation

A comparison was carried out between the TS methodology and the SS technique, consisting of a stationary analysis on quasi-stationary slices of data. This analysis was carried out on river discharge projections for the Po and the Rhine extracted from the JRCRIVER data set and on the projections of significant wave height extracted from the GWWIII data set for the locations of west Ireland and Cape Horn. The TS methodology was applied with a time window of 30 years to estimate a non-stationary GPD of annual maxima. The SS technique was carried out using a GPD approach on time slices of 30 years from 1970 to 2000, 2020 to 2050, and 2070 to 2100. For both methodologies, the threshold was selected to have on average five peaks per year.

Long-term analysis of the residual water levels modeled at the
Hebrides Islands:

Time-varying standard deviation

Return level plots for the discharge of the Rhine River at its mouth, TS (black continuous line), 95 % confidence interval for the TS methodology (green band), and SS methodology (black dashed line) for the time slices 1970–2000, 2020–2050, and 2070–2100.

Return levels modeled by the TS methodology (

Long-term variations of the extremes of projected river discharge for the Rhine and Po rivers, and of projected significant wave height for west Ireland and Cape Horn: normalized bias (NBI) and maximum difference (max diff) between the return levels estimated with the TS methodology and the SS approach, and mean 95 % confidence interval amplitude expressed as percentage of the return level, for return periods of 5, 10, 30, 100, and 300 years.

Results are illustrated in Fig. 8, where the
return levels of the projected discharge of the Rhine are shown for three
time slices. In Fig. 8 the continuous black line and the green band
represent the return levels and the 95 % confidence interval estimated by
the TS methodology, where the dashed black line represents the return levels
estimated by the stationary EVA on the considered slice (labeled in the
legend as SS). The return levels estimated for short return periods by the
two methodologies are close, while they tend to spread for high return
periods. This fact is also evident from Fig. 9,
where the return levels estimated by the two methodologies are plotted
against each other for the river discharge of the Rhine and the Po and for
the significant wave height of west Ireland and Cape Horn. We can see that
for the analyzed time series the two methodologies are in good agreement for
return periods below 30 years while they spread for larger return periods.
Some quantitative data about this fact are shown in
Table 3, which reports the normalized bias NBI of
the return levels of the two methodologies, defined as

Section 3 shows that the TS methodology is mathematically equivalent to a
particular implementation of the EM methodology as described for example by
Coles (2001), Izaguirre et al. (2011), Menéndez et al. (2009), and Sartini
et al. (2015). For the sake of completeness, we show here the results of a
comparison between the performances of TS and of a different formulation of
the EM methodology. In its formulation, the parameters of the non-stationary
GEV of the monthly maxima are expressed as

Seasonal cycle estimated by TS methodology and by the EM for the series of significant wave
height of La Spezia and Ortona. The red continuous (dashed) line represents
the location parameter

Return levels for La Spezia and Ortona for the month of January, estimated by the TS methodology (black continuous line) and by the EM (black dashed line). The green area represents the 95 % confidence interval estimated by the TS approach.

In the comparison, the EM and the seasonal TS methodology (GEV only) were
applied to the same series of significant wave heights relative to the
WWIII_MED data set described in Sect. 2.3. For the transformed-stationary
approach, a 10-year time window was used for the computation of the long-term trend.
The results of the two methodologies are similar, with a roughly flat trend
and strong seasonal pattern. The comparison of the seasonal cycles estimated
by the two techniques is represented in Fig. 10
for the two series. Here, the continuous red and green lines are the
location and scale parameters (

The GEV parameters estimated by the two approaches are in good agreement.
The small differences have relatively small impact on the return levels as
one can see in Fig. 11, where the return levels
estimated by the two methodologies for the month of January are plotted. For
both series, the return levels estimated by EM lie within the 95 %
confidence interval estimated by TS. Table 4
reports the values of normalized bias (NBI) between the return levels
estimated by TS and EM, defined as in Eq. (61), and
the mean 95 % confidence interval amplitude expressed as a percentage of
the return level. In Table 4 the values of NBI are reported for the four
seasons for return periods of 5, 10, 30, 50, and 100 years, for both La
Spezia and Ortona. In the definition of seasons that is used, winter starts
on 1 December, spring on 1 March, summer on 1 June, and
autumn on 1 September. We did not report return levels of periods
greater than 100 years because the extension of the data covers only 35
years, hence the estimates for such periods are inaccurate for both
methodologies. The average deviation between RL

Normalized bias between the return levels estimated by the TS methodology and the EM methodology for the estimation of the seasonal variations, and mean 95 % confidence interval amplitude expressed as percentage of the return level, for return periods of 5, 10, 30, 50, and 100 years, for the four seasons, for significant wave height in La Spezia and Ortona.

Extreme value analysis is a subject of broad interest not only for earth science but also for other disciplines such as economy and finance (e.g., Gençay and Selçuk, 2004; Russo et al., 2015), sociology (e.g., Feuerverger and Hall, 1999), geology (e.g., Caers et al., 1996), and biology (e.g., Williams, 1995), among others. As a consequence, non-stationarity of signals is a common problem (e.g., Gilleland and Ribatet, 2014). In this respect, it is important to stress that the TS methodology is general, and its applicability only requires the stationarity of the transformed signal. Therefore, even if in this study the technique was applied only to series related to earth science, it can be employed in all disciplines dealing with extremes.

Given that the extreme value statistical model is an important component of
applications such as those discussed here (e.g., Coles, 2001; Hamdi et al.,
2014), it is important to stress that the theory was formulated in a way that
is not restricted to GEV and GPD, but can be extended to any statistical
model for extreme values. In particular, since the GEV distribution is a
generalization of the Gumbel, Frechet, and Weibull statistics, TS can be
reformulated separately for these three distributions, as well as for the
commonly used

The transformation consists in simple, time-varying normalization of the signal through the estimation of trend, slowly varying standard deviation and seasonality, and allows different types of analysis. The first product of the methodology is its capability to estimate the extreme values of the signal. Next, the TS approach enables the analysis of long-term variability. As an example, it was shown to be useful in relating the long-term trend of the signal with the NAO climatic index (see Sect. 3.3). Finding correlations of natural parameters with climatic indices is a theme of common interest in earth science, especially in view of climate change (e.g., Barnard et al., 2015; Dodet et al., 2010; Plomaritis et al., 2015). If a time series is correlated to a climatic index in the long term, an advantage of the TS methodology is that it can model extremes correlated to the index without considering it explicitly in the computation. Finally, the TS methodology allows to describing the seasonal variability of extremes, which is also critical for climate studies (e.g., Sartini et al., 2015; Menendez et al., 2009; Méndez et al., 2006).

As shown in Sect. 4, the TS methodology has advantages over SS
(e.g., Vousdoukas et al., 2016a) and EM (e.g., Cheng et al., 2014; Gilleland
and Katz, 2016; Izaguirre et al., 2011; Méndez et al., 2006; Menéndez
et al., 2009; Mudersbach and Jensen, 2010; Russo et al., 2014; Sartini et
al., 2015), both in terms of accuracy of the results and its conceptual and
implementation simplicity. In particular, in the comparison with the SS
methodology for long-term variability, the return levels estimated by the two
techniques are similar for return periods for which the SS is accurate. The
use of the whole time horizon of the series represents a major advantage of
TS over SS because it allows more accurate estimations of the return levels
associated with long return periods. A conceptual advantage of the TS
methodology over EM is that it decouples the detection of the non-stationary
behavior of the series from the fitting of the extreme value distribution.
The study of the time-varying statistical features of the series is delegated
to the transformation, and takes place before the fitting of the extreme
value distribution. This fact provides a simple diagnostic tool to evaluate
the validity of the model applied to a particular series: the model is valid
if the transformed series is stationary. This is useful for validating the
output of the approach. Moreover, the decoupling simplifies both the
detection of non-stationary patterns and the fitting of the extreme value
distribution. In particular, the detection of non-stationary patterns can be
accomplished by means of simple statistical techniques such as low-pass
filters based on the running mean and standard deviation, and the fitting of
the extreme value distribution can be obtained through a stationary MLE with
a small number of degrees of freedom that is easier to implement and control.
Moreover, unlike many implementations of EM (e.g., Cheng et al., 2014;
Gilleland and Katz, 2016; Izaguirre et al., 2011; Méndez et al., 2006;
Menéndez et al., 2009; Sartini et al., 2015; Serafin and Ruggiero, 2014),
the detection of non-stationary patterns described in this paper does not
require an input parametric function

It is worth remarking that the EM implemented, for example, using Eq. (62), is able to model a shape parameter varying in time, unlike the TS using the transformation given by Eq. (1). While in principle this is a weak point of the TS methodology described here, assuming a constant shape parameter is a reasonable assumption for most cases, because in general simple models should be preferred to complex ones (e.g., Coles, 2001). In particular, using EM the Akaike criterion (Akaike, 1973), that favors simple models with fewer degrees of freedoms, often selects models with a fixed shape parameter (e.g., Sartini et al., 2015; Menendez et al., 2009). Moreover, the finding that a non-stationary GEV always corresponds to a transformation of the non-stationary time series into a stationary one, shown in Appendix A, suggests that a generalization of the TS methodology is possible in order to include models with time-varying shape parameters.

This paper describes the TS methodology for non-stationary extreme value analysis. The main assumption underlying this approach is that if a non-stationary time series can be transformed into a stationary one to which the stationary EVA theory can be applied, then the result can be back-transformed into a non-stationary extreme value distribution through the inverse transformation. The proposed methodology is general and, even if in this study we applied it only to series related to earth science, it can be employed in all disciplines dealing with EVA. Moreover, though we discussed it only for GEV and GPD, it can be extended to any other statistical model for extremes.

As a transformation we proposed a simple time-varying normalization of the signal estimated by means of a time-varying mean and standard deviation. This simple transformation was also adapted to describe the seasonal variability of the extremes. In addition, it was proven to provide a comprehensive model for non-stationary GEV and GPD distributions with a constant shape parameter, which means that it can be applied to a wide range of non-stationary processes. The formal duality between the TS and more established approaches has also been proven, suggesting that a complete generalization of the TS approach would allow including models with a time-varying shape parameter.

The methodology was tested on time series of different variables, sizes, and statistical properties. An evaluation of the statistical error associated with the transformation showed that, for the examined series, this is negligible with respect to the error associated with the stationary MLE (the squared error is 2 orders of magnitude smaller) and to that related to the estimation of the threshold for GPD.

The TS methodology was compared with a stationary EVA applied on quasi-stationary slices of non-stationary series (i.e., SS) for the estimation of the long-term variability of extremes, and with the EM to non-stationary EVA. The return levels estimated by TS are shown to be comparable to those obtained by these two methodologies. However, the TS approach has advantages over both SS and EM. With respect to SS, the TS uses the whole time series for fitting the extreme value distribution, guaranteeing a more accurate estimation at larger return periods. With respect to EM, the TS decouples the detection of the non-stationarity of the series from the fit of the extreme value distribution, involving a simplification of both steps of the analysis. In particular, the fit of the distribution can be accomplished using a simple MLE with a few degrees of freedom and is easy to implement and control. The detection of non-stationarity can be performed by means of easily implemented and fast low-pass filters, which do not require as input any parametric function for the variability. This makes the methodology well suited for massive applications where the simultaneous evaluation of several time series is required.

An implementation of the TS methodology has been developed in an open-source
MATLAB toolbox (tsEva), which is available at

The source code used to implement the case studies
presented in this work is available at

Here, we show that if the extremes of a time series

To prove this, we expand relationship GEV

In the particular case of

Equation (A2) alone is not enough to formulate a fully generalized TS
approach, because in Eq. (A2) the non-stationary GEV parameters
[

The authors would like to thank Simone Russo of the JRC and Francesco Fedele
of the GIT for the precious suggestions, and Niall McCormick of the JRC for
the careful review of this manuscript. This work was co-funded by the JRC
exploratory research project Coastalrisk and by the European Union Seventh
Framework Programme FP7/2007-2013 under grant agreement no. 603864 (HELIX:
High-End cLimate Impacts and eXtremes;