On the return period and design in a multivariate framework

. Calculating return periods and design quantiles in a multivariate environment is a difﬁcult problem: this paper tries to make the issue clear. First, we outline a possible way to introduce a consistent theoretical framework for the calculation of the return period in a multi-dimensional environment, based on Copulas and the Kendall’s measure. Sec-ondly, we introduce several approaches for the identiﬁcation of suitable design events: these latter quantities are of utmost importance in practical applications, but their calculation is yet limited, due to the lack of an adequate theoretical environment where to embed the problem. Throughout the paper, a case study involving the behavior of a dam is used to illustrate the new concepts outlined in this work.


Introduction
The notion of Return Period (hereinafter, RP) is frequently used in hydrology (as well as in water resources and civil engineering, and more generally in geophysical and environmental sciences) for the identification of dangerous events, and provides a means for rational decision making (for a review, see Singh et al., 2007, and references therein).
The traditional definition of the RP is as "the average time elapsing between two successive realizations of a prescribed event", which clearly has a statistical base.Equally important is the related concept of design quantile, usually defined as "the value of the variable(s) characterizing the event associated with a given RP".In engineering practice, the choice of the RP depends upon the importance of the structure, and the consequences of its failure.For example, the RP of a dam design quantile is usually 1000 years or more (Midttomme et al., 2001), while for a sewer it is about 5-10 years (Briere, 1999).
Correspondence to: G. Salvadori (gianfausto.salvadori@unisalento.it)While in the univariate case the design quantile is usually identified without ambiguity -and widely used in the engineering practice (Chow et al., 1988) -in the multivariate one this is not so.Indeed, the identification problem of design events in a multivariate context is of fundamental importance, but of troublesome nature.Recently, several efforts have been spent on the issues of multivariate design and quantiles (see, e.g.Serfling, 2002;Belzunce et al., 2007;Chebana andOuarda, 2009, 2011b;Chaouch and Goga, 2010, and references therein; for a methodology to identify multivariate extremes by using depth functions see Chebana and Ouarda, 2011a).Here we address the following crucial question: "How is it possible to calculate the critical design event(s) in the multivariate case?" Below, we outline a suitable approach in order to provide consistent answers.
As we shall show later, the calculation of the RP is strictly related to the notion of Copula.The use of copulas in environmental sciences is recent and rapidly growing.Shortly, a multivariate copula C is a joint distribution on I d = [0, 1] d with Uniform margins.The link between a multivariate distribution F and the associated d-dimensional copula C is given by the functional identity stated by Sklar's Theorem (Sklar, 1959): (1) for all x ∈ R d , where the F i 's are the univariate margins of F .If all the F i 's are continuous, then C is unique.Most importantly, the F i 's in Eq. ( 1) only play the role of (geometrically) re-mapping the probabilities induced by C on the subsets of I d onto suitable subsets of R d , without changing their values: viz., the dependence structure modeled by C plays a central role in tuning the probabilities of joint occurrences.
In fact, under weak regularity conditions, any point x ∈ R d can be uniquely re-mapped onto u ∈ I d (and vice-versa) via the Probability Integral Transform: (u 1 , ..., u d ) = (F 1 (x 1 ), ..., F d (x d )). (2) Published by Copernicus Publications on behalf of the European Geosciences Union.
G. Salvadori et al.: Return period and design For a thorough theoretical introduction to copulas see Joe (1997); Nelsen (2006); for a practical approach see Salvadori et al. (2007); Jaworski et al. (2010).In order to avoid troublesome situations, hereinafter we shall assume that F is continuous (but not necessarily absolutely continuous), and strictly increasing in each marginal: these regularity constraints are rather weak, and satisfied by the majority of the distributions used in applications.Clearly, also pathological cases can be carried out, but they require suitable techniques that go beyond the scope of this work.
Later we shall use the Kendall's distribution (or measure) function K C : I → I (Genest andRivest, 1993, 2001) given by where t ∈ I is a probability level, W = C(U 1 , ..., U d ) is a univariate random variable (hereinafter, r.v.) taking value on I , and the U i 's are Uniform r.v.s on I with copula C. Note that Eq. ( 3) practically measures the probability that a random event will appear in the region of I d defined by the inequality C(u) ≤ t -see also Genest and Rivest (2001); Nappo and Spizzichino (2009).Thus, as we shall see, K C turns out to be a fundamental tool for calculating a copula-based RP for multivariate events.Unfortunately, at present no general analytical expressions of K C are known -except for special cases, like the one of bivariate Extreme Value copulas (Ghoudi et al., 1998), and some Archimedean copulas (McNeil and Nešlehová, 2009) -and it is necessary to resort to simulations (see, e.g.Algorithm 1 outlined later).
The paper is organized as follows.In Sect. 2 we first illustrate the case study.In Sect. 3 we reconsider a previously introduced notion of RP in a multivariate environment, and compare it with other approaches.In Sect. 4 we show how to calculate the corresponding quantile.Then, in Sect. 5 we present two strategies to calculate critical design events in a multivariate context.Finally, in Sect.6 we discuss the results outlined in the paper, and draw some conclusions.

The case study
Although this work is of methodological nature, we feel important to illustrate with practical examples the new concepts introduced.For this reason, we first present the case study that will be used throughout the paper.
The data are collected at the Ceppo Morelli dam, and are essentially the same as those investigated in De Michele et al. (2005), to which we make reference for further details.The dam, completed in 1929, is located in the valley of Anza catchment, a sub-basin of the Toce river (Northern Italy), and was built to produce hydroelectric energy.The dam is characterized by a small water storage of about 0.47 × 10 6 m 3 .The minimum level of regulation is 774.75 m a.s.l., while the maximum is 780.75 m a.s.l.The maximum water level is at 782.5 m a.s.l., and the dam crest level is at 784 m a.s.l.The dam has an uncontrolled spillway (84 m long) at 780.75 m a.s.l., and also intermediate and bottom outlets (the latter ones are obstructed by river sediments).
In De Michele et al. (2005), "undisturbed" flood hydrographs incoming the reservoir were fixed by using the inverse reservoir routing, the water levels in the reservoir, and the operations on the controlled outlets.Then, maximum annual flood peaks Q and volumes V were identified and selected for 49 years, from 1937 to 1994.As a result of a thorough investigation, almost all of the occurrence dates of the Q's and the V 's were the same: i.e. they happened during the same flood event.
As an improvement over De Michele et al. (2005), beyond the pair (Q, V ), also the initial water level L in the reservoir before the flood event (Q, V ) is considered in this work, in order to analyse the triplet (Q, V , L) of practical interest: in fact, on the one hand L represents the starting state of the dam; on the other hand, (Q, V ) is the hydrologic "forcing" to the structure.Clearly, there are physical reasons to assume that L is independent of (Q, V ) -see also below.The sample mean of L is about 780.44 m a.s.l., with a sample standard deviation of about 1 m.The small variability of L with respect to its range (here [774.75, 780.75] is the regulation range), is mainly due to the management policy of the reservoir: the target of the dam manager is to keep a high water level, in order to get the maximum benefit concerning the production of electric energy.
Using the pair (Q, V ), it is possible to calculate the associated flood hydrograph with peak Q and volume V , once the shape of the hydrograph has been chosen.As first approximation, it is possible to consider a triangular shape, where the base time is equal to T b = 2 V /Q, the time of rise equals T r = T b /2.67, and the time of recession is equal to 1.67 T r , (see Soil Conservation Service, 1972 andChow et al., 1988, p.229 -for a different approach see Serinaldi and Grimaldi, 2011).Consequently, the flood hydrograph q is given by Later, in Sect.5, we shall test the behavior of the dam subject to selected hydrographs.More particularly, we shall first operate the reservoir routing of the flood hydrograph (see, e.g.Bras, 1990, p.475-478 andZoppou, 1999) considering as outlet only the uncontrolled spillway, and then we shall check whether or not the reservoir level exceeds the crest level of the dam.
In Fig. 1 we show the trivariate plot of the available observations, as well as the fits of the marginal distributions.However, we shall not insist on this point, being of secondary importance with respect to the actual methodological target of the paper.The GEV law is used to model the statistics of both Q and V , since these are annual maxima: the estimates of the parameters are reported in Table 1.The fits are valuable, as they passed standard goodness-of-fit tests (namely, Kolmogorov-Smirnov and Anderson-Darling -see, e.g.Kottegoda and Rosso, 1997) at all usual levels (viz., 1 %, 5 %, and 10 %).Instead, the behavior of the variable L is quite tricky (as explained above, the water level is arbitrarily fixed by the dam manager): for this reason, its law is calculated via a non-parametric Normal Kernel estimation (Bowman and Azzalini, 1997).As a result, also in this case the Kolmogorov-Smirnov test is passed at all usual levels.
The trivariate plot of the observations, as shown in Fig. 1, is the first step usually carried out by practitioners to investigate the multivariate behavior of the phenomenon.However, we want to stress that this type of graph only provides partial information, and should not be used to draw rough conclusions about the dependence structure of (Q, V , L)see below, and also Genest and Favre (2007) for a thorough review.
In order to investigate the joint behavior of the variables (Q, V , L), as is typical in copula analysis, we shall use the normalized ranks to carry out a non-parametric study.The trivariate rank-plot shown in Fig. 2 provides some rough indications about the global dependence structure (i.e. the copula) linking the three variables (Q, V , L).
As already mentioned above, there are physical reasons to assume that L is independent of (Q, V ): the rank-plots shown in Fig. 2 support this fact.Indeed, the sample is rather uniformly sparse in both the (Q, L) and (V , L) planes.Also,  [0.37, 0.86] [1.10, 2.11] [1.26, 2.19] the estimates of the Kendall's τ and the Spearman's ρ are not statistically significant (as confirmed by the corresponding p-values), and formal tests of independence suggest to accept the hypothesis that L is independent of (Q, V ).On the contrary, the variables (Q, V ) are significantly positively associated, and thus Q and V are not independent: the estimates of both the Kendall's τ and the Spearman's ρ are large, and the corresponding p-values are negligible (see the values reported in Fig. 2).
As in De Michele et al. (2005), a Gumbel copula was used to model the dependence between Q and V , with parameter θ ≈ 3.1378, calculated via the inversion of the Kendall's τ .The ability of this copula to model the available bivariate data is checked via multivariate goodness-of-fit tests (Berg,  2009; Genest et al., 2009;Kojadinovic et al., 2011): the resulting large p-values indicate that the Gumbel copula C QV cannot be rejected at all standard levels.As a matter of facts, the analysis of the (Q, V ) rank-plot in Fig. 2 shows a significant association between these two variables in the upperright corner of the unit square: indeed, the extreme pairs practically lie on the main diagonal.Thus, it is not a surprise that the fitted Gumbel copula, having a large upper tail dependence coefficient λ Upp ≈ 0.75 (Nelsen, 2006;Salvadori et al., 2007) is suitable for modeling the dependence structure of the pair (Q, V ).In passing, note that C QV is an Extreme Value copula (Nelsen, 2006) Given the previous results, since L can be assumed to be independent of (Q, V ), it is immediate to construct a suitable trivariate copula C QV L to model the dependence structure of the triplet (Q, V , L): where (u, v, w) ∈ I 3 .As above, the ability of this copula to model the trivariate data is properly checked, and the resulting large p-values indicate that it cannot be rejected at all standard levels.In passing, note that also C QV L is an Extreme Value copula.In addition, since C QV is Archimedean (Nelsen, 2006), then C QV L is a particular case of a "nested" Archimedean copula (Joe, 1997;Grimaldi and Serinaldi, 2006;Serinaldi and Grimaldi, 2007;Härdle and Okhrin, 2010;Hering et al., 2010).However,

Return period in a multivariate framework
In order to provide a consistent theory of RP's in a multivariate environment, it is first necessary to precisely define the abstract framework where to embed the question.Preliminary studies can be found in Salvadori ( 2004 (2010).Hereinafter, we shall consider as the object of our investigation a sequence X = {X 1 , X 2 , ...} of independent and identically distributed d-dimensional random vectors, with d > 1: thus, each X i has the same multivariate distribution F as of the random vector X ∼ F = C(F 1 , ..., F d ) describing the phenomenon under investigation, with suitable marginals F i 's and d-copula C. For example, we may think of a set of flood observations given by the pairs of non-independent r.v.'s Flood Peak -Flood Volume, joined by the copula C. The case of a non-stationary sequence X is rather tricky, and will be discussed in a future work.
In applications, usually, the event of interest is of the type {X ∈ D}, where D is a non-empty Borel set in R d collecting all the values judged to be "dangerous" according to some suitable criterion.Note that the Borel family includes all the sets of interest in practice (like, e.g. the intervals (−∞, x 1 ), (x 1 , x 2 ), (x 2 , ∞), as well as the corresponding multivariate versions).Let µ > 0 be the average inter-arrival time of the realizations in X (viz., µ is the average time elapsing between X i and X i+1 ).Following, e.g.Embrechts et al. (2003), and given the fact that the sequence X is i.i.d.(and, thus, stationary), the univariate r.v.'s {B i = I D (X i )} form a Bernoulli process (where I is an indicator set function), with positive probability of "success" p D given by where we assume that 0 < p D < 1.Then, it makes sense to calculate the first random time A D that the sequence B = {B 1 , B 2 , ...}, generated by X , takes on the value 1 (viz., the first random time that X enters D): Clearly, the r.v.A D /µ follows a Geometric distribution with parameter p D , and therefore the expected value of A D is Given the well known "memoryless property" of the Geometric distribution, and the features of the Bernoulli process (see, e.g.Feller, 1971), it is clear that µ D also corresponds to the average inter-arrival time between two successive realizations of the event {X ∈ D}.Evidently, µ D ranges in [µ, + ∞): for example, if annual maxima are investigated, then µ = 1 year, and hence µ D = 1/p D ≥ µ.We are now ready to introduce a consistent notion of RP.DEFINITION 1.The RP associated with the event {X ∈ D} is given by µ D = µ/P(X ∈ D).
Definition 1 is a very general one: the set D may be constructed in order to satisfy broad requirements, useful in different applications.Indeed, most of the approaches already present in literature are particular cases of the one outlined above.
As a univariate example, let X be a r.v. with distribution F X .In order to identify a dangerous region, traditionally a prescribed critical design value x * is used.Then, D (or, better, D x * ) contains all the realizations that are judged to be "more dangerous" than x * .For instance, in hydrology, if droughts are of concern, x * may represent a small value of river flow, and the dangerous realizations of interest are those for which X ≤ x * (viz., D x * = [0, x * ]).Instead, if floods are of concern, x * may indicate a large value of river flow, and the dangerous realizations of interest are those for which According to Definition 1, the corresponding RP's are µ/F X (x * ) in the former case, and µ/(1 − F X (x * )) in the latter one.
It is important to stress that the RP is a quantity associated with a proper event.However, with a slight abuse of language, we may also speak of "the RP of a realization" (viz., x * in the example given above), meaning in fact "the RP of the event {X belongs to the dangerous region D x * identified by the given realization x * }".Indeed, in a univariate framework, the assignment of x * uniquely specifies the corresponding region D x * .
Actually, also in a multivariate framework it is possible to associate a given multi-dimensional realization x * ∈ R d with a dangerous region D x * ⊂ R d .As an illustration, consider the two different bivariate dangerous regions constructed in Salvadori (2004); Salvadori and De Michele (2004).In these papers the joint behavior of the vector (X, Y ) ∼ F = C(F X , F Y ) was analysed: for instance, in terms of variables of hydrological interest, think of the pairs flood peak-volume, or storm intensity-duration.In particular, great attention was paid to the following two sets: }, where at least one of the components exceeds a prescribed threshold (roughly, it is enough that one of the variables is too large);

("AND" case) D
}, where both the components exceed a prescribed threshold (roughly, it is necessary that both variables are too large).
Here z * = (x * , y * ) is a prescribed vector of thresholds, and ∨, ∧ are the "(inclusive) OR" and "AND" operators.
In this work we follow a different approach.The idea stems from the possibility to write, in the univariate case, the dangerous region D x * in two equivalent ways: either as Clearly, the same rationale holds by considering as a dangerous region the set D x * = {x : x≤x * }, which may be of interest, e.g. for the study of droughts.Then, by considering the above formulation as given in terms of the distribution function F X , it is clear how it can be extended in a natural way to the multidimensional case, as we shall illustrate below.First of all we need to introduce the following notion.DEFINITION 2. Given a d-dimensional distribution F = C(F 1 , ..., F d ) and t ∈ (0, 1), the critical layer L F t of level t is defined as Clearly, L F t is the iso-hyper-surface (having dimension d − 1) where F equals the constant value t: thus, L F t is a (iso)line for bivariate distributions, a (iso)surface for trivariate ones, and so on.Evidently, for any given x ∈ R d , there exists a unique critical layer L F t supporting x: namely, the one identified by the level t = F (x).Note that, thanks to Eq. ( 2), there exists a one-to-one correspondence between the two iso-hyper-surfaces The critical layer L F t partitions R d into three nonoverlapping and exhaustive regions: 2. L F t , the critical layer itself; Practically, at any occurrence of the phenomenon, only three mutually exclusive things may happen: either a realization of X lies in R < t , or over L F t , or it lies in R > t .Note that all these three regions are Borel sets.
Thanks to the above discussion, it is now clear that the following (multivariate) notion of RP is meaningful, and coincide with the one used in the univariate framework.DEFINITION 3. Let X be a multivariate r.v. with distribution F = C(F 1 , ..., F d ).Also, let L F t be the critical layer supporting a realization x of X (i.e.t = F (x)).Then, the RP associated with x is defined as 2. for the region In the sequel we shall concentrate only upon R > t : the corresponding formulas for R < t could easily be derived.Note that R > t may be of interest, e.g. when floods are investigated, while R < t may be appropriate if droughts are of concern.Now, in view of the results outlined in Nelsen et al. (2001Nelsen et al. ( , 2003)), it is immediate to show that where ν F is the probability measure induced by F over R d , and K C is the Kendall's distribution function associated with C (see Eq. 3 and the ensuing discussion).Clearly, T > x is a function of the critical level t identified by the relation t = F (x).It is then convenient to denote the above RP via a special notation as follows.DEFINITION 4. The quantity κ x = T > x is called the Kendall's RP of the realization x belonging to L F t (hereinafter, KRP).An advantage of the approach outlined in this work is that realizations lying over the same critical layer do always generate the same dangerous region.Evidently, this is not the case considering the "OR-AND" approach discussed above.
Furthermore, in the "AND" case, it may happen that realizations not lying in the dangerous region D z * of interest have a RP larger than the one of z * .More specifically, as graphically illustrated in Fig. 3, for a given realization z * lying on the isoline of level t ∈ (0, 1) (where F ≡ t), the dangerous region D ∧ z * is given by the shaded area.However, given another realization w * , lying on the isoline of level s > t, the corresponding RP may be larger than the one of z * , but w * does not belong to D ∧ z * .A similar rationale also holds for the "OR" case.Instead, in the approach outlined in this work, all the realizations y having a KRP κ y < κ x must lie in R < t , whereas all those y having a KRP κ y > κ x must lie in R > t -clearly, all the realizations lying over L F t share the same KRP κ x .
For the sake of convenience, we report below the algorithm explained in Salvadori and De Michele (2010) for the calculation of K C (see also Genest and Rivest, 1993;Barbe et al., 1996), which yields a consistent Maximum-Likelihood estimator of K C .Here we assume that the copula model is well specified, i.e. it is available in a parametric form.ALGORITHM 1. Calculation of the Kendall's measure function K C .

For
Hydrol.Earth Syst.Sci., 15, 3293-3305, 2011 www.hydrol-earth-syst-sci.net/15/3293/2011/As an illustration, in Fig. 4a we plot an estimate of the function K C associated with the copula C QV L : here Algorithm 1 is used, running a simulation of size m = 5 × 10 7 .Also shown is the empirical estimate of K C calculated by using the available observations: the horizontal patterns are simply due to the small sample size.

Quantiles associated with the KRP
Traditionally, in the univariate framework, once a RP (say, T ) is fixed (e.g. by design or regulation constraints), the corresponding critical probability level p is calculated as 1 − p = P(X > x p ) = µ/T , and by inverting F X it is then immediate to obtain the quantile x p = F (−1) X (p), which is usually unique.Then, x p is used in practice for design purposes and rational decision making.As shown below, the same approach can also be adopted in a multivariate environment (to be compared with Belzunce et al., 2007).DEFINITION 5. Given a d-dimensional distribution F = C(F 1 , ..., F d ) with d-copula C, and a probability level p ∈ I , the Kendall's quantile q p ∈ I of order p is defined as where is the inverse of K C .Definition 5 provides a close analogy with the definition of univariate quantile: indeed, recall that K C is a univariate distribution function (see Eq. 3), and hence q p is simply the quantile of order p of K C .Thanks to Eq. (2), it is clear that the critical layer L F q p is the iso-hyper-surface in R d where F takes on the value q p , while L C q p is the corresponding one in I d where the related copula C equals q p .Now, let L F q p be fixed.Then, according to Eq. ( 3), p = K C (q p ) = P(C(F 1 (X 1 ), ..., F d (X d )) ≤ q p ).Therefore, p is the probability measure induced by C on the region R < q p , while (1 − p) is the one of R > q p .From a practical point of view this means that, in a large simulation of n independent d-dimensional vectors extracted from F , np realizations are expected to lie in R < q p , and the others in R > q p .REMARK 1.It is worth stressing that a common error is to confuse the value of the copula C with the probability induced by C on I d (and, hence, on R d ): on the critical layer L C q p it is C = q p , but the corresponding region R < q p has probability p = K C (q p ) = q p , since K C is usually nonlinear (the same rationale holds for the region R > q p ).In other words, while in the univariate case the value p = F X (x p ) corresponds to the probability induced on the region R < p , where x p is the quantile of X of order p, this is not so in the multivariate case.
Since K C is a probability distribution, and q p is the corresponding quantile of order p, we could use a standard bootstrap technique (see, e.g.Davison and Hinkley, 1997) to estimate q p if it cannot be calculated analytically.The idea is simple, and stems directly from the very definition of q p : viz., to look for the value q p of C such that, in a simulation of size n, np realizations show a copula value less than q p .Then, by performing a large number of independent simulations of size n, the sample average of the estimated q p 's is expected to converge to the true value of q p .A possible algorithm is given below, most suitable for vectorial software.
Here we assume that the copula model is well specified, i.e. it is available in a parametric form.ALGORITHM 2. Calculation of q p .First of all, choose a sample size n, a critical probability level p, the total number of simulations N, and fix the critical index k = np .
; % store new estimate of q p into vector E end q = Mean(E); % calculate the estimate of q p Then, once the loop is completed, q provides an estimate of q p .Practically, Algorithm 2 does the "inverse" task of Algorithm 1.The bootstrap method may also yield an approximate confidence interval for q p (see DiCiccio and Efron, 1996 for more refined solutions): for instance, at a 10 % level, the random interval (q 0.05 , q 0.95 ) can be used, where q 0.05 and q 0.95 are, respectively, the quantiles of order 5 % and 95 % extracted from the vector E. Using Algorithm 2 (and setting n = 10 4 , p = 0.999, and N = 10 7 , for a total of 10 11 simulated triplets), we estimated q 0.999 ≈ 0.946537 for the copula C QV L of interest here, and a 10 % confidence interval (0.946110, 0.946973), a process that took about 48 h of CPU time on a iMac equipped with a 3.06 GHz Intel Core 2 Duo processor and 8 GB RAM.As an illustration, in Fig. 4b we show an estimate of the function K C , associated with the copula C QV L , at t * ≈ 0.946537 (corresponding to a millenary KRP): as expected, the value K C (t * ) is almost exactly equal to 99.9 %.
As a further illustration, in Fig. 5 we plot the critical isosurface L C t * of the trivariate copula C QV L for the critical level t * ≈ 0.946537, corresponding to a regulation return period of 1000 years (viz., all the realizations on L C t * have a KRP equal to 1000 years).Then, C QV L = t * for all points belonging to L C t * .Instead, C QV L < t * (and κ x < 1000 years) in the region R < t * "below" L C t * , the one containing the origin 0 = (0, 0, 0), whereas C QV L > t * (and κ x > 1000 years) in the region R > t * "above" L C t * , the one containing the upper corner 1 = (1, 1, 1).On average, only 0.1 % of the realizations extracted from a simulation of C QV L are expected to lie in R > t * .However, the level of the critical layer is t * = q 0.999 ≈ 0.946537 < p = 0.999, as indicated by the diamonds markers in the plot.
As a further example, consider that the region R > 0.999 identified by the critical layer L F 0.999 (where the multivariate distribution F , or, equivalently, the copula C, takes on the value 0.999) has an estimated probability smaller than 10 −6 , and a corresponding KRP of about 3 × 10 6 years: practically, only one realization of C QV L out of 3 × 10 6 simulations is expected to lie in R > 0.999 (instead of 1 out of 1000).Evidently, if F (or C) were substituted for K C in Eq. ( 11) during the design phase, then the structure to be constructed would result over-sized (being expected to withstand stunning extreme events).

Design in a multivariate framework
The situation outlined in the previous section is generally similar to the one found in the study of univariate phenomena, where a single r.v.X with distribution F X is used to model the stochastic dynamics.However, as already mentioned, the multivariate case generally fails to provide a natural solution to the problem of identifying a unique design realization.In fact, even if also the layer L F t acts as a (multidimensional) critical threshold, there is no natural criterion to select which realization lying on L F t (among the ∞ d−1 possibilities) should be used for design purposes.In other words, in a multivariate environment, the sole tool provided by the RP may not be sufficient to identify a design realization, and additional considerations may be required in order to pick out a "characteristic" realization over the critical layer of interest.In the following, we outline possible ways to carry out such a selection.Clearly, several approaches can be proposed, each one possibly yielding a different solution: below, we show two possible elementary strategies to deal with the problem.
The basic idea is simply to introduce a suitable function (say, w) that "weighs" the realizations lying on the critical layer of interest.Following this approach, the practitioner can then freely choose the criterion (i.e. the function w) that best fits the practical needs.Clearly, without loss of generality, w can be assumed to be non-negative.In turn, a "design realization" can be defined as follows.DEFINITION 6.Let w : L F t → [0, ∞) be a weight function.The design realization δ w ∈ L F t is defined as provided that the argmax exists and is finite.Definition 6 deserves some comments.
-In general, the unicity of the maximum may not be guaranteed.When this happens, a recourse to physical/phenomenological considerations, or to additional procedures (like, e.g.Maximum Information/Entropy schemes; Jaynes, 2003), may help solving the problem.
-Different copulas may share the same Kendall's measure K C , and hence the same KRP (e.g.all the bivariate Extreme Value copulas with the same Kendall's τ ; Ghoudi et al., 1998).However, in general, the critical layers of such copulas will have a different geometry, and, in turn, will provide different design realizations.
-The search of the point of maximum in Eq. ( 13) can be subjected to additional constraints, in order to take into account the possible sensitivity of the structure under design to the behavior of specific marginals (see also the discussion in Remark 2): for instance, a Bayesian approach might be advisable.
-Sometimes it could be more appropriate to select a set of possible design realizations (i.e. an ensemble, rather than a single one) that should be used, together with experts' opinions, in order to better evaluate the features of the phenomenon affecting the structure under design.This procedure can be carried out by using a suitable step function in Eq. ( 13).
In passing, we note that, in the present case study, the distribution F QV L and the copula C QV L are trivariate, and hence the corresponding critical layers are simply twodimensional surfaces in R 3 and I 3 , respectively.Figure 5 shows the critical layer of level t * pertaining to C QV L , and the corresponding one pertaining to F QV L can be drawn by exploiting Eq. ( 2).Now, for the sake of graphical illustration, it is possible to parametrize L F t * in polar coordinates (say, (α, r)) via a one-to-one transformation, and thus re-map and plot any function w defined over L F t * onto the rectangle (0, π/4) × (0, r), for a suitable maximum ray r.In turn, it is rather easy to have a peek of the behavior of any weight function w over L F t * .REMARK 2. A delicate problem may arise when adopting the approach outlined above: to make the point clear, consider the following example.Suppose that we use the duration of a storm and the storm intensity as the two variables of interest.In a fast responding system (e.g. a sewer structure), a storm having short duration but high intensity may cause a failure, whereas the same storm may not cause any problem at a catchment level.In the catchment, however, a storm with long duration and intermediate to low intensity may cause a flood event, whereas the same storm does not cause any problem to the sewer system.Now, as a matter of principle, the design realization δ w for the given return period (i.e. the "typical" critical storm calculated according to the strategy illustrated here) may not cause any problem in both systems, and therefore these would be wrongly designed.Practically, the sewer systems should be designed using critical design storms of short durations and high intensities, whereas a structure in the main river of the watershed should be designed using storms of long durations and low intensities.However, the problem is more apparent than real.In fact, there are neither theoretical nor practical limitations to restrict the search for the maxima in Eq. ( 13) over a suitable sub-region of L F t * : remember that all the realizations on the critical layer share the same prescribed KRP.Thus, when a sewer system is of concern, only storms having short durations and high intensities could be considered, whereas a critical design storm for a structure in the main river could be spotted by restricting the attention to storms of long durations and low intensities.Roughly speaking, in the approach outlined here, the calculation of the critical design realization can be made dependent on both the environment in which a structure should be designed, as well as on the stochastic dynamics of the phenomenon under investigation.
Overall, the procedure to identify the design realization could be described as follows.Let X be a random vector with distribution F = C(F 1 , ..., F d ).
1. Fix a RP T .
3. Compute the Kendall's quantile q p as in Eq. ( 12), either analytically or by using Algorithm 2.
4. Fix a suitable weight function w.
5. Calculate the point of maximum δ w of w on the critical layer L F q p .
The resulting δ w represents a "typical" realization in R d with a given RP.Roughly speaking, it denotes the design realization obtained by considering the very stochastic dynamics of the phenomenon.Note that, in general, δ w (or, better, the corresponding critical layer) should be considered together with other information (e.g. the physical features of the structure) in order to be correctly used in practice.For the sake of illustration, below we introduce two weight functions.

Component-wise excess design realization
A realization lying on the critical layer L F t may be of major interest when all of its marginal components are exceeded with the largest probability.In simple words, we suggest to look for the point(s) x = (x 1 , ..., x d ) ∈ L F t such that it is maximum the probability that a dangerous realization y = (y 1 , ...,y d ) satisfies all the following component-wise inequalities: or y > x using a simplified notation.The next definition is immediate.DEFINITION 7. The Component-Wise Excess weight function w CE is defined as where X has distribution F = C(F 1 , ..., F d ), and [x, ∞) is the hyper-rectangle in R d whose points satisfy all the inequalities stated in Eq. ( 14).
where t ∈ (0, 1).REMARK 3. Via the Probability Integral Transform and Sklar's Theorem, it is easy to show that where U has the same copula C as of X and Uniform marginals, u(x) = (F 1 (x 1 ), ..., F d (x d )), and [u, 1] is the hyper-rectangle in I d with lower corner u and upper corner 1.Thus, the probabilities of interest can be directly computed in the unit hyper-cube (see, e.g.Joe, 1997) by working directly on the critical layer L C t (instead of L F t ), a solution numerically more convenient.Note that, for large d-dimensional problems, the CPU time involved may become prohibitive, though clever solutions have been proposed for large d's (see, e.g.Cherubini and Romagnoli, 2009).In some cases, δ CE can be calculated analytically; otherwise, it can be empirically estimated (e.g. by calculating it over suitable points of L C t or L F t ).In Fig. 6 we show the behavior of w CE over L F t * , as well as the Component-wise Excess design realization δ CE (t * ) calculated for the case study investigated here (see also Fig. 5).This point has the largest probability to be component-wise exceeded by an extreme realization with KRP larger than 1000 years, and therefore it should be regarded as a sort of (statistical) "safety lower-bound": viz., the structure under design should, at least, withstand realizations having (multivariate) size δ CE (t * ), as reported in Table 2.
As anticipated in Sect.2, using the design realization δ CE (t * ), we operated the reservoir routing of the corresponding flood hydrograph.Then, we checked whether or not the reservoir level exceeds the crest level of the dam  2. (i.e.784 m a.s.l.).The column"M.W. L." in Table 2 reports the value 782.08 m a.s.l.: thus, no over-topping occurs, i.e. the dam seems to be safe against Component-Wise Excess millenary realizations.

Most-likely design realization
A further approach to the definition of a characteristic design event consists in taking into account the density of the multivariate distribution describing the overall statistics of the phenomenon investigated: in fact, assuming that the density f of F is well defined over L F t , we may think of using it as a weight function.
Clearly, the restriction f t of f over L F t is not a proper density, since it does not integrate to one.However, it may provide useful information, since it induces a (weak) form of likelihood over L F t : in fact, it can be used to weigh the realizations lying on L F t , and spot those that are (relatively) "more likely" than others.Indeed, f t inherits all the features of interest directly from the true global density f .The next definition is immediate.DEFINITION 9.The Most-Likely weight function w ML is defined as where f is the density of F = C(F 1 , ..., F d ).
Then, by restricting our attention to the critical layer L F t , the following definition is immediate.
DEFINITION 10.The Most-Likely design realization δ ML of level t is defined as where t ∈ (0, 1).REMARK 4. As a rough interpretation, δ ML plays the role as of a "characteristic critical realization", i.e. the one that has to be expected if a critical event with given KRP happens.
In some cases, δ ML can be calculated analytically; otherwise, it can be empirically estimated (e.g. by calculating f over suitable points of L F t ).In general, provided that weak regularity conditions are satisfied, f can be calculated by using the marginal densities f i 's of X, and the density c Since our target is to compare the "weight" of different realizations, from a computational point of view it may be better to minimize −ln(f ) over L F t (since the maxima are preserved).
As an illustration, in the present (absolutely continuos) case, the expression of the trivariate density f QV L is given by where (x, y, z) ∈ R 3 , and c QV is the density of the Gumbel copula modeling the pair (Q, V ).In Fig. 7 we show the behavior of (the logarithm of) w ML (i.e.f QV L ) over L F t * , as well as the Most-Likely design realization δ ML (t * ) calculated for the case study investigated here (see also Fig. 5).The actual values of the function w ML are irrelevant: in fact, we are only interested in spotting where f QV L is maximal.Therefore, the Most-Likely design realization could be regarded as the "typical" realization: viz., the structure under design should be expected to withstand events having (multivariate) size δ ML (t * ), as reported in Table 2.
Again, as a test, using the design realization δ ML (t * ), we operated the reservoir routing of the corresponding flood hydrograph, and checked whether or not the reservoir level exceeds the crest level of the dam.The column "M.W. L." in Table 2 reports the value 781.98 m a.s.l.: thus, no overtopping occurs, i.e. the dam seems to be safe also against Most-Likely millenary realizations.

Additional remarks about design strategies
An interesting test concerning the misuse of univariate approaches in a multivariate framework can be carried out ≈ 0.946537 (for the sake of presentation, the surface is clipped at −20).The star marker indicates where the maximum is attained -see text and Table 2. as follows.In fact, as a further possible strategy, suppose that a critical design realization δ 1D = (x 0.999 , y 0.999 , z 0.999 ) is defined in terms of the millenary univariate quantiles of the three variables of interest here (see the last row of Table 2).In turn, the layer L F QV L t * 1D supporting δ 1D has a critical level t * 1D ≈ 0.997754 (see Fig. 4b), corresponding to a value of K C (t * 1D ) ≈ 0.999998, and a KRP of about 5 × 10 5 years.It is then immediate to realize that, in order to provide a true millenary multivariate design realization, it may not be enough (or necessary) to rely upon millenary univariate quantiles.Also, operating the reservoir routing using δ 1D , yields a reservoir level of about 784.80 m (see Table 2), which may cause an over-topping and a dam failure.
The example given above, as well as the illustrations presented in Sect.5, may suggest the following empirical consideration (which, however, should be taken with care).Both the millenary multivariate design realizations δ CE and δ ML yielded a maximum water level of about 782 m a.s.l., whereas δ 1D (with a KRP of the order of 10 5 years) generated a maximum level over-topping the dam crest by only about 80 cm.Thus, apparently, the dam is over-sized, i.e. it could withstand events with a RP much larger than 1000 years.Clearly, from the safety point of view, this is a good news.On the other hand, a smaller structure, correctly sized for withstanding true millenary multivariate events, would probably reduce the cost.This paper is of methodological nature, and introduces original techniques for the calculation of design quantiles in a multivariate environment.In this work we made an effort to reduce the troublesome nature of multivariate analysiswhich has always limited its practical application -by providing consistent frameworks (the KRP) and techniques (the weight functions on the critical layers) to address the identification of the critical design events when several dependent variables are involved.In particular, the "CE" and "ML" design values may provide basic realizations with given KRP, of interest in multivariate design problems.
It worth noting that the design phase should not be confused with risk assessment.In fact, the target of the former one is to provide characteristic realizations (e.g. the design realizations) useful for planning and managing a a structure.In this case only the hazard component is taken into account, viz. the probabilistic behavior of the r.v.'s under consideration, but no specific information is exploited about the structure (e.g. the dam) under design.The risk assessment, instead, aims at pointing out possible dangereous situations by further introducing the impact ingredient, i.e. by considering the physical influence of the variables on the structure.In other words, the design phase only identifies the set of possible realizations (namely, those on the critical layer) that are associated with a given probabilistic level of confidence.These realizations should then be carefully used, together with additional information (e.g. the morphology of the basin), in order to provide specific parameters for quantifying the risk of a structural failure.
To the best of our knowledge, this is the first time that a similar study is presented.Clearly, further research is necessary, especially concerning the introduction of alternative design strategies, and their mutual comparison.In addition, a step towards a consistent framework for dealing with risk assessment in a multivariate environment is needed.

Fig. 1 .
Fig. 1.Trivariate plot of the available (Q, V , L) observations, and fits of the marginal distributions -see text.

Fig. 2 .
Fig. 2. Trivariate rank-plot of the available (Q, V , L) observations, and bivariate rank-plots of the marginals -see text.Also shown are the estimates of the Kendall's τ and the Spearman's ρ, as well as the corresponding p-values (derived from non-parametric tests of independence based on rank statistics).

Fig. 3 .
Fig. 3. illustration of the dangerous region D ∧z * (shaded) in the "AND" case -see text.
Fig. 4. (a) Simulation-based estimate of the function K C (continuous line) associated with the copula CQV L; also shown is its empirical estimate (markers) calculated by using the available observations -see text.(b) Plot of the (millenary KRP) quantile t * ≈ 0.946537 (thick-dashed line) associated with the critical probability level p = 0.999; also shown (thin-dashed line) is the estimate of the value K C ≈ 0.999998 associated with the critical level t * 1D ≈ 0.997754 -see text.

Fig. 5 .
Fig. 5. Critical iso-surface L C t * of the copula C QV L corresponding to the (millenary KRP) critical level t * ≈ 0.946537, indicated by the diamond markers on the axes.The circle and the square markers indicate, respectively, the Component-Wise and the Most-Likely design realizations -see text.
Fig. 6.Polar re-mapped plot of the Component-Wise Excess weight function w CE over the critical layer L F t * , corresponding to the (millenary KRP) critical level t * ≈ 0.946537.The star marker indicates where the maximum is attained -see text and Table 2.
Fig. 7. Polar re-mapped plot of the (log) Most-Likely weight function w ML over the critical layer L F t * , corresponding to the (millenary KRP) critical level t *≈ 0.946537 (for the sake of presentation, the surface is clipped at −20).The star marker indicates where the maximum is attained -see text and Table2.

Table 1 .
Maximum-Likelihood estimates of the GEV parameters for Q and V , and corresponding 95% Confidence Intervals.

Table 2 .
Estimates of the critical design realizations, for a millenary return period, according to different strategies -see text.Also shown are the estimates of the univariate quantiles of order p = 0.999 of the variables of interest.The right-most column shows the Maximum Water Level of the dam associated with the flood event (Q, V , L) reported on the corresponding row.The Component-wise Excess design realization δ CE of level t is defined as