Historical records are an important source of information on extreme and rare floods and fundamental to establish a reliable flood return frequency. The use of long historical records for flood frequency analysis brings in the question of flood stationarity, since climatic and land-use conditions can affect the relevance of past flooding as a predictor of future flooding. In this paper, a detailed 400 yr flood record from the Tagus River in Aranjuez (central Spain) was analysed under stationary and non-stationary flood frequency approaches, to assess their contribution within hazard studies. Historical flood records in Aranjuez were obtained from documents (Proceedings of the City Council, diaries, chronicles, memoirs, etc.), epigraphic marks, and indirect historical sources and reports. The water levels associated with different floods (derived from descriptions or epigraphic marks) were computed into discharge values using a one-dimensional hydraulic model. Secular variations in flood magnitude and frequency, found to respond to climate and environmental drivers, showed a good correlation between high values of historical flood discharges and a negative mode of the North Atlantic Oscillation (NAO) index. Over the systematic gauge record (1913–2008), an abrupt change on flood magnitude was produced in 1957 due to constructions of three major reservoirs in the Tagus headwaters (Bolarque, Entrepeñas and Buendia) controlling 80 % of the watershed surface draining to Aranjuez. Two different models were used for the flood frequency analysis: (a) a stationary model estimating statistical distributions incorporating imprecise and categorical data based on maximum likelihood estimators, and (b) a time-varying model based on “generalized additive models for location, scale and shape” (GAMLSS) modelling, which incorporates external covariates related to climate variability (NAO index) and catchment hydrology factors (in this paper a reservoir index; RI). Flood frequency analysis using documentary data (plus gauged records) improved the estimates of the probabilities of rare floods (return intervals of 100 yr and higher). Under non-stationary modelling flood occurrence associated with an exceedance probability of 0.01 (i.e. return period of 100 yr) has changed over the last 500 yr due to decadal and multi-decadal variability of the NAO. Yet, frequency analysis under stationary models was successful in providing an average discharge around which value flood quantiles estimated by non-stationary models fluctuate through time.

Throughout Europe, national legislations on flood hazard assessment are based on flood-frequency analyses which estimate discharges associated with different return periods (usually 10, 25, 50, 100 and 500 yr). This assessment is also gathered in the European Flood Directive (2007/60/CE) together with the ordinarily used parameters (see Article 6.3 on flood hazard maps and flood risk maps), which are common practice in several continents and countries such as the USA, Australia and several South American countries. The common procedure involves the extrapolation from gauged hydrological registers, documenting 30–40 yr records, of observed floods (usually not comprising an extraordinary event) and estimate quantiles for floods. The straightforward handling of these simple statistical methods that makes them so widely used hides important uncertainties as well as scientific and technical problems which have been extensively discussed in the literature (Merz and Blöschl, 2008). This conventional method of addressing flood hazard assessment can be improved by including information of past floods (historical floods and palaeofloods), which should be accomplished using rigorous procedures of data collection and statistical modelling. Documentary records provide a catalogue of the largest flood events that occurred during periods of human settlement, and provide evidence of all other events below or above specific flow stages or thresholds (Brázdil et al., 2006). Long records of historical extreme floods have been applied successfully in hazard analysis together with the more traditional empirical, statistical and deterministic methods to estimate the largest floods (Stedinger and Cohn, 1986; Francés et al., 1994; England et al., 2003). Information on these extreme floods is highly demanded by planners and engineers and yet seldom registered in the observational record due to its short time length (Enzel et al., 1993; Benito et al., 2004).

A major issue of using long-term flood data in flood frequency analysis is the assumption of stationarity, i.e. the idea that “natural systems fluctuate within an unchanging envelope of variability” (Milly et al., 2008). The analysis of long-term historical flood records frequently reveals the occurrence of flood clusters and trends lasting from a few decades to centuries and, subsequently, the fact that the frequency of flood magnitude changes over time (Benito et al., 2003). Most of these trends observed in historical flood series are related to decadal climate variability (e.g. influence of the North Atlantic Oscillation or ENSO, El Niño) which may affect the assumption of stationary in conventional flood frequency analysis (Merz et al., 2014). Milly et al. (2008), in their study on the recent impacts of climate change in the hydrological cycle, conclude that the assumption of stationary in hydrology is dead and should no longer serve as a central, default assumption in water-resource risk assessment and planning. Yet, the application of non-stationary concepts by the engineering community has been rather challenging due to the limited number of non-stationary models and scarce long-term flood records to test the performance of such models. Centennial historical flood data provides long-term real data to test and compare stationary and non-stationary models under climate variability. The results of this analysis together with the selection of appropriate covariates explaining the changes in flood series are relevant in climate-related impacts on flood risk assessment. Moreover, the optimal merging of instrumental, historical and palaeoinformation is of great importance for understanding flood hazard and the appraisal of flood incidence through multi-decadal climate variability (Kundzewicz et al., 2014).

This paper aims at addressing stationary and non-stationary flood frequency analyses of historical floods (data censured above threshold), and to test their performance in flood hazard analyses. The specific objectives are to (1) analyse the long-term historical flood variability and identify the major drivers and covariates (climatic or human-induced environmental factors) describing the temporal changes; (2) test the stationary of flood series and, if so, estimate flood distribution of systematic (gauged) and non-systematic (historical) data under stationary assumption; (3) apply non-stationary models to historical flood data incorporating covariate indexes describing flood producing atmospheric circulation patterns (e.g. NAO) and other major human disturbance parameters (e.g. reservoir construction); and (4) compare results of using stationary and non-stationary models in flood hazards assessment.

The Tagus River basin in the Iberian Peninsula.

The Tagus River drains the central Spanish Plateau (Meseta Central) and flows
east–west into the Atlantic Ocean at Lisbon (Portugal). It is the longest
river of the Iberian Peninsula (1200 km) and the third largest in catchment
area (81 947 km

The Tagus River flood regime is influenced by the rainfall associated with the
Atlantic fronts that cross the western Iberian Peninsula during low zonal
circulation over the Atlantic (at 35–40

A historical database was compiled from direct and indirect sources (Proceedings of the City Council, diaries, chronicles, memoirs, etc.). Written documents and maps were obtained directly from the General Archive of the Royal Palace which stores data on the administration of the Spanish Royal Heritage (16th–20th centuries), as well as from the Simancas General Archive, the Spanish National Library, and the Geographic Service of the Spanish Army. Further bibliographical sources consulted include scientific and technical reports, local history works and non-systematic compilations by of Rico Sinobas (1850), Bentabol (1900), Masachs (1948, 1950), Fontana-Tarrats (1977), Gonzálvez (1977), López-Bustos (1981), Comisión Técnica de Inundaciones (1985), Font (1988), and Canales (1989). All indirect information was checked by cross-referencing between different sites (Benito et al., 2003), following the standardised methodology proposed by Barriendos and Coeur (2004).

Water stage data related to documentary descriptions and epigraphic marks were interpreted in terms of stage indicators as follows: (1) sites or landmarks reached by the flood (e.g. churches and bridges) were assumed to provide an exact discharge level (equal to the flood stage), (2) flooded areas (e.g. orchards, floodplain areas, gardens) provided a minimum flood stage, (3) non-flooded areas or landmarks (e.g. Royal Palace surrounded by water) were interpreted as a non-reached maximum flood stage, and (4) the relative importance of the event with respect to previous floods (e.g. the 1840 flood was 2 m higher than the flood occurring in 1820) was quoted as a range of discharge in the case of two recorded levels.

The water levels (flood stage) required to reach the described flooded areas or landmarks (buildings, streets, gardens, etc.) were converted into discharge values (O'Connor and Webb, 1988) using the HEC-RAS one-dimensional hydraulic model (Hydrological Engineering Center, 2010). The HEC-RAS software, allows for rapid calculation of water-surface profiles for specified discharges and energy loss coefficients. This conversion is an inverse problem, where the minimum discharge is obtained by matching the modelled flood water levels to those obtained from the elevation of sites inundated during the historical floods, as described in the documentary sources. Hydraulic modelling requires the estimation of key hydraulic characteristics of the river reaches (energy slope, roughness and cross-sectional topography) as well as the boundary conditions upstream or downstream depending on the flow type selected in the model. Detail cross-section topography of the river channel was obtained from echo-sounder and GPS field surveys performed by the LINDE Project (Spanish Ministry of Environment) for flood risk mapping. On the floodplain, cross sections were completed when needed from detailed topographical maps (1 : 500 or 1 : 1000 in scale). In the study reach, river channel modification resulting of constructions of dykes, river lining, gardens setting and other engineering–architectural works were reconstructed based on historical maps and technical reports described among others by García Tapia (1980), González Perez (1987) and Teran (1949).

Systematic streamflow data of annual peak discharge were obtained from gauge
records, namely from the Aranjuez station (9340 km

The potential of historical records is based not only in their value as documentary registers of the occurrence of a rare flood but also as repositories of the magnitude of these floods by documenting and sometimes mapping or graphically indicating the limits of every rare flood, over a centennial period. In statistical analysis, streamflow monitoring data are considered systematic information whereas flood observations reported as having occurred above some threshold are known as non-systematic (censored) data sets (Leese, 1973). Fitting a distribution to these systematic and non-systematic data sets provides a compact representation of the frequency distribution, a task that requires a method to estimate the distribution parameters so that quantiles can be calculated. The parameters set for three statistical models, namely the log-normal (LN), generalised extreme value (GEV) and two-component extreme-value (TCEV) distribution functions have been estimated by the maximum likelihood estimation (MLE) method. This method was selected based on its statistical features' performance for large samples and also because of its capacity to easily incorporate in the estimation process any additional non-systematic quantified data (Leese, 1973; Stedinger and Cohn, 1986).

Most historical information is non-systematic information of the censored type
since only floods of certain magnitudes (typically producing damages) are
registered in documentary records. The minimum flood stage (threshold
levels) required to be reported in documentary records is most commonly reported in
areas with urban settlements or economically important human activities. In
the case of the Iberian Peninsula sites studied till now, most are located in
floodplain areas (Benito et al., 2004). The flooding of sensitive areas, or
perception threshold, may change through time, according to the progressive
human occupation of the riverine areas and the socio-economic context. In
the case of Aranjuez, the perception threshold did not register significant
changes throughout time. The minimum discharge (250 m

Above: Royal Palace of Aranjuez with a view of the Tagus River
and gardens painted by Antonio Joli, ca. 1753. Bottom: longitudinal profile
of the channel bottom along the study reach and calculated water surface
profiles fitting historical flood water indicators described as inundated
during the 1866 flood. An estimated discharge of 650 m

In our study, flood discharge in years 1919, 1954 and 1941 were estimated based on epigraphic flood marks. All the historical flood data in Aranjuez provide minimum discharge estimates based on elevation points affected by flooding, although the exact water depth in those sites is unknown (lower-bound type after Naulet et al., 2005). A review of these various methods using historical data was presented by Ouarda et al. (1998) and by Francés (2004), and case study applications in Europe can be found e.g. in Francés et al. (1994), Naulet et al. (2005), Calenda et al. (2009) and Botero and Francés (2010), among others.

Statistical analyses rely on the general characteristics and stationarity of the flood. However, the temporal changes in the trajectory and statistics of a state variable may correspond to natural, low-frequency variations of the climate hydrological system or to non-stationary dynamics related to anthropogenic changes in key parameters such as land use and climate. Flood record stationarity from censored samples (systematic and/or non-systematic) was checked using Lang's test (Lang et al., 1999, 2004). This test assumes that the flood series can be described by a homogenous Poisson process. The 95 % tolerance interval of the accumulative number of floods above a threshold or censored level is computed. Stationary flood series are those remaining within the 95 % tolerance interval (Naulet et al., 2005).

Modelling of flood time series under a non-stationary assumption was carried out incorporating external covariates, which explains the time changes of the statistical parameters (mean and variance) for the selected frequency distribution. In this study, the NAO index and the reservoir index (RI) are introduced as explanatory variables of the marginal distribution parameters. The modelling framework was based on the “generalized additive models for location, scale and shape” (GAMLSS), as proposed by Rigby and Stasinopoulos (2005). GAMLSS has been successfully used in hydrological studies by Villarini et al. (2009a, b, 2010a, b, 2011) and López and Francés (2013).

In GAMLSS the response random variable

Due the characteristics of the selected probability distributions, only the non-stationarity in the first two moments are considered. If the relation between distribution parameters and selected covariates follows a smooth dependence, this tends to increase the complexity of the model. According to Eq. (1), four different models can be considered for the time-varying marginal distributions: model a – two distribution parameters are constant; model b – only the location parameter is time varying; model c – only the scale parameter is time varying; and model d – location and scale parameters are time varying. The final model is selected by comparing the corrected Akaike information criterion (AICc; Hurvich and Tsai, 1989). This AICc criterion is more strict than the classical Akaike information criterion (AIC). In the absence of a statistic to evaluate the goodness of fit of the selected models as a whole, verification was made by analysing the normality and independence of the residuals of each model, following the recommendations of Rigby and Stasinopoulos (2005). With these criteria, a final model is provided with a balance between accuracy and complexity (López and Francés, 2013).

After defining the functional dependence between distribution parameters and
each selected covariate and the effective degrees of freedom for the cubic
spline, the distribution functions

The modelling of flood frequency over the systematic record should consider
the high incidence of dam regulation on flood peak discharges. The Bolarque
reservoir, together with the Entrepeñas and Buendia reservoirs complex,
has substantially altered the streamflow regime of the upper Tagus
watershed. Thus, under the altered regime (1957–2008) the reservoir regulation
is an important variable to explain the abrupt change on flood magnitude and
frequency observed in the systematic record. A RI was used
to describe the dependence of flood variability on reservoir capacity and
regulation strategies, as proposed by López and Francés (2013), for
floods and recently used for low flows by Jiang et al. (2014):

The historical flood records of Aranjuez and Toledo constitute the best registers to be found along the Tagus River; they both gather a continuous data record on floods over a time period of up to 600 yr. The Aranjuez Royal Palace and nearby extensive gardens were commissioned by Phillip II as a winter residence of the Spanish kings. The oldest garden, named Isla Garden (AD 1387–1409), is bounded by the River Tagus and a diversion canal from a mill dam next to the Royal Palace. The other gardens (Prince's Garden of ca. 150 ha) have been progressively extended along the Tagus River floodplain since AD 1545, together with a series of bridges, fountains, museums, as well as facilities for boat trips and deer hunting. In the middle of the garden and next to the Tagus River is located the Farmer's Lodge, a neoclassic Palace (built for Charles IV in AD 1790–1803) highly vulnerable to flooding and frequently referred to in the documentary record when the floodplain is inundated. Currently, only two epigraphic flood marks are displayed at the entrance patio: the 1916 and 1924 flood levels.

The hydraulic model comprises a 13 km reach, from upstream of the Tagus's
river confluence with the Jarama River to the upstream of the Prince's
Garden at the Embocador mill dam, along which 57 cross sections were surveyed
(Fig. 2). Hydraulically, the Aranjuez reach is relatively complex due to its
wide floodplain and the near junction to the Jarama River. Discharge
estimations using one-dimensional hydraulic modelling for a floodplain river
reach may contain some uncertainties in the final result due to potential
channel migrations, avulsion and meandering cut-off. However, historical
maps show that meanders of the Tagus River along the studied reach were
stabilised since AD 1795 when a 2 km reach was straightened and numerous
dykes were built at either side of the Tagus River banks. Another source of
uncertainty is related to the numerous hydraulic constructions (e.g. dams,
bridges, dikes, dam mills) that have been set up along the Tagus River over
time. The excellent documentation of the work descriptions and their
emplacements (e.g. Díaz-Marta, 1992; García Tapia, 1980) were
enough to include the corresponding temporal geometric changes of the
cross sections used in the hydraulic modelling. The model was calibrated
using flood marks at the Labrador House Palace of the 20 December 1916 and
31 March 1924 floods. It was assumed for the calculations that flow was
subcritical along the modelled reach. Manning's

The historical flood reports quote over 100 locations or sites (buildings,
streets, orchards, gardens) to have been flooded during the historical
period. A database was implemented comprising flood dates, flooded sites,
GPS locations, maximum and minimum elevation and nearby cross sections used
in the hydraulic modelling. As a result, for each historical flood,
information of flooded and non-flooded affected sites and their elevation
(maximum and/or minimum) is given, thus making it possible to calculate the
discharge associated with the flooded sites. Discharge values associated
with each historical flood were estimated upon calculated water-surface
profiles bracketing elevations of flooded/non-flooded historical sites along
the longitudinal profile (Fig. 2). Most of the documentary descriptions
provide information on the minimum flood stage as written reports usually do
not provide information of water depth at a site. Non-flooded sites
explicitly indicated provide a maximum bound of discharge not exceeded by
the flood. Figure 2 shows the calculated water-surface profiles along the
study reach with the best match to seven documentary flood indicators of
flooded areas during the 1866 flood that provided a minimum discharge of
650 m

Reconstructed flood record in Aranjuez during the historical period (1557–1912) and annual maximum flood series during the instrumental periods (1913–2008), with vertical bars showing the peak discharge. NAO index values associated with flood dates are indicated (historical NAO index after Luterbacher et al., 1999, 2002).

The flood record for Aranjuez comprises 59 historical floods (1557–1912) and
95 gauged floods (1913–2008). In terms of flood frequency distribution, four
periods can be distinguished (Fig. 3) in Aranjuez: 1560–1611, 1739–1750,
1850–1899 and 1917–1928. The most important period in terms of flood number
and magnitude corresponds to 1850–1899 and includes a total of 18 flood
events, 6 of which showed minimum peak discharges of 500 m

The temporal and spatial variability of flood patterns are closely related to global change drivers, climate and human activity (e.g. Benito et al., 2008; Hall et al., 2014). A major challenge in non-stationary flood frequency analysis is the identification of major external and internal parameters describing the non-stationarity pattern of the flood series.

At annual timescales, moisture influx variability in the Iberian Peninsula
and therefore river flow is largely modulated by North Atlantic circulation,
and hence by the NAO (Trigo et al., 2004). In the Tagus River flooding is
highly related to persistent rainfall (several weeks) due to the successive
passage of Atlantic cold fronts over the Iberian Peninsula in winter months
(Cortesi et al., 2013). It is, therefore, natural to expect a strong impact
of the NAO on flooding in large Atlantic river catchments. Most studies
linking NAO index with floods (Benito et al., 2008; Machado et al., 2011;
Silva et al., 2012) have used seasonal and monthly correlations leading to
lack of significant connections and/or a delayed response between oscillation
mode and hydrological event. Floods are produced by rainfall excess with an
immediate hydrological response, requiring a daily to monthly resolution
(Salgueiro et al., 2013). Figure 1b shows the relationship between the
average winter (DJF) NAO index reconstructed by Luterbacher et al. (1999,
2002) since AD 1500 and the maximum discharges recorded during those months
from instrumental (gauge stations) and documentary sources for the Tagus
River at Aranjuez. A strong correlation is generally observed between
the negative monthly NAO index and flood discharges above 400 m

The temporal analysis of flood–NAO relationships (Fig. 3) shows that periods of decreasing flood frequency and magnitude are associated with the positive mode of NAO (e.g. 1650–1700, 1750–1800). The largest flood discharges occurred at periods with negative NAO phase or at periods with high NAO variability characteristic of strong meridional flow over the Atlantic. However, note that negative winter NAO index values are not always related to the existence of extraordinary floods.

Other factors affecting the stationarity of flood series are land-use changes
and reservoir construction in the upper Tagus Basin. Detailed analysis of
land-use changes during the historical period is highly complex and out of
the scope of this paper. However, the upper Tagus catchment is a mountain region
with a relatively low population density, covering a relatively high
drainage surface of 9340 km

Log-normal (LN), Generalised extreme value (GEV) and two-component extreme-value (TCEV) distributions fitted with (1) systematic data and (2) systematic and historical flood data. Dots represent the plotting positions for the systematic record (white dots) and historical plus systematic records (black dots).

Stationarity tests (Lang et al., 1999, 2004) of the combined documentary
and instrumental (under natural regime) flood series above 250 and
350 m

Flood quantiles for different return periods in Aranjuez obtained
using log-normal (LN), two-component extreme-value (TCEV) and generalised
extreme-value (GEV) distributions fitted to firstly the annual maximum
systematic records only, then the combined systematic and censored
historical floods for discharges above 350 m

The incorporation of the historical data into the FFA results in a slight
increase of the magnitude of the flood quantiles when compared with the ones
obtained with the systematic record before reservoir constructions (Fig. 5;
Table 1). Major differences are found when the combined FFA quantiles are
compared with annual maximum discharge series under the regulated systematic
period. For example, the 0.01 annual probability flood of the combined
historical and natural-systematic records is 1450 m

Variations of the RI in the Aranjuez gauge station. Note the sharp change in RI value produced in 1957.

The implementation of the non-stationary model consisted in establishing the
type of dependence of the distribution parameters with the associated
external covariates (winter NAO index and reservoir index) as a function of
time. For the selected log-normal distribution (LN with 2 parameters:

Modelling results of the variation in the marginal distributions
of the annual maximum streamflow with the NAO index as the explanatory
variable of the distribution parameters.

The reservoir index for the Aranjuez gauge station was calculated by
Eq. (2). As previously indicated, the streamflow from the Tagus Basin
headwaters is regulated by three major reservoirs (Entrepeñas, Buendia
and Bolarque). Construction of the Bolarque reservoir (capacity of 31

Modelling of maximum annual discharge (

Results of the marginal distributions with NAO and RI as the explanatory variables of distribution parameters.

The non-stationary model with NAOw as external covariate over the period 1913–1956 shows a linear dependence on both parameters, the mean and the variance of the logarithms of annual flood peak discharge (Table 2) and therefore on the mean and variance of the floods (Fig. 7a). However, for the period 1957–2008, the linear dependence was only found in relation to the log-Q mean (Table 2, Fig. 7b), indicating the strong incidence of the reservoir regulation on the peaks of annual maximum floods. Also, it is interesting to underline that, for the whole period 1913–2008, the heteroscedasticity of the log-Q data is only described by the climate, and not by the reservoir construction. However, in terms of the original variable, the heteroscedasticity of the floods is described by both the climate and the reservoir construction.

Estimated quantiles for the model with the NAO index as the
explanatory variable plotted against NAO index.

The models adequately describe the changes in the percentiles of the annual maximum flood peaks although, for some periods, a high negative NAOw index does not imply high flood discharges. Moreover, there is a higher variability for the highest percentiles (75th and 95th) whereas a low variability is observed for the 5th and 25th percentiles. Figure 8 shows the observed discharges and the estimated quantiles plotted in relation with the NAOw index. The highest annual discharges and their associated 95th percentiles show a high correlation with negative NAO index for the period with natural or low altered stream regime (1913–1956), although some negative NAO index values resulted in moderate peak discharges.

The modelling of flood frequency including the NAOw and RI as external covariates explains adequately the observed flood variability for the Aranjuez station (Fig. 9). The flood modelling incorporating additionally the RI allowed for a good characterisation of the sharp change on the flood regime associated with the construction of the upstream reservoirs system, which controls about 80 % of catchment surface draining to Aranjuez. These results show the potential of the RI as a covariate that explains the sharp change in the median and the variance, and how to incorporate this effect into the model.

Estimated quantiles for the non-stationary model with the NAO and RI as explanatory variables of the distribution parameters. Note a sharp change on the maximum annual floods dividing the record in two sets (1913–1956 and 1957–2011) due to construction of a complex of three reservoirs at the Tagus headwaters.

The modelling of non-stationary series was carried out considering
continuous annual maximum discharge records as a variable of response (

Estimated quantiles for the model using systematic information under a natural regimen (1913–1956) and the systematic information (1913–2008), with the NAO index and reservoir index as the explanatory variables of the distribution parameters.

Figure 10 shows the results of the estimates of quantiles since AD 1700, assuming the modelling relationships obtained from the systematic record with the natural regime (1913–1956). The comparison of the estimated quantiles with the non-systematic historical data shows that the reconstructed NAO index is a highly significant covariate to describe the temporal variability of historical floods recorded in Aranjuez. Similar results were obtained for the estimated quantiles with the non-stationary model based on the systematic record for the period 1913–2008 (Fig. 10), where both the NAOw index and the RI were used as covariates of the statistical parameters of the parametric distribution.

Quantile estimates of the annual maximum floods with 0.01 annual exceedance probability based on stationary and non-stationary models.

The results obtained from modelling flood frequency under non-stationary
conditions while incorporating external forcing captures more adequately the
dispersion of flood values and shows the effect of climate indices
modulating the frequency and magnitude of floods (Fig. 10). The historical
non-stationary model also captures the presence of multi-decadal temporal
trends, characteristic of the North Atlantic atmospheric circulation, which
are incorporated in the flood frequency model. Figure 11 shows the results
of FFA in stationary conditions and non-stationary conditions for an
exceedance probability of 0.01 (i.e. return period of 100 yr). It can be
seen that non-stationarity models indicate the existence of periods in which
flood frequency experienced significant variability (decreases and
increases). On the long-term, stationary FFA based on historical data
reflects the average discharge for the 100 yr flood although in terms of
flood hazards during some periods the stationary 100 yr flood discharge
(1450 m

Flood hazard analysis is critical for the elaboration of risk maps, and it is the basis of the design of hydraulic structures (e.g. dam spillways, diversion canals, dikes), urban drainage systems, land and urban planning and cross-drainage structures (e.g. bridges and culverts and dips). This probabilistic assessment of hazards and risks are routinely done assuming the stochastic nature of streamflow and their extremes. The reliability of the quantile estimator can be improved either by using the best statistical model or by increasing the amount of information (e.g. Botero and Francés, 2010). The use of historical censored data can increase the information length at the gauge station, providing, as it has been mentioned previously, that for each study case the information error and the underlying stationary hypothesis are explored.

Flood frequency analyses with systematic and additional non-systematic
information have been widely demonstrated to reduce the quantile estimate
uncertainty (e.g. Stedinger and Cohn, 1986; Francés et al., 1994;
O'Connell, 2005). In the analysis of frequency using non-systematic data
it is more important to know the number of floods exceeding a certain stage
(threshold) over a period of time than the precise value of the historical
flood peaks (Viglione et al., 2013). The location of the royal palace and
gardens on the River Tagus floodplain (Aranjuez) made it highly likely that
any flood causing damages to these royal constructions would have been
reported and recorded on administrative documents, providing an accurate
account of all floods exceeding the bankfull discharge. The statistical
analysis with non-systematic records is suitable to deal with uncertainties
related to the precise estimation of flood discharges (Stedinger and Cohn,
1986) particularly when the historical information is used as binomial
censored data (i.e. without defining their exact value, just the lower limit of
censoring) without losing statistical significance (e.g. Francés et al.,
1994). In our FFA the 32 historical floods, exceeding 350 m

In recent decades, land-use change and anthropogenic climate change impacts on hydrology have largely questioned the stationarity hypothesis (Milly et al., 2008), and several non-stationary models have been applied in the study of annual maximum flood series (e.g. Cunderlik and Burn, 2003; Ouarda and El-Aldouni, 2011; López and Francés, 2013). However, there is still a lack of research assessing the reliability of non-stationary flood frequency models to represent hydrological disturbances due to low-frequency climate variability over centennial historical periods. This study presents a unique record of censured flood discharges over a 300 yr period that made it possible to test the weight of temporal variability on statistical parameters under a non-stationary assumption.

In the Aranjuez case study, the Lang stationary test for a discharge
threshold of 350 m

A detailed analysis of the stationary test shows several periods of accumulated flood events at the limit of the confidence interval due to a reduction of flood frequency (Fig. 4a). A reduction on flood frequency of large floods occurred over the first decades of the 19th century (1810–1850). In contrast, a high flood frequency of large floods was recorded over the second half of the 19th century and beginning of the 20th century. A similar flood reactivation during this period is recorded in other Iberian rivers such as Guadalentín (Benito et al., 2010), Júcar (Francés, 1998), Llobregat (Thorndycraft et al., 2005) and Guadalquivir rivers (Benito et al., 2005). Over the whole record, three of the largest five floods of the Tagus River occurred during the historical period record (1878, 1611 and 1830). The use of this non-systematic data improved the precision of the frequency analysis for the exceedance probability of 0.01 (i.e. return period of 100 yr) for describing long-term average flood risk. The frequency analysis based solely in the systematic gauge record provided lower discharge values, both under the natural flow regime (1911–1956) and to a greater degree during the substantially altered streamflow, post-reservoir regime (1957–2008). In fact, the flood quantile estimates over the systematic period under the non-stationary model (Fig. 9) shows a sharp change in the flood regime after the construction of the reservoir system.

In the western Iberian Peninsula, moisture influx variability and,
therefore,
river flow are largely modulated by North Atlantic circulation,
characterized by the NAO (Trigo et al., 2004). In the Tagus River a strong
correlation is generally observed between flood discharges above
400 m

The application of these models as a predictive tool requires the
introduction of scenarios as well as an improved modelling of low-frequency
atmospheric circulation under anthropogenic climate change (Hall et al.,
2014; Merz et al., 2014). Our results indicate that stationary frequency
analysis using historical data provides a robust reference value of the
“average” probabilistic discharge during the considered period. For
instance, the peak flood for an annual exceedance probability of 0.01 during
the historical (1779–1912) and the systematic periods (1913–1956) under
a stationary model (log-normal distribution) is

Documentary records of the Tagus River in Aranjuez (central Spain) provide evidence of continuous and homogenous data of extreme floods over the last 300 yr. Documented reference to flooded and non-flooded sites (gardens, streets) and buildings (Royal Lodge) were successfully assessed and computed into flood discharges. The universality of this methodology depends however on the quality and continuity of the documented registers, as well as on the characteristics of the human settlement (spatial variation along time, burying processes, destruction or changes of landmarks) and the geomorphological dynamics of the area.

The discharge estimates show evidence that during the historical flood
record (1557–1912), flood events of greater magnitude than the ones recorded
in the gauging station (1913–2008) were documented. The documentary record
contains descriptions of 59 historical floods with the largest event
produced in 1878 with an estimated discharge of 1200 m

The identification of flood clusters during the historical record uncover
the potential problems of assuming a stationarity in the flood frequency
analysis. In this study, two different models were used for the flood
frequency analysis of the Aranjuez flood data sets: (1) the stationary
model, in which the distribution parameters do not depend on covariates,
i.e. the parameters are constant in time; and (2) the time-varying model
(non-stationary model) that incorporates external covariates, where the
distribution parameters can vary as a function of climate variability (North
Atlantic Oscillation index during winter months; NAOw) and the streamflow
regulation by dams (reservoir index; RI). Regarding the stationary flood
frequency analysis, a stationarity test associated with different discharge
thresholds (perception levels) was efficient to detect anomalous flood
periods/records. The Aranjuez flood record (historical and systematic) for
discharges exceeding 350 m

Flood frequency analysis using documentary data (plus gauged record)
improved the estimates of the probabilities of rare floods (return intervals
of 100 yr and higher). Moreover, historical information increases and
tests the representation of outliers in the systematic data. A flood
frequency analysis with 32 historical floods (

Although, in the long-run, the FFA (flood frequency analysis) under
stationary models provides good average
values, historical flood records indicate that secular flood occurrence
related to climate variability has changed over the last 300 yr, during
which flood quantiles have fluctuated through time at decadal time spans.
The modelling of flood time series under a non-stationary model
incorporating the NAO index and reservoir index as external covariates
demonstrate the existence of a high decadal variability on flood percentiles
reflecting the internal climatic variability in the flood occurrence.
According to non-stationary models, the peak flood associated with a 0.01
annual exceedance probability may range between 4180 and 560 m

In terms of the hydrological effects of climate change, future global circulation model projections incorporate too much uncertainty to accurately specify expected patterns of precipitation change or to estimate expected changes in the frequencies and magnitudes of extreme storm and flood events. Predictions can be improved by incorporating long-term flood records (several millennia) in climatic modelling and statistical analysis. The study of temporal variability of past climate–flood links can establish short- and long-term relationships at regional levels and in areas within different climatic zones. Regional studies of long-term climate–flood links involve calibrating the relationships, detecting trends (where they exist) and revising estimates of return periods. This integration will greatly improve our understanding of flood frequency and magnitude in the context of changing climates where the assumption of stationarity (implicit in most current flood risk models) is being questioned. The analysis of flood quantiles under stationary and non-stationary models over long-term historical periods is critical for the analysis of flood hazards of sensible infrastructure (e.g. dams, nuclear power plants) in the context of climate change.

This research was funded by the Spanish Ministry of Economy and Competitiveness through the research projects FLOODMED (CGL2008-06474-C02/BTE), SCARCE-CONSOLIDER (CSD2009-00065), CLARIES (CGL2011-29176) and PALEOMED (CGL2014-58127-C3-1-R), and by the CSIC PIE intramural project (201430E003). Edited by: G. Blöschl