The geophysical and hydrological processes governing river flow formation exhibit persistence at several timescales, which may manifest itself with the presence of positive seasonal correlation of streamflow at several different time lags. We investigate here how persistence propagates along subsequent seasons and affects low and high flows. We define the high-flow season (HFS) and the low-flow season (LFS) as the 3-month and the 1-month periods which usually exhibit the higher and lower river flows, respectively. A dataset of 224 rivers from six European countries spanning more than 50 years of daily flow data is exploited. We compute the lagged seasonal correlation between selected river flow signatures, in HFS and LFS, and the average river flow in the antecedent months. Signatures are peak and average river flow for HFS and LFS, respectively. We investigate the links between seasonal streamflow correlation and various physiographic catchment characteristics and hydro-climatic properties. We find persistence to be more intense for LFS signatures than HFS. To exploit the seasonal correlation in the frequency estimation of high and low flows, we fit a bi-variate meta-Gaussian probability distribution to the selected flow signatures and average flow in the antecedent months in order to condition the distribution of high and low flows in the HFS and LFS, respectively, upon river flow observations in the previous months. The benefit of the suggested methodology is demonstrated by updating the frequency distribution of high and low flows one season in advance in a real-world case. Our findings suggest that there is a traceable physical basis for river memory which, in turn, can be statistically assimilated into high- and low-flow frequency estimation to reduce uncertainty and improve predictions for technical purposes.

Recent analyses for the Po River and the Danube River highlighted that catchments may exhibit significant correlation between peak river flows and average flows in the previous months (Aguilar et al., 2017). Such correlation is the result of the behaviours of the physical processes involved in the rainfall–runoff transformation that may induce memory in river flows at several different timescales. The presence of long-term persistence in streamflow has been known for a long time, since the pioneering works of Hurst (1951), and has been actively studied ever since (e.g. Koutsoyiannis, 2011; Montanari, 2012; O'Connell et al., 2016 and references therein). While a number of seasonal flow forecasting methods have been explored in the literature (e.g. Bierkens and van Beek, 2009; Dijk et al., 2013), attempts to explicitly exploit streamflow persistence in seasonal forecasting through information from past flows have been, in general, limited. Koutsoyiannis et al. (2008) proposed a stochastic approach to incorporate persistence of past flows into a prediction methodology for monthly average streamflow and found the method to outperform the historical analogue method (see also Dimitriadis et al., 2016, for theory and applications of the latter) and artificial neural network methods in the case of the Nile River. Similarly, Svensson (2016) assumed that the standardized anomaly of the most recent month will not change during future months to derive monthly flow forecasts for 1–3 months lead time and found the predictive skill to be superior to the analogue approach for 93 UK catchments. The above-mentioned persistence approach has also been used operationally in the production of seasonal streamflow forecasts in the UK since 2013, within the framework of the Hydrological Outlook UK (Prudhomme et al. 2017). A few other studies have included past flow information in prediction schemes along with teleconnections or other climatic indices (Piechota et al., 2001; Chiew et al., 2003; Wang et al., 2009). Recently, it was shown that streamflow persistence, revealed as seasonal correlation, may also be relevant for prediction of extreme events by allowing one to update the flood frequency distribution based on river flow observations in the pre-flood season and reduce its bias and variability (Aguilar et al., 2017). The above previous studies postulated that seasonal streamflow correlation may be due to the persistence of the catchments storage and/or the weather, but no attempt was made to identify the physical drivers.

The present study aims to further inspect seasonal persistence in river flows and its determinants, by referring to a large sample of catchments in six European countries (Austria, Sweden, Slovenia, France, Spain, and Italy). We focus on persistence properties of both high and low flows by investigating the following research questions: (i) what are the physical conditions, in terms of catchment properties, i.e. geology and climate, which may induce seasonal persistence in river flow, and (ii) can floods and droughts be predicted, in probabilistic terms, by exploiting the information provided by average flows in the previous months? These questions are relevant for gaining a better comprehension of catchment dynamics and planning mitigation strategies for natural hazards. To reach the above goals, we identify a set of descriptors for catchment behaviours and climate and inspect their impact on correlation magnitude and predictability of river flows.

A few studies have analysed physical drivers of streamflow persistence on annual and deseasonalized monthly and daily time series (Mudelsee, 2007; Hirpa et al., 2010; Gudmundsson et al., 2011; Zhang et al., 2012; Szolgayova et al., 2014; Markonis et al., 2018), but the topic has been less studied on intra-annual scales relevant to seasonal forecasting of floods and droughts.

To demonstrate the high practical relevance of the identified seasonal correlations we present a technical experiment for one of the studied rivers (Sect. 7) in which the frequency distribution of both high and low flows is updated one season in advance by exploiting real-time information on the state of the catchment.

The investigation of the persistence properties of river flows focuses separately on both high and low discharges and is articulated in the following steps: (a) identification of the high- and low-flow seasons, (b) correlation assessment between the peak flow in the high-flow season (average flow in the low-flow season) and average flows in the previous months, (c) analysis of the physical drivers for streamflow persistence and its predictability through a principal component analysis (PCA), and (d) real-time updating of the frequency distribution of high and low flows for a selected case study with significant seasonal correlation by employing a meta-Gaussian approach. The above steps are described in detail in the following sections.

Season identification is performed algorithmically to identify the high-flow season (HFS) and low-flow season (LFS) for each river time series. For the estimation of HFS, we employ an automated method recently proposed by Lee et al. (2015), which identifies the high-flow season as the 3-month period centred around the month with the maximum number of occurrences of peaks over threshold (POT), with the threshold set to the highest 5 % of the daily flows. To evaluate the selection of HFS, a metric constructed as the percentage of annual maximum flows (PAMF) captured in the HFS is used. The PAMFs are classified in the subjective categories of “poor” (< 40 %), “low” (40 %–60 %), “medium” (60 %–80 %), and “high” (> 80 %) values, denoting the probability that the identified HFS is the dominant high-flow season in the record. If the identified peak month alone contains more than or equal to 80 % of the annual maxima flows, a unimodal regime is assumed and the identification procedure is terminated. In all other cases, the method allows for the search of a second peak month and the identification of a minor HFS, but we do not further elaborate on this analysis here, because we are only interested in the most extreme seasons for the purpose of predicting high and low flows.

The method proposed by Lee et al. (2015) has several advantages that make it suitable for the purpose of this research. Most importantly, it is capable of handling conditions of bimodality, which is usually a major issue for traditional methods, e.g. directional statistics (Cunderlik et al., 2004). A potential limitation is the assumption of symmetrical extension of HFS around the peak month, along with the uniform selection of its length (3-month period). The degree of subjectivity in the evaluation of the second HFS is another limitation, which is not relevant here, as we focus on the main HFS.

The LFS is herein identified as the 1-month period with the lowest amount of mean monthly flow. An alternative approach of estimating the relative frequencies of annual minima of monthly flow and selecting the month with the highest frequency as the LFS is also considered.

In the case of HFS, a correlation is sought between the maximum daily flow occurring in the HFS period and the mean flow in the previous months, before the onset of HFS. For LFS, correlation is computed between the mean flow in the LFS itself and the mean flow in the previous months. We use the mean flow in the previous month as a robust proxy of “storage” in the catchment that is expected to reflect the state of the catchment, i.e. wetter or drier than usual. Since we are interested in seasonal persistence, we compute the Pearson's correlation coefficient for HFS lag up to 9 months and for LFS lag up to 11 months.

An extensive investigation is carried out to identify physical drivers of seasonal streamflow correlation, in terms of catchment, geological, and climatic descriptors.

As catchment descriptors, we consider the basin area (

The area

The BI is considered based on the assumption that high groundwater storage may be a potential driver of correlation. BI is calculated from the daily flow series of the rivers following the hydrograph separation procedure detailed in Gustard et al. (2008). Flow minima are sampled from non-overlapping 5-day blocks of the daily flow series, and turning points in the sequence of minima are sought and identified when the 90 % value of a certain minimum is smaller or equal to its adjacent values. Subsequently, linear interpolation is used in between the turning points to obtain the baseflow hydrograph. The BI is obtained as the ratio of the volume of water beneath the baseflow separation curve versus the total volume of water from the observed hydrograph, and an average value is computed over all the observed hydrographs for a given catchment. A low index is indicative of an impermeable catchment with rapid response, whereas a high value suggests high storage capacity and a stable flow regime.

SR (m

The effect of catchment altitude is also inspected using relief maps from
the Shuttle Radar Topography Mission (SRTM) data
(

As geological descriptors we consider the percentage of catchment area with the presence of flysch (percentage of flysch – PF) and karstic formations (percentage of karst – PK) for Austrian and Slovenian catchments, respectively, where this type of information is available. A subset of Austrian catchments is characterized by the dominant presence of flysch, a sequence of sedimentary rocks characterized by low permeability, which is known to generate a very fast flow response. Karstic catchments, characterized by the irregular presence of sinkholes and caves, are also known for having rapid response times and complex behaviour; e.g. initiating fast preferential groundwater flow and intermittent discharge via karstic springs (Ravbar, 2013; Cervi et al., 2017). Geological features are also presumed to be linked to persistence properties, because geology is the main control for the baseflow index across the European continent (Kuentz et al., 2017). PK (%) and PF (%) are estimated from geological maps of Slovenia and Austria, respectively.

As climatic descriptors, the mean annual precipitation

To identify which catchment, physiographic, and climatic characteristics may
explain river memory, we attempt to regress the seasonal streamflow
correlation on the physical descriptors introduced above. We expect the
presence of multicollinearity among the predictor variables, and therefore
PCA (Pearson, 1901; Hotelling, 1933) was applied to construct uncorrelated
explanatory variables. In essence, PCA is an orthonormal linear
transformation of

PCA has useful descriptive properties of the underlying structure of the
data. These properties can be efficiently visualized in the biplot
(Gabriel, 1971), which is the combined plot of the scores of the
data for the first two principal components along with the relative position
of the

In order to evaluate the usefulness of the information provided by the 1-month-lag seasonal correlation for flow signatures in HFS and LFS, we perform a real-time updating of the frequency distribution of high and low flows based on the average river flow in the previous month. A similar analysis for the high flows was carried out by Aguilar et al. (2017) for the Po and Danube Rivers. In principle, this is a data assimilation approach, since real-time information, i.e. observations of the average river flow, is used in order to update a probabilistic model and inform the forecast of the flow signature of the upcoming season.

In detail, a bi-variate meta-Gaussian probability distribution (Kelly
and Krzysztofowicz, 1997; Montanari and Brath, 2004) is fitted between the
observed flow signatures, i.e. peak flow in the HFS,

The normal quantile transform (NQT; Kelly and Krzysztofowicz, 1997) is used
in order to make the marginal probability distribution of dependent and
explanatory variables Gaussian. This is achieved as follows: (a) the sample
quantiles

In the Gaussian domain, a bi-variate Gaussian distribution is fitted between
the random explanatory variable

Summary statistics of the river descriptors. Summary statistics for
PL, PG, and PF variables are computed only for the subset of catchments with
positive values (the total number of catchments is also reported in brackets
next to the values). PK is used as a categorical
variable (PK is either higher or lower than 50 % of catchment area),
therefore sample statistics are not computed in this case, but the number of
stations with PK

Updated Köppen–Geiger climatic map for period 1951–2000 (Kottek et al., 2006) showing the location of the 224 river gauge stations.

The dataset includes 224 records spanning more than 50 years of daily river
flow observations from gauging stations, mostly from non-regulated streams. A
few catchments are impacted by regulation. Among the 224 rivers, 108 are
located in Austria, 69 in Sweden, 31 in Slovenia, 13 in France, two in Spain,
and one in Italy. Catchment areas vary significantly, the largest being the
Po River basin in Italy (70 091 km

It is relevant to note that 16 of the Austrian rivers are subject to regulation, which may alter the persistence properties of river flows. This relates to generally “mild” forms of regulation, i.e. upstream regulation with a very low degree of flow attenuation, hydropower operations, and flow diversions to and from the basin. A preliminary examination of these rivers did not reveal any significant change during time of the flow regime. The presence of regulation does not preclude the exploitation of correlation for predicting river flows in probabilistic terms, but it may affect the analysis of physical drivers, as it may enhance or reduce persistence in the natural river flow regime. Given that detailed information is generally lacking on the impact of regulation (Kuentz et al. 2017), we assume stationarity of the river flows for all the catchments herein considered and, additionally, assume that river management does not significantly affect the identification of the physical drivers.

Approximately half of the 224 rivers are characterized by at least one
high-flow season with medium or higher significance (PAMF of HFS

Regarding the LFS identification, the two considered approaches (see Sect. 2.1) agree for 139 out of 224 stations, but the first method, i.e. the 1-month period with the lowest amount of mean monthly flow, is selected as being more relevant to the purpose of computing mean flow correlations.

Box plots of seasonal correlation coefficient against lag time for
HFS

LFS correlation is markedly higher than the corresponding HFS correlation for
lags 1–6, and its median remains
higher than 0 for more lags (see Fig. 2). For the case of HFS correlation, we
focus only on the most significant first lag, for which 73 rivers are found
to have correlation significantly higher than 0 at a 5 % significance
level. In Fig. 3, the autocorrelation of the whole monthly series is compared
to the LFS correlation for lag of 1 and 2 months, in order to prove that the
seasonal correlation for LFS is significantly higher than its counterpart
computed by considering the whole year. The latter is also confirmed by the
Kolmogorov–Smirnov test for both LFS lags (corresponding

Box plots of lag-1 and lag-2 correlation coefficients for LFS analysis (orange) and the whole monthly series (white) for the 224 rivers. The lower and upper ends of the box represent the first and third quartiles, respectively, and the whiskers extend to the most extreme value within 1.5 IQR (interquartile range) from the box ends.

Spatial distribution of the lag-1 correlation
coefficients for HFS

Figure 4 shows the spatial pattern of HFS and LFS streamflow correlations. It is interesting to notice the emergence of spatial clustering in the correlation magnitude, which implies its dependence on different spatially varying physical mechanisms. For example, for HFS, a geographical pattern emerges within France, since the highest correlation coefficients are located in the northern part of the country, which is characterized by an oceanic climate and higher baseflow indices.

To attribute the detected correlations to physical drivers, we define six groups of potential drivers of seasonal correlation magnitude: basin size, flow indices, the presence of lakes and glaciers, catchment elevation, catchment geology, and hydro-climatic forcing. For some of the descriptors the information is only available for a few countries.

In what follows, we will use the term “positive (negative) impact on
correlation” to imply that an increasing value of the considered descriptor
is associated with increasing (decreasing) correlation. For each descriptor, we
also report, between parentheses, the Spearman's rank correlation coefficient

Figure 5 shows that there is only a weak positive impact of the catchment
area (log transformed) on correlation for HFS (

Scatter plots of lag-1
HFS

Scatter plots of lag-1 HFS (bottom panels) and LFS streamflow
correlation

The effect of the BI and SR is shown in Fig. 6. The BI (Fig. 6a) appears to be a
marked positive driver for LFS (

Detailed information on the presence of lakes is available for the 69 Swedish
catchments, while the areal extension of glaciers is known for the 108
Austrian catchments. Figure S1 in the Supplement shows that the impact of
lake area (Fig. S1a) on correlation for LFS and HFS is not significant but
positive (

Relief maps from SRTM elevation data for the HFS and LFS lag-1 correlations of the rivers. Note that elevation scale is different for each region. Legend shows the colour assigned to each class of correlation for the data.

Digital elevation model of the Austrian river network
depicting the spatial distribution of lag-1 positive correlation for HFS

The areal coverage of the SRTM data is limited to 60

In the case of Austrian catchments, a 1 km resolution digital model is also used to extract information on elevation. Figure 8 confirms that there is a positive correlation pattern emerging with elevation for LFS. Based on local climatological information, it can be concluded that the spatial pattern for LFS correlation is reflective of the timing and strength of seasonality of the low flows in Austria, where dry months occur in lowlands during the summer due to increased evapotranspiration and in the mountains during winter (mostly February) due to snow accumulation which is characterized by stronger seasonality compared to the lowlands flow regime (Parajka et al., 2016; see Fig. 1). Concerning HFS in the same region, high flows are significantly impacted by the seasonality of extreme precipitation (Parajka et al., 2010), which is highly variable, with the exception of the rivers where high flows are generated by snowmelt. Therefore, a spatially consistent pattern does not clearly emerge.

Box plots of lag-1 correlation for Slovenian rivers with
more than 50 % presence of karstic formations (PK) and rivers with no or
less presence for HFS analysis

Two different geological behaviours are identified which may impact river
correlation. We first focus on 21 Slovenian catchments (out of 31) where
more than 50 % of the basin area is characterised by the presence of
karstic aquifers (percentage of karstic areas PK

In a second analysis, we focus on Austrian catchments and investigate the
relationship between correlation and percentage of flysch coverage, PF.
Figure S2 shows that there is not a prevailing pattern in
either case (

Figure 10 shows the lag-1 HFS and LFS correlations against estimates of the
annual precipitation

Scatter plots of lag-1 HFS and LFS correlation versus
annual precipitation

Differences in the mean values between the descriptors of the group
20-highest-correlation-river group for HFS and LFS
versus the remaining rivers (204).

To gain further insight into the results we select the 20 catchments with the
highest streamflow seasonal correlation coefficients for both HFS and LFS
periods in order to investigate their physical characteristics in relation to
the remaining set of rivers. Table 2 summarizes statistics for selected
descriptors in order to identify dominant behaviours. We also compare the
number of rivers with distinctive features, i.e. lakes

By focusing on HFS, one can notice that the catchments with higher seasonal correlation are characterized by larger catchment area; higher baseflow index and temperature with respect to the remaining catchments; and lower specific runoff, precipitation, and wetness. The presence of lake, glacier, karstic, and flysch areas do not appear significantly effective at a 5 % significance level. More robust considerations can be drawn for the LFS; higher seasonal correlation is found for larger catchments with a higher baseflow index and lower specific runoff, precipitation, and wetness. Decreasing temperature is strongly associated with higher correlation for the LFS. The presence of lakes plays a significant role, both for lag-1 and lag-2 correlations, with the latter also being significantly influenced by the presence of glaciers.

Loadings of the three principal components for ln

We attempt to fit a linear regression model to relate correlation to physical
drivers, in order to support correlation estimation for ungauged catchments.
To avoid the impact of multicollinearity in the regression while additionally
summarizing river information, we apply PCA (see Sect. 2.2). Although
correlation effects are efficiently dealt with via the PCA, we avoid
including highly correlated variables in the analysis. For example, the De
Martonne index, precipitation and SR are mutually highly correlated (all
Pearson's cross-correlations are higher than 0.6), therefore we only
consider the SR in the PCA because it shows a more robust linear relationship
with correlation magnitude. We select

Principal component distance biplot showing the
principal component scores on the first two principal axes along with the
vectors (brown arrows) representing the coefficients of the baseflow index
(BI), specific runoff (SR), natural logarithm of basin area ln

Naturally, the statistical behaviour of the indices reflects the known local controls for certain rivers. For example, the observed lowest BI in Slovenia is consistent with the presence of karstic formations for the majority of the Slovenian rivers, as is the higher BI in Sweden and Austria, which is related to the presence of lakes and glaciers in both countries.

Summary of linear regression results for the LFS model.

Diagnostic plots of linear regression for the LFS model.
Residuals versus the first

In the case of HFS, all the examined linear models (combinations of ln

We apply the technical experiment (see Sect. 2.3) for high and low flows to the Oise River in France and assess the difference in the estimated flood and low-flow magnitudes. We update the probability distribution of high and low flows after the occurrence of the upper 95 % and lower 5 % sample quantile of the observed mean flow in the previous month, respectively.

The Oise River (55 years of daily flow values) at Sempigny in France has a
basin area of 4320 km

Conditioning the frequency distributions for high and
low flows for the Oise River. Plots of the residuals of the linear
regression given by Eq. (2) for the HFS

A visual inspection of the residual plots is also performed (Fig. 13a, b) in order to evaluate the assumption of homoscedasticity of the residuals of the regression models given by Eq. (2). The residuals do not show any apparent trend, and the Gaussian linear model is therefore accepted. Figure 13c, d shows the conditioned and unconditioned probability distributions of peak and low flows in the Gaussian domain. As follows from Eqs. (3) and (4), the variance of the updated (conditioned) distributions decreases while the mean value increases.

After application of the inverse NQT the conditioned peak flows are modelled
through the EV1 distribution and compared to the unconditioned (observed)
peak flows. The corresponding Gumbel probability plot for conditioned and
unconditioned distributions is shown in Fig. 13e. For the return period of
200 years, the updated distribution shows a 6 % increase in the flood
magnitude for the Oise River (307.7 to 326.44 m

The methodology presented herein aims to progress our physical understanding of seasonal river flow persistence for the sake of exploiting the related information to improve probabilistic prediction of high and low flows. The correlation of average flow in the previous months with the LFS flow and HFS peak flow was found to be relevant, with the former prevailing over the latter. This result was foreseen, since the LFS correlation refers to average flow, while the HFS correlation is related to rapidly occurring events. We also aim to investigate physical drivers for correlation and quantify their relative impact on correlation magnitude. Therefore, a thorough investigation of the geophysical and climatological features of the considered catchments was carried out.

We found that the increasing basin area and baseflow index are associated
with increasing seasonal streamflow correlation, yet the latter has a
stronger impact. To this respect, Mudelsee (2007), Hirpa et al. (2010), and
Szolgayova et al. (2014) also found positive dependencies of long-term
persistence on basin area, and Markonis et al. (2018) found a positive impact
too, but for larger spatial scales (

Previous studies also pointed out that correlation increases for groundwater-dominated regimes (Yossef et al., 2013; Dijk et al., 2013; Svensson, 2016) and slower catchment response times (Bierkens and van Beek, 2009), which concurs with the impact of the baseflow index found herein as well as with the observed impact of fast responding karst areas. The latter findings are also in agreement with our conclusion that correlation decreases with increasing rapidity of river flow formation, which, for instance, occurs in the presence of karstic areas and wet soils, explaining why persistence decreases with high specific runoff, as also confirmed by other studies (Gudmundsson et al., 2011; Szolgayova et al., 2014).

Other contributions also reported higher streamflow persistence in drier conditions, either relating to lower specific runoff or mean areal precipitation estimates (Szolgayova et al., 2014; Markonis et al., 2018). It was postulated that this is due to wet catchments showing increased short-term variability compared to drier catchments (Szolgayova et al., 2014) and having a faster response to rainfall due to saturated soil. A similar conclusion has been reached by other previous studies reporting that low humidity catchments are more sensitive to interannual rainfall variability (Harman et al., 2011), therefore leading to enhanced persistence. Yet, these studies refer to generally humid regions and cannot be extrapolated to more arid climates. A related conclusion is proposed by Seneviratne et al. (2006), who found the highest soil moisture memory for intermediate soil wetness. These results do not contrast with our findings, which refer to a wide range of climatic conditions. In fact, our finding that increased wetness has a negative impact on seasonal memory of both high and low flows extends the above results to the seasonal scale and, interestingly, to both types of extremes.

We also confirm the role of lakes in determining higher catchment storage and therefore positive correlations for the LFS, which has only been reported for annual persistence in a few sites (Zhang et al., 2012).

The effect of snow cover for lag-1 LFS correlation is also revealed by the Austrian catchments. The mountainous rivers, directly affected by the process of snow accumulation, exhibit winter LFS and higher correlation than the rivers in the lowlands, which are more prone to drying out due to evapotranspiration in the hotter summer months. The inspection of elevation data confirmed the role of high altitudes in increasing LFS correlation, which is likely related to storage effects due to snow accumulation and gradual melting. In this respect, Kuentz et al. (2017) found that topography exerts dominant controls over the flow regime in the larger European region, controlling the flashiness of flow and being a particularly important driver for other low-flow signatures too. In fact, topography may affect the flow regime directly, through flow routing, but also indirectly, because of orographic effects in precipitation and hydro-climatic processes affected by elevation (e.g. snowmelt and evapotranspiration).

Regarding atmospheric forcing, we find LFS correlation to be negatively correlated to mean areal temperature and annual precipitation. The former result may be explained, considering that increased evapotranspiration (higher temperature) is likely to dry out LFS flows while snow coverage (lower temperature) was found to be associated with higher LFS correlation. An apparently different conclusion was drawn by Szolgayova et al. (2014a) and Gudmundsson et al. (2011), who reported increasing persistence with increasing mean temperature postulating that snow-dominated flow regimes smooth out interannual fluctuations. Yet, it should be noted that they refer to interannual variability, while we refer here to seasonal correlation and therefore to shorter timescales, which imply a different dynamic of snow accumulation and snowmelt; latitude may also play a relevant role in this, since in southern Europe the complete ablation of snow can occur more than once during the cold season, and sublimation may account for 20 %–30 % of the annual snowfall (Herrero and Polo, 2016), decreasing the amount of snowmelt and impacting LFS flows in the summer season.

Snowmelt mechanisms are found to increase predictive skill during low-flow periods in some other studies (Bierkens and van Beek, 2009; Mahanama et al., 2011; Dijk et al., 2013). However, in the glacier-dominated regime of western Alpine and central Austrian catchments, it is unlikely that this is a relevant driver of higher correlation, since low flow occurs in the winter months. Yet the mountainous, glacier-dominated rivers still show increased LFS correlation compared to rivers in the lowlands, which agrees well with other studies that have found less uncertainty in the rainfall–runoff modelling in this regime owing to the greater seasonality of the runoff process and the decreased impact of rainfall compared to the rainfall-dominated regime of the lowlands (e.g. Parajka et al., 2016).

Although the considerable uncertainty of areal precipitation estimates
should be acknowledged, the contribution of annual precipitation
interestingly complements the negative effect of increasing specific runoff
–which is highly correlated to

This research investigates the presence of persistence in river flow at the
seasonal scale, the associated physical drivers, and the prospect for
employing the related information to improve probabilistic prediction of high
and low flows by exploring a large sample of European rivers. The main
findings are summarized below:

Rivers in Europe show persistent features at the seasonal timescale, manifested as correlation between high- and low-flow signatures, i.e. peak flows in HFS, average flows in LFS, and average flows in the previous month. Correlation for LFS signatures is found to be consistently higher than HFS.

Seasonal correlation shows increased spatial variability together with spatial clustering.

Storage mechanisms, groundwater-dominated basins, and slower catchment response time, as reflected by large basin areas, a high baseflow index, and the presence of lakes, amplify seasonal correlation. On the contrary, correlation is lower in quickly responding karstic basins and increased wetness conditions, as revealed by high specific runoff.

Low mean areal temperature is associated with higher LFS correlation owing to the weaker drying-out evapotranspiration force and the mechanism of snow accumulation in higher altitudes. Higher mean areal precipitation is associated with lower LFS predictability, possibly due to the presence of saturated conditions and increased short-term variability in wetter climates.

The drivers of LFS predictability are easier to identify and allow for the opportunity to construct regression models for possible application to ungauged basins (see Sect. 6).

HFS and LFS correlation may directly apply to the probabilistic prediction of “extremes”, i.e. high and low flows, as increased correlation can be exploited in various stochastic models. Such an application was performed in Sect. 7 in a data assimilation setting for a river of marked technical relevance.

Regarding the last point, once a significant correlation is identified, it may be exploited in other model variants as well, e.g. adding more dependent variables of lagged flow and/or coupling with other relevant explanatory variables, such as teleconnections or antecedent rainfall, in multivariate prediction schemes. Indeed, the presence of river memory at the seasonal scale represents a possible opportunity to improve the prediction of water-related natural hazards by reducing uncertainty of associated estimates and allowing significant lag time for decision-making and hazard prevention. Besides the high relevance for extremes, this type of seasonal predictability could also be of interest to the management of water resources by, for instance, exploring the memory properties of a minor HFS.

The inspection of the physical basis, apart from advancing our understanding of the catchment dynamics and enabling predictions in ungauged basins, is highly important, as it may guide the search for other dependent variables and build confidence in the formation of process-based stochastic models (Montanari and Koutsoyiannis, 2012). A large sample of indices was herein inspected, yet more data are necessary in order to allow for more certain and generalized conclusions worldwide. An important note is the effect of regulation, which, due to the lack of objective data, is not completely understood. However, the opportunity of exploiting correlation is not affected by the presence of regulation, provided that the management of river flow does not change in time.

We conclude that our results point out that river memory provides interesting information that holds both theoretical and operational potential to improve the understanding and prediction of extremes, support decision-making, and increase the level of preparedness for water-related natural hazards.

The data and code used in this study may be made available to the readers upon request to the corresponding author.

The supplement related to this article is available online at:

The authors declare that they have no conflict of interest.

The present work was (partially) developed within the framework of the Panta Rhei Research Initiative of the International Association of Hydrological Sciences (IAHS). Part of the results were elaborated in the Switch-On Virtual Water Science Laboratory that was developed in the context of the SWITCH-ON (Sharing Water-related Information to Tackle Changes in the Hydrosphere – for Operational Needs) project, funded by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 603587. Nejc Bezak gratefully acknowledges funding by the Slovenian Research Agency (grants J2-7322 and P2-0180). María Bermúdez gratefully acknowledges financial support from the Spanish Regional Government of Galicia, Postdoctoral Grant Program 2014. Edited by: Louise Slater Reviewed by: three anonymous referees