Introduction

HESS

Hydrology and Earth System Sciences

HESS

Hydrol. Earth Syst. Sci.

1607-7938

Copernicus Publications

Göttingen, Germany

10.5194/hess-21-839-2017

Can assimilation of crowdsourced data in hydrological modelling improve flood prediction?

Mazzoleni

Maurizio

m.mazzoleni@unesco-ihe.org

https://orcid.org/0000-0002-0913-9370

Verlaan

Martin

Alfonso

Leonardo

https://orcid.org/0000-0002-8471-5876

Monego

Martina

Norbiato

Daniele

Ferri

Miche

Solomatine

Dimitri P.

https://orcid.org/0000-0003-2031-9871

1UNESCO-IHE Institute for Water Education, Hydroinformatics Chair Group, Delft, the Netherlands 2Deltares, Delft, the Netherlands 3Alto Adriatico Water Authority, Venice, Italy 4Delft University of Technology, Water Resources Section, Delft, the Netherlands

Maurizio Mazzoleni (m.mazzoleni@unesco-ihe.org)

14February2017

21 2 839861 28September2015 3November2015 2January2017 21January2017

This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/

This article is available from https://hess.copernicus.org/articles/.html

The full text article is available as a PDF file from https://hess.copernicus.org/articles/.pdf

Monitoring stations have been used for decades to properly measure hydrological variables and better predict floods. To this end, methods to incorporate these observations into mathematical water models have also been developed. Besides, in recent years, the continued technological advances, in combination with the growing inclusion of citizens in participatory processes related to water resources management, have encouraged the increase of citizen science projects around the globe. In turn, this has stimulated the spread of low-cost sensors to allow citizens to participate in the collection of hydrological data in a more distributed way than the classic static physical sensors do. However, two main disadvantages of such crowdsourced data are the irregular availability and variable accuracy from sensor to sensor, which makes them challenging to use in hydrological modelling. This study aims to demonstrate that streamflow data, derived from crowdsourced water level observations, can improve flood prediction if integrated in hydrological models. Two different hydrological models, applied to four case studies, are considered. Realistic (albeit synthetic) time series are used to represent crowdsourced data in all case studies. In this study, it is found that the data accuracies have much more influence on the model results than the irregular frequencies of data availability at which the streamflow data are assimilated. This study demonstrates that data collected by citizens, characterized by being asynchronous and inaccurate, can still complement traditional networks formed by few accurate, static sensors and improve the accuracy of flood forecasts.

Introduction

Observations of hydrological variables measured by physical sensors have been increasingly integrated into mathematical models by means of model updating methods. The use of these techniques allows for the reduction of intrinsic model uncertainty and improves the flood forecasting accuracy (Todini et al., 2005). The main idea behind model updating techniques is to either update model input, states, parameters, or outputs as new observations become available (Refsgaard, 1997; WMO, 1992). Input update is the classical method used in operational forecasting, and uncertainties of the input data can be considered as the main source of uncertainty of the model (Bergström, 1991; Canizares et al., 1998; Todini et al., 2005). Regarding the state updating, filtering methods such as the Kalman filter (Kalman, 1960), extended Kalman filter (Aubert et al., 2003; Madsen and Cañizares, 1999; Verlaan, 1998), ensemble Kalman filter (Evensen, 2006), and particle filter (Weerts and El Serafy, 2006) are the most used approaches to update a model when new observations are available.

Due to the complex nature of the hydrological processes, spatially and temporally distributed measurements are needed in the model updating procedures to ensure a proper flood prediction (Clark et al., 2008; Mazzoleni et al., 2015; Rakovec et al., 2012). However, traditional physical sensors require proper maintenance and personnel, which can be cost prohibitive for a vast network. For this reason, improvements to monitoring technology have led to the spread of low-cost sensors to measure hydrological variables, such as water level or precipitation, in a more distributed way. The main advantage of using this type of sensors, defined in the paper as “social sensors”, is that they can be used not only by technicians but also by regular citizens and that due to their reduced cost and voluntary labour by citizens, they result in a more spatially distributed coverage. The idea of designing these alternative networks of low-cost social sensors and using the obtained crowdsourced observations is the base of the European project WeSenseIt (2012–2016) and various other projects that proposed to assess the usefulness of crowdsourced observations inferred by low-cost sensors owned by citizens. For instance, in the project CrowdHydrology (Lowry and Fienen, 2013), a method to monitor stream stage at designated gauging staffs using crowdsource-based text messages of water levels is developed using untrained observers. Cifelli et al. (2005) described a community-based network of volunteers (CoCoRaHS), engaged in collecting precipitation measurements of rain, hail, and snow. An example of hydrological monitoring, established in 2009, of rainfall and streamflow values within the Andean ecosystems of Piura, Peru, based on citizen observations, is reported in Célleri et al. (2009). Degrossi et al. (2013) used a network of wireless sensors in order to map the water level in two rivers passing by Sao Carlos, Brazil. Recently, the iSPUW project was initiated to integrate data from advanced weather radar systems, innovative wireless sensors, and crowdsourcing of data via mobile applications in order to better predict flood events for the Dallas–Fort Worth Metroplex urban water systems (ISPUW, 2015; Seo et al., 2014). Other examples of crowdsourced water-related information include the so-called Crowdmap platform for collecting and communicating the information about the floods in Australia in 2011 (ABC, 2011) and informing citizens about the proper time for water supply in an intermittent water system (Alfonso, 2006; Au et al., 2000; Roy et al., 2012). Wehn et al. (2015) stressed the importance and need of public participation in water resources management to ensure citizens' involvement in the flood management cycle. Buytaert et al. (2014) provide a detailed and interesting review of the examples of citizen science applications in hydrology and water resources science. In this review paper, the potential of citizen science, based on robust, cheap, and low-maintenance sensing equipment, to complement more traditional ways of scientific data collection for hydrological sciences and water resources management is explored.

The traditional hydrological observations from physical sensors have a well-defined structure in terms of frequency and accuracy. On the other hand, crowdsourced observations are provided by citizens with varying experience of measuring environmental data and little connections between each other, and the consequence is that the low correlation between the measurements might be observed. So far, in operational hydrology practice, the added value of crowdsourced data is not integrated into the forecasting models but only used to compare the model results with the observations in a post-event analysis. This can be related to the intrinsic variable accuracy, due to the lack of confidence in the data quality from these heterogeneous sensors, and the variable life-span of the crowdsourced observations.

Regarding data quality, Bordogna et al. (2014) and Tulloch and Szabo (2012) stated that quality control mechanisms should consider contextual conditions to deduce indicators about reliability (the expertise level of the crowd), credibility (the volunteer group), and performance of volunteers as they relate to accuracy, completeness, and precision level. Bird et al. (2014) addressed the issue of data quality in conservation ecology by means of new statistical tools to assess random error and bias. Cortes Arevalo et al. (2014) evaluated data quality by distinguishing the in situ data collected between volunteers and technicians and comparing the most frequent value reported at a given location. With in situ exercises, it might be possible to have an indication of the reliability of data collected. However, this approach is not enough at an operational level to define accuracy in data quality. For this reason, to estimate observation accuracy in real time, one possible approach could be to filter out the measurements following a geographic approach which defines semantic rules governing what can occur at a given location (e.g. Vandecasteele and Devillers, 2013). Another approach could be to compare measurements collected within a predefined time window in order to calculate the most frequent value, the mean, and the standard deviation.

Crowdsourced observations can be defined as asynchronous because they do not have predefined rules about the arrival frequency (the observation might be taken once, occasionally, or at irregular time steps, which can be smaller than the model time step) and accuracy of the measurement. In a recent paper, Mazzoleni et al. (2015) presented results of the study of the effects of distributed synthetic streamflow observations having synchronous intermittent temporal behaviour and variable accuracies in a semi-distributed hydrological model. It was shown that the integration of distributed uncertain intermittent observations with single measurements coming from physical sensors would allow for the further improvements in model accuracy. However, it did not consider the possibility that the asynchronous observations might be coming at the moments not coordinated with the model time steps. A possible solution to handle asynchronous observations in time with the ensemble Kalman filter (EnKF) is to assimilate them at the moments coinciding with the model time steps (Sakov et al., 2010). However, as these authors mention, this approach requires the disruption of the ensemble integration, the ensemble update, and a restart, which may not be feasible for large-scale forecasting applications. Continuous assimilation approaches, such as three-dimensional and four-dimensional variational methods (3D-Var and 4D-Var), are usually implemented in oceanographic modelling in order to integrate asynchronous observations at their corresponding arrival moments (Derber and Rosati, 1989; Huang et al., 2002; Macpherson, 1991; Ragnoli et al., 2012). In fact, oceanographic observations are commonly collected at asynchronous times. For this reason, in variational data assimilation, the past asynchronous observations are simultaneously used to minimize the cost function that measures the weighted difference between background states and observations over the time interval, and identify the best estimate of the initial state condition (Drecourt, 2004; Ide et al., 1997; Li and Navon, 2001). In addition to the 3D-Var and 4D-Var methods, Hunt et al. (2004) proposed a four-dimensional ensemble Kalman filter (4DEnKF) which adapts EnKF to handle observations that have occurred at non-assimilation times. Furthermore, for linear dynamics, 4DEnKF is equivalent to the instantaneous assimilation of the measured data (Hunt et al., 2004). Similarly to 4DEnKF, Sakov et al. (2010) proposed a modification of the EnKF, the asynchronous ensemble Kalman filter (AEnKF), to assimilate asynchronous observations (Rakovec et al., 2015). Contrary to the EnKF, in the AEnKF, current and past observations are simultaneously assimilated at a single analysis step without the use of an adjoint model. Yet another approach to assimilate asynchronous observations in models is the so-called first-guess at the appropriate time (FGAT) method. Like in 4D-Var, the FGAT compares the observations with the model at the observation time. However, in FGAT, the innovations are assumed constant in time and remain the same within the assimilation window (Massart et al., 2010). In light of reviewed approaches, this study uses a pragmatic method, due in part to the linearity of the hydrological models implemented in this study, to assimilate the asynchronous crowdsourced observations.

The main objective of this study is to assess the potential use of crowdsourced data within hydrological modelling. In particular, the specific objectives of this study are (a) to assess the influence of different arrival frequencies and accuracies of crowdsourced data from a single social sensor on the assimilation performance and (b) to integrate distributed low-cost social sensors with a single physical sensor to assess the improvement in the streamflow prediction in an early warning system. The methodology is applied in the Brue (UK), Sieve (Italy), Alzette (Luxembourg), and Bacchiglione (Italy) catchments, considering lumped and semi-distributed hydrological models, respectively. Synthetic time series, asynchronous in time and with random accuracies, that imitate the crowdsourced data, are generated and used.

The study is organized as follows. Firstly, the case studies, the crowdsourced data and the datasets used are presented. Secondly, the hydrological models, the procedure used to integrate the crowdsourced data, and the set of experiments are reported. Finally, the results, discussion, and conclusions are presented.

Sites locations and data Case studies

Four different case studies are used to validate the obtained results for areas having diverse topographical and hydrometeorological features and represented by two different hydrological models. The Brue, Sieve, and Alzette catchments are considered because of the availability of precipitation and streamflow data, while the Bacchiglione catchment is one of the official case studies of the WeSenseIt Project (Huwald et al., 2013).

Brue catchment

The first case study is located in the Brue catchment (Fig. 1), in Somerset, with a drainage area of about 135 km2 at the catchment outlet in Lovington. The Shuttle Radar Topography Mission digital elevation model (SRTM DEM) of 90 m resolution is used to derive the topographical characteristics, streamflow network, and the consequent time of concentration, by means of the Giandotti equations (Giandotti, 1933), which is about 10 h. The hourly precipitation (49 rainfall stations) and streamflow data used in this study are supplied by the British Atmospheric Data Centre from the HYREX (Hydrological Radar Experiment) project (Moore et al., 2000; Wood et al., 2000). The average precipitation value in the catchment is estimated using ordinary kriging (Matheron, 1963).

Representation of the four case studies considered in this study; clockwise: Brue catchment; Sieve catchment; Alzette catchment; Bacchiglione catchment.

Sieve catchment

The second case study is the Sieve catchment (Fig. 1), a tributary of the Arno River, located in the central Italian Apennines in Italy. The catchment has a drainage area of about 822 km2 with the length of 56 km and it covers mostly hills and mountainous areas with an average elevation of 470 m above sea level. The time of concentration of the Sieve catchment is about 12 h. Hourly streamflow data are provided by the Centro Funzionale di Monitoraggio Meteo Idrologico-Idralico of the Tuscany Region at the outlet section of the catchment at Fornacina. The mean areal precipitation is calculated by the Thiessen polygon method using 11 rainfall stations (Solomatine and Dulal, 2003).

Alzette catchment

The Alzette catchment is located in the large part of the grand duchy of Luxembourg. The drainage area of the catchment is about 288 km2 and the river has a length of 73 km along France and Luxembourg. The catchment covers cultivated land, grassland, forest land, and urbanized land (Fenicia et al., 2007). The Thiessen polygon method is used for averaging the series at the individual stations and calculating hourly rainfall series (Fenicia et al., 2007), while streamflow data are available measured at the Hesperange gauging station.

Bacchiglione catchment

The last case study is the upstream part of the Bacchiglione River basin, located in the north-east of Italy, and tributary of the Brenta River which flows into the Adriatic Sea at the south of the Venetian Lagoon and at the north of the Po River delta. The study area has an overall extent and river length of about 400 km2 and 50 km (Ferri et al., 2012). The main urban area located in the downstream part of the study area is Vicenza. The analysed part of the Bacchiglione River has three main tributaries. On the western side are the confluences with the Bacchiglione of the Leogra and the Orolo rivers, while on the eastern side is the Timonchio River (see Fig. 2). The Alto Adriatico Water Authority (AAWA) has implemented an early warning system to forecast the possible future flood events.

Structure of the hydrological model and location of the physical (green dots), social (red dots), and Ponte degli Angeli (PA, blue dots) sensors implemented in the Bacchiglione catchment by the Alto Adriatico Water Authority.

Crowdsourced data

Social sensors can be used by citizens to provide crowdsourced distributed hydrological observations such as precipitation and water level. An example of these sensors can be a staff gauge, connected to a quick response code, on which citizens can read water level indication and send observations via a mobile phone application. Another example is the collection of rainfall data via lab-generated videos (Alfonso et al., 2015). Recently, within the activities of the WeSenseIt Project (Huwald et al., 2013), one physical sensor and three staff gauges complemented by a QR code were installed in the Bacchiglione River to measure the water level. In particular, the physical sensor is located at the outlet of the Leogra catchment while the three social sensors are located at the Timonchio, Leogra, and Orolo catchments outlet, respectively (see Fig. 2).

It is worth noting that, in most of the cases, it is difficult to directly assimilate water level observations within hydrological models. However, it is highly unrealistic to assume that citizens might observe streamflow directly. For this reason, crowdsourced observations of water level are used to calculate crowdsourced data (CSD) of streamflow by means of rating curves assessed for the specific river location, which can be easily assimilated into hydrological models. It is because of both the uncertainty in rating curve estimation at the social sensor location and the error in the water level measurements that CSD have such low and variable accuracies when compared to streamflow data estimated from classic physical sensors. CSD are then assimilated within mathematical models as described in Fig. 3 (“overall information flow”).

Graphical representation of the methodology proposed to estimate streamflow from crowdsourced observations of water level: (a) crowdsourced observations of water level are turned into streamflow crowdsourced data (CSD) by means of rating curves assessed for the specific river location; (b) the streamflow CSD within the hydrological model are assimilated.

In most hydrological applications, streamflow data from physical sensors are derived (and integrated into hydrological models) at regular, synchronous time steps. In contrast, crowdsourced water level observations are obtained by diverse types of citizens at random moments (when a citizen decides to send data). Thus, from the modelling viewpoint, CSD have three main characteristics: (a) irregular arrival frequency (asynchronicity), (b) random accuracy, and (c) random number of CSD received within two model time steps. Because streamflow CSD are not available in the case studies at the moment of this study, realistic synthetic CSD with these characteristics are generated (“considered information flow” in Fig. 3).

For the Brue, Sieve, and Alzette catchments, observed hourly streamflow data at the catchments' outlets are interpolated to represent CSD coming at arrival frequencies higher than hourly. For the Bacchiglione catchment, synthetic hourly CSD of streamflow are calculated using measured precipitation recorded during the considered flood events (post-event simulation) as input in the hydrological model of the Bacchiglione catchment. A similar approach, termed “observing system simulation experiment” (OSSE), is commonly used in meteorology to estimate synthetic “true” states and measurements by introducing random errors in the state and measurement equations (Arnold and Dey, 1986; Errico et al., 2013; Errico and Privé, 2014). OSSEs have the advantage of making it possible to compare estimates to true states and they are often used for validating the data assimilation algorithms.

Further details and assumptions regarding the characteristics of CSD and related uncertainty are provided in the next sections.

Datasets

Three flood events for each one of the four described catchments are considered to assess the assimilation of CSD in hydrological modelling.

For the Brue catchment, a 2-year time series (June 1994 to May 1996) of observed streamflow and precipitation data are available for model calibration and validation. On the other hand, for the Sieve catchment, only 3 months of hourly runoff, streamflow, and precipitation data (December 1959 to February 1960) are available (Solomatine and Shrestha, 2003). For the Alzette catchment, 2-year hourly data (July 2000 to June 2002) are used for the model calibration and validation (Fenicia et al., 2007). For these catchments, the observed precipitation values are treated as the “perfect forecasts” and are fed into the hydrological model.

For the Bacchiglione catchment, three flood events that occurred in 2013, 2014, and 2016 are considered. In particular, the one of 2013 had high intensity and resulted in several traffic disruptions at various locations upstream Vicenza. The forecasted time series of precipitation (3-day weather forecast) is used as input to the hydrological model. In all the case studies, the observed values of streamflow at the catchment outlet (Ponte degli Angeli for the Bacchiglione) are used to assess the performance of the hydrological model.

Methodology Hydrological modelling Lumped model

A lumped conceptual hydrological model is implemented to estimate the streamflow hydrograph at the outlet section of the Brue, Sieve, and Alzette catchments. The choice of the model is based on previous studies performed in the Brue catchment (Mazzoleni et al., 2015). Direct runoff is the input in the conceptual model and it is assessed by means of the soil conservation service curve number method (Mazzoleni et al., 2015). The average curve number value within the catchment is calibrated by minimizing the difference between the simulated volume and observed quick flow, using the method proposed by Eckhardt (2005), at the outlet section.

The main module of the hydrological model is based on the Kalinin–Milyukov–Nash (KMN; Szilagyi and Szollosi-Nagy, 2010) equation: Qt=1k⋅1(n-1)!∫t0tτkn-1⋅e-τ/k⋅It-τ⋅dτ, where I is the model forcing (in this case direct runoff), n (number of storage elements) and k (storage capacity expressed in hours) are the two model parameters, and Q is the model output (streamflow in m3 s-1). In this study, the parameter k is assumed as a linear function between the time of concentration and a coefficient ck. The discrete state-space system of Eq. (1) derived by Szilagyi and Szollosi-Nagy (2010) is used in this study to apply the data assimilation approach (Mazzoleni et al., 2015, 2016).

The model calibration is performed maximizing the Nash–Sutcliffe efficiency (NSE) and the correlation between the simulated and observed value of streamflow, at the outlet points of the Brue, Sieve, and Alzette catchments, using historical time series. The results of the calibration provided a value of the parameters n and ck equal to 4 and 0.026, 1 and 0.0055, and 1 and 0.00064 for the Brue, Sieve, and Alzette catchments, respectively.

Semi-distributed model

The hydrological and routing models used in this study are based on the early warning system implemented by the AAWA and described in Ferri et al. (2012). One of the goals of this study, in the framework of the WeSenseIt Project, is to test our methodology using synthetic CSD in the existing early warning system of the Bacchiglione catchment.

In the schematization of the Bacchiglione catchment, the location of physical and social sensors corresponds to the outlet section of three main sub-catchments, Timonchio, Leogra, and Orolo, while the remaining sub-catchments are considered as inter-catchments. For both sub-catchments and inter-catchments, a conceptual hydrological model, described below, is used to estimate the outflow (streamflow) hydrograph. The streamflow hydrograph of the three main sub-catchments is considered as the upstream boundary conditions of a routing model used to propagate the flow up to the catchment outlet (see Fig. 2), while the outflow from the inter-catchment is considered as an internal boundary condition to account for their corresponding drained area. In the following, a brief description of the main components of the hydrological and routing models is provided.

The input for the hydrological model consists of precipitation only. The hydrological response of the catchment is estimated using a hydrological model that considers the routines for runoff generation and a simple routing procedure. The processes related to runoff generation (surface, sub-surface, and deep flow) are modelled mathematically by applying the water balance to a control volume representative of the active soil at the sub-catchment scale. The water content Sw in the soil is updated at each calculation step dt using the following balance equation: Sw,t+dt=Sw,t+Pt-Rsur,t-Rsub,t-Lt-ET,t, where P and ET are the components of precipitation and evapotranspiration, while Rsur, Rsub, and L are the surface runoff, sub-surface runoff, and deep percolation model states, respectively (see Fig. 2). The surface runoff Rsur is expressed by the equation based on specifying the critical threshold beyond which the mechanism of Dunnian flow (saturation excess mechanism) prevails: Rsur,t=C⋅Sw,tSw,max⁡⋅Pt⇒Pt≤f=Sw,max⁡⋅Sw,max-Sw,tSw,max-C⋅Sw,tPt-Sw,max⁡-Sw,t⇒Pt>f, where C is a coefficient of soil saturation obtained by calibration, and Sw,max is the content of water at saturation point which depends on the nature of the soil and on its use.

The sub-surface flow is considered proportional to the difference between the water content Sw,t at time t and that at soil capacity Sc Rsub,t=c⋅Sw,t-Sc, while the estimated deep flow is evaluated according to the expression proposed by Laio et al. (2001): Lt=KSeβ⋅1-ScSw,max-1⋅eβ⋅Sw,t-ScSw,max-1, where KS is the hydraulic conductivity of the soil in saturation conditions and β is a dimensionless exponent characteristic of the size and distribution of pores in the soil. The evaluation of the real evapotranspiration is performed assuming it as a function of the water content in the soil and potential evapotranspiration, calculated using the formulation of Hargreaves and Samani (1982).

Knowing the values of Rsur, Rsub, and L, it is possible to model the surface Qsur, sub-surface Qsub, and deep flow Qg routed contributions according to the conceptual framework of the linear reservoir at the closing section of the single sub-catchment. In particular, in the case of Qsur, the value of the parameter k, which is a function of the residence time in the catchment slopes, is estimated by relating the velocity to the average slope length. However, one of the challenges is to properly estimate such velocity, which should be calculated for each flood event (Rinaldo and Rodriguez-Iturbe, 1996). According to Rodríguez-Iturbe et al. (1982), this velocity is a function of the effective rainfall intensity and the event duration. In this study, the estimation of the surface velocity is performed using the relation between velocity and intensity of rainfall excess proposed in Kumar et al. (2002) to estimate the average travel time and the consequent parameter k. However, this formulation is applied in a lumped way for a given sub-catchment. As reported in McDonnell and Beven (2014), more reliable and distributed models should be used to reproduce the spatial variability of the residence times over time within the catchment. That is why, in the advanced version of the model implemented by AAWA, in each sub-catchment the runoff propagation is carried out according to the geomorphological theory of the hydrologic response. The overall catchment travel time distributions are considered as nested convolutions of statistically independent travel time distributions along sequentially connected, and objectively identified, smaller sub-catchments. The correct estimation of the residence time should be derived considering the latest findings reported in McDonnell and Beven (2014). Regarding Qsub and Qg, the value of k is calibrated comparing the observed and simulated streamflow at Vicenza.

In the early warning system implemented by AAWA in the Bacchiglione catchment, the flood propagation along the main river channel is represented by a one-dimensional hydrodynamic model, MIKE 11 (DHI, 2007). However, in order to reduce the computational time required by the analysis performed in this study, MIKE11 is replaced by a Muskingum–Cunge model (see, e.g. Todini, 2007) considering rectangular river cross-sections for the estimation of hydraulic radios, wave celerities, and other hydraulic variables.

Calibration of the hydrological model parameters is performed by AAWA, and described in Ferri et al. (2012), considering the time series of precipitation from 2000 to 2010 in order to minimize the root mean square error between observed and simulated values of water level at the Ponte degli Angeli gauged station. In order to stay as close as possible to the early warning system implemented by AAWA, we used the same calibrated model parameters proposed by Ferri et al. (2012).

Data assimilation procedure Kalman filter

In data assimilation, it is typically assumed that the dynamic system can be represented in the state space as follows: xt=Mxt-1,ϑ,It+wtwt∼N0,Stzt=Hxt,ϑ+vtvt∼N0,Rt, where xt and xt-1 are state vectors at time t and t-1, M is the model operator that propagates the state x from its previous condition to the new one as a response to the inputs It, while H is the operator which maps the model states into output zt. The system and measurement errors wt and vt are assumed normally distributed with zero mean and covariance S and R. In a hydrological modelling system, these states can represent the water stored in the soil (soil moisture, groundwater) or on the earth's surface (snow pack). These states are one of the governing factors that determine the hydrograph response to the inputs into the catchment.

For the linear systems used in this study, the discrete state-space system of Eq. (1) can be represented as follows (Szilagyi and Szollosi-Nagy, 2010): xt=Φxt-1+ΓIt+wtQt=Hxt+vt, where t is the time step, x is the vector of the model states (stored water volume in m3), Φ is the state-transition matrix (function of the model parameters n and k), Γ is the input-transition matrix, and H is the output matrix. For example, for n= 3, the matrix H is expressed as H=0 0k. Expressions for matrices Φ and Γ can be found in Szilagyi and Szollosi-Nagy (2010).

For the Bacchiglione model (semi-distributed model), a preliminary sensitivity analysis on the model states (soil content Sw and the storage water xsur, xsub, and xL related to Qsur, Qsub, and Qg) is performed in order to decide on which of the states to update. The results of this analysis (shown in the next section) pointed out that the stored water volume xsur (estimated using Eq. 8 with n= 1, H= k, and It replaced by Rsur) is the most sensitive state, and for this reason we decided to update only this state.

The Kalman filter (KF; Kalman, 1960) is a mathematical tool which allows estimating, in an efficient computational (recursive) way, the state of a process which is governed by a linear stochastic difference equation. The KF is optimal under the assumption that the error in the process is Gaussian; in this case, the KF is derived by minimizing the variance of the system error assuming that the model state estimate is unbiased.

The Kalman filter procedure can be divided into two steps, namely forecast equations, (Eqs. 10 and 11) and update (or analysis) equations (Eqs. 12, 13, and 14): xt-=Φxt-1++ΓItPt-=ΦPt-1+ΦT+SKt=Pt-HTHPt-HT+R-1xt+=xt-+KtQto-Hxt-Pt+=I-KtHPt-, where Kt is the Kalman gain matrix, P is the error covariance matrix, and Qo is a new observation. In this study, the observed value of streamflow Qo is equal to the synthetic CSD estimated as described above. The prior model states x at time t are updated, as the response to the new available observation, using the analysis equations Eqs. (12) to (14). This allows for estimation of the values of the updated state (with superscript +) and then assessing the background estimates (with superscript -) for the next time step using the time update equations, Eqs. (10) and (11). The proper characterization of the model covariance matrix S is a fundamental issue in the Kalman filter. In this study, in order to evaluate the effect of assimilating CSD, small values of the model error S are considered for each case study. In fact, a covariance matrix S with diagonal values of 1, 25, and 1 m6 s-2 are considered for the Brue, Sieve, and Alzette catchments. The bigger value of S in the Sieve catchment is due to the higher flow magnitude in this catchment if compared to the other two. A sensitivity analysis of model performance depending on the value of S is reported in the Results section. For the Bacchiglione catchment, S is estimated, for each given flood event, as the variance between observed and simulated flow values.

Assimilation of crowdsourced data

As described in the previous section, a main characteristic of CSD is to be highly uncertain and asynchronous in time. Various methods have been proposed to include asynchronous observations in models. Having reviewed them, in this study, we are proposing a somewhat simpler approach of data assimilation of crowdsourced observations (DACO). This method is based on the assumption that the change in the model states and in the error covariance matrices within the two consecutive model time steps t0 and t (observation window) is linear, while the inputs are assumed constant. All CSD received during the observation window are individually assimilated in order to update the model states and output at time t. Therefore, assuming that one CSD is available at time t0*, the first step of DACO (A in Fig. 4) is the definition of the model states and error covariance matrix at t0* as xt0*-=xt0++xt--xt0+⋅t0*-t0t-t0Pt0*-=Pt0++Pt--Pt0+⋅t0*-t0t-t0. The second step (B in Fig. 4) is the estimation of the updated model states and error covariance matrix as the response to the streamflow CSD Qt0*o. The estimation of the posterior values of xt0*- and Pt0*- is performed by Eqs. (13) and (14), respectively. The Kalman gain is estimated by Eq. (12), where the prior values of model states and error covariance matrix at t0* are used. Knowing the posterior values xt0*+ and Pt0*+, it is possible to predict the value of states and covariance matrix at one model step ahead, t* (C in Fig. 4), using the model forecast equations, Eqs. (10) and (11).

Graphical representation of the data assimilation of the crowdsourced observations (DACO) method used in this study to assimilate asynchronous streamflow crowdsourced data.

The last step (D in Fig. 4) is the estimation of the interpolated value of x and P at time step t. This is performed by means of a linear interpolation between the current values of x and P at t0* and t*: x̃t-=xt0*-+xt*--xt0*+⋅t-t0*t*-t0*P̃t-=Pt0*-+Pt*--Pt0*+⋅t-t0*t*-t0*. The symbol ∼ is added on the new matrices x and P in order to differentiate them from the original forecasted values in t. Assuming that new streamflow CSD are available at an intermediate time t1* (between t0* and t), the procedure is repeated considering the values at t0* and t for the linear interpolation. Then, when no more CSD are available, the updated value of x̃t- is used to predict the model states and output at t+ 1 (Eqs. 10 and 11). Finally, in order to account for the intermittent behaviour of these CSD, the approach proposed by Mazzoleni et al. (2015) is applied. In this method, the model states matrix x is updated and forecasted when CSD are available, while without CSD the model is run using Eq. (10) and covariance matrix P propagated at the next time step using Eq. (11).

Crowdsourced data accuracy

In this section, the uncertainty related to CSD is characterized. The observational error is assumed to be normally distributed noise with zero mean and given standard deviation σtQ=αt⋅Qto, where the coefficient α is related to the degree of uncertainty of the measurement (Weerts and El Serafy, 2006).

One of the main and obvious issues in citizen-based observations is to maintain the quality control of the water observations (Cortes Arevalo et al., 2014; Engel and Voshell Jr., 2002). In the Introduction section, a number of methods to estimate the model of observational uncertainty have been referred to. In this study, coefficient α is assumed to be a random variable uniformly distributed between 0.1 and 0.3, so we leave more thorough investigation of the uncertainty level of CSD for future studies. We assumed that the maximum value of α is 3 times higher than the uncertainty coming from the physical sensors due to the uncertainty estimation of the rating curve at the social sensor location.

Experimental setup

In this section, two sets of experiments are performed in order to test the proposed method and assess the benefit of integrating CSD, asynchronous in time and with variable accuracies, in real-time flood forecasting.

In the first set of experiments, called “Experiment 1”, assimilation of streamflow CSD at one social sensor location is carried out in the Brue, Alzette, and Sieve catchments to understand the sensitivity of the employed hydrological model – KMN – under various scenarios of these data.

In the second set of experiments, called “Experiment 2”, the distributed CSD coming from social and physical sensors, at four locations within the Bacchiglione catchment, are considered, with the aim of assessing the improvement in the flood forecasting accuracy.

Experiment 1: assimilation of crowdsourced data from one social sensor

The focus of Experiment 1 is to study the performance of the hydrological model (KMN) assimilating CSD, having lower arrival frequencies than the model time step and random accuracies, coming from a social sensor located at the outlet points of the Brue, Sieve, and Alzette catchments.

To analyse all possible combinations of arrival frequencies, number of CSD within the observation window (1 h), and accuracies, a set of scenarios are considered (Fig. 5), changing from regular arrival frequencies of CSD with high accuracies (scenario 1) to random and chaotic asynchronous CSD with variable accuracies (scenario 11). In each scenario, a varying number of CSD from 1 to 100 is considered. It is worth noting that for one CSD per hour and regular arrival time, scenario 1 corresponds to the case of physical sensors with observation arrival frequencies of 1 h.

Experimental scenarios representing different configurations of arrival frequencies, number, and accuracies of streamflow crowdsourced data.

Scenario 2 corresponds to the case of CSD having fixed accuracies (α equal to 0.1) and irregular arrival moments, but in which at least one CSD coincides with the model time step. In particular, scenarios 1 and 2 coincide for one CSD available within the observation window since it is assumed that the arrival frequencies of that CSD have to coincide with the model time step. On the other hand, the arrival frequencies of CSD in scenario 3 are assumed random and CSD might not arrive at the model time step.

Scenario 4 considers CSD with regular frequencies but random accuracies at different moments within the observation window, whereas in scenario 5 CSD have irregular arrival frequencies and random accuracies. In all the previous scenarios, the arrival frequencies, the number, and accuracies of CSD are assumed periodic, i.e. repeated between consecutive observation windows along all the time series. However, this periodic repetitiveness might not occur in real life, and for this reason, a non-periodic behaviour is assumed in scenarios 6, 7, 8, and 9. The non-periodicity assumptions of the arrival frequencies and accuracies are the only factors that differentiate scenarios 6, 7, 8, and 9 from scenarios 2, 3, 4, and 5, respectively. In addition, the non-periodicity of the number of CSD within the observation window is introduced in scenario 10.

Finally, in scenario 11, CSD, in addition to all the previous characteristics, might have an intermittent behaviour, i.e. not being available for one or more observation windows.

Experiment 2: spatially distributed physical and social sensors

Synthetic CSD with the characteristics reported in scenarios 10 and 11 of Experiment 1 are generated due to the unavailability of streamflow CSD during this study. In order to evaluate the model performance, observed and simulated streamflows are compared for different lead times.

Streamflow data from physical sensors are assimilated in the hydrological model of the AMICO (Alto Adriatico Modello Idrologico e idrauliCO) system at an hourly frequency, while CSD from social sensors are assimilated using the DACO method previously described. The updated hydrograph estimated by the hydrological model is used as the input into the Muskingum–Cunge model used to propagate the streamflow downstream to the gauged station at Ponte degli Angeli, Vicenza.

The main goal of Experiment 2 is to understand the contribution of distributed CSD to the improvement of the flood prediction at a specific point of the catchment, in this case at Ponte degli Angeli. For this reason, five different settings are introduced, and represented in Fig. 6, corresponding to different types of employed sensors.

Experiment 2: characteristics of the five experimental settings (A to E) implemented within the Bacchiglione catchment: location of the social and physical sensors (dots), hydrological model update based on different sensors (coloured areas).

Firstly, only streamflow data from one physical sensor at the Leogra sub-catchment are assimilated to update the hydrological model of sub-catchment B (Fig. 2) of setting A (Fig. 6). On the other hand, in setting B, CSD from the social sensor located at the Leogra sub-catchment are assimilated. In setting C, CSD from three distributed social sensors are integrated into the hydrological model. Setting D accounts for the integration of CSD from two social sensors and physical data from the physical sensor in the Leogra sub-catchment. Finally, setting E considers the complete integration between physical and social sensors in Leogra and the two social sensors in the Timonchio and Orolo sub-catchments.

Results Experiment 1: influence of crowdsourced data on flood forecasting

The observed and simulated streamflow hydrographs at the outlet section of the Brue, Sieve, and Alzette catchments with and without the model update (considering hourly streamflow data) are reported in Fig. 7 for nine different flood events for 1 h lead time. As expected, it can be seen that the updated model tends to better represent the flood events than the model without updating in all the case studies. However, this improvement is closely related to the value of the matrix S. The higher the S value (uncertainty model), the closer the model output gets to the observation. For this reason, a sensitivity analysis on the influence of the matrix S on the assimilation of CSD for scenario 1, i.e. coming and assimilated at regular time steps within the observation windows, is reported in Fig. 8. The results of Fig. 8 are related to the first flood events of the Brue, Sieve, and Alzette catchments. Increasing the number of CSD within the observation window results in an improvement of the NSE for different values of model error. However, this improvement becomes negligible for a given threshold value of CSD, which is a function of the considered flood event. This means that the additional CSD do not add information useful for improving the model performance. Overall, increasing the value of the model error S tends to increase NSE values as mentioned before. For this reason, to better evaluate the effect of assimilating CSD, a small value of S, i.e. a model more accurate than CSD, is assumed.

Observed (black line) and simulated hydrographs, with (red line) and without (blue line) assimilation, for the flood events which occurred in the three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom row).

Model improvement in terms of Nash–Sutcliffe efficiency (NSE), during flood event 1 for each case study, for different values of the model error matrix S and 24 h lead time, assimilating streamflow CSD according to scenario 1.

In scenario 1, the arrival frequencies are set as regular for different model runs, so the moments and accuracies in which CSD became available are always the same for any model run. However, for the other scenarios, the irregular moments in which CSD become available within the observation window and their accuracies are randomly selected and change according to the different model runs. This reflects in a random model performance and consequent NSE values. In order to remove such random behaviour, different model runs (100 in this case) are carried out, assuming different random values of arrivals and accuracies (coefficient α) during each model run, for a given number of CSD and lead time. The NSE value is estimated for each model run, so μNSE and σNSE represent the mean and standard deviation of the different values of NSE.

For scenarios 2 and 3 (represented using warm red and orange colours in Figs. 9 and 10 for lead times equal to 24 h), the μNSE values are smaller but comparable to the ones obtained for scenario 1 for all the considered flood events and case studies. In particular, scenario 3 has lower μNSE than scenario 2. This can relate to the fact that both scenarios have random arrival frequencies; however, in scenario 3, CSD are not provided at model time steps, as opposed to scenario 2. From Fig. 10, higher values of σNSE can be observed for scenario 3. Scenario 2 has the lowest standard deviation for low values of CSD because the arrival frequencies have to coincide with the model time step and this stabilizes the NSE. In particular, for an increasing number of CSD, σNSE tends to decrease. However, a constant trend of σNSE can be observed, due to particular characteristics of the flood events, in the case of flood event 1 of the Sieve and flood events 2 and 3 of the Alzette. It is worth nothing that scenario 1 has null standard deviation because CSD are assumed to come at the same moments with the same accuracies for all 100 model runs.

Dependency of the mean of the Nash–Sutcliffe efficiency sample, μNSE, on the number of streamflow crowdsourced data in the experimental scenarios 1 to 9 for the considered flood events in the three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom row).

Dependency of the standard deviation of the Nash–Sutcliffe efficiency sample, σNSE, on the number of streamflow crowdsourced data in the experimental scenarios 1 to 9 for the considered flood events in the three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom row).

In scenario 4, represented using blue colour, CSD are considered to come at regular time steps but have random accuracies. Figure 9 shows that μNSE values are lower for scenario 4 than for scenarios 2 and 3. This is related to the higher influence of CSD accuracies if compared to arrival frequencies. High variability in the model performance, especially for low values of CSD, can be observed in scenario 4 (Fig. 10).

The combined effects of random arrival frequencies and CSD accuracies are represented in scenario 5 using a magenta colour (i.e. the combination of warm and cold colours used for scenarios 2, 3, and 4) in Figs. 9 and 10. As expected, this scenario has the lowest μNSE and the highest σNSE values, compared to those reported above.

The remaining scenarios (6 to 9) are equivalent to scenarios 2 to 5 with the only difference being that they are non-periodic in time. For this reason, in Figs. 9 and 10, scenarios from 6 to 9 have the same colour as scenarios 2 to 5 but indicated with a dashed line in order to underline their non-periodic behaviour. Overall, it can be observed that non-periodic scenarios have similar μNSE values to their corresponding periodic scenario. However, the smoother μNSE trends can be explained because of the lower σNSE values, which means that model performance is less dependent on the non-periodic nature of CSD than their period behaviour. Table 1 shows the NSE values and model improvement obtained for the different experimental scenarios during the different flood events. Small improvements are obtained when NSE is already high for one CSD as for the Sieve catchment during flood event 2 or the Alzette catchment during flood event 2. Moreover, it can be seen that a lower improvement is achieved for scenarios (2, 3, 6, and 7) where arrival frequencies are random and accuracies fixed if compared to those scenarios (4, 5, 8, and 9) where arrival frequencies are regular and accuracies random.

NSE improvements (%), from 1 to 50 CSD, for different experimental scenarios during the nine flood events that occurred in the Brue, Sieve, and Alzette catchments.

Scenario 1 2 3 4 5 6 7 8 9 Brue – event 1 0.126 0.125 0.140 0.243 0.253 0.125 0.144 0.237 0.248 Brue – event 2 0.416 0.413 0.445 0.920 0.902 0.413 0.463 0.841 0.870 Brue – event 3 0.443 0.438 0.472 0.890 0.842 0.440 0.471 0.809 0.822 Sieve – event 1 0.250 0.246 0.228 0.271 0.221 0.247 0.225 0.263 0.237 Sieve – event 2 0.066 0.064 0.067 0.057 0.056 0.064 0.068 0.057 0.060 Sieve – event 3 0.629 0.623 0.632 1.085 1.045 0.625 0.634 1.019 0.995 Alzette – event 1 0.884 0.881 0.883 1.274 1.265 0.882 0.890 1.251 1.342 Alzette – event 2 0.137 0.135 0.135 0.120 0.121 0.134 0.147 0.119 0.135 Alzette – event 3 0.314 0.309 0.305 0.297 0.283 0.310 0.315 0.297 0.281

Representation of the errors in flood peak timing, ERRT, and intensity, ERRI, (as described in Eqs. 20 and 21), as function of the number of streamflow crowdsourced data and experimental scenarios (1 to 9), for three different flood peaks occurred during flood event 2 in the Brue catchment.

In the previous analysis, model improvements are expressed only in terms of NSE. However, statistics such as NSE only explain the overall model accuracy and not the real increases/decreases in prediction error. Therefore, increases in model accuracy due to the assimilation of CSD have to be presented in different ways as increased accuracy of flood peak magnitudes and timing. For this reason, additional analyses are carried out to assess the change in flood peak prediction considering three peaks occurred during flood event 2 in the Brue catchment (see Fig. 7). Errors in the flood peak timing, ERRT, and intensity, ERRI, are estimated as ERRT=tPo-tPSERRI=QPo-QPSQPo, where tPo and tPs are the observed and simulated peak time (h), while QPo and QPs are the observed and simulated peak streamflow (m3 s-1). From the results reported in Fig. 11, considering 12 h lead time, it can be observed that, overall, error reduction in peak prediction is achieved for an increasing number of CSD. In particular, assimilation of CSD has more influence in the reduction of the peak intensity rather than peak timing. In fact, a small reduction of ERRT of about 1 h is obtained even increasing the number of CSD. In both ERRI and ERRT, the higher error reduction is obtained considering fixed CSD accuracies and random arrival frequencies (e.g. scenarios 1, 2, 3, 6, and 7). In fact, smaller ERRI error values are obtained for scenario 1, while scenarios 5 and 9 are the ones that show the lowest improvement in terms of peak prediction. These conclusions are very similar to the previous ones obtained analysing only NSE as model performance measures.

The combination of all the previous scenarios is represented by scenario 10, where a changing number of CSD in each observation window is considered. In scenario 11, the intermittent nature of CSD is accounted for as well. The μNSE and σNSE values of these scenarios obtained for the considered flood events are shown in Fig. 12. It can be observed that scenario 10 tends to provide higher μNSE and lower σNSE values, for a given flood event, if compared to scenario 11. In fact, intermittency in CSD tends to reduce model performance and increase the variability of NSE values for random configuration of arrival frequencies and CSD accuracies. In particular, σNSE tends to be constant for an increasing number of CSD.

Dependency of the mean μNSE and standard deviation σNSE of the Nash–Sutcliffe efficiency sample (first row and second row, respectively) on the number of streamflow crowdsourced data in scenarios 10 (solid lines) and 11 (dashed lines) for the considered flood events (black, blue, red lines) in the three catchments: Brue (left panel), Sieve (central panels), and Alzette (right panels).

Experiment 2: influence of distributed physical and social sensors

Three different flood events that occurred in the Bacchiglione catchment are used for Experiment 2. Figure 13 shows the observed and simulated streamflow value at the outlet section of Vicenza. In particular, two simulated time series of streamflow are calculated using the measured and forecasted time series of precipitation as input for the hydrological model. Overall, an underestimation of the observed streamflow can be observed using forecasted input, while the results achieved used measured precipitation tend to properly represent the observations. In order to find out what model states lead to a maximum increase of the model performance, a preliminary sensitivity analysis is performed. The four model states, xS, xsur, xsub, and xL, related to Sw, Qsur, Qsub, and Qg, are uniformly perturbed by ±20 % around the true state value for every time step up to the perturbation time (PT). No correlation between time steps is considered. After PT, the model realizations are run without perturbation in order to assess the effect on the system memory. No assimilation and no state update are performed at this step. From the results reported in Fig. 14 related to flood event 1, it can be observed that the model state xsur is the most sensitive state if compared to the other ones. In addition, the perturbations of all the states seem to affect the model output even after the PT (high system memory). For this reason, in this experiment, only the model state xsur is updated by means of the DACO method.

Observed and simulated hydrographs, without updates, using measured input (MI) and forecasted input (FI), for the three considered flood events which occurred in 2013 (event 1), 2014 (event 2), and 2016 (event 3) in the Bacchiglione catchment.

Scenarios 10 and 11, described in the previous sections, are used to represent the irregular and random behaviour of CSD assimilated in the Bacchiglione catchment.

Figures 15 and 16 show the results obtained from the experiment settings represented in Fig. 6 during three different flood events. Three different lead time values are considered. Different model runs (100) are performed to account for the effect induced by the random arrival frequencies and accuracies of CSD within the observation window as described above. Figure 15 shows that the assimilation of streamflow from the physical sensor in the Leogra sub-catchment (setting A) provides a better streamflow prediction at Ponte degli Angeli if compared to the assimilation of a small number of CSD provided by a social sensor in the same location (setting B). In particular, Fig. 15 shows that, depending on the flood event, the same NSE values achieved with the assimilation of physical data (hourly frequency and high accuracy) can be obtained by assimilating between 10 and 20 CSD per hour for a 4 h lead time. This number of CSD tends to increase for increasing values of lead times. In the event of intermittent CSD (Fig. 16), the overall reduction of NSE is such that even with a high number of CSD (even higher than 50 per hour) the NSE is always lower than the one obtained assimilating physical streamflow data for any lead time.

Effect of model state perturbation on the model output for the Bacchiglione catchment: PT indicates perturbation time; xs indicates model state related to Sw; xsur indicates model state related to Qsur; xsub indicates model state related to Qsub; xL indicates model state related to Qg.

Model performance expressed as the mean of the Nash–Sutcliffe efficiency μNSE, assimilating a different number of streamflow crowdsourced data during the three considered flood events for the three lead time values (left panels: 4 h; central panels: 8 h; right panels: 12 h) of scenario 10, for the five experimental settings (A to E) in the Bacchiglione catchment.

For setting C, it can be observed for all three flood events that distributed social sensors in Timonchio, Leogra, and Orolo sub-catchments allow for obtaining higher model performance than the one achieved with only one physical sensor (see Fig. 15). However, for flood event 3, this is valid only for small lead time values. In fact, for 8 and 12 h lead time values, the contribution of CSD tends to decrease in favour of physical data from the Leogra sub-catchment. This effect is predominant for intermittent CSD, scenario 11. In this case, setting C has higher μNSE values than setting A only during flood event 1 and for lead time values equal to 4 and 8 h (see Fig. 16).

It is interesting to note that for setting D, during flood event 1, the μNSE is higher than setting C for the low number of CSD. However, with a higher number of CSD, setting C is the one providing the best model improvement for low lead time values. In the event of intermittent CSD, it can be noticed that the setting D always provides higher improvement than setting C. For flood event 1, the best model improvement is achieved for setting E, i.e. fully integrating physical sensor with distributed social sensors. On the other hand, during flood events 2 and 3, setting D shows higher improvements than setting E. For intermittent CSD, the difference between settings D and E tends to reduce for all the flood events. Overall, settings D and E are the ones providing the highest μNSE in both scenarios 10 and 11. This demonstrates the importance of integrating an existing network of physical sensors (setting A) with social sensors to improve flood predictions.

Figure 17 shows the standard deviation of the NSE, σNSE, obtained for the different settings for 4 h lead time. Similar results are obtained for the three flood events. In the case of setting A, σNSE is equal to zero since CSD are coming from the physical sensor at regular time steps. Higher σNSE values are obtained for setting B, while including distributed CSD (setting C) tend to decrease the value of σNSE. It can be observed that σNSE decreases for high values of CSD. As expected, the lowest values of σNSE are achieved including the physical sensor in the data assimilation procedure (settings D and E). Similar considerations can be drawn for intermittent CSD, where higher and more perturbed σNSE values are obtained.

Model performance expressed as the mean of the Nash–Sutcliffe efficiency μNSE, assimilating different number of streamflow crowdsourced data during the three considered flood events for the three lead time values (left panels: 4 h; central panels: 8 h; right panels: 12 h) of scenario 11, for the five experimental settings (A to E) in the Bacchiglione catchment.

Variability of model performance expressed as σNSE, assimilating streamflow crowdsourced data within settings A, B, C, and D, assuming a lead time of 4 h, for experimental scenarios 10 (upper row) and 11 (bottom row), during the three considered flood events in the Bacchiglione catchment.

Discussion

The assimilation of CSD is performed in four different case studies considering only one social sensor location in the Brue, Sieve, and Alzette catchments, and distributed social and physical sensors within the Bacchiglione catchment.

In the first three catchments, different characteristics of CSD are represented by means of 11 scenarios. Nine different flood events are used to assess the beneficial use in assimilating CSD in the hydrological model to improve flood forecasting.

Overall, assimilation of CSD improves model performance in all the considered case studies. In particular, there is a limit in the number of CSD for which satisfactory model improvements can be achieved and for which additional CSD become redundant. This asymptotic behaviour, when extra information is added, has also been observed using other metrics by Krstanovic and Singh (1992), Ridolfi et al. (2014), Alfonso et al. (2013), among others. From Fig. 9 it can be seen that, in all the considered catchments, increasing the number of model error induces an increase of this asymptotic value with a consequent reduction of CSD needed to improve model performance. For this reason, a small value of the model error is assumed in this study. In addition, it is not possible to define a priori the number of CSD needed to improve a model because of its different behaviour for a given flood event in the event of no update. In fact, as reported in Table 1 and Fig. 8, flood events with high NSE values even without updates tend to achieve the asymptotic values of NSE for a small number of CSD (e.g. flood event 1 in the Brue and flood event 2 in the Sieve), while more CSD are needed for flood events having low NSE without updates. However, for these case studies and during these nine flood events, an indicative value of 10 CSD can be considered to achieve a good model improvement.

Figures 9 and 10 show the μNSE and σNSE values for scenarios 2 to 9. Figure 9 demonstrates that for irregular arrival frequencies and constant accuracies (e.g. scenarios 2, 3, 6, and 7) the NSE is higher than for scenarios in which accuracies are variable and arrival frequencies fixed (e.g. scenarios 4, 5, 8, and 9). These results point out that the model performance is more sensitive to the accuracies of CSD than to the moments in time at which the streamflow CSD become available. Overall, σNSE tends to decrease for high number of CSD. The combined effects of irregular frequencies and uncertainties are reflected in scenario 5, which has lower mean and higher standard deviation of NSE if compared to the first four scenarios.

An interesting fact is that, passing from periodic to non-periodic scenarios, the standard deviation σNSE is significantly reduced, while μNSE remains the same but with a smoother trend. A non-periodic behaviour of CSD, common in real life, helps to reduce the fluctuation of the NSE generated by the random behaviour of streamflow CSD. Finally, the results obtained for scenarios 10 and 11 are shown in Fig. 12. The assimilation of the irregular number of CSD in scenario 10, in each observation window, seems to provide similar μNSE as the values obtained with scenario 9. One of the main outcomes is that the intermittent nature of CSD (scenario 11) induces a drastic reduction of the NSE and an increase in its noise in both considered flood events. All these previous results are consistent across the considered catchments.

In the case of the Bacchiglione catchment, the data from physical and social sensors are assimilated within a hydrological model to improve the poor flow prediction in Vicenza for the three considered flood events. In fact, these predictions are affected by an underestimation of the 3-day rainfall forecast used as input in flood forecasting practice in this area.

One of the main outcomes of these analyses is that the replacement of a physical sensor (setting A) for a social sensor at only one location (setting B) does not improve the model performance in terms of NSE for a small number of CSD. Figures 15 and 16 show that distributed locations of social sensors (setting C) can provide higher values of NSE than a single physical sensor, even for a low number of CSD, in the event of CSD having the characteristic of scenario 10. For flood event 1, setting C provides better model improvement than setting D for low lead time values and a high number of CSD. This can be because the physical sensor at Leogra provides constant improvement, for a given lead time, while the social sensor tends to achieve better results with a higher number of CSD. This dominant effect of the social sensor, for a high number of CSD, tends to increase for the higher lead times. On the other hand, for intermittent CSD (scenario 11) this effect decreases in particular for flood events 2 and 3.

Integrating physical and social sensors (settings D and E) induces the highest model improvements for all the three flood events. For flood event 1, assimilation from setting E appears to provide better results than assimilation from setting D. Opposite results are obtained for flood events 2 and 3. In fact, the high μNSE values of setting D can be because flood events 2 and 3 are characterized by one main peak and similar shape while flood event 1 has two main peaks. Assimilation of CSD from distributed social sensors tends to reduce the variability of the NSE coefficient in both scenarios 10 and 11.

Conclusions

This study assesses the potential use of crowdsourced data in hydrological modelling, which are characterized by irregular availability and variable accuracy. We demonstrate that even data with these characteristics can improve flood prediction if integrated into hydrological models. This opens new opportunities in terms of exploiting data being collected in current citizen science projects for the modelling exercise. Our results do not support the idea that social sensors should partially or totally replace the existing network of physical sensors; instead, these new data should be used to compensate the lack of traditional observations. In fact, in the event of a dense network of physical sensors, the additional information from social sensors might not be necessary because of the high accuracy of the hydrological observations derived by physical sensors.

Four different case studies, the Brue (UK), Sieve (Italy), Alzette (Luxembourg) and Bacchiglione (Italy) catchments, are considered, and two types of hydrological models are used. In Experiment 1 (Brue, Sieve, and Alzette catchments), the sensitivity of the model results to the assimilation of crowdsourced data, having different frequencies and accuracies, derived from a hypothetical social sensor at the catchments outlet is assessed. On the other hand, in Experiment 2 (Bacchiglione catchment), the influence of the combined assimilation of crowdsourced data, from a distributed network of social sensors, and existing streamflow data from physical sensors, are evaluated. Because crowdsourced streamflow data are not yet available in all case studies, realistic synthetic data with various characteristics of arrival frequencies and accuracies are introduced.

Overall, we demonstrated that results are very similar in terms of model behaviour assimilating asynchronous data in all case studies.

In Experiment 1, it is found that increasing the number of crowdsourced data within the observation window increases the model performance even if these data have irregular arrival frequencies and accuracies. Moreover, data accuracy affects the average value of NSE more than the moment in which these data are assimilated. The noise in the NSE is reduced when the assimilated data are considered to have non-periodic behaviour. In addition, the intermittent nature of the data tends to drastically reduce the NSE of the model for different values of lead times. In fact, if the intervals between the data are too large, then the abundance of crowdsourced data at other times and places is no longer able to compensate their intermittency.

Experiment 2 showed that, in the Bacchiglione catchment, the integration of data from social sensors and a single physical sensor could improve the flood prediction even for a small number of intermittent crowdsourced data. In the event that both physical and social sensors are located at the same place, the assimilation of physical data gives the same model improvement as the assimilation of the high number and non-intermittent behaviour of crowdsourced data. Overall, the integration of existing physical sensors with a new network of social sensors can improve the model predictions. Although the cases and models are different, the presented study demonstrated that the results obtained are very similar in terms of model behaviour assimilating asynchronous data.

Although we have obtained interesting results, this work has some limitations. Firstly, the proposed method used to assimilate crowdsourced data is applied to the linear parts of hydrological models. This means that the proposed methodology has to be tested on models with non-linear dynamics. Secondly, while realistic synthetic streamflow data are used in this study, the developed methodology is not tested with data coming from actual social sensors. Therefore, the conclusions need to be confirmed using real crowdsourced observations of water level. Finally, advancing methods for a more accurate assessment of the data quality and accuracy of data derived from social sensors need to be considered (e.g. developing a pre-filtering module aimed at selecting only data that have good accuracy while discarding those with low accuracy).

Future work will be aimed at addressing the limitations formulated above, which will allow for a better characterization of the crowdsourced data, making them a reliable data source for model-based forecasting.

Data availability

The DEM data were downloaded from the SRTM database (http://srtm.csi.cgiar.org). The rainfall and river discharge data were provided by the British Atmospheric Data Centre from the NERC Hydrological Radar Experiment Dataset (Brue catchment, http://www.badc.rl.ac.uk/data/hyrex/), and by the Alto Adriatico Water Authority (Bacchiglione catchment). The authors are grateful to Marco Franchini for providing the data on the Sieve catchment.

The authors declare that they have no conflict of interest.

Acknowledgements

This research was partly funded in the framework of the EC FP7 project WeSenseIt: Citizen Observatory of Water, grant agreement no. 308429. The authors wish to thank the editor and three anonymous reviewers for their insightful and useful comments. Edited by: S. Archfield Reviewed by: three anonymous referees

References 1

ABC: ABC's crowdsourced flood-mapping initiative, ABCs Crowdsourced Flood-Mapp, Initiat, available from: http://www.abc.net.au/technology/articles/2011/01/13/3112261.htm (last access: 20 January 2016), 2011.

Alfonso, L.: Use of hydroinformatics technologies for real time water quality management and operation of distribution networks. Case study of Villavicencio, Colombia, MS Thesis, UNESCO-IHE, Institute for Water Education, Delft, the Netherlands, 2006.

Alfonso, L., He, L., Lobbrecht, A., and Price, R.: Information theory applied to evaluate the discharge monitoring network of the Magdalena River, J. Hydroinform., 15, 211–228, 10.2166/hydro.2012.066, 2013.

Alfonso, L., Chacon, J., and Pena-Castellanos. G.: Allowing Citizens to Effortlessly Become Rainfall Sensors, in 36th IAHR World Congress edited, The Hague, the Netherlands, 2015.

Arnold, C. P. and Dey, C. H.: Observing-Systems Simulation Experiments: Past, Present, and Future, B. Am. Meteorol. Soc., 67, 687–695, 10.1175/1520-0477(1986)067<0687:OSSEPP>2.0.CO;2, 1986.

Au, J., Bagchi, P., Chen, B., Martinez, R., Dudley, S. A., and Sorger, G. J.: Methodology for public monitoring of total coliforms, Escherichia coli and toxicity in waterways by Canadian high school students, J. Environ. Manage., 58, 213–230, 10.1006/jema.2000.0323, 2000.

Aubert, D., Loumagne, C., and Oudin, L.: Sequential assimilation of soil moisture and streamflow data in a conceptual rainfall–runoff model, J. Hydrol., 280, 145–161, 10.1016/S0022-1694(03)00229-4, 2003.

Bergström, S.: Principles and confidence in hydrological modelling, Hydrol. Res., 22, 123–136, 1991.

Bird, T. J., Bates, A. E., Lefcheck, J. S., Hill, N. A., Thomson, R. J., Edgar, G. J., Stuart-Smith, R. D., Wotherspoon, S., Krkosek, M., Stuart-Smith, J. F., Pecl, G. T., Barrett, N., and Frusher, S.: Statistical solutions for error and bias in global citizen science datasets, Biol. Conserv., 173, 144–154, 10.1016/j.biocon.2013.07.037, 2014.

Bordogna, G., Carrara, P., Criscuolo, L., Pepe, M., and Rampini, A.: A linguistic decision making approach to assess the quality of volunteer geographic information for citizen science, Inf. Sci., 258, 312–327, 10.1016/j.ins.2013.07.013, 2014.

Buytaert, W., Zulkafli, Z., Grainger, S., Acosta, L., Alemie, T. C., Bastiaensen, J., De Bièvre, B., Bhusal, J., Clark, J., Dewulf, A., Foggin, M., Hannah, D. M., Hergarten, C., Isaeva, A., Karpouzoglou, T., Pandeya, B., Paudel, D., Sharma, K., Steenhuis, T., Tilahun, S., Van Hecken, G., and Zhumanova, M.: Citizen science in hydrology and water resources: opportunities for knowledge generation, ecosystem service management, and sustainable development, Front. Earth Sci., 2, 1–21, 10.3389/feart.2014.00026, 2014.

Canizares, R., Heemink, A. W., and Vested, H. J.: Application of advanced data assimilation methods for the initialisation of storm surge models, J. Hydraul. Res., 36, 655–674, 10.1080/00221689809498614, 1998.

Célleri, R., Buytaert, W., De Bièvre, B., Tobón, C., Crespo, P., Molina, J., and Feyen, J.: Understanding the hydrology of tropical Andean ecosystems through an Andean Network of Basins, available from: http://dspace.ucuenca.edu.ec/handle/123456789/22089 (last access: 19 February 2016), 2009.

Cifelli, R., Doesken, N., Kennedy, P., Carey, L. D., Rutledge, S. A., Gimmestad, C., and Depue, T.: The Community Collaborative Rain, Hail, and Snow Network: Informal Education for Scientists and Citizens, B. Am. Meteorol. Soc., 86, 1069–1077, 2005.

Clark, M. P., Rupp, D. E., Woods, R. A., Zheng, X., Ibbitt, R. P., Slater, A. G., Schmidt, J., and Uddstrom, M. J.: Hydrological data assimilation with the ensemble Kalman filter: Use of streamflow observations to update states in a distributed hydrological model, Adv. Water Resour., 31, 1309–1324, 10.1016/j.advwatres.2008.06.005, 2008.

Cortes Arevalo, V. J., Charrière, M., Bossi, G., Frigerio, S., Schenato, L., Bogaard, T., Bianchizza, C., Pasuto, A., and Sterlacchini, S.: Evaluating data quality collected by volunteers for first-level inspection of hydraulic structures in mountain catchments, Nat. Hazards Earth Syst. Sci., 14, 2681–2698, 10.5194/nhess-14-2681-2014, 2014.

Danish Hydraulic Institute: MIKE FLOOD 1D-2D modelling, User manual, DHI, 2007.

Degrossi, L. C., Do Amaral, G. G., da Vasconcelos, E. S. M., Albuquerque, J. P., and Ueyama, J.: Using Wireless Sensor Networks in the Sensor Web for Flood Monitoring in Brazil, in Proceedings of the 10th International ISCRAM Conference, Baden-Baden, Germany, available from: http://humanitariancomp.referata.com/wiki/Using_Wireless_Sensor_Networks_in_the_Sensor_Web_for_Flood_Monitoring_in_Brazil (last access: 10 February 2016), 2013.

Derber, J. and Rosati, A.: A Global Oceanic Data Assimilation System, J. Phys. Oceanogr., 19, 1333–1347, 10.1175/1520-0485(1989)019<1333:AGODAS>2.0.CO;2, 1989.

Drecourt, J.-P.: Data assimilation in hydrological modelling, Environment & Resources DTU, Technical University of Denmark, 2004.

Eckhardt, K.: How to construct recursive digital filters for baseflow separation, Hydrol. Process., 19, 507–515, 10.1002/hyp.5675, 2005.

Engel, S. R. and Voshell Jr., J. R.: Volunteer biological monitoring: can it accurately assess the ecological condition of streams?, Am. Entomol., 48, 164–177, 2002.

Errico, R. M., Yang, R., Privé, N. C., Tai, K.-S., Todling, R., Sienkiewicz, M. E., and Guo, J.: Development and validation of observing-system simulation experiments at NASA's Global Modeling and Assimilation Office, Q. J. R. Meteorol. Soc., 139, 1162–1178, 10.1002/qj.2027, 2013.

Errico, R. M. and Privé, N. C.: An estimate of some analysis-error statistics using the Global Modeling and Assimilation Office observing-system simulation framework, Q. J. Roy. Meteor. Soc., 140, 1005–1012, 10.1002/qj.2180, 2014.

Evensen, G.: Data Assimilation: The Ensemble Kalman Filter, 2nd Edn., Springer, 2006.

Fenicia, F., Solomatine, D. P., Savenije, H. H. G., and Matgen, P.: Soft combination of local models in a multi-objective framework, Hydrol. Earth Syst. Sci., 11, 1797–1809, 10.5194/hess-11-1797-2007, 2007.

Ferri, M., Monego, M., Norbiato, D., Baruffi, F., Toffolon, C., and Casarin, R.: La piattaforma previsionale per i bacini idrografici del Nord Est Adriatico (I), in: Proc. XXXIII Conference of Hydraulics and Hydraulic Engineering, Brescia, p. 10, 2012.

Giandotti, M.: Previsione delle piene e delle magre dei corsi d'acqua, Servizio Idrografico Italiano, Rome, 1933.

Hargreaves, G. H. and Samani, Z. A.: Estimating potential evapotranspiration, J. Irrig. Drain. Div., 108, 225–230, 1982.

Huang, B., Kinter, J. L., and Schopf, P. S.: Ocean data assimilation using intermittent analyses and continuous model error correction, Adv. Atmos. Sci., 19, 965–992, 10.1007/s00376-002-0059-z, 2002.

Hunt, B. R., Kalnay, E., Kostelich, E. J., Ott, E., Patil, D. J., Sauer, T., Szunyogh, I., Yorke, J. A., and Zimin, A. V.: Four-dimensional Ensemble Kalman Filtering, Tellus A, 56, 273–277, 10.1111/j.1600-0870.2004.00066.x, 2004.

Huwald, H., Barrenetxea, G., de Jong, S., Ferri, M., Carvalho, R., Lanfranchi, V., McCarthy, S., Glorioso, G., Prior, S., Solà, E., Gil-Roldàn, E., Alfonso, L., Wehn de Montalvo, U., Onencan, A., Solomatine, D., and Lobbrecht, A.: D1.11 Sensor technology requirement analysis, Confidential Deliverable, The WeSenseIt Project (FP7/2007-2013 grant agreement no. 308429), 2013.

Ide, K., Courtier, P., Ghil, M., and Lorenc, A. C.: Unifed notation for data assimilation: operational, sequential and variational, J. Meteorol. Soc. Jpn., 75, 181–189, 1997.

ISPUW: iSPUW: Integrated Sensing and Prediction of Urban Water for Sustainable Cities, available from: http://ispuw.uta.edu/nsf/ (last access: 19 February 2016), 2015.

Kalman, R. E.: A new approach to linear filtering and prediction problems, J. Basic Eng.-T. ASME, 82, 35–45, 10.1115/1.3662552, 1960.

Krstanovic, P. F. and Singh, V. P.: Evaluation of rainfall networks using entropy: II. Application, Water Resour. Manag., 6, 295–314, 10.1007/BF00872282, 1992.

Kumar, R., Chatterjee, C., Lohani, A. K., Kumar, S., and Singh, R. D.: Sensitivity Analysis of the GIUH based Clark Model for a Catchment, Water Resour. Manag., 16, 263–278, 10.1023/A:1021920717410, 2002.

Laio, F., Porporato, A., Ridolfi, L., and Rodriguez-Iturbe, I.: Plants in water-controlled ecosystems: active role in hydrologic processes and response to water stress: II. Probabilistic soil moisture dynamics, Adv. Water Resour., 24, 707–723, 10.1016/S0309-1708(01)00005-7, 2001.

Li, Z. and Navon, I. M.: Optimality of variational data assimilation and its relationship with the Kalman filter and smoother, Q. J. Roy. Meteor. Soc., 127, 661–683, 10.1002/qj.49712757220, 2001.

Lowry, C. S. and Fienen, M. N.: CrowdHydrology: Crowdsourcing hydrologic data and engaging citizen scientists, GroundWater, 51, 151–156, 10.1111/j.1745-6584.2012.00956.x, 2013.

Macpherson, B.: Dynamic initialization by repeated insertion of data, Q. J. Roy. Meteor. Soc., 117, 965–991, 10.1002/qj.49711750105, 1991.

Madsen, H. and Cañizares, R.: Comparison of extended and ensemble Kalman filters for data assimilation in coastal area modelling, Int. J. Numer. Meth. Fl., 31, 961–981, 10.1002/(SICI)1097-0363(19991130)31:6<961::AID-FLD907>3.0.CO;2-0, 1999.

Massart, S., Pajot, B., Piacentini, A., and Pannekoucke, O.: On the merits of using a 3D-FGAT assimilation scheme with an outer loop for atmospheric situations governed by transport, Mon. Weather Rev., 138, 4509–4522, 2010.

Matheron, G.: Principles of geostatistics, Econ. Geol., 58, 1246–1266, 1963.

Mazzoleni, M., Alfonso, L., Chacon-Hurtado, J., and Solomatine, D.: Assimilating uncertain, dynamic and intermittent streamflow observations in hydrological models, Adv. Water Resour., 83, 323–339, 2015.

Mazzoleni, M., Alfonso, L., and Solomatine, D.: Influence of spatial distribution of sensors and observation accuracy on the assimilation of distributed streamflow data in hydrological modelling, Hydrolog. Sci. J., 10.1080/02626667.2016.1247211, 2016.

McDonnell, J. J. and Beven, K.: Debates—The future of hydrological sciences: A (common) path forward? A call to action aimed at understanding velocities, celerities and residence time distributions of the headwater hydrograph, Water Resour. Res., 50, 5342–5350, 10.1002/2013WR015141, 2014.

Moore, R. J., Jones, D. A., Cox, D. R., and Isham, V. S.: Design of the HYREX raingauge network, Hydrol. Earth Syst. Sci., 4, 521–530, 10.5194/hess-4-521-2000, 2000.

Ragnoli, E., Zhuk, S., Donncha, F. O., Suits, F., and Hartnett, M.: An optimal interpolation scheme for assimilation of HF radar current data into a numerical ocean model, Oceans, 2012, 1–5, 2012.

Rakovec, O., Weerts, A. H., Hazenberg, P., Torfs, P. J. J. F., and Uijlenhoet, R.: State updating of a distributed hydrological model with Ensemble Kalman Filtering: effects of updating frequency and observation network density on forecast accuracy, Hydrol. Earth Syst. Sci., 16, 3435–3449, 10.5194/hess-16-3435-2012, 2012.

Rakovec, O., Weerts, A. H., Sumihar, J., and Uijlenhoet, R.: Operational aspects of asynchronous filtering for flood forecasting, Hydrol. Earth Syst. Sci., 19, 2911–2924, 10.5194/hess-19-2911-2015, 2015.

Refsgaard, J. C.: Validation and Intercomparison of Different Updating Procedures for Real-Time Forecasting, Nord. Hydrol., 28, 65–84, 1997.

Ridolfi, E., Alfonso, L., Baldassarre, G. D., Dottori, F., Russo, F., and Napolitano, F.: An entropy approach for the optimization of cross-section spacing for river modelling, Hydrolog. Sci. J., 59, 126–137, 10.1080/02626667.2013.822640, 2014.

Rinaldo, A. and Rodriguez-Iturbe, I.: Geomorphological Theory of the Hydrological Response, Hydrol. Process., 10, 803–829, 10.1002/(SICI)1099-1085(199606)10:6<803::AID-HYP373>3.0.CO;2-N, 1996.

Rodríguez-Iturbe, I., González-Sanabria, M., and Bras, R. L.: A geomorphoclimatic theory of the instantaneous unit hydrograph, Water Resour. Res., 18, 877–886, 10.1029/WR018i004p00877, 1982.

Roy, H. E., Pocock, M. J. O., Preston, C. D., Roy, D. B., and Savage, J.: Understanding Citizen Science and Environmental Monitoring, Final Report of UK Environmental Observation Framework, 2012.

Sakov, P., Evensen, G., and Bertino, L.: Asynchronous data assimilation with the EnKF, Tellus A, 62, 24–29, 10.1111/j.1600-0870.2009.00417.x, 2010.

Seo, D.-J., Kerke, B., Zink, M., Fang, N., Gao, J., and Yu, X.: iSPUW: A Vision for Integrated Sensing and Prediction of Urban Water for Sustainable Cities, 2014.

Solomatine, D. P. and Dulal, K. N.: Model trees as an alternative to neural networks in rainfall–runoff modelling, Hydrolog. Sci. J., 48, 399–411, 10.1623/hysj.48.3.399.45291, 2003.

Szilagyi, J. and Szollosi-Nagy, A.: Recursive Streamflow Forecasting: A State Space Approach, CRC Press Book, 2010.

Todini, E.: A mass conservative and water storage consistent variable parameter Muskingum-Cunge approach, Hydrol. Earth Syst. Sci., 11, 1645–1659, 10.5194/hess-11-1645-2007, 2007.

Todini, E., Alberoni, P., Butts, M., Collier, C., Khatibi, R., Samuels, P., and Weerts, A.: ACTIF best practice paper-understanding and reducing uncertainty in flood forecasting, in: International conference on innovation, advances and implementation of flood forecasting technology, edited by: Balabanis, P., Lumbroso, D., and Samuels, P., Tromsø, Norway, 2005.

Tulloch, A. I. T. and Szabo, J. K.: A behavioural ecology approach to understand volunteer surveying for citizen science datasets, Emu, 112, 313–325, 10.1071/MU12009, 2012.

Vandecasteele, A. and Devillers, R.: Improving volunteered geographic data quality using semantic similarity measurements, ISPRS-Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci., 1, 143–148, 2013.

Verlaan, M.: Efficient Kalman Filtering Algorithms for Hydrodynamic Models, PhD Thesis, Delft University of Technology, the Netherlands, 1998.

Weerts, A. H. and El Serafy, G. Y. H.: Particle filtering and ensemble Kalman filtering for state updating with hydrological conceptual rainfall-runoff models, Water Resour. Res., 42, 1–17, 10.1029/2005WR004093, 2006.

Wehn, U., Rusca, M., Evers, J., and Lanfranchi, V.: Participation in flood risk management and the potential of citizen observatories: A governance analysis, Environ. Sci. Policy, 48, 225–236, 2015.

World Meteorological Organization (WMO): Simulated real-time intercomparison of hydrological models, WMO Oper. Hyrol. Rep. 38, WMO 779, Geneva, 1992.

Wood, S. J., Jones, D. A., and Moore, R. J.: Accuracy of rainfall measurement for scales of hydrological interest, Hydrol. Earth Syst. Sci., 4, 531–543, 10.5194/hess-4-531-2000, 2000.

</app></app-group></back> </article>