HESSHydrology and Earth System SciencesHESSHydrol. Earth Syst. Sci.1607-7938Copernicus PublicationsGöttingen, Germany10.5194/hess-21-839-2017Can assimilation of crowdsourced data in hydrological modelling improve
flood prediction?MazzoleniMauriziom.mazzoleni@unesco-ihe.orghttps://orcid.org/0000-0002-0913-9370VerlaanMartinAlfonsoLeonardohttps://orcid.org/0000-0002-8471-5876MonegoMartinaNorbiatoDanieleFerriMicheSolomatineDimitri P.https://orcid.org/0000-0003-2031-9871UNESCO-IHE Institute for Water Education, Hydroinformatics Chair Group, Delft, the NetherlandsDeltares, Delft, the NetherlandsAlto Adriatico Water Authority, Venice, ItalyDelft University of Technology, Water Resources Section, Delft, the NetherlandsMaurizio Mazzoleni (m.mazzoleni@unesco-ihe.org)14February201721283986128September20153November20152January201721January2017This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/This article is available from https://hess.copernicus.org/articles/21/839/2017/hess-21-839-2017.htmlThe full text article is available as a PDF file from https://hess.copernicus.org/articles/21/839/2017/hess-21-839-2017.pdf
Monitoring stations have been used for decades to properly measure
hydrological variables and better predict floods. To this end, methods to
incorporate these observations into mathematical water models have also been
developed. Besides, in recent years, the continued technological advances, in
combination with the growing inclusion of citizens in participatory processes
related to water resources management, have encouraged the increase of
citizen science projects around the globe. In turn, this has stimulated the
spread of low-cost sensors to allow citizens to participate in the collection
of hydrological data in a more distributed way than the classic static
physical sensors do. However, two main disadvantages of such crowdsourced
data are the irregular availability and variable accuracy from sensor to
sensor, which makes them challenging to use in hydrological modelling. This
study aims to demonstrate that streamflow data, derived from crowdsourced
water level observations, can improve flood prediction if integrated in
hydrological models. Two different hydrological models, applied to four case
studies, are considered. Realistic (albeit synthetic) time series are used to
represent crowdsourced data in all case studies. In this study, it is found
that the data accuracies have much more influence on the model results than
the irregular frequencies of data availability at which the streamflow data
are assimilated. This study demonstrates that data collected by citizens,
characterized by being asynchronous and inaccurate, can still complement
traditional networks formed by few accurate, static sensors and improve the
accuracy of flood forecasts.
Introduction
Observations of hydrological variables measured by physical sensors have
been increasingly integrated into mathematical models by means of model
updating methods. The use of these techniques allows for the reduction of
intrinsic model uncertainty and improves the flood forecasting accuracy
(Todini et al., 2005). The main idea behind model updating
techniques is to either update model input, states, parameters, or outputs as
new observations become available (Refsgaard,
1997; WMO, 1992). Input update is the classical method used in operational
forecasting, and uncertainties of the input data can be considered as the
main source of uncertainty of the model (Bergström, 1991; Canizares et al.,
1998; Todini et al., 2005). Regarding the state updating, filtering methods
such as the Kalman filter (Kalman, 1960), extended
Kalman filter (Aubert et al., 2003; Madsen and Cañizares, 1999; Verlaan, 1998), ensemble
Kalman filter (Evensen, 2006), and particle filter (Weerts and El
Serafy, 2006) are the most used approaches to update a model when new
observations are available.
Due to the complex nature of the hydrological processes, spatially and
temporally distributed measurements are needed in the model updating
procedures to ensure a proper flood prediction (Clark et al., 2008;
Mazzoleni et al., 2015; Rakovec et al., 2012). However, traditional physical
sensors require proper maintenance and personnel, which can be cost
prohibitive for a vast network. For this reason, improvements to monitoring
technology have led to the spread of low-cost sensors to measure
hydrological variables, such as water level or precipitation, in a more
distributed way. The main advantage of using this type of sensors, defined in
the paper as “social sensors”, is that they can be used not only by
technicians but also by regular citizens and that due to their reduced cost
and voluntary labour by citizens, they result in a more spatially distributed
coverage. The idea of designing these alternative networks of low-cost
social sensors and using the obtained crowdsourced observations is the base
of the European project WeSenseIt (2012–2016) and various other projects
that proposed to assess the usefulness of crowdsourced observations inferred
by low-cost sensors owned by citizens. For instance, in the project
CrowdHydrology (Lowry and Fienen, 2013), a method to monitor stream stage at
designated gauging staffs using crowdsource-based text messages of water
levels is developed using untrained observers. Cifelli et al. (2005)
described a community-based network of volunteers (CoCoRaHS), engaged in
collecting precipitation measurements of rain, hail, and snow. An example of
hydrological monitoring, established in 2009, of rainfall and streamflow
values within the Andean ecosystems of Piura, Peru, based on citizen
observations, is reported in Célleri et al. (2009). Degrossi et al. (2013)
used a network of wireless sensors in order to map the water level in
two rivers passing by Sao Carlos, Brazil. Recently, the iSPUW project was
initiated to integrate data from advanced weather radar systems, innovative
wireless sensors, and crowdsourcing of data via mobile applications in
order to better predict flood events for the Dallas–Fort Worth Metroplex urban
water systems (ISPUW, 2015; Seo et al., 2014). Other examples of
crowdsourced water-related information include the so-called Crowdmap
platform for collecting and communicating the information about the floods
in Australia in 2011 (ABC, 2011) and informing citizens about the proper
time for water supply in an intermittent water system (Alfonso, 2006; Au et
al., 2000; Roy et al., 2012). Wehn et al. (2015) stressed the importance and
need of public participation in water resources management to ensure
citizens' involvement in the flood management cycle. Buytaert et al. (2014)
provide a detailed and interesting review of the examples of citizen science
applications in hydrology and water resources science. In this review paper,
the potential of citizen science, based on robust, cheap, and
low-maintenance sensing equipment, to complement more traditional ways of
scientific data collection for hydrological sciences and water resources
management is explored.
The traditional hydrological observations from physical sensors have a
well-defined structure in terms of frequency and accuracy. On the other hand,
crowdsourced observations are provided by citizens with varying experience of
measuring environmental data and little connections between each other, and
the consequence is that the low correlation between the measurements might be
observed. So far, in operational hydrology practice, the added value of
crowdsourced data is not integrated into the
forecasting models but only used to compare the model results with the
observations in a post-event analysis. This can be related to the intrinsic
variable accuracy, due to the lack of confidence in the data quality from
these heterogeneous sensors, and the variable life-span of the crowdsourced
observations.
Regarding data quality, Bordogna et al. (2014) and
Tulloch and Szabo (2012) stated that quality control
mechanisms should consider contextual conditions to deduce indicators about
reliability (the expertise level of the crowd), credibility (the volunteer
group), and performance of volunteers as they relate to accuracy,
completeness, and precision level. Bird et al. (2014) addressed the issue of
data quality in conservation ecology by means of new statistical tools to
assess random error and bias. Cortes Arevalo et al. (2014) evaluated data
quality by distinguishing the in situ data collected between volunteers and
technicians and comparing the most frequent value reported at a given
location. With in situ exercises, it might be possible to have an indication
of the reliability of data collected. However, this approach is not enough at
an operational level to define accuracy in data quality. For this reason, to
estimate observation accuracy in real time, one possible approach could be to
filter out the measurements following a geographic approach which defines
semantic rules governing what can occur at a given location (e.g.
Vandecasteele and Devillers, 2013). Another approach could be to compare
measurements collected within a predefined time window in order to calculate
the most frequent value, the mean, and the standard deviation.
Crowdsourced observations can be defined as asynchronous because
they do not have predefined rules about the arrival frequency (the
observation might be taken once, occasionally, or at irregular time steps,
which can be smaller than the model time step) and accuracy of the
measurement. In a recent paper, Mazzoleni et al. (2015) presented results of the study of the effects of distributed
synthetic streamflow observations having synchronous intermittent temporal
behaviour and variable accuracies in a semi-distributed hydrological model.
It was shown that the integration of distributed uncertain intermittent
observations with single measurements coming from physical sensors would
allow for the further improvements in model accuracy. However, it did not
consider the possibility that the asynchronous observations might be
coming at the moments not coordinated with the model time steps. A possible
solution to handle asynchronous observations in time with the ensemble Kalman
filter (EnKF) is to assimilate them at the moments coinciding with the model
time steps (Sakov et al., 2010). However, as
these authors mention, this approach requires the disruption of the ensemble
integration, the ensemble update, and a restart, which may not be feasible for
large-scale forecasting applications. Continuous assimilation approaches,
such as three-dimensional and four-dimensional variational methods (3D-Var
and 4D-Var), are usually implemented in oceanographic modelling in order to
integrate asynchronous observations at their corresponding arrival moments (Derber
and Rosati, 1989; Huang et al., 2002; Macpherson, 1991; Ragnoli et al.,
2012). In fact, oceanographic observations are commonly collected at
asynchronous times. For this reason, in variational data assimilation, the
past asynchronous observations are simultaneously used to minimize the cost
function that measures the weighted difference between background states and
observations over the time interval, and identify the best estimate of the
initial state condition (Drecourt, 2004; Ide et al., 1997; Li and Navon, 2001).
In addition to the 3D-Var and 4D-Var
methods, Hunt et al. (2004) proposed a four-dimensional ensemble Kalman filter (4DEnKF) which adapts EnKF to handle
observations that have occurred at non-assimilation times. Furthermore, for
linear dynamics, 4DEnKF is equivalent to the instantaneous assimilation of
the measured data (Hunt et al., 2004).
Similarly to 4DEnKF, Sakov et al. (2010)
proposed a modification of the EnKF, the asynchronous ensemble Kalman filter
(AEnKF), to assimilate asynchronous observations (Rakovec et
al., 2015). Contrary to the EnKF, in the AEnKF, current and past observations
are simultaneously assimilated at a single analysis step without the use of
an adjoint model. Yet another approach to assimilate asynchronous
observations in models is the so-called first-guess at the appropriate time
(FGAT) method. Like in 4D-Var, the FGAT compares the observations with the
model at the observation time. However, in FGAT, the innovations are assumed
constant in time and remain the same within the assimilation window
(Massart et al., 2010). In light of reviewed approaches,
this study uses a pragmatic method, due in part to the linearity of the
hydrological models implemented in this study, to assimilate the
asynchronous crowdsourced observations.
The main objective of this study is to assess the potential use of
crowdsourced data within hydrological modelling. In particular, the specific
objectives of this study are (a) to assess the influence of different arrival
frequencies and accuracies of crowdsourced data from a single social sensor
on the assimilation performance and (b) to integrate distributed low-cost
social sensors with a single physical sensor to assess the improvement in
the streamflow prediction in an early warning system. The methodology is
applied in the Brue (UK), Sieve (Italy), Alzette (Luxembourg), and
Bacchiglione (Italy) catchments, considering lumped and semi-distributed
hydrological models, respectively. Synthetic time series, asynchronous in
time and with random accuracies, that imitate the crowdsourced data, are
generated and used.
The study is organized as follows. Firstly, the case studies, the
crowdsourced data and the datasets used are presented. Secondly, the
hydrological models, the procedure used to integrate the crowdsourced data,
and the set of experiments are reported. Finally, the results, discussion,
and conclusions are presented.
Sites locations and dataCase studies
Four different case studies are used to validate the obtained results for
areas having diverse topographical and hydrometeorological features and
represented by two different hydrological models. The Brue, Sieve, and
Alzette catchments are considered because of the availability of
precipitation and streamflow data, while the Bacchiglione catchment is one
of the official case studies of the WeSenseIt Project (Huwald et al., 2013).
Brue catchment
The first case study is located in the Brue catchment
(Fig. 1), in Somerset, with a drainage area of
about 135 km2 at the catchment outlet in Lovington. The Shuttle Radar
Topography Mission digital elevation model (SRTM DEM) of 90 m
resolution is used to derive the topographical characteristics, streamflow
network, and the consequent time of concentration, by means of the Giandotti
equations (Giandotti, 1933), which is about 10 h. The hourly
precipitation (49 rainfall stations) and streamflow data used in this study
are supplied by the British Atmospheric Data Centre from the HYREX
(Hydrological Radar Experiment) project (Moore et al., 2000; Wood et
al., 2000). The average precipitation value in the catchment is estimated
using ordinary kriging (Matheron, 1963).
Representation of the four case studies considered in this
study; clockwise: Brue catchment; Sieve catchment; Alzette catchment; Bacchiglione
catchment.
Sieve catchment
The second case study is the Sieve catchment (Fig. 1), a tributary of the Arno River, located in the central Italian
Apennines in Italy. The catchment has a drainage area of about 822 km2
with the length of 56 km and it covers mostly hills and mountainous areas
with an average elevation of 470 m above sea level. The time of
concentration of the Sieve catchment is about 12 h. Hourly streamflow data
are provided by the Centro Funzionale di Monitoraggio Meteo
Idrologico-Idralico of the Tuscany Region at the outlet section of the
catchment at Fornacina. The mean areal precipitation is calculated by the
Thiessen polygon method using 11 rainfall stations (Solomatine and Dulal, 2003).
Alzette catchment
The Alzette catchment is located in the large part of the grand duchy of
Luxembourg. The drainage area of the catchment is about 288 km2 and the
river has a length of 73 km along France and Luxembourg. The catchment
covers cultivated land, grassland, forest land, and urbanized land
(Fenicia et al., 2007). The Thiessen polygon method
is used for averaging the series at the individual stations and calculating
hourly rainfall series (Fenicia et al., 2007),
while streamflow data are available measured at the Hesperange gauging
station.
Bacchiglione catchment
The last case study is the upstream part of the Bacchiglione River basin,
located in the north-east of Italy, and tributary of the Brenta River which
flows into the Adriatic Sea at the south of the Venetian Lagoon and at the
north of the Po River delta. The study area has an overall extent and river
length of about 400 km2 and 50 km (Ferri et al.,
2012). The main urban area located in the downstream part of the study area
is Vicenza. The analysed part of the Bacchiglione River has three main
tributaries. On the western side are the confluences with the Bacchiglione
of the Leogra and the Orolo rivers, while on the eastern side is the
Timonchio River (see Fig. 2). The Alto Adriatico
Water Authority (AAWA) has implemented an early warning system to forecast
the possible future flood events.
Structure of the hydrological model and location of the physical
(green dots), social (red dots), and Ponte degli Angeli (PA, blue dots)
sensors implemented in the Bacchiglione catchment by the Alto Adriatico Water
Authority.
Crowdsourced data
Social sensors can be used by citizens to provide crowdsourced distributed
hydrological observations such as precipitation and water level. An example
of these sensors can be a staff gauge, connected to a quick response code, on which
citizens can read water level indication and send observations via a mobile
phone application. Another example is the collection of rainfall data via
lab-generated videos (Alfonso et al., 2015). Recently, within the activities
of the WeSenseIt Project (Huwald et al.,
2013), one physical sensor and three staff gauges complemented by a QR code
were installed in the Bacchiglione River to measure the water level. In
particular, the physical sensor is located at the outlet of the Leogra
catchment while the three social sensors are located at the Timonchio,
Leogra, and Orolo catchments outlet, respectively (see Fig. 2).
It is worth noting that, in most of the cases, it is difficult to directly
assimilate water level observations within hydrological models. However, it
is highly unrealistic to assume that citizens might observe streamflow
directly. For this reason, crowdsourced observations of water level
are used to calculate crowdsourced data (CSD) of streamflow by
means of rating curves assessed for the specific river location, which can
be easily assimilated into hydrological models. It is because of both the
uncertainty in rating curve estimation at the social sensor location and the
error in the water level measurements that CSD have such low and variable
accuracies when compared to streamflow data estimated from classic physical
sensors. CSD are then assimilated within mathematical models as described in
Fig. 3 (“overall information flow”).
Graphical representation of the methodology proposed to estimate
streamflow from crowdsourced observations of water level: (a) crowdsourced
observations of water level are turned into streamflow crowdsourced data
(CSD) by means of rating curves assessed for the specific river location;
(b) the streamflow CSD within the hydrological model are assimilated.
In most hydrological applications, streamflow data from physical sensors are
derived (and integrated into hydrological models) at regular, synchronous
time steps. In contrast, crowdsourced water level observations are obtained
by diverse types of citizens at random moments (when a citizen decides to
send data). Thus, from the modelling viewpoint, CSD have three main
characteristics: (a) irregular arrival frequency (asynchronicity), (b) random
accuracy, and (c) random number of CSD received within two model time steps.
Because streamflow CSD are not available in the case studies at the moment
of this study, realistic synthetic CSD with these characteristics are
generated (“considered information flow” in Fig. 3).
For the Brue, Sieve, and Alzette catchments, observed hourly streamflow data
at the catchments' outlets are interpolated to represent CSD coming at arrival
frequencies higher than hourly. For the Bacchiglione catchment, synthetic
hourly CSD of streamflow are calculated using measured precipitation
recorded during the considered flood events (post-event simulation) as input
in the hydrological model of the Bacchiglione catchment. A similar approach,
termed “observing system simulation experiment” (OSSE), is commonly used
in meteorology to estimate synthetic “true” states and measurements by
introducing random errors in the state and measurement equations (Arnold
and Dey, 1986; Errico et al., 2013; Errico and Privé, 2014). OSSEs have
the advantage of making it possible to compare estimates to true states
and they are often used for validating the data assimilation algorithms.
Further details and assumptions regarding the characteristics of CSD and
related uncertainty are provided in the next sections.
Datasets
Three flood events for each one of the four described catchments are
considered to assess the assimilation of CSD in hydrological modelling.
For the Brue catchment, a 2-year time series (June 1994 to May 1996) of
observed streamflow and precipitation data are available for model
calibration and validation. On the other hand, for the Sieve catchment, only
3 months of hourly runoff, streamflow, and precipitation data (December 1959
to February 1960) are available (Solomatine and Shrestha, 2003). For the
Alzette catchment, 2-year hourly data (July 2000 to June 2002) are used
for the model calibration and validation (Fenicia et al., 2007). For these
catchments, the observed precipitation values are treated as the “perfect
forecasts” and are fed into the hydrological model.
For the Bacchiglione catchment, three flood events that occurred in 2013, 2014,
and 2016 are considered. In particular, the one of 2013 had high intensity
and resulted in several traffic disruptions at various locations upstream
Vicenza. The forecasted time series of precipitation (3-day weather
forecast) is used as input to the hydrological model. In all the case
studies, the observed values of streamflow at the catchment outlet (Ponte
degli Angeli for the Bacchiglione) are used to assess the performance of the
hydrological model.
MethodologyHydrological modellingLumped model
A lumped conceptual hydrological model is implemented to estimate the
streamflow hydrograph at the outlet section of the Brue, Sieve, and Alzette
catchments. The choice of the model is based on previous studies performed
in the Brue catchment (Mazzoleni et al., 2015). Direct runoff is the input
in the conceptual model and it is assessed by means of the soil conservation
service curve number method (Mazzoleni et al.,
2015). The average curve number value within the catchment is calibrated
by minimizing the difference between the simulated volume and observed
quick flow, using the method proposed by Eckhardt (2005),
at the outlet section.
The main module of the hydrological model is based on the
Kalinin–Milyukov–Nash (KMN; Szilagyi and Szollosi-Nagy, 2010)
equation:
Qt=1k⋅1(n-1)!∫t0tτkn-1⋅e-τ/k⋅It-τ⋅dτ,
where I is the model forcing (in this case direct runoff),
n (number of storage elements) and k (storage capacity
expressed in hours) are the two model parameters, and Q is the model
output (streamflow in m3 s-1). In this study, the parameter k is
assumed as a linear function between the time of concentration and a
coefficient ck. The discrete state-space system of Eq. (1)
derived by Szilagyi and Szollosi-Nagy (2010) is used in this study
to apply the data assimilation approach (Mazzoleni
et al., 2015, 2016).
The model calibration is performed maximizing the Nash–Sutcliffe efficiency
(NSE) and the correlation between the simulated and observed
value of streamflow, at the outlet points of the Brue, Sieve, and Alzette
catchments, using historical time series. The results of the calibration
provided a value of the parameters n and ck equal to
4 and 0.026, 1 and 0.0055, and 1 and 0.00064 for the Brue, Sieve, and
Alzette catchments, respectively.
Semi-distributed model
The hydrological and routing models used in this study are based on the
early warning system implemented by the AAWA and described in
Ferri et al. (2012). One of the goals of this study,
in the framework of the WeSenseIt Project, is to test our methodology using
synthetic CSD in the existing early warning system of the Bacchiglione
catchment.
In the schematization of the Bacchiglione catchment, the location of
physical and social sensors corresponds to the outlet section of three main
sub-catchments, Timonchio, Leogra, and Orolo, while the remaining
sub-catchments are considered as inter-catchments. For both sub-catchments
and inter-catchments, a conceptual hydrological model, described below, is
used to estimate the outflow (streamflow) hydrograph. The streamflow
hydrograph of the three main sub-catchments is considered as the upstream
boundary conditions of a routing model used to propagate the flow up to the
catchment outlet (see Fig. 2), while the outflow
from the inter-catchment is considered as an internal boundary condition to
account for their corresponding drained area. In the following, a brief
description of the main components of the hydrological and routing models is
provided.
The input for the hydrological model consists of precipitation only. The
hydrological response of the catchment is estimated using a hydrological
model that considers the routines for runoff generation and a simple routing
procedure. The processes related to runoff generation (surface, sub-surface,
and deep flow) are modelled mathematically by applying the water balance to a
control volume representative of the active soil at the sub-catchment scale.
The water content Sw in the soil is updated at each
calculation step dt using the following balance equation:
Sw,t+dt=Sw,t+Pt-Rsur,t-Rsub,t-Lt-ET,t,
where P and ET are the components of precipitation
and evapotranspiration, while Rsur, Rsub, and L
are the surface runoff, sub-surface runoff, and deep
percolation model states, respectively (see Fig. 2). The surface runoff
Rsur is expressed by the equation
based on specifying the critical threshold beyond which the mechanism of
Dunnian flow (saturation excess mechanism) prevails:
Rsur,t=C⋅Sw,tSw,max⋅Pt⇒Pt≤f=Sw,max⋅Sw,max-Sw,tSw,max-C⋅Sw,tPt-Sw,max-Sw,t⇒Pt>f,
where C is a coefficient of soil saturation obtained by calibration, and
Sw,max is the content of water at saturation point which depends
on the nature of the soil and on its use.
The sub-surface flow is considered proportional to the difference between
the water content Sw,t at time t and that at
soil capacity ScRsub,t=c⋅Sw,t-Sc,
while the estimated deep flow is evaluated according to the expression
proposed by Laio et al. (2001):
Lt=KSeβ⋅1-ScSw,max-1⋅eβ⋅Sw,t-ScSw,max-1,
where KS is the hydraulic conductivity of the soil in
saturation conditions and β is a dimensionless exponent
characteristic of the size and distribution of pores in the soil. The
evaluation of the real evapotranspiration is performed assuming it as a
function of the water content in the soil and potential evapotranspiration,
calculated using the formulation of Hargreaves and Samani (1982).
Knowing the values of Rsur, Rsub, and L, it is possible
to model the surface Qsur, sub-surface Qsub, and deep flow Qg routed
contributions according to the conceptual framework of the linear reservoir at
the closing section of the single sub-catchment. In particular, in the case of
Qsur, the value of the parameter k, which is a
function of the residence time in the catchment slopes, is estimated
by relating the velocity to the average slope length. However, one of the
challenges is to properly estimate such velocity, which should be calculated
for each flood event (Rinaldo and Rodriguez-Iturbe,
1996). According to Rodríguez-Iturbe et al. (1982), this
velocity is a function of the effective rainfall intensity and
the event duration. In this study, the estimation of the surface velocity is
performed using the relation between velocity and intensity of rainfall
excess proposed in Kumar et al. (2002) to
estimate the average travel time and the consequent parameter k.
However, this formulation is applied in a lumped way for a given
sub-catchment. As reported in McDonnell and Beven (2014),
more reliable and distributed models should be used to reproduce the spatial
variability of the residence times over time within the catchment. That is
why, in the advanced version of the model implemented by AAWA, in each
sub-catchment the runoff propagation is carried out according to the
geomorphological theory of the hydrologic response. The overall catchment
travel time distributions are considered as nested convolutions of
statistically independent travel time distributions along sequentially
connected, and objectively identified, smaller sub-catchments. The correct
estimation of the residence time should be derived considering the latest
findings reported in McDonnell and Beven (2014). Regarding
Qsub and Qg, the value of k is
calibrated comparing the observed and simulated streamflow at Vicenza.
In the early warning system implemented by AAWA in the Bacchiglione
catchment, the flood propagation along the main river channel is represented
by a one-dimensional hydrodynamic model, MIKE 11 (DHI, 2007). However, in
order to reduce the computational time required by the analysis performed in
this study, MIKE11 is replaced by a Muskingum–Cunge model (see, e.g. Todini,
2007) considering rectangular river cross-sections for the estimation of
hydraulic radios, wave celerities, and other hydraulic variables.
Calibration of the hydrological model parameters is performed by AAWA, and
described in Ferri et al. (2012), considering the time
series of precipitation from 2000 to 2010 in order to minimize the root mean
square error between observed and simulated values of water level at the Ponte
degli Angeli gauged station. In order to stay as close as possible to the
early warning system implemented by AAWA, we used the same calibrated model
parameters proposed by Ferri et al. (2012).
Data assimilation procedureKalman filter
In data assimilation, it is typically assumed that the dynamic system can be
represented in the state space as follows:
xt=Mxt-1,ϑ,It+wtwt∼N0,Stzt=Hxt,ϑ+vtvt∼N0,Rt,
where xt and xt-1 are state vectors at time t and
t-1, M is the model operator that propagates the state x from
its previous condition to the new one as a response to the inputs It,
while H is the operator which maps the model states into output
zt. The system and measurement errors wt and vt are
assumed normally distributed with zero mean and covariance S and
R. In a hydrological modelling system, these states can represent the
water stored in the soil (soil moisture, groundwater) or on the earth's
surface (snow pack). These states are one of the governing factors that
determine the hydrograph response to the inputs into the catchment.
For the linear systems used in this study, the discrete state-space system
of Eq. (1) can be represented as follows (Szilagyi and
Szollosi-Nagy, 2010):
xt=Φxt-1+ΓIt+wtQt=Hxt+vt,
where t is the time step, x is the vector of the model states
(stored water volume in m3), Φ is the state-transition
matrix (function of the model parameters n and k), Γ is the
input-transition matrix, and H is the output matrix. For example,
for n= 3, the matrix H is expressed as H=0 0k. Expressions for matrices Φ and Γ can
be found in Szilagyi and Szollosi-Nagy (2010).
For the Bacchiglione model (semi-distributed model), a preliminary
sensitivity analysis on the model states (soil content Sw and the
storage water xsur, xsub, and xL related to
Qsur, Qsub, and Qg) is performed in order to
decide on which of the states to update. The results of this analysis (shown
in the next section) pointed out that the stored water volume
xsur (estimated using Eq. 8 with n= 1, H=k, and It
replaced by Rsur) is the most sensitive state, and for this reason
we decided to update only this state.
The Kalman filter (KF; Kalman, 1960) is a
mathematical tool which allows estimating, in an efficient computational
(recursive) way, the state of a process which is governed by a linear
stochastic difference equation. The KF is optimal under the assumption that the
error in the process is Gaussian; in this case, the KF is derived by minimizing
the variance of the system error assuming that the model state estimate is
unbiased.
The Kalman filter procedure can be divided into two steps, namely forecast
equations, (Eqs. 10 and 11) and update (or analysis) equations (Eqs. 12, 13, and 14):
xt-=Φxt-1++ΓItPt-=ΦPt-1+ΦT+SKt=Pt-HTHPt-HT+R-1xt+=xt-+KtQto-Hxt-Pt+=I-KtHPt-,
where Kt is the Kalman gain matrix, P is the error covariance
matrix, and Qo is a new observation. In this study, the observed value
of streamflow Qo is equal to the synthetic CSD estimated as
described above. The prior model states x at time t are updated, as
the response to the new available observation, using the analysis
equations Eqs. (12) to (14). This allows for estimation of the values of the
updated state (with superscript +) and then assessing the background
estimates (with superscript -) for the next time step using the time update
equations, Eqs. (10) and (11). The proper characterization of the model
covariance matrix S is a fundamental issue in the Kalman filter. In
this study, in order to evaluate the effect of assimilating CSD, small
values of the model error S are considered for each case study. In fact, a
covariance matrix S with diagonal values of 1, 25,
and 1 m6 s-2 are considered for the Brue, Sieve,
and Alzette catchments. The bigger value of S in the Sieve catchment is due
to the higher flow magnitude in this catchment if compared to the other two.
A sensitivity analysis of model performance depending on the value of S is
reported in the Results section. For the Bacchiglione catchment, S is
estimated, for each given flood event, as the variance between observed and
simulated flow values.
Assimilation of crowdsourced data
As described in the previous section, a main characteristic of CSD is to be
highly uncertain and asynchronous in time. Various methods have been
proposed to include asynchronous observations in models. Having reviewed
them, in this study, we are proposing a somewhat simpler approach of data
assimilation of crowdsourced observations (DACO). This method is based on
the assumption that the change in the model states and in the error
covariance matrices within the two consecutive model time steps
t0 and t (observation window) is linear, while the
inputs are assumed constant. All CSD received during the observation window
are individually assimilated in order to update the model states and output
at time t. Therefore, assuming that one CSD is available at time
t0*, the first step of DACO (A in Fig. 4) is the definition of the model states and
error covariance matrix at t0* as
xt0*-=xt0++xt--xt0+⋅t0*-t0t-t0Pt0*-=Pt0++Pt--Pt0+⋅t0*-t0t-t0.
The second step (B in Fig. 4) is the estimation of the updated model states
and error covariance matrix as the response to the streamflow CSD
Qt0*o. The estimation of the posterior values of
xt0*- and Pt0*- is
performed by Eqs. (13) and (14), respectively. The Kalman gain is estimated by
Eq. (12), where the prior values of model states and error covariance matrix
at t0* are used. Knowing the posterior values
xt0*+ and Pt0*+, it is possible to predict the
value of states and covariance matrix at one model step ahead, t* (C in
Fig. 4), using the model forecast equations, Eqs. (10) and (11).
Graphical representation of the data assimilation of the
crowdsourced observations (DACO) method used in this study to
assimilate asynchronous streamflow crowdsourced data.
The last step (D in Fig. 4) is the estimation of the interpolated value of
x and P at time step t. This is performed by means of a linear
interpolation between the current values of x and P at
t0* and t*:
x̃t-=xt0*-+xt*--xt0*+⋅t-t0*t*-t0*P̃t-=Pt0*-+Pt*--Pt0*+⋅t-t0*t*-t0*.
The symbol ∼ is added on the new matrices x and P in order
to differentiate them from the original forecasted values in t. Assuming
that new streamflow CSD are available at an intermediate time t1*
(between t0* and t), the procedure is repeated considering the
values at t0* and t for the linear interpolation. Then, when no
more CSD are available, the updated value of x̃t- is
used to predict the model states and output at t+ 1 (Eqs. 10 and 11).
Finally, in order to account for the intermittent behaviour of these CSD, the
approach proposed by Mazzoleni et al. (2015) is applied. In this method, the
model states matrix x is updated and forecasted when CSD are
available, while without CSD the model is run using Eq. (10) and covariance
matrix P propagated at the next time step using Eq. (11).
Crowdsourced data accuracy
In this section, the uncertainty related to CSD is characterized. The
observational error is assumed to be normally distributed noise with zero mean and
given standard deviation
σtQ=αt⋅Qto,
where the coefficient α is related to the degree of
uncertainty of the measurement (Weerts and El
Serafy, 2006).
One of the main and obvious issues in citizen-based observations is to
maintain the quality control of the water observations
(Cortes Arevalo et al., 2014; Engel and Voshell Jr., 2002).
In the Introduction section, a number of methods to estimate the model of
observational uncertainty have been referred to. In this study, coefficient
α is assumed to be a random variable uniformly distributed
between 0.1 and 0.3, so we leave more thorough investigation of the uncertainty
level of CSD for future studies. We assumed that the maximum value of
α is 3 times higher than the uncertainty coming from
the physical sensors due to the uncertainty estimation of the rating curve at
the social sensor location.
Experimental setup
In this section, two sets of experiments are performed in order to test the
proposed method and assess the benefit of integrating CSD, asynchronous in
time and with variable accuracies, in real-time flood forecasting.
In the first set of experiments, called “Experiment 1”, assimilation of
streamflow CSD at one social sensor location is carried out in the Brue,
Alzette, and Sieve catchments to understand the sensitivity of the employed
hydrological model – KMN – under various scenarios of these data.
In the second set of experiments, called “Experiment 2”, the distributed
CSD coming from social and physical sensors, at four locations within the
Bacchiglione catchment, are considered, with the aim of assessing the
improvement in the flood forecasting accuracy.
Experiment 1: assimilation of crowdsourced data from one social
sensor
The focus of Experiment 1 is to study the performance of the hydrological
model (KMN) assimilating CSD, having lower arrival frequencies than the
model time step and random accuracies, coming from a social sensor located
at the outlet points of the Brue, Sieve, and Alzette catchments.
To analyse all possible combinations of arrival frequencies, number of CSD
within the observation window (1 h), and accuracies, a set of scenarios are
considered (Fig. 5), changing from regular arrival frequencies of CSD with
high accuracies (scenario 1) to random and chaotic asynchronous CSD with
variable accuracies (scenario 11). In each scenario, a varying number of CSD
from 1 to 100 is considered. It is worth noting that for one CSD per hour and
regular arrival time, scenario 1 corresponds to the case of physical sensors
with observation arrival frequencies of 1 h.
Experimental scenarios representing different configurations of
arrival frequencies, number, and accuracies of streamflow crowdsourced data.
Scenario 2 corresponds to the case of CSD having fixed accuracies
(α equal to 0.1) and irregular arrival moments, but in
which at least one CSD coincides with the model time step. In particular,
scenarios 1 and 2 coincide for one CSD available within the observation
window since it is assumed that the arrival frequencies of that CSD have to
coincide with the model time step. On the other hand, the arrival
frequencies of CSD in scenario 3 are assumed random and CSD might not arrive
at the model time step.
Scenario 4 considers CSD with regular frequencies but random accuracies at
different moments within the observation window, whereas in scenario 5 CSD
have irregular arrival frequencies and random accuracies. In all the
previous scenarios, the arrival frequencies, the number, and accuracies of CSD
are assumed periodic, i.e. repeated between consecutive observation windows
along all the time series. However, this periodic repetitiveness might not
occur in real life, and for this reason, a non-periodic behaviour is assumed
in scenarios 6, 7, 8, and 9. The non-periodicity assumptions of the arrival
frequencies and accuracies are the only factors that differentiate scenarios 6,
7, 8, and 9 from scenarios 2, 3, 4, and 5, respectively. In addition,
the non-periodicity of the number of CSD within the observation window is
introduced in scenario 10.
Finally, in scenario 11, CSD, in addition to all the previous
characteristics, might have an intermittent behaviour, i.e. not being
available for one or more observation windows.
Experiment 2: spatially distributed physical and social
sensors
Synthetic CSD with the characteristics reported in scenarios 10 and 11 of
Experiment 1 are generated due to the unavailability of streamflow CSD
during this study. In order to evaluate the model performance, observed and
simulated streamflows are compared for different lead times.
Streamflow data from physical sensors are assimilated in the hydrological
model of the AMICO (Alto Adriatico Modello Idrologico e idrauliCO) system at an hourly frequency, while CSD from social sensors
are assimilated using the DACO method previously described. The updated
hydrograph estimated by the hydrological model is used as the input into
the Muskingum–Cunge model used to propagate the streamflow downstream to the
gauged station at Ponte degli Angeli, Vicenza.
The main goal of Experiment 2 is to understand the contribution of
distributed CSD to the improvement of the flood prediction at a specific
point of the catchment, in this case at Ponte degli Angeli. For this reason,
five different settings are introduced, and represented in
Fig. 6, corresponding to different types of
employed sensors.
Experiment 2: characteristics of the five experimental settings (A
to E) implemented within the Bacchiglione catchment: location of the social
and physical sensors (dots), hydrological model update based on different
sensors (coloured areas).
Firstly, only streamflow data from one physical sensor at the Leogra
sub-catchment are assimilated to update the hydrological model of
sub-catchment B (Fig. 2) of setting A
(Fig. 6). On the other hand, in setting B, CSD
from the social sensor located at the Leogra sub-catchment are assimilated.
In setting C, CSD from three distributed social sensors are integrated into
the hydrological model. Setting D accounts for the integration of CSD from
two social sensors and physical data from the physical sensor in the Leogra
sub-catchment. Finally, setting E considers the complete integration between
physical and social sensors in Leogra and the two social sensors in the
Timonchio and Orolo sub-catchments.
ResultsExperiment 1: influence of crowdsourced data on flood
forecasting
The observed and simulated streamflow hydrographs at the outlet section of
the Brue, Sieve, and Alzette catchments with and without the model update
(considering hourly streamflow data) are reported in Fig. 7 for nine
different flood events for 1 h lead time. As expected, it can be seen that
the updated model tends to better represent the flood events than the model
without updating in all the case studies. However, this improvement is
closely related to the value of the matrix S. The higher the
S value (uncertainty model), the closer the model output gets to the
observation. For this reason, a sensitivity analysis on the influence of the
matrix S on the assimilation of CSD for scenario 1, i.e. coming
and assimilated at regular time steps within the observation windows, is
reported in Fig. 8. The results of Fig. 8 are related to the first flood
events of the Brue, Sieve, and Alzette catchments. Increasing the number of
CSD within the observation window results in an improvement of the
NSE for different values of model error. However, this
improvement becomes negligible for a given threshold value of CSD, which is a
function of the considered flood event. This means that the additional CSD do
not add information useful for improving the model performance. Overall,
increasing the value of the model error S tends to increase
NSE values as mentioned before. For this reason, to better
evaluate the effect of assimilating CSD, a small value of S, i.e.
a model more accurate than CSD, is assumed.
Observed (black line) and simulated hydrographs, with (red line)
and without (blue line) assimilation, for the flood events which occurred in the
three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom
row).
Model improvement in terms of Nash–Sutcliffe efficiency
(NSE), during flood event 1 for each case study, for different
values of the model error matrix S and 24 h lead time, assimilating
streamflow CSD according to scenario 1.
In scenario 1, the arrival frequencies are set as regular for different
model runs, so the moments and accuracies in which CSD became available are
always the same for any model run. However, for the other scenarios, the
irregular moments in which CSD become available within the observation
window and their accuracies are randomly selected and change according to
the different model runs. This reflects in a random model performance and
consequent NSE values. In order to remove such random
behaviour, different model runs (100 in this case) are carried out, assuming
different random values of arrivals and accuracies (coefficient
α) during each model run, for a given number of CSD and
lead time. The NSE value is estimated for each model run, so
μNSE and σNSE represent the mean
and standard deviation of the different values of NSE.
For scenarios 2 and 3 (represented using warm red and orange colours in
Figs. 9 and 10 for
lead times equal to 24 h), the μNSE values are smaller
but comparable to the ones obtained for scenario 1 for all the considered flood
events and case studies. In particular, scenario 3 has lower μNSE
than scenario 2. This can relate to the fact that both
scenarios have random arrival frequencies; however, in scenario 3, CSD are
not provided at model time steps, as opposed to scenario 2. From
Fig. 10, higher values of σNSE can be observed for scenario 3.
Scenario 2 has the lowest standard
deviation for low values of CSD because the arrival frequencies have to
coincide with the model time step and this stabilizes the NSE.
In particular, for an increasing number of CSD, σNSE
tends to decrease. However, a constant trend of σNSE
can be observed, due to particular characteristics of the flood events, in
the case of flood event 1 of the Sieve and flood events 2 and 3 of the Alzette. It is
worth nothing that scenario 1 has null standard deviation because CSD are
assumed to come at the same moments with the same accuracies for all 100 model runs.
Dependency of the mean of the Nash–Sutcliffe efficiency sample,
μNSE, on the number of streamflow crowdsourced data in
the experimental scenarios 1 to 9 for the considered flood events in the
three catchments: Brue (upper row), Sieve (middle row), and Alzette (bottom
row).
Dependency of the standard deviation of the Nash–Sutcliffe
efficiency sample, σNSE, on the number of streamflow
crowdsourced data in the experimental scenarios 1 to 9 for the considered
flood events in the three catchments: Brue (upper row), Sieve (middle row),
and Alzette (bottom row).
In scenario 4, represented using blue colour, CSD are considered to come at
regular time steps but have random accuracies.
Figure 9 shows that μNSE values
are lower for scenario 4 than for scenarios 2 and 3. This is related to the
higher influence of CSD accuracies if compared to arrival frequencies. High
variability in the model performance, especially for low values of CSD,
can be observed in scenario 4 (Fig. 10).
The combined effects of random arrival frequencies and CSD accuracies are
represented in scenario 5 using a magenta colour (i.e. the combination of
warm and cold colours used for scenarios 2, 3, and 4) in
Figs. 9 and 10. As
expected, this scenario has the lowest μNSE and the
highest σNSE values, compared to those reported
above.
The remaining scenarios (6 to 9) are equivalent to scenarios 2 to 5
with the only difference being that they are non-periodic in time. For this
reason, in Figs. 9 and 10, scenarios from 6 to 9 have the same
colour as scenarios 2 to 5 but indicated with a dashed line in order to
underline their non-periodic behaviour. Overall, it can be observed that
non-periodic scenarios have similar μNSE values to their
corresponding periodic scenario. However, the smoother μNSE
trends can be explained because of the lower σNSE
values, which means that model performance is less dependent on the
non-periodic nature of CSD than their period behaviour. Table 1 shows the
NSE values and model improvement obtained for the different
experimental scenarios during the different flood events. Small improvements
are obtained when NSE is already high for one CSD as for the
Sieve catchment during flood event 2 or the Alzette catchment during flood event 2.
Moreover, it can be seen that a lower improvement is achieved for
scenarios (2, 3, 6, and 7) where arrival frequencies are random and
accuracies fixed if compared to those scenarios (4, 5, 8, and 9) where
arrival frequencies are regular and accuracies random.
NSE improvements (%), from 1 to 50 CSD, for
different experimental scenarios during the nine flood events that occurred in
the Brue, Sieve, and Alzette catchments.
Representation of the errors in flood peak timing, ERRT,
and intensity, ERRI, (as described in Eqs. 20 and 21), as
function of the number of streamflow crowdsourced data and experimental
scenarios (1 to 9), for three different flood peaks occurred during flood
event 2 in the Brue catchment.
In the previous analysis, model improvements are expressed only in terms of
NSE. However, statistics such as NSE only explain
the overall model accuracy and not the real increases/decreases in prediction
error. Therefore, increases in model accuracy due to the assimilation of CSD
have to be presented in different ways as increased accuracy of flood peak
magnitudes and timing. For this reason, additional analyses are carried out
to assess the change in flood peak prediction considering three peaks
occurred during flood event 2 in the Brue catchment (see Fig. 7). Errors in
the flood peak timing, ERRT, and intensity, ERRI, are
estimated as
ERRT=tPo-tPSERRI=QPo-QPSQPo,
where tPo and tPs are the
observed and simulated peak time (h), while QPo and
QPs are the observed and simulated peak streamflow
(m3 s-1). From the results reported in Fig. 11, considering 12 h
lead time, it can be observed that, overall, error reduction in peak
prediction is achieved for an increasing number of CSD. In particular,
assimilation of CSD has more influence in the reduction of the peak intensity
rather than peak timing. In fact, a small reduction of ERRT of
about 1 h is obtained even increasing the number of CSD. In both
ERRI and ERRT, the higher error reduction is obtained
considering fixed CSD accuracies and random arrival frequencies (e.g.
scenarios 1, 2, 3, 6, and 7). In fact, smaller ERRI error values
are obtained for scenario 1, while scenarios 5 and 9 are the ones that show
the lowest improvement in terms of peak prediction. These conclusions are
very similar to the previous ones obtained analysing only NSE
as model performance measures.
The combination of all the previous scenarios is represented by scenario 10,
where a changing number of CSD in each observation window is considered. In
scenario 11, the intermittent nature of CSD is accounted for as well. The μNSE and σNSE values of these
scenarios obtained for the considered flood events are shown in
Fig. 12. It can be observed that scenario 10
tends to provide higher μNSE and lower σNSE values, for a given flood event, if compared to
scenario 11. In fact, intermittency in CSD tends to reduce model
performance and increase the variability of NSE values for
random configuration of arrival frequencies and CSD accuracies. In
particular, σNSE tends to be constant for an increasing
number of CSD.
Dependency of the mean μNSE and standard
deviation σNSE of the Nash–Sutcliffe efficiency sample
(first row and second row, respectively) on the number of streamflow
crowdsourced data in
scenarios 10 (solid lines) and 11 (dashed lines)
for the considered flood events (black, blue, red lines) in the three
catchments: Brue (left panel), Sieve (central panels), and Alzette (right
panels).
Experiment 2: influence of distributed physical and social
sensors
Three different flood events that occurred in the Bacchiglione catchment are used
for Experiment 2. Figure 13 shows the observed and simulated streamflow
value at the outlet section of Vicenza. In particular, two simulated time
series of streamflow are calculated using the measured and forecasted time
series of precipitation as input for the hydrological model. Overall, an
underestimation of the observed streamflow can be observed using forecasted
input, while the results achieved used measured precipitation tend to properly
represent the observations. In order to find out what model states lead to a
maximum increase of the model performance, a preliminary sensitivity analysis
is performed. The four model states, xS, xsur,
xsub, and xL, related to Sw, Qsur,
Qsub, and Qg, are uniformly perturbed by ±20 %
around the true state value for every time step up to the perturbation time
(PT). No correlation between time steps is considered. After PT, the model
realizations are run without perturbation in order to assess the effect on
the system memory. No assimilation and no state update are performed at
this step. From the results reported in Fig. 14 related to flood event
1, it can be observed that the model state xsur is the most
sensitive state if compared to the other ones. In addition, the
perturbations of all the states seem to affect the model output even after
the PT (high system memory). For this reason, in this experiment, only the
model state xsur is updated by means of the DACO method.
Observed and simulated hydrographs, without updates, using
measured input (MI) and forecasted input (FI), for the three considered
flood events which occurred in 2013 (event 1), 2014 (event 2), and 2016 (event 3)
in the Bacchiglione catchment.
Scenarios 10 and 11, described in the previous sections, are used to
represent the irregular and random behaviour of CSD assimilated in the
Bacchiglione catchment.
Figures 15 and 16 show the results obtained from the experiment settings
represented in Fig. 6 during three different flood events. Three different
lead time values are considered. Different model runs (100) are performed to
account for the effect induced by the random arrival frequencies and
accuracies of CSD within the observation window as described above. Figure 15
shows that the assimilation of streamflow from the physical sensor in the
Leogra sub-catchment (setting A) provides a better streamflow prediction at
Ponte degli Angeli if compared to the assimilation of a small number of CSD
provided by a social sensor in the same location (setting B). In particular,
Fig. 15 shows that, depending on the flood event, the same NSE
values achieved with the assimilation of physical data (hourly frequency and
high accuracy) can be obtained by assimilating between 10 and 20 CSD per
hour for a 4 h lead time. This number of CSD tends to increase for increasing
values of lead times. In the event of intermittent CSD (Fig. 16), the overall
reduction of NSE is such that even with a high number of CSD
(even higher than 50 per hour) the NSE is always lower than the
one obtained assimilating physical streamflow data for any lead time.
Effect of model state perturbation on the model output for the
Bacchiglione catchment: PT indicates perturbation time; xs indicates model
state related to Sw; xsur indicates model state related
to Qsur; xsub indicates model state related to
Qsub; xL indicates model state related to Qg.
Model performance expressed as the mean of the Nash–Sutcliffe efficiency
μNSE, assimilating a different number of streamflow
crowdsourced data during the three considered flood events for the three
lead time values (left panels: 4 h; central panels: 8 h; right panels:
12 h) of scenario 10, for the five experimental settings (A to E) in the
Bacchiglione catchment.
For setting C, it can be observed for all three flood events that
distributed social sensors in Timonchio, Leogra, and Orolo sub-catchments
allow for obtaining higher model performance than the one achieved with
only one physical sensor (see Fig. 15). However,
for flood event 3, this is valid only for small lead time values. In fact,
for 8 and 12 h lead time values, the contribution of CSD tends to decrease in
favour of physical data from the Leogra sub-catchment. This effect is
predominant for intermittent CSD, scenario 11. In this case, setting C has
higher μNSE values than setting A only during flood
event 1 and for lead time values equal to 4 and 8 h (see
Fig. 16).
It is interesting to note that for setting D, during flood event 1, the μNSE is higher than setting C for the low number of CSD. However,
with a higher number of CSD, setting C is the one providing the best model
improvement for low lead time values. In the event of intermittent CSD, it
can be noticed that the setting D always provides higher improvement than
setting C. For flood event 1, the best model improvement is achieved for
setting E, i.e. fully integrating physical sensor with distributed social
sensors. On the other hand, during flood events 2 and 3, setting D shows
higher improvements than setting E. For intermittent CSD, the difference
between settings D and E tends to reduce for all the flood events. Overall,
settings D and E are the ones providing the highest μNSE
in both scenarios 10 and 11. This demonstrates the importance of integrating
an existing network of physical sensors (setting A) with social sensors to
improve flood predictions.
Figure 17 shows the standard deviation of the
NSE, σNSE, obtained for the different
settings for 4 h lead time. Similar results are obtained for the three flood
events. In the case of setting A, σNSE is equal to zero
since CSD are coming from the physical sensor at regular time steps. Higher
σNSE values are obtained for setting B, while
including distributed CSD (setting C) tend to decrease the value of σNSE. It can be observed that σNSE
decreases for high values of CSD. As expected, the lowest values of σNSE are achieved including the physical sensor in the data
assimilation procedure (settings D and E). Similar considerations can be
drawn for intermittent CSD, where higher and more perturbed σNSE values are obtained.
Model performance expressed as the mean of the Nash–Sutcliffe
efficiency μNSE, assimilating different number of
streamflow crowdsourced data during the three considered flood events for
the three lead time values (left panels: 4 h; central panels: 8 h;
right panels: 12 h) of scenario 11, for the five experimental settings (A
to E) in the Bacchiglione catchment.
Variability of model performance expressed as σNSE, assimilating streamflow crowdsourced data within
settings A, B, C, and D, assuming a lead time of 4 h, for experimental scenarios
10 (upper row) and 11 (bottom row), during the three considered flood events
in the Bacchiglione catchment.
Discussion
The assimilation of CSD is performed in four different case studies
considering only one social sensor location in the Brue, Sieve, and Alzette
catchments, and distributed social and physical sensors within the
Bacchiglione catchment.
In the first three catchments, different characteristics of CSD are
represented by means of 11 scenarios. Nine different flood events are used
to assess the beneficial use in assimilating CSD in the hydrological model
to improve flood forecasting.
Overall, assimilation of CSD improves model performance in all the
considered case studies. In particular, there is a limit in the number of
CSD for which satisfactory model improvements can be achieved and for which
additional CSD become redundant. This asymptotic behaviour, when extra
information is added, has also been observed using other metrics by
Krstanovic and Singh (1992), Ridolfi et al. (2014), Alfonso et al. (2013), among others. From
Fig. 9 it can be seen that, in all the considered catchments, increasing the
number of model error induces an increase of this asymptotic value with a
consequent reduction of CSD needed to improve model performance. For this
reason, a small value of the model error is assumed in this study. In
addition, it is not possible to define a priori the number of CSD needed to
improve a model because of its different behaviour for a given flood event in
the event of no update. In fact, as reported in Table 1 and Fig. 8, flood events
with high NSE values even without updates tend to achieve the
asymptotic values of NSE for a small number of CSD (e.g. flood
event 1 in the Brue and flood event 2 in the Sieve), while more CSD are needed for flood events
having low NSE without updates. However, for these case studies
and during these nine flood events, an indicative value of 10 CSD can be
considered to achieve a good model improvement.
Figures 9 and 10
show the μNSE and σNSE values for
scenarios 2 to 9. Figure 9 demonstrates that for
irregular arrival frequencies and constant accuracies (e.g. scenarios 2, 3,
6, and 7) the NSE is higher than for scenarios in which
accuracies are variable and arrival frequencies fixed (e.g. scenarios 4, 5,
8, and 9). These results point out that the model performance is more
sensitive to the accuracies of CSD than to the moments in time at which the
streamflow CSD become available. Overall, σNSE tends
to decrease for high number of CSD. The combined effects of irregular
frequencies and uncertainties are reflected in scenario 5, which has lower
mean and higher standard deviation of NSE if compared to the
first four scenarios.
An interesting fact is that, passing from periodic to non-periodic scenarios,
the standard deviation σNSE is significantly reduced,
while μNSE remains the same but with a smoother trend. A
non-periodic behaviour of CSD, common in real life, helps to reduce the
fluctuation of the NSE generated by the random behaviour of
streamflow CSD. Finally, the results obtained for scenarios 10 and 11 are
shown in Fig. 12. The assimilation of the irregular number of CSD in scenario
10, in each observation window, seems to provide similar
μNSE as the
values obtained with scenario 9. One of the main
outcomes is that the intermittent nature of CSD (scenario 11) induces a
drastic reduction of the NSE and an increase in its noise in
both considered flood events. All these previous results are consistent
across the considered catchments.
In the case of the Bacchiglione catchment, the data from physical and social
sensors are assimilated within a hydrological model to improve the poor flow
prediction in Vicenza for the three considered flood events. In fact, these
predictions are affected by an underestimation of the 3-day rainfall
forecast used as input in flood forecasting practice in this area.
One of the main outcomes of these analyses is that the replacement of a
physical sensor (setting A) for a social sensor at only one location
(setting B) does not improve the model performance in terms of
NSE for a small number of CSD. Figures 15 and 16
show that distributed locations
of social sensors (setting C) can provide higher values of NSE
than a single physical sensor, even for a low number of CSD, in the event of CSD
having the characteristic of scenario 10. For flood event 1, setting C
provides better model improvement than setting D for low lead time values
and a high number of CSD. This can be because the physical sensor at Leogra
provides constant improvement, for a given lead time, while the social
sensor tends to achieve better results with a higher number of CSD. This
dominant effect of the social sensor, for a high number of CSD, tends to
increase for the higher lead times. On the other hand, for intermittent CSD
(scenario 11) this effect decreases in particular for flood events 2 and 3.
Integrating physical and social sensors (settings D and E) induces the
highest model improvements for all the three flood events. For flood event
1, assimilation from setting E appears to provide better results than
assimilation from setting D. Opposite results are obtained for flood events
2 and 3. In fact, the high μNSE values of setting D can
be because flood events 2 and 3 are characterized by one main peak and
similar shape while flood event 1 has two main peaks. Assimilation of CSD
from distributed social sensors tends to reduce the variability of the
NSE coefficient in both scenarios 10 and 11.
Conclusions
This study assesses the potential use of crowdsourced data in hydrological
modelling, which are characterized by irregular availability and variable
accuracy. We demonstrate that even data with these characteristics can
improve flood prediction if integrated into hydrological models. This opens
new opportunities in terms of exploiting data being collected in current
citizen science projects for the modelling exercise. Our results do not
support the idea that social sensors should partially or totally replace the
existing network of physical sensors; instead, these new data should be used
to compensate the lack of traditional observations. In fact, in the event of
a dense network of physical sensors, the additional information from social
sensors might not be necessary because of the high accuracy of the
hydrological observations derived by physical sensors.
Four different case studies, the Brue (UK), Sieve (Italy), Alzette
(Luxembourg) and Bacchiglione (Italy) catchments, are considered, and two
types of hydrological models are used. In Experiment 1 (Brue, Sieve, and
Alzette catchments), the sensitivity of the model results to the assimilation
of crowdsourced data, having different frequencies and accuracies, derived
from a hypothetical social sensor at the catchments outlet is assessed. On
the other hand, in Experiment 2 (Bacchiglione catchment), the influence
of the combined assimilation of crowdsourced data, from a distributed
network of social sensors, and existing streamflow data from physical
sensors, are evaluated. Because crowdsourced streamflow data are not yet
available in all case studies, realistic synthetic data with various
characteristics of arrival frequencies and accuracies are introduced.
Overall, we demonstrated that results are very similar in terms of model
behaviour assimilating asynchronous data in all case studies.
In Experiment 1, it is found that increasing the number of crowdsourced data
within the observation window increases the model performance even if these
data have irregular arrival frequencies and accuracies. Moreover, data
accuracy affects the average value of NSE more than the moment
in which these data are assimilated. The noise in the NSE is
reduced when the assimilated data are considered to have non-periodic
behaviour. In addition, the intermittent nature of the data tends to
drastically reduce the NSE of the model for different values
of lead times. In fact, if the intervals between the data are too large, then
the abundance of crowdsourced data at other times and places is no longer
able to compensate their intermittency.
Experiment 2 showed that, in the Bacchiglione catchment, the integration of
data from social sensors and a single physical sensor could improve the
flood prediction even for a small number of intermittent crowdsourced data.
In the event that both physical and social sensors are located at the same place, the
assimilation of physical data gives the same model improvement as the
assimilation of the high number and non-intermittent behaviour of crowdsourced
data. Overall, the integration of existing physical sensors with a new
network of social sensors can improve the model predictions. Although the
cases and models are different, the presented study demonstrated that the
results obtained are very similar in terms of model behaviour assimilating
asynchronous data.
Although we have obtained interesting results, this work has some
limitations. Firstly, the proposed method used to assimilate crowdsourced
data is applied to the linear parts of hydrological models. This means that
the proposed methodology has to be tested on models with non-linear
dynamics. Secondly, while realistic synthetic streamflow data are used in
this study, the developed methodology is not tested with data coming from
actual social sensors. Therefore, the conclusions need to be confirmed using
real crowdsourced observations of water level. Finally, advancing methods
for a more accurate assessment of the data quality and accuracy of data
derived from social sensors need to be considered (e.g. developing a
pre-filtering module aimed at selecting only data that have good accuracy while
discarding those with low accuracy).
Future work will be aimed at addressing the limitations formulated above, which
will allow for a better characterization of the crowdsourced data, making
them a reliable data source for model-based forecasting.
Data availability
The DEM data were downloaded from the SRTM database
(http://srtm.csi.cgiar.org). The rainfall and river discharge data were
provided by the British Atmospheric Data Centre from the NERC Hydrological
Radar Experiment Dataset (Brue catchment,
http://www.badc.rl.ac.uk/data/hyrex/), and by the Alto Adriatico Water
Authority (Bacchiglione catchment). The authors are grateful to
Marco Franchini for providing the data on the Sieve catchment.
The authors declare that they have no conflict of
interest.
Acknowledgements
This research was partly funded in the framework of the EC FP7 project WeSenseIt: Citizen
Observatory of Water, grant agreement no. 308429. The authors wish to thank the editor
and three anonymous reviewers for their insightful and useful
comments. Edited by: S. Archfield
Reviewed by: three anonymous referees
ReferencesABC: ABC's crowdsourced flood-mapping initiative, ABCs Crowdsourced
Flood-Mapp, Initiat, available from:
http://www.abc.net.au/technology/articles/2011/01/13/3112261.htm (last
access: 20 January 2016), 2011.
Alfonso, L.: Use of hydroinformatics technologies for real time water quality
management and operation of distribution networks. Case study of
Villavicencio, Colombia, MS Thesis, UNESCO-IHE, Institute for Water
Education, Delft, the Netherlands, 2006.Alfonso, L., He, L., Lobbrecht, A., and Price, R.: Information theory applied
to evaluate the discharge monitoring network of the Magdalena River, J.
Hydroinform., 15, 211–228, 10.2166/hydro.2012.066, 2013.
Alfonso, L., Chacon, J., and Pena-Castellanos. G.: Allowing Citizens to
Effortlessly Become Rainfall Sensors, in 36th IAHR World Congress edited,
The Hague, the Netherlands, 2015.Arnold, C. P. and Dey, C. H.: Observing-Systems Simulation Experiments:
Past, Present, and Future, B. Am. Meteorol. Soc., 67, 687–695,
10.1175/1520-0477(1986)067<0687:OSSEPP>2.0.CO;2,
1986.Au, J., Bagchi, P., Chen, B., Martinez, R., Dudley, S. A., and Sorger, G. J.:
Methodology for public monitoring of total coliforms, Escherichia coli and
toxicity in waterways by Canadian high school students, J. Environ. Manage.,
58, 213–230, 10.1006/jema.2000.0323, 2000.Aubert, D., Loumagne, C., and Oudin, L.: Sequential assimilation of soil
moisture and streamflow data in a conceptual rainfall–runoff model, J.
Hydrol., 280, 145–161, 10.1016/S0022-1694(03)00229-4, 2003.
Bergström, S.: Principles and confidence in hydrological modelling,
Hydrol. Res., 22, 123–136, 1991.Bird, T. J., Bates, A. E., Lefcheck, J. S., Hill, N. A., Thomson, R. J.,
Edgar, G. J., Stuart-Smith, R. D., Wotherspoon, S., Krkosek, M.,
Stuart-Smith, J. F., Pecl, G. T., Barrett, N., and Frusher, S.: Statistical
solutions for error and bias in global citizen science datasets, Biol.
Conserv., 173, 144–154, 10.1016/j.biocon.2013.07.037, 2014.Bordogna, G., Carrara, P., Criscuolo, L., Pepe, M., and Rampini, A.: A
linguistic decision making approach to assess the quality of volunteer
geographic information for citizen science, Inf. Sci., 258, 312–327,
10.1016/j.ins.2013.07.013, 2014.Buytaert, W., Zulkafli, Z., Grainger, S., Acosta, L., Alemie, T. C.,
Bastiaensen, J., De Bièvre, B., Bhusal, J., Clark,
J., Dewulf, A., Foggin, M., Hannah, D. M., Hergarten, C., Isaeva, A.,
Karpouzoglou, T., Pandeya, B., Paudel, D., Sharma, K., Steenhuis, T.,
Tilahun, S., Van Hecken, G., and Zhumanova, M.: Citizen science in hydrology
and water resources: opportunities for knowledge generation, ecosystem
service management, and sustainable development, Front. Earth Sci.,
2, 1–21, 10.3389/feart.2014.00026, 2014.Canizares, R., Heemink, A. W., and Vested, H. J.: Application of advanced
data assimilation methods for the initialisation of storm surge models, J.
Hydraul. Res., 36, 655–674, 10.1080/00221689809498614, 1998.Célleri, R., Buytaert, W., De Bièvre, B., Tobón, C., Crespo, P.,
Molina, J., and Feyen, J.: Understanding the hydrology of tropical Andean
ecosystems through an Andean Network of Basins, available from:
http://dspace.ucuenca.edu.ec/handle/123456789/22089 (last access: 19 February
2016), 2009.
Cifelli, R., Doesken, N., Kennedy, P., Carey, L. D., Rutledge, S. A.,
Gimmestad, C., and Depue, T.: The Community Collaborative Rain, Hail, and
Snow Network: Informal Education for Scientists and Citizens, B. Am.
Meteorol. Soc., 86, 1069–1077, 2005.Clark, M. P., Rupp, D. E., Woods, R. A., Zheng, X., Ibbitt, R. P., Slater,
A. G., Schmidt, J., and Uddstrom, M. J.: Hydrological data assimilation with
the ensemble Kalman filter: Use of streamflow observations to update states
in a distributed hydrological model, Adv. Water Resour., 31, 1309–1324,
10.1016/j.advwatres.2008.06.005, 2008.Cortes Arevalo, V. J., Charrière, M., Bossi, G., Frigerio, S., Schenato,
L., Bogaard, T., Bianchizza, C., Pasuto, A., and Sterlacchini, S.: Evaluating
data quality collected by volunteers for first-level inspection of hydraulic
structures in mountain catchments, Nat. Hazards Earth Syst. Sci., 14,
2681–2698, 10.5194/nhess-14-2681-2014, 2014.
Danish Hydraulic Institute: MIKE FLOOD 1D-2D modelling, User manual, DHI,
2007.Degrossi, L. C., Do Amaral, G. G., da Vasconcelos, E. S. M., Albuquerque, J.
P., and Ueyama, J.: Using Wireless Sensor Networks in the Sensor Web for
Flood Monitoring in Brazil, in Proceedings of the 10th International ISCRAM
Conference, Baden-Baden, Germany, available from:
http://humanitariancomp.referata.com/wiki/Using_Wireless_Sensor_Networks_in_the_Sensor_Web_for_Flood_Monitoring_in_Brazil
(last access: 10 February 2016), 2013.Derber, J. and Rosati, A.: A Global Oceanic Data Assimilation System, J.
Phys. Oceanogr., 19, 1333–1347,
10.1175/1520-0485(1989)019<1333:AGODAS>2.0.CO;2, 1989.
Drecourt, J.-P.: Data assimilation in hydrological modelling, Environment
& Resources DTU, Technical University of Denmark, 2004.Eckhardt, K.: How to construct recursive digital filters for baseflow
separation, Hydrol. Process., 19, 507–515, 10.1002/hyp.5675, 2005.
Engel, S. R. and Voshell Jr., J. R.: Volunteer biological monitoring: can it
accurately assess the ecological condition of streams?, Am. Entomol., 48,
164–177, 2002.Errico, R. M., Yang, R., Privé, N. C., Tai, K.-S., Todling, R.,
Sienkiewicz, M. E., and Guo, J.: Development and validation of
observing-system simulation experiments at NASA's Global Modeling and
Assimilation Office, Q. J. R. Meteorol. Soc., 139, 1162–1178,
10.1002/qj.2027, 2013.Errico, R. M. and Privé, N. C.: An estimate of some analysis-error
statistics using the Global Modeling and Assimilation Office observing-system
simulation framework, Q. J. Roy. Meteor. Soc., 140, 1005–1012,
10.1002/qj.2180, 2014.
Evensen, G.: Data Assimilation: The Ensemble Kalman Filter, 2nd Edn.,
Springer, 2006.Fenicia, F., Solomatine, D. P., Savenije, H. H. G., and Matgen, P.: Soft
combination of local models in a multi-objective framework, Hydrol. Earth
Syst. Sci., 11, 1797–1809, 10.5194/hess-11-1797-2007, 2007.
Ferri, M., Monego, M., Norbiato, D., Baruffi, F., Toffolon, C., and Casarin,
R.: La piattaforma previsionale per i bacini idrografici del Nord Est
Adriatico (I), in: Proc. XXXIII Conference of Hydraulics and Hydraulic
Engineering, Brescia, p. 10, 2012.
Giandotti, M.: Previsione delle piene e delle magre dei corsi d'acqua,
Servizio Idrografico Italiano, Rome, 1933.
Hargreaves, G. H. and Samani, Z. A.: Estimating potential evapotranspiration,
J. Irrig. Drain. Div., 108, 225–230, 1982.Huang, B., Kinter, J. L., and Schopf, P. S.: Ocean data assimilation using
intermittent analyses and continuous model error correction, Adv. Atmos.
Sci., 19, 965–992, 10.1007/s00376-002-0059-z, 2002.Hunt, B. R., Kalnay, E., Kostelich, E. J., Ott, E., Patil, D. J., Sauer, T.,
Szunyogh, I., Yorke, J. A., and Zimin, A. V.: Four-dimensional Ensemble
Kalman Filtering, Tellus A, 56, 273–277,
10.1111/j.1600-0870.2004.00066.x, 2004.
Huwald, H., Barrenetxea, G., de Jong, S., Ferri, M., Carvalho, R.,
Lanfranchi, V., McCarthy, S., Glorioso, G., Prior, S., Solà, E.,
Gil-Roldàn, E., Alfonso, L., Wehn de Montalvo, U., Onencan, A.,
Solomatine, D., and Lobbrecht, A.: D1.11 Sensor technology requirement
analysis, Confidential Deliverable, The WeSenseIt Project (FP7/2007-2013
grant agreement no. 308429), 2013.
Ide, K., Courtier, P., Ghil, M., and Lorenc, A. C.: Unifed notation for data
assimilation: operational, sequential and variational, J. Meteorol. Soc.
Jpn., 75, 181–189, 1997.ISPUW: iSPUW: Integrated Sensing and Prediction of Urban Water for
Sustainable Cities, available from: http://ispuw.uta.edu/nsf/ (last
access: 19 February 2016), 2015.Kalman, R. E.: A new approach to linear filtering and prediction problems, J.
Basic Eng.-T. ASME, 82, 35–45, 10.1115/1.3662552, 1960.Krstanovic, P. F. and Singh, V. P.: Evaluation of rainfall networks using
entropy: II. Application, Water Resour. Manag., 6, 295–314,
10.1007/BF00872282, 1992.Kumar, R., Chatterjee, C., Lohani, A. K., Kumar, S., and Singh, R. D.:
Sensitivity Analysis of the GIUH based Clark Model for a Catchment, Water
Resour. Manag., 16, 263–278, 10.1023/A:1021920717410, 2002.Laio, F., Porporato, A., Ridolfi, L., and Rodriguez-Iturbe, I.: Plants in
water-controlled ecosystems: active role in hydrologic processes and response
to water stress: II. Probabilistic soil moisture dynamics, Adv. Water
Resour., 24, 707–723, 10.1016/S0309-1708(01)00005-7, 2001.Li, Z. and Navon, I. M.: Optimality of variational data assimilation and its
relationship with the Kalman filter and smoother, Q. J. Roy. Meteor. Soc.,
127, 661–683, 10.1002/qj.49712757220, 2001.Lowry, C. S. and Fienen, M. N.: CrowdHydrology: Crowdsourcing hydrologic data
and engaging citizen scientists, GroundWater, 51, 151–156,
10.1111/j.1745-6584.2012.00956.x, 2013.Macpherson, B.: Dynamic initialization by repeated insertion of data, Q. J.
Roy. Meteor. Soc., 117, 965–991, 10.1002/qj.49711750105, 1991.Madsen, H. and Cañizares, R.: Comparison of extended and ensemble Kalman
filters for data assimilation in coastal area modelling, Int. J. Numer. Meth.
Fl., 31, 961–981,
10.1002/(SICI)1097-0363(19991130)31:6<961::AID-FLD907>3.0.CO;2-0, 1999.
Massart, S., Pajot, B., Piacentini, A., and Pannekoucke, O.: On the merits of
using a 3D-FGAT assimilation scheme with an outer loop for atmospheric
situations governed by transport, Mon. Weather Rev., 138, 4509–4522, 2010.
Matheron, G.: Principles of geostatistics, Econ. Geol., 58, 1246–1266, 1963.
Mazzoleni, M., Alfonso, L., Chacon-Hurtado, J., and Solomatine, D.:
Assimilating uncertain, dynamic and intermittent streamflow observations in
hydrological models, Adv. Water Resour., 83, 323–339, 2015.Mazzoleni, M., Alfonso, L., and Solomatine, D.: Influence of spatial
distribution of sensors and observation accuracy on the assimilation of
distributed streamflow data in hydrological modelling, Hydrolog. Sci. J.,
10.1080/02626667.2016.1247211, 2016.McDonnell, J. J. and Beven, K.: Debates—The future of hydrological
sciences: A (common) path forward? A call to action aimed at understanding
velocities, celerities and residence time distributions of the headwater
hydrograph, Water Resour. Res., 50, 5342–5350, 10.1002/2013WR015141,
2014.Moore, R. J., Jones, D. A., Cox, D. R., and Isham, V. S.: Design of the HYREX
raingauge network, Hydrol. Earth Syst. Sci., 4, 521–530,
10.5194/hess-4-521-2000, 2000.
Ragnoli, E., Zhuk, S., Donncha, F. O., Suits, F., and Hartnett, M.: An
optimal interpolation scheme for assimilation of HF radar current data into a
numerical ocean model, Oceans, 2012, 1–5, 2012.Rakovec, O., Weerts, A. H., Hazenberg, P., Torfs, P. J. J. F., and
Uijlenhoet, R.: State updating of a distributed hydrological model with
Ensemble Kalman Filtering: effects of updating frequency and observation
network density on forecast accuracy, Hydrol. Earth Syst. Sci., 16,
3435–3449, 10.5194/hess-16-3435-2012, 2012.Rakovec, O., Weerts, A. H., Sumihar, J., and Uijlenhoet, R.: Operational
aspects of asynchronous filtering for flood forecasting, Hydrol. Earth Syst.
Sci., 19, 2911–2924, 10.5194/hess-19-2911-2015, 2015.
Refsgaard, J. C.: Validation and Intercomparison of Different Updating
Procedures for Real-Time Forecasting, Nord. Hydrol., 28, 65–84, 1997.Ridolfi, E., Alfonso, L., Baldassarre, G. D., Dottori, F., Russo, F., and
Napolitano, F.: An entropy approach for the optimization of cross-section
spacing for river modelling, Hydrolog. Sci. J., 59, 126–137,
10.1080/02626667.2013.822640, 2014.Rinaldo, A. and Rodriguez-Iturbe, I.: Geomorphological Theory of the
Hydrological Response, Hydrol. Process., 10, 803–829,
10.1002/(SICI)1099-1085(199606)10:6<803::AID-HYP373>3.0.CO;2-N, 1996.Rodríguez-Iturbe, I., González-Sanabria, M., and Bras, R. L.: A
geomorphoclimatic theory of the instantaneous unit hydrograph, Water Resour.
Res., 18, 877–886, 10.1029/WR018i004p00877, 1982.Roy, H. E., Pocock, M. J. O., Preston, C. D., Roy, D. B., and Savage, J.:
Understanding Citizen Science and Environmental Monitoring, Final Report of
UK Environmental Observation Framework, 2012.
Sakov, P., Evensen, G., and Bertino, L.: Asynchronous data assimilation with
the EnKF, Tellus A, 62, 24–29, 10.1111/j.1600-0870.2009.00417.x, 2010.
Seo, D.-J., Kerke, B., Zink, M., Fang, N., Gao, J., and Yu, X.: iSPUW: A
Vision for Integrated Sensing and Prediction of Urban Water for Sustainable
Cities, 2014.Solomatine, D. P. and Dulal, K. N.: Model trees as an alternative to neural
networks in rainfall–runoff modelling, Hydrolog. Sci. J., 48, 399–411,
10.1623/hysj.48.3.399.45291, 2003.
Szilagyi, J. and Szollosi-Nagy, A.: Recursive Streamflow Forecasting: A State
Space Approach, CRC Press Book, 2010.Todini, E.: A mass conservative and water storage consistent variable
parameter Muskingum-Cunge approach, Hydrol. Earth Syst. Sci., 11, 1645–1659,
10.5194/hess-11-1645-2007, 2007.
Todini, E., Alberoni, P., Butts, M., Collier, C., Khatibi, R., Samuels, P.,
and Weerts, A.: ACTIF best practice paper-understanding and reducing
uncertainty in flood forecasting, in: International conference on innovation,
advances and implementation of flood forecasting technology, edited by:
Balabanis, P., Lumbroso, D., and Samuels, P., Tromsø, Norway, 2005.Tulloch, A. I. T. and Szabo, J. K.: A behavioural ecology approach to
understand volunteer surveying for citizen science datasets, Emu, 112,
313–325, 10.1071/MU12009, 2012.
Vandecasteele, A. and Devillers, R.: Improving volunteered geographic data
quality using semantic similarity measurements, ISPRS-Int. Arch. Photogramm.
Remote Sens. Spat. Inf. Sci., 1, 143–148, 2013.
Verlaan, M.: Efficient Kalman Filtering Algorithms for Hydrodynamic Models,
PhD Thesis, Delft University of Technology, the Netherlands, 1998.Weerts, A. H. and El Serafy, G. Y. H.: Particle filtering and ensemble Kalman
filtering for state updating with hydrological conceptual rainfall-runoff
models, Water Resour. Res., 42, 1–17, 10.1029/2005WR004093, 2006.
Wehn, U., Rusca, M., Evers, J., and Lanfranchi, V.: Participation in flood
risk management and the potential of citizen observatories: A governance
analysis, Environ. Sci. Policy, 48, 225–236, 2015.
World Meteorological Organization (WMO): Simulated real-time intercomparison
of hydrological models, WMO Oper. Hyrol. Rep. 38, WMO 779, Geneva, 1992.Wood, S. J., Jones, D. A., and Moore, R. J.: Accuracy of rainfall measurement
for scales of hydrological interest, Hydrol. Earth Syst. Sci., 4, 531–543,
10.5194/hess-4-531-2000, 2000.