Introduction

HESS

Hydrology and Earth System Sciences

HESS

Hydrol. Earth Syst. Sci.

1607-7938

Copernicus Publications

Göttingen, Germany

10.5194/hess-21-3071-2017

Rainfall and streamflow sensor network design: a review of applications, classification, and a proposed framework

Chacon-Hurtado

Juan C.

j.chaconhurtado@unesco-ihe.org Alfonso

Leonardo

https://orcid.org/0000-0002-8471-5876

Solomatine

Dimitri P.

https://orcid.org/0000-0003-2031-9871

1Department of Integrated Water Systems and Governance, UNESCO-IHE, Institute for Water Education, Delft, the Netherlands 2Water Resources Section, Delft University of Technology, Delft, the Netherlands

Juan C. Chacon-Hurtado (j.chaconhurtado@unesco-ihe.org)

28June2017

21 6 30713091 20July2016 13September2016 30January2017 23March2017

This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/3.0/

This article is available from https://hess.copernicus.org/articles/21/3071/2017/hess-21-3071-2017.html

The full text article is available as a PDF file from https://hess.copernicus.org/articles/21/3071/2017/hess-21-3071-2017.pdf

Sensors and sensor networks play an important role in decision-making related to water quality, operational streamflow forecasting, flood early warning systems, and other areas. In this paper we review a number of existing applications and analyse a variety of evaluation and design procedures for sensor networks with respect to various criteria. Most of the existing approaches focus on maximising the observability and information content of a variable of interest. From the context of hydrological modelling only a few studies use the performance of the hydrological simulation in terms of output discharge as a design criterion. In addition to the review, we propose a framework for classifying the existing design methods, and a generalised procedure for an optimal network design in the context of rainfall–runoff hydrological modelling.

Introduction

Optimal design of sensor networks is a key procedure for improved water management as it provides information about the states of water systems. As the processes taking place in catchments are complex and the measurements are limited, the design of sensor networks is (and has been) a relevant topic since the beginning of the International Hydrological Decade (1965–1974, TNO, 1986) until today (Pham and Tsai, 2016). During this period, the scientific community has not yet arrived at an agreement about a unified methodology for sensor network design due to the diversity of cases, criteria, assumptions, and limitations. This is evident from the range of existing reviews on hydrometric network design, such as those presented by WMO (1972), TNO (1986), Nemec and Askew (1986), Knapp and Marcus (2003), Pryce (2004), NRC (2004), and Mishra and Coulibaly (2009).

The design of rainfall and streamflow sensor networks depends to a large extent on the scale of the processes to be monitored and the objectives to address (TNO, 1986; Loucks et al., 2005). Therefore, the temporal and spatial resolution of measurements are driven by the measurement objectives. For example, information for long-term planning does not require the same level of temporal resolution as for operational hydrology (WMO, 2009; Dent, 2012). On the global and country scale, sensor networks are commonly used for climate studies and trend detection (Cihlar et al., 2000; Grabs and Thomas, 2002; WMO, 2009; Environment Canada, 2010; Marsh, 2010; Whitfield et al., 2012), and are denoted as National Climate Reference Networks (WMO, 2009). On a regional or catchment scale, applications require careful selection of monitoring stations, since water resource planning and management decisions, such as operational hydrology and water allocation, require high temporal and spatial resolution data (Dent, 2012).

This paper presents a review of methods for optimal design and evaluation of precipitation and discharge sensor networks at catchment scale, proposes a framework for classifying the design methods, and suggests a generalised framework for optimal network design for surface hydrological modelling. It is possible to extend this framework to other variables in the hydrological cycle, since optimal sensor location problems are similar. The framework introduced here is part of the results of the FP7 WeSenseIt project (www.wesenseit.eu), and the validation of the proposed methodology will be presented in subsequent publications. This review does not consider in situ installation requirements or recommendations, so the reader is referred to WMO (2008a) for the relevant and widely accepted guidelines, and to Dent (2012) for current issues in practice.

The structure of this paper is as follows: first, a classification of sensor network design approaches according to the explicit use of measurements and models is presented, including a review of existing studies. Next, a second way of classification is suggested, which is based on the classes of methods for sensor network analysis, including statistics, information theory, case-specific recommendations, and others. Then, based on the reviewed literature, an aggregation of approaches and classes is presented, identifying potential opportunities for improvement. Finally, a general procedure for the optimal design of sensor networks is proposed, followed by conclusions and recommendations.

Main principles of network design

The design of a sensor network uses the same concepts as experimental design (Kiefer and Wolfowitz, 1959; Fisher, 1974). The design should ensure that the data are sufficient and representative, and can be used to derive the conclusions required from the measurements (EPA, 2002), or to assess the water status of a river system (EC, 2000). In the context of rainfall–runoff hydrological modelling, provide the sufficient data for accurate simulation and forecasting of discharge and water levels, at stations of interest.

The objectives of the sensor network design have been categorised into two groups, the optimality alphabet (Fedorov, 1972; Box, 1982; Fedorov and Hackl, 1997; Pukelsheim, 2006; Montgomery, 2012), which uses different letters to name different design criteria, and the Bayesian framework (Chaloner and Verdinelli, 1995; DasGupta, 1996). The alphabetic design is based on the linearisation of models, optimising particular criteria of the information matrix (Fedorov and Hackl, 1997). Bayesian methods are centred on principles of decision-making under uncertainty, in which it seeks to maximise the gain in information (Shannon, 1948) between the prior and posterior distributions of parameters, inputs, or outputs (Lindley, 1956; Chaloner and Verdinelli, 1995). Among the most used alphabetic objectives are the D-optimal, which minimises the area of the uncertainty ellipsoids around the model parameters, and G-optimal, which minimises the variance of the predicted variable, which can also be used as objective functions in the Bayesian design.

These general objectives are indirectly addressed in the literature of optimisation of hydrometric sensor networks, achieved by the use of several functional alternatives. These approaches do not consider block experimental design (Kirk, 2009), due to the incapacity to replicate initial conditions in a non-controlled environment, such as natural processes.

On the practical side, the design of a sensor network should start with the institutional set-up, purposes, objectives, and priorities of the network (Loucks et al., 2005; WMO, 2008b). From the technical point of view, an optimal measurement strategy requires the identification of the process, for which data are required (Casman et al., 1988; Dent, 2012). Considering that the information objectives are not unique and consistent or that the characterisation of the processes is not complete, the re-evaluation of the sensor network design should occur on a regular basis. Therefore, the sensor network should be re-evaluated when the studied process, information needs, information use, or modelling objectives change. Consequently, regulations regarding monitoring activities are often strict not in terms of station density, but in the suitability of data for providing information about the status of the water system (EC, 2000; EPA, 2002).

The design of meteorological and hydrometric sensor networks should consider at least three aspects. First, it should meet various objectives that are sometimes conflicting (Loucks et al., 2005; Kollat et al., 2011). Second, it should be robust in the event of failure of one or more measurement stations (Kotecha et al., 2008). Third, it must take into account different purposes and users with different temporal and spatial scales (Singh et al., 1986). Therefore, the design of an optimal sensor network is a multi-objective problem (Alfonso et al., 2010b).

The sensor network design can also be seen from an economic perspective (Loucks et al., 2005). In most cases, the main limitation in the deployment of sensor networks is related to costs, being sometimes the main driver of decisions related to reduction of the monitoring networks. The valuation between the costs of the sensor networks and the cost of having insufficient information is not usually considered, because the assessment of the consequences of decisions is made a posteriori (Loucks et al., 2005; Alfonso et al., 2016). In most studies, it is seen that the improvement of information content metrics (e.g. entropy, uncertainty reduction, among others) is marginal as the number of extra sensors increases (Pardo-Iguzquiza, 1998; Dong et al., 2006; Ridolfi et al., 2011), and thus the selection of the adequate number of sensors can be based on a threshold in the rate of increment in the objective function. However, in many practical applications the number of available sensors may be defined by budget limitations. Therefore, the optimal number of sensors in a network is strictly case-specific (WMO, 2008c).

Typical data flow in discharge simulation using hydrological models.

Scenarios for sensor network design: augmentation, relocation, and reduction

Scenarios for designing of sensor networks may be categorised into three groups: augmentation, relocation, and reduction (NRC, 2004; Mishra and Coulibaly, 2009; Barca et al., 2015). Augmentation refers to the deployment of at least one additional sensor in the network, whereas reduction refers to the opposite case, where at least one sensor is removed from the original network. Relocation is about repositioning the existing network nodes.

The lack of data usually drives the sensor network augmentation, whereas economic limitations usually push for reduction. These costs of the sensor network usually relate to the deployment of physical sensors in the field, and transmission, maintenance, and continuous validation of data (WMO, 2008c).

Augmentation and relocation problems are fundamentally similar, as they require estimation of the measured variable at ungauged locations. For this purpose, statistical models of the measured variable are often employed. For example, Rodriguez-Iturbe and Mejia (1974) described rainfall regarding its correlation structure in time and space, Pardo-Igúzquiza (1998) expressed areal averages of rainfall events with ordinary Kriging estimation, and Chacon-Hurtado et al. (2009) represented rainfall fields using block Kriging. In contrast, for network reduction, the analysis is driven by what-if scenarios as the measurements become available. Dong et al. (2005) employ this approach to re-evaluate the efficiency of a river basin network based on the results of hydrological modelling.

In principle, augmentation and relocation aim to increase the performance of the network (Pardo-Igúzquiza, 1998; Nowak et al., 2010). In reduction, by contrast, network performance is usually decreased. The driver of these decisions is usually related to factors such as operation and maintenance costs (Moss et al., 1982; Dong et al., 2005).

Role of measurements in rainfall–runoff modelling

The typical data flow for hydrological rainfall–runoff modelling can be summarised as in Fig. 1. For discharge simulation, precipitation and evapotranspiration are the most common data requirements (WMO, 2008c; Beven, 2012), while discharge data are commonly employed for model calibration, correction, and update (Sun et al., 2015). Data-driven hydrological models may use measured discharge as input variables as well (e.g. Solomatine and Xue, 2004; Shrestha and Solomatine, 2006). Methods for updating of hydrological models have been widely used in discharge forecasting as data assimilation, using the model error to update the model states. In this way, more accurate discharge estimates can be obtained (Liu et al., 2012; Lahoz and Schneider, 2014). In real-time error correction schemes, typically, a data-driven model of the error is employed which may require as input any of the mentioned variables (Xiong and O'Connor, 2002; Solomatine and Ostfeld, 2008).

In a conceptual way, we can express the quantification of discharge at a given station as (Solomatine and Wagener, 2011) Q=Q^x,θ+ε, where Q is the recorded discharge, and Q^(x,θ) represents a hydrological model, which is a function of measured variables (mainly precipitation and discharge, x) and the model parameters (θ). ε is the simulation error, which is ideally independent of the model, but in practice is conditioned by it. Considering that neither are the measurements perfect nor the model unbiased, the variance of the estimates is proportional to the uncertainty in the model inputs, σ2(x), and the uncertainty in model parameters, σ2(θ):

σ2Q^x,θασ2x,σ2θ.

Classification of approaches for sensor<?xmltex \hack{\break}?> network evaluation

There is a variety of approaches for the evaluation of sensor networks, ranging from theoretically sound to more pragmatic. In this section, we provide a general classification of these approaches, and more details of each method are given in the next section.

Although most of the approaches for the design of sensor networks make use of data, some rely solely on experience and recommendations. Therefore, a first tier in the proposed classification consists of recognising both measurement-based and measurement-free approaches (Fig. 2). The former make use of the measured data to evaluate the performance of the network (Tarboton et al., 1987; Anctil et al., 2006), while the latter use other data sources (Moss and Tasker, 1991), such as topography and land use.

Proposed classification of methods for sensor network evaluation.

Measurement-based evaluation

The measurement-based approach can be further subdivided into model-free and model-based approaches (Fig. 2), depending on the use of modelling results in the performance metric.

Model-free performance evaluation

In model-free approaches, water systems and the external processes that drive their behaviour are observed through existing measurements, without the use of catchment models. Then, metrics about amount and quality of information in space and time are evaluated with regards to the management objectives and the decisions to be made in the system. Some performance metrics in this category are joint entropy (Krstanovic and Singh 1992), information transfer (Yang and Burn, 1994), interpolation variance (Pardo-Igúzquiza, 1998; Cheng et al., 2007), and autocorrelation (Moss and Karlinger, 1974), among others. Figure 3 presents the flowchart for the case when precipitation and discharge, as the main drivers of catchment hydrology (WMO, 2008c), are considered in model-free network evaluation.

General procedure for model-free sensor network evaluation.

Fundamentally, the model-free approach aims to minimise the variance of the measured variable, thereby (and in theory) minimising the variance in the estimation (Eq. 3). However, a design that is optimal for estimation is not necessarily also optimal for prediction (Chaloner and Verdinelli, 1995).

min⁡σ2Q^x,θαmin⁡σ2x Application of model-free approaches can be found in Krstanovic and Singh (1992), Nowak et al. (2010), and Li et al. (2012). Model-free evaluations are suitable for sensor network design aimed mainly at water resource planning, in which diverse water interests must be balanced. Due to the lack of a quantitative performance metric that relates simulated discharge, these kinds of evaluations do not necessarily improve rainfall–runoff simulations.

Model-based performance evaluation

In the model-based approach, the performance of sensor networks is carried out using a catchment model (Dong et al., 2005; Xu et al., 2013). In this case, measurements of precipitation are used to simulate discharge, which is compared to the discharge measurements at specific locations. Therefore, any metric of the modelling error could be used to evaluate the performance of the network. Figure 4 presents a generic model-based approach for evaluating sensor networks.

General procedure for model-based sensor network evaluation.

In the model-based design of sensor networks, it is assumed that the model structure and parameters are adequate. Therefore, it is possible to identify a set of measurements (x) which minimise the modelling error as

min⁡σ2ϵαmin⁡Q-Q^x,θ. The need for the catchment model and possible high computational efforts for multiple model runs are some disadvantages of this approach. The computational load is especially critical in the case of complex distributed models. It is worth mentioning that particular model error metrics (Nash and Sutcliffe, 1970; Gupta et al., 2009) may qualify the network by its ability to capture certain hydrological processes (Bennet et al., 2013), affecting the network evaluation.

Measurement-free evaluation

As is seen from its name, this approach does not require the previous collection of data of the measured variable to evaluate the sensor network performance. The evaluation of sensor networks is based on either experience or physical characteristics of the area such as land use, slope, or geology. In this group of methods, the following can be mentioned: case-specific recommendations (Bleasdale, 1965; Wahl and Crippen, 1984; Karasseff, 1986; WMO, 2008a) and physiographic components (Tasker, 1986; Laize, 2004). This approach is the first step towards any sensor network development (Bleasdale, 1965; Moss et al., 1982; Nemec and Askew, 1986; Karasseff, 1986).

Classification of methods for sensor network evaluation

In this section, we classify the methods used to quantify the performance of the sensor networks based on the mathematical apparatus used to evaluate the network performance. These methods can broadly be categorised as statistics-based, information theory-based, expert recommendations, and others.

Statistics-based methods

Statistics-based methods refer to methods where the performance of the network is evaluated with statistical uncertainty metrics of the measured or simulated variable. These methods aim to minimise either interpolation variance (Rodriguez-Iturbe and Mejia, 1974; Bastin et al., 1984; Bastin and Gevers, 1985; Pardo-Igúzquiza, 1998; Bonaccorso et al., 2003), cross-correlation (Maddock, 1974; Moss and Karlinger, 1974; Tasker, 1986) or model error (Dong et al., 2005; Xu et al., 2013).

Interpolation variance (geostatistical)

Methods to evaluate sensor networks considering a reduction in the interpolation variance assume that for a network to be optimal, the measured variable should be as certain as possible in the domain of the problem. To achieve this, a stochastic interpolation model that provides uncertainty metrics is required. Geostatistical methods such as Kriging (Journel and Huijbregts, 1978; Cressie, 1993) or copula interpolation (Bárdossy, 2006) have an explicit estimation of the interpolation error. This characteristic makes it suitable for identifying areas with expected poor interpolation results, (Bastin et al., 1984; Pardo-Igúzquiza, 1998; Grimes et al., 1999; Bonaccorso et al., 2003; Cheng et al., 2007; Nowak et al., 2009, 2010; Shafiei et al., 2013).

In the case of Kriging, the optimal estimation of a variable at ungauged locations is assumed to be a linear combination of the measurements, with a Gaussian distributed probability distribution function. Under the ordinary Kriging formulation, the variance in the estimation (σ2) of a variable at location (u) over a catchment is

σ2(u)=C0-∑α=1nλαu-C(uα-u), where C0 refers to the variance of the random field, and λα are the Kriging weights for the station α at the ungauged location u. Cuα-u is the covariance between the station α at the location uα and the interpolation target at the location u. n represents the total number of stations in the neighbourhood of u and used in the interpolation.

Therefore, as an objective function the optimal sensor network is such that the total Kriging variance (TKV) is minimum:

TKV=∑u=1Uσ2(u), where U is the total number of discrete interpolation targets in the catchment or domain of the problem.

Bastin and Gevers (1985) optimised a precipitation sensor network at pre-defined locations to estimate the average precipitation for a given catchment. Their selection of the optimal sensor location consisted of minimising the normalised uncertainty by reducing the network. The main drawback of their approach is that the network can only be reduced and not augmented. Similar approaches have also been used by Rodriguez-Iturbe and Mejia (1974), Bogárdi et al. (1985), and Morrissey et al. (1995). Pardo-Igúzquiza (1998) advanced this formulation by removing the pre-defined set of locations (allowing augmentation). Instead, rain gauges were allowed to be placed anywhere in the catchment and its surroundings. A simulated annealing algorithm is used to search for the optimal set of sensors to minimise the interpolation uncertainty.

Copula interpolation is a geostatistical alternative to Kriging for the modelling of spatially distributed processes (Bárdossy, 2006; Bárdossy and Li, 2008; Bárdossy and Pegram, 2009). As a geostatistical model, the copula provides metrics of the interpolation uncertainty, considering not only the location of the stations and the model parameterisation, but also the value of the observations. Li et al. (2011) use the concept of a copula to provide a framework for the design of a monitoring network for groundwater parameter estimation, using a utility function, related to the cost of a given decision with the available information.

In the case of copulas, the full conditional probability distribution function of the variable is interpolated. As such, the interpolation uncertainty depends on the confidence interval, measured values, parameterisation of the copula, and the relative position of the sensors in the domain of the catchment. More details on the formulation of copula-based designs can be found in Bárdossy and Li (2008).

Cheng et al. (2007), as well as Shafiei et al. (2013), recognised that the temporal resolution of the measurements affects the definition of optimality in minimum interpolation variance methods. This change in the spatial correlation structure occurs due to more correlated precipitation data between stations at coarser sampling resolutions (Ciach and Krajewski, 2006). For this purpose, the sensor network has to be split into two parts, a base network and non-base sensors. The former should remain in the same position for long periods, to characterise longer fluctuation phenomena, based on the definition of a minimum threshold for an area with acceptable accuracy. The latter is relocated to improve the accuracy of the whole system, and should be relocated as they do not provide a significant contribution to the monitoring objective.

Recent efforts have used minimum interpolation variance approaches to consider the non-stationarity assumption of most geostatistical applications in sensor network design (Chacon-Hurtado et al., 2014). To this end, changes in the precipitation pattern and its effect on the uncertainty estimation were considered during the development of a rainfall event.

Cross-correlation

The objective of minimum cross-correlation methods is to avoid placing sensors at sites that may produce redundant information. Cross-correlation was suggested by Maddock (1974) for sensor network reduction, as a way to identify redundant sensors. In this scope, the objective function can be written as ρXi,Xj=∑i=1n∑j=i+1ncov(xi,xj)σxiσ(xj), where cov is the covariance function between a pair of stations (i,j), and σ is the standard deviation of the observations.

Stedinger and Tasker (1985) introduced the method called network analysis using generalised least squares (NAUGLS), which assesses the parameters of a regression model for daily discharge simulation based on the physiographic characteristics of a catchment (Stedinger and Tasker, 1985; Tasker, 1986; Moss and Tasker, 1991). The method builds a generalised-least-square (GLS) covariance matrix of regression errors to correlate flow records and to consider flow records of different lengths, so the sampling mean squared error can be expressed as SMSE=1n∑i=1jXiTXTΛ-1X-1Xi, where X[k,w] is the matrix of the (k) basin characteristics in a window of size w at discharge measuring site i. Λ is the GLS weighting matrix, using a set of n gauges (Tasker, 1986).

A comparable method was proposed by Burn and Goulter (1991), who used a correlation metric to cluster similar stations. Vivekanandan and Jagtap (2012) proposed an alternative for the location of discharge sensors in a recurrent approach, in which the most redundant stations were removed, and the most informative stations remained using Cooks' D metrics, a measure of how the spatial regression model at a particular site is affected by removing another station. The result of these types of sensors is sparse, as the redundancy of two sensors increases with the inverse of the distance between them (Mishra and Coulibaly, 2009).

Model output error

These methods assume that the optimal sensor network configuration is such as satisfies a particular modelling purpose, e.g. a minimum error in simulated discharge. Considering this, the design of a sensor network should be such as minimises the difference between the simulated and recorded variables: min⁡fQ-Q^x,θ, where f is a metric that summarises the vector error such as bias, root mean squared error (RMSE), or Nash–Sutcliffe efficiency (NSE); Q are the measurements of the simulated variable, and Q^ are the simulation results using inputs x and parameters θ. Bias measures the mean deviation of the results between the observations (Q) and simulation results (Q^) for t pairs of observations and simulation results:

Bias=1n∑i=1tQ^i-Qi. This metric theoretically varies from minus infinity to infinity, and its optimal value is equal to 0. The RMSE measures the standard deviation of the residuals as RMSE=1n∑i=1tQ^i-Qi2. The RMSE can then vary from 0 to infinity, where 0 represents a perfect fit between model results and observations. As RMSE is a statistical moment of the residuals, the result is a magnitude rather than a score. Therefore, benchmarking between different case studies is not trivial. To overcome this issue, Nash and Sutcliffe (1970) proposed a score (also known as the coefficient of determination) based on the ratio of the model results in variance over the observation variance as NSE=1-∑i=1tQ^i-Qi2∑i=1tQi-Q‾i2, in which Q are the measurements, Q^ are the model results, and Q‾ is the average of the recorded series.

Theoretically, this score varies from minus infinity to 1. However, its practical range lies between 0 and 1. On the one hand, an NSE equal to 0 indicates that the model has the same explanatory capabilities as the mean of the observations. On the other end, a value of 1 represents a perfect fit between model results and observations. Model output error formulations have been used to identify the most convenient set of sensors that provide the best model performance (Tarboton et al., 1987) to propose measurement strategies regarding the number of gauges and sampling frequency.

Another application is provided by Dong et al. (2005), who proposed evaluating the rainfall network using a lumped HBV model (Lindström et al., 1997). They found that the model performance does not necessarily improve when extra rain gauges are placed. A similar approach was presented by Xu et al. (2013), who evaluated the effect of diverse rain gauge locations on runoff simulation using a similar hydrological model. It was found that rain gauge locations could have a significant impact and suggests that a gauge density of less than 0.4 stations per 1000 km2 can negatively affect the model performance.

Anctil et al. (2006) aimed at improving lumped neural network rainfall–runoff forecasting models through mean areal rainfall optimisation, and concluded that different combinations of sensors lead to noticeable streamflow forecasting improvements. Studies in other fields have also used this method. For example, Melles et al. (2009, 2011) obtained optimal monitoring designs for radiation monitoring networks, which minimise the prediction error of mean annual background radiation. The main drawback of this approach is that multiple error metrics are considered, as specific objectives relate to different processes.

Information-theory-based methods

The use of information theory (Shannon, 1948) in the design of sensor networks for environmental monitoring is based on communication theory, which studies the problem of transmitting signals from a source to a receiver throughout a noisy medium. Information theory provides the possibility of estimating probability distribution functions in the presence of partial information with the less biased estimation (Jaynes, 1957). Some of its concepts are analogous to statistics concepts, and therefore similarities between entropy and uncertainty, such as mutual information and correlation, can be found (Cover and Thomas, 2005; Alfonso, 2010; Singh, 2013).

Information theory-based methods for designing sensor networks mainly consider the maximisation of information content that sensors can provide, in combination with the minimisation of redundancy among them (Krstanovic and Singh, 1992; Mogheir and Singh, 2002; Alfonso et al., 2010a, b, 2013; Alfonso, 2010; Singh, 2013). Redundancy can be measured by using mutual information (Singh, 2000; Steuer et al., 2002), directional information transfer (Yang and Burn, 1994), or total correlation (Alfonso et al., 2010a, b; Fahle et al., 2015), among others.

Entropy

The principle of maximum entropy (POME) is based on the premise that probability distribution with the largest remaining uncertainty (i.e. the maximum entropy) is the one that best represents the current stage of knowledge. POME has been used as a criterion for the design of sensor networks, by allowing the identification of the set of sensors that maximises the joint entropy among measurements (Krstanovic and Singh, 1992), in other words, to provide as much information content, from the information theory perspective, as possible (Jaynes, 1988).

In the design of sensor networks, the objective is to maximise the joint entropy (H) of the sensor network as

HX1,X2,…,Xn=-∑i=1k…∑j=1mpxi1,…xjmlog⁡pxi1,…xjm, where p(X) is the probability of the random variable X taking a discrete value xm. As in many applications, X is a continuous variable which has to be discretised (quantised) into intervals (k, m) to calculate its entropy. The probabilities are calculated following frequency analysis, such that the probability of a variable X taking a value in the interval i, …, j is defined by the number of times in which this value appears, divided by the complete length of the dataset. When calculating the entropy of more than one variable simultaneously (joint entropy), joint probabilities are used.

Krstanovich and Singh (1992) presented a concise work on rainfall network evaluation using entropy. They used POME to obtain multivariate distributions to associate different dependencies between sensors, such as joint information and shared information, which was used later either to reduce the network (in the case of high redundancy) or to expand it (in the case of a lack of common information).

Fuentes et al. (2007) proposed an entropy-utility criterion for environmental sampling, particularly suited for air-pollution monitoring. This approach considers Bayesian optimal sub-networks using an entropy framework, relying on the spatial correlation model. An interesting contribution of this work is the assumption of non-stationarity, contrary to traditional atmospheric studies, and relevant in the design of precipitation sensor networks.

Hydraulic 1-D models and metrics of entropy have been used to select the adequate spacing between sensors for water level in canals and polder systems (Alfonso et al., 2010a, b). This approach is based on the current conditions of the system, which makes it useful for operational purposes, but it does not necessarily support the modifications in the water system conditions or changes in the operation rules. Studies on the design of sensor networks using these methods have been on the rise in recent years (Alfonso, 2010; Alfonso et al., 2013; Ridolfi et al., 2014; Banik et al., 2017).

Benefits of POME include the robustness of the description of the posterior probability distribution since it aims to define the less biassed outcome. This is because neither the models nor the measurements are completely certain. Li et al. (2012) presented, as part of a multi-objective framework for sensor network optimisation, the criteria of maximum (joint) entropy, as one of the objectives. Other studies in this direction have been presented by Lindley (1956), Caselton and Zidek (1984), Guttorp et al. (1993), Zidek et al. (2000), Yeh et al. (2011), and Kang et al. (2014).

More recently, Samuel et al. (2013) and Coulibaly and Samuel (2014) proposed a mixed method involving regionalisation and dual entropy multi-objective optimisation (CRDEMO), which is a step forward when compared to single-objective optimisation for sensor network design.

Mutual information (trans-information)

Mutual information is a measurement of the amount of information that a variable contains about another. This is measured as the relative entropy between the joint distribution and the product distribution (Cover and Thomas, 2005). In the simplest expression (two variables), the mutual information can be defined as

IX1,X2=HX1+HX2-HX1,X2, where H(X1) and H(X2) is the entropy of each of the variables, and H(X1,X2) is the joint entropy between them. The extension of the mutual information for more than two variables should not only consider the joint entropy between them, but also the joint entropy between pairs of variables, leading to a significantly complex expression for the multivariate mutual information. Regarding this issue, the multivariate mutual information can be addressed as a nested problem, such that IX1,X2,…,Xn=IX1,X2,…,Xn-1-IX1,X2,…,Xn-1|Xn, where I(X1,X2, …, Xn) is the multivariate mutual information among n variables, and I(X1,X2, …, Xn-1|Xn) is the conditional information of n-1 variables with respect to the nth variable. The conditional mutual information can be understood as the amount of information that a set of variable share with another variable (or variables). The conditional mutual information of two variables (X1 and X2) with respect to a third one (X3) can be quantified as IX1,X2|X3=HX1|X3-HX1|X2,X3, where H(X1|X3) is the conditional entropy of X1 to X3 and H(X1|X2, X3) is the conditional entropy of X1 with respect to X2 and X3 simultaneously. The conditional entropy can be understood as the amount of information that a variable does not share with another. The joint entropy between two variables can be quantified as HX1|X2=∑i=1k∑j=1mpX1i,X2jlog⁡pX1ipX1i,X2j, where p(X1,X2) is the joint probability, for k and m discrete values, of X1 and X2.

An optimal sensor network should avoid collecting repetitive or redundant information; in other words, it should reduce the mutual (shared) information between sensors in the network. Alternatively, it should maximise the transferred information from a measured to a modelled variable at a point of interest (Amorocho and Espildora, 1973). Following this idea, Husain (1987) suggested an optimisation scheme for the reduction of a rain sensor network. His objective was to minimise the trans-information between pairs of stations. However, assumptions of the probability and joint probability distribution functions are strong simplifications of this method. To overcome these assumptions, the Directional Information Transfer (DIT) index was introduced (Yang and Burn, 1994) as the inverse of the coefficient of non-transferred information (NTI) (Harmancioglu and Yevjevich, 1985). Both DIT and NTI are a normalised measure of information transfer between two variables (X1 and X2). DIT=IX1,X2HX1

Particularly for the design of precipitation sensor networks, Ridolfi et al. (2011) presented a definition of the maximum achievable information content for designing a dense network of precipitation sensors at different temporal resolutions. The results of this study show that there exists a linear dependency between the non-transferred information and the sampling frequency of the observations.

Total correlation (C) is an alternative measure of the amount of shared information between two or more variables, and has also been used as a measure of information redundancy in the design of sensor networks (Alfonso et al., 2010a, b; Leach et al., 2015) as CX1,…,Xn=∑i=1nH(Xi)-H(X1,…,XN), where C(X1,X2, …, Xn) is the total correlation among the n variables, H(Xi) is the entropy of the variable i, and H(X1,X2, …, Xn) is the joint entropy of the n variables. Total correlation can be seen then as a simplification of the multivariate mutual information, where only the interaction among all the variables is considered. In the design of sensor networks, it is expected that the mutual information among the different variables is minimum; therefore, the difference between the total correlation and multivariate mutual information tends to be minimised as well. The advantage of total correlation is the computational advantage that represents assuming a marginal value for the interaction among variables.

Recommended minimum densities of stations (area in km2 per station) – adapted from WMO (2008c).

Precipitation Physiographic unit Non-recording Recording Evaporation Streamflow Sediments Water quality Coastal 900 9000 50 000 2750 18 300 55 000 Mountains 250 2500 50 000 1000 6700 20 000 Interior plains 575 5750 5000 1875 12 500 37 500 Hilly/undulating 575 5750 50 000 1875 12 500 47 500 Small islands 25 250 50 000 300 2000 6000 Urban areas – 10–20 – – – – Polar/arid 10 000 10 000 100 000 20 000 200 000 200 000

A method to estimate trans-information fields at ungauged locations has been proposed by Su and You (2014), employing a trans-information–distance relationship. This method accounts for spatial distribution of precipitation, supporting the augmentation problem in the design of precipitation sensor networks. However, as the relationship between trans-information between sensors and their distance is monotonic, the resulting sensor networks are generally sparse.

Methods based on expert recommendations Physiographic components

Among the most used planning tools for hydrometric network design are the technical reports presented by the WMO (2008c), in which a minimum density of stations depending on different physiographic units, are suggested (Table 1). Although these guidelines do not provide an indication about where to place hydrometric sensors, rather they recommend that their distribution should be as uniform as possible and that network expansion has to be considered. The document also encourages the use of computationally aided design and evaluation of a more comprehensive design. For instance, Coulibaly et al. (2013) suggested the use of these guidelines to evaluate the Canadian national hydrometric network.

Moss et al. (1982) presented one of the first attempts to use physiographic components in the design of sensor networks in a method called Network Analysis for Regional Information (NARI). This method is based on relations of basin characteristics proposed by Benson and Matalas (1967). NARI can be used to formulate the following objectives for network design: minimum cost network, maximum information, and maximum net benefit from the data-collection programme, in a Bayesian framework, which can be approximated as log⁡σSQ^-Qα=a+b1n+b2y, where the function S(|Q^-Q|)α is the α percentile of the standard error in the estimation of Q, a, b1, and b2 are the parameters from the NARI analysis, n is the number of stations used in the regional analysis, and y is the harmonic mean of the records used in the regression.

Laize (2004) presented an alternative for evaluating precipitation networks based on the use of the Representative Catchment Index (RCI), a measure to estimate how representative a given station in a catchment is for a given area, on the stations in the surrounding catchments. The author argues that the method, which uses datasets of land use and elevation as physiographical components, can help in identifying areas with an insufficient number of representative stations in a catchment.

Practical case-specific considerations

Most of the first sensor networks were designed based on expert judgement and practical considerations. Aspects such as the objective of the measurement, security, and accessibility are decisive to selecting the location of a sensor. Nemec and Askew (1986) presented a short review of the history and development of the early sensor networks, where it is highlighted that the use of “basic pragmatic approaches” still had most of the attention, due to its practicality in the field and its closeness with decision-makers.

Bleasdale (1965) presented a historical review of the early development process of the rainfall sensor networks in the United Kingdom. In the early stages of the development of precipitation sensor networks, two main characteristics influencing the location of the sensors were identified: at sites that were conventionally satisfactory and where good observers were located. However, the necessity of a more structured approach to select the location of sensors was underlined. As a guide, Bleasdale (1965) presented a series of recommendations on the minimal density of sensors for operational purposes, summarised in Fig. 5, relating the characteristics of the area to be monitored and the minimum required a number of rain sensors, as well as its temporal resolution.

Minimum number of rain gauges required in reservoired moorland areas – adapted from Bleasdale (1965).

In a more structured approach, Karasseff (1986) introduced some guidelines for the definition of the optimal sensor network to measure hydrological variables for operational hydrological forecasting systems. The study specified the minimum requirements for the density of measurement stations based on the fluctuation scale and the variability of the measured variable by defining zonal representative areas. This author suggested the following considerations for selecting the optimal placement of hydrometric stations:

in the lower part of inflow and wastewater canals;

at the heads of irrigation and watering canals taking water from the sources;

at the beginning of a debris cone before the zone of infiltration, and at its end, where groundwater decrement takes place;

at the boundaries of irrigated areas and zones of considerable industrial water diversions (towns); and

at the sites of hydroelectric power plants and hydro-projects.

From a different perspective, Wahl and Crippen (1984), as well as Mades and Oberg (1986), proposed a qualitative score assessment of different factors related to the use of data and the historical availability of records for the evaluation of sensor values. Their analyses aimed at identifying candidate sensors to be discontinued, due to their limited accuracy.

User survey

These approaches aim to identify the information needs of particular groups of users (Sieber, 1970), following the idea that the location of a certain sensor (or group of sensors) should satisfy at least one specific purpose. To this end, surveys to identify the interests for the measurement of certain variables, considering the location of the sensor, record length, frequency of the records, methods of transmission, among others, are executed.

Singh et al. (1986) applied two questionnaires to evaluate the streamflow network in Illinois: one to identify the main uses of streamflow data collected at gauging stations, where participants described how data was used and how they would categorise it in either site-specific management activities, local or regional planning and design, or determination of long-term trends. The second questionnaire was used to determine present and future needs for streamflow information. The results showed that the network was reduced due to the limited interest about certain sensors, which allowed for enhancing the existing network using more sophisticated sensors or recording methods. Additionally, this redirection of resources increased the coverage at specific locations.

Other methods

There are also other methods that cannot be easily attributed to the previously mentioned categories. Among them, value of information, fractal, and network theory-based methods can be mentioned.

Value of information

The value of information (VOI, Howard, 1966; Hirshleifer and Riley, 1979) is defined as the value a decision-maker is willing to pay for extra information before making a decision. This willingness to pay is related to the reduction of uncertainty about the consequences of making a wrong decision (Alfonso and Price, 2012).

The main feature of this approach is the direct description of the benefits of additional pieces of information, compared with the costs of acquiring that extra piece of information (Black et al., 1999; Walker, 2000; Nguyen and Bagajewicz, 2011; Alfonso and Price, 2012; Ballari et al., 2012). The main advantage of this method is that it provides a pragmatic framework in which information has a utilitarian value, usually economic, which is especially suited for budget constraint conditions.

One of the assumptions of this type of model is that a prior estimation of consequences is needed. If a is the action that has been decided to perform, m is the additional information that comes to make such a decision, and s is the state that is actually observed, then the expected utility of any action a can be expressed as

ua,Ps=∑SPsuCas, where Ps is the perception, in probabilistic terms, of the occurrence of a particular state (s) among a total number of possible states (S), and u is the utility of the outcome Cas of the actions given the different states. When new information (i.e. a message m) becomes available, and the decision-maker accepts it, his prior belief Ps will be subject to a Bayesian update. If P(m|s) is the likelihood of receiving the message m given the state s and Pm is the probability of getting a message m, then Pm=∑SPsPm|s.

The value of a single message m can be estimated as the difference between the utility, u, of the action, am that is chosen given a particular message m and the utility of the action, a0, that would have been chosen without additional information as Δm=uam,Ps|m-ua0,Ps|m.

The value of information, VOI, is the expected utility of the values Δm: VOI=EΔm=∑MPmΔm.

Following the same line of ideas, Khader et al. (2013) proposed the use of decision trees to account for the development of a sensor network for water quality in drinking groundwater applications. VOI is a straightforward methodology to establish present causes and consequences of scenarios with different types of actions, including the expected effect of additional information. A recent effort by Alfonso et al. (2016) towards identifying valuable areas to get information for floodplain planning consists of the generation of VOI maps, where probabilistic flood maps and the consequences of urbanisation actions are taken into account to identify areas where extra information may be more critical.

Fractal-based

Fractal-based methods employ the concept of Gaussian self-affinity, where sensor networks show the same spatial patterns at different scales. This affinity can be measured by its fractal dimension (Mandelbrot, 2001). Lovejoy et al. (1986) proposed the use of fractal-based methods to measure the dimensional deficit between the observations of a process and its real domain. Consider a set of evenly distributed cells representing the physical space, and the fractal dimension of the network representing the number of observed cells in the correlation space. The lack of non-measured cells in the correlation space is known as the fractal deficit of the network. Considering that a large number of stations have to be available at different scales, the method is suitable for large networks, but less useful in the deployment of few sensors in a catchment scale.

Lovejoy and Mandelbrot (1985) and Lovejoy and Schertzer (1985) introduced the use of fractals to model precipitation. They argued that the intermittent nature of the atmosphere can be characterised by fractal measures with fat-tailed probability distributions of the fluctuations, and stated that standard statistical methods are inappropriate to describe this kind of variability. Mazzarella and Tranfaglia (2000) and Capecchi et al. (2012) presented two different case studies using this method for the evaluation of a rainfall sensor networks. The former study concludes that for network augmentation, it is important to select the optimal locations that improve the coverage, measured by the reduction of the fractal deficit. However, there are no practical recommendations on how to select such locations. The latter proposes the inspection of seasonal trends as the meteorological processes of precipitation may have significant effects on the detectability capabilities of the network.

A common approach for the quantification of the dimensional deficit is the box-counting method (Song et al., 2007; Kanevski, 2008), mainly used in the fractal characterisation of precipitation sensor networks. The fractal dimension of the network (D) is quantified as the ratio of the logarithm of the number of blocks (NB) that have measurements and the logarithm of the scaling radius (R). D=log(NB(R))log(R)

Due to the scarcity of measurements of precipitation types of networks, the quantification of the fractal dimension may be unstable. An alternative fractal dimension may be calculated using a correlation integral (Mazzarella and Tranfaglia, 2000) instead of the number of blocks, such that

CIR=2B(B-1)∑i=1B∑j=1BΘR-uαi-uαj:fori≠j, in which CI is the correlation integral, R is the scaling radius, B is the total number of blocks at each scaling radius, and Uα is the location of station α. Θ is the Heaviside function. A normalisation coefficient is used, as the number of estimations of the counting of blocks considers each station as a centre.

The consequent definition of the fractal dimension of the network is the rate between the logarithm of the correlation integral and the logarithm of the scaling radius. This ratio is calculated from a regression between different values of R, for which the network exhibits fractal behaviour (meaning a high correlation between log(CI) and log(R)).

D=log(CI)log(R) The maximum potential value for the fractal dimension of a 2-D network (such as for spatially distributed variables) is 2. However, this limit considers that the stations are located on a flat surface, as elevation is a consequence of the topography, and is not a variable that can be controlled in the network deployment.

Network theory-based

Recently, research efforts have been devoted to the use of the so-called network theory to assess the performance of discharge sensor networks (Sivakumar and Woldemeskel, 2014; Halverson and Fleming, 2015). These studies analyse three main features, namely average clustering coefficient, average path length, and degree distribution. Average clustering is a degree of the tendency of stations to form clusters. Average path length is the average of the shortest paths between every combination of station pairs. Degree distribution is the probability distribution of network degrees across all the stations, being network degree defined as the number of stations to which a station is connected. Halverson and Fleming (2015) observed that regular streamflow networks are highly clustered (so the removal of any randomly chosen node has little impact on the network performance) and have long average path lengths (so information may not easily be propagated across the network).

In hydrometric networks, three metrics are identified (Halverson and Fleming, 2015): degree distribution, clustering coefficient, and average path length. The first of these measures is the average node degree, which corresponds to the probability of a node being connected to other nodes. The metric is calculated in the adjacency matrix (a binary matrix in which connected nodes are represented by 1 and the missing links by 0). Therefore, the degree of the node is defined as k(α)=∑j=1naα,j, where k(α) is the degree of station α, n is the total number of stations, and a is the adjacency matrix.

The clustering coefficient is a measure of how much the nodes cluster together. High clustering indicates that nodes are highly interconnected. The clustering coefficient (CC) for a given station is defined as CCα=2kαkα-1∑j=1naα,j. Additionally, the average path length refers to the mean distance of the interconnected nodes. The length of the connections in the network provides some insights into the length of the relationships between the nodes in the network.

L=1n(n-1)∑α=1kα∑j=1ndα,j As can be seen from the formulation, the metrics of the network largely depend on the definition of the network topology (adjacency matrix). The links are defined from a metric of statistical similitude such as the Pearson r or the Spearman rank coefficient. The links are such a pair of stations over which statistical similitude is over a certain threshold.

Classification of sensor network design criteria including recommended reading.

Approaches Measurement-based Measurement-free Model-free Model-based Classes Statistics-based Interpolation variance Pardo-Igúzquiza (1998) Bardossy and Li (2008) Nowak et al. (2010) Cross-correlation Maddock (1974) Vivekanandan and Jagatp (2012) Moss and Karlinger (1974) Model error Tarboton et al. (1987) Dong et al. (2005) Information theory Entropy Krstanovic and Singh (1992) Pham and Tsai (2016) Alfonso et al. (2014) Mutual information Husain (1987) Coulibaly and Samuel (2014) Alfonso (2010) Expert recommendations Physiographic components Samuel et al. (2013) Moss and Karlinger (1974) Lazie (2004) Moss et al. (1982) Practical case-specific Wahl and Crippen (1984) considerations Nemec and Askew (1986) Karaseff (1986) User survey Sieber (1970) Singh et al. (1986) Other methods Value of information Alfonso and Price (2012) Black et al. (1999) Alfonso et al. (2016) Fractal characterisation Lovejoy and Mandelbrot (1985) Capecchi et al. (2012) Network theory Sivakumar and Woldemeskel (2014) Halverson and Fleming (2015)

According to Halverson and Fleming (2015), an optimal configuration of streamflow networks should consist of measurements with small membership communities, high-betweenness, and index stations with large numbers of intracommunity links. Small communities represent clusters of observations, thus indicating efficient measurements. Large numbers of intra-community links ensure that the network has some degree of redundancy, and, thus, is resistant to sensor failure. High-betweenness indicates that such stations which have the most inter-communal links are adequately connected and thus able to capture the heterogeneity of the hydrological processes at a larger scale.

Advantages and disadvantages of sensor network design methods.

Advantages Disadvantages Statistics-based Interpolation variance Useful for assessing data-scarce areas No event-driven Heavily rely on the characterisation of the covariance structure Minimise uncertainty in spatial distribution of measured variable No relationship with final measurement objective Cross-correlation Useful for detecting redundant stations Computationally inexpensive Augmentation not possible without additional assumptions Limited to linear dependency between stations Model error Has direct relationship with the measurement Biased towards current measurement objectives objectives Biased towards model and error metrics Information theory Entropy Assess non-linear relationship between variables Formal form is computationally intensive Unbiased estimation of network performance Quantising (binning) of continuous variables lead to different results Optimal networks are usually sparse Difficult to benchmark Data intensive Mutual information Idem Idem Expert recommendations Physiographic Well understood Not useful for homogeneous catchments components Functional for heterogeneous catchments with few available measurements No quantitative measure of network accuracy Useful at country/continental level Practical case-specific No previous measurements are required Biased towards expert considerations Useful for observing specific variables Collected data do not influence selection Biased towards current data requirements User survey Pragmatic Extensive user identification Cost-efficient Biased towards current data requirements Other methods Value of information Provides a full economical assessment Hard to quantify Usually decisions are made with available information Biased towards a rational decision model Fractal characterisation Efficient for large networks Not suitable for small networks or catchments Does not require data collection Does not consider topographic or orographic influence Network theory Provides insight in interconnected networks Not useful for augmentation purposes Data intensive

Aggregation of approaches and classes

Table 2 summarises the sensor network design classes and approaches, with the selected references to the relevant papers in each of the categories for further reference.

It is of special interest in the review to highlight the lack of model-based information theory methods, as well as the low number of publications in network theory-based methods. Also, quantitative studies in the comparison of different methodologies for the design of sensor networks are limited. It is suggested, therefore, that a pilot catchment is used for the scientific community to test all the available methods for network evaluation, and to establish similarities and differences among them.

Table 3 summarises the main advantages and disadvantages for each of the design and evaluation methods. These recommendations are general, but take into account the most general points in the design considerations of sensor networks. Some of the advantages of these methods have been exploited in combined methodologies, such as those presented by Yeh et al. (2011), Samuel et al. (2013), Barca et al. (2015), Coulibaly and Samuel (2014), and Kang et al. (2014).

General procedure for sensor network design

Based on the presented literature review, in this section an attempt is made to present a first version of a unified, general procedure for sensor network design. Such procedure logically link in a flowchart various methods, following the measurement-based approaches (Fig. 6). The flowchart suggests two main loops: one to measure the network performance (optimisation loop), and a second one to represent the selection in the number of sensors in either augmentation or reduction scenarios. Most of the measurement-based methods, as well as most of the design scenarios can be typically seen as particular cases of this generalised algorithmic flowchart.

Sensor network (re)design flowchart (CML: candidate measurement locations).

The general procedure consists of 11 steps (boxes in Fig. 6). In the first place, physical measurements (1) are acquired by the sensor network. These data are used to parameterise an estimator (2), which will be used to estimate the variable at the candidate measurement locations (CML) using, for instance, Kriging (Pardo-Igúzquiza, 1998; Nowak et al., 2009) or 1-D hydrodynamic models (Neal et al., 2012; Rafiee, 2012; Mazzoleni et al., 2015). The sensor network reduction does not require such estimators as measurements are already in place.

The selection of the CML should consider factors such as physical and technical availability, as well as costs related to maintenance and accessibility of stations, as illustrated by the WMO (2008c) recommendations. The selection of CML can also be based, for example, on expert judgement. These limitations may be presented in the form of constraints in the optimisation problem.

Then an optimisation loop starts (Fig. 6), by the estimation of the measured variable at the CML (3), using the estimator built in (2). Next, the performance of the sensor network at the CML is evaluated (4), using any of the previously discussed methods. The selection of the method depends on the designer and its information requirements, which also determine whether an optimal solution is found (5). The stopping criteria in the optimisation problem can be set by a desired accuracy of the network, some non-improved number of solutions, or a maximum number of iterations. As pointed out in the review, these performance metrics can be either model-based or model-free and should not be confused with the use of a (geostatistical) model of the measured variable.

In case the optimisation loop is not complete, a new set of CML is selected (6). The use of optimisation algorithms may drive the search for the new potential CML (Pardo-Igúzquiza, 1998; Kollat et al., 2008, 2011; Alfonso, 2010). The decision about adequate performance should not only consider the expected performance of the network, but also recognise the effect of a limited number of sensors.

Once the performance is optimal, an iteration over the number of sensors is required. If the scenario is for network augmentation (7), then a possibility of including additional sensors has to be considered (8). The decision to go for an additional sensor will depend on the constraints of the problem, such as a limitation on the number of sensors to install, or on the marginal improvement of performance metrics.

The network reduction scenario (9) is inverse: for diverse reasons, mainly of a financial nature, networks require fewer sensors. Therefore, the analysis concerns which sensors to remove from the network, within the problem constraints (10).

Finally, the sensor network is selected (11) from the results of the optimisation loop, with the adequate number of sensors. It is worth mentioning that an extra loop is required, leading to re-evaluation, typically done on a periodical basis, when objectives of the network may be redefined, new processes need to be monitored, or when information from other sources is available, and that can potentially modify the definition of optimality.

Conclusions and recommendations

This paper summarised some of the methodological criteria for the design of sensor networks in the context of hydrological modelling, proposed a framework for classifying the approaches in the existing literature, and also proposed a general procedure for sensor network design. The following conclusions can be drawn.

Most of the sensor network methodologies aim to minimise the uncertainty of the variable of interest at ungauged locations and the way this uncertainty is estimated varies between different methods. In statistics-based models, the objective is usually to minimise the overall uncertainty about precipitation fields or discharge modelling error. Information theory-based methods aim to find measurements at locations with maximum information content and minimum redundancy. In network theory-based methods, estimations are generally not accurate, resulting in less biassed estimations. In methods based on practical case-specific considerations and value of information, the critical consequences of decisions dictate the network configuration.

However, in spite of the underlying resemblances between methods, different formulations of the design problem can lead to rather different solutions. This gap between methods has not been deeply covered in the literature and therefore general agreement on the sensor network design procedure is relevant.

In particular, for catchment modelling, the driving criteria should also consider model performance. This driving criterion ensures that the model adequately represents the states and processes of the catchment, reducing model uncertainty and leading to more informed decisions. Currently, most of the network design methods do not ensure minimum modelling error, as often it is not the main performance criteria for design.

Furthermore, in recent years, the rise of various sensing technologies in operational environments has promoted the inclusion of additional design considerations towards a unified heterogeneous sensor network. These new sensing technologies include e.g. passive and active remote sensing using radars and satellites (Thenkabali, 2015), microwave links (Overeem et al., 2011), mobile sensors (Haberlandt and Sester, 2010; Dahm et al., 2014), crowdsourcing, and citizen observatories (Huwald et al., 2013; Lanfranchi et al., 2014; Alfonso et al., 2015). These non-conventional information sources have the potential to complement conventional networks by exploiting the synergies between the virtues and reducing limitations of various sensing techniques, and at the same time require the new network design methods allowing for handling of the heterogeneous dynamic data with varying uncertainty.

The proposed classification of the available network design methods was used to develop a general framework for network design. Different design scenarios, namely relocation, augmentation, and reduction of networks, are included for measurement-based methods. This framework is open and offers “placeholders” for various methods to be used depending on the problem type.

Concerning the further research, from the hydrological modelling perspective, we propose directing efforts towards the joint design of precipitation and discharge sensor networks. Hydrological models use precipitation data to provide discharge estimates; however, as these simulations are error-prone, the assimilation of discharge data, or error correction, reduces the systematic errors in the model results. The joint design of both precipitation and discharge sensor networks may help to provide more reliable estimates of discharge at specific locations.

Another direction of research may include methods for designing dynamic sensor networks, given the increasing availability of low-cost sensors, as well as the expansion of citizen-based data collection initiatives (crowdsourcing). These information sources have been on the rise in recent years, and one may foresee the appearance of interconnected, multi-sensor heterogeneous sensor networks shortly.

The presented review has also shown that limited effort has been devoted to considering changes in long-term patterns of the measured variable in the sensor network design. This assumption of stationarity has become more relevant in recent years due to new sensing technologies and increased systemic uncertainties, e.g. due to climate and land use change and rapidly changing weather patterns. Although this topic has been recognised for quite some time already (see e.g. Nemec and Askew, 1986), the number of publications presenting effective methods to deal with them is still limited. This problem, and the techniques to solve it, are being addressed in the ongoing research.

No data sets were used in this article.

The authors declare that they have no conflict of interest.

Acknowledgements

We would like to thank Joanne Craven for the editing support in the final stages of this article, and the three anonymous referees whose comments greatly helped us to improve this document to its current form. Edited by: Laurent Pfister Reviewed by: three anonymous referees

References 1

Alfonso, L.: Optimisation of monitoring networks for water systems Information theory, value of information and public participation, PhD thesis, UNESCO-IHE and Delft University of Technology, CRC-Press, Delft, the Netherlands, 2010.

Alfonso, L. and Price, R.: Coupling hydrodynamic models and value of information for designing stage monitoring networks, Water Resour. Res., 48, W08530, 10.1029/2012WR012040, 2012.

Alfonso, L., Lobbrecht, A., and Price, R.: Optimization of Water Level Monitoring Network in Polder Systems Using Information Theory, Water Resour. Res., 46, W12553, 10.1029/2009WR008953, 2010a.

Alfonso, L., Lobbrecht, A., and Price, R.: Information theory–based approach for location of monitoring water level gauges in polders, Water Resour. Res., 46, W12553, 10.1029/2009WR008101, 2010b.

Alfonso, L., He, L., Lobbrecht, A., and Price, R.: Information theory applied to evaluate the discharge monitoring network of the Magdalena River, J. Hydroinform., 15, 211–228, 10.2166/hydro.2012.066, 2013.

Alfonso, L., Ridolfi, E., Gaytan-Aguilar, S., Napolitano, F., and Russo, F.: Ensemble entropy for monitoring network design, Entropy, 16, 1365–1375, 10.3390/e16031365, 2014.

Alfonso, L., Chacon-Hurtado, J., and Peña-Castellanos, G.: Allowing citizens to effortlessly become rainfall sensors, 36th IAHR World Congress, the Hague, the Netherlands, 2015.

Alfonso, L., Mukolwe, M., and Di Baldassarre, G.: Probabilistic flood maps to support decision-making: Mapping the Value of Information, Water Resour. Res., 52, 1026–1043, 10.1002/2015WR017378, 2016.

Amorocho, J. and Espildora, B.: Entropy in the assessment of uncertainty in hydrologic systems and models, Water Resour. Res., 9, 1511–1522, 10.1029/WR009i006p01511, 1973.

Anctil, F., Lauzon, N., Andréassian, V., Oudin, L., and Perrin, C.: Improvement of rainfall-runoff forecast through mean areal rainfall optimization, J. Hydrol., 328, 717–725, 10.1016/j.jhydrol.2006.01.016, 2006.

Ballari, D., de Bruin, S., and Bregt, A. K.: Value of information and mobility constraints for sampling with mobile sensors, Comput. Geosci., 49, 102–111, 10.1016/j.cageo.2012.07.005, 2012.

Banik, B. K., Alfonso, L., Di Cristo, C., Leopardi, A., and Mynett, A.: Evaluation of different formulations to optimally locate pollution sensors in sewer systems, ASCE J. Water Res. Pl., 10.1061/(ASCE)WR.1943-5452.0000778, 2017.

Barca, E., Pasarella, G., Vurro, M., and Morea, A.: MSANOS: Data-Driven, Multi-Approach Software for Optimal Redesign of Environmental Monitoring Networks, Water Resour. Manag., 29, 619–644, 10.1007/s11269-014-0859-9, 2015.

Bárdossy, A.: Copula-based geostatistical models for groundwater quality parameters, Water Resour. Res., 42, W11416, 10.1029/2005WR004754, 2006.

Bárdossy, A. and Li, J.: Geostatistical interpolation using copulas, Water Resour. Res., 44, W07412, 10.1029/2007WR006115, 2008.

Bárdossy, A. and Pegram, G. G. S.: Copula based multisite model for daily precipitation simulation, Hydrol. Earth Syst. Sci., 13, 2299–2314, 10.5194/hess-13-2299-2009, 2009.

Bastin, G. and Gevers, M.: Identification and optimal estimation of random fields from scattered point-wise data, Automatica, 2, 139–155, 10.1016/0005-1098(85)90109-8, 1985.

Bastin, G., Lorent, B., Duque, C., and Gevers, M.: Optimal estimation of the average areal rainfall and optimal selection of rain gauge locations, Water Resour. Res., 20, 463–470, 10.1016/0005-1098(85)90109-8, 1984.

Bennet, N. D., Croke, B. F. W., Guariso, G., Guillaume, J. H., Hamilton, S. H., Jakeman, A. J., Marsili-Libelli, S., Newham, L. T. H., Norton, J. P., Perrin, C., Pierce, S. A., Robson, B., Seppelt, R., Voinov, A. A., Fath, B. D., and Andreassian, V.: Characterising performance of environmental models, Environ. Model. Softw., 40, 1–20, 10.1016/j.envsoft.2012.09.011, 2013.

Benson, A. and Matalas, N. C.: Synthetic hydrology based on regional statistical parameters, Water Resour. Res., 3, 931–935, 10.1029/WR003i004p00931, 1967.

Beven, K. J.: Rainfall-runoff modelling: the primer, John Wiley & Sons, Ltd., Hoboken, NJ, USA, 2012.

Black, A. R., Bennet, A. M., Hanley, N. D., Nevin, C. L., and Steel, M. E.: Evaluating the benefits of hydrometric networks, R&D Technical report W146, Environment Agency, UK, 1999.

Bleasdale, A.: Rain-gauge networks development and design with special reference to the United Kingdom, WMO/IAHS Symposium the design of hydrological networks, 1965.

Bogárdi, I., Bárdossy, A., and Duckstein, L.: Multicriterion network design using geostatistics, Water Resour. Res., 21, 199–208, 10.1029/WR021i002p00199, 1985.

Bonaccorso, B., Cancelliere, A., and Rossi, G.: Network design for drought monitoring by geostatistical techniques, European Water, EWRA, 9–15, 2003.

Box, G. E. P.: Choice of response surface design and alphabetic optimality, Technical summary report #2333, University of Wisconsin-Madison, Mathematics Research Center, 1982.

Burn, D. and Goulter, I.: An approach to the rationalization of streamflow data collection networks, J. Hydrol., 122, 71–91, 10.1016/0022-1694(91)90173-F, 1991.

Capecchi, V., Crisci, A., Melani, S., Morabito, M., and Politi, P.: Fractal characterization of rain-gauge networks and precipitations: an application in central Italy, Theor. Appl. Climatol., 107, 541–546, 10.1007/s00704-011-0503-z, 2012.

Caselton, W. F. and Zidek, J. V.: Optimal monitoring network designs, Statistics and Probability Letters, 2, 223–227, 10.1016/0167-7152(84)90020-8, 1984.

Casman, E., Naiman, D., and Chamberlin, C.: Confronting the ironies of optimal design: Nonoptimal sampling design with desirable properties, Water Resour. Res., 24, 409–415, 10.1029/WR024i003p00409, 1988.

Chacon-Hurtado, J. C., Garzon, F., and Montaña, D.: Optimización de la red de pluviómetros de la ciudad de Cali, Colombia, por métodos geoestadísticos [Optimisation of the Cali, Colombia pluviometer network using geostatistical methods], Agua 2009: La gestión integrada del recurso hídrico frente al cambio climático, session: Un nuevo paradigma en la gestión integral del agua en zonas urbanas, Cali, Colombia, 2009 (in Spanish).

Chacon-Hurtado, J., Alfonso, L., and Solomatine, D.: Precipitation sensor network design using time-space varying correlation structure, 11th international conference on Hydroinformatics, International Conference on Hydroinformatics, CUNY Academic Works, New York, USA, 2014.

Chaloner, K. and Verdinelli, I.: Bayesian Experimental Design: A Review, Stat. Sci., 10, 273–304, 1995.

Cheng, K. S., Ling, Y. C., and Liou, J. J.: Rain gauge network evaluation and augmentation using geostatistics, Hydrol. Process., 22, 2554–2564, 10.1002/hyp.6851, 2007.

Ciach, G. and Krajewski, W.: Analysis and modeling of spatial correlation structure in small-scale rainfall in Central Oklahoma, Adv. Water Resour., 29, 1450–1463, 10.1016/j.advwatres.2005.11.003, 2006.

Cihlar, J., Grabs, W., and Landwehr, J.: Establishment of a hydrological observation network for climate, Report of the GCOS/GTOS/HWRP expert meeting, Report GTOS 26, WMO, Geisenheim, Germany, 2000.

Coulibaly, P. and Samuel, J.: Hybrid Model Approach To Water Monitoring Network Design, International Conference on Hydroinformatics, CUNY Academic Works, New York, USA, 2014.

Coulibaly, P., Samuel, J., Pietroniro, A., and Harvey, D.: Evaluation of Canadian National Hydrometric Network density based on WMO 2008 standards, Can. Water Resour. J., 38, 159–167, 10.1080/07011784.2013.787181, 2013.

Cover, T. M. and Thomas, J. A.: Elements of information theory. 2., Wiley-Interscience, New York, NY, USA, 2005.

Cressie, N. A. C.: Statistics for spatial data, John Wiley and Sons, Hoboken, USA, 1993.

Dahm, R., de Jong, S., Talsma, J., Hut, R., and van de Giesen, N.: The application of robust acoustic disdrometers in urban drainage modelling, 13th International Conference on Urban Drainage, Sarawak, Malaysia, 2014.

DasGupta, A.: Review of Optimal Bayes Designs. Technical report #95-4, Purdue University, Department of Statistics, 1996.

Dent, J. E.: Climate and meteorological information requirements for water management: A review of issues, WMO 1094, Geneva, Switzerland, 2012.

Dong, X., Dohmen-Janssen, C. M., and Booij, M. J.: Appropriate spatial sampling of rainfall for flow simulation, Hydrolog. Sci. J., 50, 279–298, 10.1623/hysj.50.2.279.61801, 2005.

EC: EU Water Framework Directive. Directive 2000/60/EC of the European Parliament and of the Council of 23 October 2000 establishing a framework for Community action in the field of water policy, European Commission, 2000.

Environment Canada: Audit of the national hydrometric program, available at: http://www.ec.gc.ca/ae-ve/default.asp?lang=En&n=514E38D8-1 (last access: 10 June 2017), 2010.

EPA: Guidance on choosing a sampling design for environmental data collection, EPA, US Environmental Protection Agency, 2002.

Fahle, M., Hohenbrink, T. L., Dietrich, O., and Lischeid, G.: Temporal variability of the optimal monitoring setup assessed using information theory, Water Resour. Res., 51, 7723–7743, 10.1002/2015WR017137, 2015.

Fedorov, V. V.: Theory of optimal experiments, Academic press, New York, 1972.

Fedorov, V. V. and Hackl, P.: Model oriented design of experiments, Springer, New York, 1997.

Fisher, R. A.: The design of experiments, Hafner press, New York, 1974.

Fuentes, M., Chaudhuri, A., and Holland, D. H.: Bayesian entropy for spatial sampling design of environmental data, Environ. Ecol. Stat., 14, 323–340, 10.1007/s10651-007-0017-0, 2007.

Grabs, W. and Thomas, A. R.: Report of the GCOS/GTOS/HWRP expert meeting on the implementation of a global terrestrial network – hydrology (GTN-H), Report GCOS 71, GTOS 29, WMO, Koblenz, Germany, 2001.

Grimes, D. I. F., Pardo-Iguzquiza, E., and Bonifacio, R.: Optimal areal rainfall estimation using raingauges and satellite data, J. Hydrol., 222, 93–108, 10.1016/S0022-1694(99)00092-X, 1999.

Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squarred error and NSE performance criteria: Implications for improving hydrological modelling, J. Hydrol., 377, 80–91, 10.1016/j.jhydrol.2009.08.003, 2009.

Guttorp, P., Le, N. D., Sampson, P. D., and Zidek, J. V.: Using entropy in the redesign of an environmental monitoring network, in: Multivariate environmental statistics, edited by: Patil, G. P. and Rao, C. R., 175–202, North Holland: Elsevier Science, New York, 1993.

Haberlandt, U. and Sester, M.: Areal rainfall estimation using moving cars as rain gauges – a modelling study, Hydrol. Earth Syst. Sci., 14, 1139–1151, 10.5194/hess-14-1139-2010, 2010.

Halverson, M. J. and Fleming, S. W.: Complex network theory, streamflow, and hydrometric monitoring system design, Hydrol. Earth Syst. Sci., 19, 3301–3318, 10.5194/hess-19-3301-2015, 2015.

Harmancioglu, N. and Yevjevich, V.: Transfer of hydrologic information along rivers partially fed by Karstified limestones, IAHS, Karst Water Resources, 1985.

Hirshleifer, J. and Riley, J. G.: The Analytics of Uncertainty and Information-An Expository Survey, J. Econ. Lit., 17, 1375–1421, 1979.

Howard, R. A.: Information Value theory, IEEE T. Syst. Sci. Cyb., 2, 22–26, 10.1109/TSSC.1966.300074, 1966.

Howard, R. A.: The foundations of decision analysis, IEEE T. Syst. Sci. Cyb., 4, 211–219, 10.1109/TSSC.1968.300115, 1968.

Husain, T.: Hydrologic network design formulation, Can. Water Resour. J., 12, 44–63, 10.4296/cwrj1201044, 1987.

Huwald, H., Barrenetxea, G., de Jong, S., Ferri, M., Carvalho, R., Lanfranchi, V., McCarthy, S., Glorioso, G., Prior, S., Solà, E., Gil-Roldàn, E., Alfonso, L., Wehn de Montalvo, U., Onencan, A., Solomatine, D., and Lobbrecht, A.: D1.11 Sensor technology requirement analysis, FP7/2007-2013 grant agreement no 308429, WeSenseIt project, 2013.

Jaynes, E. T.: Information theory and statistical mechanics, Phys. Rev., 106, 620–630, 10.1103/PhysRev.108.171, 1957.

Jaynes, E. T.: The relation of Bayesian and maximum entropy methods, Maximum-Entropy and Bayesian Methods in Science and Engineering, 25–29, 10.1007/978-94-009-3049-0_2, 1988.

Journel, A. and Huijbregts, C.: Mining Geostatistics, Academic Press, London, 1978.

Kang, J., Li, X., Jin, R., Ge, Y., Wang, J., and Wang, J.: Hybrid Optimal Design of the Eco-Hydrological Wireless Sensor Network in the Middle Reach of the Heihe River Basin, China, Sensors (Basel), 14, 19095–19114, 10.3390/s141019095, 2014.

Karasseff, I. F.: Principles of specifications of optimum networks of hydrologic observation sites, IAHS, Integrated Design of Hydrological Networks, Budapest, Hungary, 1986.

Kanevski, M.: Advanced mapping of environmental data, ISTE Ltd. and John Wiley and Sons, Hoboken, USA, 2008.

Khader, A. I., Rosenberg, D. E., and McKee, M.: A decision tree model to estimate the value of information provided by a groundwater quality monitoring network, Hydrol. Earth Syst. Sci., 17, 1797–1807, 10.5194/hess-17-1797-2013, 2013.

Kiefer, J. and Wolfowitz, J.: Optimum designs in regression problems, Ann. Math. Stat., 30, 271–294, 1959.

Kirk, R. E.: The SAGE handbook of quantitative methods in Psychology, edited by: Millsap, R. E. and Maydeu-Olivares, A., SAGE Publications, Thousand Oaks, CA, USA, 2009.

Knapp, V. and Markus, M.: Evaluation of the Illinois streamflow gaging network, Illinois State Water Survey, Champaign, USA, 2003.

Kollat, J. B., Reed, P. M., and Kasprzyk, J. R.: A new epsilon-dominance hierarchical bayesian optimization algorithm for large multiobjective mointoring network design problems, Adv. Water Resour., 31, 828–845, 10.1016/j.advwatres.2008.01.017, 2008.

Kollat, J. B., Reed, P. M., and Maxwell, R. M.: Many-objective groundwater monitoring network design using bias-aware ensemble Kalman filtering, evolutionary optimization and visual analytics, Water Resour. Res., 47, W02529, 10.1029/2010WR009194, 2011.

Kotecha, P. R., Bushan, M., Gudi, R. D., and Keshari, M. K.: A duality based framework for integrating reliability and precision for sensor network design, J. Process Contr., 18, 189–201, 10.1016/j.jprocont.2007.06.005, 2008.

Krstanovic, P. F. and Singh, V. P.: Evaluation of rainfall networks using Entropy: I. Theoretical development, Water Resour. Manag., 6, 279–293, 10.1007/BF00872281, 1992.

Lahoz, W. A. and Schneider, P.: Data Assimilation: Making sense of Earth Observation, Front. Environ. Sci., 2, 1–28, 10.3389/fenvs.2014.00016, 2014.

Laize, C. L. R.: Integration of spatial datasets to support the review of hydrometric networks and the identification of representative catchments, Hydrol. Earth Syst. Sci., 8, 1103–1117, 10.5194/hess-8-1103-2004, 2004.

Lanfranchi, V., Ireson, N., Wehn, U., Wrigley, S. N., and Ciravegna, F.: Citizens' observatories for situation awareness in flooding, Proceeding of the 11th international ISCRAM conference, University Park, Pennsylvania, USA, 2014.

Leach, J. M., Kornelsen, K. C., Samuel, J., and Coulibaly, P.: Hydrometric network design using streamflow signatures and indicators of hydrologic alteration, J. Hydrol., 529, 1350–1359, 10.1016/j.jhydrol.2015.08.048, 2015.

Li, C., Singh, V. P., and Mishra, A. K.: Entropy theory-based criterion for hydrometric network evaluation and design: Maximum information minimum redundancy, Water Resour. Res., 48, W05521, 10.1029/2011WR011251, 2012.

Li, J., Bárdossy, A., Guenni, L., and Liu, M.: A copula based observation network design approach, Environ. Model. Softw., 26, 1349–1357, 10.1016/j.envsoft.2011.05.001, 2011.

Lindley, D. V.: On a measure of the information provided by an experiment, Ann. Math. Stat., 27, 986–1005, 1956.

Lindström, G., Johansson, B., Persson, M., Gardelin, M., and Bergström, S. Development and test of the distributed HBV-96 hydrological model, J. Hydrol., 201, 272–288, 10.1016/S0022-1694(97)00041-3, 1997.

Liu, Y., Weerts, A. H., Clark, M., Hendricks Franssen, H.-J., Kumar, S., Moradkhani, H., Seo, D.-J., Schwanenberg, D., Smith, P., van Dijk, A. I. J. M., van Velzen, N., He, M., Lee, H., Noh, S. J., Rakovec, O., and Restrepo, P.: Advancing data assimilation in operational hydrologic forecasting: progresses, challenges, and emerging opportunities, Hydrol. Earth Syst. Sci., 16, 3863–3887, 10.5194/hess-16-3863-2012, 2012.

Loucks, D., van Beek, E., Stedinger, J., Dijkman, J., and Villars, M.: Water Resources Systems Planning and Management: An Introduction to Methods, Models and Applications, UNESCO, Paris, France, 2005.

Lovejoy, S. and Mandelbrot, B. B.: Fractal properties of rain, and a fractal model, Tellus A, 37, 209–232, 10.1111/j.1600-0870.1985.tb00423.x, 1985.

Lovejoy, S. and Schertzer, D.: Generalized scale invariance in the atmosphere and fractal models of rain. Water Resour. Res., 21, 1233–1250, 10.1029/WR021i008p01233, 1985.

Lovejoy, S., Schertzer, D., and Ladoy, P.: Fractal characterization of inhomogeneous geophysical measuring networks, Nature, 319, 43–44, 10.1038/319043a0, 1986.

Maddock, T.: An optimum reduction of gauges to meet data program constraints, Hydrological Sciences Bulletin, 19, 337–345, 10.1080/02626667409493920, 1974.

Mades, D. and Oberg, K.: Evaluation of the US Geological Survey Gaging Station Network in Illinois, US Geological Survey Water Resources Investigations Report, US Geological Survey, Urbana, USA, 1986.

Mandelbrot, B. B.: Gaussian Self-Affinity and Fractals, Springer, New Haven, CT USA, 2001.

Marsh, T.: The UK Benchmark network – Designation, evolution and application, 10th symposium on stochastic hydraulics and 5th international conference on water resources and environment research, Quebec, Canada, 2010.

Mazzarella, A. and Tranfaglia, G.: Fractal characterisation of geophysical measuring networks and its implication for an optimal location of additional stations: An application to a rain-gauge network, Theor. Appl. Climatol., 65, 157–163, 10.1007/s007040070040, 2000.

Mazzoleni, M., Alfonso, L., and Solomatine, D.: Improving flood prediction by assimilation of the distributed streamflow observations with variable uncertainty and intermittent behavior, Geophys. Res. Abstr., 15856, EGU General Assembly 2015, Vienna, Austria, 2015.

Melles, S. J., Heuvelink, G. B. M., Twenhöfel, C. J. W., van Dijk, A., Hiemstra, P., Baume, O., and Stöhlker, U.: Optimization for the design of environmental monitoring networks in routine and emergency settings, StatGIS Conference Proceedings, Milos, Greece, 1–6, 2009.

Melles, S. J., Heuvelink, G. B. M., Twenhöfel, C. J. W., van Dijk, A., Hiemstra, P., Baume, O., and Stöhlker, U.: Optimizing the spatial pattern of networks for monitoring radioactive releases, Comput. Geosci., 37, 280–288, 10.1016/j.cageo.2010.04.007, 2011.

100

Mishra, A. and Coulibaly, P.: Developments in hydrometric sensor network design: A review, Rev. Geophys., 47, RG2001, 10.1029/2007RG000243, 2009.

101

Mogheir, Y. and Singh, V. P.: Application of Infomation Theory to groundwater quality monitoring networks, Water Resour. Manag., 16, 37–49, 10.1023/A:1015511811686, 2002.

102

Montgomery, D. C.: Design and analysis of experiments, John Wiley and Sons, Hoboken, NJ, 2012.

103

Morrissey, M. L., Maliekal, J. A., Greene, J. S., and Wang, J.: The uncertainty of simple spatial averages using rain gauge networks, Water Resour. Res., 31, 2011–2017, 10.1029/95WR01232, 1995.

104

Moss, M. and Karlinger, M.: Surface water network design by regression analysis simulation, Water Resour. Res., 433–437, 10.1029/WR010i003p00427, 1974.

105

Moss, M. and Tasker, G.: An intercomparison of hydrological network-design technologies, J. Hydrol. Sci., 36, 209–221, 10.1080/02626669109492504, 1991.

106

Moss, M., Gilroy, E., Tasker, G., and Karlinger, M.: Design of surface water data networks for regional information, USGS Water Supply, Alexandria, VG, USA, 1982.

107

Nash, J. E. and Sutcliffe, J. V.: River flow forecasting through conceptual models Part I – A discussion of principles, J. Hydrol., 10, 282–290, 10.1016/0022-1694(70)90255-6, 1970.

108

Neal, J. C., Atkinson, P. M., and Hutton, C. W.: Adaptive space-time sampling with wireless sensor nodes for flood forecasting, J. Hydrol., 414–415, 136–147, 10.1016/j.jhydrol.2011.10.021, 2012.

109

Nemec, J. and Askew, A. J.: Mean and variance in network design philosophies, Integrated design of hydrological networks, IAHS, Budapest, Hungary, 123–131, 1986.

110

Nguyen, D. Q. and Bagajewicz, M. J.: New sensor network design and retrofit method based on value of information, American Institute of Chemical Engineers, 57, 2136–2148, 10.1002/aic.12440, 2011.

111

Nowak, W., de Barros, F. P. J., and Rubin, Y.: Bayesian Geostatistical Design, Stuttgart University, Stuttgart, Germany, 2009.

112

Nowak, W., de Barros, F. P. J., and Rubin, Y.: Bayesian geostatistical design: Task-driven optimal site investigation when the geostatistical model is uncertain, Water Resour. Res., 46, W03535, 10.1029/2009WR008312, 2010.

113

NRC: Committee on Review of the USGS National Streamflow Information Program, National Academy of Sciences Press, Washington, USA, 2004.

114

Overeem, A., Lejinse, H., and Uijlenhoet, R.: Measuring urban rainfall using microwave links from commercial cellular communication networks, Water Resour. Res., 47, WR010350, 10.1029/2010WR010350, 2011.

115

Pardo-Igúzquiza, E.: Optimal selection of number and location of rainfall gauges for areal rainfall estimation using geostatistics and simulated annealing, J. Hydrol., 210, 206–220, 10.1016/S0022-1694(98)00188-7, 1998.

116

Pham, H. V. and Tsai, F. T. C.: Optimal observation network design for conceptual model discrimination and uncertainty reduction, Water Resour. Res., 52, 1245–1264, 10.1002/2015WR017474, 2016.

117

Pryce, R.: Review and Analysis of Stream Gauge Networks for the Ontario Stream Gauge Rehabilitation Project, WSC Report, Watershed Science Centre, Trent University, Peterborough, USA, 2004.

118

Pukelsheim, F.: Optimal design of experiments, Society for industrial and applied mathematics, 10.1137/1.9780898719109, John-Wiley and Sons, New York, USA, 2006.

119

Rafiee, M.: Data Assimilation in Large-scale networks of open channels, PhD Thesis, eScholarship, Berkeley, CA, USA, 2012.

120

Ridolfi, E., Montesarchio, V., Russo, F., and Napolitano, F.: An entropy approach for evaluating the maximum information content achievable by an urban rainfall network, Nat. Hazards Earth Syst. Sci., 11, 2075–2083, 10.5194/nhess-11-2075-2011, 2011.

121

Ridolfi, E., Alfonso, L., Di Baldassarre, G., Dottori, F., Russo, F., and Napolitano, F.: An entropy approach for the optimization of cross-section spacing for river modelling, Hydrolog. Sci. J., 59, 822640, 10.1080/02626667.2013.822640, 2014.

122

Rodriguez-Iturbe, I. and Mejia, J. M.: The design of rainfall networks in time and space, Water Resour. Res., 10, 713–728, 10.1029/WR010i004p00713, 1974.

123

Samuel, J., Coulibaly, P., and Kollat, J.: CRDEMO: Combined regionalization and dual entropy-multiobjective optimization for hydrometric network design, Water Resour. Res., 49, WR014058, 10.1002/2013WR014058, 2013.

124

Shafiei, M., Ghahraman, B., Saghafian, B., Pande, S., Gharari, S., and Davary, K.: Assessment of rain-gauge networks using a probabilistic GIS based approach, Hydrol. Res., 45, 551–562, 10.2166/nh.2013.042, 2013.

125

Shannon, C. E.: A Mathematical Theory of communication, The Bell System Technical Journal, 27, 379–429, 1948.

126

Shrestha, D. and Solomatine, D.: Machine learning approaches for estimation of prediction interval for the model output, Neural Networks, 19, 225–235, 10.1016/j.neunet.2006.01.012, 2006.

127

Sieber, C.: A proposed streamflow data program for Illinois, Open-file report, US Geological Survey, Urbana, USA, 1970.

128

Singh, K. P., Ramamurthy, G. S., and Terstriep, M. L.: Illinois streamgaging network program: Related studies and results, Tech. rep., Illinois department of Energy and Natural Resources, Champaign, IL, USA, 1986.

129

Singh, V. P.: The entropy theory as a tool for modelling and decision making in environmental and water resources, Water SA, Vol. 26, 2000.

130

Singh, V. P.: Entropy Theory and its application in environmental and water engineering, Wiley-Blackwell, Texas, USA, 2013.

131

Sivakumar, B. and Woldemeskel, F. M.: Complex networks for streamflow dynamics, Hydrol. Earth Syst. Sci., 18, 4565–4578, 10.5194/hess-18-4565-2014, 2014.

132

Solomatine, D. and Xue, Y.: M5 Model trees and Neural Networks: Application to flood forecasting in the upper reach of the Huai River in China, ASCE J. Hydrol. Eng., 9, 491–501, 10.1061/(ASCE)1084-0699(2004)9:6(491), 2004.

133

Solomatine, D. P. and Ostfeld, A.: Data-driven modelling: some past experiences and new approaches, J. Hydroinform., 10, 3–22, 10.2166/hydro.2008.015, 2008.

134

Solomatine, D. P. and Wagener, T.: Hydrological modelling, in: Treatise on Water Science, edited by: Wilderer, P., 435–457, Oxford Academic Press, 2011.

135

Song, C., Gallos, L. K., Havlin, S., and Makse, H. A.: How to calculate the fractal dimension of a complex network: the box covering algorithm, J. Stat. Mech.-Theory E., 2007, P03006, 10.1088/1742-5468/2007/03/p03006, 2007.

136

Stedinger, J. and Tasker, G.: Regional hydrological analysis: 1, Ordinary, Weighted and Generalized least squares, Water Resour. Res., 21, 1421–1432, 10.1029/WR021i009p01421, 1985.

137

Steuer, R., Kurths, J., Daub, C. O., Weise, J., and Selbig, J.: The mutual information: detecting and evaluating dependencies between variables, Bioinformatics, 18, 231–240, 2002.

138

Su, H.-T. and You, G. J.-Y.: Developing an entropy-based model of spatial information estimation and its application in the design of precipitation gauge networks, J. Hydrol., 519, 3316–3327, 10.1016/j.jhydrol.2014.10.022, 2014.

139

Sun, L., Seidou, O., Nistor, I., and Liu, K.: Review of the Kalman type hydrological data assimilation, Hydrolog. Sci. J., 61, 2348–2366, 10.1080/02626667.2015.1127376, 2015.

140

Tarboton, D., Bras, R., and Puente, C.: Combined hydrological sampling criteria for rainfall and streamflow, J. Hydrol., 323–339, 10.1016/0022-1694(87)90009-6, 1987.

141

Tasker, G.: Generating efficient gauging plans for regional information. Integrated Design of hydrological Networks, IAHS, Budapest, Hungary, 269–281, 1986.

142

Thenkabali, P. S.: Remote sensing of water resources, disasters and urban studies, 1st CRC-Press, Boca Raton, FL, USA, 2015.

143

TNO: Design aspects of hydrological networks, Organisation for Applied Scientific Research, TNO, The Hague, the Netherlands, 1986.

144

Vivekanandan, N. and Jagtap, R.: Optimization of Hydrometric Network using Spatial Regression Approach, Journal of Industrial Engineering and Management Science, 2, 56–61, 2012.

145

Wahl, K. and Crippen, J.: A Pragmatic Approach to Evaluating A Multipurpose Stream-Gaging Network, Water Resources Investigations Report, US Geological Survey, Lakewood, USA, 1984.

146

Walker, S.: The value of hydrometric information in water resources managemenet and flood control, Meteorol. Appl., 7, 387–397, 2000.

147

Withfield, P. H., Burn, D. H., Hannaford, J., Higgins, H., Hodgkins, G. A., Marsh, T., and Looser, U.: Reference hydrologic networks I. The status and potential future directions of national reference hydrologic networks for detecting trends, Hydrolog. Sci. J., 57, 1562–1579, 10.1080/02626667.2012.728706, 2012.

148

WMO: Casebook on hydrological network design practice, WMO, Geneva, 1972.

149

WMO: Guide to Meteorological observations and practices, Standard, WMO, Geneva, 2008a.

150

WMO: Guide to Meteorological Instruments and Methods of Observation, Standard, WMO, Geneva, Switzerland, 2008b.

151

WMO: Guide to hydrological practices, Volume I. Hydrology – From measurement to hydrological information, 6th Edn., WMO, Geneva, Switzerland, 2008c.

152

WMO: Guide to hydrological practices. Volume II: Management of water resources and application of hydrological practices, WMO 168, 6th Edn., Geneva, Switzerland, 2009.

153

Xiong, L. and O'Connor, K. M.: Comparison of four updating models for real-time river flow forecasting, Hydrolog. Sci. J., 47, 621–639, 10.1080/02626660209492964, 2002.

154

Xu, H., Xu, C. Y., Chen, H., Zhang, Z., and Li, L.: Assesing the influence of raingauge distribution on hydrological model performance in a humid region of China, J. Hydrol., 505, 1–12, 10.1016/j.jhydrol.2013.09.004, 2013.

155

Yang, Y. and Burn, D. H.: An entropy approach to data collection network design, J. Hydrol., 157, 307–324, 10.1016/0022-1694(94)90111-2, 1994.

156

Yeh, H.-C., Chen, Y.-C., Wei, C., and Chen, R.-H.: Entropy and Kriging approach to rainfall network design, Paddy and Water Environment, 9, 343–355, 10.1007/s10333-010-0247-x, 2011.

157

Zidek, J. V., Sun, W., and Le, N. D.: Designing and Integrating Composite Networks for Monitoring Multivariate Gaussian Pollution Fields, Appl. Stat., 63–79, 2000.

</app></app-group></back> </article>