Global spatial optimization with hydrological systems simulation : application to land-use allocation and peak runoff minimization

A general methodology is presented to integrate complex simulation models of hydrological systems into optimization models, as an alternative to scenario-based approaches. A gradient-based hill climbing algorithm is proposed to reach locally optimal solutions from distinct starting points. The gradient of the objective function is estimated numerically with the simulation model. A statistical procedure based on the Weibull distribution is used to build a confidence interval for the global optimum. The methodology is illustrated by an application to a small watershed in Ohio, where the decision variables are related to land-use allocations and the objective is to minimize peak runoff. The results suggest that this specific runoff function is convex in terms of the land-use variables, and that the global optimum has been reached. Modeling extensions and areas for further research are discussed.


Introduction
Understanding watershed hydrological processes and their linkages to land cover changes is important for controlling nonpoint source (NPS) water pollutants, which are produced by land-based activities (Novotny, 2003) and carried into waterways by stormwater runoffs.Effective management of NPS pollution calls for methods to identify pollution sources and pathways, and to minimize pollutants production and delivery.Computer simulation models of complex watershed processes can provide a better understanding of the interactions among the various physical systems in a watershed, and predict the hydrological impacts of changes in land Correspondence to: I.-Y.Yeo (iyeo@umd.edu)management practices (Beven, 1989;Grayson et al., 1992).They have been used to develop Best Management Practices (BMP) and to predict the effects of future climate changes (Quilbé et al., 2008).
However, these simulation models necessarily consider only a small number of scenarios, precluding the optimal selection and location of BMPs, and cannot explicitly link pollution sources to yields.Optimization methods have recently emerged as an alternative modeling framework to efficiently search for the best possible alternative, under given objectives (performance measures) and constraints (Haith, 1995).Srivastava et al. (2002), Nicklow and Muleta (2001), Muleta and Nicklow (2002), Seppelt and Voinov (2002), and Kaur et al. (2004) have integrated optimization methods into simulation models to overcome the limitations of a scenario-based approach.However, there is little evaluation of the obtained solution in terms of its closeness to the global one (Lee and El-Sharkawi, 2008).This would require a thorough understanding of the relationship between the watershed system and the decision variables.
The purpose of this research is primarily to propose a general methodology for the efficient use of simulation models of natural systems within an optimization framework.These simulation models can be viewed as implicit functions relating inputs to outputs, with some inputs representing decision variables and the outputs representing favorable and/or unfavorable impacts.The methodology is illustrated with a hydrological runoff simulation model applied to a small catchment in Ohio, and the decision variables are related to land-use allocation.The optimization procedure involves the simulation-based numerical approximation of gradient vectors at each step of a hill-climbing algorithm.A large number of local optima is generated, and the Weibull distribution is used to statistically assess convergence towards the global optimum.
Published by Copernicus Publications on behalf of the European Geosciences Union.
I.-Y.Yeo and J.-M.Guldmann: Global spatial optimization The remainder of this paper is organized as follows.Section 2 describes the methodology, including the optimization-simulation procedure and the statistical assessment of local optima using the Weibull distribution.Section 3 describes an application of the methodology, including the data, the hydrological simulation model, the results for a small catchment, and model extensions.The structure of the peak runoff function is further analyzed in Sect. 4 in light of the underlying mechanics of the hydrological model.Conclusions and areas for further research are presented in Sect. 5.

Optimization-simulation procedure
Most natural systems (hydrological, atmospheric, or ecological) can be represented by simulation models of varying degrees of sophistication and complexity.The inputs to these models include exogenous variables, decision variables, and parameters.In the specific case of NPS pollutants/runoff models, these could be: E= vector of exogenous variables, such as the geographic distributions of soil types, topography, shallow aquifers, surface stream network, channel length, and weather (precipitation, temperature, solar radiation, wind speed, relative humidity).For a given site/region, these variables cannot be modified, at least in the short and middle terms, and are taken as given; X= vector of decision variables, which represent various possible management and planning interventions, such as the allocation of land uses, the sitting and sizing of ecological engineering technologies (e.g., constructed wetlands or filtration systems), rainwater harvesting, and agricultural practices (application and management of nutrients and pesticides, conservation tillage, contour farming, crop residue management, irrigation); P = vector of the parameters that characterize the various equations/relationships that make up the simulation model (e.g., CN number, Manning's coefficient, canopy interception of precipitation, etc.).
Let Y be the vector of the simulation outputs.Assume that there are n outputs, with In the case of watershed simulation models such as the SWAT (Soil and Water Assessment Tool) model (Arnold et al., 1998), Y includes the (1) peak runoff, (2) sediment load, (3) phosphorous load, and (4) nitrogen load.These outputs can be watershed-wide aggregates, or disaggregated by location (e.g., Hydrologic Response Unit -HRU).The simulation model can be symbolically represented by the vector function F =(F 1 ,F 2 ...F k ...F n ), with: (1) The functional relationship (1) is implicit and cannot be expressed in closed mathematical form because of the spatial and temporal complexity of the simulation model, which is generally run for a discrete number of scenarios pertaining to the decision vector X, and/or for different geographical settings (vectors E and P ).Various technological, environmental, and socio-economic constraints must be accounted for, that limit the feasible space for X, with: G(E,X,P ) ≤ O. (2) A variety of objective functions may be considered.For instance, let Y k be the peak runoff at the watershed outlet, and assume that the only goal is to minimize this runoff.Then, the problem is to find X that minimizes subject to constraint (2).Some of the other outputs (e.g., pollutant loads) may be constrained by pollution standards Y * j (maximum load or maximum concentration), with: Finally, in the context of multi-objective optimization, the search for the Pareto frontier of non-inferior solutions (Cohon, 1978) involves the minimization of the weighted function (w i ≥0): (5) For the sake of exposition simplicity, ignore the exogenous components E and P .Let F (X) be the unidimensional function to be minimized.Consider an algorithmic process wherein a sequence of solution points X k ={x k j } is generated.Consider the vector Z={z j } in the neighborhood of the k-th trial point X k .The function F (Z) can be approximated in this neighborhood by the following linear function based on a first-order Taylor's series expansion: With X k fixed, F (Z) is linear in Z.However, this linear approximation is not straightforward to obtain, because the objective function is not analytically closed, and the partial derivatives must be approximated numerically.The ∂F (X k ) change in F X k resulting from a very small increment ∂x j in each component of X k , leaving all other components unchanged, is computed with the simulation model, and the partial derivative is estimated as the ratio of the two increments.
The goal is to minimize the second term in (6), subject to constraints.If these constraints are linear, then the implicitly nonlinear program can be solved using sequential linear Hydrol.Earth Syst.Sci., 14, 325-338, 2010 www.hydrol-earth-syst-sci.net/14/325/2010/ programming, also known as the method of convex combinations (Venkataraman, 2001).If the constraints are nonlinear, various gradient-based feasible directions algorithms can be used (Bazaraa et al., 2006).
In the case of minimization, the algorithm produces the globally optimal solution only if (1) the constraint domain is convex, and (2) the objective function is convex over the feasible domain, which requires that its Hessian matrix be positive definite everywhere (Intriligator, 1976).The Hessian matrix is made up of the second-order partial derivatives of the objective function.Since the objective function cannot be expressed analytically, it is impossible to directly assess its convexity.In order to keep the focus on the objective function, assume that the constraint set is convex.The algorithm is presumed to generate a locally optimal solution for each distinct starting point (initial solution).Using a large number of different initial solutions, two situations may emerge: (1) the same local optimum is obtained in all cases, which indicates that the function F is convex and the global optimum has been found; or (2) distinct local optima are obtained, which indicates that the function is not convex; in this case, a probabilistic method is proposed in Sect.2.2 to assess the closeness of the best local optimum to the global optimum.
The complexity and computational requirements of the proposed methodology beg the following question: why is it important to obtain the global optimal solution, and why would a good solution be insufficient?There are two fundamental reasons for searching for the global optimum.First, using simulation only would necessarily lead to a limited number of solutions, which may all be significantly inferior.In any case, it would be difficult to assess the quality of a solution, unless the global optimum is known.Second, maybe more importantly, the global optimum is the benchmark to be used when assessing heuristic procedures that yield good, but not necessarily optimal solutions.Heuristics are much less computationally demanding, but have no value if they cannot be evaluated.The operations research literature has offered many heuristics for difficult-to-solve optimization problems (Pearl, 1984).
A second issue is related to the uncertainty in the inputs P and E. This uncertainty has been alleged to make the search for an optimum meaningless.For instance, Beven and Freer (2001) discuss the concept of equifinality, whereby the same output Y can be obtained with different sets of the inputs P and E, which are characterized by measurement or other uncertainties.However, uncertainty is a very general modeling issue, characterizing both natural and socioeconomic systems models.It can be tackled, in part, through sensitivity analyses of the solution over ranges of parameter values, or with stochastic programming techniques.The focus of the proposed methodology is not on the vectors P /E but on the vector X -the management/planning decision variables.

Statistical assessment of local optima
It is possible to assess how close a local optimum is to the global one by using a statistical method developed by Golden (1978) and Los and Lardinois (1982).The idea is to generate independent local optima and to obtain a point estimate for the global optimum by applying statistical extremevalue theory.Golden (1978) investigates how far a heuristic solution is from the global optimum in the case of the traveling salesman problem.Los and Lardinois (1982) apply a hill-climbing technique to solve the optimal network design problem.
The local optima generated from different initial states are analyzed with the Weibull Distribution (WD), which has been used to analyze survival data, to assess stability, and to measure risk (Aitkin and Clayton, 1980).In hydrology, the WD has been used to fit hydrographs (Bhunya et al., 2006) and analyze trends in extreme events, including annual peak discharges, annual minimum flows, and annual maximum rainfall intensities (Clarke, 2002).One of its advantages is its independence from the parent distribution.Parent distribution assumptions are often critical in constructing a confidence interval (CI), because they help derive the statistical parameters that determine the lower and upper bounds of a CI.However, the optimization algorithm only provides extreme values (maximum or minimum), and the underlying probability distribution is unknown.The WD does not require any assumptions to derive the probability of the extreme values, as long as there are enough available data (Roberts, 1971).It is derived as the asymptotic distribution of the smallest order statistics, known as Type III (Fisher and Tippett, 1928;Gumbel, 1958).Its cumulative distribution function (x) and density function δ(x) are: x ≥ a > 0;c > 0,b > 0.
The parameters a, b, and c are defined as the location, scale, and shape parameters, with (a +b) = 1−e −1 (≈ 0.63).The parameter a, the lower bound of the WD, is also the lower bound of the parent-distribution, and is considered the global optimum (minimum).The best local optimum is used as an estimate of the global optimum, and its reliability is statistically evaluated by analyzing the empirically fitted WD of local optima.If R observations (i.e., sample size) are drawn from a WD, and if x h (l) , the best local optimum, is the first element (i.e., smallest number) in an ordered set, it follows that: www.hydrol-earth-syst-sci.net/14/325/2010/ Hydrol.Earth Syst.Sci., 14, 325-338, 2010 I.-Y.Yeo and J.-M.Guldmann: Global spatial optimization The estimator a for the location parameter a is used as point estimator for the global optimum ( x * ).Its confidence interval, with a significance level of 100(1-e −R ) %, is (Golden and Alt, 1979;Dergis, 1985): However, the interval calculated by this formula is too large.It was tightened by Los and Lardinois (1982), using a real number V , with: Then, the confidence interval is given as: The confidence level (1-α) is expressed as and the real number V is Estimating the global optimum value and its confidence interval can then be used to assess convergence toward the global optimum.

Data
The Old Woman Creek (OWC) watershed, part of the National Estuarine Research Reserve (NERR) system, is selected for a pilot study because of the availability of the detailed physical and socio-economic data needed to develop, parameterize, and validate the model.Data on watershed characteristics was collected from remote sensing images, historical land use maps, soil maps from the soil survey geographic database (SSURGO), technical reports from the NERR center, and farmer surveys from the local USDA-NRCS (US Department of Agriculture, Natural Resources Conservation Service) office.These data, which provide a comprehensive description of surface characteristics, are used to derive proper values for the model parameters.
Due to the large computing requirements and the need to generate many independent local optima, the numerical application is performed on a small catchment of the OWC watershed, with a few land-use categories, a simple drainage network, and simple spatial distributions of land uses and soil types.The catchment is overlaid by a grid of 1732 30-m cells (approximately 1.6 km 2 ), with land-use/cover classified into three categories: agriculture, conservation, and urban.The catchment is predominantly agricultural (78%), conservation land uses (grass/woods) represent 12.6 % of the area, urban land use makes up 1.5%, and the remainder represents water (8%).Roads make up most of the urban land (25 cells), except for one cell of built-up structures.The land-use, soil, and topography structures of the catchment are illustrated in Fig. 1.See Yeo et al. (2004) for further descriptions of the OWC and data sources and processing.

Hydrological simulation model
The relationship F (X) between a land-use pattern X and the resulting peak discharge rate at the watershed outlet is analyzed with a spatially explicit hydrological model, by modifying the SCS curve number (CN) method.This method has been chosen, because (1) the relationship between landuse and peak runoff is expressed in terms of hydrologic soil groups and land use/cover conditions (McCuen, 1982;USDA, 1986;Bingner and Theurer, 2001), (2) it is implementable under available computing resources, and (3) it is simple and accurate.The CN method has been embedded into various watershed models for hydrology, flood analysis, and water quality modeling (Garen and Moore, 2005), including the Soil and Water Assessment Tool (SWAT) (Arnold et al., 1998), the AGricultural Non-Point Source Pollution Model (AGNPS) (Bingner and Theurer, 2001;Young et al., 1989), and the Erosion Productivity Impact Calculator or the Environmental Policy Integrated Climate (EPIC) model (Williams et al., 1984;Williams and Meinardus, 2004).There have been continuous efforts to modify the CN values under different physiographic and climatic conditions (Ponce and Hawkins, 1996;Arnold and Fohrer, 2005;Grunwald and Frede, 1999), and to merge the CN method with distributed, variable source area concepts (Walter and Shaw, 2005).
The conventional CN method yields lumped effects by using weighted averages of the parameters.To better account for the impacts of spatial variability in land use, a 30-m cell is selected as the modeling unit, consistent with the smallest spatial resolution for a number of input data, including soil, land use, and DEM.Therefore, the spatial heterogeneity and variability of the input data are fully considered, and lumping is minimized by not using average input values at the cell level.The volume of runoff (Q) is computed as: where P is the precipitation and S the moisture retention, estimated from the runoff curve number (CN): The quantities P , Q, and S are measured in millimeters [mm].Groundwater flows are not modeled, and the antecedent soil moisture condition is considered by using the default estimation of the SCS method (USDA, 1986).Details are provided in McCuen (1982), USDA (1986), and Bingner and Theurer (2001).Runoff flows are accumulated by following the flow paths determined by topography.The flow routing direction is determined by the D-8 method, which assigns the runoff on    , 1984).The onsite cell infiltration capacity (i.e., the initial abstraction), which depends only on cell characteristics (soil, land use, antecedent soil moisture), is compared with the precipitation depth at the cell, and the excess precipitation is transformed into cell runoff while accounting for the upstream runoff routed through the flow path, in a way similar to the routing method in SWAT (Gassman et al., 2007).The excess runoff over a flow path is then obtained by summing up the storm runoffs occurring at all the cells along the flow path, and the total runoff volume at the watershed outlet is obtained by summing up the runoffs occurring along all flow paths in the watershed (Olivera 1996).This process is illustrated in Fig. 2.
A similar approach is applied to estimate the time of concentration, Tc.Instead of computing Tc from a predefined longest distance to the watershed outlet, the model calculates it by keeping track of the flow time of every pathway, to better account for land spatial variability.The travel time of a flow path is calculated by summing up the travel times for all the cells along the path.The maximum travel time across all paths is selected as the time of concentration.The travel time for each cell is determined according to its flow type -overland flow, shallow concentrated flow, or channel flow (USDA, 1986), which accounts for routing and decay.See the extended TR-55 for details (USDA, 1986;Bingner and Theurer, 2001).
After calculating the time of concentration and the total amount of runoff, the peak runoff rate is determined using the extended TR-55 procedure (Bingner and Theurer 2001), with: See Bingner and Theurer (2001) for the values of these coefficients.
The hydrological model was calibrated and validated by comparing model output with historic precipitation and stream flow data.First, the daily precipitation data available for the site were fitted using the Extreme Value Type I probability distribution function (Chow et al., 1989) to determine the design storms.Since the data are only available in daily steps, it is assumed that the precipitation pattern follows a SCS II rainfall time distribution (USDA, 1986).Then, the estimated design storms were used as inputs to the hydrological model, and the peak stream runoffs were simulated and compared with the observed stream flows.As the simulation was event-based, a flood frequency analysis was carried out with observed stream data and the Bulletin 17B method (IACWD, 1982).After determined the frequency curve, the stream runoff rates corresponding to the probabilities of 1-, 2-, 5-, and 10-year storms were determined.These stream runoff rates were comparable with the simulation outputs at a 95% confidence level.The flood analysis uses daily stream data over the period 1987-2002.Daily precipitation data were obtained from the National Weather Service Center from the period 1985-2002.See Yeo et al. (2004) for the values used for parameterization and further information on model validation and calibration.

Land-use optimization model
A simple land use optimization model is integrated with the hydrological model to delineate the land-use pattern that minimizes the peak storm runoff at the watershed outlet: Figure 2: Runoff process and flow path Path k Fig. 2. Runoff process and flow path.
Subject to: where x il is the amount of land use l in cell i, T l is the total land target for land use l in the watershed, and A i is the area of cell i.F (X) is the objective function that evaluates the peak discharge at the watershed outlet resulting from the land-use pattern (X).It is numerically evaluated with the runoff simulation model.Constraint ( 19) guarantees the achievement of watershed land-use targets, and constraint (20) guarantees the full occupation of cell i by land uses.

Results
Five hundred land-use maps have been generated by randomly assigning land uses to the 1567 catchment cells that are neither road nor water.The total land-use areas are kept constant across these maps: 22 urban cells, 1307 agricultural cells, and 237 conservation cells.These totals correspond to the optimal land allocation in Yeo et al. (2007).The combined simulation-optimization model is then applied to these 500 land-use allocations at the 30-m cell level, and the resulting optimal allocations that minimize peak stormwater runoff at the catchment outlet are further analyzed statistically.Nine identical local solutions obtained from clearly different initial maps were eliminated in order to satisfy the assumptions of the Fisher-Tippett theorem, which requires independence of the observations in the sample (Los and Lardinois, 1982;Dergis, 1985).
In order to illustrate the wide range of the 491 initial solutions, the catchment is divided into three sub-regions, as illustrated in Fig. 3, and statistics for the numbers of agricultural, conservation, and urban cells allocated to each subregion are reported in Table 1, confirming that these allocations vary significantly within each sub-region.This range is mirrored by the range of the corresponding peak discharge rates, which vary from 0.25 m 3 /s to 0.5 m 3 /s (Fig. 4a).In contrast, the optimal allocations generated by the model display little variability, with much smaller standard deviations and coefficients of variations (Table 1).The corresponding peak discharges vary within the very narrow range of [0.254073-0.254298]m 3 /s (Fig. 4b).Figure 5 displays five maps corresponding to the optimal allocations for the 1-, 25 -, 50-, 75-, and 100 percentiles of the peak runoff flow, and the five initial land allocations leading to these optimal allocations.The optimal maps are very close to each other, but significantly different from the initial allocations used to generate them.At the optimum, most of the urban land is allocated to upland areas, near the upland boundary of the catchment, away from the waterways and roads, and at low density.Urban land is buffered by conservation land to offset its impacts on runoff volume and travel time to the stream.Denser conservation land is allocated near the catchment outlet along the waterways, and at the edge of the catchment, where the slope is steep, but is avoided in areas with low infiltration capacity (i.e., soil type C or D), increasing the travel time of runoff flows but reducing the runoff volume.These optimal peak runoff values are obtained at convergence, that is, when the sum of the squared differences between the land allocations of two consecutive iterations in the optimization procedure X k T X k is less than ε=10 −8 .It is very likely that with a much smaller convergence criterion ε, the algorithm would converge to the same optimum for all the initial solutions.However, because the model is gradient-based, the convergence becomes very slow after about 10 iterations, and reaching the exact global optimum might take a huge amount of computing time.In addition, the optimized peak runoffs are extremely close to each other, with differences only at the 6th decimal point.These differences are meaningless in a physical sense, and make a strong case for global optimality and the convexity of the peak runoff function.
The statistical methodology presented in Sect.2.2 can be applied to the 491 "optimal" peak discharge rates, which can be viewed as a random sample for ε-level convergence.A three-parameter Weibull distribution is estimated and the results are presented in Table 2.The distribution of the sampled data is presented in Fig. 6.The global optimum and its confidence interval (CI) are estimated from the εlevel optimal values, and the results are summarized in Table 3.The best value from the numerical experiment (i.e., x h (l) or the upper bound of the CI) is 0.254073, approximately 0.01% above the point estimate of the global optimum ( x * =0.254047).The lower bound of the CI is 0.254023, about 0.02% below the best local optimum (x h (l) ).

Model extensions
Different optimal land patterns would be obtained with different-size storms, as demonstrated in Yeo et al. (2004), who show that land management as a BMP is most effective with a small size storm.As the focus is here on the effectiveness of land use as a BMP to reduce the peak runoff, it was reasonable to choose a small design storm, such as 1year storm.However, optimizing under 1-year design storm is clearly different from optimizing for the annual load, or multiple storms.The proposed methodology can be easily extended to deal with multiple storms and to delineate the corresponding optimal land-use pattern, by employing a continuous watershed model, instead of an event-based model, to simulate the annual load.Consider a representative year subdivided into T (t=1→ T ) precipitation periods.For a given, Hydrol.Earth Syst.Sci., 14, 325-338, 2010 www.hydrol-earth-syst-sci.net/14/325/2010/  Table 3. Estimation for the global optimum and confidence interval.
time-independent, land-use pattern subsumed by vector X, the peak runoff for period t would be F t (X), as computed by the simulation model under the conditions of period t.Minimizing, for instance, the aggregate annual runoff, t F t (X), could be implemented with the same procedure.It simply would be lengthier and more computationally demanding because gradients would have to be calculated for each period.
The land allocation model is very simplified.It only considers total land use targets and land availability per cell, and only one objective -peak runoff minimization.It does not consider other ecological (e.g., carbon fixation, animal and vegetal species preservation, etc.) and socio-economic factors in the watershed.This was done purposefully, to allow for a focus on the simulation of the peak runoff, and to generate, with the available computer resources, the largest possible set of feasible solutions (land-use allocations) over which to search for the global optimum.Therefore, the obtained minimum peak runoff can be viewed as the lower bound for the minimum peak runoff that would be obtained if more constraints were added to the model.The methodology could be integrated into a much more comprehensive land allocation model, with not only more constraints, but also multiple objectives.There is a literature on multiobjective optimization models applied to watershed issues, but these models are of an aggregate nature and do not deal with detailed allocations at the cell (or HRU) level.For instance, Sadeghi et al. (2009) allocate land to five agricultural land uses while minimizing erosion and maximizing economic benefits.Chang et al. (1995) allocate land to forest conservation, agriculture, recreation, and residential development, while minimizing the discharges of five distinct pollutants and maximizing employment and income.Gabriel et al. (2006) develop a mixed-integer quadratic program to select parcels for development while (1) maximizing the compactness of the development area, (2) minimizing its imperviousness, (3) minimizing the development of environmentally-sensitive parcels, and (4) maximizing the total value of the development parcels.
If the local minimum generated by a hill-climbing algorithm always turns to be the global one, the objective function is necessarily convex.The previous results suggest that the peak runoff function is convex in terms of the land-use variables.However, as this function cannot be expressed analytically in a closed form, its convexity cannot be proven by analyzing its Hessian matrix.As discussed in Sect.3, the total runoff volume and travel time are involved in estimating the peak discharge.The effects of the land-use variables on these two components and the whole hydrological system are further examined, to provide additional support (though no formal proof) for the convexity of the objective function.

Estimation of runoff volume
As described in Eq. ( 15), the CN method is used to estimate the volume of runoff (Q) as a function of precipitation (P ) and moisture retention (S).While precipitation is exogenous to the simulation model, S is solely a function of the curve number (Eq.16), which is endogenous, as it depends upon land cover and soil type.Let x il be land use l in cell i, and c il the curve number for land use l in cell i.Since soil types do not vary across the watershed, the curve number (cn i ) for cell i is: Therefore, the parameter S i and the runoff volume Q i of cell i are functions of the vector of the land-use variables X i =(x i1 ,. . .x il ,..x il ) , with: The runoff volume Q p a along the flow path p a to the watershed outlet is estimated by summing the runoff volumes in all cells i in the path, with: P i is the amount of precipitation in cell i.The routing path is determined by the D-8 method.Equation ( 24) is identical to Eqs. ( 15)-( 16), but applies to the runoff volume along the flow path, instead of to a single cell.Although the total runoff is expressed analytically as a function of the land-use variables, it is difficult to characterize the convexity of the  function presented in Eq. ( 24), because of the term f (X i ).A numerical analysis has been conducted to better understand this function.With the given land allocation and soil distribution (Fig. 1), the CN values ( l c il x il ) only vary over [77][78][79][80][81][82] in the catchment, because most of it is used for agriculture and conservation.The runoff volume (Q p a ) generated for these CN values was computed.The results, presented in Fig. 7, show a monotonously increasing and slightly convex relationship between Q p a and CN.As CN is a linear function of the X i 's, Q p a is then a convex function of the X i ' s.

Estimation of runoff travel time
As discussed in Sect.3.2, the most influential parameters for flow times are land uses and topography.The overland flow is a function of Manning's roughness coefficient, flow length, and slope; the shallow concentrated flow is determined by slope; the channel flow is computed with Manning's roughness coefficient, channel length and area, and slope.Manning's roughness coefficient, a parameter for surface friction and resistance, is a function of land-cover, and topography determines flow directions, slopes, and drainage patterns.
The hydrological model keeps track of all flow paths to the watershed outlet, and assigns a specific flow type to each cell on each path.This is necessary to account for the impacts of site-specific land-use changes, as surface cover affects the Manning's roughness coefficient used in flow time estimation.Then, the flow time is explicitly calculated for each cell i, and the total flow time over path p a is estimated as the sum of the travel times over all the consecutive flow segments along the flow path.(Bingner and Theurer, 2002).
variables vector X:

Estimation of peak discharge
The peak discharge rate at the watershed outlet is estimated using the extended TR-55 method (Bingner and Theurer, 2002), which requires the following inputs: the runoff volume, the time of concentration, and the unit peak regression coefficients a p − f p .These coefficients are determined by the rainfall distribution and the ratio I a /P 24 .The initial abstraction (I a,i ) of cell i is estimated as 20% of the moisture retention S i (Eq.16), which is itself dependent upon the land uses in cell i (USDA, 1986), with: The initial abstraction for the watershed is then estimated as the average value of I a,i : The constants a p − f p are derived from a look-up table, and Fig. 8 presents their values as functions of the ratio I a /P 24 , for Type II rainfall.Except for coefficient a p , these curves are strongly nonlinear.Once the values of the parameters a p −f p are determined, the peak discharge rate is computed as: where Y (X) represents the ratio I a /P 24 that determines the regression coefficients a p −f p (Eq. 27), H (X) represents the time of concentration Tc (Eq.25), and Np a is the number of all possible paths.The right-hand side of Eq. ( 28) is essentially identical to Eq. ( 17), as the total runoff at the watershed outlet ( ) is equal to the product of the total drainage area by the effective rainfall (P 24 D a ), which is the amount of precipitation that is neither retained by the land surface nor infiltrated into the soils.Equation ( 28) relates two components, the total runoff volume and the time of concentration, Ia/P24 =0.00 Ia/P24 =0.25 Ia/P24 =0.50 Ia/P24 =0.75
The function L(H (X),Y (X)) is computed with selected values for Tc and the I a /P 24 ratio for the study area.Numerical results are presented in Fig. 9 for I a /P 24 =(0.00,0.25, 0.50, 0.75) and Tc in the range [0-2 h], which covers all possible Tc values in the 500 initial land-use patterns.The relationships presented in Fig. 8 are either convex or linear.However, the convexity of Eq. ( 28) cannot be guaranteed, as the multiplication of two convex functions, L(H (X),Y (X)) and N P a 1 Q P a , cannot be mathematically proven to be convex.

Conclusions
This paper has presented a general methodology for integrating complex simulation models of natural systems into optimization models that account for various socio-economic and environmental objectives and constraints.In the specific case of hydrological watershed models, the decision variables are related to land-use allocations, sitting and sizing of structural BMPs, agricultural practices, and non-structural BMPs, and the environmental outputs include peak runoff and sediment, phosphorous, and nitrogen loads.The gradient of the objective function is estimated numerically with the simulation model, and a hill climbing algorithm is implemented to reach either a local optimum or the global optimum.A statistical procedure based on the Weibull distribution is next used to estimate the global optimum out of a large number of model-generated local optima.
The methodology has been applied with a peak runoff simulation model to the OWC watershed in Ohio.The decision variables are land-use allocations, and the objective is to minimize peak runoff at the watershed outlet.A large number of solutions has been generated from distinct initial solutions, and these solutions turned out to be very close, strongly supporting the case for a convex relationship between peak dis-charge and land-use variables.The convexity of the objective function has been further investigated by examining the underlying mechanics of the hydrological model (i.e., the SCS-CN method) in terms of land-use variables, and by performing numerical evaluations of its main components.The numerical results also support, though do not fully prove, the case for convexity.
The methodology can be adapted to deal with other optimization and simulation models, and to design watershed structural BMPs, alternative agricultural practices, and non-structural BMPs (e.g., use of pervious material in urban areas).The modeling scope can be extended to (1) include multiple storms, and (2) account for socio-economic and other environmental factors.To further assess the methodology, numerical experimentations should be undertaken with other simulation models, objectives, constraints, and sites/regions.The search for and derivation of the global optimum should provide the basis for designing heuristic procedures that yield very good, though not necessarily optimal, managerial and planning decisions.

Figure 1 :
Figure 1: Characteristics of the Study Area

Fig. 1 .
Fig. 1.Characteristics of the Study Area.Note: the hydrologic soil distribution (soil type A, B ,C, D) is the soil grouping used in the SCS-CN number method, related to the soil infiltration capacity.There is no soil type D in the study site.
)where Q p is the peak discharge [m 3 /s], D a the area of the spatial unit [ha], P 24 the 24-h effective rainfall over the total drainage area [mm], T c the time of concentration [hr], and the coefficients a p ,b p ,c p ,d p ,e p , and f p , are determined by the ratio of initial abstraction (I a ) to 24-h precipitation (P 24 ).

Fig. 3 .
Fig. 3. Sub-regions in the OWC Catchment.* Denotes cells that are neither water nor road.

Figure 4 :
Figure 4: Distributions of the Peak Discharge Rates

Fig. 7 .
Fig. 7. Numerical assessment of peak discharge rate with varying curve number.

Figure
Figure 9: Numerical Assessment of ( ( ), ( )) L H Y X X Under Type II Rainfall Distribution

Table 1 .
Summary Statistics for the initial and optimal allocations.
Note: the numbers in the table indicate the number of cells (30-m) assigned to the different land uses.

Table 2 .
Data fitting with a weibull distribution.
The time of concentration T c is determined by the flow path with the maximum travel time.Since flow path and type are determined by watershed topography and geography, the time component of the hydrological model is necessarily a function of the land-use Hydrol.Earth Syst.Sci., 14, 325-338, 2010www.hydrol-earth-syst-sci.net/14/325/2010/ Fig. 8. Regression Coefficients for Peak Discharge