Articles | Volume 26, issue 16
Research article
22 Aug 2022
Research article |  | 22 Aug 2022

An algorithm for deriving the topology of belowground urban stormwater networks

Taher Chegini and Hong-Yi Li

Belowground urban stormwater networks (BUSNs) are critical for removing excess rainfall from impervious urban areas and preventing or mitigating urban flooding. However, available BUSN data are sparse, preventing the modeling and analysis of urban hydrologic processes at regional and larger scales. We propose a novel algorithm for estimating BUSNs by drawing on concepts from graph theory and existing, extensively available land surface data, such as street network, topography, and land use/land cover. First, we derive the causal relationships between the topology of BUSNs and urban surface features based on graph theory concepts. We then apply the causal relationships and estimate BUSNs using web-service data retrieval, spatial analysis, and high-performance computing techniques. Finally, we validate the derived BUSNs in the metropolitan areas of Los Angeles, Seattle, Houston, and Baltimore in the US, where real BUSN data are partly available to the public. Results show that our algorithm can effectively capture 59 %–76 % of the topology of real BUSN data, depending on the supporting data quality. This algorithm has promising potential to support large-scale urban hydrologic modeling and future urban drainage system planning.

1 Introduction

Urban flooding events pose an escalating threat to urban areas at regional and larger scales. The worsening of the urban flooding issue can first be attributed to urbanization and associated regional population migration. The United Nations estimates that, globally, the urban population will grow to more than two-thirds of the total population by 2050 (United Nations Department of Economic and Social Affairs2019). This worsening can also be due to climate change, particularly accelerated extreme precipitation and sea-level rise, which have posed an increasing threat to urban populations (Schreider et al.2000; Yang et al.2013; Hettiarachchi et al.2018; Rosenberger et al.2021). Both urbanization and climate change are widely considered large-scale phenomena and have been studied mostly at regional and larger scales (e.g., Ajaaj et al.2017; Ntelekos et al.2010; Teuling et al.2019; Pang et al.2022; Qian et al.2022). As such, it is necessary to understand, predict, and mitigate urban flooding at regional and larger scales.

Generally, there are two types of systems for transporting stormwater from urban areas to local water bodies: combined sewer systems (CSSs) and separate sewer systems (SSSs). CSSs collect domestic sewage and/or industrial wastewater in addition to stormwater, whereas SSSs have two separate systems for collecting stormwater and sewage/wastewater. During heavy rainfalls, the overflows from CSSs are a major source of pollution; therefore, at the end of the 20th century, many major countries around the globe started adopting SSSs in their urban development plans and partially or fully transforming their existing CSSs into SSSs (Mannina and Viviani2009). Although there are still many cities around the world with CSSs, SSSs are more common in many major countries. For example, there are over 700 communities in the US that use CSSs, whereas over 80 % of the US population resides in areas with SSSs (EPA2018). In China, the percentage of SSS and CSS usage varies by region, but SSSs are predominant overall, accounting for 58 %–87 % of the total sewer line length in different regions of China (Huang et al.2018).

The US Environmental Protection Agency (EPA) uses the term “municipal separate storm sewer system” (MS4) to refer to the stormwater collection part of SSSs. In this study, we focus on MS4s because they are the most dominant type of stormwater transport systems in the US. The EPA defines an MS4 as a publicly owned urban stormwater conveyance system that directly receives excess surface runoff from urban areas during storm events and delivers it to lakes, rivers, or oceans. Thus, MS4s play an irreplaceable role in preventing or mitigating urban floods. However, due to a lack of good-quality MS4 data, most urban modules in existing hydrological models focus on surface hydrological processes (e.g., those associated with impervious areas and compacted soils) and do not explicitly account for MS4s (e.g., Rafee et al.2019; Qian et al.2022; Cuo et al.2008; Yang et al.2011). One of the few exceptions is the Storm Water Management Model (SWMM) (Rossman and Simon2022). However, SWMM can only apply to a local, small-city level due to its intensive computational demand and requirement for detailed MS4 data, which are typically not available to the public (e.g., Nanía et al.2015; Meyers et al.2021; Fraga et al.2016; Naves et al.2019; Yang et al.2011). Moreover, some studies, instead of using the actual urban drainage network, generate synthetic networks based on probabilistic methods that can capture the hydrologic responses of urban watersheds (e.g., Seo and Schmidt2014; Kim et al.2021). Generating such synthetic networks requires parameter calibration based on hydrological observations such as streamflow. Therefore, MS4 data sparsity remains a grand challenge in urban hydrology, preventing us from understanding and modeling belowground urban hydrologic processes at the regional or larger scales that are compatible with the impacts of urbanization and climate change.

In this study, we focus on the belowground urban stormwater network (BUSN) elements of MS4s, i.e., aboveground elements such as street inlets, manholes, and ditches are not the subject of this study. We attempt to address the data scarcity challenge based on two premises. The first premise is the topological relationship between street/road networks and BUSNs. Generally, BUSNs are required to protect streets/roads from flooding; thus, they are often constructed parallel to street/road networks, particularly for important streets, as illustrated in Fig. 1. The Urban Drainage Design Manual by the US federal government (Brown et al.2013) states that the main design objective of BUSNs is to collect the stormwater runoff and convey it along and through streets toward suitable water bodies without adversely impacting the streets' intended functions. The second premise is that aboveground urban geospatial data are now extensively available. For example, OpenStreetMap (OpenStreetMap2021) provides a vector-format, global map of street/road networks freely accessible to the public. By capturing and utilizing this topological relationship, we can derive the topological properties of BUSNs (e.g., geographic locations of stormwater pipes and their spatial connections) based on the existing street/road network and other aboveground data.

Figure 1A schematic of an urban drainage system adapted from Town of Gilbert, AZ (2022).

Table 1The shortest paths in the two graphs shown in Fig. 2.

Paths in bold indicate the paths that are in one graph and not in the other.

Download Print Version | Download XLSX

Figure 2A comparison of unweighted (a, b) and weighted (c, d) betweenness centrality (BC) for two directed graphs. Panels (a) and (c) show the edge weights. Panels (b) and (d) illustrate the importance of edge weights in determining the shortest paths. The line colors in panels (b) and (d) represent the BC values.


Thus, our primary objective is to propose, develop, and validate a novel algorithm for deriving BUSN topological properties from ubiquitous existing aboveground data. The rest of the article is structured as follows: Sect. 2 describes the conceptual basis and technical details of the new algorithm, Sect. 3 lists four metropolitan areas in the US as the case studies, Sect. 4 illustrates this algorithm in these case studies, and Sect. 5 closes with a summary and discussions on the limitations and future implications.

2 Methodology

In this section, we first explain the conceptual framework of our algorithm for deriving the topology of BUSNs, including the following:

  1. complex network analysis concepts (which are transferable and not specific to any location) from graph theory that we adapt to capture the topological relationships between BUSNs and street/road networks,

  2. a generic procedure for BUSN derivation,

  3. a simple yet effective way of validating the BUSNs derived by the algorithm,

  4. the technical details of implementing the algorithm in the US based on the US the federal and local urban drainage design criteria as well as publicly available land surface data.

Figure 3A schematic diagram of a generic procedure for estimating BUSNs using publicly available aboveground datasets.


2.1 Generic and conceptual framework

2.1.1 Complex network analysis

BUSNs and street networks are both complex, hierarchical networks. The function of such networks largely depends on their nontrivial topological structure (Strogatz2001). Therefore, analyzing complex networks is usually carried out by measuring their structural properties. Graph theory introduces different types of structural properties for quantifying the function of complex networks from different perspectives such as clustering (grouping elements based on their attributes), connectivity (resilience to removing elements), and centrality (relative importance of elements). As our proposed algorithm is based on the relative importance of elements of a complex network, centrality-based metrics are suitable for our algorithm. Moreover, as we are interested in analyzing flow paths in street and storm drain networks in this study, we opt for using betweenness centrality (BC; Freeman1977) as a metric for measuring the relative importance of edges in a complex network such as a BUSN.

The mathematical definition of BC is as follows:

(1) BC ( i ) = v i w σ vw ( i ) σ vw ,

where σvw is the total number of the shortest paths from Node v to Node w, and σvw(i) is the subset of paths that pass through Node i (Brandes2001). In other words, the BC value of an edge is the ratio of the total number of the shortest paths that pass through the edge to the total number of the shortest paths in the entire network. In general, the shortest path between two nodes in a network is the path with the minimum number of edges that connect the two nodes. The specific definition of the shortest path closely depends on the network type. A network can be either undirected or directed. In an undirected network, there is a symmetric relationship between a pair of connected nodes, whereas, in a directed network, there could be a one-way relationship between a pair of connected nodes. For example, pedestrian pathways in a street network can be considered as an undirected graph because pedestrians can cross any path in the network in both directions. Car pathways, on the other hand, cannot be considered undirected because there are one-way and two-way streets. Therefore, in this case, the street network can be considered as a directed graph. A BUSN, nonetheless, is a directed network because the pipes in a BUSN are always one-way by design (i.e., stormwater moves in the pipes under gravity towards surface water bodies). In addition to direction, network edges can have numeric properties, such as length and width. These numeric properties, which are called edge weights, can be used to differentiate the importance or capacity of the edges. Within a directed network, BC can be calculated with or without the edge weights and denoted as weighted or unweighted BC, respectively.

We demonstrate the difference between weighted and unweighted BC in a directed graph using a simple example. Figure 2a shows an unweighted directed network in which, for simplicity, we assume that the network is uniformly weighted (i.e., all edge weights are 1). Figure 2c is the same graph, but two edges have different weights (i.e., edges d→f and e→f have weights of 9 and 3, respectively).

Table 1 shows all of the paths in the network and the sum of edge weights along the paths. In the table, Wiju and Wijw are the sums of edge weights in a path for the uniformly weighted and weighted graphs, respectively. We note that, as all edge weights are 1 in the uniformly weighted graph, Wiju becomes the number of edges in a path.

Figure 4A simple case demonstrating the validation method. Real BUSN elements with more than 60 % coverage percentage are considered as “covered”.


The paths highlighted using bold font in Table 1 are the shortest paths unique to each graph. For example, there are two paths between nodes a and f, namely adf and adef. In the uniformly weighted graph, the sum of weights for these two paths are 2 and 3, respectively; thus, adf is the shortest path. On the other hand, in the weighted graph, the sum of weights for these two paths becomes 10 and 5, respectively; thus, adef becomes the shortest path from a to f. By repeating this procedure for all of the edges in the two networks, we can identify all of the shortest paths passing through each edge and compute their BC values (Fig. 2b, d). From this example, we conclude that edges with lower weights are more likely to have higher BC values because the shortest paths are more likely to pass through them.

Furthermore, the weights of edges can be calculated in various ways depending on their network function. Kirkley et al. (2018) adapted the BC concept in street network analysis to study traffic congestion (i.e., slower movement of vehicles due to the imbalance of street capacity and traffic volume). Considering a street network as a directed graph, street segments are edges and street junctions are nodes, and a higher weight associated with a street segment implies the lower importance of this street segment. As travel time is central to quantifying congestion, Kirkley et al. (2018) assigned street length as the edge weight for computing BC. As such, for any street, a lower weight implies a shorter travel time, a higher chance of this street being on an optimum traffic route, and thus higher importance. They showed that, in a street network, streets with high BC values are guaranteed to be a part of the backbone structure of the network and, thus, more important to traffic analysis. Nevertheless, they pointed out that using street length as the only edge weight was a limitation in their analysis, as the length is not the only factor affecting traffic congestion.

For adapting the BC concept to BUSNs, we rely on two facts: (1) a BUSN is also a complex network and a directed graph, and (2) BUSNs are well connected to street networks through some connecting elements such as street inlets and catch basins, as shown in Fig. 1, as required by federal and state regulations (Brown et al.2013). As these connecting elements are primarily aligned with streets, we consider that belowground stormwater pipes, for the most part, are laid beneath streets. Therefore, we assume that a BUSN's topology is analogous to a street network's topology in an urban area, and we can infer the former from the latter. We also note that, in a BUSN, pipes are edges and pipe junctions are nodes.

Moreover, it is neither necessary nor feasible to have a belowground stormwater pipe below each street. For example, a country road or a street in a very sparsely populated area may not need a belowground stormwater pipe because the corresponding surface infrastructures, such as flood buffering zones, may be sufficient to protect it from floods. Those streets in residential and commercial areas are relatively more important and will need BUSNs for flood protection due to two possible reasons: (1) the streets are so important that extra flood protection is needed in addition to existing surface infrastructure (e.g., buffering zone and retention ponds), or (2) there is not enough space for surface infrastructure in heavily populated urban areas, where BUSNs are the most economical and feasible option. Indeed, within a street/road network, some streets are more important than others depending on several factors, such as road type, urban form (e.g., street circulation system and buildings' arrangement, distribution, and spatial accessibility), land use (e.g., residential, commercial, and industrial), and land cover (e.g., open spaces, parks, and impervious surfaces). Thus, streets with more importance are more likely to have BUSNs underlying them. In this study, we quantify the relative importance of streets by incorporating the aforementioned factors into the BC concept.

To address the single-weight-factor limitation pointed out in the Kirkley et al. (2018) study, for BUSN derivation, we propose incorporating multiple stormwater-relevant factors, such as street topography and land use/land cover, into edge weights. In other words, instead of using only one attribute as an edge weight, we compute an integrated weight from multiple attributes. The resulting weighted BC reflects the integrated impacts of various urban factors on the required stormwater transport capacity of the BUSN pipe, and it is denoted as the integrated weighted BC (IWBC) hereinafter. Intuitively, a street with a higher weight will have a lower requirement for stormwater transport due its lower importance and, thus, a lower IWBC value. Correspondingly, this street will be less likely to have a BUSN pipe underlying it. In this study, we assign street weights based on the following rules:

  • As an edge should have a single integrated weight for computing the BC, different street/pipe attributes should be summarized into a single value.

  • We transform the values of street/pipe attributes into edge weights such that streets/pipes with more significance have lower weights. The reason for this is that we measure the street/pipe significance based on their BC values, and edges with lower weights are more likely to have higher BC values.

  • Considering that street/pipe attributes can have different ranges or even data types, we normalize their values before assigning them as edge weights.

In this study, we consider four street attributes, namely, land cover type, road type, the discharge capacity of its associated storm drain pipe, and the building footprint, using integer, string, float, and float data types, respectively. First, we normalize each attribute by data binning (i.e., dividing the values into five categories and assigning each category an integer number from 1 through 5). These integer values correspond to different levels of relative importance starting from very high (1) to very low (5). For example, we normalize building footprints such that streets with higher building footprints have lower weights. The reason for this is that a higher building footprint value indicates that the street is located in a high-density residential area or a business center; therefore, the stormwater should be drained quicker. We note that the only requirement for a normalized weight is that it should be greater than zero, as zero edge weights may lead to infinity paths with equal lengths, meaning that the shortest path cannot be determined. After transforming all of the attributes into edge weights by normalizing them to the range [1, 5], we compute the integrated weight of edges by taking the average of four weights. The same relative importance logic applies to the integrated weight (i.e., a lower integrated weight value for a street increases the probability of the street having a higher BC value). Consequently, the street has a higher relative significance and requires more stormwater transport capacity. We provide more details on the implementation of these rules in our algorithm in the following section.

Figure 5Comparing the Fisher–Jenks (a, c, e, g) and quantile (b, d, f, h) classification methods for integrated weighted betweenness centrality (IWBC) values of baseline networks. Panels (a) and (b), (c) and (d), (e) and (f), and (g) and (h) correspond to the Los Angeles, Houston, Baltimore, and Seattle cases, respectively. In each plot, IWBC values are on the left y axis (red), and the number of elements in each class is on the right y axis (purple).


Figure 6A flowchart demonstrating details of the proposed framework.


2.1.2 Generic procedure for deriving a BUSN

In this subsection, we outline a generic procedure to derive a BUSN based on IWBC that can be conceptually applicable to any urban area. Figure 6 illustrates the major steps in this generic procedure, which are explained in the following:

  1. Step 1 entails the calculation of the surface slopes of all streets in the street network of interest (Fig. 3a) based on the available digital elevation model (DEM) (Fig. 3b) data. The DEM data should ideally have the highest spatial resolution and a small error in vertical elevation values when possible, to minimize the errors in estimating street slopes. When necessary, lidar data can be used instead for better accuracy.

  2. Step 2 involves setting the flow directions between streets based on the surface slopes obtained in Step 1. At this stage, we assume street length is the only weighting factor. Now the street network becomes a directed, weighted network, as shown in Fig. 3c.

  3. Step 3 entails the initial estimation of relative street importance by calculating the weighted betweenness, as shown in Fig. 3d. In this study, we use two Python packages for performing these operations: “Networkit” (Staudt et al.2015) for running computationally expensive network operations, such as computing BC in parallel, and “Networkx” (Hagberg et al.2008) for other network operations, such as community detection and measuring network connectivity.

  4. Step 4 requires the estimation of streets' right-of-way (ROW) based on local/federal regulations. ROW is a part of the land that is reserved by local/federal authorities for construction, maintenance, and future expansion of transportation elements, such as highways and public utilities (Brown et al.2013).

  5. Step 5 entails the estimation of the hydraulic properties of the potential BUSN pipes (e.g., pipe size and slope) by accounting for both the weighted betweenness from Step 3 and the recommendations from the street-/road-relevant regulations at the federal, state, or local levels (see Fig. 3e). Steps 4 and 5 are where the site-specific factors come into play, as different cities under different jurisdictions may have site-specific requirements with respect to ROW and the hydraulic properties of BUSNs.

  6. Step 6 involves the calculation of IWBC by assigning different weights to different streets and integrating several weighting factors, such as road type, land use/land cover (LULC), surface topography, and building density.

  7. Step 7 requires the derivation of the BUSN by removing those relatively unimportant streets based on IWBC. We assume that BUSN pipes are only needed for those remaining streets. Thus, the topology of the BUSN is the same as that of the remaining, relatively important streets.

  8. Finally, Step 8 involves checking the connectivity of the remaining network based on the concept of weakly connected components from graph theory. In graph theory, network connectivity is an important measure of a network's resilience to losing edges or nodes (i.e., the impact that removing edges and nodes has on the overall network flow). For this purpose, after removing the unimportant streets, we first detect the isolated subnetworks by determining the weakly connected components (i.e., those components that are unreachable after converting the network to an undirected graph by ignoring edge directions). We then find the number of streets for each subnetwork and remove those subnetworks whose number of streets is less than the average street count of the subnetworks.

In Step 5, we proposed a weight integration strategy for combining continuous and discrete weighting factors into a unified discrete weight system. Some urban features, such as road type and land use/land cover, are only quantified with discrete values and cannot be represented by continuous values. The integrated weight ranges from 1 to 5. For any edge, a smaller weight indicates higher relative importance because the edge will have a higher chance of being on the shortest path. First, we transform all continuous weighting factors into discrete values in two possible ways: (1) the quantile method, which is based on the equal number of features in each class, and (2) the Fisher–Jenks (Jenks1977) method, which minimizes the total within-class variance of features. Upon transforming all weighting factors to discrete values, we integrate them via arithmetic mean. Note that, by using arithmetic mean, we equally account for different weighting factors because no data are available for objectively quantifying the relative importance of each weighting factor.

2.1.3 Validation

Due to the scarcity of publicly available real BUSN data and their low quality, we can only validate the topology of the derived BUSNs. Therefore, although our proposed algorithm provides hydraulically feasible approximations for the size and slope of the BUSN pipes, we cannot validate them. Our topology validation strategy is based on the principle of spatial proximity for the places where some real BUSN data are available. As shown in Fig. 4, on each side of a derived BUSN edge, we set a buffer zone as wide as the corresponding ROW. We then perform spatial analysis to judge how much of a pipe from the real BUSN is located within the buffer zones. If 60 % of a pipe length from the real BUSN is within this buffer zone, the pipe is considered “covered”. We determined this 60 % threshold based on a sensitivity analysis which showed that the total coverage percentage value does not change by more than 2 % for threshold values from 50 % to 80 %. For any area, we denote Lcovered as the total length of the “covered” BUSN pipes and Lall as the total length of all of the BUSN pipes. To measure the algorithm's performance over an urban area, we define a successful covering percentage as

(2) ω = L covered L all 100 .

Obviously, the higher the ω value, the higher the percentage of the real BUSN pipes successfully covered by the algorithm and, thus, the better the algorithm's performance.

Figure 7A map of urban areas that are the subject of this study as well as their area.

There are often non-negligible uncertainties in both street network and real BUSN data. For instance, over any urban area, one may estimate both the total length values of the real BUSN pipes and the streets, respectively, from the available data. In principle, the real BUSN's total length should not exceed that of the corresponding street network. In reality, however, this may not be the case if there are notably fewer missing data from the real BUSN dataset compared with the corresponding street network dataset. To account for this situation in our validation, we first discretize the targeted urban domain into 1 km resolution grid cells. For each grid cell, we then calculate the ratio of the total length of the real BUSN pipes to the total length of the street network elements. Theoretically, this ratio should be no more than 1.0. We discard those cells with a ratio larger than the theoretical maximum (1.0) and only calculate ω in the remaining grid cells. Finally, the spatial average of ω values over the grid cells within an urban area (ω) is the metric we use to measure the algorithm's performance for this area.

Figure 8Input data for the Los Angeles case: (a) digital elevation model, (b) land use/land cover, (c) street network, (d) existing BUSN, and (e) building footprints.

Figure 9Examples demonstrating the quality of publicly available datasets. (a) Existing BUSN data for Houston are published by the Houston Public Works (City of Houston2021), and (b) street data for a part of Snohomish, Seattle, WA, are from OpenStreetMap (OpenStreetMap2021) and Google Maps (commercial) (Gorelick et al.2017) (© OpenStreetMap contributors YEAR. Distributed under the Open Data Commons Open Database License (ODbL) v1.0. and © Google Maps).

2.2 Technical implementation of the US

This subsection describes the detailed implementation of the previously outlined algorithm over the US.

2.2.1 Data gathering and processing

We retrieve the raw input data that are available at least in the US from the following sources:

  • Street network data are retrieved from OpenStreetMap (OpenStreetMap2021) using an open-source Python package called “OSMnx” (Boeing2017). OSMnx retrieves street network data and their attributes, such as road type and street length, for any region of interest and can perform some post-processing operations for cleaning up the raw street network data

  • Digital elevation model (DEM) data at a 10 m resolution are retrieved from the 3D Elevation Program (3DEP) of the US Geologic Survey (USGS) (U.S. Geological Survey2017) using an open-source Python package called “Py3DEP” (Chegini et al.2021). Py3DEP provides access to the highest-quality and highest-resolution DEM data that are publicly available in the US.

  • Land use/land cover (LULC) data at a 30 m resolution are sourced from the Multi-Resolution Land Characteristics (MRLC) consortium (Dewitz and U.S. Geological Survey2021) using an open-source Python package called “PyGeoHydro” (Chegini et al.2021). This package provides access to many hydroclimate datasets, such as streamflow observations from USGS stream gauges and the National Inventory of Dam, in addition to LULC data.

  • Building footprints are retrieved from the Microsoft Building Footprints (MSBF) dataset (Microsoft2018) using an open-source Python package called “PyGeoOGC” (Chegini et al.2021). Building footprints are the perimeter of buildings that outlines their exterior walls. In total, this dataset includes about 130 million building footprints over the US. Moreover, PyGeoOGC is a low-level interface to many geospatial web services.

  • BUSN design criteria and recommended parameters are sourced from the Urban Drainage Design Manual by the Federal Highway Administration, US Department of Transportation (Brown et al.2013), and other state or local governments (e.g., UDFC2018).

We perform the following post-processing operations on the raw input data:

  • We retrieve the “Road type” and “length” attributes for each street directly from OpenStreetMap (see Table 2).

  • We calculate the surface slope and flow direction of each street in four steps. First, within any street network, we remove intersection points that are closer than the DEM resolution and, therefore, cannot be effectively used. Second, we hydrologically condition the DEM data to more accurately represent the flow direction of surface runoff. Third, we compute the street slope using the conditioned DEM data and set the slope value to 0.4 % if the computed slope is less than 0.4 %, as streets must have a minimum longitudinal slope of 0.4 % (UDFC2018). Finally, we set the flow directions between streets (assuming that the flow directions in the underlying BUSN pipes are the same as the streets).

  • We also estimate the streets' ROW in four steps. First, we assign the number of lanes to each street based on its road type defined in OpenStreetMap (OpenStreetMap, 2021) (see Table 2). As the remaining road types that are not listed in Table 2 are mostly local access roads, we assign them two lanes. Second, we set the width of an individual lane to 5.0 m, which is based on the minimum recommended street lane width of 3.6 m and the minimum sidewalk width of 1.5 m (UDFC2018). Third, we calculate the total width of each street by multiplying the number of lanes by the lane width. Finally, we assign one buffer zone on each side of the street that is as wide as the street itself.

  • We determine the dominant land cover type in the buffer zone of each street by computing the dominant cover type within the buffer zone from the high-resolution LULC data.

  • We estimate the total area of building footprints within the buffer zone of each street by summing up the footprints of the buildings with more than 30 % of their areas within the buffer zone.

Table 2Road type definitions (OpenStreetMap2021) and their corresponding number of lanes.

* Definitions are direct quotes from the OpenStreetMap Wiki (© OpenStreetMap contributors2022).

Download Print Version | Download XLSX

Upon performing these post-processing operations, each street has seven attributes: road type, length, ROW, surface slope, flow direction, land cover type, and building footprints' area. Once all of the input data are ready, we consider four weighting factors for IWBC: road type, LULC, building footprint area, and stormwater pipe flow capacity, as shown in Table 3. We note that, in this study, we transform these street attributes into edge weights in such a way that higher IWBC values correspond to higher significance for transporting stormwater. Here, stormwater pipe flow capacity (Qf) is the maximum discharge for a pipe full of water (more details are provided later). These four weighting factors are chosen for two reasons: (1) they all impact urban stormwater transport, and (2) they can be estimated in the US from the existing data. Other weighting factors may also impact urban stormwater, but insufficient supporting data exist to translate them into quantitative weight.

Table 3Ranges and scores of the weighting factors.

* Ji corresponds to bin values obtained from the Fisher–Jenks natural breaks' classification algorithm.

Download Print Version | Download XLSX

We group each weighting factor into five classes and assign a provisional weight to each class. We set a higher provisional weight to a class with less importance and vice versa. Recall that, for any edge in a weighted network, a higher weight implies a lower probability to be included in the shortest path and, thus, a smaller IWBC value.

The weighting factors in Table 3 include both discrete (road type and LULC) and continuous (building footprints and pipe flow capacity) variables. For consistent weight calculation, we group each weighting factor into five classes. Road type and LULC are already discrete, so the grouping is straightforward. For building footprints and pipe flow capacity, we use the Fisher–Jenks natural breaks method to group them into five classes. We then assign provisional weights to these classes (i.e., a higher provisional weight to a class with less importance and vice versa). This way, each street will have four provisional weights, and its final weight is taken as the arithmetic mean of these provisional weights.

We derive Qf from the existing information using Eq. (3) (Thomason2019). This equation is for a circular concrete pipe with a diameter of less than 24 in. (0.6 m) which is one of the most commonly used stormwater pipes in the US. (Heilman2022).

(3) Q f = 0.3116 n D 8 / 3 S 1 / 2 ,

where n, D, and S are the pipe's Manning's roughness coefficient, diameter (m), and slope (mm−1), respectively. Although the maximum discharge capacity of a circular pipe occurs at 94 % of a pipe diameter (Chow1959), it does not make a difference in our proposed algorithm because we are only concerned with the relative discharge capacity of pipes in the network. The recommended n value for stormwater pipes with a diameter less than or equal to 24 in. (0.6 m) is 0.013 (Heilman2022). D and S are determined during a two-stage, predictor–corrector procedure to compute the IWBC values as follows:

  • Compute the provisional BC values (BC^) for the entire network using Eq. (3) by only considering length as the street weight.

  • Assign a suitable pipe size to each street in the network based on the BC^ values. We achieve this by statistical binning of these values into 10 intervals (the number of permissible pipe sizes from Table 4). Considering that streets with higher BC^ values are likely to receive more stormwater, the largest pipe size is assigned to streets in the class with the highest BC^ values. Similarly, the smallest pipe size is assigned to streets in the class with the lowest BC^ values.

  • Set the pipe slopes based on the obtained street slopes and the permissible slope ranges corresponding to each pipe size given in Table 4 adopted from Brown et al. (2013). A pipe slope usually follows the street slope unless the slope is outside of the permissible range, as it can lead to pipe flow velocities below or above the design criterion (0.9–3 m s−1). Therefore, if a street slope is less than the minimum permissible pipe slope, we set the pipe slope to its minimum permissible value. Similarly, if a street slope exceeds the maximum permissible pipe slope, we set the pipe slope to its maximum permissible value. Subsequently, we compute the pipe flow capacity using Eq. (3) and the obtained pipe sizes and slopes.

  • Compute the arithmetic mean of the four weighting factors and determine the corrected BC values for the network (i.e., the final IWBC values).

Table 4Slope range based on storm drain pipe size.

Download Print Version | Download XLSX

Figure 10Model performance with drainage adequacy classifier index (DACI) values varying from 0.0 to 1.0 with a 0.05 interval for the four case studies.


Figure 11Three different BUSNs estimated for the San Fernando Valley case with three different drainage adequacy classifier index (DACI) values, showing (a) the full street network and DACI values of (b) 0.9, (c) 0.7, and (c) 0.5.

Note that this predictor–corrector approach does not yet change the baseline street network topology. The obtained IWBC values are our basis to obtain the derived BUSN by removing those relatively unimportant streets from the baseline network. One may begin by removing those edges with IWBC values of less than a threshold, as these edges represent less important streets that are less likely to require belowground stormwater pipes. Intuitively, by increasing the IWBC threshold, we remove more elements from the baseline street network; thus, the drainage capacity of the BUSN corresponding to the remaining part of the street network decreases. There is, nevertheless, a nonlinear relationship between increasing the IWBC threshold and decreasing the derived BUSN. This is due to two reasons:

  • In most street networks, the numbers of edges associated with lower IWBC values are nonlinearly larger than those with higher IWBC values. Our analysis shows that the lowest IWBC values have the highest frequency in the network. For example, Fig. 5 compares the distribution of IWBC values using two classification methods, namely Fisher–Jenks and quantile, for the four cities that are subject of this study. As is evident from the figure, the first class based on the Fisher–Jenks method has a significantly higher edge count, whereas the last class based on the quantile method has significantly higher within-class variance in IWBC values.

  • The edges with lower IWBC values correspond to the pipes with smaller diameters, and their removal has a smaller impact on the total BUSN's drainage capacity compared with removing those edges corresponding to pipes with larger diameters.

Figure 12Results for the Los Angeles case. Panel (a) shows the coverage percentage, and panel (b) presents the length ratio maps at a 1 km resolution, where the red pixels indicate the areas with poor street data quality. The white background pixels are grid cells that do not contain street and/or existing BUSN elements. Panel (c) compares the estimated BUSN with the existing BUSN using a 0.9 DACI.

Table 5Summary of real BUSN and street network data.

“No. of types” means the number of element type categories in a database (e.g., trunk and culvert), and “comm.” stands for network communities.

Download Print Version | Download XLSX

Therefore, IWBC cannot be directly used to guide this removal operation. We carry out this operation in two steps:

  • We use the first out of 10 classes based on the Fisher–Jenks method (FJ1) to identify the group of streets with the lowest IWBC values. FJ1 has the highest edge count and the lowest within-class variance in IWBC values.

  • Considering the small variance of IWBC values of the streets in FJ1, the quantile method is suitable for categorizing the streets based on their IWBC values. Thus, we use the quantile method as an indicator for removing edges from the baseline network.

Our empirical analysis of the real BUSN data from the US cities suggests that, for a baseline street network, those edges with IWBC values belonging to the FJ2–FJ10 classes are important enough to have underlying BUSN pipes, whilst only a fraction of the edges with IWBC values belonging to the FJ1 class do not have underlying BUSN pipes and should be removed. Hence, we define the Drainage Adequacy Classifier Index (DACI) based on the IWBC quantiles within the FJ1 class. Mathematically, DACI =1- IWBC quantile. We use the DACI as a direct indicator in guiding the removal of relatively unimportant edges from a baseline street network and deriving the final BUSN. For example, if the DACI value is 0.8, we drop those edges with IWBC values less than the 20th quantile of the IWBC values in the FJ1 class. A DACI value of 1.0 suggests retaining all of the edges in a baseline network.

In this study, we use the DACI as an empirical parameter. For a case study where real BUSN data are partially available, we increase the DACI value until there is no significant increase in the average ω value. Figure 6 depicts a flowchart of our proposed framework.

Figure 13Results for the Houston case. Panel (a) shows the coverage percentage, and panel (b) presents length ratio maps at a 1 km resolution, where the red pixels indicate the areas with poor street data quality. The white background pixels are grid cells that do not contain street and/or existing BUSN elements. Panel (c) compares the estimated BUSN with the existing BUSN using a 0.8 DACI.

Figure 14Results for the Seattle case. Panel (a) shows the coverage percentage, and panel (b) presents length ratio maps at a 1 km resolution, where the red pixels indicate the areas with poor street data quality. The white background pixels are grid cells that do not contain street and/or existing BUSN elements. Panel (c) compares the estimated BUSN with the existing BUSN using a 0.9 DACI.

3 Results

3.1 Case studies and data

We choose the following four major cities in the US as the case studies to demonstrate the algorithm (Fig. 7): Houston, TX; Los Angeles, CA; Baltimore, MD; and Seattle, WA. The primary reason for choosing them is that there are real BUSN data available to the public over a fraction of the areas within these cities with relatively decent data quality. For instance, over the Los Angeles metropolitan area, we only obtain the real BUSN data for a small town, San Fernando Valley.

Interestingly, the real BUSN and street network data show different characteristics among these case studies. The street network structure in the Baltimore and Seattle cases is quite different from that in the Houston and Los Angeles cases. In graph theory, a community refers to a group of nodes (street intersections) in a network where the density of the connections among them is higher than the rest of the network. In a street network, a community can be analogous to an urban cluster. Table 5 summarizes some real BUSN and street network data for our four case studies. As is evident from the real BUSN columns of the table, the quality, availability, and types of real BUSN data vary on a case-by-case basis, as local authorities produce the data based on their own rules, regulations, and available resources. The table shows that the number of element type categories in each case is different. For example, in the Houston case, BUSN element types are divided into seven categories (Trunk, Lead, Outfall, Culvert, Trench Drain, Siphon, and Overflow), whereas there are four categories (Detention Pipe, Driveway Culvert, Cross Culvert, and Pipe) in the Seattle cases. Thus, not only the data quality but also the level of detail that each real BUSN dataset provides is different. Nevertheless, as is evident from Table 5, the main storm drain pipes (sometimes called trunks), which are the focus of this study, are the most dominant elements in BUSNs, both in terms of quantity and their role in the system's transport capacity. Moreover, according to Table 5, the Baltimore and Seattle cases have smaller average community sizes and, thus, smaller urban clusters.

We retrieve the input data for the four case studies following the procedure described in Sect. 2.2.1. Figure 8 shows an example of the input data that we collected for the Los Angeles case. Figure 8a and b plot the digital elevation model and land use/land cover data that we retrieved from 3DEP (U.S. Geological Survey2017) and MRLC (Dewitz and U.S. Geological Survey2021), respectively. Additionally, Fig. 8c, d, and e represent the street network, existing BUSN, and building footprints that we obtained from OSM (OpenStreetMap2021), Los Angeles GeoHub (Los Angeles GeoHub2022), and MSBF (Microsoft2018), respectively.

Furthermore, although the input data for the proposed BUSN algorithm are generally available in the US, the data quality might vary among different categories of input data and different locations. For example, in the available real BUSN data, there are often some missing edges. As is evident from Fig. 9a, the BUSN data are only available in some urban areas in Houston, as indicated by the red color. However, there should be BUSNs in the other areas, but the real data are not available. Our observation suggests that different case studies may be subject to different levels of real BUSN data quality issues, which lead to some uncertainty in the validation of the derived BUSN. Moreover, it appears that the quality of street network data is not consistent across locations either. Figure 9b shows the street network data from OpenStreetMap overlaid with that from Google Maps for a tiny portion of Seattle, WA (for better clarity); this aforementioned figure suggests that OpenStreetMap misses a considerable number of streets and, thus, has certain data quality issues as well. As the topology of our derived BUSNs is primarily based on that of the underlying street networks, the accuracy of the derived BUSN data is also subject to the quality of the underlying street network data.

3.2 BUSN derivation and validation

For each of the case studies, we run the algorithm with DACI values varying from 0.0 to 1.0 at a 0.05 interval (Fig. 10). Intuitively, one would expect that ω increases with the DACI. Interestingly, there is a plateau behavior for the Baltimore, Seattle, and Houston cases, where ω stops increasing once the DACI passes a threshold value (i.e., ω reaches its maximum and stabilizes). Recall that a higher DACI value implies that more streets in a street network require underlying stormwater pipes. This plateau behavior confirms our earlier statement that relatively unimportant streets will not require underlying stormwater pipes. The DACI thresholds (for this plateau behavior) are 0.9, 0.9, and 0.8 for the Baltimore, Seattle, and Houston cases, respectively, suggesting that there are more unimportant streets in Houston that do not require underlying stormwater pipes compared with the Baltimore and Seattle cases. The Los Angeles case, however, does not seem to have such a plateau behavior (i.e., ω does not stop increasing with the DACI even when the DACI =1.0). The possible reason for this is that there are stormwater pipes under most streets in the Los Angeles case. Therefore, as we retain more edges from the baseline street network (as the derived BUSN pipes), ω keeps increasing. We note that, as the street network data quality is sufficiently good for the Los Angeles case, data quality is not a reason for not reaching a plateau. We further confirm the street network data quality by retrieving higher-quality data from a local source, Los Angeles GeoHub (2022), and comparing it with the data obtained from the OSM.

Figure 11 illustrates the changing spatial patterns of the baseline street network (Fig. 11a) and the derived BUSNs when using DACI values of 0.9, 0.7, and 0.5, respectively (Fig. 11b–d), for the Los Angeles case. As expected, by decreasing the DACI value, the derived BUSN becomes sparser, while there are no small isolated subnetworks. We recall that, in Step 8 of our general procedure, the algorithm removes the small isolated subnetworks if there are any. This step is particularly important because the connectivity of a network dictates its overall transport capacity.

Figures 1215 show results of the four cases. Each of these figures includes three panels. Panel a in Figs. 1215 shows the ω values at the grid cell level. Panel b in Figs. 1215 shows the ratio of the total length of the real BUSN pipes to the total length of the street network elements within each grid cell. Note that, in the first panel, those grid cells with a ratio exceeding 1.0 are highlighted with a red color and excluded from the validation due to the poor quality of street network data. Thus, a spatial average of the ω values from the remaining grid cells is used as the algorithm's overall performance for each case study (see Table 6). Panel c in Figs. 1215 shows the derived BUSN overlaid by the corresponding real BUSN.

Figure 15Results for the Baltimore case. Panel (a) shows the coverage percentage, and panel (b) presents length ratio maps at a 1 km resolution, where the red pixels indicate the areas with poor street data quality. The white background pixels are grid cells that do not contain street and/or existing BUSN elements. Panel (c) compares the estimated BUSN with the existing BUSN using a 0.9 DACI.

Figure 16Comparison of the overland slope of urbanized areas for the four case studies based on (a) the cumulative percentage graph of slopes and (b) the variability percentage of the slopes from steep slopes with a 0 %–5 % exceedance probability to moderate slopes with a 5 %–33 % exceedance probability, from moderate to mild slopes with a 33 %–66 % exceedance probability, and from mild to very mild slopes with a 95 %–100 % exceedance probability. The dashed lines in panel (b) represent the slope variability for the four cases and among the four slope categories.


The algorithm performs very well in the Los Angeles and Houston cases, with ω values of 75.5 % and 72.9 %, respectively. It only performs reasonably well in the Seattle and Baltimore cases, with ω values of 65.6 % and 59.0 %, respectively. There are two possible reasons for this difference in the algorithm's performance: (1) the street network and real BUSN data quality is better in the Los Angeles and Houston cases than in the other two cases (there are far fewer grid cells marked with the red color in Figs. 12b–13b compared with Figs. 14b and 15b); (2) topographic features can affect the model performance because they can have a significant impact on the design and construction cost of belowground drainage elements. Notably, the slope of streets can pose significant limitations on the construction of BUSNs. We recall that urban design manuals provide permissible slope ranges for BUSNs to maintain the pipes' flow velocity within a certain range. Therefore, urban areas in hilly terrain that can have higher street slope variability may require BUSN pipes to be placed deeper into the ground as well as nonuniform cover depths (the distance between the top of a BUSN pipe and the street surface). These requirements can lead to significantly higher construction costs because more excavation needs to be carried out.

In this study, we do not account for the construction limitations and difficulties that arise in BUSNs in hilly terrain; therefore, we expect poorer performance in such areas. We quantify the slope variability of urban areas based on the cumulative percent (CP) graph for slope, as shown in Fig. 16a. For this purpose, as demonstrated in Fig. 16b, we divide the slopes in the CP graph into four categories, namely steep slopes with a 0 %–5 % exceedance probability, moderate slopes with a 5 %–33 % exceedance probability, mild slopes with a 33 %–66 % exceedance probability, and very mild slopes with a 95 %–100 % exceedance probability. Subsequently, we determine the slope variability by computing the gradient of the average slope values in each category (i.e., average slopes of the dashed lines in Fig. 16b) as follows:

(4) γ j i = 1 Δ E j i log S j S i 100 ,

where γji is the gradient from category j to i, and ΔEji is the difference between the exceedance probability percentages of consecutive categories (e.g., ΔE33,5=33-5=28). Moreover, Si is the average of all of the slopes within category i. The reason that we use a logarithm for computing the difference between consecutive average slopes is that the y axis of the cumulative percent graph (Fig. 16a) is on a log-scale. Higher gradients correspond to less slope variability.

Table 6 summarizes the algorithm's performance for the four cases, in terms of ω values. Although the four selected urban areas have different characteristics, the derived BUSNs generally match the real BUSNs with acceptable accuracy (i.e., with ω values above 60 % in all of them). Note that the satisfactory validation of the derived BUSN from the algorithm verifies the analogy assumption we make earlier (in Section 2.2.1), i.e., the analogy between the BUSN and the street network's topology. As is evident from Table 6, the average coverage percentage and slope variability follow the same trend (i.e., the highest coverage percentage corresponds to the lowest slope variability).

Table 6Summary of the model performance for the four cases.

Download Print Version | Download XLSX

4 Summary and discussions

This study presents a novel algorithm for estimating belowground urban stormwater networks based on graph theory concepts and publicly available information. Most of the procedure is automatic, except for one empirical parameter that is specified by the user. Inputs of the algorithm are mostly land surface data, such as street network, topography, land use/land cover, and building footprints, that are readily available to the public and cover at least the whole US. We successfully validated the topology of the derived BUSNs for four US cities on both the west and east coasts, with the average coverage percentage varying from 59 % to 76 %.

Although we developed our proposed framework based on publicly available datasets and design manuals in the US, it is flexible and can be adapted to other regions with different design criteria and data availability. Moreover, despite relying only on publicly available datasets, which are not the most accurate available datasets, the model showed satisfactory performance.

There are a few directions to further improve the algorithm, including but not limited to the following:

  1. The quality and availability of input data for the algorithm can be further enhanced at the regional or larger scales (e.g., the street network data).

  2. The DACI threshold for deriving BUSNs is an empirical, user-specified parameter in this study. Estimating it a priori based on the hydroclimatic conditions for any urban watershed can be achieved via a rigorous hydraulic analysis involving estimating peak runoff and adequately detailed BUSN hydraulic modeling.

  3. We may generalize this DACI threshold parameter at the regional or larger scales based on the regional hydroclimate conditions (e.g., intensity, duration, and frequency of extreme rainfall and peak runoff). However, these improvements are beyond the scope of this study and are left for future work.

  4. We may further expand our algorithm to account for drainage catchments in urban areas and break down the derived BUSNs into several subnetworks that follow the catchments.

  5. The BUSN algorithm in this study is designed for separate sewer systems. Considering that combined sewer systems have different design criteria, the applicability of the algorithm for such systems requires further research.

Ultimately, our proposed algorithm for estimating BUSNs is a valuable tool to support the parameterization of large-scale urban hydrologic modeling, particularly in the areas where BUSN data are not available. It may also provide decision support in regional-scale urban planning from the angle of stormwater and flood management.

Code and data availability

The source code and the data generated from this study are available from the corresponding author upon reasonable request.

Author contributions

TC and HYL conceived the idea. TC designed and implemented the algorithm, and performed the analyses with inputs from HYL. Both authors contributed to writing the paper.

Competing interests

Hong-Yi Li acknowledges his financial interest in Pythias Analytics regarding the support from the Alfred P. Sloan Foundation.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Taher Chegini was supported by the University of Houston's internal funds and the Alfred P. Sloan Foundation via the Houston Advanced Research Center (grant no. UH0421). Hong-Yi Li was also supported by the US Department of Energy Office of Science Biological and Environmental Research as part of the Earth System Model Development program area through the collaborative, multi-program Integrated Coastal Modeling (ICoM) project.

Financial support

This research has been supported by the Alfred P. Sloan Foundation (grant no. UH0421).

Review statement

This paper was edited by Fuqiang Tian and reviewed by three anonymous referees.


Ajaaj, A. A., Mishra, A. K., and Khan, A. A.: Urban and peri-urban precipitation and air temperature trends in mega cities of the world using multiple trend analysis methods, Theor. Appl. Climatol., 132, 403–418,, 2017. a

Boeing, G.: OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks, Comput. Environ. Urban, 65, 126–139,, 2017. a

Brandes, U.: A faster algorithm for betweenness centrality, J. Math. Sociol., 25, 163–177,, 2001. a

Brown, S. A., Schall, J. D., Morris, J. L., Doherty, C. L., Stein, S. M., and Warner, J. C.: Urban Drainage Design Manual, Hydraulic Engineering Circular 22, Third Edition, Federal Highway Administration Press, Publication No. FHWA-NHI-10-009, 2013. a, b, c, d, e

Chegini, T., Li, H.-Y., and Leung, L. R.: HyRiver: Hydroclimate Data Retriever, Journal of Open Source Software, 6, 3175,, 2021. a, b, c

Chow, V. T.: Open-channel Hydraulics, Civil Engineering Series, McGraw-Hill, (last access: 19 August 2022), 1959. a

City of Houston: The Public Works GIS (GIMS), Houston Public Works [data set], (last access: 17 February 2022), 2021. a

Cuo, L., Lettenmaier, D. P., Mattheussen, B. V., Storck, P., and Wiley, M.: Hydrologic prediction for urban watersheds with the Distributed Hydrology-Soil-Vegetation Model, Hydrol. Process., 22, 4205–4213,, 2008. a

Dewitz, J. and U.S. Geological Survey: National Land Cover Database (NLCD) 2019 Products (ver. 2.0, June 2021),, 2021. a, b

EPA: Office of Wastewater Management, Water Permits Division: Municipal Separate Storm Sewer System Permits: Compendium of Clean Specific and Measurable Permitting Examples, National Service Center for Environmental Publications (NSCEP), Publication No. 830S16002, 2018. a

Fraga, I., Cea, L., Puertas, J., Suárez, J., Jiménez, V., and Jácome, A.: Global Sensitivity and GLUE-Based Uncertainty Analysis of a 2D-1D Dual Urban Drainage Model, J. Hydrol. Eng., 21, 04016004,, 2016. a

Freeman, L. C.: A Set of Measures of Centrality Based on Betweenness, Sociometry, 40, 35,, 1977. a

Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., and Moore, R.: Google Earth Engine: Planetary-scale geospatial analysis for everyone, Remote Sens. Environ., 202, 18–27,, 2017. a

Hagberg, A. A., Schult, D. A., and Swart, P. J.: Exploring Network Structure, Dynamics, and Function using NetworkX, in: Proceedings of the 7th Python in Science Conference, edited by: Varoquaux, G., Vaught, T., and Millman, J., 11–15, Pasadena, CA USA, 19–24 August 2008. a

Heilman, J.: Hydraulics Manual, Design Office, Engineering and Regional Operations Division, Washington State Department of Transportation,, last access: 19 August 2022. a, b

Hettiarachchi, S., Wasko, C., and Sharma, A.: Increase in flood risk resulting from climate change in a developed urban watershed – the role of storm temporal patterns, Hydrol. Earth Syst. Sci., 22, 2041–2056,, 2018. a

Huang, D., Liu, X., Jiang, S., Wang, H., Wang, J., and Zhang, Y.: Current state and future perspectives of sewer networks in urban China, Front. Env. Sci. Eng., 12, 2,, 2018. a

Jenks, G.: Optimal Data Classification For Choropleth Maps, Occasional paper, University of Kansas, (last access: 19 August 2022), 1977. a

Kim, S. E., Seo, Y., Hwang, J., Yoon, H., and Lee, J.: Connectivity-informed drainage network generation using deep convolution generative adversarial networks, Sci. Rep., 11, 1519,, 2021. a

Kirkley, A., Barbosa, H., Barthelemy, M., and Ghoshal, G.: From the betweenness centrality in street networks to structural invariants in random planar graphs, Nat. Commun., 9, 2501,, 2018. a, b, c

Los Angeles GeoHub:, GeoHub [data set], last access: 13 March 2022. a, b

Mannina, G. and Viviani, G.: Separate and combined sewer systems: a long-term modelling approach, Water Sci. Technol., 60, 555–565,, 2009. a

Meyers, S. D., Landry, S., Beck, M. W., and Luther, M. E.: Using logistic regression to model the risk of sewer overflows triggered by compound flooding with application to sea level rise, Urban Climate, 35, 100752,, 2021. a

Microsoft: Computer generated building footprints for the United States, GitHub, (last access: 17 February 2022), 2018. a, b

Nanía, L. S., León, A. S., and García, M. H.: Hydrologic-Hydraulic Model for Simulating Dual Drainage and Flooding in Urban Areas: Application to a Catchment in the Metropolitan Area of Chicago, J. Hydrol. Eng., 20, 04014071,, 2015. a

Naves, J., Anta, J., Puertas, J., Regueiro-Picallo, M., and Suárez, J.: Using a 2D shallow water model to assess Large-Scale Particle Image Velocimetry (LSPIV) and Structure from Motion (SfM) techniques in a street-scale urban drainage physical model, J. Hydrol., 575, 54–65,, 2019. a

Ntelekos, A. A., Oppenheimer, M., Smith, J. A., and Miller, A. J.: Urbanization, climate change and flood policy in the United States, Climatic Change, 103, 597–616,, 2010. a

OpenStreetMap: OpenStreetMap Wiki, Key:highway, (last access: 20 April 2022), 2021. a, b, c, d, e

Pang, X., Gu, Y., Launiainen, S., and Guan, M.: Urban hydrological responses to climate change and urbanization in cold climates, Sci. Total Environ., 817, 153066,, 2022. a

Qian, Y., Chakraborty, T. C., Li, J., Li, D., He, C., Sarangi, C., Chen, F., Yang, X., and Leung, L. R.: Urbanization Impact on Regional Climate and Extreme Weather: Current Understanding, Uncertainties, and Future Research Directions, Adv. Atmos. Sci., 39, 819–860,, 2022. a, b

Rafee, S. A. A., Uvo, C. B., Martins, J. A., Domingues, L. M., Rudke, A. P., Fujita, T., and Freitas, E. D.: Large-Scale Hydrological Modelling of the Upper Paraná River Basin, Water, 11, 882,, 2019. a

Rosenberger, L., Leandro, J., Pauleit, S., and Erlwein, S.: Sustainable stormwater management under the impact of climate change and urban densification, J. Hydrol., 596, 126137,, 2021. a

Rossman, L. A. and Simon, M.: Storm Water Management Model User's Manual Version 5.2, National Risk Management Research Laboratory, Office of Research and Development, 2022. a

Schreider, S. Y., Smith, D. I., and Jakeman, A. J.: Climate Change Impacts on Urban Flooding, Clim. Change, 47, 91–115,, 2000. a

Seo, Y. and Schmidt, A. R.: Application of Gibbs' model to urban drainage networks: a case study in southwestern Chicago, USA, Hydrol. Process., 28, 1148–1158,, 2014. a

Staudt, C. L., Sazonovs, A., and Meyerhenke, H.: NetworKit: A Tool Suite for Large-scale Complex Network Analysis, arXiv [preprint],, 2015. a

Strogatz, S. H.: Exploring complex networks, Nature, 410, 268–276,, 2001. a

Teuling, A. J., de Badts, E. A. G., Jansen, F. A., Fuchs, R., Buitink, J., Hoek van Dijke, A. J., and Sterling, S. M.: Climate change, reforestation/afforestation, and urbanization impacts on evapotranspiration and streamflow in Europe, Hydrol. Earth Syst. Sci., 23, 3631–3652,, 2019. a

Thomason, C. P.: Hydraulic Design Manual, Texas Department of Transportation, 2019. a

Town of Gilbert, AZ:, last access: 12 June 2022. a

UDFC: Urban Storm Drainage Criteria Manual: Volume 1, Management, Hydrology, and Hydraulics, (last access: 19 August 2022), Urban Drainage and Flood Control District (UDFCD), 2018. a, b, c

United Nations Department of Economic and Social Affairs: World Urbanization Prospects 2018: Highlights, United Nations, ISBN 9789210043137, 2019. a

U.S. Geological Survey: 1/3rd arc-second Digital Elevation Models (DEMs) – USGS National Map 3DEP Downloadable Data Collection, USGS [data set], (last access: 17 February 2022), 2017. a, b

Yang, G., Bowling, L. C., Cherkauer, K. A., and Pijanowski, B. C.: The impact of urban development on hydrologic regime from catchment to basin scales, Landscape Urban Plan., 103, 237–247,, 2011.  a, b

Yang, L., Smith, J. A., Wright, D. B., Baeck, M. L., Villarini, G., Tian, F., and Hu, H.: Urbanization and Climate Change: An Examination of Nonstationarities in Urban Flooding, J. Hydrometeorol., 14, 1791–1809,, 2013. a

© OpenStreetMap contributors:, last access: 3 April 2022. a

Short summary
Belowground urban stormwater networks (BUSNs) play a critical and irreplaceable role in preventing or mitigating urban floods. However, they are often not available for urban flood modeling at regional or larger scales. We develop a novel algorithm to estimate existing BUSNs using ubiquitously available aboveground data at large scales based on graph theory. The algorithm has been validated in different urban areas; thus, it is well transferable.