Runoff sensitivity to spatial rainfall variability: A hydrological modeling study with dense rain gauge observations

Precipitation is a key input to hydrological models. While rain gauges provide the most direct precipitation measurements, their accuracy in capturing rain patterns highly depends on the spatial variability of rainfall events and the gauge network density. In this study, we employ a high-resolution meteorological station network (mean station distance of 1.4 km), the WegenerNet in southeastern Austria, to investigate the impact of station density and interpolation schemes on runoff simulations. We first simulate runoff during heavy precipitation (three short-duration and three long-duration events) using a physically 5 based hydrological model with precipitation input obtained from a full network of 158 stations. The same simulations are then repeated with precipitation inputs from subnetworks of 5, 8, 16, 32, and 64 stations, using three different interpolation schemes – Inverse Distance Weighting with a weighting power of 2 and of 3, respectively, and Thiessen polygon interpolation. We find that the performance of runoff simulations is greatly influenced by the spatial variability of precipitation input, especially for short-duration rainfall events and in small catchments. For long-duration events, reliable runoff simulations in the study area 10 can be obtained with a subnetwork of 16 or more well-distributed gauges (mean station distance of about 6 km). We find a clear effect of interpolation schemes on runoff modeling as well, but only for low-density gauge networks. The sensitivity to the precipitation input is smaller for long-duration heavy precipitation events and bigger catchments. As a next step we suggest to study an ensemble of precipitation datasets in combination with runoff modeling to be able to decompose the effects of precipitation measurement uncertainties and its spatial variability. 15

they flow into the Raab river. The Haselbach (12 km 2 ) is taken as a representative small subcatchment and the Grazbach (54 km 2 ) as a bigger one for our analysis. Both can be seen as typical subcatchments in our study area.
The total study area is moderately hilly with elevations from 230 m to 530 m and located in the southern alpine foreland.
The land use is dominated by agriculture areas and patchy forests. The dominant soil type is sandy loam. The mean annual precipitation is around 850 mm and the mean annual temperature about 9.5°C. The study area was chosen because of its 95 vulnerability to heavy/convective precipitation events  and climate change (Hohmann et al., 2018). The region is well equipped with a highly dense climate network, the WEGN, which was built up by the Wegener Center for Climate and Global Change, University of Graz, Austria (Kirchengast et al., 2014). The WEGN measures precipitation, temperature, humidity, and other variables since the beginning of 2007 with 150 stations (about one per 2 km 2 , 5 min sampling) in an area of 22 km x 16 km. All data are quality controlled by the WEGN QC system (Kirchengast et al., 2014) and additional 100 bias correction is implemented for precipitation data, using the approach by O et al. (2018).  the Austrian Hydrographic Service (AHYD) with 1 to 15 minutes time resolution to properly simulate runoff (Table 2). To run the model only for the focus area, gauging station Takern II/Raab is used as an inflow. Runoff data from gauging station Table 2. Catchment attributes and hydrometeorological data used for the hydrological modeling with WaSiM with the following sources: HYDROBOD -homogeneous soil and land use grids by Klebinder et al. (2017), LStmk/LBgld -state government offices of the States of Styria/Burgenland, TANALYS -preprocessing tool of the hydrological model WaSiM, WEGN -highly dense station network data version 7.1 (Fuchsberger et al., 2019), ZAMG -data from the Austrian Weather Service, and AHYD -data from the Austrian Hydrographic Service. sensitivity study with a low flow focus.

Catchment attributes
In this study, we used the WaSiM Version: Richards-10.02.03. All used modules of WaSiM are shown in Fig. 2. For more information about the modules see Schulla (1997) or the WaSiM user guide by Schulla (2019). The model is set up with a spatial resolution of 100 m x 100 m and a temporal resolution of 30 min. WaSiM internally interpolates the meteorological station data to grids. The evapotranspiration is calculated after Penman-Monteith (Monteith, 1965)  The final groundwater parameters of the 2D groundwater module were fitted to represent the baseflow quite well during calibration period. Therefore, the saturated horizontal conductivity is split up in areas around the river with 5 · 10 −5 m s −1 and 140 surrounding hilly areas with 1 · 10 −6 m s −1 . The colmation factor is set to 1 · 10 −5 and the storage coefficient to 0.2 m 3 m −3 .
Beside the gridded groundwater parameters, WaSiM is calibrated with four parameters of the soil module, which influence shape and volume of the simulated runoff hydrograph and no measured or literature data are available (Schulla, 2019): The storage coefficient of surface runoff kd (shape of surface runoff hydrograph) and interflow ki (shape of interflow hydrograph), the drainage density for interflow dr and a recession constant of the soil krec in the soil    (2019)). Focus of this study is to evaluate runoff output data (box marked in dark blue) simulated by WaSiM with various precipitation data resolutions at input (blue box) and using different interpolation schemes (violet box).

Experimental design
Our study design is visualized in Fig. 3. We are analyzing simulated runoff in different catchments and subcatchments (Sect.

Selection of precipitation station network densities
To obtain precipitation input data at various spatial resolutions, we define six precipitation subnetworks consisting of different 170 numbers of rain gauges ranging from 5 to 158 (Table 3). For instance, the lowest-density network (5-Stations) is defined using ZAMG stations only, with a mean station distance of 11 km. This could be a normal setup for operational use of hydrological

Selection of precipitation events
We selected heavy precipitation events among the top 10 % heaviest rainfall days during summer (May to September) within the 10-years period of 2007 to 2016 (O and Foelsche, 2019). Three small-scale short-duration and three large-scale long-duration events are selected through visual inspection of the WEGN and INCA data over the study area (Fig. 4). In Table 4 you find the three heaviest short-duration precipitation events, as well as the three heaviest long-duration events. 2009 was the year with the 185 heaviest events in our study period. The heaviest short-duration event (short-1) was on 10-Aug-2009 with 34 mm precipitation and a peak runoff at station Neumarkt/Raab of 107 m 3 s −1 , a HQ 1 event. The biggest event, the long-1 event measured from 22-Jun-2009 until 24-Jun-2009 with 121 mm precipitation lead to a peak runoff of 244 m 3 s −1 at Neumarkt/Raab. This "long-1 Table 3. Precipitation station subnetwork cases with the total number of stations per subnetwork, together with the specific station data source (Z -ZAMG, A -AHYD, W -WEGN) and estimated mean station distance, the latter calculated with an ArcGIS tool.

Gauge Subnetwork
Number ( percentile range among the stations (gray shaded), while "ZAMG" shows mean precipitation (red line) obtained from the 3 ZAMG stations with a min-max range across the stations (yellow shaded). The maps sequence (four right panels) shows the evolution of the precipitation event as captured by the gridded INCA analysis over the WEGN network (red box) and the larger Raab catchment region (black box).

Spatial interpolation schemes
Several interpolation methods are implemented in WaSiM, e.g. TP, IDW, Elevation Dependent Regression, as well as different combinations (Schulla, 1997). In this paper, we test two different IDW setups and the TP, which are widely used interpolation methods in hydrological studies (Goovaerts, 2000;Ly et al., 2013;Szcześniak and Piniewski, 2015). We decided not to include height information for precipitation map creation, because the elevation differences in the area are fairly small (height differ-195 ences no more than about 300 m). IDW is the sum of all contributing station data with specific weights (Schulla, 1997). It is calculated with the following equations (1) and (2): In our study we use the standard weighting power p of 2 (IDW2) and for comparison also the weighting power p of 3 (IDW3). In WaSiM, all stations in a specific radius are used for the interpolation. Only one specific search radius can be selected, which is then applied for all stations. In our study, we formally set the search radius to 50 km to be able to include the surrounding weather stations also in subregions with larger station distances, which is necessary to get a robust coverage of the total catchment area. With the TP interpolation scheme, always the precipitation data of nearest station are taken. So, each 205 grid cell of the model is getting the nearest station information and the formed polygons (Thiessen Polygons) are representing lines of equal distance between two stations (Schulla, 1997). Hence TP is a simpler method than IDW, but still widely used in hydrological modeling (Zeng et al., 2018;Meselhe et al., 2009;Kobold and Brilly, 2006).

Runoff analysis approach
In our study, we analyze the event-specific time series of runoff and peak flow deviation. Time series are visualized for all 210 events individually, but combined with different station network densities and interpolation schemes. For each catchment, interpolation method and event, the peak flow deviation in percent is calculated individually. For this purpose, the maximum runoff value is calculated for the simulation results of every subnetwork case (MAX value) and compared to the maximum runoff value of the full-network reference case (MAX Ref 158Stations ), which best captures the "true" spatial variability of precipitation in the study area. This deviation metric is hence computed as follows: 4 Results

Results for individual example events
In this section we focus on individual precipitation events. Figure 5 shows These are examples of one short-and one long-duration event, for three catchments, but they do not cover the total range of setups and results. Therefore, combined figures are shown in the next Sect. 4.2. In contrast, for the short-duration events and the comparatively small subcatchments, the station density is evidently much  18 https://doi.org/10.5194/hess-2020-453 Preprint. Discussion started: 29 September 2020 c Author(s) 2020. CC BY 4.0 License.
In Fig. 8 we visualize the summarized results for the peak flow deviations as a function of all station subnetworks. This summary view clearly highlights that the uncertainty in runoff simulations due to interpolation schemes and gauge network density is much greater for short-duration convective precipitation events. We further find that the direction of biases (overestimation vs underestimation) is affected primarily by the gauge network density rather than the interpolation scheme. For long-duration 320 heavy precipitation events, we find faster decreases in biases with increasing number of gauges in the network. 16 stations in our study area (around 500 km 2 ) yield satisfactory performance, with biases lower than 10 % for all subnetworks of at least this station number and all interpolation cases. Note that our subnetworks represent a quite regularly distributed gauge configuration, and therefore uncertainty in the runoff simulations can be somewhat greater for more irregular gauge location configurations. Polygons, respectively, for the three short events (left), the three long events (middle), and the mean each over the short and long events (right).

Discussion
Here we discuss the diversity of results of the station densities, interpolation schemes, (sub)catchments, and individual events in more detail and in synthesis. The mean over all catchments of the long-duration events shows a "sufficiency threshold" at the 16-Stations subnetwork, with just little runoff change (< 6%) for more stations. This equals a mean station distance of around 6 km, or around 16 stations per 1000 km 2 . Beyond this station density, no strong further improvement of the simulated 330 runoff can be observed in average over all catchments for long-duration events. In contrast, the mean over all catchments of short-duration events only show a "sufficiency threshold" at the 64-stations subnetwork, with just little runoff change (< 6%) for more stations. Here a mean station distance of around 2.5 km, or rather around 64 stations per 1000 km 2 are needed.
Such thresholds go along with the literature, where no better performances after crossing specific station densities are seen (e.g. Bárdossy and Das, 2008;Dong et al., 2005;Lopez et al., 2015;Xu et al., 2013). For example, Lopez et al. (2015) mention 335 an increase of performance with a denser station network up to 24 gauges per 1000 km 2 , but no improvement after that for up to 40 stations in the Thur basin (basin area around 1700 km 2 ). Xu et al. (2013) found, for a large-scale catchment, that the performance leveled off after 93 stations (1 rain gauge per 1000 km 2 ), which was about 50 % of the 181 available stations in the catchment of the Xiangjiang River (94 660 km 2 ). They also noted that below 38 stations (0.4 rain gauges per 1000 km 2 ) the model performance was pretty poor. Dong et al. (2005) found in their catchment, Qingjiang river (12 209 km 2 ) a critical 340 number at 5 out of 24 precipitation stations.
In contrast, in our high-resolution case, the individual fairly small catchments do not show such an expected threshold, especially for short-duration events. However, also the long-duration events do not show a salient threshold in all catchments or events. Given its density of 1 station per 2 km 2 , the station network in particular of the WEGN area is much denser than any of the other networks studied. These studies used station densities such as about 2 station per 1000 km 2 Xu et al. (2013) 345 or up to 12 stations per 1000 km 2 Lopez et al. (2015). Our study hence detects and highlights the strong catchment and eventdependence of the precipitation densities especially for short-duration events at 1-km-scale spatial resolutions. Therefore, for proper modeling of the runoff from heavy convective precipitation events a highly dense station network is very important.
This was also seen to some degree by St-Hilaire et al. (2003), again a study that addressed larger scales, where areas with high precipitation were better defined by denser networks for the long term (total annual precipitation) and short term (summer 350 convective events).
For our three short-duration events, the total precipitation amount of 31 mm to 34 mm is very similar, but it leads to different simulated runoff curves. The short-1 event has a maximum peak flow of 107 m 3 s −1 , while the short-2 and short-3 events are similar but smaller with 27 m 3 s −1 and 26 m 3 s −1 . Overall, the precipitation amount might lead to different runoff curves, depending on the location of gauges and storm core (O and Foelsche, 2019). From the runoff modeling point of view, it also 355 depends on the specific station locations and the measured precipitation amount at specific stations. This becomes clear with the huge over-and underestimations of peak flow, depending on the different station densities (Fig. 7). The three long-duration events show different runoff peak with 244/55/18 m 3 s −1 and a total precipitation amount of 121/62/50 mm, respectively. The peak flow deviations are very similar for the long-1 and long-2 event. The most stratiform event with smallest hourly peak flows (long-3 event) shows a different picture, even though the total precipitation is similar to the long-2 event. In summary, we 360 can learn from this that in small catchments for short-and long-duration heavy precipitation events the amount of peak runoff and of total precipitation are not directly related to the level of observed peak flow. The latter are driven by the specific event characteristics in a more complex manner.
We emphasize that the explicit study of the hydrological response to different precipitation events is crucial. Many earlier studies have evaluated the "accuracy" of (remote-sensing) gridded rainfall event data through direct comparison with ground 365 gauge measurements (e.g. O et al., 2017;Kirstetter et al., 2012;Lamptey, 2008). Now this study highlights that it is also important to evaluate the performance of precipitation datasets with various resolutions in terms of hydrological runoff response. et al. (2017) and are available on request from these authors. Geoinformation data are from state government offices of the States of Styria and Burgenland and are available from the respective GIS services (www.gis.steiermark.at; https://geodaten.bgld.gv.at).
Author contributions. All authors designed the study, with primary contributions by CH and GK. CH collected the data, performed the modeling and most of the analysis, created the figures, and wrote the first draft of the manuscript. GK provided guidance and advice on all aspects of the study, and significantly contributed to the figure design and the text. SO contributed to the data collection, precipitation data analysis, and figure creation. WR supported the model calibration and validation. All authors helped to shape the research and analysis, they provided critical feedback and contributions to the text until submission and during review.
Competing interests. The authors declare that they have no conflict of interest ment, and V. Hess (DK Climate Change, Univ. of Graz) for several fruitful discussions and support during the study work. Furthermore, we acknowledge the data providers at the Austrian Weather Service (ZAMG), the Austrian Hydrographic Service (AHYD), the state government offices of Styria and Burgenland, and the Wegener Center regarding the WegenerNet data. WegenerNet funding is provided by the Austrian Ministry for Science and Research, the University of Graz, the state of Styria (which also included European Union regional development funds), and the city of Graz; detailed information can be found online (www.wegcenter.at/wegenernet, last access: 29 August 2020).