Application of logistic regression to simulate the influence of rainfall genesis on storm overflow operations: a probabilistic approach

One of the key parameters constituting the basis for the operational assessment of stormwater systems is the annual number of storm overflows. Since uncontrolled overflows are a source of pollution washed away from the surface of the catchment area, which leads to imbalanced receiving waters, there is a need for their prognosis and potential reduction. The paper presents a probabilistic model for simulating the annual number of storm overflows. In this model, an innovative solution is to use the logistic regression method to analyze the impact of rainfall genesis on the functioning of a storm overflow (OV) in the example of a catchment located in the city of Kielce (central Poland). The developed model consists of two independent elements. The first element of the model is a synthetic precipitation generator, in which the simulation of rainfall takes into account its genesis resulting from various processes and phenomena occurring in the troposphere. This approach makes it possible to account for the stochastic nature of rainfall in relation to the annual number of events. The second element is the model of logistic regression, which can be used to model the storm overflow resulting from the occurrence of a single rainfall event. The paper confirmed that storm overflow can be modeled based on data on the total rainfall and its duration. An alternative approach was also proposed, providing the possibility of predicting storm overflow only based on the average rainfall intensity. Substantial simplification in the simulation of the phenomenon under study was achieved compared with the works published in this area to date. It is worth noting that the coefficients determined in the logit models have a physical interpretation, and the universal character of these models facilitates their easy adaptation to other examined catchment areas. The calculations made in the paper using the example of the examined catchment allowed for an assessment of the influence of rainfall characteristics (depth, intensity, and duration) of different genesis on the probability of storm overflow. Based on the obtained results, the range of the variability of the average rainfall intensity, which determines the storm overflow, and the annual number of overflows resulting from the occurrence of rain of different genesis were defined. The results are suited for the implementation in the assessment of storm overflows only based on the genetic type of rainfall. The results may be used to develop warning systems in which information about the predicted rainfall genesis is an element of the assessment of the rainwater system and its facilities. This approach is an original solution that has not yet been considered by other researchers. On the other hand, it represents an important simplification and an opportunity to reduce the amount of data to be measured.

concerning the genetic type of rainfall.They can be used to develop warning systems, in which information on the predicted rainfall genesis is a component of the assessment of the operation of the stormwater system and the facilities located on it.

Introduction
One of the important criteria for assessing the operation of stormwater systems is the annual number of storm overflow discharges, which is confirmed by national and foreign guidelines (US EPA, 1995;Zabel et al., 2001;ÖWAV, 2003).The physics of the phenomenon is complex and depends on the dynamics of rainfall, its changes over time and the characteristics of urban catchment areas with storm overflows.Currently, the annual number of discharges in the catchment areas can be assessed on the basis of long-term observations of their operation (Price, 2008;Andrés-Doménech, 2010;Gemerith et al., 2011), but it is a costly solution due to the need for continuous monitoring of flows.An alternative approach is to build a hydrodynamic model of the catchment, which requires collecting detailed data on the basin, precipitation from a long-term period (30 years according to DWA A-118) and flows in order to calibrate the model from a period not shorter than two years (Szeląg et al., 2016).Using the model of the catchment built on the basis of long-term rainfall measurements, it is possible to perform the so-called continuous simulation, which will allow to estimate the annual number of storm overflow discharges.
Such a solution may be a source of reliable estimation of the number of discharges, although its technical implementation is complex and the results obtained (numerical simulations of the catchment model) are not always satisfactory (Romanowicz and Beven, 2006;Beven and Binley, 2014).
Considering the above, the article uses probabilistic models to forecast the annual number of discharges, which take into account the stochastic nature of rainfall and the complex nature of runoff in urban catchment areas.This problem was discussed in the work by Willems and Thorndahl (2008), who used FORM (First Order Reliability Model) method to simulate the storm overflow discharge event; its application in engineering practice is limited due to the complexity of its implementation process.
In subsequent works dealing with this problem, rainfall generators and episode of discharge simulation models (Grum and Aalderink, 1999) were used to forecast the annual number of discharges.Szeląg et al. (2018) presented a model of simulation of storm overflow discharge (single rainfall episode) determined by logistic regression method.A significant disadvantage of the above mentioned solutions is the fact that in the probabilistic models simplified rainfall generators were used and the variability of precipitation characteristics was taken into account in relation to only one episode of discharge.At the same time, the issue of precipitation genesis was not addressed at all.Therefore, the question arises whether the information on the nature of precipitation (e.g.season of the year, precipitation genesis) could not find practical use in its modelling.It seems puzzling why the fact that the time course and dynamics of rainfall are the result of complex movements of air masses (Serrano et al., 2009;Alhammoud et al., 2014;Dayan et al., 2015) was not taken into account when modelling rainfall generators to simulate storm overflows.The problem of modelling of complex atmospheric phenomena is the subject of many works (Madsen et al., 1995;Paquet et al., 2006;Vincente -Serrano et al., 2009;Garavaglia et al., 2010;Abushandi and Merkel, 2011).The models created concern simulations of meteorological conditions changing in time and determining the distribution of temperature, pressure and humidity, which affects the dynamics of air movement and, consequently, the course of precipitation phenomena.
According to the literature knowledge, the information concerning the genesis of precipitation allows for preliminary assessment (quantitative and qualitative) of its course and estimation of the average rainfall intensity (Suligowski, 2004).This information may be the basis for forecasting the operation of the stormwater system and developing an early warning system against the risks of flash flood.This problem has not been considered so far and, at the same time, it seems possible to model the functioning of the stormwater system and the facilities located on it only on the basis of forecasts and identification of the rainfall genesis.
Taking into account the above considerations, an innovative probabilistic model for forecasting the number of storm overflow discharges was proposed in the paper.This model is composed of two independent elements: a synthetic rainfall generator and a model for forecasting storm overflow discharges.The identification of discharges takes place only on the basis of information on average rainfall intensity.In the model of the rainfall generator the genesis of rainfall was taken into account, which allowed to determine the curves showing the influence of rainfall genesis on the occurrence of overflow discharge in a single rainfall episode.On the basis of the performed analyses, the ranges of variability of average rainfall intensity assigned to rainfall of different genesis, for which storm overflow discharge may occur in the examined city catchment, were determined.Calculation experiments carried out in the study allowed to determine the influence of the distribution of the annual number of rainfall episodes of different genesis on the variability of the number of storm overflow discharges.

Object of study
The object of analysis was a 62 ha urban catchment located in the south-eastern part of Kielce (Figure 1).The city covers an area of 109 km 2 and is located in Świętokrzyskie Voivodeship.The average population of the area in question is 21.4 people ha -1 .Impervious areas including pavements, roads, parking lots, roofs, school playgrounds constitute 47.2 %.On the other hand, pervious areas, i.e. lawns, including green areas, occupy 52.8 %.On this basis it was established that the weighted average value of the catchment retention is dav= 3.81mm (Szeląg et al., 2016).The length of the main canal is 1.6 km and its diameter changes between 0.60-1.25 m.The maximum difference of ordinates in the catchment is 12.0 m and the average slope in the catchment is 7.1 %.
Stormwater from the catchment is discharged to the Silnica river.Detailed information about the catchment can be found in the work Dąbkowski et al. (2010).Stormwater from the catchment is drained via channel S1 into the distribution chamber (DC).If the chamber filling is less than h = 0.42 m, the stormwater is discharged into the sewage treatment plant (STP).If it is filled above h, the stormwater is discharged by a storm overflow (OV) into the S2 canal, from where it flows into the Silnica river.

Rainfallgenesis, number of rain events
Precipitation series covering all events, regardless of their duration, cannot be considered statistically homogeneous.
Precipitation reflects the different processes taking place in the troposphere.Therefore, it seems justified to divide the whole set of precipitation events and to compare data in groups of identical genesis.Precipitation is shaped by two different precipitation mechanisms: convective and stratiform (Houze, 2014).The third rainfall mechanism, which may have the above mentioned components, is related to the orographic lifting of air masses over mountains or hills (Smith, 1993).Thus, rainfall is universally classified into three types (Sumner, 1988): convective, cyclonic and orographic.The main distinguishing feature between convective precipitation in air mass and frontal precipitation in mid-latitudes is its spatial extent and duration.The range of convective precipitation associated with local air circulation is much smaller than in the case of travelling extratropical cyclones with weather fronts.Convective precipitation induced by single thunderstorm cells, their complexes or squall lines is short-lived, but is characterized by high average intensity (Kane et al., 1987) and causes flash floods in many areas (Gaume et al., 2009;Marchi et al., 2010;Bryndal, 2015).On the other hand, the lifespan of the mechanisms of creating cyclonic precipitation is much longer than that of convective precipitationin the order of days rather than hours.Hence, the effect of this is long-term rainfall with a high total sum (Frame et al., 2017), often causing regional floods (Barredo, 2007).The presented classification of precipitation types distinguished by Sumner (1988) due to the origin of the phenomenon, developed for the British Isles and Western Europe, cannot be directly applied in practical hydrology in other regions of the continent, especially in its eastern and central parts.This is the result of exceptional variability of meteorological conditions occurring in the temperate zone of warm transition climateon the borderline of air masses coming from the Atlantic and continental masses from the east (Twardosz and Niedźwiedź, 2001;Niedźwiedź et al., 2009;Twardosz et al., 2011;Łupikasza, 2016).The transformation of air masses over the western part of the continent, lower velocities of the movement of atmospheric fronts due to an increase in the roughness of the substrate and the height of the friction layer, as well as a weakening of the dynamics of processes in their zone cause that the precipitation associated with them in central Europe differs in frequency, intensity and duration in relation to the precipitation separated by Summer (1988) as cyclonic.In addition, in the summer, low-pressure systems in central Europe, bringing precipitation of high altitude are also genetically associated with the Mediterranean and Black Seas (Łupikasza, 2016).Analysis of maximum rainfall of different duration in Poland carried out at the end of the 1990s (Kupczyk andSuligowski, 1997, 2011), supplemented by the analysis of synoptic situation (on the base of surface synoptic charts of Europe, published in Daily Meteorological Bulletin of the Institute of Meteorology and Water Management -IMGW in Warsaw) and a calendar describing the types of atmospheric circulation together with air masses and air fronts (Niedźwiedź, 2019), led to the separation of three types of genetic precipitation: convective in air mass, frontal and convergence zones.
The source material for the study presented was data from May -October 1961-2000 obtained from the records of a traditional float pluviograph (precipitation height, duration, mean intensity) installed at the IMGW meteorological station in Kielce.These data were taken into account in the conducted analyses, as the launch of a new device, the SEBA electronic rain gauge (tippingbucket SEBA rain gauge), a few years later in the state measuring network, resulted in the recording of significantly lower precipitation levels (by several percent), high intensity and short-lived, compared to measurements recorded by a traditional pluviograph (Kotowski et al., 2011).In the period 1961-2000 there were 1312 precipitation events in Kielce with a height above 3 mm, which gives an average of 32.8 episodes per year.The highest number of rainfall (54) was observed in 1974 (Figure 2).
In Kielce, precipitation classified as the first genetic type (convective in air mass) lasts up to 150 minutes.They are caused by single convective cells with intensive ascending and descending currents, or by complexes of cells forming systems in the form of bands (squall line).The average annual number of these precipitation events in Kielce is 14.3 (1961-2000), although at the end of the 1990s their frequency increased significantly (30 episodes in 2000, 26 in 1996) (Figure 2).These precipitations are characterized by a low depth (Figure 3a), but rapidly increasing with the increase in duration (max.40.5 mm in 137 min).
Due to the short duration of all rainfall episodes (from 5 min to 2.5 h) they have a high intensity (median 0.073 mm min -1 ; max.0.587 mm min -1 ) (Figure 3b).The second type (frontal rainfall) forms a group of precipitation in Kielce, in which the duration is very varied and ranges from 2.5 h to 10.5 h.These are the most frequent precipitation in Kielce (average 16 events per year, Figure 2).They are associated with the movement of atmospheric fronts, while the fast moving cold front together with dynamic processes in its zone leads to a high intensity of precipitation lasting 2.5-5.5 h (max.54.8 mm h -1 ).On the other hand, the processes in the zone affected by the warm front usually generate higher precipitation levels, but due to their duration (5.5-10.5 h), even twice lower precipitation intensity (max. up to 25.8 mm h -1 ).Transformation of air masses over the western part 10 of the continent, lower speeds of movement of frontal zones, as well as weakening of the dynamics of processes in the front zone cause that precipitation in Kielce differ in intensity and duration in relation to precipitation emitted by Sumner (1988) as cyclonic.On the other hand, precipitation associated with convergence zones occurs in Kielce on average 2.3 times a year (Figure 3).They are the result of the passage of deep centers of low atmospheric pressure or a series of low pressure with two clearly marked frontal areas.High dynamics and magnitude of processes operating within them cause that a long-term continuous precipitation (> 10.5 h) is recorded near the ground surface, with a clearly increasing sum (Figure 3a) and variable (weakening of intensity after passing a warm front, increase of intensity on a cold front), although with a low mean intensity (Figure 3b).

Methodology
Within the conducted analyses, an innovative probabilistic model was proposed for forecasting the number of storm overflow discharges (Figure 4).This model allows for the forecast of the annual number of discharges and the simulation of the number of events per year, taking into account the genesis of rainfall (convective in air mass, frontal, convergence zones precipitation), which is typical for countries located in central Europe and other regions of world.Although the paper focuses on the genesis of rainfall developed by Kupczyk andSuligowski (1997, 2011), the proposed approach is universal.
The distribution of rainfall data may be based on local conditions determining the movement of air masses, which has a key impact on the dynamics of rainfall events.The time range of particular rainfall groups can then be determined on the basis of meteorological, synoptic and statistical analysis in the periods of high precipitation sums or precipitation intensity in a given area (Llasat, 2001;Rigo and Llasat, 2004;Millán et al., 2005;Langer and Reimer, 2007;Federico et al., 2008;Lazri et al., 2012;Berg and Haerter, 2013).The classification of precipitation proposed by Sumner (1988) for Western Europe can also be used in such analyses.A literature review (Vendenberge et al., 2008) shows that the seasons of the year were taken into account in models for forecasting rainfall distribution based on dome functions.Aspects related to the genesis of rainfall have so far not been included in probabilistic models for the analysis of stormwater systems operation.The model proposed in the present study consists of three components.The first component is a logit model, which is used to simulate the occurrence of a storm overflow discharge.Another component are synthetic precipitation generators, which are realized in two variants.In the first variant it was assumed that the basis for the simulation of rainfall series is their genesis.In the second variant precipitation is forecasted regardless of its originin the annual cycle.The third component of the model is a calculation block, in which the number of discharges is simulated for generated rainfall series.On this basis, distribution functions (CDF) are determined that describe the probability (Z) of exceeding the number of storm overflow discharges.The proposed algorithm includes the  probabilistic model (separation of rainfall events, creation of a logistic regression model, development of a rainfall generator) are discussed in detail (Figure 4).

Separation of rain events
One of the basic conditions allowing for the completion of a synthetic precipitation generator is the separation of single independent rain events in the ranks of rainfalls.For this purpose, the guidelines DWA A-118 (2006) were used, in which the basis for precipitation separation is a minimum antecendent period of 4 h.As a precipitation event, in the paper such a precipitation episode has been assumed for which the total amount of rainfall is not less than 3.0 mm (Fu and Kapelan, 2013;Fu et al., 2014).
On the basis of precipitation observed in the period 1961-2000, independent precipitation events were separated, for which statistical distributions were determined.In order to obtain the best possible matching of theoretical data (precipitation characteristics including Ptot and tr values for precipitation of appropriate genesis) with empirical data, the following statistical distributions were considered (Adams and Papa, 2000;Bacchi et al., 2008;Domenech et al., 2010): Weibull, chi-square, expotential, GEV, Gumbel, gamma, Johnson, log-normal, Pareto and beta.Kolmogorov-Smirnov and chi-square tests were used to assess the conformity of the empirical and theoretical distributions.Empirical distributions were also determined and theoretical distributions were adjusted to the data describing the number of precipitation events in the year and the number of episodes resulting from frontal, low and convective precipitation.Within the theoretical distributions, Poisson's, geometric, Bernoulli's and binomial distributions were considered.Kolmogorov-Smirnov tests were used to assess the conformity of empirical and theoretical distributions.

Logistics regression
The logistic regression model is also called the binomial logit model and is usually used to simulate binary data.Therefore, this model is commonly used for probability modelling.The logit model is often used to forecast phenomena and processes in medicine, social sciences and psychology (Bagley et al., 2001).This model is also successfully used to model processes in ecology, water engineering, geotechnics (Hayer et al., 2013, Inglemo et al., 2011) and wastewater treatment (Bayo et al., 2006;Szeląg et al., 2019).It is also used to simulate objects located in rainwater drainage networks (storm overflows) (Szeląg et al., 2018).The logit model takes the following form: where αiempirical coefficients estimated with the method of Maximum Likelihood Estimation (MLE), xiindependent variables, which in this paper include: total amount of rainfall (Ptot [mm]), duration of rainfall (tr [min]), average intensity of rainfall event (q= 166.7 Ptot tr -1 [L s -1 ha -1 ]), p [-]probability of occurrence of the event: storm overflow discharge.The calculations assume that a storm overflow discharge takes place when p is not less than 0.50, which corresponds to the following condition: To evaluate the predictive capacity of the logit model, the following measures were used to match the calculation results to the measurements: sensitivity -SENS (determines the correctness of data classification in the set of data including events when a storm overflow discharge occurred), specificity -SPEC (determines the correctness of data classification in the set of data constituting cases when no storm overflow discharge occurred) and counting error -Rz 2 (determines the correctness of identification of simulation of events: storm overflow discharge occurred / no occurred).These measures are discussed in detail in McFadden's paper (1963).
Two variants of the logit model were considered in the analyses performed.In the first of them, the height of rainfall and its duration were assumed to be independent variables, as described in Szeląg et al. (2018).The second variant assumes simplification and considers a single independent variable, i.e. average rainfall intensity.To determine the logit model, the results of measurements of the operation of the investigated storm overflow have been used from the years 2009-2011, when 69 discharges of 188 precipitation events occurred, and from the years 2012-2014, when 42 discharges of 93 precipitation events occurred.

Synthetic precipitation generator
Simulating continuous series of rainfall is a complex task that requires the implementation of complex numerical algorithms.
For this purpose, multi-dimensional scaling, fractal geometry (Rupp et al., 2009;Licznar et al., 2015;Müller-Thomy and Haberlandt 2015) methods are currently used in many cases.Alternative approaches are models based on the creation of multidimensional distributions using dome functions (Vandenberghe et al., 2010;Vernieuwe et al., 2015).Despite numerous applications, as indicated by the number of works in this field (Zhang and Singh, 2019), the above approach is not simple and in some cases requires searching for many different combinations of theoretical distributions and an appropriate combining function (Clayton, Frank, Gumbel, etc.) in order to obtain a generator with satisfactory predictive capabilities.Much less complex is the application of the Monte Carlo method with modification of Iman-Conover (1982) allowing to simulate variables that are dependent.In this method the variability of the considered variables is described by boundary (theoretical) distributions, and the basis for evaluation of their correlation is the Spearman correlation coefficient.The conditions, which must be met in order for the results obtained to be considered correct, are as follows: in the data obtained from simulation and measurements, the mean values (μ1, μ2,...,μi)s and the standard deviations (σ1, σ2,...,σi)s of the variables (xi) considered in j samples do not differ by more than 5 %, theoretical distributions of xi variables obtained from simulation are consistent with those obtained from measurements; in order to meet this condition it is recommended to use the Kolmogorov-Smirnov test, the value of the correlation coefficient (R) between individual dependent variables (xi) obtained for data from MC simulation does not differ by more than 5 % from the value of R obtained for empirical data.
If the above mentioned conditions are met, the results of the simulation performed with the IC method can be considered correct.If this is not the case, the sample size of the MC needs to be increased (Wu and Tsang, 2004).In order to limit the sample size and improve the efficiency of the Iman-Conover algorithm, a modification has been developed by using the Latin-Hybercube algorithm, which is part of the layered sampling methods aimed at improving the "uniformity" of the numbers generated from the boundary distributions.
On the basis of the determined boundary distributions of rainfall characteristics, simulations of synthetic rainfall series were performed with the use of the Monte Carlo method with Iman-Conover modifications and taking into account the Latin-Hypercube algorithm.

Simulator of annual synthetic rainfall series
Currently conducted research in the field of rainfall simulators based on multidimensional boundary distributions combined with the so-called dome functions take into account the distribution of rainfall in the rainfall episode (Vernieuwe et al., 2015), spatial diversity of rainfall (Dai et al., 2014), seasons (Khedun et. al., 2014) and the period separating subsequent rainfall events (Balistrocchi and Bacchi, 2011).The studies conducted in this way are very important from the point of view of the most accurate mapping of stormwater outflow from urban and agricultural catchments in terms of rainwater management and facility design.At present, however, a significant problem is simulation of continuous precipitation values, and a simpler task, which is being considered in practice, seems to be the identification of rainfall genesis, i.e. whether rainfall is the result of convective processes taking place in the air mass, processes in the zone of moving atmospheric fronts or is connected with extensive convergence zones.In many cases, this information may constitute a valuable source of knowledge, which enables the identification of rainfall characteristics (Suligowski, 2004), which may be useful at the stage of operation of the underground infrastructure systems and river basins.Taking into account the above considerations, an innovative precipitation simulator was developed (Figure 5), in which the genesis of rainfall plays a key role.
The paper presents two approaches to the simulation of rainfall series.In the first approach to the simulation, the average number of precipitation events in the year of convective in air mass, frontal and convergence zones rainfall type was assumed.
In the second approach it was assumed that the number of rainfall of the appropriate genetic type is stochastic.

Results
Following the above mentioned methodology concerning the structure of individual elements of the probabilistic model (Figure 4), calculations were made.They consisted in the determination of the logit model, identification of empirical distributions and theoretical rainfall characteristics, simulation of synthetic rainfall series with the inclusion of rainfall genesis and the forecast of the number of discharges.

Logit model
Based on the results of measurements of storm overflow (OV) and rainfall described in detail in the chapters above and in the works of Szelag et al. (2013Szelag et al. ( , 2018)), independent rain events were separated on the basis of which the logit model was determined.In case of independent variables such as Ptot and tr the logit model takes the form (Szeląg et al., 2018): It is characterized by satisfactory predictive abilities, because the value of SPEC= 96.90 % (out of 106 overflow discharges, in 103 episodes the model correctly identified the event), SENS= 98.20 % (out of 165 events, the 162 overflow discharges were correctly classified using the model) and Rz 2 = 97.78% (out of 271 observed episodes, in 265 events the calculation results were consistent with the measurements).
If only the average rainfall intensity (q) was included in the analyses, a logit model of the form was obtained: This model is also characterized by satisfactory predictive abilities, because the value of SPEC= 84.90 % (out of 106 overflow discharges, in 90 episodes the model correctly identified the event), SENS= 90.20 % (out of 165 events, the 148 overflows were correctly classified using the model) and Rz 2 = 87.82% (out of 271 observed episodes, in 229 events, the calculation results were consistent with the measurements).An interesting result of the research is the fact that it is possible to simulate discharges with satisfactory accuracy only on the basis of rainfall intensity (q).This fact is important from the point of view of the construction of models for modelling rainfall.The result obtained in the study indicates the possibility of significant simplification of the construction of the probabilistic model for the simulation of the number of discharges.
The above relationships (eq. 3 and eq.4) are of local character and it seems that they can be applied only for the investigated research catchment area.Considering the above and the need to build universal models, the models formulated above have been transformed.Based on theoretical considerations conducted by Thorndahl and Willems (2008), who provided a generalised model for forecasting the volume of wastewater discharge via a storm overflow, it can be concluded that in this case the following relations take place: tot − 0.007 •   = 3.802 ≈  av (5) On the basis of the relationships (eq. 5 and eq. 6) it can be concluded that the values of free words obtained in them are similar to the weighted average value of the catchment retention (dav).The relative difference between the values of free words and retention of the catchment area does not exceed 5 %.The result obtained may have a significant practical meaning, as it gives the possibility of transferring the dependencies obtained above to other drainage basins.Nevertheless, in order to confirm this, further detailed analyses on other objects concerning measurements and calculation experiments are necessary.

Identification of empirical and theoretical distributions of selected rainfall characteristics
On the basis of determined empirical distributions, theoretical distributions were adjusted to them (Figures 6 and 7).Table 1 presents the results of Kolmogorov-Smirnov (KS) and chi-square (Chi) tests of matching empirical distributions to theoretical distributions for rainfall characteristics (Ptot, tr, q, M) depending on the genesis of precipitation and shows determined coefficients in equations describing theoretical distributions.where ξ, β, γ, λ, σ, μempirical coefficients determined by the method of maximum likelihood.
The calculations performed (Table 1) showed that in most cases the Weibull distribution in the form (eq. 7) is the best suited to the empirical data describing the variability of the total depth of rainfall (Ptot) in a rainfall episode.Also, the variation in rainfall duration in episodes resulting from rainfall of different genesis in most cases is described by the Weibull distribution and only in the case of data measured over an annual cycle is it expressed by the GEV distribution (eq.9).The values of rainfall intensities of different genesis are in most cases described by log-normal distributions (eq.8), whereas in the case of rain intensity caused by frontal precipitation its dynamics is described by a generalised distribution of extreme values.
Satisfactory adjustment of the calculation results to the measurements of the tested variables of a continuous nature is confirmed by the curves shown in Figures 6-7.In the case of discrete variables, the number of rainfall events in a year of different genetic type is described by Poisson's distribution (eq.11) with satisfactory accuracy.Only for the variability of episodes recorded in the annual cycle, significant differences in the results of measurements and calculations were found.
Therefore, by the simulation of the annual number of overflow discharges, the variant in which the variability of the number of precipitation episodes in the year is included, was abandoned.5 Based on the models determined for the forecast of storm overflow discharge and the determined theoretical rainfall distributions, the impact of rainfall genesis on the occurrence of a single storm overflow discharge was determined in the first place.Taking into account the different predictive abilities of the logit models obtained, the analyses were limited to calculations with a model that best represented the existing state.The results of the simulations are shown in Figure 8.The calculations show that in the case of discharges resulting from convective precipitation, the stochastic nature of the precipitation has a significant impact on the values of the lower and upper percentiles (Figure 9a).For example, for p = 0.05, the difference in the number of overflow discharges obtained with the assumption of N = var is 5 times greater than the solution when the average number of precipitation events per year was assumed.For the percentile value p = 0.95, the difference in the number of discharges between the considered solutions is 4. In the case of discharges resulting from frontal rainfall (Figure 9b), the difference in the number of discharges obtained for the variants when N = const and N = var is much smaller than for convective rainfall (Figure 9a).This difference may be due to a significant variation in the number of convective and frontal rainfall over an annual cycle and the variation in rainfall intensity in both cases.Analysing the results of the simulation it can be concluded that when the number of precipitation episodes (convective, frontal, convergence zones) is determined by Poisson's distribution, the calculated number of discharges for p < 0.50 is smaller than when N = const (Figure 9).On the other hand, for p > 0.50 the inverse relation is maintained.The influence of the theoretical distribution of the number of rainfall events per year on the values of 0.99 > p > 0.50 is confirmed by Szeląg et al. (2018).On the basis of simulations made and curves designed it can be established that the average number of overflow discharges resulting from convective rainfall is 15 (Figure 9a), and in the case of frontal and convergence zones precipitation it is much smaller and equals 7 and 1 respectively, assuming that N = const (Figures 9b, 9c).In the case of storm overflow discharges caused by frontal overflow it was found on the basis of the determined curves that out of 7 overflow discharges as many as 5 of them are caused by rainfall connected with a cold front, for which the duration of rainfall does not exceed tr= 4.5 h (Figure 9b).
The innovative synthetic precipitation generator proposed in the paper allows to quantify the impact of rainfall genesis on the annual number of storm overflow discharges (Figure 9d), which until now has not been included in the models developed by other authors.This approach can be transferred to other facilities located in stormwater systems and used to assess the effectiveness of stormwater drainage systems.Ultimately, the results obtained in this way may be the basis for the construction of an expert system for early warning of torrential phenomena caused by heavy rainfall.The advantage of the developed probabilistic model is also the possibility to perform simulations of a long-term nature (multiannual) and assess the impact of distribution of individual types of precipitation on the annual number of storm overflow discharges and its variability.

Conclusions
The calculations performed showed that the measurements of average rainfall intensity can be used to simulate (using a logit model) the occurrence of a discharge in a single rain episode.The simulation results obtained do not differ significantly from the calculations made on the basis of rainfall depth and duration.In both models it was shown that the numerical value of the free word in the model equations does not differ by more than 10 % from the height of the weighted average retention of the catchment area.This fact has a significant practical meaning, because it gives the possibility to transfer the results obtained to other municipal catchments.However, in order to confirm this, further analyses on catchment areas with different physical and geographical characteristics are advisable.
The computational experiments carried out in the study allowed to assess the influence of rainfall genesis (convective in air mass, frontalcold and warm, convergence zones precipitation) on the occurrence of storm overflow discharge.Moreover, the ranges of variability of average rainfall intensity were determined, for which storm overflow discharges were found.The information obtained may be used in engineering practice, because on its basis it is possible to determine whether a storm overflow discharge will take place.
Identification of the state of operation of the stormwater system (in this case, storm overflow discharge) on the basis of the identification of the rain front may have a significant practical significance.It provides an opportunity to develop an early warning system against the occurrence of emergencies (spill of stormwater to the surface, hydraulic overload of pipes, overfilling of tanks) in stormwater systems.

Figure 1 :
Figure 1: Diagram of the analysed urban catchment.In the years 2009-2011 the amount of wastewater leaving the catchment area was measured with the use of the MES1 flow meter located in the canal (S1) at a distance of 3.0 m from the inlet to the DC chamber.On the other hand, since 2015 in the inlet (S1) and discharge (S2) channels, MES1 and MES2 flowmeters measuring the values of filling and flows have been installed.A detailed description of the installed measuring equipment can be found in Szeląg article (2016).

Figure 2 :Figure 3 :
Figure 2: Number of rainfall episodes with depth above 3 mm in Kielce (against the background of averages from multi-year in particular genetic types).
following steps: a) separation of precipitation events in rainfall measurement series, b) identification of independent variables (xi) in a logit model at the accepted confidence level and estimation of empirical coefficients, c) determination of empirical distributions and theoretical rainfall characteristics of the different types of genetic precipitation (convective in air mass, frontal and convergence zones), https://doi.org/10.5194/hess-2019-271Preprint.Discussion started: 29 July 2019 c Author(s) 2019.CC BY 4.0 License.d) simulation by means of the Monte Carlo method of the number of precipitation events; as an alternative, an approach based on a fixed average number of rain episodes can be used, e) Monte Carlo simulation of rainfall characteristics for the number of rain events generated, f) determination of the number of discharges for generated rainfall series: per year, 5 in different genetic types.

Figure 4 :
Figure 4: Calculation diagram of the algorithm of building a probabilistic model for forecasting the number of discharges.The paper presents the following stages of construction of a probabilistic model on the example of an urban catchment located in the area of Kielce city.In the following sections the individual steps of the above mentioned calculation algorithm of the 10 https://doi.org/10.5194/hess-2019-271Preprint.Discussion started: 29 July 2019 c Author(s) 2019.CC BY 4.0 License.

Figure 5 :
Figure 5: Algorithm of simulation of annual rainfall series, taking into account the generation of rainfall.Nnumber of modelled samples for M rainfall episodes (where: cconvective, ffrontal, czconvergence zone); N= 500 was assumed in the paper; Knumber of samples of rainfall characteristics modelled with IC + LH method; K= 1000 was assumed in the paper; F(ζ)c,f,cztheoretical distributions for simulation of the number of rainfall events of the appropriate genetic type (cconvective, ffrontal, czconvergence zone); F(x)c,f,cztheoretical distributions describing rainfall characteristics of the appropriate genetic type.

Figure 6 :
Figure 6: Comparison of quantiles of empirical distributions and theoretical number of precipitation episodes per year for the rainfall type: (a) all events, (b) convective, (c) frontal, (d) convergence zone. 10

Figure 8 :
Figure 8: (a) Impact of rainfall genesis on the probability of overflow discharge, (b) Impact of rainfall genesis on rainfall intensity distribution determining overflow discharge.

Figure 9 :
Figure 9: (a) Distribution function (CDF) showing the annual number of discharges due to convective rainfall; (b) Distribution function (CDF) showing the annual number of discharges due to frontal rainfall; (c) Distribution function (CDF) showing the annual number of discharges due to convergence zones precipitation; (d) Curve showing the probability of exceeding the annual number of discharges.