Temporal downscaling of precipitation time-series projections to forecast green roofs future detention performance

A strategy to simulate rainfall by the means of different Multiplicative random Cascades (MRC) was developed to evaluate their applicability to produce inputs for green roof infrastructures models taking into account climate change. The MRC reproduce a (multi)fractal distribution of precipitation through an iterative and multiplicative random process. The initial model was improved with a temperature dependency and an additional function to improve its capability to reproduce the temporal structure of rainfall. The structure of the models with depth and temperature dependency was found to be applicable in 5 eight locations studied across Norway (N) and France (F). The resulting time-series from both reference period and projection based on RCP 8.5 were applied to two green roofs (GR) with different properties. The different models lead to a slight change in the performance of GR, but this was not significant compared to the range of outcomes due to ensemble uncertainty in climate modelling and the stochastic uncertainty due to nature of the process. The moderating effect of the green infrastructure was found to decrease in most of the Norwegian cities, especially Bergen (N), while increasing in Lyon (F). 10


Introduction
Hydrologic performance of stormwater Green Infrastructure (GI) is usually divided between Retention and Detention. Retention refers to water stored, infiltrated, or evapotranspirated. Actual evapotranspiration can be estimated from a water balance including Potential Evapotranspiration, accumulated precipitation, a soil moisture evaluation function, a, and, a crop factor (Johannessen et al., 2017;Oudin et al., 2005). Evapotranspiration process time-scale is typically 24 hours or less. Detention refers time-series; iv) the evaluation of the capability to reproduce the performance of GI based on observed data; and finally v) the analysis of the possible shift in performance of GI at the end of the century.

Meteorological data
Time-series of precipitation and temperature from six locations in Norway and two in France, representing four different 60 climates (Table 1) according to the Köppen Geiger classification (Peel et al., 2007), were used to apply the downscaling method. In Norway, the precipitation was measured by 0.2 mm Plumatic Kongsberg tipping rain gauges. The rain gauges were not heated and thus did not operate in cold temperature. They were successively replaced to Lambrecht 1518H3 (measuring range of 0.1mm) in the 1990s and 2000s. The stations were operated by the Norwegian Water Resources and Energy Directorate (NVE) and the Norwegian Meteorological institute (MET). The data were quality checked by the Norwegian Meteorological 65 institute (MET) (Lutz et al., 2020). In Lyon, precipitation was measured by 0.2 mm Précis-Mécanique tipping bucket rain gauges. Ten climate projections (temperature and precipitation) on daily resolution with the RCP 8.5 for the period from 2071 to 2099 for Norwegian cities were available online https://nedlasting.nve.no/klimadata/kss (Dyrrdal et al., 2018). For Lyon and Marseille (France), twelve climate projections were available from http://www.drias-climat.fr/.

Data aggregation and processing
The data were aggregated two by two from 1-minute resolution (resp. 6-minute) to more than 1-day resolution in order to capture a part of the uncertainty linked to the estimation of the parameter of the models . The aggregation was done for each possible time-steps: all multiple of 2 smaller than 1500 min. During the process of aggregation, both the weights (1) and the temporal coherence indicator S (2) measuring the proportion of high weight on the side of the highest neighbouring depth w 75 were computed. Given i a time-step, j a temporal resolution in minutes, and d a rainfall depth, the weight w and the indicator S of the side of the neighbour were calculated according to: The MRC models were developed to ensure a parsimonious number of parameters. Homogeneity of the resolution in the input datasets was not required for calibration and data processing (i.e. the model can be calibrated using multiple datasets with different resolutions between 1-min and 1 day). The modelled properties were time-scale continuous to allow the model to be used with all initial resolution smaller 1500 min. Based on the observed data, the functions were chosen to represent equations 3, 4 and 5, depending on the variable. Figure 1 describes the downscaling process. The models MCS, MCDS and 100 MCDTS (Table 2)   Model  Figure 1. Workflow for downscaling to transfer a depth from time-step T to time-step T 2 . The red boxes involve the generation of a random number. The process starts with 1440 minute time-step to reach 5.625 min an interpolation is then done to reach 6 min time-step.

Green Infrastructure modelling
In order to quantify the influence of rainfall input in green roof performance estimation, two green roofs located in Trondheim were modelled: i) A typical extensive green roof (E-green roof) with sedum vegetation, 30 mm of substrate, and 10 mm of "eggbox" drainage layer (Hamouz and Muthanna, 2019), and ii) a detention-based extensive green roof (D-green roof) with 110 sedum vegetation, 30 mm of substrate, and 100 mm of lightweight clay aggregates (Hamouz et al., 2020). The model is a simple reservoir model with differentiable linear function (8) for the outflow, Oudin's model for Potential Evapotranspiration (PET) and a Soil Moisture Evaluation Function (SMEF) to estimate Actual Evapotranspiration (AET) (Johannessen et al., 2017).
W C i is the water content (mm) at time t i . P i is the precipitation (mm · min −1 ). The discharge Q i (mm · min −1 )is based on the empirical curve (8). The temperature T mean is in Celsius degree, the extra-terrestrial radiation Ra is derived from the 120 latitude and the Julian day. The constant 1 λρ ≈ 0.408 depends on latent heat and volumetric mass of water. The factor C is a calibrated factor depending on the maximum storage and the crop factor. The smoothed linear curve (8) with K the conductivity slope, S K the smoothing factor and W C K the starting delay. The model was developed based on data from extreme tests with artificial precipitation (Hamouz et al., 2020) by establishing a relationship between water content and runoff. One day of data collected during extreme tests including nearly dry roof, successive artificial rainfall events leading to saturation and 125 drainage of the roof during twelve hours. The relative water content was computed based on inflow and outflow and shifted to ensure positive water content. The outflow depending on water content was used as input for calibration of the discharge function using Bayesian calibration with DREAM setup (Laloy and Vrugt, 2012). The D-green roof's model was validated with a rainfall series of two and half month from July 2018 to the 25th of September, and a one-month series from the 5th of September 2019 to the 5th of October. The E-green roof's model was validated with a rainfall series from April 2017 to 130 September 2017. Snow periods were mostly excluded for the evaluation.

Evaluating the downscaled time-series
For each location, the observed precipitations were aggregated to daily resolution and downscaled to obtain 200 time-series of 6-min time-step. They were used to model all the extensive and detention-based extensive green roofs in parallel. It should be noted that irrigation needs, and snow periods were neglected since the primary objective of the study was to evaluate the To evaluate the performance of the downscaling model and the projected performance of green roofs, different indicators were used: -The lag-1 autocorrelation depending on time-step. It was chosen to assess the temporal structure of the produced timeseries.

145
-The survival distribution of precipitation and of discharge from both roofs at 6-min time-step. This approach was similar to the use of flow duration curves recently applied to green roofs by Johannessen et al. (2018). The exceedance probabilities were presented with a log axis to account for extreme probabilities. The median, 5th and 95th percentile of the downscaled time-series were represented. The survival distribution of discharge from the roofs with downscaled time-series compared to the distribution based on observed data indicates the applicability of the downscaled time-series 150 as an input for green infrastructure modelling.
-Three different discharge thresholds were used to report exceedance frequency on different operating modes: 1L/s/ha for small events, 10 L/s/ha for major events and 100L/s/ha for extreme events.
-The distribution of dry periods and the retention fraction. They are not expected to be affected by the downscaling process since the dry periods affecting the roofs can be observed on daily resolution, and the retention fraction can be 155 estimated with conceptual models using daily time-step data. However, they provide additional information to analyse the behaviour of the roofs.

Green infrastructure model
The parametrized empirical reservoir model was applied to the extensive green roof and the detention-based extensive green 160 roof. The performance was evaluated both on the time-series and individual events extracted from the time-series. The criteria were: i) Nash Sutcliffe Efficiency (N SE) indicator on time-series for both discharge and water content, ii) N SE for rainfall events defined with a minimum inter events time of 6-hours to analyse further the behaviour of the model, and iii) the volumetric error on the time-series to account for model retention evaluation. The observed water content was estimated directly from discharge measurement using the empirical curve. The performance was as follow:

165
-N SE > 0.8 for both discharge and water content for the extensive green roof. On the 3 most intense events the N SE ranged from 0.9 to 0.75. The water balance error was found to be 2.1%.
-N SE > 0.94 for both discharge and water content for the detention-based extensive roof. On the 3 most intense events the N SE ranged from 0.96 to 0.85. The water balance error was found to be 5%.
The model is limited as it lumped processes and neglects dynamical effect: the wetting of the aggregates and substrate and 170 the spatial distribution of water content within the roof (Hamouz et al., 2020). It can be seen in Figure 2 in the beginning of the events. It suggests that short events with low intensity are not reproduced well by the model as it cannot represent the delay induced by the wetting of the different layers of the roofs. Since the objectives of this study involve the use of a simple model to reproduce the behaviour of two roofs, the model was not further improved.
2 0 1 9 -0 9 -1 3 2 0 1 9 -0 9 -1 5 2 0 1 9 -0 9 -1 7 2 0 1 9 -0 9 -1 9 2 0 1 9 -0 9 -2 1 2 0 1 9 -0 9 -2 3 Date Therefore, the monotony or non-monotony of the proportion of weights equalling to zero depending on time-scale can be explained by different distribution of depth in the observed data. The proportion depended on depth, which is consistent with previous work (Rupp et al., 2009). 185 In Figure 3b, the zero-weights proportion decreases with increasing depth for the case of Bodø. In the case of Hamar, it increases for depth higher than 2 mm. The two plots on the right show that a temperature dependency may explain this behaviour. In Bodø, the proportion depending on depth gives similar results for different ranges of temperature at 48-min resolution. On the contrary, in Hamar, the subsets with lower temperature lead to a lower proportion of weights being equal to zero, compared to subsets with higher temperature. Moreover, the higher depths were observed in subsets with higher 190 temperature. The increase observed in Hamar can be explained by the distribution of observed values. It is consistent with the observation of different temporal distributions of rainfall for different temperature ranges such as convective rain (Berg et al., 2013;Zhang et al., 2013). If, given a depth of 10 mm at resolution of 48 minutes, the probability to have a weight equal to zero is higher, then there is a higher probability to have an intense rainfall. The non-homogeneity of observed datasets and the shift in temperature with climate change might lead to inconsistency in datasets produced by the downscaling methods that exclude depth and/or temperature dependency. Developing a simple model is easy but might prevent comparability of parameters between locations and does not necessarily lead to parameter parsimonious models. Moreover, a model such as MCD can result in overfitting when used with datasets like Hamar. The functions necessary to represent the behaviour without considering the temperature dependency are more complex and less explanatory. Adding the temperature dependency could result in a more explanatory model with more robust results for the influence of climate change.

Evaluation of the downscaling methods
An overview of the performance of the downscaling and green roof models in Bergen is presented on Figure 4. All the downscaling models performed similarly in terms of dry period distribution and slightly underestimate the dry periods in observed data ( Figure 4b). The dry periods were directly linked to the zero-weight probability. In green infrastructure modelling, length of the dry periods influences the retention performance as it can lead to water stress hindering evapotranspiration. However, dry 205 periods leading to water stress can be also evaluated with daily time-step series (there is no need for minute time-step series).
Therefore, dry periods longer than the initial daily resolution are not significantly affected by downscaling.
The distribution of precipitation (Figure 4a  To evaluate the produced time-series it is necessary to compare the discharge with observed time-series to the discharge with downscaled time-series. For most of the location, the predicted range of precipitation or discharge deviated for lowest 220 probabilities from the values obtained with observed time-series: i) When the precipitation range match with the observed distribution, the discharge tended to be overestimated; ii) When the precipitation was underestimated, the discharge with observed data tends to lay in the range obtained from downscaled time-series. The performance based on downscaled time-series might lead to biased result if used as a discharge from observed time-series. Moreover, the raw discharge time-series might not be suitable for robust decision making in green infrastructure implementation as it does not represent the natural 225 variation of performance of green infrastructure.
In order to evaluate the potential of discharge from downscaled time-series to approach the range of performance linked to natural variability, a 3-year moving window was used on precipitation time-series and discharge time-series resulting from observed precipitation. The resulting 5th and 95th percentile of the annual time exceeding 1L/s/ha, 10L/s/ha and 100L/s/ha is presented in Figure 5 to evaluate the time-series in different operating modes of the roofs. It is compared to the stochastic 230 variability (5th and 95th percentile) from the 6 models. Each horizontal line in Figure 5 represents the range between the 5th and 95th percentile for the threshold and model considered. The different thresholds represent respectively discharge for small events, for major events and extreme events. On Figure 4, the threshold corresponds to 0.006 mm/min, 0.06 mm/min and 0.6 mm/min. A good estimate is defined by a complete or partial overlap between the observed natural variability and the stochastic variability range conserving the order of magnitude of the range. For instance, in Bergen, the observed range of the D-green 235 roof higher than 10L/s/ha is 9 to 16 hours so less than a day; the MC model provide a range from 24 to 28 hours, more than year. However, those models kept the order of magnitude, while model MC and MCS estimated it higher than 10 2 minutes. The same behaviour was observed with Hamar ( Figure 5) and Lyon datasets (appendix, Figure A1). This suggests that the models 245 performed worse with dryer location, possibly due to the calibration procedure since less wet days are available for calibration.
The models MCD and MCDT performed similarly, but due to its structure, MCD risks to overfit to the calibration data. It could result in an inaccurate prediction in case of significant temperature shift between the calibration and prediction datasets. To

Assessment of green roof future performance
All six models were used to assess the future performance of green roofs for future climate as illustrated for Bergen in Figure   6. to the models and the variability linked to the different projections available under RCP8.5 ( Figure 6) . In Bergen, according to the projections, the performance of the two solutions is likely to lead to worse performance: under current climate, the 100 L/s/ha exceedance was lower than 1 minute for the D-green roof; according to the MCDTS model it might reach between 5 and 19 minutes in future climate. It suggested a shift in the order or magnitude from 10 0 to more than 10 1 minutes. Similarly, the E-green roof might have a 100 L/s/ha exceedance shift from 10 1 to 10 2 minutes. It means that the threshold would regularly 260 be reached.
As illustrated by Figure 7 and Figure A2, the performance shift depends highly on the location. The 100L/s/ha exceedance of the green roofs was likely to get worse in Bergen, to stay stable despite a small increase in Bodø and to improve in Hamar and Marseille. The increase of exceedance frequency in the Norwegian cities was due to an increase in precipitation. However, the increase in temperature led to an increase in potential evapotranspiration and therefore might have attenuated or even coun-265 terbalanced the effect of rainfall increase by lowering the initial water content in the roofs at the beginning of a rainfall event.
The Table 3 shows that the retention fraction was likely to decrease in Bergen, Bodø, Hamar, Kristiansand and Kristiansund.
It was found to increase in Lyon, Marseille and slightly in Trondheim. The models with temperature dependency performed similarly to the model with only depth dependency in most of the location. However, in Lyon and Marseille, the 100 L/s/ha exceedance or precipitation predicted differed from 16-27 min to 21-50 min (resp. 14-30 to 14-43 in Marseille). This suggests 270 that some locations are more sensitive than other to temperature dependent patterns. The models MCD, MCDS MCDT and MCDTS allow to evaluate shift in performance for the different roofs using exceedance range.  The exceedance frequency is in day/year for small events, hour/year for major events and minute/year for extreme events. the model 0-Obs result of the observed precipitation time-series with a 3-year moving window to estimate the 5 th and 95 th percentile.

Design perspectives
In order to conclude on the applicability of downscaled time-series to predict the future performance of green infrastructure, the methods were compared to the current recommended practice in Norway: the use of the variational method (Alfieri et al.,275 2008) with the use of a climate factors (CF) (Kristvik et al., 2019;Trondheim Kommune, 2015). The results presented, for the city of Trondheim and 2, 5 and 10-year return period rainfall and runoff events, include: i) peaks runoff of runoff events based on an observed precipitation time-series, ii) the peak runoff or rainfall events based on variational method with and without climate factor and, iii) an hybrid approach based on downscaling 10 5 rainfall events with a daily depth based on to the return period curves with and without climate factors (Figure 8). This last approach used the MCDTS model. According to the current 280 recommendation in Norway for Trondheim municipality, a climate factor of 1.4 was applied (Dyrrdal and Førland, 2019). The figure shows that the variational method underestimated the peaks runoff with observed data, and the distribution from the hybryd approach covered them. It suggests that the variational method might not be enough conservative when compared to peak rufnoof from runoff events instead of rainfall events. Even if the results from the hybrid event-based downscaling lead to realistic distribution based on probable rainfall events, the downscaling models might need a different calibration or 285 conceptualization to be optimized specifically for extreme events. The observed peaks show a range of possible outcome which highlight the limitations of the variational method with a single estimate, whereas the hybrid downscaling-event based method, leading to a range of probable outcomes, gave promising results that can lead to more robust design and decision making.  The MC and MCS were not sufficient to predict the future performance of green infrastructure as they lead to overestimation 300 of runoff; The MCD, MCDS MCDT and MCDTS lead to better performance: it was possible to predict runoff exceedance frequency with similar order of magnitude to an estimate of the natural variability of performance based on observed timeseries. The structure of the MCD and MCDS models make them more vulnerable to overfitting than MCDT and MCDTS which make them less reliable for future performance estimate. However, the differences between them were negligible compared to the variability linked to the different outcome of climate models, the variability inherent to the model and its accuracy. The 305 MCS, MCDS and MCDTS add an equation to improve the temporal structure of downscaled rainfall. The models predicted higher runoff from the detention-based extensive green roof, which is consistent with their properties, however the change in performance was not significant compared to stochastic uncertainty.
Using the RCP8.5, the different downscaling and green roof models suggests that the shift in performance due to climate change highly depends on the location. The runoff exceedance is likely to increase in Bergen while decrease in Lyon and 310 Marseille and keeping the same order of magnitude in the other locations. The results were compared to one of the current practices: the use of the variational method with a climate factor. It highlighted the limitation of this practice that provide a singular estimate and underestimate the observed peaks. A hybrid method using downscaling on extreme events led to promising results by estimating a distribution of performance of peak runoff.
The models performed well in the 8 locations and 4 different climates. The use of a more advanced calibration procedure with 315 Bayesian methods should improve the results. Similarly, a sensitivity analysis could improve the parametrization, especially for the models with depth and temperature dependency in order to fix non behavioural parameters. The current study does not include irrigation and snow modelling a study centred on green infrastructure modelling is therefore needed to extend the results. In order to be applied in practice on event-based simulation for design perspectives, the downscaling models needs to be improved with a calibration procedure developed for extreme events and not on the complete spectrum of observation as in 320 the current study.  Bürger, G., Heistermann, M., and Bronstert, A.: Towards subdaily rainfall disaggregation via clausius-clapeyron, Journal of Hydrometeorol-