Intensity–duration–frequency (IDF) statistics describing extreme rainfall intensities in Norway were analysed with the purpose of investigating how the shape of the curves is influenced by geographical conditions and local climate characteristics. To this end, principal component analysis (PCA) was used to quantify salient information about the IDF curves, and a Bayesian linear regression was used to study the dependency of the shapes on climatological and geographical information. Our analysis indicated that the shapes of IDF curves in Norway are influenced by both geographical conditions and 24 h precipitation statistics. Based on this analysis, an empirical model was constructed to predict IDF curves in locations with insufficient sub-hourly rain gauge data. Our new method was also compared with a recently proposed formula for estimating sub-daily rainfall intensity based on 24 h rain gauge data. We found that a Bayesian inference of a PCA representation of IDF curves provides a promising strategy for estimating sub-daily return levels for rainfall.

Climate change caused by an increased greenhouse effect is expected to be associated with changes in the hydrological cycle and an increase in precipitation and more extreme rainfall

One problem associated with IDF curves is that rain gauge data have limited geographical coverage, making it difficult to analyse the risk of extreme rainfall everywhere. The Intergovernmental Panel on Climate Change's Sixth Assessment Report also noted that the exact levels of regional IDF characteristics may depend on the method as well as the resolution of downscaling when derived from climate model simulations

Another problem is that IDF curves may be inconsistent across durations.

In this study, we test an empirical statistical modelling approach to estimate the shape of the IDF curves, rather than the return values for each duration and return period. This approach is based on principal component analysis (PCA) and Bayesian inference. Using PCA reduced the return values of all stations and return periods to a set of principal components (PCs) in the form of spatial patterns, with accompanying IDF shapes and eigenvalues. The leading PC spatial patterns were subjected to Bayesian linear regression and subsequently expanded to new stations based on climatological and geographical information. The analysis is included in its entirety in the R Markdown document provided in the Supplement. Predicting the shape of the IDF curves of all return periods simultaneously through PCA is a novel strategy which to our knowledge has not been done before in this context. The motivation for this approach was the observation and expectation that the curves have simple and smooth shapes with regional similarities. In other words, the return values for rainfall intensities over different durations are related to each other, and the IDF curves are associated with a substantial degree of redundant information that can be utilised through the application of PCA. The estimated return values were compared with the simple formula for estimating approximate values of return levels for sub-daily rainfall based on 24 h rain gauge data presented in

We used new IDF statistics from 74 Norwegian stations, consisting of return values for a range of durations (1, 2, 3, 5, 10, 15, 30, and 45 min and 1, 1.5, 2, 3, 6, 12, and 24 h) and return intervals (2, 5, 10, 20, 25, 50, 100, and 200 years), depicted in Fig.

Return values for 74 Norwegian stations for the

IDF curves for

Daily mean precipitation (pr) and air temperature at 2 m (

Principal component analysis (PCA) was applied to the IDF curves through singular value decomposition (SVD)

The purpose of the PCA was to reduce the complexity of the IDF data while preserving as much variability as possible. The procedure finds new variables that are linear functions of the original data, where the new variables (PCs) are uncorrelated with each other, and the variance is successively maximised, meaning that the leading PC describes the dominant pattern, and each successive mode represents a smaller and smaller portion of the variance. The original IDF curves can thus be reproduced by combining a few of the leading PCs, eigenvectors, and eigenvalues with little loss of information.

We consider several criteria for assessing the number

There are other more objective but also computationally demanding methods of selecting the number of PCs to retain, for example, based on cross-validation or bootstrapping, but these were not deemed necessary for our purposes. Based on the criteria above, we focused primarily only on the two leading PCs in most of the analysis and statistical modelling of this study (see discussion in Sect.

Statistical relationships were established between the two leading PCs of the IDF curves, which represent the dominant spatial patterns of the data, and a set of geographical and meteorological predictors: the wet-day mean precipitation in the warm season (April–September) and cold season (October–March),

The statistical model can be described as follows:

The estimated principal components were then combined with the corresponding eigenvectors and eigenvalues:

Model fitting was performed by Bayesian linear regression, using the R package “BAS”

To evaluate model skill, a leave-one-out cross-validation was performed in which the predictand and predictor data of one station were excluded from model calibration. The statistical model was subsequently applied to the climatological and geographical data of the excluded station to estimate return values. This procedure was repeated so that independent estimates of the return values were obtained for all stations. The root-mean-square error (RMSE) and relative RMSE between the original return values and independent estimates obtained through cross-validation were then calculated as described in Appendix

Confidence intervals of the estimated return values,

The IDF curves estimated with Bayesian inference of principal components, as described above, were compared with the simple approximate formula for estimating return values derived by

As mentioned earlier, principal component analysis was applied to the IDF statistics as described in Sect.

Summary figure describing different aspects of the PCA of the IDF curves. Panel

The spatial pattern associated with the first principal component,

A reconstruction of the IDF statistics (Fig. S8) from only the first PC showed that it determined the basic slope and level of the IDF curves. The second PC altered the curvature: in the stations where

Statistical models were fitted for the first two principal components of the IDF statistics,

Marginal posterior inclusion probability (pip) of the coefficients for different predictor variables in the statistical models for the three leading principal components of the IDF curves,

Figures

Demonstration of the effect of the model parameters

A new set of IDF curves was constructed by combining the predicted principal components

Estimated 200-year return values for eight example stations:

A comparison between the original return values and values estimated from the two leading original PCs (Fig.

Original return values plotted against the error of the estimated return values (original–estimated) calculated

New IDF curves were generated by applying the statistical models to meteorological and geographical data for 240 stations in Norway, including Svalbard and Jan Mayen (Fig.

Estimated return values for 240 Norwegian stations for the

Comparing the original return values (Figs.

The regression results presented in Table

An advantage of the proposed method, applying PCA to the IDF data, is that all durations and return periods are considered together. This is not only computationally efficient but also reduces the influence of uncertain or erroneous individual return levels. There are also some potential pitfalls with this approach. First of all, only the first two PCs could be modelled, which could have too much of a smoothing effect on the IDF curves. There could also be nonlinear effects that were not captured by the linear models, which could result in underestimated variations in the estimated PCs. The quality of the estimated return levels was limited by the quality of the IDF data that they were based on. A different set of IDF statistics would likely result in statistical models with similar predictors and coefficients, but the PCA and ultimately the estimated IDF curves would be defined by the shape of the IDF data.

IDF statistics tend to have large uncertainties attached to them, as illustrated by the confidence intervals in Fig.

A more direct evaluation of model skill might be between the original return values (that were also obtained as median values of a distribution; see

Expanding IDF curves from gauged to ungauged locations or from current to future periods leans on the assumption that the transfer functions are stationary and the climate data that go into the analyses are representative of the location/period that they are associated with. This is not always true. If the reference period is shorter than the relevant cycles of natural climate variability or if there is a trend or large interannual variability, the outcome will likely be sensitive to the precise period of the input data.

Given that the IDF statistics in Norway are estimated for each duration separately and often based on short time series

As an alternative to PCA, we tried using a weighted set of polynomials to represent the shapes of the IDF curves, fitting return values against time intervals (Fig. S6). This method was not successful. First- and second-order polynomials were not a good fit, and higher-order polynomials, while fitting the data reasonably well for the time intervals for which return value data were available, had wiggly shapes with local minima and maxima that did not make sense as IDF curves.

One interesting question is how to use the IDF modelling approach presented here in the context of climate change projections. For the case of Norway, the key variables expected to change are

We obtained predictions of the shape of IDF curves in Norway with Bayesian inference applied to a PCA representation of the IDF data and conclude that it provides a useful strategy that can be utilised for regionalisation and downscaling of future climate projections.

Statistical measures of the discrepancy between original return values

The data code used to carry out this analysis is available in an R Markdown document provided in the Supplement. The daily precipitation and temperature data used in this paper are publicly available at

The supplement related to this article is available online at:

KMP and REB contributed to the data analysis. JL supplied the IDF data. All authors participated in writing the manuscript.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the Norwegian Meteorological Institute and the KlimaDigital project.

This research has been supported by the Norges Forskningsråd (grant no. 281059).

This paper was edited by Dominic Mazvimavi and reviewed by two anonymous referees.