Statistical models of the relationship between precipitation and topography
are key elements for the spatial interpolation of rain-gauge measurements in
high-mountain regions. This study investigates several extensions of the
classical precipitation–height model in a direct comparison and within two
popular interpolation frameworks, namely linear regression and kriging with
external drift. The models studied include predictors of topographic height
and slope at several spatial scales, a stratification by types
of a circulation classification, and a predictor for wind-aligned
topographic gradients. The benefit of the modeling components is
investigated for the interpolation of seasonal mean and daily precipitation
using leave-one-out cross-validation. The study domain is a north–south
cross section of the European Alps (154 km

The significance of the topographic predictors was found to strongly depend on the interpolation framework. In linear regression, predictors of slope and at multiple scales reduce interpolation errors substantially. But with as many as nine predictors, the resulting interpolation still poorly replicates the across-ridge variation of climatological mean precipitation. Kriging with external drift (KED) leads to much smaller interpolation errors than linear regression, but this is achieved with a single predictor (local topographic height), whereas the incorporation of more extended predictor sets brings only marginal further improvement. Furthermore, the stratification by circulation types and the wind-aligned gradient predictor do not improve over the single predictor KED model. As for daily precipitation, interpolation accuracy improves considerably with KED and the use of a single predictor field (the distribution of seasonal mean precipitation) as compared to ordinary kriging (i.e., without any predictor). Nonetheless, information from circulation types did not improve interpolation accuracy.

Our results confirm that the consideration of topography effects is important for spatial interpolation of precipitation in high-mountain regions. But a single predictor may be sufficient and taking appropriate account of the spatial autocorrelation (by kriging) can be more effective than the development of elaborate predictor sets within a regression model. Our results also question a popular practice of using linear regression for predictor selection in spatial interpolation; however they support the common practice of using a climatological mean field as a background in the interpolation of daily precipitation.

High-mountain ranges contribute to the supply and storage of freshwater and river flow in many regions of the world (e.g., Viviroli et al., 2007). The role of mountains in extracting moisture from the atmosphere manifests in numerous regional anomalies and gradients in the distribution of the global precipitation climate (e.g., Basist et al., 1994; Schneider et al., 2013). Accurate knowledge of the distribution and variation of rain and snowfall is crucial for numerous planning tasks concerned, for example, with water resources, water power, agriculture, glaciology and natural hazards (e.g., Greminger, 2003; Holzkämper et al., 2012; Machguth et al., 2009; Yates et al., 2009). A convenient source of information is spatial analyses of observed precipitation, obtained by interpolation onto a regular grid, comprehensively over large areas. Such grid data sets have become of interest also for monitoring climate variations and for evaluating model-based reanalyses and climate models (e.g., Alexander et al., 2006; Bukovsky and Karoly, 2007; Frei et al., 2003; Schmidli et al., 2002).

The construction of accurate precipitation grid data sets for high-mountain regions is confronted with the challenge of complex spatial variations. Even with idealized topographic settings and flow configurations (e.g., isolated hill or ridge, constant flow), situations can be distinguished where precipitation maxima occur over the windward slope, the crest, or the downwind slope of a topographic obstacle (e.g., Sinclair et al., 1997; Smith, 1979). Distributions depend on the height and scale of the obstacle, and the strength, static stability and moisture profile of the impinging flow. More complex topographic shapes, transient weather systems, convection, and the drift of hydrometeors quickly complicate the picture (e.g., Cosma et al., 2002; Fuhrer and Schär, 2005; Houze et al., 2001; Roe, 2005; Sinclair et al., 1997; Steiner et al., 2003). Therefore, the distribution of long-term mean precipitation is, in many regions, a superposition of several distinct responses to topography, which act at different space scales, involve several characteristics of the topography (not just height), and pertain to different flow situations.

A further complication for spatial analysis in mountain regions is posed by the limited spatial density of rain gauges, the standard device for climatological inference on precipitation. Even in comparatively densely instrumented areas, such as the European Alps, the networks do not resolve contrasts between individual valleys and hills explicitly, and they miss out episodic fine-scale patterns familiar from radar observations and numerical models (e.g., Bergeron, 1961; Frei and Schär, 1998; Germann and Joss, 2001; Zangl et al., 2008). Moreover, the distribution of rain gauges in complex terrain is often biased, with a majority of measurements taken at valley floors, while steep slopes and high elevations are underrepresented (e.g., Frei and Schär, 1998; Sevruk, 1997). The sampling bias entails a risk of systematic errors in spatial interpolation that can impinge upon estimates on a larger scale, such as for averages over river catchments (e.g., Daly et al., 1994; Sinclair et al., 1997).

In this context, models of the relationship between precipitation and
topography constitute an essential element of spatial interpolation methods.
Their purpose is to enhance the methods' capabilities in describing
variations not explicitly resolved by the observations and to reduce the
risk of systematic errors related to the non-representativity of the
measurement network. Approaches for considering precipitation–topography
relationships in interpolation methods can roughly be grouped into

In this study we explore and compare several ideas for the modeling of precipitation–topography relationships in the framework of empirical statistical models. Our specific focus is on models that (a) take account of the multi-scale nature of the relationship, (b) consider responses both to slope and elevation of the topography, (c) involve a dependency on the direction of the large-scale flow, and (d) examine the potential of a stratification by circulation types. The value of the different modeling components is assessed in terms of the skill of a geostatistical interpolation method that has these models incorporated and is applied for the estimation of fields of seasonal mean and daily precipitation in a sub-region of the European Alps.

Systematic topography effects on precipitation are usually difficult to discern in observations at short timescales (e.g., for daily totals). Precipitation–topography relationships are therefore mostly estimated from long-term averages, which are then used, via a climatological background field, for the interpolation of shorter duration totals (Haylock et al., 2008; Rauthe et al., 2013; Widmann and Bretherton, 2000).

A common model of topography effects is that of a linear relationship between climatological (seasonal or monthly) mean precipitation and in situ topographic elevation. Precipitation–height gradients have been considered using various interpolation methodologies, such as: linear regression by using height as a predictor (e.g., Gottardi et al., 2012; Rauthe et al., 2013; Sokol and Bližnák, 2009), several variants of kriging by using a digital elevation model as a secondary variable (Allamano et al., 2009; Goovaerts, 2000; Hevesi et al., 1992; Phillips et al., 1992; Tobin et al., 2011), thin-plate spline interpolation by using height as a third regionalization variable (Haylock et al., 2008; Hutchinson, 1998), and triangular interpolation by adopting height corrections (Tveito et al., 2005). The assumption of these procedures is that local height is a key explanatory variable of the distribution of precipitation and that the relationship, commonly estimated over larger domains, is representative at the scale relevant for the interpolation (i.e., at and below the spacing of stations).

Map of long-term mean winter precipitation (mm day

Three types of extensions of the aforementioned methodologies have been proposed: the first introduced a range of physiographic predictors (not just height) and/or predictors representing smoothed versions of the actual topography (e.g., Basist et al., 1994; Benichou and Le Breton, 1986; Gyalistras, 2003; Perry and Hollis, 2005; Prudhomme and Reed, 1998; Sharples et al., 2005). Additional predictors (e.g., slope, exposure) were found to significantly increase the explained variance compared to only height (e.g., Gyalistras, 2003; Prudhomme and Reed, 1998), and digital elevation models smoothed to resolutions of 5–50 km (depending on the region) were found to be more powerful predictors compared to high-resolution topography (e.g., Prudhomme and Reed, 1998; Sharples et al., 2005). Conversely, the second extension remains with univariate height dependencies, but considers the relationship to be spatially variable (Brunetti et al., 2012; Daly et al., 1994; Gottardi et al., 2012). The aim is to focus on dependencies at scales that are not explicitly resolved by the station network and therefore are particularly relevant for interpolation. There are different emphases in the two extensions between robustness and local representativity of the precipitation–topography model used for interpolation. The third extension of traditional precipitation–height models is to incorporate information on atmospheric flow conditions into the interpolation; Kyriakidis et al. (2001) constructed new rainfall predictors by combination of lower-atmosphere flow and moisture with local terrain height and slope. When used in kriging these dynamical predictors yielded more accurate interpolations of the seasonal mean precipitation compared to using only elevation. Hewitson and Crane (2005) modified the weighting scheme of a daily interpolation method to depend on synoptic state (discrete types of daily low-level circulation) in order to account for the varying short-range representativity of station measurements. Gottardi et al. (2012) used the circulation regime of the day under consideration to estimate orographic effects specifically for different weather conditions. All these ideas built on empirical evidence that the mesoscale precipitation distribution in complex terrain varies considerably between days with different large-scale flow conditions (Cortesi et al., 2013; Schiemann and Frei, 2010).

In this study we build on, extend, and test ideas of all three extensions in a subregion of the European Alps. We compare several sets of physiographic predictors with regard to their relevance for high-resolution precipitation interpolation. Apart from including height and directional gradients, our set encompasses predictors at several spatial scales simultaneously in order to explicitly distinguish between patterns resolved and unresolved by the station network. We also compare the role of predictor setting between multivariate linear regression and kriging with external drift to assess how a model of spatial autocorrelation (kriging) can compensate for extensive predictor sets. We further examine the possibility of stratifying seasonal means by independent analyses for composites of a circulation-type classification and by including predictors of the pertinent circulation terrain effect. Most of our analyses focus on interpolations of seasonal mean precipitation, but we also assess the relevance of circulation-type dependent background fields for the interpolation of daily precipitation. Essential for all our comparisons is that interpolation errors will be examined as a function of topographic height and for both systematic and random error components. The main purpose of our study is to gain insight into the role of different approaches to precipitation–topography modeling, but some of our analyses also explore possibilities to improve an interpolation method previously developed for the generation of a precipitation grid data set for the entire Alpine region (Isotta et al., 2013).

The region of the European Alps is an interesting example for studying interpolation procedures and pertinent models of the precipitation–topography relationship. There is an exceptional density of long-term rain-gauge observations (see Fig. 1), which allows modeling approaches of larger complexity than in sparsely-gauged mountain regions. Moreover, there is a broad range of topographic scales (from hundreds of kilometers for the main ridge down to a few kilometers for individual massifs) and variations in ridge height (2000–3000 m for the main ridge down to a few hundredmeters for adjacent hill ranges). Accordingly, the distribution of mean precipitation reveals several nested patterns of the precipitation response that is indicative of its multi-scale nature (see Fig. 1).

This study is part of the project European Reanalysis and Observations for Monitoring (EURO4M). The outline of the study is organized as follows: in Sect. 2 we introduce the study domain and the data. The methods of spatial analysis and the procedure of evaluation are described in Sect. 3. The results of the evaluation are then presented and discussed in Sect. 4, and the conclusions of this study are drawn in Sect. 5.

In this study we consider a subdomain of the Alps (11

The rain-gauge data for this study (Fig. 2a) were obtained from the German
Weather Service (DWD, for Germany), from the Austrian Federal Ministry of
Agriculture, Forestry, Environment and Water (for Austria) and from Servizio
Meteorologico and Ufficio Idrografico Bolzano Alto Adige (for Italy). The
data set is a subset of 440 stations out of a pan-Alpine compilation of
high-resolution daily rain-gauge time series extending over the period
1971–2008 (Isotta et al., 2013). On average the station density is 1 station
per 70 km

Like in other mountainous regions, the distribution of the stations in our study domain has a limited representativity with respect to terrain height (Fig. 2b). High-elevation areas (> 1500 m m.s.l.) are significantly underrepresented. For example, elevations above 1500 m m.s.l. contribute about 25 % of the total area but are represented with only 6 % of the stations. This setting involves a risk of precipitation estimates for high-elevation areas being biased due to inaccurate interpolation between valley stations. This will be given particular attention in the assessment of interpolation methods later.

The rain-gauge time series underwent different quality control procedures at the original data providers. In addition they were jointly checked rigorously for raw errors after compilation using criteria of temporal and spatial consistency and physical plausibility (for details see Isotta et al., 2013). One problem with the quality of the data is, however, posed by the systematic measurement error emanating from wind-induced undercatch, wetting, and evaporation losses (Groisman and Legates, 1994; Neff, 1977; Sevruk, 2005). Sevruk (1985) and Richter (1995) estimate the systematic measurement error in the Alps to range from about 7 % (5 %) over the flatland regions in winter (summer) to 30 % (10 %) above 1500 m m.s.l. The data used in this study are not corrected for these systematic errors. Indeed, water balance considerations in the Alps have challenged existing correction procedures (Schädler and Weingartner, 2002; Weingartner et al., 2007). The systematic errors may affect the strength and estimation of empirical precipitation–topography relationships. However, given that the spatial variability of mean precipitation across the domain (see the example in Fig. 2a) is much larger than the range of expected systematic errors, we assume that these errors do not significantly affected the conclusions of the present study.

Our statistical analyses are conducted with estimates of mean precipitation at the above stations, that is, with seasonal means over a multi-year period or with means over all days belonging to the same class of a daily circulation-type classification. The fact that many rain-gauge series extend over a part of the full 38-year period only requires care in establishing robust and comparable mean values. For this purpose quantitative tests have been carried out, aiming at determining the minimum number of days required to build a mean value of a given accuracy. The tests were conducted with bootstrap experiments (sampling across days) over the time series of the 20 most complete station records. The error metric is based on the relative mean root-transformed error presented in Sect. 3.4. Our accuracy requirement was that the probability of a sampling error larger than 10 % of the “full” mean (i.e., mean over the complete time series) should be smaller than 5 %. The error thresholds are somewhat arbitrary but are chosen to guarantee reliable climatic estimates (compared to the spatial variations) while retaining enough data. The resulting minimum requirement of the available length of the time series varies between season and circulation class. Stations not fulfilling this minimum requirement are discarded from the analysis. As a result the station sample varies between analyses with different seasons and between seasonal and circulation-type stratifications. Typically, the selection procedure eliminates 5–15 % of the total number of stations, leaving between 317 and 420 time series depending on stratification.

The circulation-type classification chosen in this study is the PCACA classification (Philipp et al., 2010; Yarnal, 1993). It uses daily mean sea level pressure distributions as input for a hierarchical cluster analysis of principal components. The classification catalog used here was taken from an application of PCACA in the framework of COST Action 733 over an extended Alpine domain, using sea level pressure fields from ERA40 and ERA-Interim (Dee et al., 2011; Uppala et al., 2005) and with a target number of 9 clusters (Weusthoff, 2011). The choice of the 9-types classification (PCACA9) is a compromise between differentiation of daily circulation patterns and robustness of mean values (i.e., enough days within a weather class). In a comprehensive intercomparison, PCACA9 was found to be particularly skillful in explaining the distribution of mesoscale daily precipitation in the Alpine region (Schiemann and Frei, 2010). The geostrophic wind fields for each of the clusters were calculated from sea level pressure composites based on ERA40 (Uppala et al., 2005).

Our study on the significance and utility of physiographic predictors for
spatial interpolation firstly deals with seasonal mean
precipitation, wherein topographic effects on the distribution stand
out more clearly from spatial variations of episodic nature. The
methodological framework employed is that of kriging with external drift
(KED; Schabenberger and Gotway, 2005), an interpolation model with a
component for multilinear dependence on pre-defined variables (external
drift or trend, here a set of topographic predictors) and a component of
spatial autocorrelation. Two limiting cases of KED will also be considered
for comparison: multi-linear regression models (LM), which comprise the
linear dependence on topographic predictors only (i.e., no spatial
auto-correlation), and ordinary kriging (OK) with only the spatial
autocorrelation component included (i.e., omitting dependence on predictors).
As topographic predictors, a set of candidates will be considered, including
elevation (

Secondly, we compare the quality of daily precipitation interpolations when using various climatologies (with different predictor sets: seasonal or circulation-type stratification) as a background reference (Widmann and Bretherton, 2000). As in the seasonal experiments, KED will provide the methodological framework for the daily interpolation, but using the previously determined background reference fields as trend variables.

Interpolation experiments conducted for long-term seasonal mean precipitation. Interpolation method, predictors used and the total number of predictors included.

Interpolation experiments conducted for daily precipitation. The name of a scheme is a combination of the name of the daily scheme and the background field used.

The following subsections describe in detail the methodological setup (Sect. 3.1), the derivation and usage of the topographic predictor sets (Sect. 3.2), the method for daily interpolation (Sect. 3.3), and the cross-validation procedure (Sect. 3.4). Table 1 lists the experiments conducted for seasonal precipitation with the different methods and predictor sets, using the acronyms just introduced. The experiments conducted for daily interpolation are listed in Table 2.

For the interpolation concept, the present study builds on kriging with
external drift (Schabenberger and Gotway, 2005) and two simplified
limit cases of it. KED belongs to a broad class of geostatistical
interpolation methods that estimate values at target locations as the best
linear, unbiased combination of sample observations, assuming
that the field of interest is a realization of a second-order stationary
Gaussian process (see e.g., Cressie, 1993; Diggle and Ribeiro, 2007). KED
considers the observations

In our application of KED for seasonal mean precipitation, the trend variables

In all our applications, the semivariogram is assumed to be exponential with a nugget, sill, and range as parameters. Despite the two-dimensional character of our study domain (i.e., ridge aligned in the east–west direction), we have chosen an isotropic variogram model in all our experiments. The reason for this is that the deterministic model component in KED comprises the angular asymmetry of the variations in precipitation implicitly via predictor fields that represent the orientation of the ridge. Predictors of height and slope, especially at larger space scales, vary in the north–south direction more than in the west–east direction. Introducing an anisotropy in the stochastic model part (variogram) would likely compete with the significance of these predictors for interpolation. As a consequence, the results would become very specific to our study domain with its simple geography, where the absence of predictors can be compensated for by variogram anisotropy. In a more complex domain – e.g., with a topography orientation changing across the region – such a compensation is far less effective and the incorporation of informative predictors more decisive. In this study, we are interested in predictor dependence in this more general setting, which is why we deliberately refrain from the added flexibility with anisotropic variograms. The choice of the exponential variogram was motivated by simplicity. Preliminary sensitivity experiments with a spherical variogram (again allowing for nugget) did show very minor differences in results compared to the exponential model.

All model parameters (trend coefficients and variogram parameters) are estimated jointly using the method of restricted maximum likelihood (Schabenberger and Gotway, 2005), which accounts for biases from limited sample size/large predictor sets. The utilization of a likelihood-based estimation procedure is central in our application. Estimating trend coefficients and variogram parameters jointly means that the procedure implicitly distinguishes between variations in the observations that are better explained by the predictors and variations that are better explained by spatial covariance (spatial continuity). This procedure ensures optimality of the parameter estimates and consistency of assumptions with the stochastic model of Eq. (1) (see also Diggle and Ribeiro, 2007). Prior estimation of predictor coefficients by linear regression followed by ordinary kriging of residuals, an estimation procedure frequently applied, has a risk of disturbing spatial autocorrelation when the relationship to predictors is the sole source for explaining variance in the regression step.

A complication of adopting KED in the present study is posed by the
assumption of a multivariate Gaussian with stationary variance in space for
the stochastic component (the residuals of the trend). This condition is
rarely met with precipitation data, which have a distribution bounded by zero, positive skewness, and shows larger variance in areas of high
versus low precipitation. Partial remedy of this can be made with a prior monotonic
transformation of the data, the application of KED in transformed space, and
subsequent back-transformation of the estimated kriging distribution. The
procedure, commonly known as trans-Gaussian kriging (Schabenberger and
Gotway, 2005), has been adopted in all KED experiments of the present study,
using the Box–Cox power transformation (Box and Cox, 1964):

Here we prescribe the transformation parameter at

It is worth noting here that the Box–Cox transformation improves compliance
with model assumptions only with respect to non-stationarity related to the
skewness of precipitation amounts. Precipitation intermittency (the
existence of contiguous dry/wet areas) is responsible for non-stationarities
that the transformation does not eliminate. Note that, with

The KED model of Eq. (1) comprises two simplifying special cases that
will be considered in this study as alternative methods of spatial
interpolation. The first is to assume that

The second special case of the KED model (1) is that in which topographic
predictors are omitted, i.e., presuming

All computations are done in R (R Core Team, 2012) using the geostatistics package geoR (Diggle and Ribeiro, 2007).

The topographic predictors used in this study are based on the DEM of the Shuttle Radar Topography Mission (SRTM; Farr et al., 2007). SRTM was obtained using both C- and X-band microwave radars and has originally a resolution of about 90 m. In this study we use the SRTM elevation model on a 1 km grid of the Lambert Azimuthal Equal Area Coordinate Reference System (ETRS89-LAEA; Annoni et al., 2001).

The three main topographic predictors considered are fields of elevation and gradients in the two cardinal directions across the ridge (north–south) and along the ridge (east–west). Several predictors for each of these quantities will be considered, describing variations in elevation and gradients at different space scales. These were derived from smoothed versions of the original DEM, after applying a Gaussian kernel with window widths of 1, 5, 10, 25 and 75 km, respectively. A predictor set that involves, for example, elevation and gradients at three space scales comprises a total of nine different predictor fields: three for elevation, three for the north–south gradient and three for the east–west gradient. Values of the predictors at the station locations were always taken from the nearest grid cell of the predictor fields.

Care was required to avoid co-linearity between predictors when combining several of them for the various space scales. To this end, predictors for a scale were defined as the difference between the variable at that scale and the same variable at the next larger scale. For example, the 25 km elevation predictor in a set involving the scales 1, 25 and 75 km is obtained by calculating the difference between the 25 km and the 75 km smoothed versions of the DEM.

Apart from analyzing fields of seasonal mean precipitation directly from seasonal mean station observations, we also investigate the potential for recombining a seasonal mean field from several separate spatial analyses for average precipitation within the classes of a circulation-type classification. Precipitation–topography relationships may be more clearly established under conditions of similar large-scale circulation, and this could assist the derivation of a seasonal mean field through further stratification.

The consideration of circulation types permits the introduction of an
additional circulation-guided topographic predictor. It is defined as

Figure 3 illustrates examples of the wind-aligned gradient

Consideration of

Apart from

Illustration of

Our experiments on the interpolation of daily precipitation also make use of the concepts of kriging with external drift and ordinary kriging (Sect. 3.1) as used for the interpolation of seasonal mean precipitation. However, rather than using the topographic predictors directly as trend variables, the daily interpolation adopts fields of seasonal mean or circulation-type mean precipitation as trend variables. Precipitation measurements at short timescales usually exhibit large spatial variations from which systematic topographic effects are difficult to estimate. The solution followed here is to inject this information via pre-calculated long-term averages. The approach is somewhat related to the common use of climatological mean fields as reference (e.g., New et al., 2000; Widmann and Bretherton, 2000), but instead of adopting the reference as scaling factor, the approach uses it as the trend variable in KED.

Following the main focus of our study on precipitation–topography
relationships, we conduct experiments with daily interpolations and shed
light on the role of the climatological reference fields. To this end the
interpolation errors are compared between different specifications of the
trend variable (see Table 2 for a list of experiments). The trend settings
include (a) a long-term seasonal mean built with topographic predictors–
experiment KED(KED1e); (b) the long-term mean of the day's pertinent
circulation type–experiment KED(KED1e

Our comparison and discussion of the various interpolation experiments is based on systematic leave-one-out cross-validations, rejecting one by one all the stations of the domain and estimating pertinent interpolations at the location and with the predictors for that station.

Two error scores will be used to summarize the performance of the methods.
The first is a measure of the relative bias (

Depending on the data stratification and interpolation method, between 317 and 420 stations are available for estimation and interpolation. To ensure maximum comparability of the evaluation results, however, we use a fixed set of 317 stations to calculate the above error scores.

Linear regression is often considered an exploratory framework with which potential predictors for a trend model of KED can be compared. We therefore develop our discussion starting with results from the special case when spatial autocorrelation is neglected and then pursue the changes when introducing autocorrelation in combination with topographic predictors.

The number of possible regression models with three variables (elevation,
north–south gradient, east–west gradient) and six different spatial scales
is very large. We have selected three of them for our discussion because of
their illustrative purposes. The simplest (LM1e; see Table 1) only has
elevation at the finest spatial scale (1 km) as predictor. It is a
traditional and wide spread model of topography effects on precipitation
(see Sect. 1). The second (LM3e; see Table 1) also involves elevation
only, but at three different space scales (75, 25, 1 km). The third
model (LM9eg; see Table 1) involves elevation and gradients (in both
cardinal directions) at the three space scales (75, 25, 1 km).
Experiments with all five space scales (including 5 and 10 km)
showed that the three selected scales led to the largest values in adjusted

Distribution of DJF long-term mean precipitation (mm per day) as
estimated by

Note that a formal and automated model selection procedure (using step-wise linear regression) was not feasible in our application, because the predictors for one scale depend on those retained for other scales (elimination of co-linearity; see Sect. 3b).

Table 3 lists values of adjusted

Adjusted

Despite its decent values in explained variance, the nine-predictor model LM9eg shows elementary deficiencies in reproducing the distribution of rain-gauge measurements in the domain. These are illustrated for the example of DJF mean precipitation in Fig. 4a. Precipitation is systematically overestimated over a wide flatland belt adjacent to the ridge (see full red square), underestimated along the foothills and, again, overestimated in interior parts of the ridge (see dashed red square). Apparently the larger-scale topographic predictors provide, in linear combination, only a partial match to the observed north–south profile, and the resulting prediction tends to smooth out some of the variations. Similar types of deficiencies (differing in exact location) were evident with other combinations, with the full set of space scales, and during other seasons. There was always clear spatial clustering in the prediction errors (regression residuals). It seems that, even with quite comprehensive predictor sets, it is difficult to capture in a regression model all aspects of the precipitation field resolved by the station network. Surprisingly, this is even the case with the comparatively simple north–south profile of this study, for which the construction of a suitable predictor set may have first looked easy.

Ordinary kriging seeks to represent the precipitation distribution entirely without topographic predictors. The corresponding estimation (Fig. 4b) has a smooth appearance but reproduces the characteristic north–south contrasts between flatland, foothills and inner Alps. Hence, OK amends some of the regional deficiencies of the linear regression model of Fig. 4a (see red squares). However, in the inner Alpine region, several rain gauges with anomalously wet conditions (mostly at mountain peak stations) are represented as isolated spots. It appears as if some elevation dependency that is not explicitly resolved by the station network is missed because of the absence of predictors in OK.

Figure 4c depicts the result obtained with KED, i.e., integrating predictors and spatial autocorrelation, using the comprehensive three-scale elevation and gradients model as trend (KED9eg). The distribution shows the superposition of a spatially smooth pattern (similar to OK, Fig. 4b) and a small-scale pattern with topographic features that are not explicitly resolved by the station network (similar to LM9eg). The consideration of spatial autocorrelation has amended for the deficiencies of LM9eg in representing the larger-scale north–south profile (red squares). Moreover, the strong contrasts between mountain stations (moist) and valley stations (dry) in the interior Alps are now integrated via an elevation (and gradient) dependence at small scales.

North–south precipitation profile as estimated by the three
interpolation methods LM9eg, OK, and KED9eg (see Table 1). DJF long-term mean
precipitation (lower

Error statistics for the interpolation of mean DJF precipitation
using different interpolation models (see Table 1 for model acronyms).

It is interesting to realize that the three discussed interpolation methods yield markedly different estimates not just regionally, but also when aggregated over larger scales. This is further illustrated in Fig. 5, which depicts the results of Fig. 4 when averaged over latitude bands (along the ridge). OK and KED9eg both represent a moist anomaly at the foothills, centered at an elevation of about 1200 m m.s.l. This anomaly is much less pronounced and more wide-spread in LM9eg. Towards the inner Alpine region the three methods yield markedly different areal estimates with OK being much dryer than the regression model and KED. OK and KED differ by between 5 and 25 % in this area. In the inner Alpine region, it is not entirely clear, at this point, which of the methods are more realistic. Clearly there is a risk of general underestimates by OK due to the absence of topography dependence in conjunction with poor sampling of high-elevation areas. But there is also a risk that KED suffers from overestimates if, for example, the elevation dependence estimated over the full domain is not representative of the inner Alps.

In the following we assess the relative performance of a range of
interpolation models from the above three categories by means of a
systematic leave-one-out cross-validation. Results are depicted for DJF mean
precipitation in Fig. 6. The two panels are for

Relative bias

Relative mean root-transformed error

When averaged over all stations the values of bias are small, varying between 0.97 and 0.995 depending on the method (Fig. 6a, dashed lines). The largest underestimate (3 %) is obtained for LM1e (the linear model with local elevation as single predictor). More significant biases are, however, found in individual elevation ranges. This is particularly so for the linear regression model LM1e and for ordinary kriging. The lack of topographic predictors in OK impinges upon the interpolation at high elevation. Here OK systematically underestimates by about 30 %. This deficiency is mostly corrected with interpolation models that incorporate topographic predictors (LM9eg and KED9eg). The explicit modeling of topography allows for a compensation of the effects of non-representative vertical distribution of the station sample. In the framework of KED, this remedy is almost as good with only one predictor (KED1e) as with many predictors (KED9eg). In the linear model framework, however, in situ elevation alone provides a poor model of the spatial distribution (see also Table 3), and this is reflected in large and alternating biases between the elevation ranges. An interpretation of this difference may be seen in the fact that the estimated coefficient for the 1 km elevation predictor is quite different between LM1e and KED1e. It seems that the consideration of spatial autocorrelation in KED1e permitted for a much more realistic separation between small-scale elevation dependence (modeled by the predictor) and larger-scale precipitation variations (modeled by the autocorrelation part). In contrast, LM1e attempts to capture larger-scale and small-scale variations with one single linear dependence by construction. It is then likely that larger-scale variations (such as the north–south profile) disturb a realistic estimate of the small-scale elevation dependence.

The limited accuracy of linear regression models in predicting the spatial
variations of seasonal mean precipitation is most evident in the relative
error score

The OK model (no topographic predictors) has much smaller errors than the
regression models, except for the highest elevation range (Fig. 6b). OK
profits from its explicit account for spatial autocorrelation, which permits
the reproduction of larger-scale variations (e.g., the north–south profile)
from the information at neighboring stations (see also Fig. 4b). In our
application this methodological feature yields considerably smaller errors
than a comprehensive predictor set in a regression model, at least for low
and intermediate elevation ranges. At large elevations, however, the OK
model suffers large

The family of KED models, which include both topographic predictors and
spatial autocorrelation, yield the smallest interpolation errors of all
models (

Between the different KED models (with different predictor sets) there are
only marginal differences in the scores (Fig. 6b, Table 5). Values of

Error statistics for the interpolation of mean DJF precipitation
using interpolation models that utilize information from a circulation
classification (see Table 1 for model acronyms).

Experiment KED9eg (10, 5, 1 km) involves predictors at spatial scales all smaller than the station spacing. Still there seems to be little added value compared to the model with the 1 km elevation predictor only (KED1e, see Fig. 6b and Table 5). It is unclear whether this result implies that the additional predictors (5 and 10 km elevations and gradients) are indeed not very relevant (on top of the 1 km elevation) for describing small-scale precipitation variations in the Alps. There may be insufficient sampling of these predictors in the station sample considering that most of the inner-Alpine stations are in valleys or on mountain tops.

Note that

In this section we examine the potential of considering circulation types
for the derivation of interpolated mean seasonal precipitation fields. Two
extensions will be considered. The first deals with a substratification of
the season. For this purpose several KED interpolation models are adopted separately
for each class of the circulation classification. The resulting
fields of mean precipitation for each class are subsequently recombined
into a seasonal mean field by weighting according to the classes' frequency.
Experiments adopting this substratification are labeled with a “

Relative bias

Relative mean root-transformed error

Cross-validation results from these experiments are depicted in Fig. 7,
again for

With all tested interpolation methods, the biases are smaller than 2 %
(5 %) below (above) 1000 m m.s.l. (Fig. 7a). The interpolation with
circulation classes (KED1e

Comparison of the different methods in terms of

We have tested several alternative definitions of a circulation dependent
predictor deviating from that in Eq. (3). These included the introduction of
an asymmetry between upslope and downslope gradients, truncating the

There are several possible reasons why circulation class information did not
improve interpolation accuracy in our application. The region may be
geographically too simple or too small to reveal the benefits of a predictor
that builds on spatially variable wind directions. The large-scale wind
field (derived from a coarse resolution sea level pressure field) may be of
limited representativity for the true air flow in such a complex topography.
The variability of airflows within a circulation class may be large, so that
systematic topographic effects do not necessarily manifest at the small
space scales addressed by the

In this section we compare and evaluate several options for extending the KED interpolation framework for daily precipitation. The main purpose of this comparison is to investigate how sensitive the accuracy of a daily interpolation scheme is to various options of integrating small-scale topography-related information. Simultaneously we compare the KED-based daily models with results from a previously implemented deterministic daily interpolation scheme that was calibrated over a much larger area (the entire Alpine region) and was used for a popular data set of trans-Alpine daily precipitation (Isotta et al., 2013).

Table 2 lists the interpolation models compared here and Fig. 8 depicts
results from some of these models for a day with widespread and intense
precipitation in the study domain. All KED models considered adopt the
stochastic concept of Eq. (1) but with one of the previously determined
climatological mean fields as trend rather than with the topographic
predictors themselves. The trend field for KED(KED1e) is the mean seasonal
field KED1e that was derived with the 1 km elevation predictor. Recall that
this version of the mean seasonal distribution showed cross-validation
skills comparable to other versions with comprehensive predictor sets (Fig. 6).
The precipitation for the example day (Fig. 8a) shows small-scale
patterns along the foothills and in the interior of the ridge that reflect
patterns of the trend field. For KED(KED1e

Daily precipitation total (mm) for 13 February 1990 as derived by
the daily interpolation methods investigated in this study.

Figure 8d depicts daily precipitation for the example day derived by the Alpine-wide SYMAP(PRISM) interpolation. This procedure uses a seasonal climatology derived from a local regression approach as background (PRISM, Daly et al., 1994, 2002; Schwarb, 2001). The result depicted comes from a 5 km grid interpolation (Isotta et al., 2013) and is coarser than results for the other models (1 km grid). It shows more variable and larger peak values than the other models. In contrast to the KED models with elevation as predictor, PRISM estimates precipitation–height gradients locally (considering the representativity of surrounding stations) and this results in more pronounced small-scale variations.

The daily interpolation methods have been quantitatively evaluated using cross-validation over all winter days of 1971–2008 (3400 days). For computational reasons, the cross-validation of the models was only calculated for the daily interpolation step, i.e., with the seasonal background field estimated from all the data including the test station. Clearly the daily interpolation step contributes the largest error component, but the errors calculated in this simplified way should be considered as a lower bound of the true errors.

Figure 9 depicts the bias

Error statistics for the interpolation of daily precipitation in
winter (DJF, 1971–2008) using the interpolation models of Table 2 (see also
Sect. 3).

The bias of the daily interpolation (Fig. 9a) reveals similar features to those in the climatic case. Methods without consideration of topographic
predictors in the climatological background field (OK(

The relative ranking of methods in terms of

The KED(KED1e) and KED(KED1e

Modeling the relationship between precipitation and topography is essential for the construction of accurate precipitation grid data sets by statistical interpolation. Here we have investigated several extensions of the classical precipitation–height model, including predictors of slope in addition to elevation, a multi-scale decomposition of the predictors, a circulation-type dependence of the relationship, and the inclusion of a wind-aligned gradient predictor. Variants of these extensions have been proposed previously, but their effect on interpolation accuracy has not been systematically evaluated and mutually compared so far. Station measurements in our study region (a cross section of the European Alps) show imprints of slope effects and coarser-scale topography in the distribution of mean seasonal precipitation. Intuitively one would therefore expect that the considered extensions could improve interpolation accuracy.

Our experiments illustrate that the benefit from complex predictor sets (elevation and slope, multiple scales) in the interpolation of seasonal mean precipitation depends strongly on the statistical modeling framework. In a linear regression framework there is a clear benefit in the sense that cross-validation errors (random and systematic) are reduced with more predictors included. However, even with nine predictors, the resulting interpolation is unsatisfactory. It poorly replicates the characteristic changes from the flatland over the foothills to the inner section of the ridge as revealed by the station measurements. Linear regression would require many more predictors for a decent reproduction of this pattern because all spatial variations need to be modeled with predictors.

For kriging with external drift (predictors with spatially correlated residuals), however, the role of a complex predictor set was found to be much smaller. Local elevation (a 1 km digital elevation model) was found to be essential for reducing the systematic underestimates and large random errors observed at high elevations with ordinary kriging (no predictors). In fact, the simple one-predictor KED model was substantially better than the linear regression model with nine predictors. But the inclusion of more complex physiographic predictor sets in KED did bring only marginal additional improvement. Neither topographic slopes nor a wind-aligned gradient could effectively reduce the cross-validation errors. Interpolation results with comprehensive multiscale predictor sets in KED were very similar to those of the one-predictor model and the inclusion of circulation-type dependence had only small effects. It seems that a large portion of the spatial precipitation variation in our study region is captured by a model of spatial autocorrelation directly from the measurements (kriging) and that a simple digital elevation model was sufficient (but essential) to correct for interpolation errors emanating from the nonrepresentative vertical distribution of stations.

Linear regression is often considered an exploratory framework in spatial interpolation to identify potential predictors for a trend model of KED. This practice is somewhat questioned by the results of our study. We find a strong contrast of sensitivity to predictor choice between the two methods. Linear regression tends to suggest larger predictor sets than are actually necessary in KED. Our results with KED were not measurably degraded by the inclusion of non-informative predictors. However, this resistance is dependent on the estimation procedure. Our approach of estimating the trend coefficients and variogram parameters jointly by maximum likelihood (see Sect. 3.1) permits the estimation process to distinguish between predictor dependence and spatial autocorrelation implicitly (Diggle and Ribeiro, 2007). This distinction is more restricted in an alternative estimation procedure, often referred to as residual kriging or detrended kriging (Martínez-Cob, 1996; Phillips et al., 1992; Prudhomme and Reed, 1999), where predictor coefficients and variogram parameters are estimated in disjoint steps (regression followed by simple kriging of residuals). This will make the method more prone to errors in predictor choice. Regression kriging, yet another estimation procedure (Hengl et al., 2007; Pebesma, 2004; Tadić Perčec, 2010), uses an iterative procedure and should be similarly robust to predictor choice like the likelihood-based estimation used in our study.

Our experiments for daily precipitation illustrate that the utilization of a climatological background field (seasonal climatology) reduces interpolation errors significantly, particularly systematic errors at high elevations in comparison to direct interpolation. The large spatial variability of daily precipitation complicates robust estimation of systematic topographic responses directly from the daily data, but a climatological background field can pick up some of these patterns, which translates into smaller interpolation errors. This result supports a practice widely used in the construction of short-term precipitation grid data sets but rarely verified so far (Harris et al., 2013; Haylock et al., 2008; Isotta et al., 2013; Rauthe et al., 2013). Clearly the topographic effects evident in mean precipitation are not necessarily representative of all weather conditions. Our results, however, suggest that estimating these effects separately for typical circulation types does not significantly improve the performance compared to a seasonal background. This result may depend on the region considered and the circulation-type classification chosen. In any case, the classification we have experimented with here was previously shown to explain precipitation variations in the Alps better than other common classification schemes (Schiemann and Frei, 2010).

The daily KED interpolation method using a seasonal mean climatology as background has turned out to perform better in the Alpine cross section compared to the method used for a grid data set over the entire Alpine region (Isotta et al., 2013). This may hint at ways of methodological improvement, but it is premature to value the two methods with regard to their suitability over the entire Alpine region. On the one hand, the existing method makes compromises in order to meet very diverse conditions in climate and station density. On the other hand, extending the KED approach over the entire region raises questions about the representativity of “globally” estimated trend coefficients and variogram parameters. Moreover, on a practical side, the KED approach may become computationally very demanding with several thousands of stations.

The results of our study are likely dependent on the setting of our study region, such as the density of the station network, the complexity of the topography, and the diversity of weather patterns. In other regions where the station network is coarser and hence the nearest observations are less informative, extended predictor sets may become more relevant. Nevertheless, our results call for prudence in our expectations into seemingly versatile topographic predictors for filling the information between in situ measurements. Clearly, sensitivity experiments like those conducted can help to make a parsimonious choice and to ensure robustness of the final interpolation method.

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement n242093. Edited by: F. Tian