Stream Temperature Prediction in Ungauged Basins: Review of Recent Approaches and Description of a New Physics-derived Statistical Model

The development of stream temperature regression models at regional scales has regained some popularity over the past years. These models are used to predict stream temperature in ungauged catchments to assess the impact of human activities or climate change on riverine fauna over large spatial areas. A comprehensive literature review presented in this study shows that the temperature metrics predicted by the majority of models correspond to yearly aggregates, such as the popular annual maximum weekly mean temperature (MWMT). As a consequence, current models are often unable to predict the annual cycle of stream temperature , nor can the majority of them forecast the inter-annual variation of stream temperature. This study presents a new statistical model to estimate the monthly mean stream temperature of ungauged rivers over multiple years in an Alpine country (Switzerland). Contrary to similar models developed to date, which are mostly based on standard regression approaches , this one attempts to incorporate physical aspects into its structure. It is based on the analytical solution to a simplified version of the energy-balance equation over an entire stream network. Some terms of this solution cannot be readily evaluated at the regional scale due to the lack of appropriate data, and are therefore approximated using classical statistical techniques. This physics-inspired approach presents some advantages: (1) the main model structure is directly obtained from first principles, (2) the spatial extent over which the predictor variables are averaged naturally arises during model development, and (3) most of the regression coefficients can be interpreted from a physical point of view – their values can therefore be constrained to remain within plausible bounds. The evaluation of the model over a new freely available data set shows that the monthly mean stream temperature curve can be reproduced with a root-mean-square error (RMSE) of ±1.3 • C, which is similar in precision to the predictions obtained with a multi-linear regression model. We illustrate through a simple example how the physical aspects contained in the model structure can be used to gain more insight into the stream temperature dynamics at regional scales.


Introduction
Among the parameters affecting the ecological processes in streams, temperature occupies a predominant role.It influences the concentration of chemicals, such as dissolved oxygen, and may increase the toxicity of dissolved substances (Langford, 1990).It also affects the life cycle of many fish species, particularly the salmonids whose rate of spawning, timing of birth and rate of death are directly influenced by stream temperature (Caissie, 2006;Benyahya et al., 2007).Water temperature is also a relevant factor for many thermal power plants which rely on cooling by river water, and whose electricity production decreases when water temperature exceeds a certain limit (Haag and Luce, 2008).
As a result of the rising concern about climate change and water management impacts on aquatic life, stream tempera-Published by Copernicus Publications on behalf of the European Geosciences Union.

A. Gallice et al.: Stream temperature prediction in ungauged basins
ture modelling has regained some interest over the past 10-15 years.This fostered the development of many stochastic and deterministic models (e.g.Mohseni et al., 1998;Segura et al., 2014;Chang and Psaris, 2013;DeWeber and Wagner, 2014;Meier et al., 2003;Westhoff et al., 2007).The former type relies on a statistical analysis to empirically relate stream temperature to climatic and physiographic variables, such as air temperature, discharge, altitude or channel width (see Benyahya et al., 2007, for a complete review of this subject).Deterministic models, on the other hand, rely on a physically based formulation of the stream energy conservation to compute water temperature (Caissie, 2006).Both model types have usually been applied to a single stream reach or a limited number of catchments (e.g.Sinokrot and Stefan, 1993;Roth et al., 2010;Caissie et al., 2001;Caldwell et al., 2013;Grbić et al., 2013).As a response to the lack of stream temperature data, some studies have recently attempted to develop regionalized models.This effort was certainly encouraged by the incentive of the International Association of Hydrological Sciences (IAHS), which set the focus of the last decade on hydrological prediction in ungauged basins (Sivapalan et al., 2003;Hrachowitz et al., 2013).In the case of stream temperature, the difficulty in meeting the data requirements of the physically based models led the authors to mostly rely on statistical approaches to make predictions in ungauged catchments.However, the validity of these models for studying climate change impacts or water management techniques has not been assessed yet.
In this paper, more than 30 studies describing regionalized statistical models for stream temperature estimation were reviewed to put our work in a larger context (see Table 1).The extensive introduction below discusses several aspects of the reviewed literature which motivated the development of the novel stream temperature model described in the next section.

Predictions with limited precision
One recurring issue described in the reviewed literature is the difficulty in predicting stream temperature with a high level of precision.A typical example is the statistical model of Isaak et al. (2010) for the estimation of mean summer stream temperature (15 July-15 September) in the Boise River basin, Idaho.Despite considering a significant number of predictor variables and two different modelling approaches a priori, Isaak et al. (2010) could not reduce the root-mean-square error (RMSE) of their model below 1.5 • C. Prediction uncertainties of the same order of magnitude are reported e.g. by Wehrly et al. (2009), Ruesch et al. (2012), Moore et al. (2013) or Hill et al. (2013).
In general, it seems that the model error originates partly from the lack of appropriate field data, such as measures of riparian shading, groundwater infiltration or irrigation withdrawals (Moore et al., 2013).As noted by Hill et al. (2013), "these types of data are not readily available everywhere and will take time to develop".In the meantime, they can in some circumstances be accounted for through indirect measures.For example, Tague et al. (2007) used the geological aquifer type as a proxy for the presence or absence of groundwater infiltration.Similarly, Hrachowitz et al. (2010) and Scott et al. (2002) estimated riparian shading based on riparian forest coverage, computed over buffer areas of various widths and lengths around the streams.In the absence of such proxies, the model cannot represent some known processes and must concede some increase in its prediction error (Moore et al., 2013).The size of the areas over which stream temperature is modelled -and hereby the diversity of encountered climatic and geomorphologic conditions -constitutes another factor potentially explaining the model uncertainties for some studies.
Regarding the impact of the modelling approach, Wehrly et al. (2009) investigated four different statistical model types and showed that their difference in prediction accuracy was relatively small.The same conclusion was reached by Daigle et al. (2010), who compared four other modelling techniques.Isaak et al. (2010) found that networked kriging regression performed better than multi-linear regression over the calibration data set, but this assertion became much less evident over the validation set.Similarly, Pratt and Chang (2012) and Chang and Psaris (2013) concluded that geographically weighted regression is slightly more accurate than multilinear regression, but they did not validate their results on an independent data set.These studies tend to suggest that no significant decrease in the prediction errors should be awaited from a change in the statistical modelling technique.
Further comparisons between the different models reported in the literature are unfortunately hindered by the diversity of temperature metrics and error measures used by the authors.As mentioned in several studies already, we advocate here the systematic use of the different error measures that are RMSE, bias and coefficient of determination R 2 for the evaluation of the model precision, possibly combined with a benchmark model (Schaefli and Gupta, 2007).It should be noted that R 2 is also referred to as Nash-Sutcliffe efficiency by the hydrological community (Nash and Sutcliffe, 1970), and is defined as 1 minus the ratio of the model error variance over the variance of the observed data.

Few models can predict the stream temperature annual cycle
Inspecting Table 1, it can be seen that most regionalization efforts have concentrated on some particular periods of the year.For example, Jones et al. (2006), Isaak et al. (2010) and Chang and Psaris (2013) focused on the annual maximum of the 7-day moving average of the daily maximum temperature (MWMT).Similarly, both Pratt and Chang (2012) and Hill et al. (2013) Nelitz et al. (2007) Western Canada CRT 104 1/site Year n/a Nelson and Palmer (2007) Western USA MLR 16 3 Season R 2 = 0.36-0.88Ozaki et al. (2003) Japan ULR 5 8 Day n/a Pratt and Chang (2012) Western USA MLR, GWR 51 1/site Season R 2 = 0.48-078 Risley et al. (2003) Western USA ANN 148 0.25 Hour, season σ e = 1.6-1.8• C Rivers- Moore et al. (2012) South Africa MLR 90 1/site Month, year R 2 = 0.14-0.50Ruesch et al. (2012) Western USA NKM 165 15 Year R 2 = 0.84, σ e = 1.5 • C Segura et al. (2014) Conterminous USA MLR 171 ≥ 1.5 Week, month R 2 = 0.79 Sponseller et al. (2001) Eastern USA MLR 9 1 Year R 2 = 0.81-0.93Scott et al. (2002) Eastern USA MLR 36 1/site Season R 2 = 0.82 Stefan and Preud'homme (1993)  to derive regional models to compute the complete annual cycle of stream temperature over several years.Miyake and Takeuchi (1951) and Stefan and Preud'homme (1993) were probably the first authors to address this issue; they relied on linear regression against air temperature to simultaneously estimate stream temperature at multiple sites.However, their respective works are restricted to a limited number of rivers (20 and 11, respectively) and could probably not be applied to larger areas.In an attempt at generalizing these models, Ozaki et al. (2003) and Kelleher et al. (2012) separately regressed stream temperature against air temperature in each one of the catchments they considered, and subsequently regionalized the slopes of the regression lines.However, both studies were only partly successful in completing the regionalization step, since the modelled regression slopes had large prediction errors.They would additionally have had to model the intercepts of the regression lines to completely regionalize the stream-air temperature relationship.In a similar fashion, Johnson et al. (2014) relied on the logistic equation introduced by Mohseni et al. (1998) to relate stream temperature to air temperature in each catchment.Also, they faced difficulties in regressing the equation parameters against geomorphological properties of the catchments.The two most complete works on the regionalization of the linear stream-air temperature relationship were recently conducted by Ducharne (2008) and Segura et al. (2014).These two studies attempted to regionalize both the slopes and intercepts of the regression lines between stream and air temperature.To this end, Ducharne (2008) grouped the streams according to their Strahler order and fitted a single line in each group.Segura et al. (2014), on the other end, expressed the slopes and intercepts as linear combinations of climatic and physiographic variables.The model of Ducharne (2008) had nominally a higher explanatory power (R 2 = 0.88 − 0.96, depending on the Strahler order) than Segura et al. (2014)'s model (R 2 = 0.79), but was effectively based on about 10 times fewer rivers.
Instead of using air temperature as an independent variable, Bogan et al. (2003) relied on equilibrium temperature.This variable corresponds to the stream temperature at which the net energy flux at the air-water interface vanishes (e.g.Edinger et al., 1968).It was shown by Bogan et al. (2003) to be a fairly good estimator of stream temperature for almost 600 rivers in the eastern and central United States, with a prediction error of about 3 • C.
As an alternative to the above-mentioned studies, the annual cycle of stream temperature has been modelled by some authors as a function of time directly, rather than air or equilibrium temperature.Hrachowitz et al. (2010), Imholt et al. (2013) and Rivers- Moore et al. (2012) expressed water temperature as a linear combination of climatic and physiographic variables for each month of the year separately.Their models were derived for a particular year, but can be transferred to other years by estimating stream temperature at a few measurement points using Mohseni's logistic equation and fitting the multi-linear regression model to the resulting values (Hrachowitz et al., 2010).Based on a similar approach, Macedo et al. (2013) succeeded in deriving one single regression model to estimate daily mean stream temperature at 12 different sites in Brazil over 1.5 years.The performance of their model was not tested using data from subsequent years, though.Johnson (1971) relied on another different technique to estimate the thermal regime of six rivers in New Zealand.He first fitted the stream temperature annual cycles with sine curves.In a second step, he identified the physiographic properties of the catchments which best correlated with the fit coefficients.The focus of his study being on the investigation of these physiographic properties, he did not evaluate the prediction error of his model.Although not intended for this purpose, the work of Garner et al. (2014) is based on a somewhat similar approach and may be used to get a first estimate of the annual cycle of temperature in UK streams.The authors classified rivers into several groups according to the shape and magnitude of their respective thermal regimes.Then, they investigated the similarities and dissimilarities of some geomorphological properties among and between the groups.This processing could be inverted to infer the thermal regime from the physiographic properties of the catchments.
Finally, some studies have evaluated the possibility of modelling the time evolution of stream temperature using machine learning techniques.For example, DeWeber and Wagner (2014) trained an artificial neural network to reproduce daily mean temperature values from May to October over more than 30 years for 1080 streams in the eastern United States.Their approach could be easily extended so as to model the complete annual cycle of stream temperature each year.

Space-averaging of the predictor variables
Some of the reviewed publications on regional stream temperature modelling addressed the question of the spatial scale over which the predictor variables should be averaged.It is common knowledge that stream temperature is not only affected by local environmental conditions, but also by the conditions prevailing upstream.However, the exact extent of the area controlling the stream energy balance at a given point is not clear (Moore et al., 2005).
Due to this uncertainty, different approaches have been used in the literature to average the predictor variables.Based on studies of the effect of forest harvesting on stream temperature (e.g.Moore et al., 2005), some authors considered riparian buffer zones of various widths and lengths as averaging areas.This approach was usually applied to average the land cover characteristics only, particularly forest coverage (e.g.Sponseller et al., 2001;Scott et al., 2002;Macedo et al., 2013;Segura et al., 2014), but also in some cases to average most of the predictor variables, including elevation or slope (Hrachowitz et al., 2010;Imholt et al., 2013).Other authors considered larger portions of the catchments as averaging areas, sometimes extending far beyond the riparian zone.For example, Wehrly et al. (2009) used the whole area drained by the stream segment located directly upstream of the temperature measurement point.Whereas most studies relied on simple spatial averaging, a few of them applied a weighting scheme to give more emphasis to the conditions prevailing near the gauging point.As such, Isaak et al. (2010) and Hill et al. (2013) applied a weight w decreasing exponentially with the distance d to the catchment outlet, w = exp(−d/L c ), where the e-folding distance L c controls the spatial extent of the averaging area.
In response to this diversity of methods, we could not find a general consensus in the reviewed literature concerning the extent of the spatial area which is relevant for stream temperature prediction.While some studies conclude that this area should have a length of about 1-4 km (Isaak et al., 2010;Hrachowitz et al., 2010;Chang and Psaris, 2013;Macedo et al., 2013), others tend to indicate that the catchment scale is the most appropriate one (Sponseller et al., 2001;Scott et al., 2002).Similarly, values between 30 and 200 m are assumed for the width of the riparian buffer affecting stream temperature at a given point (e.g.Jones et al., 2006;Scott et al., 2002;DeWeber and Wagner, 2014 Of all the regional models reported in Table 1, less than a third were developed for stream temperature prediction outside of North America, and only one -the model developed by Arscott et al. (2001) -is applied over a European Alpine region.An unpublished attempt at developing another model for an Alpine country (Switzerland) was conducted by Rubin et al. (2012).They relied on the regionalization of the stream-air temperature relationship, but unfortunately did not evaluate the precision of their model.Other studies have sought to classify the thermal regimes of Alpine rivers (Jakob, 2010;Müller, 2011), sometimes with minimal success (see Schädler, 2008, for a review of the classification efforts before 2008).These authors grouped the streams according to the physiographic characteristics of their associated watershed, such as mean basin altitude, water origin (lake, artificial reservoir, deep aquifer or shallow subsurface groundwater), channel width or slope.They computed the characteristics of the typical thermal regime of each group.However, inter-annual variations of the thermal regime cannot be accounted for by this method.

Investigation of a new modelling approach
All the reviewed models rely on standard statistical techniques to estimate stream temperature.The range of methods encompasses traditional approaches such as multi-linear regression (e.g.Arscott et al., 2001;Mayer, 2012;Imholt et al., 2013) or linear mixed modelling (Macedo et al., 2013), but also more advanced techniques such as geographically weighted regression (Pratt and Chang, 2012;Chang and Psaris, 2013), networked kriging models (Gardner and Sullivan, 2004;Isaak et al., 2010;Ruesch et al., 2012) or machine learning techniques (e.g.Westenbroek et al., 2010;Hill et al., 2013;DeWeber and Wagner, 2014).All these methods are general, in the sense that they can be used to model almost any possible relationship between given input and output variable(s).As a consequence of this generality, the user has to specify the set of predictor variables to be considered by the model.Although some objective methods can help to perform this selection (e.g.Burnham and Anderson, 2002), the original set of variables on which these methods act must initially be indicated by the user.In the end, the choice of predictor variables is necessarily affected to some extent by the training and experience of the authors, hereby introducing some diversity in the sets of predictor variables.Thus, in the case of stream temperature modelling in ungauged catchments, some studies consider only physiographic characteristics as predictor variables (e.g.Scott et al., 2002;Jones et al., 2006;Nelson and Palmer, 2007;Hrachowitz et al., 2010), while others also include climatic variables (e.g.Isaak et al., 2010;Ruesch et al., 2012;Moore et al., 2013), stream morphological factors such as channel width or bed gravel size (e.g.Hawkins et al., 1997;Arscott et al., 2001;Daigle et al., 2010), or even markers of anthropogenic activities (e.g.Pratt and Chang, 2012;Hill et al., 2013;Macedo et al., 2013).It should be mentioned that this diversity also largely results from the varying availability and reliability of data among different geographic areas.This is particularly true for riparian shading, which is never directly measured and can only be estimated based on the data at one's disposal.For example, Isaak et al. (2010) approximated riparian shading using a sophisticated combination of satellite orthoimages and ground hemispherical canopy pictures, whereas DeWeber and Wagner (2014) could only rely on country-wide land-use data.
Although the generality of the standard statistical methods allows them to be applied to many problems, it prevents them from incorporating prior knowledge about the system dynamics into their structure.For example, a multi-linear model expresses the predictand as a linear combination of the predictors regardless of the problem at hand.This fact is also true for non-parametric methods such as artificial neural networks, which implicitly impose some (flexible) functional form onto the model.As advocated by Burnham and Anderson (2002), our idea is therefore to attempt to derive a statistical model whose structure includes known dynamics of the predictand variable of interest, namely stream temperature in the case at hand.
Our approach is strongly inspired by the physically based models which have been used for decades to predict water temperature along stream reaches (e.g.Brown, 1969;Sinokrot and Stefan, 1993;Westhoff et al., 2007).However, it differs from these models in the sense that we seek a much simpler expression for stream temperature, expressed as a function of variables which are readily available at the regional scale.To this end, we analytically solve a simplified version of the energy-balance equation over an entire stream network (see Sect. 3.1).The resulting expression involves variables whose value cannot be estimated based on the available spatial data sets.Due to our lack of knowledge regarding the nature of the relationships between the unknown variables and the available data, we choose to rely on multi-linear regression to estimate the former as a function of the latter.Although this step involves the subjective selection of predictor variables and assumes a linear relationship, we do not think that it entirely questions our incentive to incorporate physical considerations into the model structure.As a matter of fact, only the unknown variables are replaced in the analytical formula, letting the global form of the relationship be unaffected.Assuming that the major nonlinearities are already captured by the global structure of the model, the specific form of the expressions used to approximate the unknown terms may be considered to have a minor effect.Moreover, our approach attributes a physical meaning to some of the terms appearing in the formula.These terms can be constrained to remain within physical bounds, hereby restricting the range of values that the calibration parameters can adopt.
The objectives of the present work are three-fold: (1) describe a new physics-inspired statistical model for the prediction of stream temperature in ungauged basins, allowing for the computation of the monthly resolved annual cycle and capturing inter-annual variability; (2) through proper calibration of the model, determine the length of the upstream area which controls stream temperature at a given point; and (3) compare the physics-inspired model with a more standard statistical approach over a set of various Swiss catchments, so as to evaluate the potential benefits of the incorporation of physical considerations into the model structure.The data set used to evaluate the performances of the models is presented in Sect. 2. The models are described in Sect.3. Results are detailed in Sect. 4 and discussed in Sect.5, followed by the conclusion.

Selected catchments for model evaluation
In order to test the two stream temperature models, catchments are selected in Switzerland such that (a) the natural regime of the river is as little affected by anthropogenic activities as possible, and (b) measurements of discharge and stream temperature are available for more than 1 year.This results in a set of 29 catchments, whose locations are depicted in Fig. 1 and physiographic properties are summarized in Table 3.
About half of the selected catchments are situated on the Swiss Plateau -a large area with little altitude variations between Lake Geneva in the south-west and Lake Constance in the north-east.The climate in this region is relatively mild, with precipitation mostly falling as rain in winter and mean daily maximum air temperature hardly exceeding 30 • C in summer.The hydrological regimes of the catchments in the plateau depend on the precipitation patterns and are therefore strongly variable from year to year (Aschwanden and Weingartner, 1985).Discharge does not vary by more than a factor 2 over the year; it usually reaches its maximum during winter, when evapotranspiration is the lowest.As catchments gain in altitude, the discharge control mechanism changes from evapotranspiration to snowmelt: higher-altitude catchments present a discharge peak during the melt season, in April-May.
Only two catchments are found in the Jura mountains, a relatively low-altitude (< 1700 m) mountainous range with rigorous winters.This region is characterized by its karstic aquifers with preferential flow paths, generating fast and complex responses to precipitation events.Although more marked, the hydrological regimes of the Jura catchments are relatively similar to those of the watersheds in the plateau.A clear peak in discharge is noticeable in April for the highest catchments (Aschwanden and Weingartner, 1985).3.
The Alpine region of Switzerland is typically subdivided into its northern and southern parts, based on their difference in climate.The Southern Alps are influenced by Mediterranean weather, implying warmer winters and more precipitation in autumn than in the Northern Alps.The hydrological regimes of the catchments in the Northern Alps are strongly related to altitude.The month in which the peak of discharge is observed ranges from May for low-altitude watersheds to July-August for catchments partially covered by glaciers.Moreover, the ratio of annual maximum to annual minimum discharge increases with altitude.Similar hydrological regimes are observed in the Southern Alps, except for a second discharge peak in autumn due to rainfall (Aschwanden and Weingartner, 1985).As seen in Fig. 1, only three unperturbed catchments could be found in the Northern Alps, while five are located in the Southern Alps.
All in all, 10 of the 16 hydrological regimes identified by Aschwanden and Weingartner (1985) in Switzerland are present among the 29 selected catchments (see Table 3).The surface area distribution is quite large, with catchments ranging from 3.31 km 2 (Rietholzbach at Mosnang) to 392 km 2 (Broye at Payerne).The mean altitudes of the watersheds also span a wide range of values.Few catchments are partially covered by a glacier, with only two of them having a glacier cover fraction over 10 %.

Stream temperature data
The stream temperature data which are used in the present study were provided by the Swiss Federal Office for the Environment (FOEN).Advantage is taken of the present publication to describe this new data set, which is freely accessible for research purposes at the following address: http://www.bafu.admin.ch/wasser/13462/13494/15076/index.html?lang=en.A map displaying the position of all available hydrological stations which measure stream temperature can also be found on the website of the FOEN (http://www.hydrodaten.admin.ch/en/messstationen_temperatur.html).
The FOEN operates an automatic network of stream gauging stations, continuously measuring water level and discharge at more than 180 locations in Switzerland.Water level is recorded using an ultrasonic distance sensor and converted into discharge values through a rating curve adapted each year.The water level values are validated against the measurements of a second instrument -a pressure probeand rejected in case the difference between the two values is greater than 2 cm.A limited number of gauging stations has been equipped with a thermometer, the earliest starting in 1968.This number has increased greatly since 2002, with now more than 70 stations automatically probing water temperature every 10 min (Jakob, 2010).The measurement values are automatically uploaded and displayed in real time on the webpage of the FOEN (same page as for the map displaying the positions of the stations).
Among the watersheds in which temperature is monitored, 25 have been identified in the present study as being little affected by anthropogenic activities.In order to complete this data set, the temperature and discharge measurements of four additional gauging stations were obtained from the Department for Construction, Transport and Environment of Canton Aargau (see Table 3).The period in which water temperature was measured by each station is also indicated in Table 3.
The temperature data are usually not quality-proofed by the FOEN or Canton Aargau.As a validation procedure, we performed two different tests on the data at the hourly time step, on top of visual inspection.a.All temperature measurements lower than 0 • C or greater than 30 • C were removed, except for the values between −0.5 and 0 • C, which were set to 0 • C, and the values between 30 and 30.5 • C which were set to 30 • C.Although the limit of 30 • C might be naturally reached in shallow areas, some temperature series showed clear evidence that such temperature was recorded as a result of the sensor being out of water.As a consequence, it was decided to remove all data points above 30 • C, potentially discarding correct data.
b.The temperature variation between consecutive time steps was checked to remain within physical bounds.
In particular, it was verified that temperature varied by more than 0.01 • C over 5 h, but less than 3 • C within 1 h.Constant temperature values could result from a defect in the sensor, but also from the fact that the hourly values had been replaced with their daily mean in some cases.In order to distinguish between the two, the present quality control procedure was performed semimanually.
After quality control, the hourly data were aggregated into monthly mean values.

Meteorological data
The two statistical models described in Sect. 3 use monthly mean air temperature and incoming solar radiation as predictor variables.Data for these variables were obtained from the Swiss Meteorological Office (MeteoSwiss), which provides free access to them for research purposes.For each one of the selected catchments described in Sect.2.1, the air temperature and incoming solar radiation values measured by all the meteorological stations located at less than 20 km from the catchment outlet were collected.In case fewer than three stations could be found within a 20 km radius, data from the three closest meteorological stations were retained.The value of 20 km was chosen so as to ensure that data interpolations would remain representative of the climatic conditions at the catchment outlet, while being based on three stations at least.In fact, 27 of the 29 selected catchments are entirely contained within the disk of radius 20 km centered on their respective outlet point (not shown).As such, the collected meteorological data can actually be considered as representative for the entire catchments, and not just for their outlet point.We were provided with hourly mean data, which we aggregated into monthly mean values.We did not perform any quality checks on the data, since Me-teoSwiss already follows strict quality control procedures (see http://www.meteosuisse.admin.ch/home/systemes-de-mesure-et-de-prevision/gestion-des-donnees/ preparation-des-donnees.html; webpage only available in German, French or Italian).
Among its network of operated meteorological stations, MeteoSwiss selected a subset of 14 stations which are considered to be representative of the climate diversity in Switzerland (see http://www.meteoswiss.admin.ch/home/climate/past/homogenous-monthly-data.html;description only available in German, French or Italian).These stations, referred to as "reference stations" in the following, are used by the standard statistical model to estimate the monthly mean air temperature over the entire Swiss territory (see Sect. 3.2).

Thermal regime classification
A preliminary study of the selected catchments was performed, with the aim of classifying the rivers according to their thermal behaviour.This classification was intended to be used later in order to investigate whether the performance of the models was affected by the river thermal regime.
As a first attempt, we examined whether the catchments could be classified based on the shape of their stream tem- perature curve.To this end, we z scored (i.e.standardized) the monthly mean stream temperature values in each watershed similarly to Garner et al. (2014).However, as observed by these authors, we could identify only one single thermal regime (Fig. 2a).Only two catchments among the 29 did not to present the same thermal regime as the others, namely those labelled as 5 and 14 in Table 3.
As an alternative approach, we tested whether the characteristics of the stream-air temperature curve could be used to characterize the thermal regime of the catchments.For this purpose, monthly mean stream temperature was linearly regressed against monthly mean air temperature, excluding the points with negative air temperature values (e.g.Kelleher et al., 2012).Based on the values of the slope and intercept of this relationship, three groups of catchments could be clearly identified (Fig. 2b).The first group contains the watersheds in which a significant portion of discharge originates from deep aquifer infiltration (watersheds 9 and 14 in Table 3, labelled as "groundwater-fed streams" in Fig. 2b).This group is characterized by low slope and high intercept values, as reported by many studies (e.g.Caissie, 2006;Webb et al., 2008).The second group of watersheds corresponds to the high-altitude basins with more than 50 % glacier cover.Both the slope and intercept of the stream-air temperature relationship are small for the members of this group, which is actually composed of only one catchment (watershed 5 in Table 3, denoted as "proglacial" in Fig. 2b).The vast majority of the watersheds do not fall into any of the two aforementioned groups.These catchments, denoted as "thermally climate-driven", are characterized by relatively low intercept and high slope values; i.e. their stream temperature is strongly correlated with air temperature.
Because of the predominance of the thermally climatedriven streams, only the latter will be considered for the testing of the physics-inspired and standard regression models.The inclusion of the groundwater-dominated streams in the test set would require the amount of groundwater discharging into the stream to be estimated.We tested several methods, including the derivation of the baseflow index from discharge measurements (e.g.Eckhardt, 2005;van Dijk, 2010) or from the TOPMODEL topographic index (e.g.Ducharne, 2009).However, none of the investigated techniques succeeded in predicting a larger baseflow index for the catchments labelled as "groundwater-fed" as compared to the others (not shown).Similarly, the consideration of the proglacial streams would imply the glacier cover fraction being included in the models.This addition of one calibration parameter was not considered justified given that this group contains only one catchment.In total, 26 catchments were used for the calibration and validation of the models, namely all those listed in Table 3 except watersheds 5, 9 and 14.

Formulations of the stream temperature models
The new physics-inspired statistical model for stream temperature prediction is derived in the following subsection.The standard statistical model used for comparison is presented in Sect.3.2.

Physics-inspired statistical model
As mentioned above, the physics-inspired stream temperature model presented in this paper is based on the analytical solution to the stream energy-balance equation.This topic has been investigated extensively in the literature (e.g.Edinger et al., 1968;Theurer et al., 1984;Gosink, 1986;Polehn and Kinsel, 2000;Toffolon et al., 2010), although used only once for stream temperature prediction in ungauged basins (Bogan et al., 2003).In order to analytically solve the energy-balance equation, all studies relied on the linearization of the heat flux φ a at the air-water interface as a function of stream temperature T : φ a = −k(T − T e ).Some Hydrol.Earth Syst.Sci., 19, 3727-3753, 2015 www.hydrol-earth-syst-sci.net/19/3727/2015/ of them assumed the heat transfer coefficient k to be constant and used prescribed functions of time, space or both to express the equilibrium temperature T e (e.g.Gosink, 1986;Polehn and Kinsel, 1997;Daly, 2005).Other studies derived analytical formulations for k and T e based on the physical expressions of the heat fluxes occurring at the stream-air interface (e.g.Edinger et al., 1968;Bogan et al., 2003;Caissie et al., 2005;Bustillo et al., 2014).While a minority of authors considered the temperature distribution to be spatially homogeneous (Edinger et al., 1968;Caissie et al., 2005;Bustillo et al., 2014), most of them assumed the stream to be in a steady state or, equivalently, the stream celerity to be constant.In addition, they all assumed the river width to remain constant along the stream so as to analytically solve the energy-balance equation.Very few studies accounted for the heat exchange with the stream bed or the heat advected by lateral inflow of water (Bogan et al., 2004;Herb and Stefan, 2011).Bogan et al. (2003) were the only authors to evaluate their analytical expression over ungauged basins.They tested their model in the central and eastern United States, since this region has a topography flat enough for a meteorological station located even at more than 100 km from a given point to be still representative of the climate at that point.Their work is therefore hardly transferable to Switzerland, where the mountainous landscape prevents the proper interpolation of variables such as air humidity or wind speed, which are required as input by the model.

Derivation of the analytical solution to the energy-balance equation
Assuming a well-mixed water column and a negligible longitudinal heat dispersion, the mass and energy-balance equations along a stream reach read (adapted from Westhoff et al., 2007) where w (m), p (m), A (m 2 ), Q (m 3 s −1 ) and T ( • C) denote the width, wetted perimeter, cross-sectional area, discharge and temperature of the stream, respectively; t (s) refers to time, x (m) to the downstream distance, z (m) to altitude, and g (m s −2 ) to the gravitational acceleration.The water mass density ρ (kg m −3 ) and the specific heat capacity of water c p (J • C −1 kg −1 ) are both assumed constant.The quantities φ a (W m −2 ) and φ b (W m −2 ) refer to the energy fluxes at the stream-air and stream-bed interfaces, respectively.The lateral heat fluxes due to the inflow of surface, fast subsurface and slow subsurface runoffs into the stream are merged into a single term, q T , where q (m 2 s −1 ) denotes the sum of these three runoffs per unit stream length and T ( • C) stands for their mean temperature.The last term on the right-hand side of Eq. (2) corresponds to friction, which is usually ne-glected in stream temperature models (e.g.Sinokrot and Stefan, 1993;Westhoff et al., 2007), but has been shown by Hannah et al. (2004) and Leach and Moore (2014) to be an important term in the energy balance of small streams during winter.
The present study builds mainly upon the work of Theurer et al. (1984), which is one of those considering the less restrictive approximations for the derivation of the solution to Eqs. ( 1)-(2).Our own assumptions are the following.i.At the timescale of the month, the stream temperature is assumed to be in a steady state.
ii.The energy flux at the stream-air interface is expressed as where φ r (W m −2 ) denotes the net radiative heat flux, incorporating both the short-wave and long-wave components.The second term on the right-hand side accounts for both the latent and sensible heat fluxes (e.g.Polehn and Kinsel, 1997;Toffolon et al., 2010), where the bulk heat transfer coefficient k (W m −2 • C −1 ) between water and air is assumed to be constant, and T a ( • C) refers to the air temperature.
iv.The lateral inflow of water q is assumed to be spatially constant (e.g.Biswal and Marani, 2010;Mutzner et al., 2013).
v. The ratio of stream width to discharge w/Q is assumed to be spatially constant, as opposed to Theurer et al. (1984) and Polehn and Kinsel (2000), who both assumed a constant stream width.This approximation also differs from the typical relationship used in fluvial geomorphology, which expresses stream width as a powerlaw function of discharge with exponent ∼ 0.5 (see e.g.Knighton, 1998).It allows for the definition of a characteristic stream length L c (m), vi.All sources in the network are supposed to have the same discharge, denoted as Q s in the following.This approximation is discussed in more detail in Sect.3.1.2.
Using the above assumptions, the mass and energybalance equations simplify to Eqs. ( 5)-( 6), where γ = 1/k.The reader is referred to Appendix A for the complete derivation of the analytical solution to these equations.Only the final expressions for discharge Q out and stream temperature T out at the outlet of a catchment are reported here, with ω 2 = ηδ , ( 10) In the above equations, L tot and n s correspond to the total length of the river network and the number of sources in the catchment, respectively.The operator • L refers to the distance-weighted average; it computes the average of its operand over the entire stream network using a weight equal to exp(−d/L c ), where d denotes the distance to the catchment outlet.This operator gives much more emphasis to the points located near to the catchment exit.It should be noted that the spatial extent of the area over which the average is computed is controlled by the characteristic length L c : the smaller L c , the smaller the contributing area.The quantity T s appearing in Eq. ( 8) denotes the weighted average of water temperature at the network sources.The latter are weighted by a factor exp(−d s,i /L c ), where d s,i is the distance along the stream between the ith source point and the catchment outlet.The weights ω 1 , ω 2 and ω 3 are all in the interval [0, 1].In Eqs. ( 9)-( 10), the factor η refers to the fraction of discharge at the catchment outlet originating from lateral inflow of water along the stream network -i.e.excluding the fraction coming from the sources, The two factors δ s and δ are defined as where n r denotes the number of reaches in the stream network, d k the streamwise distance between the downstream point of stream reach k and the catchment outlet, and L k the length of stream reach k.The factor δ s corresponds to the average of the weight exp(−d(x)/L c ) over all the network sources, and the factor δ refers to the average of the same weight over the set of all stream reaches in the catchment.It follows that both δ s and δ decrease roughly exponentially as a function of the network length.Equation ( 8) expresses stream temperature as a linear function of air temperature, the slope of the regression line between the two being equal to ω 3 = 1 − ω 1 − ω 2 .Assuming η to vary only slightly along the network, it can be seen in Eqs. ( 9) and ( 10) that ω 1 and ω 2 decrease roughly exponentially with the stream network length.As a consequence, the present model predicts ω 3 to tend towards 1 as the catchment size increases, a fact which has been observed at many locations (e.g.Ozaki et al., 2003;Ducharne, 2008;Kelleher et al., 2012;Chang and Psaris, 2013;Segura et al., 2014).
The present expression for T out differs from those reported previously in the literature in at least two aspects (see Sect. 1 for a review of the analytical solutions to the energy-balance equation published to date).First, the terms on the right-hand side of Eq. ( 6) were not assumed to be spatially homogeneous when integrating them.This explains the presence of the spatial averaging operator • L in Eq. ( 8), which in turn translates the fact that stream temperature is not impacted by local conditions only.This operator has already been used for the computation of predictor variables in regression-based stream temperature models (Isaak et al., 2010;Hill et al., 2013), but never in association with analytical solutions to the energy-balance equation.Second, the source and lateral inflow terms have not been neglected.These two terms are weighted by the factors ω 1 and ω 2 in Eq. ( 8), respectively, and tend to decrease exponentially with the stream length (see discussion above).Although negligible in large catchments, they might be of the same order of magnitude as the heat exchange term in small watersheds.Only a few studies relying on an analytical expression for stream temperature modelling have considered the lateral inflow term to date (Bogan et al., 2004;Herb and Stefan, 2011), and none has retained the source term.
As noted above, the extent of the zone over which • L averages its operand is controlled by the characteristic length L c .Given that this length is a function of the river dischargeto-width ratio Q/w (see Eq. 4) and that the stream celerity is assumed here to be constant, L c is approximately proportional to the water height.Its value should therefore be expected to change over the course of the year.Based on a formula similar to Eq. ( 4), Herb and Stefan (2011) have estimated L c to vary between 3 and 45 km for discharge values between 0.4 and 5.8 m 3 s −1 in the case of the Vermillion River in Minnesota.As most of the catchments considered in the present study have discharges contained within this range, we should expect a marked variation in the values of L c both during the course of the year and across catchments.However, since the characteristic length will be treated as a calibration parameter here (see Sect. 3.1.2),only its seasonal variability will be investigated.A single value will be assumed in each season for all the catchments (see Sect. 4), for otherwise L c would have to be calibrated independently for each catchment, which would prevent prediction in ungauged basins.We acknowledge this as a limitation of our model.

Parametrization of the unknown terms
Equation ( 8) contains several unknown quantities.The procedure used to calculate their respective values is detailed below.
The channel slope dz/dx is computed along the centre line of each stream.A vector representation of the centre lines was extracted from a land cover map at scale 1 : 25 000 (for more information on this map, see http://www.swisstopo.admin.ch/internet/swisstopo/en/home/products/maps/national/25.html).This map was overlaid with a digital elevation model of Switzerland with 2 m horizontal resolution produced by the Swiss Federal Office of Topography (see http://www.swisstopo.admin.ch/internet/swisstopo/en/home/products/height/swissALTI3D.html) in order to extract the altitude of each point.As an alternative approach, a geomorphological analysis of the stream watersheds could have been performed so as to automatically extract the stream networks.However, it was observed that the results of this analysis did not match with the land cover map in some basins (not shown).
The monthly mean air temperature T a along the streams is computed based on the values measured by the neighbouring meteorological stations (see Sect. 2.3).Within each catchment i, air temperature T a,i is assumed to be a linear function of altitude only, where z i (m) refers to the altitude of the gauging station.The lapse rate a T ,i ( • C m −1 ) is computed each month separately by regressing the air temperature measurements of the neighbouring meteorological stations against the station altitudes.
In case the coefficient of determination R 2 of the regression line is lower than 0.6, a T ,i is set equal to 0. The intercept b T ,i ( • C) is computed each month as the inverse-distanceweighted average of the same air temperature measurements, which are first corrected for the altitude effect by virtually transferring them to altitude z i using the lapse rate a T ,i .
The quantity γ φ r , which accounts for the effect of the net radiation heat flux at the air-water interface, cannot be readily computed based on the available data.As a matter of fact, long-wave radiation and reflected short-wave radiation measurements are performed by MeteoSwiss at a few locations only.Incoming short-wave radiation φ isw (W m −2 ), on the other hand, is a commonly measured variable which can be interpolated along the stream networks.To this end, it is assumed that the incoming short-wave radiation φ isw,i in each catchment i is a function of altitude only, where the lapse rate a φ,i (W m −3 ) and the intercept b φ,i (W m −2 ) are computed similarly to a T ,i and b T ,i in Eq. ( 15).An attempt is made to correct the values computed using Eq.16 in order to account for riparian shading.As discussed in Sect.1.5 above, very few spatial data sets exist for riparian shading, which in practice often has to be estimated using proxy variables.In the present case, riparian shading at a given stream point is approximated based on the stream orientation θ and riparian forest cover f f at that point.Using the land cover map at scale 1 : 25 000 mentioned above, θ is computed as the cosine of the angle between north and the stream flow direction; it is a measure of northing, i.e. values close to 1 indicate a catchment that is oriented towards north and values close to −1 a catchment that is south-oriented.
The riparian forest cover f f is defined here as the fraction of the riparian zone which is covered with forests according to the land cover map.As the extent of the riparian zone affecting stream temperature is unclear (Moore et al., 2005), the forest cover fraction is computed over riparian buffers with different widths: 25, 50 and 100 m on each side of the centre line of the streams (total buffer widths are 50, 100 and 200 m, respectively).The map does unfortunately not allow for the distinction between coniferous and deciduous forests.In addition to θ and f f , topographical shading f s is also computed in order to correct the incoming solar radiation values estimated from Eq. ( 16).f s is expressed at each point along the streams as a value between 0 and 1, 1 indicating complete shading.It is derived from the above-mentioned 2 m digital elevation map of Switzerland at nine different hours of day time -corresponding to the fractions 0.1-0.9 of the day-time period -on the 15th day of each month of the year.These values are then averaged at each grid cell and in each season to obtain the spatial distribution of f s .Since shading by topography and by the riparian forest does not only affect incoming solar radiation, but also incoming long-wave radiation, it was decided not to use the variables θ , f f and f s to directly modify the values of φ isw .Instead, it is the unknown term γ φ r which is approximated as a linear combination of φ isw , θ , f f and f s : As discussed in Sect.1.5, the choice of a linear relationship is motivated by our wish to keep the model simple and by our ignorance of the actual form of the function linking γ φ r to the above-mentioned predictor variables.A linear relationship also significantly simplifies the computation of the distance average of γ φ r using the operator • L .Equation ( 17) requires the calibration of five unknown coefficients, namely {a φ,x } x=isw,s,θ,f and b φ .In order to limit the number of model parameters, this expression is not directly used as is, but more parsimonious formulations are evaluated instead.All possible sub-expressions involving any combination of either one or two of the predictor variables {φ isw , θ, f f , f s } is considered for approximating γ φ r .It should be mentioned that the choice to consider expressions with at most two terms (plus the intercept) is arbitrary and only introduced to avoid equifinality issues (Beven, 1993).In total, 11 different models are tested for γ φ r -including the constant expression with only b φ as calibration parameter.
The two weights ω 1 and ω 2 cannot be readily estimated from Eqs. ( 9) and (10).While the values of the factors δ s and δ can be easily derived from the vector representation of the stream network described above, the parameter η requires additional assumptions.It should be remembered that this parameter corresponds to the fraction of the outlet discharge which originates from lateral inflow.Assuming a typical power-law relationship between drainage area and discharge (e.g.Mutzner et al., 2013), η could in principle be approximated as the ratio between the area A net drained by the network (excluding the area drained by the sources) and the total catchment area A tot , raised to some power α: η ∼ (A net /A tot ) α .However, the computation of A net would require a geomorphological analysis, which was discarded based on the discrepancy between the stream network predicted by this analysis and the observed one (see above).As alternative methods, we consider two different techniques for estimating η.The simplest approach assumes a constant single value for η, calibrated over all catchments.The second approach relies on the analytical expression for η presented in Eq. ( 12), in which the ratio Q s /Q tot is replaced with (A s /A tot ) α : The calibration parameters of this second method correspond to the area A s drained by a single source and the exponent α.
In order to compute T s and T L in Eq. ( 8), two different methods for the estimation of the source and lateral inflow temperatures are considered.In a first approximation, these two temperatures are assumed to be both constant and equal.The second method considers them to be linearly related to air temperature as measured at their respective altitudes.In other words, it expresses the temperature T s,i of each source i = 1. ..n s and the lateral inflow temperature T (z) at any point with altitude z along the network as where z s,i (m) denotes the altitude of source i, and a w ( • C • C −1 ) and b w ( • C) are two parameters to be calibrated over the set of all catchments.Notice that the same slope a w and intercept b w are used to derive both T s,i and T from air temperature, hereby assuming that the source and lateral inflows originate from the same hydrological processes.Moreover, since these two parameters are the same for all catchments, it is implicitly supposed that the ratio of surface runoff to subsurface runoff is the same in all watersheds.As discussed in Sect.2.4, this requires catchments to be classified by hydrological regime before a w and b w can be calibrated separately for each regime.In Eqs. ( 19)-( 20), the monthly mean air temperature is computed in each catchment using Eq. ( 15).The distance average of variables T , γ φ r , T a and dz/dx are computed by discretizing the operator • L over the stream segments, where f k denotes the unweighted mean value of variable f along stream segment k; the other quantities have been defined previously in Sect.3.1.1.Except for the riparian forest cover f f , which is derived over buffers of widths 25, 50 and 100 m, the unweighted means of all other quantities (namely φ isw , f s , T a and dz/dx) along each stream segment are computed over a 20 m wide buffer centered around the centre line of the segment, as extracted from the vector representation of the stream network at scale 1 : 25 000 (see above).The value of 20 m is considered to be typical for the width of the streams investigated in the present study; although only this value has been tested, it is expected to have little impact on the computed averages.It should be noted that the expressions for T a L and φ isw L both reduce to linear functions of the distance-weighted average of altitude along the stream network z L as per Eqs.( 15) and ( 17).The length L k of stream reach k and the distance d k between the downstream end of reach k and the catchment outlet are derived from the vector map of the stream network.
Replacing the terms in Eq. ( 8) with their above expressions, the stream temperature model reads where The calibration parameters of the model are listed in Table 2.When testing a constant parametrization for the source and lateral inflow temperatures, a w should be set to 0. Similarly, at least two of the coefficients {a φ,x } x=isw,s,θ,f are assumed equal to 0, as per the parametrization of the radiation term discussed above.Thus, between three and eight parameters must be calibrated, depending on the methods used to approximate the respective unknown variables in Eq. ( 8).Advantage is taken of the fact that each parameter can be interpreted from a physical point of view to restrict its associated calibration range (see Chosen so as to constrain T s,i and T to the range 0-25 Eq. ( 17) ( • C m 2 W −1 ) Chosen so as to constrain γ φ r to the range −20-20 • C Must be positive a φ,s Eqs. ( 9)-( 10) (-) 0-1 None A s Eq. ( 18) (m 2 ) Chosen so as to constrain η in the range 0-1 Must be positive α (-) 0-3 None to adopt a value between 0 and 1 as per Eq. ( 12), and only positive values are considered for a φ,isw based on the fact that solar radiation is contributing positively to the net radiation heat flux.Moreover, six different values are tested for the characteristic length L c used in the definition of • L : 1, 2, 4, 8, 16 and 32 km (see Sect. 4.2).All possible combinations of the different parametrizations of the model terms are tested for each one of these values of L c .The model associated with the lowest value of the modified Akaike information criterion (AICc) is considered to be the best one among the tested set (e.g.Burnham and Anderson, 2002).As mentioned in Sect.3.1.1,the model is calibrated in each season separately to account for the fact that the value of the parameter L c varies over the year.

Standard regression model
In order to assess its performances, the physics-inspired statistical model described by Eq. ( 22) is compared with a more classical regression model which we developed based on a combination of some of the standard statistical approaches reviewed in Sect. 1.The regression model takes advantage of the fact that most stream temperature curves have a similar shape (see Sect. 2.4).This shape is first estimated by the model based on air temperature, before being mapped to the respective stream temperature curves of the catchments using a linear transformation.
The model assumes all streams to have the same z scored (i.e.standardized) temperature T (-).The latter is related to the monthly mean temperature T i of each individual catchment i through (see e.g.Garner et al., 2014) where T i ( • C) and σ i ( • C) correspond to the annual mean and standard deviation of monthly mean stream temperature in catchment i, respectively.These two quantities are estimated each year independently using multi-linear regression (MLR) models.Although more sophisticated techniques could have been used, Wehrly et al. ( 2009) and Daigle et al. (2010) showed that MLR performs at least as well as several more complicated statistical methods for stream temperature prediction.The MLR models were constructed using similar predictor variables as in the physics-inspired statistical model, namely the annual mean and standard deviation of both air temperature and incoming short-wave radiation, the riparian forest cover fraction, stream channel slope, stream orientation, the difference in topographical shading between summer and winter, the number of sources in the network and the watershed area.All multi-linear models based on any possible subset of these variables were tested, with a maximum number of terms per model arbitrarily fixed to six.This limitation was introduced in order to avoid overparametrization, but also to ensure that the number of parameters in the final standard regression model was about the same as in the physics-inspired model, hereby guaranteeing a more even comparison between the two.Multicollinearity issues were avoided by discarding MLR models whose variance inflation factor (VIF) exceeded 5.Each predictor variable was distance-averaged over the stream networks using the operator • L , as in the case of the physics-inspired statistical model.Different values of L c were considered when applying this operator as per Eq. ( 21): 1, 2, 4, 8, 16 and 32 km.The best predicting MLR models for T i and σ i were selected based on AICc.In Eq. ( 24), the z scored stream temperature is computed each month based on a non-linear relationship with air temperature, where µ ) and κ ( • C −1 ) are coefficients obtained through ordinary least squares regression, and T a (-) denotes the mean z scored air temperature over Switzerland.T a is obtained by averaging the z scored measurements of the 14 MeteoSwiss reference meteorological stations (see Sect. 2.3), In the above equation, T a,k ( • C) denotes the monthly mean air temperature measured at reference station k, and T a,k ( • C) and σ a,k ( • C) refer to the annual average and standard deviation of T a,k computed each year independently, respectively.In summary, the standard regression model proceeds as follows to estimate stream temperature in an ungauged basin: (a) it first computes the mean z scored air temperature over Switzerland according to Eq. ( 26), based on the measurements of 14 meteorological stations, (b) it then uses T a to estimate the z scored stream temperature in any catchment as per Eq. ( 25), and finally (c) it converts T to the actual stream temperature using Eq. ( 24), where the scaling coefficients T i and σ i are estimated for the catchment of interest using MLR models.These different steps will be illustrated in more detail in Sect.4.4.

Model evaluation
In order to rigorously evaluate the performance of the two models described in the previous section, 5 of the 26 selected catchments were removed from the data set to create an independent validation set (watersheds 3, 6, 11, 13 and 27, displayed in orange in Fig. 1).Caution was given to single out basins with different size, mean elevation and geographic location.Among the four climatic regions of Switzerland, only the Jura could not be represented in the validation set, given that only 1 station (number 26) among the 26 available was located in this area.A bootstrap on the validation stations was not possible because of too high computational requirements.Indeed, Burnham and Anderson (2002) recommend using at least 10 000 bootstrap samples, which led to a prohibitively high number of model evaluations in our case.
The measurement time period is also split into a calibration (2007)(2008)(2009)(2010)(2011)(2012) and validation (all dates before and including 2006) period.Only the measurements performed by the calibration stations -whose drainage area is marked in green in Fig. 1 -during the period 2007-2012 are used to calibrate the models.Four different validation sets can be formed with the remaining station months.
1.The data set containing the measurements of the validation stations during the calibration period.This set can be used to evaluate the ability of the models to make predictions in ungauged basins.
2. The data set containing the measurements of the calibration stations during the validation period.This set will be used to evaluate the precision of the models when predicting stream temperature in past or future years.
3. The data set formed by the measurements of the validation stations during the validation period.This set serves to evaluate the performance of the models when predicting stream temperature both in ungauged basins and in ungauged years.
4. The data set corresponding to the union of all three previous validation sets, which may be used to obtain a synthetic evaluation of the precision of the models.
The complete data set is almost equally subdivided into its calibration and validation parts, with the former containing 1223 station months and the validation sets 1-4 regrouping 360, 705, 204 and 1269 station months, respectively.
As mentioned in Sect.3.1, the value of the characteristic stream length L c is expected to change over the course of the year.In order to ease capturing of this variability, the physics-inspired statistical model is calibrated over each season separately.As such, the calibration and validation data sets are each subdivided into four groups, corresponding to winter (January-March), spring (April-June), summer (July-September) and autumn (October-December), respectively.Each one of these subgroups contains approximately one-fourth of the station months originally belonging to the parent group.The standard regression model is calibrated over all seasons at once, but is evaluated in each season separately so as to investigate a potential effect of the period of the year on its precision.
Since the physics-inspired model expresses stream temperature as a linear function of air temperature, it cannot reproduce the asymptotic behaviour of the former as the latter drops below 0 • C. Consequently, data points associated with negative air temperature values are removed from the data set before calibration (Kelleher et al., 2012).When evaluating the model over the validation sets, all stream temperatures predicted to be negative are replaced with 0 • C values.
In the following, the best seasonal formulations of the physics-inspired model are presented first.The precision of this model is then evaluated, and the influence of the stream network resolution on the model results investigated.Finally, comparison is made with the standard regression model.All the results presented in this section will be discussed and analysed in Sect. 5.

Model formulations
As mentioned in Sect.3.1.2,the different possible formulations of the physics-inspired statistical model are ranked in each season according to their respective AICc value.AICc is preferred here over the classical definition of the Akaike information criterion (AIC) since it includes a correction term for finite-sized data sets (Burnham and Anderson, 2002).It should be mentioned that, following Burnham and Anderson (2002), AICc is computed by calibrating the models not over the calibration set only, but rather over the entire data set (i.e. both the calibration set and validation set 4).Only in a second time is each model calibrated over just the calibration set, so as to evaluate its performances in terms of RMSE, R 2 and bias.
Table 4 presents the best model formulations selected in each season, ranked according to their respective Akaike weights w i .The latter corresponds to the probability of each model being a better descriptor of the observed data (according to information theory) as compared to the model with the minimum AICc value (Burnham and Anderson, 2002;Wagenmakers and Farrell, 2004).Considering models with w i ≤ 0.1 to be statistically insignificant, it can be observed that only a few formulations were identified in each season as being relevant for stream temperature prediction.The characteristic stream length L c is found to be consistent among these formulations, with a value of 4 km in spring, summer and autumn, and 8 km in winter, regardless of the formulation.
The model selection reveals the radiation term γ φ r to be preferentially expressed as a function of topographical shading f s and riparian forest cover fraction f f , or as a function of f s alone.Among the tested buffer widths used to compute f f , none of the three values 50, 100 or 200 m prevails significantly over the others.The order in which they appear in the ranked models varies depending on the season; for example in winter, forest cover computed over a 100 m wide buffer is expected to be a better predictor of γ φ r than forest cover over a 50 m wide buffer, whereas the opposite is true in spring.Focusing on each season separately, the linear coefficient associated with any given term is observed to have a fairly constant value among the different expressions tested for γ φ r .For example, the coefficient multiplying f s remains within a narrow range (at most 1 • C large) in each season.
This behaviour is even more pronounced in the case of the term associated with the source and lateral inflow temperatures (T s and T ).This term is expressed as a linear function of air temperature, whose slope a w and intercept b w are constant among the various model formulations in a given season (see Table 4).The values of a w are observed to be rather low independently of the period of the year, which indicates a weak coupling between the stream source (or lateral inflow) temperature and air temperature.Moreover, a w and b w differ among the seasons in such a way that T s and T are the least coupled to air temperature in winter and the most in summer.
The model ranking based on AICc also identified a single expression for η in each season.This parameter is found equal to 1 in summer and autumn, and 0 in winter.Its expression is slightly more complicated in spring, where the selected formulation is the one based on the source drainage area (see Sect. 3.1.2).

Model performance
The RMSE, R 2 and bias of the best selected model formulation in each season -i.e. the one with w i = 1 -are reported in Table 5.Based on the results of the evaluation over validation set 4, the model precision is observed to be rather satisfactory.Its RMSE and R 2 are relatively constant over the year (about 1.3 • C and 0.87, respectively), except in winter where the value of the coefficient of determination is much lower (0.55).Similarly, the bias is small in all seasons (−0.11 to 0.14 • C) apart from winter (−0.47 • C).
Table 4. Formulations of the physics-inspired statistical model selected in each season based on their corresponding AICc value.The Akaike weights are denoted as w i .Only the model formulations with w i ≥ 0.1 are presented here.The widths of the buffers used to compute the riparian vegetation cover are indicated as subscripts of the variable f f (the indicated values correspond to the total buffer widths, i.e. accounting for both sides of the stream centre line).

Season
w i L c (km) Formulation of γ φ r Formulation of T s and T Formulation of η Winter Regarding the different validation sets, it can be observed in Table 5 that the model performs better when predicting in ungauged catchments as compared to simulating past or future years.Indeed, the RMSE values computed using validation set 1 are smaller than those based on set 2, particularly in winter, autumn and summer.Similarly, the values of R 2 are higher over set 1 than over set 2, despite the fact that the model bias is larger over the former set as compared to the latter.As expected, the weakest model performances are generally associated with validation set 3, which contains the measurements performed by the validation stations during the validation period.The only noticeable exception is in summer, where the model evaluation over set 3 provides satisfactory results (RMSE = 1.13 • C, R 2 = 0.90, bias = −0.01• C).

Influence of the stream network resolution
The results reported above are based on the stream network geometries extracted from the land cover map at scale 1 : 25 000 (see Sect. 3.1.2).These geometries directly affect the values of the distance-averaged predictor variables, since the operator • L averages over the entire stream network.As a consequence, modifying the network resolution is expected to impact the model performance.
To test this hypothesis, two additional stream networks with a coarser resolution than the original one were investigated.These networks were obtained by removing stream segments with Strahler order 1, and those with Strahler order 1 and 2, respectively.Through this procedure, the mean drainage density of the 26 selected catchments decreased from 2.1 km km −2 for the original network to 0.5 km km −2 for the coarsest one, passing through 1.0 km km −2 for the intermediate-resolution network.The different model formulations were evaluated over the two additional networks using the same procedure as described in the previous section.Although the results are not reported here, it was essentially observed that the network resolution had little influence on the ranking of the model formulations based on AICc in each season.Almost all selected models were associated with a characteristic stream length L c = 4 or 8 km, as in the case of the original stream network.The parametrization of the net radiation heat flux γ φ r was also similar to the one reported in Table 4. Topographical shading and riparian forest cover remained the two most statistically significant predictors for this term, except during winter, where stream orientation appeared as a relevant variable.The values of the coefficients a w and b w were noted to vary little among the selected model formulations in a given season.Finally, the parameter η was preferentially expressed as a constant term.Its value was identified as being 0 in all seasons except summer in the case of the intermediate-resolution stream network, whereas its parametrization was close to the one described in Table 4 in the case of the coarser network.
As a consequence of the little influence of the network resolution on the model parametrization, few variations in the model precision were observed between the three stream networks.As seen in Fig. 3, no clear tendency can be identified among the residuals.At most, a small increase in the model prediction error can be detected for the coarsest network as compared to the first two, especially in autumn.The largest absolute residuals are also observed to be generated by this network.On the other hand, the strong bias previously noted in winter is present in the case of the intermediate resolution network, but less so in the case of the coarsest resolution one.

Comparison with the standard regression model
This section describes the characteristics of the calibrated standard regression model first, before presenting the results of its evaluation in a second step.Figure 4 pictures the observed monthly mean stream temperature, z scored and averaged over the 21 calibration stations, as a function of T a over the period 2007-2012.As can be observed, the relationship between these two quantities displays a small hysteresis effect, which can be explained by stream cooling due to snowmelt in spring (Mohseni et al., 1998).The logistic equation introduced by Mohseni et al. (1998) is fitted to each one of the hysteresis branches separately, T = −1.87+ 4.88 1 + e −0.99•( T a −0.66) (in January-June), ( 27) T = −1.86+ 3.96 1 + e −1.16•( T a −0.06) (in July-December). (28) It should be noted that the parameters corresponding to the lower and upper asymptotic values of the logistic curve are particularly sensitive to the data points located at both ends of the hysteresis.To limit inaccuracy errors, the temperatures measured in January and July were used to fit both branches of the hysteresis, as they usually correspond to the annual extreme values.Equations ( 27)-( 28) are those used in the model to determine the z scored stream temperature T at any location based on T a (see Sect. 3.2).The multi-linear regression models which were selected to estimate the annual mean T i and the standard deviation σ i of stream temperature in a given year are presented in Table 6.They correspond to the models associated with the lowest AICc values among the tested formulations (see Sect. 3.2).As observed in the table, the characteristic stream length used by the operator • L to average the predictor variables over the stream networks is significantly different in the two cases: L c = 4 km for the T i model, whereas L c = 32 km for the σ i one.
Table 7 summarizes the prediction errors of the standard stream temperature regression model when evaluated over validation set 4 using the original stream network.Comparison with Table 5 reveals that its precision is greater than the one of the physics-inspired model.Its RMSE is about 0.2 • C lower, its R 2 about 0.03 to 0.12 larger, and its absolute bias 0.05 to 0.20 • C smaller, depending on the season.However, denotes the difference in topographical shading between summer and winter, T a,i and σ a,i refer to the annual mean value and standard deviation of air temperature in the year of interest, respectively, A tot denotes the watershed area and |dz/dx| the channel slope.The other variables have been defined in the text.its performance worsens when using the two stream networks with coarser resolution: its yearly average RMSE increases to 1.26 • C in the case of the intermediate resolution network, and even 1.29 • C for the coarsest network, which is close to the value obtained with the physics-inspired model.

Discussion
The formulations of the physics-inspired model selected by AICc ranking are consistent among the different seasons.In particular, topographical shading systematically appears to be the strongest predictor of the net radiation heat flux γ φ r .This observation is not particularly surprising in a mountainous country like Switzerland, where some valleys are steep enough for their bottom not to be illuminated by direct sun-light for some period of the year.The basins referred to as numbers 14 and 24 in Table 3 are examples of such watersheds, both having a mean catchment slope larger than 35 • .Riparian forest cover fraction corresponds to the second most important predictor for the net radiation heat flux term.It was rather unexpected to identify this parameter as relevant during autumn and winter, especially since more than half of the selected catchments are mainly covered with deciduous forests due to their relatively low mean elevation (< 1000 m).This result has to be balanced with the fact that a given fraction increase in riparian forest cover is predicted by the model to have an effect on γ φ r about 4 to 6 times smaller in magnitude than the same fraction increase in topographical shading.It should also be remembered that the precision of the model is rather low in winter, hereby questioning the validity of its parametrization in this season.Certainly more unexpected is the absence of solar radiation among the predictors of γ φ r , which will be explained below.Regarding the parametrization of the discharge fraction due to lateral water inflow η, the model predicts the water in the stream channel to originate principally from surface and subsurface runoff during summer and autumn (η = 1).This partly matches our expectations, since the fraction of discharge originating from the sources is expected to decrease when moving downstream along a given network.The characteristic catchment size defining the transition from source-water-dominated to lateral-inflow-dominated discharge is controlled here by L c , which is equal to 4 km in summer and autumn.This value is smaller than the main channel length in more than 90 % of the selected watersheds (not shown), hereby strengthening our confidence in the parametrization of η during these two seasons.On the other hand, a value of 1 for η in all catchments may appear as a too simplified approach (see below).
The questioning of the parametrization of η is all the more true in winter, where its value is predicted to be 0.Only in spring did the model ranking select the more physically based formulation for η, expressed as a function of the area drained by each source.Concerning the parametrization of the source and lateral inflow temperatures, it should be mentioned that the linear expression as a function of air temperature was systematically preferred over the constant term.This certainly results from the large altitudinal range covered by the selected catchments, which does not allow for a constant inflow temperature to reflect the diversity of encountered climatological conditions, and mainly air temperature.
As defined in Eq. ( 22), the physics-inspired model linearly relates air temperature to stream temperature through the proportionality coefficient ω 3 .The latter is compared in Fig. 5 with its actual observed value, namely the slope of the regression line between the monthly mean temperature measurements of the stream and air.As seen in the figure, the model systematically overestimates the value of ω 3 , particularly in winter and summer, where the mean bias equals 0.2.Referring to Eq. ( 22), this implies that ω 1 and ω 2 are globally underestimated by the model, hereby indicating that the parametrization of the factor η could possibly be improved.As noted in Sect.3.1.2,a more physically based expression could be used to compute η, as long as a geomorphological analysis of the river watersheds can be performed.This approach was not investigated here for the reasons mentioned earlier.
The overestimation of ω 3 is probably at the origin of the fact that solar radiation is unexpectedly missing from the selected expressions for the net radiation heat flux γ φ r (see Table 4).Contrary to the standard regression model, the physics-inspired model presents the advantage that the calibration range of most of its parameters can be restricted based on physical considerations (see Table 2).An attempt was made to remove these constraints, which resulted in incoming short-wave radiation being present in almost all models for γ φ r , but associated with a negative coefficient.It was concluded that the unconstrained model takes advantage of the fact that the annual cycles of air temperature and solar radiation have a similar shape to artificially reduce the value of air temperature by subtracting a fraction of solar radiation, hereby compensating for the too large value of ω 3 .This observation argues once more in favour of a better parametrization of the factor η.
As mentioned in Sect.4, the characteristic stream length L c is found to be of the order of 4-8 km regardless of the season or the stream network resolution in the case of the physics-inspired model.This range is in agreement with the findings of Isaak et al. (2010) and Macedo et al. (2013).Hra- chowitz et al. ( 2010) and Chang and Psaris (2013) rather concluded that L c was around 1 km; however, they did not investigate values for L c larger than 1 km.Contrary to our expectations, we do not observe a marked variation of L c across seasons, probably due to the fact that we assumed a single value for all the catchments.The annual cycle of L c may have been better captured by separately calibrating this parameter in each catchment individually, but this would have contradicted our aim to derive a regional model.It should be emphasized that the absence of an observed annual cycle for L c does not question the decision to calibrate the physicsinspired model on a seasonal basis, since the source temperature parametrization is observed to vary significantly over the year (see Table 4).
Our model is rather equivocal regarding the width of the riparian buffer which is relevant for the determination of stream temperature at a given point.As a matter of fact, none of the tested buffer widths appears to prevail over the other ones in the retained parametrizations of γ φ r .This ambiguity reflects the range of buffer widths used in the literature, which extends from 30 m (e.g.Jones et al., 2006;Macedo et al., 2013) to 200 m (e.g.Scott et al., 2002;Segura et al., 2014).This also points to the difficulty in adequately accounting for the effect of riparian vegetation using the available spatial data sets, which often lack important details such as the distinction between deciduous and coniferous forest.
The precision of the physics-inspired model was reported in the previous section to be rather low in January-March.This can be explained by the fact that the non-linearity of the stream-air temperature relationship at low air tempera-

A. Gallice et al.: Stream temperature prediction in ungauged basins
ture values is not captured by the model.The latter rather simulates a sharp transition from the linear regime to a constant one, since the stream temperature values predicted to be negative are systematically replaced with 0 • C.This implies a faster decrease towards 0 • C, which is at the origin of the strong negative model bias in winter.
As is noticeable in Table 5, the model RMSE is larger in spring as compared to the other seasons.This is attributable to the fact that many of the selected watersheds are impacted by snowmelt in spring.Since the snow cover conditions are strongly variable both spatially and temporally, a large dispersion of the stream temperature values is typically observed in spring.The model performs nonetheless relatively well in this season, for its R 2 value is of the same magnitude in spring as during the rest of the year.
Advantage can be taken of the physics integrated into the model structure to investigate some aspects of the stream temperature dynamics.For example, Fig. 6 displays the respective values of the factors ω 1 , ω 2 and ω 3 appearing in Eq. ( 22) as a function of the season.These factors correspond to the weights associated with the mean source temperature T s , average lateral inflow temperature along the network T L and average equilibrium temperature along the network T e L , respectively.As seen in the figure, ω 3 is the largest factor of the three in all seasons, with a value of about 0.6-0.8.This results from the fact that stream temperature is primarily impacted by the atmospheric conditions.The other two factors are nonetheless non-negligible, with ω 1 being of the order of 0.4 in winter and ω 2 being approximately equal to 0.2 from April to December.The value of ω 1 has to be put into perspective with respect to the fact that our confidence in the model parametrization is relatively low in winter.Moreover, following the above discussion about the computation of the factor η, the values reported here for ω 2 should be considered as a lower limit.It therefore appears that not only the net total heat flux at the air-water interface is important in determining stream temperature, but also the heat flux associated with the lateral inflow of water.This conclusion is in agreement with the findings of Bogan et al. (2004), who found that the precision of their stream temperature model was improved by including a term accounting for the lateral water inflow.Similarly, Herb and Stefan (2011) mention that the heat input associated with groundwater infiltration may be of the same order of magnitude as the heat flux due to atmospheric forcing in some cases.This effect seems to be largely underestimated in the literature, since the lateral inflow of water has often been neglected in previous stream temperature models (e.g.Edinger et al., 1968;Bogan et al., 2003;Caissie et al., 2005;Bustillo et al., 2014).
The simplifying assumptions (i)-(vi) reported in Sect.3.1.1are likely to have limited the performance of the physics-inspired model.In particular, the assumption of a spatially homogeneous lateral inflow rate q is expected to fail in most catchments.For example, only the highest regions of low-altitude catchments experience snowmelt in spring.In higher-altitude catchments, snowmelt leads to an increase in q only at low altitudes at the beginning of spring, and only at higher altitudes later in the season.These mechanisms introduce an altitude dependence in q which contradicts our assumption and may partly explain the higher RMSE of the model in spring.Similarly, assumption (v) expresses stream width as a linear function of discharge.As compared to the typical power-law relationship used in fluvial geomorphology (Knighton, 1998), this simplification may lead to an overestimation of stream width at low discharge rates -i.e. in small catchments -and to an underestimation of stream width at high discharge ratesi.e. in large catchments.This may in turn decrease the ability of the model to simulate catchments of various sizes, hereby increasing its prediction error.Assumption (vi), stating that all sources in a given catchment have the same discharge rate, is also disputable.This is particularly true for small catchments, where the short distance to the outlet and the low number of sources do not allow the averaging effect to be significant enough to compensate for the introduced error.
In addition to the simplifying assumptions discussed above, the parametrizations of the unknown terms in the analytical solution might also have impacted the model precision.Indeed, the estimation of the source and lateral inflow temperatures using only air temperature has recently been questioned, particularly for the catchments impacted by snowmelt or glacier melt (Leach and Moore, 2015).This simplification may notably have contributed to increase the model RMSE in spring.Regarding the parametrization of the term accounting for the net radiation heat flux at the airwater interface, the use of a linear expression may appear as limiting.We actually tested a power-law function as well, but did not succeed in calibrating the model due to convergence issues.We also considered an alternative expression based on an estimation of the incoming atmospheric radiation and a first-order approximation of the long-wave radiation emitted by the stream (not shown).Rather than an improvement, this parametrization actually led to a decrease in the model precision, as a result of its inability to compensate for the overestimation of ω 3 (see above).
As opposed to the physics-inspired model, the parameter values of the standard regression model could not be constrained using physical considerations.As a result, the sign of some of the linear coefficients relating the predictor variables to T i and σ i are in contradiction with our understanding of stream temperature dynamics (see Table 6).For example, the stream orientation θ , measured as the cosine of the angle between north and the channel direction, is positively related to the annual mean stream temperature T i .It was rather expected that north-oriented catchments receive less radiation from the sun, hereby implying lower stream temperatures.The same observation is true for the riparian forest cover fraction, which is positively associated in the model with the annual standard deviation σ i of stream temperature.However, experimental observations tend to conclude that riparian shading has a buffering effect on stream temperature, therefore dampening the amplitude of the variations of the latter (see e.g.Moore et al., 2005).Despite these inconsistencies, the standard regression model performs better than the physics-inspired one in terms of RMSE, R 2 and bias.This fact questions further the validity of the parametrization of the physics-inspired model, which could certainly be improved (see Sect. 6).On the other hand, the standard regression model appears to be much more sensitive to the stream network resolution as compared to its counterpart, possibly as a consequence of its lack of physical elements in its structure.This lack also does not allow for the investigation of the physics governing stream temperature, as can be done with the physics-inspired model (see Fig. 6).

Conclusions
This study aimed to present a new statistical model for the prediction of monthly mean stream temperature in ungauged basins.As opposed to the standard statistical methods, this model is devised so as to incorporate physical considerations into its structure.To this end, it is built upon the analytical solution to a simplified version of the one-dimensional heat advection equation.Contrary to previously reported analytical solutions, the present one is obtained by solving the equation over an entire stream network instead of a single stream each.Moreover, the various terms of the equation are not supposed to be spatially homogeneous, which leads to the apparition of a space averaging operator • L applied to most terms of the solution.This operator uses a weight which decreases exponentially with the distance to the catchment outlet, hereby giving more emphasis to the points located near the gauging station.Both the source and the lateral inflow terms -which are usually neglected -are retained in the final solution to the heat advection equation.This notably enables the model to be applied in small watersheds, where the influence of source temperature on the value of stream temperature measured at the catchment outlet cannot be discarded.
While most terms of the analytical expression can be evaluated using meteorological observations or topographic maps, some require data which are not available.These terms are replaced with approximations based on the spatial data sets at hand.In particular, the net radiation heat flux at the air-water interface is expressed as a linear combination of several physiographic variables.Similarly, the source and lateral inflow temperatures are approximated as a linear function of air temperature measured at the source location and along the stream, respectively.Finally, the fraction η of discharge at the catchment outlet originating from lateral water inflow along the network is estimated based on the number of sources in the watershed.As a consequence of these approximations, the resulting model is statistical in nature, but nevertheless retains physical aspects due to its global structure being derived from the heat-balance equation.
The performance of the model is quite satisfactory, with a root-mean-square error of about 1.3 • C and a coefficient of determination R 2 of 0.87 when used for stream temperature prediction in "thermally climate-driven" catchments.These catchments, which are by far the most abundant ones in Switzerland, correspond to those with a glacier cover lower than 50 % and whose stream is not impacted by groundwater infiltration from a deep aquifer.Model precision is the lowest in winter, due to the inability of the model to reproduce the fact that stream temperature asymptotically tends towards 0 • C for negative air temperature values.
The precision of the model was also assessed by comparing it with a more standard regression model.The latter was observed to perform slightly better, with a RMSE about 0.2 • C lower.However, its parameters could not be interpreted from a physical point of view, hereby hindering the restriction of their respective calibration ranges based on physical considerations.This led the regression model to simulate some aspects of the stream temperature dynamics wrongly.For example, some physiographic variables known to have a cooling effect on water temperature were modelled as warming up the stream.The standard regression model was also observed to be much more sensitive than its physicsinspired counterpart with respect to the stream network resolution.When discarding all stream segments with a Strahler order equal to 1, the RMSE of the regression model increased from 1.12 to 1.26 • C, whereas the one of the physics-inspired model remained constant up to 0.01 • C.
Despite a few deficiencies, the physics-inspired statistical model can be used to analyse some aspects of the physics governing stream temperature.As an example, the relative importance of each one of the stream heat sources could be determined from the model.Climatic forcing was found to be the major driver of water temperature, as expected (e.g.Caissie, 2006).More interestingly, the lateral inflow of water was identified as a non-negligible secondary heat flux.This fact is confirmed by other studies (e.g.Bogan et al., 2004;Herb and Stefan, 2011), but nonetheless fails to be accounted for in many stream temperature models (e.g.Caissie et al., 2005;Bustillo et al., 2014).We therefore wish to emphasize the role of lateral water inflow in stream temperature, even in catchments -such as those used in this study -which are not impacted by groundwater infiltration originating from a deep aquifer.
Among the improvements that can be brought to the physics-inspired model, a more accurate parametrization of the discharge fraction originating from lateral water inflow η appears as a promising enhancement.In particular, η could be estimated from a geomorphological analysis of the catchments.This approach was not retained here due to the discrepancy between the stream networks predicted by the geomorphological analysis and the observed ones.In case it could be implemented, such a revision is expected to improve the predicted slope of the stream-air temperature curve.A geomorphological analysis could also positively influence the modelling of the source and lateral inflow temperatures.The parametrization of these two terms could be improved by including predictor variables accounting for e.g. the glacier cover fraction or the mean altitude of the area drained by each source (or stream reach).The model could also be substantially improved in case the characteristic stream length L c , which controls the extent of the spatial area over which the operator • L acts, could be computed instead of calibrated.Indeed, L c does not only present a seasonal variation, but also differs across the individual catchments, a fact which was neglected in the present work.Finally, one might expect the model precision to improve by using a more physically based parametrization for the net radiation heat flux -instead of the multi-linear model used here.
We expect the physics-inspired model to be easily transferable to other regions of the globe.The parametrization of the net radiation heat flux at the air-water interface might need some adaptation in order to correctly reflect the dominant physiographic controls on local stream climate.For example, topographic shading is certainly not a relevant predictor variable over flat regions.Similarly to the approach presented in this work, the most appropriate set of predictor variables for the net radiation heat flux over a particular region can be obtained through AICc ranking.Once set, the stream temperature model can be used to investigate e.g. the extent of the stream network which is thermally suitable for sensitive fish species at the regional scale (e.g.Isaak et al., 2010).This investigation can in turn serve as a basis for the introduction of regulation policies or protection measures.be expressed as where k and k denote the set of source points and reaches draining into reach k, respectively, as illustrated in Fig. A1c.
Inserting the above equation into Eq.(A12) and re-arranging the terms, A 2 may be written as where i refers to the set of reaches linking the ith source point to the network outlet, and k denotes the set of reaches linking reach k to the network outlet, not including reach k itself (see Fig. A1c).Assuming that all source points have the same discharge Q s , and replacing the integrals along the reaches with their respective values, Eq. (A14) can be written more simply as where Eq. (A6) has been used in the second step to replace n s Q s with Q tot − q L tot , and the factors δ s and δ are defined as (A19) The factor η appearing in Eq. (A18) denotes the ratio between the discharge originating from lateral inflow and the total discharge at the network outlet, )

Figure 1 .
Figure 1.Locations of the gauging stations selected for the evaluation of the physics-inspired and standard statistical models.The stations are displayed as red points and their associated catchments as green or orange areas, depending on whether they are used to calibrate or validate the model.The four main climatic regions of Switzerland -the Jura mountains, Plateau, Northern Alps and Southern Alps -are displayed in different colours.The numbering corresponds to Table3.

Figure 2 .
Figure 2. Classification of the thermal regimes of the selected catchments.Streams impacted by groundwater infiltration are shown in green, the proglacial stream in blue and the thermally climate-driven streams in orange.(a) Normalized monthly mean stream temperature curves over 3 consecutive years (2010-2012); all curves are z scored independently each year.(b) Slopes and intercepts of the regression lines fitted to the stream-air temperature points of the respective catchments.All points with negative air temperature values have been discarded prior to fitting.The bars indicate the standard error estimates.

Figure 3 .
Figure 3. Prediction error of the physics-inspired statistical model for different resolutions of the stream network.The boxes extend from the first to the third quartile of the error distribution.Outliers are displayed as red dots.In each season, the network resolution decreases from left to right: the left box corresponds to the network with all stream reaches, whereas the central and right boxes contain only the stream segments whose Strahler order is greater than or equal to 2 and 3, respectively.The error values 0, −1 and +1 • C are displayed as a solid grey line and two dashed grey lines.

Figure 4 .
Figure 4. Non-linear relationship between the z scored stream temperature T and the z scored air temperature T a averaged over 14 reference meteorological stations.The values of T are obtained by averaging in each month the z scored stream temperatures measured at the 21 calibration stations.Each point corresponds to a single month of the calibration period 2007-2012.Months from January to July are displayed as green crosses, and those from July to December as blue dots.The two solid lines correspond to the respective fits of the data points in the two year halves (see Eqs. 27-28).

Figure 5 .
Figure 5.Comparison of modelled against measured slopes of the regression line between stream and air temperatures.The panels correspond to the different seasons: (a) January-March, (b) April-June, (c) July-September, and (d) October-December.The bias b corresponds to the average, in each season, of the difference between the modelled and measured regression slopes over all the selected stations and years (i.e.belonging to both the calibration set and validation set 4).The 1 : 1 line is indicated as a dashed grey curve.

Figure 6 .
Figure 6.Seasonal values of the factors ω 1 , ω 2 and ω 3 weighting the different terms in Eq. (8).The values of these weights are evaluated over the entire data set, i.e. both the calibration set and validation set 3. The error bars indicate the confidence interval centered around the mean and extending over 1 standard deviation on each side.

Figure A1 .
Figure A1.Schematic representations of (a) a stream reach and (b-c) a stream network, illustrating the notation used in Appendix A to derive the analytical solution to the stream energy balance.
k /L c 1 − e −L k /L c = A 1 L tot .(A17)Combining Eqs.(A7), (A8), (A11) and (A15), the expression for stream temperature at the network outlet can eventually be written in a more convenient form,T out = (1 − η)δ s T s + ηδ T L + 1 − (1 − η)δ s − ηδ γ φ r + T a − L c operator • Q has been approximated by• L , and T s corresponds to the distance-weighted source temperature, averaged over all sources and weighted by a factor decreasing exponentially with the respective distance of each source to the network outlet, T s = 1 n s δ s n s i=1 e −d s,i /L c T s,i .

Table 1 .
List of reviewed publications about statistical stream temperature prediction in ungauged basins.

Table 2 .
Calibration parameters of the physics-inspired statistical model.

Table 3 .
Physiographic properties of the 29 selected hydrological catchments in Switzerland.The three watersheds indicated in bold are not used for the model evaluation.

Table 5 .
Performance of the best physics-inspired statistical model in each season (w i = 1) in terms of RMSE, R 2 and bias, depending on the validation set.The numbers between brackets refer to the different validation sets (see beginning of Sect.4).

Table 6 .
Best multi-linear regression models for the prediction of annual mean T i and standard deviation σ i of the monthly mean stream temperature in a given year.All predictor variables are averaged over the stream networks using the operator • L .

Table 7 .
Performance of the standard regression model in terms of RMSE, R 2 and bias computed over the validation set 4 in each season.The stream network used to evaluate the model corresponds to the original one derived from the map at scale 1 : 25 000.