The development of stream temperature regression models at regional scales
has regained some popularity over the past years. These models are used to
predict stream temperature in ungauged catchments to assess the impact of
human activities or climate change on riverine fauna over large spatial
areas. A comprehensive literature review presented in this study shows that
the temperature metrics predicted by the majority of models correspond to
yearly aggregates, such as the popular annual maximum weekly mean temperature
(MWMT). As a consequence, current models are often unable to predict the
annual cycle of stream temperature, nor can the majority of them forecast the
inter-annual variation of stream temperature. This study presents a new
statistical model to estimate the monthly mean stream temperature of ungauged
rivers over multiple years in an Alpine country (Switzerland). Contrary to
similar models developed to date, which are mostly based on standard
regression approaches, this one attempts to incorporate physical aspects into
its structure. It is based on the analytical solution to a simplified version
of the energy-balance equation over an entire stream network. Some terms of
this solution cannot be readily evaluated at the regional scale due to the
lack of appropriate data, and are therefore approximated using classical
statistical techniques. This physics-inspired approach presents some
advantages: (1) the main model structure is directly obtained from first
principles, (2) the spatial extent over which the predictor variables are
averaged naturally arises during model development, and (3) most of the
regression coefficients can be interpreted from a physical point of view –
their values can therefore be constrained to remain within plausible bounds.
The evaluation of the model over a new freely available data set shows that
the monthly mean stream temperature curve can be reproduced with a
root-mean-square error (RMSE) of

Among the parameters affecting the ecological processes in streams, temperature
occupies a predominant role. It influences the concentration of chemicals, such
as dissolved oxygen, and may increase the toxicity of dissolved substances

As a result of the rising concern about climate change and water management
impacts on aquatic life, stream temperature modelling has regained some
interest over the past 10–15 years. This fostered the development of many
stochastic and deterministic models

In this paper, more than 30 studies describing regionalized statistical models
for stream temperature estimation were reviewed to put our work in a larger
context (see Table

One recurring issue described in the reviewed literature is the difficulty in
predicting stream temperature with a high level of precision. A typical
example is the statistical model of

In general, it seems that the model error originates partly from the lack of
appropriate field data, such as measures of riparian shading, groundwater
infiltration or irrigation withdrawals

Regarding the impact of the modelling approach,

Further comparisons between the different models reported in the literature
are unfortunately hindered by the diversity of temperature metrics and error
measures used by the authors. As mentioned in several studies already, we
advocate here the systematic use of the different error measures that are
RMSE, bias and coefficient of determination

List of reviewed publications about statistical stream temperature prediction in ungauged basins.

Inspecting Table

Instead of using air temperature as an independent variable,

As an alternative to the above-mentioned studies, the annual cycle of stream
temperature has been modelled by some authors as a function of time directly,
rather than air or equilibrium temperature.

Finally, some studies have evaluated the possibility of modelling the time
evolution of stream temperature using machine learning techniques. For
example,

Some of the reviewed publications on regional stream temperature modelling
addressed the question of the spatial scale over which the predictor
variables should be averaged. It is common knowledge that stream temperature
is not only affected by local environmental conditions, but also by the
conditions prevailing upstream. However, the exact extent of the area
controlling the stream energy balance at a given point is not clear

Due to this uncertainty, different approaches have been used in the
literature to average the predictor variables. Based on studies of the effect
of forest harvesting on stream temperature

In response to this diversity of methods, we could not find a general consensus
in the reviewed literature concerning the extent of the spatial area which is
relevant for stream temperature prediction. While some studies conclude that
this area should have a length of about

Of all the regional models reported in Table

All the reviewed models rely on standard statistical techniques to estimate
stream temperature. The range of methods encompasses traditional approaches
such as multi-linear regression

All these methods are general, in the sense that they can be used to model
almost any possible relationship between given input and output variable(s).
As a consequence of this generality, the user has to specify the set of
predictor variables to be considered by the model. Although some objective
methods can help to perform this selection

Although the generality of the standard statistical methods allows them to be
applied to many problems, it prevents them from incorporating prior knowledge
about the system dynamics into their structure. For example, a multi-linear
model expresses the predictand as a linear combination of the predictors
regardless of the problem at hand. This fact is also true for non-parametric
methods such as artificial neural networks, which implicitly impose some
(flexible) functional form onto the model. As advocated by

Our approach is strongly inspired by the physically based models which have
been used for decades to predict water temperature along stream reaches

The objectives of the present work are three-fold: (1) describe a new
physics-inspired statistical model for the prediction of stream temperature
in ungauged basins, allowing for the computation of the monthly resolved
annual cycle and capturing inter-annual variability; (2) through proper
calibration of the model, determine the length of the upstream area which
controls stream temperature at a given point; and (3) compare the
physics-inspired model with a more standard statistical approach over a set
of various Swiss catchments, so as to evaluate the potential benefits of the
incorporation of physical considerations into the model structure. The data
set used to evaluate the performances of the models is presented in
Sect.

In order to test the two stream temperature models, catchments are selected
in Switzerland such that (a) the natural regime of the river is as little
affected by anthropogenic activities as possible, and (b) measurements of
discharge and stream temperature are available for more than 1 year. This
results in a set of 29 catchments, whose locations are depicted in
Fig.

About half of the selected catchments are situated on the Swiss Plateau – a
large area with little altitude variations between Lake Geneva in the
south-west and Lake Constance in the north-east. The climate in this region
is relatively mild, with precipitation mostly falling as rain in winter and
mean daily maximum air temperature hardly exceeding 30

Only two catchments are found in the Jura mountains, a relatively
low-altitude (

Locations of the gauging stations selected for the evaluation of the
physics-inspired and standard statistical models. The stations are displayed
as red points and their associated catchments as green or orange areas,
depending on whether they are used to calibrate or validate the model. The
four main climatic regions of Switzerland – the Jura mountains, Plateau,
Northern Alps and Southern Alps – are displayed in different colours. The
numbering corresponds to Table

The Alpine region of Switzerland is typically subdivided into its northern
and southern parts, based on their difference in climate. The Southern Alps
are influenced by Mediterranean weather, implying warmer winters and more
precipitation in autumn than in the Northern Alps. The hydrological regimes
of the catchments in the Northern Alps are strongly related to altitude. The
month in which the peak of discharge is observed ranges from May for
low-altitude watersheds to July–August for catchments partially covered by
glaciers. Moreover, the ratio of annual maximum to annual minimum discharge
increases with altitude. Similar hydrological regimes are observed in the
Southern Alps, except for a second discharge peak in autumn due to rainfall

All in all, 10 of the 16 hydrological regimes identified by

The stream temperature data which are used in the present study were provided
by the Swiss Federal Office for the Environment (FOEN). Advantage is taken of
the present publication to describe this new data set, which is freely
accessible for research purposes at the following address:

The FOEN operates an automatic network of stream gauging stations,
continuously measuring water level and discharge at more than 180 locations
in Switzerland. Water level is recorded using an ultrasonic distance sensor
and converted into discharge values through a rating curve adapted each year.
The water level values are validated against the measurements of a second
instrument – a pressure probe – and rejected in case the difference between
the two values is greater than 2 cm. A limited number of gauging stations
has been equipped with a thermometer, the earliest starting in 1968. This
number has increased greatly since 2002, with now more than 70 stations
automatically probing water temperature every 10 min

Among the watersheds in which temperature is monitored, 25 have been
identified in the present study as being little affected by anthropogenic
activities. In order to complete this data set, the temperature and discharge
measurements of four additional gauging stations were obtained from the
Department for Construction, Transport and Environment of Canton Aargau (see
Table

The temperature data are usually not quality-proofed by the FOEN or Canton
Aargau. As a validation procedure, we performed two different tests on the
data at the hourly time step, on top of visual inspection.

All temperature measurements lower than 0

The temperature variation between consecutive time steps was
checked to remain within physical bounds. In particular, it was verified that
temperature varied by more than 0.01

The two statistical models described in Sect.

We were provided with hourly mean data, which we aggregated into monthly mean
values. We did not perform any quality checks on the data, since MeteoSwiss
already follows strict quality control procedures (see

Among its network of operated meteorological stations, MeteoSwiss selected a
subset of 14 stations which are considered to be representative of the
climate diversity in Switzerland (see

A preliminary study of the selected catchments was performed, with the aim of classifying the rivers according to their thermal behaviour. This classification was intended to be used later in order to investigate whether the performance of the models was affected by the river thermal regime.

Classification of the thermal
regimes of the selected catchments. Streams impacted by groundwater
infiltration are shown in green, the proglacial stream in blue and the thermally
climate-driven streams in orange.

As a first attempt, we examined whether the catchments could be classified
based on the shape of their stream temperature curve. To this end, we

As an alternative approach, we tested whether the characteristics of the
stream–air temperature curve could be used to characterize the thermal
regime of the catchments. For this purpose, monthly mean stream temperature
was linearly regressed against monthly mean air temperature, excluding the
points with negative air temperature values

Because of the predominance of the thermally climate-driven streams, only the
latter will be considered for the testing of the physics-inspired and
standard regression models. The inclusion of the groundwater-dominated
streams in the test set would require the amount of groundwater discharging
into the stream to be estimated. We tested several methods, including the
derivation of the baseflow index from discharge measurements

The new physics-inspired statistical model for stream temperature
prediction is derived in the following subsection. The standard statistical
model used for comparison is presented in Sect.

As mentioned above, the physics-inspired stream temperature model presented
in this paper is based on the analytical solution to the stream
energy-balance equation. This topic has been investigated extensively in the
literature

Assuming a well-mixed water column and a negligible longitudinal heat
dispersion, the mass and energy-balance equations along a stream reach read

The present study builds mainly upon the work of

At the timescale of the month, the stream temperature is assumed to be in a steady state.

The energy flux at the stream–air interface is expressed as

where

The energy flux at the stream–bed interface
is neglected; i.e.

The lateral inflow of water

The ratio of stream width to discharge

All sources in the network are supposed to
have the same discharge, denoted as

Using the above assumptions, the mass and energy-balance equations simplify
to Eqs. (

Equation (

The present expression for

As noted above, the extent of the zone over which

Equation (

The channel slope

The monthly mean air temperature

The quantity

The two weights

In order to compute

The distance average of variables

Calibration parameters of the physics-inspired statistical model.

Physiographic properties of the 29 selected hydrological catchments in Switzerland. The three watersheds indicated in bold are not used for the model evaluation.

Replacing the terms in Eq. (

In order to assess its performances, the physics-inspired statistical model
described by Eq. (

The model assumes all streams to have the same

In Eq. (

In summary, the standard regression model proceeds as follows to estimate
stream temperature in an ungauged basin: (a) it first computes the mean

In order to rigorously evaluate the performance of the two models described
in the previous section, 5 of the 26 selected catchments were removed from
the data set to create an independent validation set (watersheds 3, 6, 11, 13
and 27, displayed in orange in Fig.

The measurement time period is also split into a calibration (2007–2012) and
validation (all dates before and including 2006) period. Only the
measurements performed by the calibration stations – whose drainage area is
marked in green in Fig.

The data set containing the measurements of the validation stations during the calibration period. This set can be used to evaluate the ability of the models to make predictions in ungauged basins.

The data set containing the measurements of the calibration stations during the validation period. This set will be used to evaluate the precision of the models when predicting stream temperature in past or future years.

The data set formed by the measurements of the validation stations during the validation period. This set serves to evaluate the performance of the models when predicting stream temperature both in ungauged basins and in ungauged years.

The data set corresponding to the union of all three previous validation sets, which may be used to obtain a synthetic evaluation of the precision of the models.

As mentioned in Sect.

Since the physics-inspired model expresses stream temperature as a linear
function of air temperature, it cannot reproduce the asymptotic behaviour of
the former as the latter drops below 0

In the following, the best seasonal formulations of the physics-inspired
model are presented first. The precision of this model is then evaluated, and
the influence of the stream network resolution on the model results
investigated. Finally, comparison is made with the standard regression model.
All the results presented in this section will be discussed and analysed in
Sect.

As mentioned in Sect.

Table

The model selection reveals the radiation term

This behaviour is even more pronounced in the case of the term associated
with the source and lateral inflow temperatures (

The model ranking based on AICc also identified a single expression for

The RMSE,

Formulations of the physics-inspired statistical model selected in
each season based on their corresponding AICc value. The Akaike weights are
denoted as

Regarding the different validation sets, it can be observed in
Table

The results reported above are based on the stream network geometries extracted
from the land cover map at scale 1 : 25 000 (see Sect.

To test this hypothesis, two additional stream networks with a coarser
resolution than the original one were investigated. These networks were
obtained by removing stream segments with Strahler order 1, and those with
Strahler order 1 and 2, respectively. Through this procedure, the mean
drainage density of the 26 selected catchments decreased from

Prediction error of the physics-inspired statistical model for
different resolutions of the stream network. The boxes extend from the first
to the third quartile of the error distribution. Outliers are displayed as
red dots. In each season, the network resolution decreases from left to
right: the left box corresponds to the network with all stream reaches,
whereas the central and right boxes contain only the stream segments whose
Strahler order is greater than or equal to 2 and 3, respectively. The error
values

As a consequence of the little influence of the network resolution on the
model parametrization, few variations in the model precision were observed
between the three stream networks. As seen in Fig.

This section describes the characteristics of the calibrated standard
regression model first, before presenting the results of its evaluation in a
second step. Figure

Non-linear relationship between the

The multi-linear regression models which were selected to estimate the annual
mean

Performance of the best physics-inspired
statistical model in each season (

Best multi-linear
regression models for the prediction of annual mean

Performance of the standard regression model in
terms of RMSE,

Table

The formulations of the physics-inspired model selected by AICc ranking are
consistent among the different seasons. In particular, topographical shading
systematically appears to be the strongest predictor of the net radiation
heat flux

As defined in Eq. (

The overestimation of

Comparison of modelled against measured slopes of the regression
line between stream and air temperatures. The panels correspond to the
different seasons:

As mentioned in Sect.

Our model is rather equivocal regarding the width of the riparian buffer
which is relevant for the determination of stream temperature at a given
point. As a matter of fact, none of the tested buffer widths appears to
prevail over the other ones in the retained parametrizations of

The precision of the physics-inspired model was reported in the previous
section to be rather low in January–March. This can be explained by the fact
that the non-linearity of the stream–air temperature relationship at low air
temperature values is not captured by the model. The latter rather simulates
a sharp transition from the linear regime to a constant one, since the stream
temperature values predicted to be negative are systematically replaced with
0

As is noticeable in Table

Advantage can be taken of the physics integrated into the model structure to
investigate some aspects of the stream temperature dynamics. For example,
Fig.

Seasonal values of the factors

The simplifying assumptions (i)–(vi)
reported in Sect.

In addition to the simplifying assumptions discussed above, the
parametrizations of the unknown terms in the analytical solution might also
have impacted the model precision. Indeed, the estimation of the source and
lateral inflow temperatures using only air temperature has recently been
questioned, particularly for the catchments impacted by snowmelt or glacier
melt

As opposed to the physics-inspired model, the parameter values of the
standard regression model could not be constrained using physical
considerations. As a result, the sign of some of the linear coefficients
relating the predictor variables to

This study aimed to present a new statistical model for the prediction of
monthly mean stream temperature in ungauged basins. As opposed to the
standard statistical methods, this model is devised so as to incorporate
physical considerations into its structure. To this end, it is built upon the
analytical solution to a simplified version of the one-dimensional heat
advection equation. Contrary to previously reported analytical solutions, the
present one is obtained by solving the equation over an entire stream network
instead of a single stream each. Moreover, the various terms of the equation
are not supposed to be spatially homogeneous, which leads to the apparition
of a space averaging operator

While most terms of the analytical expression can be evaluated using
meteorological observations or topographic maps, some require data which are
not available. These terms are replaced with approximations based on the
spatial data sets at hand. In particular, the net radiation heat flux at the
air–water interface is expressed as a linear combination of several
physiographic variables. Similarly, the source and lateral inflow
temperatures are approximated as a linear function of air temperature
measured at the source location and along the stream, respectively. Finally,
the fraction

The performance of the model is quite satisfactory, with a root-mean-square
error of about 1.3

The precision of the model was also assessed by comparing it with a more
standard regression model. The latter was observed to perform slightly
better, with a RMSE about 0.2

Despite a few deficiencies, the physics-inspired statistical model can be
used to analyse some aspects of the physics governing stream temperature. As
an example, the relative importance of each one of the stream heat sources
could be determined from the model. Climatic forcing was found to be the
major driver of water temperature, as expected

Among the improvements that can be brought to the physics-inspired model, a
more accurate parametrization of the discharge fraction originating from
lateral water inflow

Indeed,

We expect the physics-inspired model to be easily transferable to other regions
of the globe. The parametrization of the net radiation heat flux at the
air–water interface might need some adaptation in order to correctly reflect
the dominant physiographic controls on local stream climate. For example,
topographic shading is certainly not a relevant predictor variable over flat
regions. Similarly to the approach presented in this work, the most appropriate
set of predictor variables for the net radiation heat flux over a particular
region can be obtained through AICc ranking. Once set, the stream temperature
model can be used to investigate e.g. the extent of the stream network which is
thermally suitable for sensitive fish species at the regional scale

The analytical solution to
Eqs. (

The above equations require the values of discharge and temperature at the
upstream end of the reach to be known. By applying them iteratively to all the
reaches of a network, starting from the most downstream one, the expressions for
discharge

Schematic representations of

Equations (

where Eq. (

A. Gallice performed the analysis, produced the figures and wrote the manuscript. B. Schaefli gave much appreciated guidance and impulse to the present work. M. P. Parlange commented on the manuscript. H. Huwald, M. Lehning and B. Schaefli helped write the manuscript and co-supervised the work.

This work was financially supported by the Swiss Federal Office for the
Environment (FOEN). The Department for Construction, Transport and
Environment of Canton Aargau and FOEN are greatly acknowledged for the free
access to their hydrological data. All plots have been produced with the
Matplotlib Python library