Articles | Volume 26, issue 13
Hydrol. Earth Syst. Sci., 26, 3477–3495, 2022
Hydrol. Earth Syst. Sci., 26, 3477–3495, 2022
Research article
07 Jul 2022
Research article | 07 Jul 2022

Spatial extrapolation of stream thermal peaks using heterogeneous time series at a national scale

Spatial extrapolation of stream thermal peaks using heterogeneous time series at a national scale
Aurélien Beaufort1,2, Jacob S. Diamond1,2, Eric Sauquet1, and Florentina Moatar1 Aurélien Beaufort et al.
  • 1RiverLy, INRAE, Centre de Lyon-Grenoble Auvergne-Rhône-Alpes, 69100, Villeurbanne, France
  • 2GéoHydrosystèmes COntinentaux, Université de Tours, Tours, 38000, France

Correspondence: Florentina Moatar (


Spatial reconstruction of stream temperature is relevant to water quality standards and fisheries management, yet large, regional scale datasets are rare because data are decentralized and inharmonious. This data discordance is a major limitation for understanding thermal regimes of riverine ecosystems. To overcome this barrier, we first aggregated one of the largest stream temperature databases on record with data from 1700 individual stations over 9 years from 2009–2017 (n= 45 000 000 hourly measurements) across France (area = 552 000 km2). For each station, we calculated a simple, ecologically relevant metric – the thermal peak – that captures the magnitude of summer thermal maximums. We then used three statistical models to extrapolate the thermal peak to nearly every stream reach in France and Corsica (n= 105 800) and compared relative model performances with an air temperature metric. In general, the hottest thermal peaks were found along major rivers, whereas the coldest thermal peaks were found along small rivers with forested riparian zones and strong groundwater inputs and were located in mountainous regions. Several key predictors of the thermal peak emerged, including drainage area, mean summer air temperature, minimum monthly specific discharge, and vegetation cover in the riparian zone. Despite differing predictor importance across model structures, we observed strong concordance among models in their spatial distributions of the thermal peak, suggesting its robustness as a useful metric at the regional scale. Finally, air temperature was found to be a poor proxy for the stream temperature thermal peak across nearly all stations and reaches, highlighting the growing need to measure and account for stream temperature in regional ecological studies.

1 Introduction

Stream water temperature controls the rates of biogeochemical processes (Nimick et al., 2011; Song et al., 2018) and the distribution and phenology of aquatic communities (Hawkins et al. 1997; Wolter 2007). Water temperature extremes, like summer maxima, are particularly strong drivers of aquatic biota behavior, habitat selection (Carlson et al., 2007; Xu et al., 2010), and water quality (van Vliet and Zwolsman, 2008). Despite their importance, we lack an understanding of the spatial distributions of thermal maximums, largely due to a paucity of consistent stream temperature data. There is thus a need to develop and improve spatially distributed stream temperature datasets with a focus on identifying areas most affected by thermal maximums.

National- and even regional-scale stream temperature data are a relatively recent phenomenon (Jackson et al., 2018), in part due to advancements in in situ sensors (Isaak et al., 2017), but these data tend to be spatiotemporally uncoordinated across regions and river networks (Arismendi et al., 2014). This results in snapshot datasets, necessitating the use of air temperature proxies to supplement or even replace stream temperature time series (Conti et al., 2015; Logez et al., 2012; Tisseuil et al., 2012). Indeed, ecologists regularly use air temperature as a proxy or predictor of stream temperature (Buisson and Grenouillet, 2009; Durance and Ormerod, 2009; Mayer, 2012) despite the fact that water temperature better explains the spatial organization of aquatic biota (Kirk and Rahel, 2022). While air temperature can be used to fill temporal data gaps (Morrill et al., 2005), air temperature is a poor surrogate for stream temperature in several cases. In particular, air temperature correlates poorly with stream temperature in headwater reaches (Caissie, 2006) and in reaches with strong local controls (e.g., riparian vegetation, groundwater inflows, bed form, impoundements) (Moatar and Gailhard, 2006; Hill and Hawkins, 2014; Loicq et al., 2018; Chandesris et al., 2019; Seyedhashemi et al., 2021).

The spatial variation of summer stream temperature is controlled by several factors, each with varying spatiotemporal influence (Fullerton et al., 2015). The most commonly observed factor is stream size – often evaluated with a drainage area or distance-from-source proxy – which tends to have a positive relationship with stream thermal maxima (Hrachowitz et al., 2010; Imholt et al., 2013; Isaak et al., 2017). As the stream's surface area increases, it leads to rapid heat transfer and subsequent equilibration with air temperatures. Intuitively, discharge is also a strong control on stream temperature distributions, with increasing groundwater influence (e.g., measured with monthly flow minima or the base flow index (BFI)) often cited as a mitigating effect on thermal maxima (Chang and Psaris, 2013; Hare et al., 2021). An obvious candidate factor is summer air temperature, which exhibits clear positive relationships with stream thermal maxima, albeit at regional scales (Moore et al., 2013; Isaak et al., 2017). In addition to these three dominant factors, an array of additional controls like local geology, slope, altitude, and riparian cover can modulate these relationships, leading to unexpected spatial distributions of stream water temperature (Fullerton et al., 2015).

To account for the variety of controls on stream temperature spatial distributions in a data scarce landscape, researchers turn to predictive models. Still, despite a recent growth in modeling techniques, including statistical (Segura et al., 2015, Benyahya et al., 2007) and physical (Beaufort et al., 2016a, b) approaches, models that can capture regional or national-scale stream temperature distributions remain scarce (but see Hare et al., 2021). At these larger spatial scales, statistical models, which relate stream temperature to climatic and environmental variables, are likely the most pertinent and parsimonious for management. Indeed, geostatistical river network models that can incorporate network covariance structure are likely the most appropriate and accurate models to estimate spatial distributions of stream temperature (Isaak et al., 2017), but their use is limited to regions with high data density. As most countries lack the required data for such approaches, researchers are more prone to employ empirical regressions or machine learning approaches, but the comparative accuracy and utility of these approaches are still unknown.

In light of these gaps in data and understanding, our objectives were twofold: (1) to aggregate and develop a homogeneous national-scale stream temperature database for France and (2) to develop empirical statistical models to predict a simple, ecologically relevant stream thermal metric that captured the magnitude of the stream temperature maxima at the regional scale. To do so, we first defined an interannual thermal metric, which we termed the “thermal peak”, using heterogeneous and non-concomitant time series of stream temperature and estimated it at 1700 stations for 2009–2017. We then tested three statistical models and one multi-model approach to predict this metric at the regional scale and compare their predictive capacity with that of air temperature. Specifically, we used these models to address the following question: what are the spatial patterns of stream temperature maxima in France and their drivers, and are these patterns consistent across modeling approaches? We hypothesized that spatial patterns would be consistent, whereas the drivers would depend on the modeling approach used. We also hypothesized that stream size, air temperature, and groundwater contributions would emerge as important, regardless of approach.

2 Methods

2.1 Study area and monitoring network

The study area is continental France and Corsica (550 000 km2). France is located in a temperate zone characterized by a variety of climates due to the influences of the Atlantic Ocean, the Mediterranean Sea, and mountain areas. We assembled a large stream temperature dataset for this area by combining data from public (national, regional; approximately 600 stations available from, last access: 25 March 2019), fishing associations, and other private agencies. Due to the diversity of data origins, the assembled time series did not have a consistent spatial and temporal structure. Some regions are densely monitored, while others had few instrumented streams, and many of the data do not exhibit temporal overlap, challenging our ability to define comparable metrics among streams. Hence, our main challenge was to coalesce and homogenize all the disparate data sources.

Table 1Outlier detection and data filtering process of stream temperature dataset.

Download Print Version | Download XLSX

The stream temperature data used here comprise approximately 45 000 000 hourly measurements from 1700 unique measurement stations (n= 2 107 623 site days) collected between 2009 and 2017, primarily during summer. All the stations under strong human influence (i.e., dam releases and nuclear power thermal effluent) were excluded from this dataset. All data were recorded by automatic data loggers managed by professional biologists, hydrologists, and fishermen. Outliers from each station's time series were removed with automatic outlier detection filters, and the resulting hourly data were screened visually before being averaged into daily mean stream temperature data (hereafter referred to as Tw). The automatic outlier detection consists of three steps with eight unique filters that remove, in order, (1) stream temperature anomalies based on hourly data, (2) anomalies between Tw and daily mean air temperature (Tair) at monthly scales, and (3) anomalies between Tw and Tair at daily scales (Table 1; Moatar et al., 2001; Beaufort et al., 2020b). Tair was provided by the 8 km gridded SAFRAN (Système d'Analyse Fournissant des Renseignements Atmosphériques à la Neige) atmospheric reanalysis data released by Météo-France over the 2009–2017 period (Vidal et al., 2010). It was extracted from SAFRAN meshes overlapping the station location.

Figure 1Data availability for each temperature station used in this study. (a) Map of stream temperature stations in France with RHT network shown for all reaches of Strahler order > 4. (b) Heatmap of data availability by year (x axis) and station (y axis) with the total stations per year listed at the top of each column. Sites are colored by the number of years with observations. (c) Distributions of drainage area for RHT reaches (blue) and of thermal stations (black).

To understand spatial patterns in ecologically meaningful temperature metrics, we focused on the two hottest stream temperature months, July and August (hereafter referred to as summer), a time period essential for the growth and survival of many aquatic species. This focus also had the benefit of maximizing the number of observation stations for analysis. Still, out of the 1700 stations, 490 stations had just 1 year of data, 88 stations had observations covering summer over all 9 years, and only 30 had year-round observations for all 9 years (Fig. 1a and b). To obtain hydraulic and hydrologic characteristics for each station, stations were projected onto the Theoretical Hydrographic Network for France (RHT; Pella et al., 2012), an oriented hydrographic network with defined flow directions (i.e., built-in upstream–downstream dependencies) that comprises 114 600 reaches of median length 1961 m (2475 ± 1512 m, mean ± SD). The majority of stations were located on RHT river reaches with drainage areas 20–500 km2, whereas most reaches are small streams with a drainage area of less than 20 km2 (Fig. 1c).

2.2 Defining the thermal peak metric

Due to the limited concordance among stream temperature time series (Fig. 1b) and to focus our analysis towards ecologically relevant ends, we summarized stream temperature data with a simple metric, the thermal peak. This metric has precedent in regional species distribution models that instead used air temperature (hereafter referred to as Tair) as a proxy (Buisson and Grenouillet, 2009). We refer to this metric as the thermal peak (Tp) and define it as the interannual average of the mean temperature of the 30 hottest consecutive days of each year (Tw,30)i:

(1) T p = i = 1 N T w , 30 i N ,

where i is the year index, and N is the number of years of observation available from 2009–2017 (N= 1–9).

We did not know a priori the 30 hottest consecutive days of each year, but a sensitivity analysis on the sites with annual data suggests that July and August were regularly the hottest months (Fig. B1). Indeed, across sites with annual data, the hottest day of the year always occurred within the approximate 30 d period between 28 July and 30 August (mean ± SD: 12 August ± 16 d), and only 3 % of site years had their hottest 30 d outside of this period. This fact further allowed us to take advantage of many of the sub-annual time series, particularly those generated by fishing agencies, which only contain July and August data.

Table 2List of thermal peak terminology with the count of sites (n) used in their calculation.

Download Print Version | Download XLSX

2.3 Climate correction of the thermal peak

The thermal peak can be biased depending on the climatic variability of the years of observation for each station. Indeed, only 30 of our stations had Tp calculated using all 9 years of data and therefore had the highest level of confidence in their estimate. We refer to the Tp from these 30 stations as Tp,ref, indicating that these are reference, or true, estimates of Tp (Table 2). To account for the bias associated with missing data at the remaining 1670 stations, we gap-filled missing data at these stations using site-specific stream–air temperature regressions. This method accounts for interannual variation in climatic forcing on stream temperature, and we therefore refer to it as a climate correction.

The climate correction is achieved by first calculating station-specific regressions during summer between daily Tw and a right-aligned moving average of Tair at lags ranging from 2–10 d. The moving average lag whose regression produced the highest coefficient of determination was then used to fill gaps in the time series of Tw for each station. Stations with watershed areas greater than 1000 km2 tended to have the best R2 at the longest lags, but there were no clear trends at smaller watershed areas (Fig. B2). Next, we reconstructed summer stream temperature using this regression and subsequently recalculated Tp based on these reconstructed data. We refer to Tp from these climate-corrected data as Tp,clim to indicate that missing data were gap-filled with the climate correction procedure (Table 2).

To validate this approach, we conducted a permutation test on the 30 stations with full annual monitoring from 2009–2017 (i.e., sites where a Tp,ref is known). At each site, we introduced randomly placed, artificial annual gaps into observed data ranging from 1–9 years to simulate missing data. We then backfilled these introduced gaps according to the climate correction method. Following gap-filling, we calculated two metrics: (1) Tp using gap-induced data without gap-filling (Tp,gap) and (2) the thermal peak using the gap-filled data (Tp,fill; Table 2). We then compare these two metrics to the reference thermal peak (Tp,ref) using absolute biases at each tested permutation (i.e., the number of introduced gap years). This approach allowed us to assess whether the climate-corrected reconstruction of the gaps in time series is (1) a useful approach and (2) lower in bias and uncertainty compared to using observed data alone.

2.4 Extrapolating the thermal peak to national scale with statistical modeling

We estimated Tp throughout the entire RHT network using four distinct statistical models. For modeling (Tp,m) it was not clear how to choose a priori a particular model structure due to the complexity of the processes involved in determining local stream temperature. Therefore, we tested four different structures: (1) a multiple linear regression model (REG), (2) an artificial neural network model (ANN) that is potentially nonlinear but encompasses a linear model as a special case, (3) a random forest model (RF) that can also be nonlinear but with a different strategy than ANN, and (4) a multi-model combination (MM), which combines the three prior structures ANN, RF, and REG. All these models are based on a function to estimate Tp at all stations (i):

(2) T p , m , i = f ( g 1 , i , , g n , i ) ,

where gn,i is the nth explanatory variable defined for each ith station.

To measure how stream thermal regime estimations might be different if using a Tair proxy, we also calculated Tp,air using Tair. The Tp,air is calculated identically to Tp but using daily Tair instead of daily Tw, as would be done in ecological studies using Tair as a proxy for Tw. Finally, to determine the effect of climate corrections on the full RHT extrapolation, we compared the distribution of Tp,m values when models were fitted using either Tp,obs or Tp,clim.

Table 3List of explanatory variables used in models and hypothesized effects on thermal peaks.

a All climate variables are calculated on data from 2009–2017. b Summer refers to July–August. c Determined by geostatistical interpolation on the RHT network (Sauquet et al., 2000) with return period of 5 years. d Ratio of flow duration quantiles Q1-Q10/Q10-Q99, as defined by Catalogne (2012). e Classes from 1–12 with pluvial regimes from 1–6, transition regimes from 7–8, and glacial and snow melting regimes from 9–12 (Sauquet et al., 2008). f Detailed description of these variables in Valette et al. (2012).

Download Print Version | Download XLSX

2.4.1 Explanatory variables

We selected 16 variables (Table 3) to explain the spatial distribution of Tp,m based on results from a prior analysis (Beaufort et al., 2020a), a review of the literature, and an effort to minimize variable collinearity (Fig. B3). We further considered the ability to calculate or estimate each variable at the scale of the entire RHT network. The variables fall into three categories: climatic, hydrologic, and watershed characteristics.

The four climatic variables were determined from SAFRAN reanalysis data for the years 2009–2017: (1) mean annual precipitation, (2) mean summer precipitation, (3) mean annual snowfall, and (4) mean summer air temperature.

The four hydrological variables were determined by extrapolation based on prior datasets. In general, we wanted to quantify the influence of groundwater on Tp but lacked the local hydrometric data necessary to calculate the typically used BFI (Hare et al., 2021) at a majority of sites. Hence, we turned to the following variables that could potentially quantify groundwater influence across all of our sites. The first two variables, monthly minimum discharge (Sauquet et al., 2008) and the annual minimum monthly discharge with a return period of 5 years (Catalogne, 2012), describe the low-flow regime of each site. The remaining two variables, the hydrologic regime (HR; Sauquet et al., 2008) and the concavity index (CI; Catalogne, 2012), are dimensionless and characterize the general hydrology of each site. More specifically, the HR groups sites into one of 12 classes ranging from rainfall-dominated, to transitional, to glacial- and snowmelt-dominated (see Appendix A). These classes generally fall into buffered (i.e., high baseflow) or highly variable (i.e., low baseflow) hydrologic regimes. The CI describes the concavity of the flow duration curve, where values close to 1 indicate low flow variability (e.g., large high storage capacity in aquifer or snow), and values close to 0 indicate high flow variability (e.g., low storage capacity exemplary of Mediterranean systems). The concavity index is computed as follows:

(3) CI = Q 1 - Q 10 Q 10 - Q 99 ,

where Qp is the daily flow that is exceeded p % of the time derived from the flow duration curve.

The eight variables relating to watershed characteristics were extracted from either the SYRAH-CE (SYstème Relationnel d'Audit de l'Hydromorphologie des Cours d'Eau) database (Valette et al., 2012) or the RHT database (Pella et al., 2012). The three variables from RHT comprise (1)  mean altitude, (2) watershed drainage area, and (3) mean slope. The five variables from SYRAH-CE comprise (1) riparian vegetation cover in a 10 m buffer, (2) linear upstream weir density along the stream, (3) areal upstream weir density for the watershed, (4) upstream pond cover as a fraction of stream area, and (5) incision class describing the rate of incision of the valley.

2.4.2 Multiple regression

We first fit a multiple linear regression model between Tp and explanatory variables using all possible variables characterized in Table 3. Prior to fitting, we scaled the explanatory variables so that their fitted coefficients could be compared in terms of relative importance. We did not use any variable selection techniques in the multiple regression approach because our goal was to compare across the four modeling approaches (regression, ANN, random forest, and multi-model) that use the same independent variables. In other words, we did not seek to have the most parsimonious multiple regression model but instead used all 16 variables (Table 3) to create the model. Each of these variables was previously assessed for multicollinearity (Fig. B3), and variance inflation factors were all less than 3.

2.4.3 Artificial neural network

We then used an ANN — specifically a feed-forward neural network with one hidden layer (R package “nnet”; Venables and Ripley 2002) – to estimate Tp as a potentially nonlinear function of covariates. An ANN is a flexible mathematical structure with an interconnected set of simple processing elements, called nodes or neurons, that emulates the functioning of neurons in the human brain. They are used as an alternative tool to identify and simulate complex and highly nonlinear relationships between input and output data that the ANN identifies during the “learning process”. We included a direct connection between covariate inputs and outputs so that the case with zero hidden units corresponded to a linear relationship. We used weight decay regularization, also known as ridge regression, to control overfitting by decreasing less relevant coefficients. Both the number of hidden units and the amount of weight decay were selected with a first cross-validation procedure (Bishop, 2006). To quantify the importance of the different covariates, we used a connection weight approach (Olden and Jackson, 2002; Olden et al., 2004):

(4) W V = h = 1 nhu A V , h B h ,

where WV (–) is the relevance of covariate V, AV,h (–) is the ANN coefficients connecting hidden unit h to covariate V, BTh (–) is the ANN coefficients connecting hidden unit h to the output, and nhu is the number of hidden units.

2.4.4 Random forest

We used a random forest (RF) for the third statistical model structure. RF is a machine-learning method that draws bootstrap samples from original training data and fits a tree to each sample, and the ensemble of trees are used for regression (Breiman, 2001). Each tree recursively partitions the individuals from each sample into nonoverlapping groups of each predictor, which are then used to predict the target variable, in this case Tp. The average across the ensemble of trees is used as the final regression. The importance of each predictor variable was provided as standard output by the “randomForest” package, which determines how much the mean square errors in prediction increases when that covariate is randomly permuted within the tree. We used the R package “randomForest” (Liaw and Wiener, 2002), following Breiman's algorithm (Breiman, 2001) allowing for 500 trees.

2.4.5 Multi-model combination

Finally, we used a multi-model combination approach to obtain a consensus estimation map of Tp,m. The temperature predictions from each previously described model were linearly combined to reduce the associated uncertainties through multiple linear regression.

(5) T p , mm , i = a T p , ANN , i + b T p , RF , i + c T p , REG , i + d ,

where Tp,mm,i is the multi-model Tp based on observed and reconstructed Tw,i for each observation station, i; Tp,ANN,i is Tp estimated by ANN for each observation station i; Tp,RF,i is Tp estimated by RF at each observation station i; Tp,REG,i is Tp estimated by multiple regression at each observation station i; and a, b, c, and d are fitted regression coefficients.

Temperature predictions made by the three models are used as multi-model independent variables, and each model prediction is weighted with a coefficient to match the observations as closely as possible. Hence, the multi-model coefficients are calculated only in relation to observations at the 1700 stations. Then, using Eq. (5) with calculated coefficients, we extrapolate Tp along the river network.

2.4.6 Model cross-validation and comparison with air temperature

To assess each model's performance prior to spatial extrapolation to the entire RHT network, we conducted a cross-validation on observed data from our 1700 stations. For each model, including Tp,air, we used 80 % of stations (n= 1360) as training data to estimate model parameters. Using those model parameters, we estimated validation data Tp at the remaining 20 % of stations (n= 340) and cross-validated those estimates with Tp,obs. We conducted this cross-validation 100 times, allowing for random selection of stations used in the training and validation datasets. We evaluated the results of the cross-validation with the Nash–Sutcliffe efficiency criterion (NSE; Nash and Sutcliffe, 1970), the RMSE (root mean square error), and biases between observed Tp,obs and Tp,m. We also conducted regressions of Tp,obs and Tp,m.

We also compared all Tp model results with a thermal peak calculated using only Tair from SAFRAN reanalysis data, Tp,air. The goal was to evaluate model performance with the simplest and most widely available stream temperature proxy.

2.4.7 Hypothesis testing

We evaluated the hypothesis that spatial patterns across models would be consistent but that drivers would depend on the modeling approach in two ways. First, we compared relative importance of the most important variables across models to see if the same variables emerged. Second, we visually compared maps of Tp,m and compared their distributions across watershed area and stream order. We evaluated our second hypothesis that stream size, air temperature, and groundwater contributions would emerge as important, regardless of approach, by comparing their relationships with Tp,m across modeling approaches using Kendall's tau correlation.

3 Results

3.1 Validation of the climate correction approach

Gap-filling Tw with our climate correction approach resulted in consistently lower absolute biases for Tp (i.e., Tp,fill) compared to only using available data (i.e., Tp,gap), regardless of the length of introduced annual gaps (Fig. 2). This discrepancy in bias between uncorrected and corrected data increased exponentially with the number of introduced gaps, growing from a median of 0.03 C with 1 year of gaps to 0.25 C with 8 years of gaps. Climate-corrected biases remained below 0.25 C for 75 % of the stations with up to 6 years of introduced gaps. There was no systematic bias, regardless of the years taken into account in the regressions.

Figure 2Improvement in absolute bias of thermal peaks across all reference sites (Tp,ref; n= 30) when introduced annual gaps were filled with the climate correction procedure (|Tp,fill-Tp,ref|, red boxplots) compared to when gaps were unfilled (|Tp,gap-Tp,ref|, blue boxplots), regardless of the number of introduced gaps.


Figure 3Cross-validation performance metrics for modeled thermal peaks (Tp,m), where 80 % of stations (n= 1360) were training data, and 20 % of stations (n= 340) were validation data. Violin plots show distributions of 100 replicates of training-validation data for each model's performance metrics: (a) bias relative to Tp,ref, (b) NSE, and (c) RMSE of Tp,m relative to observed Tp. Horizontal lines indicate sample medians, and red points indicate sample means. (d–f) Tp,m vs. Tp,obs for each model colored by watershed area at the site. Simple linear regression results (n= 1700) are shown in blue with comparison to a 1:1 line (dashed).


3.2 Model cross-validation

In cross-validation, all four statistical models performed substantially better than air temperature at accurately predicting Tp (Fig. 3). Indeed, Tp,air overestimates the observed Tp,ref by 2.5 C on average (Fig. 3a), and the negative NSE for Tp,air indicates that using a simple mean of Tw observations is a better predictor of Tp than using Tair (Fig. 3b). Overall, models tend to slightly underestimate Tp (Fig. 3a), but mean biases are close to 0 C. The REG and ANN models had similar performance (median RMSE > 1.5 C; median NSE = 0.6), whereas the RF and MM models obtained the best performances (median RMSE < 1.5 C; median NSE = 0.7), with the MM slightly superior for NSE (Fig. 3b and c). The models explain in cross-validation between 70 % and 78 % of the variation in Tp,obs, with the best performances for RF and MM (Fig. 3d–f). Sites with larger watershed areas had consistent positive bias (lighter colors in Fig. 3d–g), but bias was more evenly distributed for sites with smaller watershed areas, though the smallest watersheds tended to exhibit negative bias. We did not observe any strong spatial patterning in performance metrics, although sites with larger watershed areas tended to have positive bias (Fig. 3d–h).

Figure 4Relative importance of top four explanatory variables calculated in cross-validation for the estimation of Tp with the models: (a) ANN, (b) REG, and (c) RF. To compare the relative importance of each variable for each model, all variables were centered and scaled. Text in the upper right of each panel refers to the sum of the relative importance of the first four explanatory variables in each model; colors indicate the explanatory variable.


3.3 Explanatory variables' importance and effects in models

Variable importance differed among modeling approaches, and the maximum importance value for any one variable was between 25 %–30 % (Fig. 4). The two most important variables were watershed area (area) and mean summer air temperature (Tsummer) for the RF and multiple regression models (Fig. 4b and c). For the ANN model, minimum monthly specific discharge (qmin) and riparian vegetation cover (veg) were most important (Fig. 4a; note that qmin is not correlated with area, Fig. B3). Surprisingly, area and Tsummer obtained relative importance of less than 5 % in ANN, whereas they were the most important variables in the RF and the REG models. Across models, none of the three precipitation variables (Table 3) emerged as important, nor did the stream incision class.

The cumulative importance of the four most relevant variables of the REG and ANN models is, respectively, 92 % and 81 %, which means that the other variables had very little weight in the estimates. This sum is only 69 % for RF, which indicates that the relative importance of the explanatory variables is more distributed, and the other explanatory variables had a significant weight in the estimates, which could explain the best performances in cross-validation of RF. In particular, linear and areal weir density upstream of stations occupied a cumulative 7 % relative importance. These variables also occupied the two next greatest importance rankings (at 1.7 % and 0.9 % relative importance, respectively) for the multiple regression model after those shown in Fig. 4.

Figure 5Modeled Tp at each of the 1700 observation stations as a function of the variables hypothesized to be equally important across approaches. Rows indicate the model used (points also colored accordingly), and columns are separated by area, qmin, and Tsummer; area and qmin are natural-log-transformed. Lines indicate best-fit smoothers using a generalized additive model, and text indicates Kendall's tau correlation values (all p< 0.001). There was no systematic difference in variable effects on Tp,m.


The direction and magnitude of effects from the hypothesized variables (area, groundwater contributions, and summer air temperature) did not systematically differ across modeling approaches (Fig. 5). However, there were minor differences in that the RF model tended towards higher Tp,m than ANN and REG models at small watershed areas and tended towards lower Tp,m than ANN and REG models at larger watershed areas (compare Fig. 5g with Fig. 5a and d). The RF model was also less sensitive to Tsummer, with RF producing lower Tp,m than ANN and REG models at sites with cooler summer temperatures and lower Tp,m than ANN and REG models at sites with warmer summer temperatures (compare Fig. 5i with Fig. 5c and f).

Figure 6Thermal peaks (Tp) of stream water extrapolated to all reaches of the French hydrographical network RHT with the different predictive model structures: (a) air temperature (Tair), (b) multiple regression (REG), (c) artificial neural network (ANN), (d) random forest (RF), and (e) multi-model combination of all previous models (MM). All reaches are colored by their modeled range of Tp, and colors are chosen to improve visualization.

3.4 Spatial extrapolation of thermal peaks and comparison with air temperature

The statistical models could extrapolate Tp to 92 % (105 800 reaches) of the RHT network, and the resulting spatial structure of the extrapolations was consistent across models (Fig. 6). On average, Tp was 18.2 C and ranged between 6.3 and 27.0 C. The highest Tp,m (i.e., Tp> 22 C) values were generally found on the largest rivers located in the southeast and in the sedimentary plains. The lowest Tp,m values are found in the mountain streams of the Alps, Pyrenees and Massif Central, and in the northwest. Although the distribution Tp,m is consistent among models (Fig. 6), there are some clear disparities, particularly at the extremes. The ANN model predicts lower Tp,m (20 % of reaches Tp,m< 15 C) compared to the RF and multi-model approach (< 10 %; Fig. B4a). Application of the climate correction prior to model fitting and subsequent extrapolation showed that differences at the RHT are less than 0.5 C at 89 % of the reaches (Fig. B4b). Still, 10 % of the reaches exhibited Tp,m differences greater than 0.5 C, and the vast majority of these differences are negative (9.4 % vs. 1.6 %), suggesting that climate correction more often than not reduces overestimates in Tp.

Figure 7Differences between stream thermal peaks estimated by air temperature and models depending on (a) watershed area and (b) minimum monthly specific discharge. To simplify visualization across the 114 600 reaches, points were summarized across 5 % bins in the x axis; points indicate mean values at each bin, with vertical error bars indicating standard error. (a) As watershed area increases, Tair shifts from overestimation to underestimation relative to Tp,m, with a shift in sign between 2000 and 5000 km2. (b) As minimum monthly specific discharge increases, Tair tends to increase its overestimation relative to Tp,m, especially for watershed size greater than 100 km2 (colors). Only results for RF model are shown for clarity of presentation and because it had the best performance (each model had similar patterns).


Estimating Tp with air temperature (i.e., Tp,air) led to consistently higher values than were obtained with statistical models, with Tp,air greater than 20 C for more than 70 % of the reaches (Figs. 6a and B4a). Indeed, depending on the model, 94 %–97 % of reaches had Tp,m lower than Tp,air (Fig. 6), and more than 95 % of these were in reaches with watershed areas less than 1000 km2. Tp,air exhibited its greatest overestimations in the smallest watersheds (Fig. 7a) and switched from overestimation to underestimation relative to Tp,m at reaches with watershed areas between 2000–5000 km2. While this behavior was similar across models, the REG and ANN models tended to produce lower Tp,m for small areas and higher Tp,m for large areas compared to RF and MM models, suggesting their increased sensitivity to this variable. Tp,air also increased its overestimations relative to Tp,m in reaches with the most groundwater contributions (Fig. 7b). Similar to watershed area, the REG and ANN models tended to produce higher Tp,m for low qmin and lower Tp,m for high qmin than the RF and MM models, suggesting higher sensitivity to this variable. In other words, in small watersheds with low baseflow, Tp,m values from REG and ANN models were 2–3 C less than from RF and MM models, but in large watersheds with high baseflow, Tp,m values from REG and ANN models were 3–5 C greater than from RF and MM models.

4 Discussion

We compiled one of the largest regional summertime stream temperature datasets to address the growing need to understand the spatial distributions of stream thermal maxima. This database is available for use at the following website: (last access: 25 May 2022). Using this database, we demonstrated that a simple, ecologically meaningful metric, which we term the thermal peak (Tp), can be reliably estimated at the regional scale using a few easily accessible explanatory variables.

4.1 Horizons and limitations in estimating large-scale stream thermal metrics

Spatiotemporally comprehensive stream temperature datasets are rare because interest in these data is relatively recent, and there is little money to support instrumentation at regional or national scales. This lack of centralized data has been recognized as a major limitation for understanding thermal regimes of riverine ecosystems (Arismendi et al., 2012; Ouellet et al., 2020). Here, we alleviated these limitations by homogenizing existing disparate datasets and demonstrated the aggregated utility of such datasets in improving our understanding of stream temperature distributions. Indeed, our homogenized dataset allowed us to identify, at a national scale, a map of summer stream temperature maxima with important implications for aquatic species distributions under climate change. We note here that the more detailed computation of Tp (i.e., rolling window of 30 d) could be simplified with a mean of August stream temperature data, as we observed this to regularly be the hottest month across all observable data.

We observed the hottest Tp along major rivers and the coldest Tp along small rivers and in mountainous regions (Fig. 6). The downside of the current approach is that it is based on interannual metrics. Indeed, the non-concomitance of the time series precluded us from comparing extreme years (hot vs. cold). However, the stream temperature dataset used here contains enough interannual information that it could be used to calibrate models and assess the impact of climate change on thermal regimes (see Isaak et al., 2017, 2020). This map can presently be used to predict and manage cold-water stream habitats, with potential for regular multi-annual updates.

To enable unbiased comparison among stations from time series with gaps, we used a climate correction of Tw based on regression with Tair. The efficacy of regressions between Tw and Tair is well understood (Ducharne, 2008; Segura et al., 2015, Moatar and Gailhard, 2006), and the approach can be easily transferred to other stations and regions. On the other hand, a cautious approach is required for such regressions because they assume seasonal correlation between Tair and Tw and can therefore only apply to rivers having natural seasonal dynamics, without dam release, thermal peaking, or major weirs regulating the flow of streams (Bruno et al., 2013; Chandesris et al., 2019; Seyedhashemi et al., 2021; Casado et al., 2013). Moreover, our rolling window approach to TwTair regression suggests that the watershed size may be an important factor in developing these relationships, with regressions for larger rivers benefitting from longer lags in Tair (Fig. B2). However, where possible, climate correction makes it possible to significantly reduce the biases in Tp estimates when compared to those made using observed data without climate corrections (Figs. 2 and 3). The biases are reduced when the number of years of observation available is less than 4. Beyond this limit, the meteorological variability specific to each year is sufficient to estimate Tp, which confirms results obtained by Jones and Schmidt (2018). Moreover, climate correction reduces overestimation of Tp, reinforcing the importance of taking into account these climatic corrections of temperature metrics, even if it only slightly affects the majority of rivers (Isaak et al., 2017)

4.2 Spatial extrapolation of Tp is consistent and best predicted with a random forest model

All four statistical models achieved similar predictive performance, but the RF model exhibited marginal improvements over the others (Fig. 3). The MM approach only slightly improves performance in cross-validation in comparison with RF (vis-à-vis the NSE criterion), so it is unlikely to be a useful approach in future applications due to its complexity. Moreover, whereas the multi-model has the best performance, it lacks the explanatory power and relative simplicity of the other approaches. In contrast, a potential benefit of the multi-model approach is that by leveraging multiple approaches, it can compensate for errors particular to individual models.

Overall, our model performances (NSE = 0.75, RMSE < 2) were of the same order as other large-scale stream temperature studies found in the literature (see Segura et al., 2015; Daigle et al., 2010; Wehrly et al., 2009), suggesting that our approach is reasonable and broadly applicable. Indeed, literature ranges of RMSE for monthly stream thermal maxima are consistent with values we obtained here (e.g., 0.9–2.1 C for 16 sites across Canada (Daigle et al., 2010) and 2.0–2.3 C for 1131 sites across USA (Wehrly et al., 2009)), suggesting a general accuracy limit to current modeling approaches. This is likely because RF models can be prone to overfitting such that they can accurately predict a set of observations, but their performance may decline when predictions are made at unsampled locations. They also have less robust means of model selection and significance testing than the much simpler multiple linear regression approach.

In spatial extrapolation, the Tp estimates are globally consistent between the models, and the same spatial structures are found, regardless of the approach used (Fig. 6). We observed divergences between the models in particular for Tp less than 14 C, where the ANN tends to predict the coldest Tp for more reaches compared to other models (Fig. B4). Still, because our analysis is limited to 2009–2017, these model differences may diminish as Tw measurements grow in time and space. We further note that the performance of the modeling techniques used here was less than that of spatial stream-network (SSN) models applied to similar temperature datasets, which typically have R2 0.90 and RMSE  1.0 C (Isaak et al., 2017). We tested a SSN model on a small region well covered by data (9000 km2, 92 stations) for a robust estimation of parameters with the R package “SSN” (see Fig. B5 and Table B1), and, indeed, SSN had reduced bias relative to that of the random forest model (absolute bias decreased by 0.2 C). By comparing the observed and estimated values, we can see that compared to the SSN, the RF model tends to underestimate the high values (on major rivers) and to overestimate the low values (on smaller reaches). Still, the spatial patterns are very consistent among the two approaches, though there are important differences between the SSN and RF model estimates which can be ± 2 C. However, SSNs are labor-intensive to apply in comparison to non-geospatial techniques and require specialized geospatial algorithms for fitting. Hence, despite lower accuracy, the approaches used here may be useful in more generic use cases when geospatial data and computing time are limiting.

4.3 Drivers of thermal peak depend on model structure

We observed clear divergence of variable importance for the estimation of Tp among the ANN, RF, and REG models. The two most relevant variables in RF and REG were watershed area and mean summer air temperature, consistent with other studies (Laanaya et al., 2017; McGarvey et al., 2018; Moore et al., 2005, 2013). Drainage area also emerged as one of the most important variables driving thermal peaks, behind watershed elevation and slope, in a recent regional study (Johnson et al., 2020). Likewise, distance from source and Strahler order, which directly correlate with drainage area, are known to be important drivers of thermal peaks (Hrachowitz et al., 2010; Imholt et al., 2013; Ducharne, 2008, Mohseni et Stephan, 1999). The clear rationale for this common effect is that longer residence times in large rivers allow for more time for temperature equilibration with the atmosphere compared to small rivers (Beaufort et al., 2020a; Mohseni et al., 1998, Fullerton et al., 2015). Large rivers are also less shaded by riparian vegetation and are less influenced by groundwater inflows, which compounds residence time effects.

Importantly, RF had a much more even distribution of variable importance relative to the other models structures, which is likely due to its nonlinear structure. In contrast, for ANN, the most important variables were minimum monthly specific discharge (43 %) followed by riparian vegetation cover (23 %). Higher minimum flows imply a consistent groundwater supply, leading to cool surface waters in summer (Hannah et al., 2004; Kelleher et al., 2012; Lalot et al., 2015). Similarly, greater riparian vegetation implies more shading and reduced temperature increases from solar radiation (Dugdale et al., 2018; Loicq et al., 2018; Moore et al., 2013). These differences in relevance between the variables for each model underline the importance of using several approaches and show that the multi-model approach makes it possible to take into account all these across-model divergences. We caution, however, that while predictability may increase, interpretation of variable importance in multi-models is complex, and their utility may therefore decrease when project goals are focused on critical variables for stream temperature habitat restoration.

4.4 Air temperature is not an appropriate proxy for stream temperature

Estimates of Tp produced by stream temperature were clearly more accurate than those produced by air temperature (Figs. 3 and 5). In cross-validation, Tp,air overestimated observed Tp by more than 2 C, could not differentiate stream temperature among regions (Figs. 6 and 7), and could not differentiate large rivers from small rivers. These results highlight the need to account for stream-specific temperature estimates, especially when such Tp,air overestimates inflate potential ecological change (see Kirk and Rahel 2022). Our results further demonstrated that this overestimation of Tp,air was greatest in smaller rivers and that these biases were amplified in streams with significant groundwater contribution (Fig. 7), implying that groundwater buffers effects of increasing Tair. This aligns with recent results that found sites with deep groundwater contributions are much less likely to exhibit increasing summer stream temperatures compared to sites with shallow groundwater contributions (Hare et al., 2021). However, we note that the influence of qmin on Tp,air-Tp,w is weak up to values of approximately 5 Ls-1km-2 (Fig. 7b), which is in accordance with recent work in the same region (Beaufort et al., 2020a). This small effect may in part be because qmin is not an effective proxy for groundwater contributions; the base flow index is likely more appropriate (Kelleher et al., 2012; O'Driscoll and DeWalle, 2006; Hare et al., 2021; Johnson et al., 2020), but we were unable to obtain this parameter in this work at a national scale.

Overall, this work clearly demonstrates that Tair is an inappropriate proxy for Tw, with important implications for ecological studies, especially those that consider temperature tolerance thresholds of aquatic species. Species distribution models may need to use Tair instead of Tw because data from Tw are not sufficient in the regions studied (McGarvey et al., 2018). It is therefore important to introduce Tw in input of these models rather than Tair in order to limit the biases linked to the poor spatial representation of Tair.

This research further emphasizes the importance of spatial scale and heterogeneity for water temperature studies. As streams increase in size, they become more coupled to air temperature dynamics. Hence, smaller reaches more influenced by groundwater inputs and vegetation may serve as “climate refugia” for ectotherms species especially in the context of climate change. Therefore, choosing the best predictors at a spatial scale for thermal peaks and other thermal regime metrics is essential to accurately predict and manage future stream cold-water habitat.

5 Conclusion

Stream thermal maxima are essential controls on aquatic ecosystems, but our understanding of their spatial distribution and thus our ability to adequately manage them accordingly are limited by data availability and simple metrics applicable at large scales. To address these gaps, we created a publicly available, harmonized dataset of stream temperature that can used by ecologists in France and scientists more broadly. We then developed a simple, ecologically relevant metric – the thermal peak, Tp – that can be extrapolated at large scales, even when data are sparse. We developed an innovative climate correction method to reduce biases related to such data sparseness, when applied to summer data. The Tp provides an important perspective on the magnitude of thermal maximums during summer, but development of additional metrics such as threshold exceedance frequency, duration, and timing will continue to grow our understanding of stream temperature behavior under climate change. However, development of these metrics will require longer and more spatially explicit time series. Hence, we argue that to improve our capacity to manage and benefit from aquatic ecosystems, it is critical to continue and expand our stream temperature measurement networks.

Appendix A: Classification of river flow regime for France

Sauquet et al. (2008) defined a classification system for river discharge based on the mean monthly runoff pattern (Fig. A1). Using this classification system, they published a map with each river reach assigned to one class along the main river network.

Figure A1Reference dimensionless hydrographs representative of the classification of river flow regime for France (after Sauquet et al., 2008). The y axis (CM) represents the percent of each month contributing to total annual runoff, equal to monthly runoff (mm) divided by the mean annual runoff (mm).


Generally, the uppermost basins located in mountainous areas display glacial- and snowmelt-dominated regimes (Group 1–3). The lower the outlet, the lower the contributions of snowmelt to runoff. Group 4 to 6 were defined to be in the “transition” regime, where the seasonal variation in streamflow is affected as much by precipitation timing as by air temperature and topographic influences on snowpack formation and snowmelt timing. In this regime, high flows are typically observed in spring. Group 7 to 12 are rainfall-dominated regimes. These six rainfall-dominated groups differ along an axis of maximum and minimum monthly discharges. For instance, group 7 is characterized by very low flow in summer, reflecting the lack of deep groundwater storages in the catchment. In contrast, Group 12 exhibits nearly uniform flows through most of the year, and these river reaches are typically found where large aquifers moderate flows.

Appendix B

Figure B1Histogram of the median date for the hottest 30 d periods across all sites that had annual data. The 30 d rolling windows of mean daily stream temperature were calculated across the entire time series and for each site, and the 30 d windows with the maximum values were selected here. The median date here refers to day 15 in the 30 d period.


Figure B2Daily lags for Tair that produce the highest R2 (colors) for a regression between Tw and a right-aligned moving average of Tair at each site as a function of watershed area. The moving average lag whose regression produced the highest R2 was used to fill gaps in the time series of Tw for each station. Each point represents a station, with the x axis being the watershed area at that station and the y axis being the lag in Tair that produced the best regression. Points are jittered for visual clarity; only integer lag values were possible. Black points indicate the mean and standard error (horizontal bars) of watershed areas within each possible lag (2–10 d).


Figure B3Correlation matrix of all 16 considered environmental variables prior to selection in the thermal peak analysis. All variables had collinearity values less than 0.6 or greater than 0.6.


Figure B4Distributions of Tp,m extrapolated to all RHT network reaches. (a) Histogram of models (10 bins equally spaced between 0 and 30 C) according to their extrapolated Tp value, colored by model, and (b) differences between Tp,m calculated with and without climate corrections on the 1630 stations with gaps in their data.


Figure B5Figure comparing modeled thermal peak (Tp,m) estimates from (a) SSN, (b) RF, and (c) the difference between RF and SSN. The SSN model here is from a benchmark test on a small region well covered by data (9000 km2, 92 stations) for a robust estimation of parameters with the R package “SSN”. SSN had reduced bias relative to that of the random forest model (absolute bias decreased by 0.2 C). Additionally, by comparing the observed and estimated values, we can see that RF tends to underestimate the high values and to overestimate the low values. Unfortunately, due to the lack of an RHT with upstream–downstream information, we could not apply at the scale of the whole watershed. Still, the spatial patterns are very consistent among the two approaches, though there are important differences between the SSN and RF model estimates which can be ± 2 C. The estimates of the SSN model are generally warmer than those of RF on the main major river axes and colder on the small tributaries. This is also consistent with observations. So, while the presented models may not be optimal, we are confident the spatial patterns are correct.

Table B1Comparison of model performance metrics for the region tested in Fig. B5.

Download Print Version | Download XLSX

Code and data availability

Water temperature data provided by OFB can be downloaded from (Naiades, 2019). Data generated by this work can be found at (INRAE, 2022). Additional data and code are from the corresponding author upon reasonable request.

Author contributions

The paper was authored by AB with contributions from all the co-authors. AB, FM, JSD, and ES contributed to the conceptualization and methodology. AB and JSD performed the formal analysis. ES contributed hydrology metrics. AB, JSD, ES, and FM contributed to the writing and revision of the manuscript.

Competing interests

The contact author has declared that none of the authors has any competing interests.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


The authors thank Jan Seibert, editor of Hydrology and Earth System Sciences, and the three reviewers for their valuable comments, suggestions, and positive feedback on the manuscript. The authors also thank the OFB, the many regional fishing organizations, DREAL, and EDF for the temperature data. The SAFRAN database was provided by the French national meteorological service (Météo-France).

Financial support

This research has been supported by the Agence française pour la biodiversité (grant no. AFB/2018/130).

Review statement

This paper was edited by Jan Seibert and reviewed by three anonymous referees.


Arismendi, I., Johnson, S. L., Dunham, J. B., Haggerty, R., and Hockman-Wert, D.: The paradox of cooling streams in a warming world: regional climate trends do not parallel variable local trends in stream temperature in the Pacific continental United States, Geophys. Res. Lett., 39, L10401,, 2012. 

Arismendi, I., Safeeq, M., Dunham, J. B., and Johnson, S. L.: Can air temperature be used to project influences of climate change on stream temperature?, Environ. Res. Lett., 9, 084015,, 2014. 

Beaufort, A., Moatar, F., Curie, F., Ducharne, A., Bustillo, V., and Thiéry, D.: River Temperature Modelling by Strahler Order at the Regional Scale in the Loire River Basin, France, River Res. Appl., 32, 597–609,, 2016a.  

Beaufort, A., Curie, F., Moatar, F., Ducharne, A., Melin, E., and Thiery, D.: T-NET, a dynamic model for simulating daily stream temperature at the regional scale based on a network topology, Hydrol. Process., 30, 2196–2210, 2016b. 

Beaufort, A., Moatar, F., Sauquet, E., Loicq, P., and Hannah, D. M.: Influence of landscape and hydrological factors on stream–air temperature relationships at regional scale, Hydrol. Process., 34, 583–597,, 2020a. 

Beaufort, A., Moatar, F., Sauquet, E., and Magand, C.: Thermie en rivière: Analyse géostatistique et description de régime: Application à l'échelle de la France, INRAE, 106 pp., (last access: May 2022), 2020b. 

Benyahya, L., Caissie, D., St-Hilaire, A., Ouarda, T. B., and Bobée, B.: A review of statistical water temperature models, Can. Water Resour. J., 32, 179–192, 2007. 

Bishop, C. M.: Pattern recognition and machine learning, Springer, ISBN 978-1-4939-3843-8, 2006. 

Breiman, L.: Random forests, Mach. Learn., 45, 5–32, 2001. 

Bruno, M. C., Siviglia, A., Carolli, M., and Maiolini, B.: Multiple drift responses of benthic invertebrates to interacting hydropeaking and thermopeaking waves, Ecohydrology, 6, 511–522, 2013. 

Buisson, L. and Grenouillet, G.: Contrasted impacts of climate change on stream fish assemblages along an environmental gradient, Divers. Distrib., 15, 613–626, 2009. 

Caissie, D.: The thermal regime of rivers: a review, Freshwater Biol., 51, 1389–1406, 2006. 

Carlson, S. M., Hendry, A. P., and Letcher, B. H.: Growth rate differences between resident native brook trout and non-native brown trout, J. Fish Biol., 71, 1430–1447, 2007. 

Casado, A., Hannah, D. M., Peiry, J. L., and Campo, A. M.: Influence of dam-induced hydrological regulation on summer water temperature: Sauce Grande River, Argentina, Ecohydrology, 6, 523–535, 2013. 

Catalogne, C.: Amélioration des méthodes de prédétermination des débits de référence d'étiage en sites peu ou pas jaugés, Doctorat Ocean Atmosphere Hydrologie, Université Joseph Fourier, Grenoble, 2012. 

Chandesris, A., Van Looy, K., Diamond, J. S., and Souchon, Y.: Small dams alter thermal regimes of downstream water, Hydrol. Earth Syst. Sci., 23, 4509–4525,, 2019. 

Chang, H. and Psaris, M.: Local landscape predictors of maximum stream temperature and thermal sensitivity in the Columbia River Basin, USA, Sci. Total Environ., 461, 587–600, 2013. 

Conti, L., Comte, L., Hugueny, B., and Grenouillet, G.: Drivers of freshwater fish colonisations and extirpations under climate change, Ecography, 38, 510–519, 2015. 

Daigle, A., St-Hilaire, A., Peters, D., and Baird, D.: Multivariate modelling of water temperature in the Okanagan watershed, Can. Water Resour. J., 35, 237–258, 2010. 

Ducharne, A.: Importance of stream temperature to climate change impact on water quality, Hydrol. Earth Syst. Sci., 12, 797–810,, 2008. 

Dugdale, S. J., Malcolm, I. A., Kantola, K., and Hannah, D. M.: Stream temperature under contrasting riparian forest cover: Understanding thermal dynamics and heat exchange processes, Sci. Total Environ., 610, 1375–1389, 2018. 

Durance, I. and Ormerod, S. J.: Trends in water quality and discharge confound long-term warming effects on river macroinvertebrates, Freshwater Biol., 54, 388–405, 2009. 

Fullerton, A. H., Torgersen, C. E., Lawler, J. J., Faux, R. N., Steel, E. A., Beechie, T. J., Ebersole, J. L., and Leibowitz, S. G.: Rethinking the longitudinal stream temperature paradigm: region-wide comparison of thermal infrared imagery reveals unexpected complexity of river temperatures, Hydrol. Process., 29, 4719–4737, 2015. 

Hannah, D. M., Malcolm, I. A., Soulsby, C., and Youngson, A. F.: Heat exchanges and temperatures within a salmon spawning stream in the Cairngorms, Scotland: seasonal and sub-seasonal dynamics, River Res. Appl., 20, 635–652, 2004. 

Hare, D. K., Helton, A. M., Johnson, Z. C., Lane, J. W., and Briggs, M. A.: Continental-scale analysis of shallow and deep groundwater contributions to streams, Nat. Commun., 12, 1–10, 2021. 

Hawkins, C. P., Hogue, J. N., Decker, L. M., and Feminella, J. W.: Channel morphology, water temperature, and assemblage structure of stream insects, J. N Am. Benthol. Soc., 16, pp.728-749, 1997. 

Hill, R. A. and Hawkins, C. P.: Using modelled stream temperatures to predict macro-spatial patterns of stream invertebrate biodiversity, Freshwater Biol., 59, 2632–2644, 2014. 

Hrachowitz, M., Soulsby, C., Imholt, C., Malcolm, I., and Tetzlaff, D.: Thermal regimes in a large upland salmon river: a simple model to identify the influence of landscape controls and climate change on maximum temperatures, Hydrol. Process., 24, 3374–3391, 2010. 

Imholt, C., Soulsby, C., Malcolm, I. A., Hrachowitz, M., Gibbins, C. N., Langan, S., and Tetzlaff, D.: Influence of scale on thermal characteristics in a large montane river basin, River Res. Appl., 29, 403–419, 2013. 

INRAE: Présentation du projet TIGRE, INRAE [data set],, last access: 25 May 2022. 

Isaak, D. J. and Hubert, W. A.: A hypothesis about factors that affect maximum summer stream temperatures across montane landscapes, J. Am. Water Resour. As., 37, 351–366, 2001. 

Isaak, D. J., Wenger, S. J., Peterson, E. E., Ver Hoef, J. M., Nagel, D. E., Luce, C. H., Hostetler, S. W., Dunham, J. B., Roper, B. B., and Wollrab, S. P.: The NorWeST summer stream temperature model and scenarios for the western US: A crowd-sourced database and new geospatial tools foster a user community and predict broad climate warming of rivers and streams, Water Resour. Res., 53, 9181–9205, 2017. 

Isaak, D. J., Luce, C. H., Horan, D. L., Chandler, G. L., Wollrab, S. P., Dubois, W. B., and Nagel, D. E.: Thermal regimes of perennial rivers and streams in the Western United States, J. Am. Water Resour. As., 56, 842–867, 2020. 

Jackson, F. L., Fryer, R. J., Hannah, D. M., Millar, C. P., and Malcolm, I. A.: A spatio-temporal statistical model of maximum daily river temperatures to inform the management of Scotland's Atlantic salmon rivers under climate change, Sci. Total Environ., 612, 1543–1558, 2018. 

Johnson, Z. C., Johnson, B. G., Briggs, M. A., Devine, W. D., Snyder, C. D., Hitt, N. P., Hare, D. K., and Minkova, T. V.: Paired air-water annual temperature patterns reveal hydrogeological controls on stream thermal regimes at watershed to continental scales, J. Hydrol., 587, 124929,, 2020. 

Jones, N. and Schmidt, B.: Thermal regime metrics and quantifying their uncertainty for North American streams, River Res. Appl., 34, 382–393, 2018. 

Kelleher, C., Wagener, T., Gooseff, M., McGlynn, B., McGuire, K., and Marshall, L.: Investigating controls on the thermal sensitivity of Pennsylvania streams, Hydrol. Process., 26, 771–785, 2012. 

Kirk, M. A. and Rahel, F. J.: Air temperatures over-predict changes to stream fish assemblages with climate warming compared with water temperatures, Ecol. Appl., 32, e02465,, 2022. 

Laanaya, F., St-Hilaire, A., and Gloaguen, E.: Water temperature modelling: comparison between the generalized additive model, logistic, residuals regression and linear regression models, Hydrolog. Sci. J., 62, 1078–1093, 2017. 

Lalot, E., Curie, F., Wawrzyniak, V., Baratelli, F., Schomburgk, S., Flipo, N., Piegay, H., and Moatar, F.: Quantification of the contribution of the Beauce groundwater aquifer to the discharge of the Loire River using thermal infrared satellite imaging, Hydrol. Earth Syst. Sci., 19, 4479–4492,, 2015. 

Liaw, A. and Wiener, M.: Classification and regression by randomForest, R news, 2, 18–22, 2002. 

Logez, M., Bady, P., and Pont, D.: Modelling the habitat requirement of riverine fish species at the European scale: sensitivity to temperature and precipitation and associated uncertainty, Ecol. Freshw. Fish, 21, 266–282, 2012. 

Loicq, P., Moatar, F., Jullian, Y., Dugdale, S. J., and Hannah, D. M.: Improving representation of riparian vegetation shading in a regional stream temperature model using LiDAR data, Sci. Total Environ., 624, 480–490, 2018. 

Mayer, T. D.: Controls of summer stream temperature in the Pacific Northwest, J. Hydrol., 475, 323–335, 2012. 

McGarvey, D. J., Menon, M., Woods, T., Tassone, S., Reese, J., Vergamini, M., and Kellogg, E.: On the use of climate covariates in aquatic species distribution models: are we at risk of throwing out the baby with the bath water?, Ecography, 41, 695–712, 2018. 

Moatar, F. and Gailhard, J.: Water temperature behaviour in the River Loire since 1976 and 1881, C. R. Geosci., 338, 319–328,, 2006. 

Moatar, F., Miquel, J., and Poirel, A.: A quality-control method for physical and chemical monitoring data. Application to dissolved oxygen levels in the river Loire (France), J. Hydrol., 252, 25–36,, 2001. 

Mohseni, O. and Stefan, H. G.: Stream temperature/air temperature relationship: a physical interpretation, J. Hydrol., 218, 128–141, 1999. 

Mohseni, O., Stefan, H. G., and Erickson, T. R.: A nonlinear regression model for weekly stream temperatures, Water Resour. Res., 34, 2685–2692, 1998. 

Moore, R., Nelitz, M., and Parkinson, E.: Empirical modelling of maximum weekly average stream temperature in British Columbia, Canada, to support assessment of fish habitat suitability, Can. Water Resour. J., 38, 135–147, 2013. 

Moore, R. D., Spittlehouse, D. L., and Story, A.: Riparian microclimate and stream temperature response to forest harvesting: a review, J. Am. Water Resour. As., 41, 813–834, 2005. 

Morrill, J. C., Bales, R. C., and Conklin, M. H.: Estimating stream temperature from air temperature: implications for future water quality, J. Environ. Eng., 131, 139–146, 2005. 

Naiades: Données sur la qualité des eaux de surface,, last access: 25 March 2019. 

Nash, J. E. and Sutcliffe, J. V.: River flow forecasting through conceptual models part I—A discussion of principles, J. Hydrol., 10, 282–290, 1970. 

Nelson, K. C. and Palmer, M. A.: Stream temperature surges under urbanization and climate change: data, models, and responses, J. Am. Water Resour. As., 43, 440–452, 2007. 

Nimick, D. A., Gammons, C. H., and Parker, S. R.: Diel biogeochemical processes and their effect on the aqueous chemistry of streams: A review, Chem. Geol., 283, 3–17, 2011. 

O'Driscoll, M. A. and DeWalle, D. R.: Stream–air temperature relations to classify stream–ground water interactions in a karst setting, central Pennsylvania, USA, J. Hydrol., 329, 140–153,, 2006. 

Olden, J. D. and Jackson, D. A.: Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks, Ecol. Model., 154, 135–150, 2002. 

Olden, J. D., Joy, M. K., and Death, R. G.: An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data, Ecol. Model., 178, 389–397, 2004. 

Ouellet, V., St-Hilaire, A., Dugdale, S. J., Hannah, D. M., Krause, S., and Proulx-Ouellet, S.: River temperature research and practice: Recent challenges and emerging opportunities for managing thermal habitat conditions in stream ecosystems, Sci. Total Environ., 736, 139679,, 2020. 

Pella, H., Lejot, J., Lamouroux, N., and Snelder, T.: Le réseau hydrographique théorique (RHT) français et ses attributs environnementaux, Géomorphologie, 18, 317–336, 2012. 

Sauquet, E., Gottschalk, L., and Leblois, E.: Mapping average annual runoff: a hierarchical approach applying a stochastic interpolation scheme, Hydrolog. Sci. J., 45, 799–815, 2000. 

Sauquet E., Gottschalk L., and Krasovskaia I. Estimating mean monthly runoff at ungauged locations: an application to France, Hydrol. Res., 39, 403–423, 2008. 

Segura, C., Caldwell, P., Sun, G., McNulty, S., and Zhang, Y.: A model to predict stream water temperature across the conterminous USA, Hydrol. Process., 29, 2178–2195, 2015. 

Seyedhashemi, H., Moatar, F., Vidal, J.-P., Diamond, J. S., Beaufort, A., Chandesris, A., and Valette, L.: Thermal signatures identify the influence of dams and ponds on stream temperature at the regional scale, Sci. Total Environ., 766, 142667,, 2021.  

Song, C., Dodds, W. K., Rüegg, J., Argerich, A., Baker, C. L., Bowden, W. B., Douglas, M. M., Farrell, K. J., Flinn, M. B., Garcia, E. A., and Helton, A. M.: Continental-scale decrease in net primary productivity in streams due to climate warming, Nat. Geosci., 11, 415–420, 2018. 

Strauch, A. M., MacKenzie, R. A., and Tingley III, R. W.: Base flow-driven shifts in tropical stream temperature regimes across a mean annual rainfall gradient, Hydrol. Process., 31, 1678–1689, 2017. 

Tisseuil, C., Vrac, M., Grenouillet, G., Wade, A. J., Gevrey, M., Oberdorff, T., Grodwohl, J.-B., and Lek, S.: Strengthening the link between climate, hydrological and species distribution modeling to assess the impacts of climate change on freshwater biodiversity, Sci. Total Environ., 424, 193–201, 2012. 

Valette, L., Piffady, J., Chandesris, A., and Souchon, Y.: SYRAH-CE: description des données et modélisation du risque d'altération hydromorphologique des cours d'eau pour l'état des lieux DCE, Rapport Technique Onema-Irstea, (last access: May 2022), 2012. 

van Vliet, M. T. H. and Zwolsman, J. J. G.: Impact of summer droughts on the water quality of the Meuse river, J. Hydrol., 353, 1–17,, 2008. 

Venables, W. N. and Ripley, B. D.: Modern Applied Statistics with S, fourth edn., Springer, New York, ISBN 0-387-95457-0, 2002. 

Vidal, J. P., Martin, E., Franchistéguy, L., Baillon, M., and Soubeyroux, J. M.: A 50-year high-resolution atmospheric reanalysis over France with the Safran system, Int. J. Climatol., 30, 1627–1644, 2010. 

Webb, B. W., Hannah, D. M., Moore, R. D., Brown, L. E., and Nobilis, F.: Recent advances in stream and river temperature research, Hydrol. Process., 22, 902–918, 2008. 

Wehrly, K. E., Brenden, T. O., and Wang, L.: A comparison of statistical approaches for predicting stream temperatures across heterogeneous landscapes, J. Am. Water Resour. As., 45, 986–997, 2009. 

Wolter, C.: Temperature influence on the fish assemblage structure in a large lowland river, the lower Oder River, Germany, Ecol. Freshw. Fish, 16, 493–503, 2007. 

Xu, C. L., Letcher, B. H., and Nislow, K. H.: Size-dependent survival of brook trout Salvelinus fontinalis in summer: effects of water temperature and stream flow, J. Fish Biol., 76, 2342–2369, 2010. 

Short summary
We developed one of the largest stream temperature databases to calculate a simple, ecologically relevant metric – the thermal peak – that captures the magnitude of summer thermal extremes. Using statistical models, we extrapolated the thermal peak to nearly every stream in France, finding the hottest thermal peaks along large rivers without forested riparian zones and groundwater inputs. Air temperature was a poor proxy for the thermal peak, highlighting the need to grow monitoring networks.