Technical note: Mapping surface saturation dynamics with thermal infrared imagery

15 In this study we assess the practicability of applying thermal infrared (TIR) imagery for mapping surface saturation dynamics. The advantage of TIR imagery compared to other surface saturation mapping methods is its large spatial and temporal flexibility combined with a non-invasive and intuitive character. Based on an 18-month field campaign, we review and discuss the methodological principles, under which conditions the method works best and what problems may occur. These considerations enable to plan efficient TIR imagery mapping campaigns and to benefit from the full potential offered by TIR 20 imagery, which we demonstrate with several application examples. In addition, we elaborate on image post-processing and test different methods for the generation of binary saturation maps from the TIR images. The method testing is performed on various images with different image characteristics. Results show that the best method in addition to a manual image classification is a statistical-based approach that combines distribution fitting of two pixel classes, adaptive thresholding and region growing. 25


Introduction
Patterns and dynamics of surface saturation areas have remained on hydrologic research agendas ever since the formulation of the variable source area (VSA) concept by Hewlett and Hibbert (1967). Surface saturation is relevant for runoff generation and for water quality, due to variable active and contributing areas (Ambroise, 2004) as well as critical source areas (e.g. Doppler et al., 2014;Frey et al., 2009;Heathwaite et al., 2005). Likewise, surface saturation patterns and dynamics are closely 30 linked to groundwater-surface water interactions (e.g. Frei et al., 2010;Latron and Gallart, 2007) and catchment storage characteristics and dynamics (e.g. Soulsby et al., 2016;Whiting and Godsey, 2016). Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License.
Albeit the prominent role of saturated areas in hydrological processes research, their mapping remains a challenging exercise.
The most straightforward mapping method consists in locating saturated areas by walking through the catchment. However, this simple but labour intensive 'squishy-boot' method (e.g. Blazkova et al., 2002;Creed et al., 2003;Latron and Gallart, 2007;Rinderer et al., 2012) is neither applicable to large areas nor is it suited for fine-scale spatial resolutions. Dunne et al. (1975) introduced topography, soil morphology, hydrometric measurements (soil moisture, water table level, baseflow), and 5 vegetation as useful indicators for delineating saturated areas. Even if it is still a matter of research how to best make use of these catchment characteristics to delineate saturated areas (e.g. Ali et al., 2014;Doppler et al., 2014;Grabs et al., 2009;Kulasova et al., 2014aKulasova et al., , 2014b, only hydrometric measurements have the potential to monitor the temporal evolution (from minutes to months) of dynamic surface saturation. Their major disadvantage is that area-representative hydrometric measurements are difficult to carry out for larger areas. 10 Remote sensing methods have proven to be well suited for mapping temporal dynamic patterns of surface saturation over large areas. It is possible to extract flooded areas in the order of kilometres to metres from data acquired with satellite and airborne platforms, such as synthetic aperture radar (SAR) images (e.g. Matgen et al., 2006;Verhoest et al., 1998), or the normalised difference water index (NDWI) and the normalised difference vegetation index (NDVI) (de Alwis et al., 2007;Mengistu and Spence, 2016). Observations at higher spatial resolution (order of centimetres) require unmanned aerial vehicles (UAVs) or 15 ground-based instruments. Due to various technical constraints, up to now SAR image acquisitions are scarcely used for UAVbased applications or for ground-based applications that are not restricted to a fixed location (e.g. Li and Ling, 2015;Luzi, 2010). NDWI and NDVI are theoretically applicable at these scales, however, to the best of our knowledge the necessary simultaneous acquisition of short wave infrared and visible light (VIS) images has not yet been performed via UAVs or on the ground. 20 Ishaq and Huff (1974) and Dunne et al. (1975) suggested the use of VIS or infrared photographs for mapping surface saturation.
However, even though VIS cameras have been deployed on the ground and mounted on UAVs, airborne or satellite platforms for a long time, this suggestions was rarely followed in the last 40 years (exception e.g. Portmann, 1997). Recently, Chabot and Bird (2013) and Spence and Mengistu (2016) successfully used VIS cameras mounted on UAVs for mapping surface water (a wetland of 128 ha and an intermittent stream surveyed via three transects of 2 km each). Silasari et al. (2017) mapped 25 surface saturated areas on an agricultural field (100 m x 15 m) by using a VIS camera mounted on a weather station for highfrequency image acquisition.
Today, thermal infrared (TIR) imagery features the same temporal and spatial flexibility as VIS imagery. Pfister et al. (2010) and Glaser et al. (2016) demonstrated the potential of TIR imagery for mapping surface saturation by carrying out repeated infrared image acquisitions at small spatial scales (centimetres to metres) with handheld cameras. So far, these two studies 30 represent rare examples of using TIR imagery for mapping surface saturation, while the usage of TIR imagery for analysing groundwatersurface (water) interactions (e.g. Ala-aho et al., 2015;Briggs et al., 2016;Pfister et al., 2010;Schuetz and Weiler, 2011) or water flow paths, velocities, and mixings (e.g. Antonelli et al., 2017;Deitchman and Loheide, 2009;Schuetz et al., 2012) became rather common with the advent of affordable, handheld TIR cameras. Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License.
A reason why studies using TIR imagery for mapping surface saturation are still scarce is certainly that existing descriptions of the methodological advantages and challenges are sparse. Several general guidelines and methodological descriptions for TIR imagery applications exist. For example, they focus on one specific aspect of TIR imagery, such as co-registration (Turner et al., 2014;Weber et al., 2015) or on how to acquire correct surface water temperatures, which is the main application of TIR imagery in hydrology (e.g. Dugdale, 2016;Handcock et al., 2006Handcock et al., , 2012Torgersen et al., 2001). Many of these 5 recommendations can be directly applied for mapping surface saturation via TIR imagery (e.g. choice of sensor type).
However, some recommendations are redundant (e.g. temperature corrections) or different (e.g. optimal time scheduling) for the application of TIR imagery for surface saturation mapping.
Here, we go beyond the mere demonstration of the potential for TIR imagery to map saturated surface areas and address the related application-specific technical and methodological challenges. We (1) review relevant technical and methodological 10 aspects from existing TIR imagery literature and (2) complement them with our expertise and results from an 18-months field campaign. The field campaign was focused on the recurrent acquisition of panoramic images of seven distinct riparian areas with a portable TIR camera. Yet, the precautions and considerations that we describe in this technical note are also valid for surface saturation mapping campaigns with permanently installed ground-based TIR cameras and TIR cameras mounted on UAVs, airborne or satellite platforms. 15 2 Acquisition of TIR images for mapping surface saturation patterns 2.1 Fundamental principles TIR cameras allow to obtain an areal picture of surface temperatures (e.g. 100 µm penetration depth for water columns). The cameras sense the intensity of thermal infrared radiation emitted by the objects the camera is pointed at. The surface temperature of the objects is then calculated from the sensed radiation intensity, based on Stefan-Boltzmann's law and 20 considering some radiometric corrections, such as material specific emissivity, reflected radiation, and atmospheric induced radiation. Details on principles of TIR imagery, TIR sensor types (i.e. wave length, sensitivity), and considerations for choosing the most appropriate camera and remote sensing platform for the desired acquisition (i.e. accuracy, resolution) are provided in literature (cf. Dugdale, 2016;Handcock et al., 2012). For this study we used two different handheld TIR camera models: FLIR B425 with a resolution of 320 x 240 pixels and an angle of view of 25° and FLIR T640 with a resolution of 640 x 480 pixels 25 and an angle of view of 45° (FLIR Systems, Wilsonville. USA). The wider angle of view of the FLIR T640 clearly eased the image acquisition in this study, while a pixel resolution lower than the resolutions of the two cameras would still have been sufficient for the identification of surface saturation patterns.
Determining surface saturation with TIR imagery implies that surface saturation is defined as water ponding or flowing on the ground surface (even if only present as a very thin layer). Mapping surface saturation with TIR imagery requires (1) a sufficient 30 temperature contrast between surface water and the surrounding environment (e.g. dry soil, rock, vegetation) and (2) at least one pixel of the TIR image being known to correspond to surface water. When these two requirements are met, it is possible Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License.
to visually identify the surface saturation patterns in a TIR image. This is exemplified with a TIR image of a riparian-stream zone (Fig. 1). The substantial temperature contrast (requirement 1) allows to differentiate between two TIR pixel groups, i.e. surface water pixels and surrounding environment pixels. With ground truth data at hand (here: VIS image, alternatives are e.g. water temperature or the stream course) for point 1 of Fig. 1 (requirement 2), the group of pixels with higher temperatures can be identified as surface water, whereas the group of pixels with lower temperatures can be assigned to the surrounding 5 environment (cf. Fig. 1, point 2). With this classification in mind, the TIR image significantly amplifies the appearance of surface saturated areas as expressed from a VIS image. Moreover, the TIR image reveals additional surface saturated areas that are not clearly identifiable (cf. points 3 in Fig.1) or not visible (cf. area above point 6, Fig.1) within a VIS image.
The example shows that the identification of surface saturation relies on temperature contrasts between surface water and the surrounding environment. Radiometric corrections of TIR images for obtaining correct temperature values are thus not 10 necessary. However, interferences that affect temperature, such as shadow casts or reflections (cf. Dugdale, 2016;Handcock et al., 2012), cannot be disregarded as they can influence the temperature contrast (see section 2.2). In cases where the water temperature is too similar to surrounding materials, saturated areas might be falsely identified as dry, whereas surrounding materials might be falsely identified as wet. In cases where non-uniform water temperatures occur, different water sources may be distinguished (cf. Fig.1, where point 4 likely represents stream water, point 5 and 7 likely represent exfiltration of 15 warmer groundwater). However, a bimodal distribution of water temperatures (e.g. cold stream and warm exfiltrating groundwater or warm ponding water) can also lead to a misinterpretation of temperature contrasts to the surrounding environment (e.g. surrounding material with a temperature between the water temperatures might be identified as water).

Image acquisition
Weather conditions can interfere with TIR image acquisition (e.g. Dugdale, 2016;Handcock et al., 2012). We identified 20 similarity between air and water temperatures as the main reason for insufficient temperature contrasts between water and the surrounding environment, compromising an identification of surface saturation with TIR images (Fig. 2a). Water has a higher thermal capacity than most environmental materials and therefore the water surface temperature generally aligns more slowly with air temperature than the surface temperatures of surrounding materials do. During our field campaign it became clear that particularly during day-night-day or seasonal transitions, this difference in thermal capacities induced a convergence of the 25 surrounding environment's temperatures (which are aligning to air temperature) to the water temperature. Furthermore, direct exposure of the study site to sunlight combined with shadow casts commonly distorted the temperature contrasts. Shadowed surrounding materials with different temperatures than the same, sunlit surrounding materials led to reduced temperature contrasts between these materials and surface water (Fig. 2b). Once the direct sun exposure ceased, different thermal capacities of different materials could still cause sun memory effects with patches of warmer and colder temperatures. Rain and fog may 30 also influence image quality due to water droplets falling between the TIR sensor and the ground, eventually blurring the images and causing uniform temperature signatures (Fig. 2c).
To avoid the acquisition of non-useable TIR images we advise to plan field campaigns adapted to the weather forecasts. The ideal situation is to work during dry and cloudy weather with warm or cold air temperatures in order to ensure a clear difference between the temperature of surrounding materials and the more temperate water surface temperatures. Dugdale (2016) reported the time period from mid-afternoon to night time as optimal TIR image acquisition time for monitoring water surface temperatures. Based on our 18-months field campaign we suggest that the optimal TIR image acquisition time for identifying 5 surface saturation patterns is early daytime. At this time undesirable effects due to sunlight (shadows, memory effect) are nonexistent and temperature contrasts between water surfaces and the surrounding environment are commonly large. A site specific analysis of the sun exposure over the day can help to understand at what other times images can be taken under favourable conditions for a specific study site.
View obstructions in the TIR camera's field of view are obviously problematic. Yet, permanent view obstructions on the 10 ground (e.g. tree trunks, Fig. 2d, point 6) showed to be useful ground reference points in our field campaign. Temporary view obstructions, such as growing vegetation (Fig. 2d), recent litter, and snow cover proved to be problematic for repeated imaging campaigns. Cutting the vegetation during the growing season can be an option for small study sites. Our experience is that grasses and herbaceous plants with small leaves still permit the recording of ground surface temperature, while ferns or tree leaves are completely opaque. 15 Ideally, images are taken from above and at nadir to the study site. Oblique angles of view (>30° of nadir) reduce the object's emissivity and thus distort the detected temperatures in the TIR images (Dugdale, 2016). The incorrect temperature values as such are not critical for mapping surface saturation patterns, but we observed that wide ranges of angles can result in distinct temperature distortions and thus reduced temperature contrasts within the images. In a similar way, varying distances between camera and ground surface for different positions within one image (e.g. top / bottom, left / right) do not only provoke pixels 20 with varying area equivalents, but can also distort the temperature detection and thus temperature contrast. Therefore, groundbased cameras should be positioned at locations that minimize the range of angles of view and the distances between camera and ground surface. In case of repeated image acquisitions of a given area of interest, we took the pictures each time from the same position in order to facilitate subsequent image comparisons. For repeated image campaigns it could be useful to install a structure that allows to acquire several images by moving the camera to specific positions with fixed heights above ground 25 and angles of view. This could simplify the post-processing and assemblage of the images into panoramic images (cf. section

2.3).
For determining surface saturation, the TIR images should cover a part of the stream or another area known to be surface saturated in order to have a reference for water temperature (cf. section 2.1). In addition, a VIS image should be acquired simultaneously with the TIR image for comparison. The TIR imagery parameters that are necessary for correcting and 30 converting the radiation signal to temperature values (e.g. air temperature, humidity) do not need to correspond to the actual conditions, since only the temperature contrast and not the correct temperature value is required for defining saturated areas.
However, setting realistic values for the distance between camera and ground surface helped the auto-focus process of the camera. Working with proper values for the emissivity (0.96-0.98 for water) and for the actual reflection temperature helped to prevent the observation of unrealistic surface temperatures. Nonetheless, in case of clear sky or cold winter days we occasionally observed negative temperatures for flowing water. The explanations for these observations stay purely speculative (Antonelli et al., 2017). However, for the identification of surface saturation patterns such unrealistic temperatures do not pose a problem since the temperature range stays correct (Antonelli et al., 2017).
Reflections of surrounding objects on the water surface (Fig. 2e) and image vignetting (cf. aura effect in Antonelli et al., 2017) 5 can occur during the image acquisition and can compromise a further use of the TIR images. In this study, the image vignetting was unproblematic, especially in case a panorama was built from several images (cf. section 2.3). This is due to the fact that image vignetting only occurs at the edges of the pictures and that it is of minor relevance in images with high temperature contrasts. Reflections of surrounding objects on the water surface limit the value of the images for saturation identifications in a similar way as shades (cf. Fig. 2d and 2e). The difference to shades is that reflections also occur with diffuse light, which 10 makes it difficult to predict their occurrence and thus to avoid them.

Generation of TIR panorama images
We acquired the images that were needed for the assemblage of a panoramic view in two different ways: (1) by taking single, overlapping images, or by (2) taking a video of the area of interest. While both approaches deliver similar final results, videos are taken faster than sequences of individual images. Independently from the chosen data format, we kept the camera's 15 parameter settings (cf. section 2.2) constant during image / video acquisition for the area of interest and ensured that the saving format retained the temperature information as radiometric data. These two precautions were necessary for further image processing (see below and Fig. 3). Sun (dis)appearance and automatic noise corrections by the camera (non-uniformity corrections, cf. Dugdale, 2016) can lead to clear shifts in recorded temperatures from one image / video frame to another. In order to control if such a temperature shift occurred, we fixed the temperaturecolour scale while taking the video/single 20 images of an observation area. In case the colour (and thus temperature) of overlapping image parts shifted, we restarted the image acquisition, since a correction of such temperature shifts is difficult (cf. Dugdale, 2016).
We acquired the images / video frames in such a way that the area of interest constituted the central part of a panorama. This allowed to avoid image gaps and distortion effects at the borders of the area of interest. When possible, we ensured that the single pictures / video frames included overlapping parts with identifiable structures such as the stream bank, tree stems, or 25 stones as natural reference points. For videos it was essential to move the camera slowly enough to obtain sharp images and to use a low frame rate (e.g. 2 Hz) to keep the number of video frames reasonable (enough frames for obtaining area overlaps, but not too many frames showing the same area).
The generation of a panorama from overlapping TIR images / video frames acquired with a ground-based camera involves some challenges that specifically relate to TIR and / or ground-based images. This needs to be addressed in TIR specific 30 panorama generation and image processing steps, as presented in a nutshell by Cardenas et al. (2014). Before blending the images / video to a panorama image, one needs to ensure that all single images / video frames have the identical temperature colour scale.
Ideally, the temperaturecolour scale is a linear greyscale and ranges from the global minimum to the global maximum temperature value of the images / video frames. This prevents artefactual colour mixing effects and it allows to embed the temperature information in the generated panoramas. The camera types that we used for this study did not allow to fix such a linear, grey temperaturecolour scale prior to image acquisition. Therefore, we transformed the acquired images / video in a colour conversion step to grey scaled images (Fig.3, step 1) based on the temperature information retained in the acquired 5 images / video (see above). In cases where extreme temperature values of an image were not relevant for the identification of saturated areas, we truncated the temperature range retained for the colour conversion in favour of a better colour contrast and a finer temperature class width retained in the grey values (in case of a temperature range of 25.5 °C and an image with 255 grey values, the retained temperature class width is 0.1 °C).
We employed Microsoft's Image Composite Editor (ICE) and the PTGui panorama software (New House Internet Services) 10 for creating panorama images (Fig.3, step 2). ICE and PTGui allow to create panoramas from single images with an automatic mosaicking function (i.e. a function that geometrically transforms, aligns and overlaps the single images). Unlike PTGUI, the ICE software additionally allows to automatically mosaic video frames. However, in cases where the automatic mosaicking fails, only PTGUI offers adequate possibilities to manually interact with the image alignment (i.e. defining control points for matching distinct points in overlapping images). TIR images generally show less identifiable features and lower contrasts than 15 VIS images (cf. Weber et al., 2015). Therefore, a (partial) failure of automatic mosaicking is not uncommon and manual interactions became frequently necessary for the images of our 18-months field campaign. Nevertheless, we always first attempted to create a panorama from a video with ICE, since the video acquisition showed to be more efficient for the image acquisition and for the grey-scale conversion. In cases where the image mosaicking of a video (partly) failed, we extracted single images from the video and processed them as individual images, eventually resorting to PTGui. 20 In order to compare several panorama images of the same area, the panoramas need to be co-registered (Fig. 3, step 3). In principle, it is possible to georectify the TIR images by allocating in the images geographical coordinates that are derived from ground control points (Keys et al., 2016;Silasari et al., 2017) or from a virtually projected elevation model (cf. DSM-derived virtual perspectives in Cardenas et al., 2014). However, this can result in strong interpolations or distortions, due to view obstruction in the picture. Instead, we co-registered TIR panoramas of the same area against each other (cf. Cardenas et al., 25 2014;Glaser et al., 2016). More specifically, we registered and cut them to the dimensions of a reference TIR panorama of the area of interest (Fig. 3, step 3).

Application examples
In this section we present three examples from our 18-months field campaign that demonstrate the capability of TIR imagery for analysing surface saturation patterns and their dynamics. All images were taken in the Weierbach catchment -a forested, 30 42 ha headwater research catchment in western Luxembourg (Glaser et al., 2016;Klaus et al., 2015;Martínez-Carreras et al., 2016;Schwab et al., 2018). We avoided unfavourable environmental conditions for the image acquisitions (cf. 2.2, Fig. 2 allowing a few days tolerance to the targeted (bi-)weekly recurrence frequency. Additionally, we cut ferns that obstructed the camera view during the summer months. The 364 acquired panorama were divided into three groups classified as usable without restrictions (32.4 %), usable with some restrictions (small negative effects of low temperature contrasts or covering vegetation visible, 31.1 %) and unusable (36.5 %).
The usable panoramas captured the temporal evolution of surface saturation over the 18-months field campaign. This 5 demonstrates the robustness of TIR imagery through the complete range of seasonal conditions (Fig. 4), including snow and growing vegetation, as well as warm and cold water. The full extent of added value provided by TIR imagery compared to VIS imagery was documented for cases with different seasonal conditions (Fig 4 a/e  show similar saturation patterns for the two dates. In addition to surface saturation dynamics, the TIR images can also reveal distinct types of saturation patterns. For example, saturated areas may be oriented within few metres from perpendicular (Fig. 5 top) to parallel (Fig. 5 bottom) to the adjacent stream. The parallel direction on the left bank (Fig. 5 bottom) appears to be created by a parallel flow of the stream in a flat riparian zone that becomes an extended stream bed, while the perpendicular direction appears to be generated from exfiltrating 15 groundwater that flows downhill to the stream at the soil surface. Thus, the different directional extents of the saturated areas can indicate different processes underlying the surface saturation formation.
Finally, the images allow to identify the spatial heterogeneity of temporal saturation dynamics across different study sites. Figure 6 shows TIR images of the riparian zone of two different source areas with different degrees and dynamics of surface saturation. In area 1 (Fig. 6, left panels) the pattern of saturation areas had barely changed from February to April, while in 20 area 2 (Fig. 6, right panels) some locations had dried out (red circles). In December 2016, the riparian zones of both source areas were completely dry and the stream started further downstream in comparison to the other observation dates (red arrows).
This suggests that both source areas evolve from very wet to very dry conditions (during which surface saturation is mainly represented by spots with stable groundwater exfiltration) with distinctly different transition dynamics.

Building saturation maps 25
The application examples described in section 3 demonstrate the potential of TIR images for a rapid and intuitive visualisation of surface saturated areas. The 'raw data' images can be used without any additional processing to study surface saturated areas, their evolution over time, and how and where they occurultimately contributing to a better mechanistic understanding of the hydrological processes prevailing in the studied area. This is even more valid, since the images do not only provide surface saturation extensions, but potentially also information on the different mechanisms that generate saturated patches (cf. 30 saturation patterns, Fig.6). However, the images need to be transformed into binary saturation maps for further analyses based on quantitative values (as e.g. saturation percentages) or for applying map comparison methods (e.g. confusion matrices, kappa coefficients).
One possibility to transform a TIR image into a binary saturation map is to take the temperature range of pixels that are known to be saturated (i.e. stream pixels) and to define all pixels of the image that fall into that temperature range as saturated (cf. 5 Glaser et al., 2016;Pfister et al., 2010). This approach requires a sufficient temperature contrast. Furthermore, artefacts (such as pixels corresponding to vegetation covering the stream) may induce some uncertainty in pixel classification, eventually leading to discrepancies to visual saturation pattern identifications. The selection of the temperature range for surface saturation can be done manually by adapting the chosen range until the resulting saturation map matches best the visual assessment of the original TIR andif possible -VIS image. Up to a certain level it is straightforward to visually reject a temperature range 10 by qualifying it as being too wide or too small (cf. Fig. 7). However, finding an unequivocal temperature range is not feasible and the selection of the most plausible temperature range (Fig. 7, dark-green asterisk) remains somewhat subjective.
Consequently, a pixel classification based on this procedure remains tarnished by uncertainties and the definition of an uncertainty range within which the temperature range is considered plausible (Fig. 7, dark-green, dashed lines) is a subjective exercise as well. In our experience from the 18-months field campaign, the uncertainty range was generally small for images 15 with low saturation and gradually increased with higher saturation (compare Fig. 7 d-b). Accordingly, images with a large difference in percentages of saturated pixels (e.g. Fig. 7b vs 7d) did not encounter an overlap of the uncertainty ranges. For some images the uncertainty range was very high (Fig. 7a) and a comparison with other images with percentages of saturated pixels in the same range can thus be problematic. In such cases it is good if only one person defines the optimal temperature ranges and thus saturation patterns for all images that are intended to be compared in order to ensure consistency in the image 20 interpretation (e.g. we realised that some persons consistently favoured higher and others consistently favoured lower saturations within the uncertainty range of a set of images).
A more objective and for time-lapsed images faster option for constraining the temperature range of saturated pixels consists in using preselected pixels or a predefined mask for saturated and unsaturated parts in all images. Such pixels / masks can be selected based on visual interpretation of the images or on information obtained from reference sensors in the field, indicating 25 whether a location was wet or dry at the surface at the time of the image acquisition. As an example we chose a mask of 2000 pixels falling into an area that always stayed dry and 2000 pixels falling into an area where the stream was flowing all year (red rectangles, Fig.7). Based on this mask we defined a minimum and maximum temperature range for each image in such a manner that 90% of the pixels falling below the mask are defined as saturated and dry, respectively. The resulting uncertainty ranges of saturation percentages (Fig. 7, blue points) are higher than for the manual temperature range selection (Fig. 7, dark-30 green, dashed lines). The saturation identifications based on the dry part of the mask were clearly not constrained enough. The saturation identifications based on the wet part of the mask sometimes approached the manual saturation identifications (Fig.   7 a, c) but in other cases they even exceeded it (Fig. 7d). The mask-defined uncertainty range of saturation could be narrowed by selecting a value higher than 90% of the pixels for the temperature range definition. However, this increased the risk to Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License. obtain a clearly wrong value (cf. Fig. 7d), since the wet / dry mask can cover pixels of the 'wrong' category (due to artefacts like vegetation covering the stream or due to distorted co-registered images, resulting in a shifted mask). A reduced mask size prevents such 'wrong' pixels, but reduces the captured variability in temperature (in an extreme case down to one temperature value), which increases the risk to miss the warmest or coldest temperature of the water or dry areas.
Another approach is to derive a threshold value for the temperature range from image statistics as e.g. the probability density 5 function (Pfister et al., 2010). However, this is only straightforward in cases where the temperature distribution is clearly bimodal between water and the surrounding environment. Silasari et al. (2017) applied an automatic image classification for unimodal distributions based on a threshold parameter that needs to be calibrated to specific image conditions (in their case brightness of VIS images). In our case, the image conditions vary to such an extent (e.g. very wet/dry conditions, water warmest/coldest material, cf. location of dark-green asterisk on cumulative saturation curves, Fig. 7) that each image would 10 require its own calibration to manual selected temperature ranges / saturation patterns. Chini et al. (2017)  We tested the usability of this approach for our TIR images by constraining the region growing algorithm to a) a bimodal 20 distribution derived from the HSBA applied to the entire image, b) a bimodal distribution derived from the HSBA where the selection of bimodal image subsections was constrained to image-specific manual predefinitions of temperature ranges of saturation, and c) a bimodal distribution derived from pre-selected parts of the image that include clearly wet and dry areas.
While in some cases the fully automatic image classification (a) worked very well in comparison with the manual selection of a temperature range (cf. Fig. 8 04/12/15, 30/08/16), for the other cases the saturation was mostly underestimated (cf. Fig. 8  25 25/02/16, 03/06/16). The additional constraint with image specific temperature ranges (b) overall improved the matches to the manually defined saturation patterns, but the result was strongly influenced by the match of the given constraint range to the range that was defined as the optimum for the image. A constraint with a rough estimated temperature for saturation worked poorer than a constraint with the temperature range as selected in the detailed manual assessment described earlier in the section (cf. Fig. 7 green asterisks and lines). The classification based on pre-selected parts of the image (c) tended to result in 30 higher saturation amounts. This improved the match for the cases that were underestimated with the fully automatic classification (a) (cf. Fig. 8 25/02/15, 03/06/16), but it overestimated saturation for the cases where the fully automatic classification (a) showed good results (cf. Fig. 8 04/12/15, 30/08/16).

Discussion
The main advantages of TIR imagery in comparison to other surface saturation mapping methods are its non-invasive character and its large temporal and spatial flexibility (centimetres to kilometres, minutes to months). Another advantage is that TIR images allow a rapid and intuitive identification and analysis of the dynamics of surface saturation patterns. VIS imagery offers similar advantages (Silasari et al., 2017), but commonly the saturated areas are not as clearly visible as with TIR imagery (cf. 5 Fig. 1, Fig. 4). Moreover, VIS imagery is not usable during night-time and it cannot provide additional information about water sources and processes underlying the surface saturation formation (cf. Fig. 1, Fig. 5). Nevertheless, VIS imagery provides good complimentary information to the TIR imagery and should always be considered as ground truth information source.
In our study, unfavourable image acquisition conditions (cf. section 2.2) caused 36.5 % of the acquired images to be not usable for further processing. High amounts of unusable images are a common problem in environmental imagery (cf. e.g. cloud 10 cover for satellite images, night-time for VIS images (DeAlwis et al., 2007;Silasari et al., 2017)). Flexibility in the scheduling of a field campaign is thus necessary to reduce the number of acquisitions during unfavourable conditions. A concern for the use of TIR imagery for mapping saturation patterns is that some saturated areas (e.g. warmed up ponding water) might not be identified as saturated due to a temperature that is very different from stream temperature. This relates back to the fact that temperature is only used as an indicator for saturation. Compared to other saturation indicators such as vegetation mapping or 15 hydrometric measurements (cf. Dunne et al., 1975) we deem the TIR imagery with the above-mentioned advantages as the better indirect mapping method. However, the only way to directly map surface saturation consists in walking through the area of interest (e.g. squishy boot method), which remains restricted to small areas and / or low mapping frequencies.
The amount of field work for imagery mapping is generally reduced compared to other methods for mapping surface saturation (e.g. vegetation/soil mapping), allowing more frequent campaigns with higher spatial precision. Yet, consistent with other 20 imagery mapping studies (e.g. Spence and Mengistu, 2016), the image post-processing in this study was time-consuming.
Mosaicking and co-registering of images is often considered particularly difficult for TIR images, since ground control points with a thermal signature are needed (Dugdale, 2016;Weber et al., 2015). Our experience showed that the images normally offered enough natural thermal ground control points (e.g. the stream bank) in cases where the temperature contrast between water and ambient materials was good enough for image usability. In combination with the presented post-processing 25 workflow the post-processing effort was reasonable. More automatized workflows like the one proposed by Turner et al. (2014) for mosaicking UAV acquired TIR images could also be adapted and applied.
More challenging than TIR image mosaicking and co-registering was the generation of saturation maps from the TIR images.
The different tested processing methods all yield somewhat different results compared to pixel classification based on manual, visual assessment. Nevertheless, realising an objective, automatic classification of saturated areas is not more challenging than 30 for other surface saturation mapping methods. Saturation maps created from the squishy boot method or vegetation/soil mapping are subjective to decisions taken during field work. The (un)supervised classification methods that are commonly used for creating saturation maps from remote sensing data (e.g. VIS images / NDVI/NDWI) also contain some uncertainty Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License. (Chabot and Bird, 2013;DeAlwis et al., 2007;Mengistu and Spence, 2016;Spence and Mengistu, 2016). Moreover, the main problem for all of the tested saturation map generation methods (cf. section 4) is that they are not applicable without adapting them to individual image conditions (very wet, very dry, water warmest / coldest material, slightly different field of views).
Other image processing methods for deriving saturation maps also do not fulfil this requirement and it is necessary to adapt the parameters (e.g. Silasari et al., 2017) or to do a new supervision (with new classification pixels/masks) for the classification 5 of images with different conditions (e.g. Chabot and Bird, 2013;Keys et al., 2016). By now we consider a manual choice of the temperature range for saturated pixels as the best approach for time-lapsed images with very variable conditions and slight perspective shifts, even though it is labour intensive and somewhat subjective. For time-lapsed images with a fixed vantage point and for time spans with similar conditions (e.g. storm events), the presented automatable methods represent valuable options. Especially the combination of an automatic decomposition of two pixel class distributions with a region growing 10 algorithm yielded objective saturation maps close to the manual saturation classification and visual assessment of the TIR images (Fig. 8). Small adaptations of the constraint for the decomposition of two pixel class distributions were sufficient to obtain good results for the different image conditions (cf. Fig. 8 ac) and further developments of the method might even allow to perform such adaptations in a (semi)automatic way.

Summary and conclusions 15
This technical note presents recent work in the Weierbach catchment, where we tested the capabilities of TIR imagery for mapping surface saturation dynamics. We reviewed and summarized the methodological principles and the required precautions and considerations for a successful application of TIR imagery. The main requirement is a clear temperature contrast between water and surrounding environments. Image acquisition during an 18-months campaign showed that the method works best during dry night time or dry early daytime and that images should be taken from well-chosen positions 20 without (non-)permanent view obstructions to the ground. The presented workflow for acquiring panoramic images is particularly suitable for small areas of interest (centimetres to metres) with intended intermediate to low mapping frequencies (days to months). Moreover, the information contained in this technical note is also beneficial for applications at different temporal and spatial scales (fixed cameras for high frequency images, drone/satellite images for larger spatial scales), considering that some adaptions and further developments of the methodology might be necessary. 25 We demonstrated with three examples that TIR imagery is applicable throughout the year and can reveal spatially heterogeneous surface saturation dynamics and distinct types of saturation patterns. The saturation patterns can also be used to identify different processes underlying the surface saturation formation, such as groundwater exfiltration or stream expansion. The surface saturation information visualised in the images can be directly used as soft data for characterising field conditions, for analysing ongoing hydrologic processes, and for model validation. 30 The presented methods for obtaining binary, objective saturation maps from TIR images contain some uncertainties and are not automatable for datasets containing many images with varying characteristics (e.g. very wet / dry, water warmest / coldest Hydrol. Earth Syst. Sci. Discuss., https://doi.org /10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License. material, slightly different field of views). In such cases, a manual choice of the temperature range for saturated pixels is the most reliable approach. Yet, for image subsets with similar conditions the tested pixel classifications work well and we think that the combination of an automatic decomposition of the image distribution in two pixel classes and a region growing algorithm is a very promising option for obtaining objective, comparable saturation maps. In conclusion, we consider the TIR imagery a very powerful method for mapping surface saturation in terms of practicability and spatial and temporal flexibility 5 and we believe it can provide new insights in the role of saturated areas and subsequent spatial and temporal dynamics in rainfall runoff transformation.        higher (a, b) or lower (c, d) temperature than the temperature range threshold and are thus defined as saturated (marked as yellow pixels in the inset TIR images). The green asterisks mark the temperature ranges that were manually chosen as optimum from visual assessment of the images. Green dashed lines define the uncertainty of the optimum temperature ranges. Red rectangles in the TIR images are 5 the mask used for the identification of temperature ranges from a constantly wet (left) and constantly dry (right) area. The respective temperature ranges and saturation percentages are marked in blue. Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License. Figure 8: Comparison of saturation maps (yellow = saturation) generated with a region growing process whose seeds and stopping criteria were automatically constrained to a) bimodal distributions derived from the HBSA applied to the entire image, b) bimodal distributions derived from the HBSA where the selection of bimodal image subsections was constrained to image-specific manual predefinitions of temperature ranges of saturation, c) bimodal distributions derived from pre-selected parts of the image (shown in 5 d) that include clearly wet and dry areas. The saturation maps generated with manual selected temperature ranges based on visual assessment (cf. Fig. 7, green asterisk) are shown for comparison (e). Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2018-334 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 2 July 2018 c Author(s) 2018. CC BY 4.0 License.