Impact of improved Sea Surface Temperature representation on the forecast of small Mediterranean catchments hydrological response to heavy precipitation

Operational meteo-hydrological forecasting chains are affected by many sources of uncertainty. In coastal areas characterized by complex topography, with several medium-to-small size catchments, quantitative precipitation forecast becomes even more challenging due to the interaction of intense air-sea exchanges with coastal orography. For such areas, quite common in the Mediterranean basin, improved representation of Sea Surface Temperature (SST) space-time patterns 10 can be particularly important. The paper focuses on the relative impact of different accuracy levels of SST representation on regional operational forecasting chains (up to river discharge estimates) over coastal Mediterranean catchments, with respect to other two fundamental options while setting up the system, i.e., the choice of the forcing GCM and the possible use of a three-dimensional variational assimilation (3DVAR) scheme. Two different kinds of severe hydro-meteorological events affecting the Calabria Region (Southern Italy) on 2015 are analysed using the atmosphere-hydrology modelling system 15 WRF-Hydro in its uncoupled version. Both the events are modelled using the 0.25° resolution Global Forecasting System (GFS) and the ECMWF’s 16 km resolution Integrated Forecasting System (IFS) initial and lateral atmospheric boundary conditions. For the IFS-driven forecasts, also the effects of the 3DVAR scheme are analysed. Finally, native initial and lower boundary SST data are replaced with data from the Medspiration Project by IFREMER/CERSAT, having a 24 hour time resolution and 2.2 km spatial resolution. Precipitation estimates are compared with both ground-based and radar data, as well 20 as discharge estimates with stream gauging stations data. Overall, the experiments highlight that the added value of improved SST representation can be hidden by other more relevant sources of uncertainty, especially the choice of the General Circulation Model providing boundary conditions. Nevertheless, high-resolution SST fields show in most cases a not negligible impact on the simulation of the atmospheric boundary layer processes, modifying flow dynamics and/or the amount of precipitated water, therefore emphasizing that uncertainty in SST representation should be duly taken into account 25 in coastal areas operational forecasting.

accessibility of the results and administrative and/or institutional factors can be as important as monitoring and modelling activities (Pagano et al., 2014; . Nevertheless, the cornerstone of such systems, and undoubtedly the most demanding part from a scientific point of view, still is the meteorological-hydrological modelling chain, supported by in-situ or remotely sensed measurements.
Increasingly refined modelling chains have been developed in the recent years (e.g., UK Environmental Prediction Research, 5 Canadian Great Lakes, U.S. Navy's Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS®)). Despite their complexity, these systems all have to deal with some inherent limitations of the meteorological and hydrological models.
The main sources of errors in weather forecast are connected to both inaccuracy in defining the initial state, due to the lack of available measures or observation/assimilation errors, and approximations of the models, whose structure is not capable to represent properly the phenomena of interest (Allen et al., 2002;Buizza, 2018). These problems are exacerbated by the 10 chaotic nature of the atmosphere. Even though hydrological models are much simpler than meteorological models in their structure (Liu et al., 2012;Pagano et al., 2014), they also have to struggle with different sources of uncertainty that, according to Renard et al. (2010), can be grouped in four categories: 1) input uncertainty; 2) output uncertainty (e.g., runoff estimates are not straightforward); 3) structural model uncertainty and; 4) parametric uncertainty. Furthermore, since very seldom catchments are perfect natural systems, some effects of human disturbances virtually cannot be modelled. 15 The main link between atmospheric and hydrological compartments in a forecasting chain is precipitation forecast, which is an output variable for weather models and constitutes the main input for hydrological models. Quantitative Precipitation Forecast (QPF) is a major challenge for operational meteorology, because the reliability of precipitation forecasts crucially affects streamflow forecasts skill (for a review see Cuo et al., 2011; for recent applications, e.g., Davolio 2018), examining several events in the Eastern Adriatic, also found that more realistic SST fields did not substantially improve precipitation estimate; furthermore, they showed that the impact of improved SST varied in different cases. Conversely, Katsafados et al. (2011) found noticeable deviations among the forecast skills of simulations with SST boundary conditions at different resolutions in a test-case in the Eastern Mediterranean, while Cassola et al. (2016) verified in a study in north-western Italy that high resolution SST fields can positively impact QPF in the forecasting range 5 36-48 h. Finally, Berthou et al. (2016) in southern France and Stocchi and Davolio (2017) in the Adriatic Sea highlighted that SST-atmosphere interactions affect precipitation patterns and intensity mainly through complex (and varying event-byevent) modifications of the stability of the upstream atmospheric boundary layer.
The main objective of this paper is to contribute to the current discussion on the impact of SST representation by extending the analysis over the whole meteo-hydrological forecasting chain, i.e. going beyond precipitation forecasts and evaluating 10 sensitivity on streamflow forecasts. Furthermore, SST sensitivity is assessed in the context of the overall uncertainty linked to initial and boundary conditions in regional modelling, using different forcing GCMs, with and without data assimilation.
To this aim, different accuracy levels of SST representation are used in an operational meteorological-hydrological forecasting chain over a coastal Mediterranean area including, in addition to the native SST fields of the General Circulation Models (GCMs), also higher resolution fields (namely, the Medspiration level 4 Ultra-High Resolution foundation SST - Weather Forecasts -ECMWF) and a three-dimensional variational assimilation (3DVAR) scheme. 20 The study area, corresponding to the Calabrian peninsula (southern Italy), due to its particular position in the middle of the Mediterranean Sea and its complex and steep orography experiences quite regularly severe precipitation events and is particularly prone to significant ground effects (Federico et al., 2003a(Federico et al., , 2003b Avolio and Federico (2018), severe precipitation events 25 over Calabria can be classified in short-lived events, lasting less than 24 hours, and long-lived events. Following this classification, in this paper two case studies occurred in 2015 are considered, the former characterized by convective, very localized precipitation (August 11-12) and the latter by more persistent and widespread stratiform precipitation (October 30-November 2).
The meteorological-hydrological forecasting chain is based on the WRF-Hydro modelling system . This The paper is organized as follows. Section 2 describes the study area, the two events analysed, the numerical model and its 5 setup with details on space and time resolutions of the boundary conditions. In Section 3 the results of the meteorological and hydrological outputs are analysed separately for the two events. Finally, Section 4 discusses and summarizes the main findings and outlines future research lines.

Study area and events description
Location of Calabrian peninsula in the centre of the Mediterranean as well as its complex orography entails a very irregular The first high impact event (case study 1) was very localized in space and time and hit the north-eastern part of the region on the morning of 12 August 2015. The analysis at the synoptic scale (Figs. 1a-f) shows that in the early hours of 12 August 2015 a main low pressure system coming from the Atlantic moved over the French and Spanish coasts, while over the central Mediterranean a cut-off low occurred, giving rise to a new low pressure vortex with reduced dimensions that caused 20 intense local rainfall. The observed precipitation patterns (Fig. 1g) involved only small areas in the mainland, specifically the territory of the Corigliano and Rossano municipalities. The data provided by the Italian National Radar Network (integrated in the same map of Fig. 1g), though underestimating ground observations, show that most of the precipitation occurred over the Ionian Sea. The Corigliano rain gauge measured high rainfall values (Fig. 1h) Fig. 2i). Such catchments are chosen because they are two of the biggest with available observations of water levels (unfortunately no discharge data are available) and are located in the north and the south, 10 respectively, of the rainiest area. Specifically, Chiaravalle Centrale station is located at the Ancinale River outlet.

WRF
The Advanced Research WRF (ARW) Model, version 3.7.1, is used in two one-way nested domains (Fig. 3) A summary of all simulations carried out is reported in Table 2.

WRF-Hydro 10
In this work, WRF-Hydro version 3.0 is used in one-way mode. Therefore, the atmospheric model outputs are used as input of the hydrological model using an hourly time step. According to the WRF parameterization, the Land Surface Model No observed discharge or flow depth data is available for case study 1, hence model calibration is not performed. In case study 2, model calibration is performed manually with respect to the available water level data for the two selected catchments (Ancinale and Bonamico), with the aim of reproducing the timing of the hydrological responses to heavy precipitation and, mainly, to correctly simulate the peak flow time, which is a paramount variable for civil protection activities. 20 The humidity and temperature conditions in the 4 soil layers at the beginning of the analysed event ( precipitation fields are achieved merging hourly ground-based rainfall observations to hourly radar data estimates provided by the Italian weather radar network managed by the National Department of Civil Protection. The merging procedure follows Sinclair and Pergram (2005) with the difference that, instead of a double kriging interpolation, a simpler double IDW interpolation method is used. The merging technique guarantees an increase of the total "observed" rainfall volume, with 30 respect to a simple IDW interpolation, of +4.6% over the Ancinale River and +10.6% over the Bonamico Creek.
The parameters involved in the calibration procedure are broadly the same used in previous studies with WRF-Hydro (e.g., The calibrated parameters are shown in Table 3, while resulting hydrographs are shown in Fig. 4. The more impulsive 5 behaviour of the Bonamico Creek, typical of Calabrian "fiumare", is simulated through lower values of the infiltration factor and lower soil layers thickness. Nevertheless, in order to allow timely peak flows simulation, a small delay of the initial response is necessary through an increase in the RETDEPRTFAC value, which is compatible with noteworthy initial ponding in the wide alluvial bed and infiltration in the gravelly soil. On the other hand, abundance of organic matter in the soils of the dense forests within the Ancinale River catchment, which especially in autumn can store considerable quantities 10 of water, most probably contributes substantially to the smoother response of the Ancinale River.
As for the hydrographs ( further validated by on-field sample measurements. One-dimensional steady flow simulations reaching observed peak heights provide peak discharges broadly comparable to the results achieved with the model. For the sake of brevity, hereafter the WRF-Hydro hydrographs calibrated using observed precipitation fields shown in Fig. 4 will be referred to as 'observed hydrographs' or simply 'observations'. Figure 5 shows the skin SST evolution for two specific points 1 and 2 in the Ionian Sea (whose exact location is given in Fig.   3b) for the whole 48-hour simulation period (from 11 August 00:00 UTC to 13 August 00:00 UTC). Furthermore, panels in Figure S1 show, for all simulations carried out in this case study, the skin SST fields in the Domain D02 from 11 August 18:00 UTC to 12 August 18:00 UTC with a time step of six hours.  According to the generally small differences identified in the SST fields, Figure 6 clearly shows that ingesting high resolution SST information provides, in terms of spatial distribution of accumulated precipitation, much less relevant (and partially chaotic) effects than changing initial and boundary conditions or using data assimilation schemes, and a minor or possibly opposite impact on the accuracy of the simulations. Given the peculiar features of the analysed event, it makes 30 sense to focus on the area surrounding the Corigliano gauge station. For each simulation, the graph in Figure 7a

Case Study 2
Case study 2 embraces a longer period than case study 1. In this Section, forecasting skills are first assessed considering the whole 4-days length of the event. Then, in order to reduce the uncertainties due to the longer lead time forecast, we focus on a 3-days forecast, starting on 31 October 2015.

4-day forecast (30 October -2 November 2015) 5
Such as in the previous case study, the first analysis is devoted to skin SST fields. Fig. S2 highlights (besides the already mentioned IFS-related problem along coastlines) that in this case Medspiration fields for the whole period overestimate both GFS and IFS native SST fields. Specifically, average differences with respect to GFS SST vary from about 0.6 to 0.8 K, while differences with respect to IFS SST fields are higher than 0.8 K (the average difference increases to about 1.5 K, if also the values along coastlines are considered). It is noteworthy that also GFS underestimates skin SST particularly near 10 coastlines, while, such as in the previous test case, there is an overestimation off the Tyrrhenian Sea. Focusing on points 1 and 2 (Fig. 11) it is shown that: 1) both points replicate the general behaviour, with Medspiration fields values higher than GFS, in their turn higher than IFS; 2) differences are more marked in point 1 (average values of +1.0 and +0.6 K, respectively for IFS and GFS) than in point 2 (+0.9 and +0.3 K); 3) such as in the case study 1, also here, in the graph related to point 1, a sudden reduction of about 0.5 K can be observed for Medspiration, moving from 1 November to 2 November. 15 Nevertheless, a similar abrupt change, even though less marked (about 0.2 K), is observed also for GFS on 31 October 06:00 UTC. Summarizing, this case study shows an evident skin SST increase from IFS to GFS to Medspiration. Concerning the precipitation patterns, for the aims of this study it is interesting to focus on the biggest cluster in the southeast corner of the domain (i.e., the direction from which the humid air mass comes). Moving from GFS to IFS to IFS-DA, quite independently from SST fields change, a shift of this cluster can be observed from north-east to south-west.
With the aim of objectively assessing the performance of each WRF configuration, a detailed analysis using categorical scores is carried out considering ground based observations in the Civil Protection warning areas more affected by the event 30 (grey areas in Figure 13a Cala8. Among the numerous scores available in literature (for a review see, e.g., Wilks, 2006), for each zone Figure 13 shows results concerning the Frequency Bias Index (FBI): In the previous equations, the terms hits, misses, false alarms and correct negatives refer to a typical 2 × 2 contingency table.
The FBI indicates if the forecast system has a tendency to underestimate (FBI<1) or overestimate (FBI>1) events frequency, while ETS measures the fraction of the correctly predicted events, adjusted for hits associated with random forecasts, and 10 ranges from -1/3 to 1 (perfect score). Both scores are used for consecutive 6-hour time intervals for the period 31 October -2 November (which is the actual rainy period for the analysed warning areas), using precipitation thresholds with a step of 0. All simulations performed for this case study show that the greater energy supplied to the system by the higher skin SST 5 Medspiration fields affects lower layers flow dynamics allowing more transport, but not accelerating it. This behaviour can be attributed to the long-lasting characteristics of the event that, developing at a wider scale than case study 1 and providing humid air continuously, smooths potential differences in terms of timing.
Assessing the hydrological impact in the two selected catchments is more interesting in this case study, because all simulations forecast heavy rain over the catchment areas of the Ancinale River and Bonamico Creek, yet it is still 10 challenging, because reliable hydrological forecasts require accurate QPFs at the catchment scale. A QPF performance analysis is carried out for the catchment areas, considering the average values of the interpolated precipitation fields. The simulated average precipitation over the Ancinale River catchment is strongly overestimated by all the IFS-based simulations (from +53% to +72%), while GFS-based simulations provide much more reasonable biases (+12% and -1%, Concerning the Bonamico Creek, IFS-based hydrographs are not well correlated and forecast peak flows more than 12 hours in advance with respect to observations, while GFS-based hydrographs substantially underestimate.
The analyses performed show the great uncertainty of QPF at the catchment scale, due to many sources of errors and uncertainties that can be amplified by the 4-day forecast window. In order to attempt to reduce the sources of uncertainty and 30 highlight the possible emergence of positive effects due to the more detailed representation of the SST, a further analysis is carried out with a forecast window reduced to 3 days, from 31 October to 2 November.

3-day forecast (31 October -2 November 2015)
The main change produced by the 3-day forecast with respect to the 4-day forecast is the higher correspondence of the GFSbased simulations to the IFS-based. Specifically, Fig. 18 highlights that the GFS-based rainfall footprints located in the south-east of the domain D02 meet the Calabrian Ionian coast more southern with respect to the 4-day simulation (Fig. 12 Performance evaluation with categorical scores against ground based observations is repeated also for the 3-day simulations with the same warning areas Cala4, Cala7 and Cala8 (Fig. 19). Despite the improvements with respect to the 4-day the observed of about 20%, but the correlation between simulated and observed hydrographs is high (0.89) and the observed peak flow time (1 November 16 UTC) is delayed by only 2 hours. Generally, all the IFS-based simulations are well correlated (r values always higher than 0.6) even though peak flow time is always delayed (up to 12 hours). GFS-based simulations are poorly correlated and show significant overestimation and early forecast of the peak flow.

DISCUSSIONS AND CONCLUSIONS 5
The results achieved in this study provide not univocal indications and need to be carefully analysed. Table 4 aims at supporting the discussion summarizing the main outcomes concerning: 1) representation of the skin SST fields; 2) accumulated precipitation values in the internal domain and the related spatial distribution; 3) time distribution of precipitation and; 4) hydrological impact (hydrograph shape, total discharge, peak flow times), depending on: 1) GCM choice for determining the boundary conditions; 2) use of the 3DVAR scheme; 3) use of the high-resolution Medspiration 10 fields.
The most evident outcome across the case studies, yet far from surprising, is that the choice of the GCM providing boundary conditions is, comparatively, the most relevant factor affecting the simulations. Specifically, for the case studies analysed, GFS-based simulations are generally less performing than IFS-based (this difference is emphasized if the forecast time window is increased, such as case study 2 demonstrates). Of course, it is not a generalizable result, given the few number of 15 events involved and the lack of further analyses (e.g., evaluation of different parameterizations). For example, for case study 2, through detailed sensitivity tests Avolio et al. (2018) found that simulations forced by GFS have better performance than those forced by ECMWF. Nevertheless, for the purpose of this study it is shown that the different features differentiating the two GCMs (among that, the spatial resolution, which is improved with IFS) can considerably affect precipitation fields calculated through dynamical downscaling, comparatively more than using three dimensional variational data assimilation 20 methods or imposing specific (high-resolution) skin SST boundary conditions. The use of the 3DVAR scheme in this study has to be considered mainly as a strategy for improving initial conditions. Several studies adopted data assimilation approaches for achieving improvements for shorter forecast periods than 48 to 96 hours used in this study (e.g., Sun  with previous studies, demonstrating that the effects of data assimilation do not lead to an effective improvement in the case of highly convective events (Liu et al., 2013). In case study 2 it is noteworthy that IFS-O is capable to provide in some warning areas, especially in the 4-day forecasts, better ETS values than IFS-DA-O, meaning that other sources of uncertainty than initial conditions can strongly affect forecasting skills. Among those uncertainties, representation of SST conditions can be important, given that, in general, IFS-DA-M (i.e., the simulation including both data assimilation and improved SST 5 representation) provides better performances.
Unlike the 3DVAR scheme, the effects of improved SST representation on forecasts are emphasized to the maximum in this study, given that observed rather than forecasted SST fields are replaced as lower boundary conditions in the simulations, thereby providing a kind of "upper limit" to the effects provided by well forecasted SST fields. The foundation SST fields used (defined as the temperature of the water column free of diurnal temperature variability, Donlon et al., 2007) are 10 produced by the Medspiration project once every 24 hours, but the diurnal cycles are ensured by the sst_skin option. They especially improve the SST fields provided by IFS boundary conditions that, even allowing better forecasts than GFS, show very evident problems along the coastlines. The forecast periods analysed in this study allow to largely overcome the problem highlighted by Cassola et al. (2016), who found that for forecasting ranges shorter than 36-48 hours the forced ingestion of high-resolution SST fields can be counterproductive, due to the relatively slow adjustment of initial atmospheric 15 fields.
Improved SST fields provide often, but not always (and not always significantly) enhanced forecast performances with respect to the corresponding simulations with native SST fields. Especially in case study 1, the effects close to the Corigliano rain gauge seem to be somehow linked to generally chaotic behaviour. Such as discussed in the case of improved initial conditions (i.e., the 3DVAR scheme), these outcomes are related to the fact that other sources of uncertainties rather than 20 SST representation hinder enhanced forecast skills. Nevertheless, using more realistic SST fields leads to enough clear changes in the simulation of the atmospheric boundary layer dynamics in both case studies, especially with respect to the configurations with clearly unrealistic fields (i.e., IFS lower boundary conditions). Specifically, in the summer (shorter, convective, highly localized) event higher SST values along the coastlines accelerate flow dynamics, moving faster humid air towards the coast and moving up precipitation (thus agreeing with the results achieved by Stocchi and Davolio, 2017). On 25 the other hand, in the autumn (longer, caused by a frontal system, widespread) event the higher energy supplied to the system by a continuously warmer sea surface leads to a generalized increase of precipitation amount that, however, does not change substantially neither the spatial pattern nor the timing of the event. The missed change in timing is most probably due to the fact that the stability of the atmospheric boundary layer and the related flow dynamics in case study 2 depend more on large scale (synoptic) conditions than local factors (that possibly is the same reason why, on the other hand, the 3DVAR 30 scheme is capable to influence more case study 2 than case study 1). Such large-scale conditions are capable to lead to much stronger winds than case study 1 (that is evident comparing Figs. 8 and 16). can be 'doubly' reversed. For example, both bias analyses and Taylor diagrams related to the Ancinale River Basin (Figs. 17 and 20) highlight better QPF performances for GFS-based simulations, not so obvious (or even not found) in the analysis of simulations skills on a larger scale. Nevertheless, IFS-based hydrographs are better correlated with those calculated with observed rainfall and peak flow times are closer to observed (it is worth to recall that a quantitative discharge analysis is less significant in this case, given that only water level observations are available). Contrary to what was found by Yucel et al. 5 (2015), streamflow simulations are not particularly improved by initial data assimilation. Most probably, this result is due to the relatively long forecast periods (from 72 to 96 hours). Indeed, in the 3-day forecast the benefits of the improved initial conditions partially come to light in both catchments even though, interestingly, the best simulation (even if only slightly) is IFS-DA-O, i.e. that using the 3DVAR scheme but not the Medspiration SST fields, meaning that at the high resolution scale of the catchments analysed there are so many sources of uncertainty that the added value provided by improved SST fields is 10 hidden.
Summarizing, the results achieved in this study show that none of the different versions of the forecasting chain adopted is capable to achieve in all the analysed cases quantitative precipitation and (consequently) streamflow forecast, yet several interesting clues are provided. Specifically, similar to past studies it is shown that the improved representation of SST fields can significantly change the simulation of the atmospheric boundary layer processes, modifying flow dynamics and/or the 15 amount of precipitated water. Nevertheless, the potentially positive impact of improved SST fields can be easily hidden by several other sources of uncertainty (mainly, the relevance of the choice of the GCM providing boundary conditions).
Further improvements in both GCMs (e.g., the higher-resolution IFS cycle since March 2016) and RCMs will reduce uncertainties highlighting more clearly the need of improved SST representation in regional modelling. Emerging approaches like regional-scale fully-coupled ocean-atmospheric (e.g., within the Baltic Sea Experiment -BALTEX, 20 Gustafsson