Results from a full coupling of the HIRHAM regional climate model and the MIKE SHE hydrological model for a Danish catchment

. A major challenge in the emerging research ﬁeld of coupling of existing regional climate models (RCMs) and hydrology/land-surface models is the computational interaction between the models. Here we present results from a full two-way coupling of the HIRHAM RCM over a 4000 km × 2800 km domain at 11 km resolution and the combined MIKE SHE-SWET hydrology and land-surface models over the 2500 km 2 Skjern River catchment. A total of 26 one-year runs were performed to assess the in-ﬂuence of


Introduction
Combined modelling of atmospheric, surface and subsurface processes has been performed in a broad range of studies over the years utilizing increasingly complex model codes.For example, by adding more complex process descriptions in the hydrological component of the Lund-Potsdam-Jena vegetation model (LPJ GUESS), more realistic global reproductions of evapotranspiration and run-off is achieved as compared to an offline hydrological model (Gerten et al., 2004).It is further argued that the combination of hydrology and vegetation processes may account for rising CO 2 levels not simulated using hydrological models alone.Similarly, Yan et al. (2012) successfully simulate global Published by Copernicus Publications on behalf of the European Geosciences Union.evapotranspiration using the energy based vegetation and water balance land-surface model ARTS E, while Anyah et al. (2008) show a direct connection between soil moisture and simulations of evapotranspiration over western North America, where soil water is a limiting factor, using the coupled RAMS-Hydro model.Several studies deal with the influence of surface hydrology, vegetation and land use change on atmospheric processes.Seneviratne et al. (2006) show that land-atmosphere coupling processes are significant in representing the variability of temperature projections for 2070 to 2099 using an ensemble of climate models.Zeng et al. (2003) highlight the considerable influence of landsurface temperature and moisture heterogeneities on simulations of sensible (H ) and latent heat (LE) fluxes as well as the precipitation pattern, using the RegCM2 regional climate model (RCM).Cui et al. (2006) show a substantial change in ECHAM5 general circulation model predictions as a consequence of projected changes in vegetation.Kunstmann and Stadler (2005), Smiatek et al. (2012) and York et al. (2002) study the influence of the atmosphere on landsurface and subsurface state.Of these, York et al. (2002) use the CLASP II model with coupled aquifer-atmosphere processes for a single grid box to study the response of groundwater levels to climate forcing.
Current climate models include only a simplistic surface and subsurface description of hydrology processes.Similarly, hydrological models generally include atmospheric processes in a surface-near layer in the scale of metres.More recent studies have therefore focused on combining model codes that each represents a component in the total simulation of atmospheric, land-surface and subsurface processes as well as ocean processes.Of these, a few studies have focused on coupling a mesoscale atmospheric model with a combined land-surface and hydrological model.Maxwell et al. (2007), for example, study the coupling of the ARPS mesoscale atmospheric model (Xue et al., 2000(Xue et al., , 2001) ) and the ParFlow hydrological model (Kollet and Maxwell, 2008) for a 36 hour period over the Little Washita catchment in Oklahoma, USA, showing a high degree of soil moisture influence on the boundary layer development.In Maxwell et al. (2011) the ParFlow hydrological model, also including subsurface flow, is coupled with the WRF atmospheric model (Skamarock et al., 2008) and the NOAH land-surface model (Ek et al., 2003) for 48 h idealized and semi-idealized runs, emphasizing the applicability of the model set-up in integrated water resource studies.Also using the WRF and NOAH models, Jiang et al. (2009) couple these with the SIMGM groundwater model highlighting the importance in proper energy flux and soil moisture signal from land-surface for reproduction precipitation over central USA.A recent study utilizes a fully dynamic coupling of the COSMO atmospheric model, the CLM3.5 land-surface model and the ParFlow hydrology model for a 1 week summer period (Shresta et al., 2014) indicating slight improvements for surface energy fluxes for the distributed model system as compared to 1-D columns.
COSMO further has the advantage of being non-hydrostatic and therefore able to resolve convective processes.Klüpfel et al. (2011) use COSMO in 2.8 km resolution over western Africa and demonstrate a high degree of soil moisture influence on simulated precipitation for a convective event.Furthermore, a few recent studies couple atmospheric models in climate mode, i.e. performing longer-term simulations at larger spatial scales.Rasmussen (2012) studied the HIRHAM RCM (Christensen et al., 2006) and the MIKE SHE hydrological model (Graham and Butts, 2005) with the SWET land-surface scheme (Overgaard, 2005) in one-way coupled mode, where output from the RCM is transferred to the hydrological model over the FIFE test domain in Kansas, USA, for the period May to October 1987.In that study, data are exchanged over an area represented by a single 0.125 • HIRHAM grid cell.In two more recent studies, the MM5 regional climate model and the PROMET land-surface model (Zabel and Mauser, 2013), and the CAM atmosphere model and the SWAT hydrology model (Goodall et al., 2013) have been coupled.
A comprehensive two-way coupling between the HIRHAM RCM and the MIKE SHE hydrological model combined with the SWET land-surface model for the 2500 km 2 Skjern River catchment in Denmark has recently been established by Butts et al. (2014) and used for a 1-year simulation.To our knowledge, no previous studies have been reported on annual simulations employing couplings between a distributed RCM and a full 3-D groundwater-surface water hydrological model for catchments larger than a single RCM grid point.A limitation of the study of Butts et al. (2014) is the need to understand the influence of the data transfer interval (DTI) between the two models, an issue which has also not been reported in previous studies.Also, in Butts et al. (2014) only a limited part of the full RCM domain is replaced by the local hydrology model land-surface scheme which could lead to local physical discontinuities.Another crucial issue, when systematically evaluating climate model results, is the inherent model variability where minor changes to the model set-up, induced either by artificially perturbing initial conditions (Giorgi and Bi, 2000) or by altering the domain location (Larsen et al., 2013) result in significant variations in the simulated atmospheric variables.Giorgi and Bi (2000) show for regions in China that especially during the summer, and for high precipitation events, precipitation is highly sensitive to perturbations in the initial and boundary conditions.Similarly, Alexandru et al. (2007) used the Canadian RCM (CRCM; Caya and Laprise, 1999) over five domains with twenty perturbed runs for each domain to assess model variability in precipitation.They found at least 10 ensemble members were needed to reproduce the correct seasonal means, although this number is dependent on the domain size.
In this paper we study the interaction and feedback mechanisms between the atmosphere and the land surface by Hydrol.Earth Syst.Sci., 18, 4733-4749, 2014 www.hydrol-earth-syst-sci.net/18/4733/2014/ two-way coupling of proven climate and hydrology models each operating in an environment where the other model components deliver high quality boundary conditions using the same set-up as Butts et al. (2014).Our hypothesis is that the inclusion of feedback will provide a significantly changed signal when compared to uncoupled simulations.In addition, the current study aims to evaluate the influence of the DTI between the two models since this strongly influences computation time, and to evaluate the importance of the internal HIRHAM model variability by assessing the sensitivity of the simulation results to perturbations of boundary and initial conditions.

Study area
The climate and hydrological models used in this study each cover areas typical of their application range.The HIRHAM regional climate domain model covers an area of approximately 2800 km × 4000 km from north-west Iceland to southern Ukraine (Fig. 1).Approximately 60 % of the latitudinal stretch is located west of the Skjern catchment where most local weather systems originate.The MIKE SHE model set-up covers the Skjern catchment area of 2500 km 2 (Fig. 1) located in the western part of the Jutland peninsula.
The data exchange between the models occurs at the overlapping grid cells with the hydrological catchment nested within the climate model domain (Fig. 1).Skjern River emerges in the central Jutland ridge at approximately 125 m a.s.l. and has its outlet into the Ringkøbing fjord.The Jutland ridge has a maximum elevation of approximately 130 m.Two general soil classes can be distinguished within the catchment: sandy soils generated by the Weichsel ice age glacial outwash and till soils from the previous Saalian ice age.The catchment land use is divided between 61 % agriculture, 24 % meadow/grass/heath, 13 % forest and 2 % other.For the period 2000-2009 the average annual measured precipitation was 940 mm, which when corrected for turbulence related gauge undercatch (Allerup et al., 1998) amounts to 1130 mm yr −1 .The mean annual air temperature for the same period was 9.3 • C.

Observed input and validation data
Measurements from three sites having flux towers, placed over agricultural, meadow and forest surfaces, respectively, are used for calibration of the hydrological model (Fig. 1) as described in Larsen et al. (2014).At these locations we have measurements of LE, H and soil heat fluxes (G), radiation components, soil/air temperature, precipitation, wind speed, soil moisture and groundwater table depth.LE and H are measured above the vegetation using eddycovariance sonic anemometers and G is measured using Hukseflux plates at 5 cm depths.LE and H are gap-filled and corrected according to data quality using Alteddy software 3.5 (Alterra, University of Wageningen, the Netherlands), as described in Ringgaard (2012).Up to 45 % of the data is replaced.For the periods 21 July-16 August and 24 August-28 October 2009, no data were recorded at the agricultural site and were therefore replaced by data from the forest site (Ringgaard et al., 2011).Discharge measurements (Q) from the three discharge stations Ahlergaarde (1055 km 2 ), Soenderskov (500 km 2 ) and Gjaldbaek (1550 km 2 ) were also used for calibrating the hydrological model (Larsen et al., 2014), and in the present study for point validation (Fig. 1).
To drive the MIKE SWET module six climatic variables are needed.Daily precipitation (PRECIP) data are derived from gauge stations and interpolated by kriging to a 500 m grid as described in Stisen et al. (2011a).The precipitation data are dynamically corrected for gauge undercatch (Allerup et al., 1998;Stisen et al., 2011b).The remaining five variables: air temperature (T a ), wind speed (V ), relative humidity (RH), surface pressure (P s ) and global radiation (R g ) are based on measurements from climatic stations.The data have been interpolated in space and time to produce hourly data sets at a 2 km resolution (Stisen et al., 2011b).For the assessments made here, these six distributed variables have been bi-linearly interpolated to match the exact grid of the HIRHAM set-up allowing for grid-by-grid calculations.

MIKE SHE
In the present study we use the MIKE SHE hydrological model that represents all key hydrological processes in the land-surface part of the hydrological cycle such as evapotranspiration, snow melt, channel flow (the MIKE 11 www.hydrol-earth-syst-sci.net/18/4733/2014/ Hydrol.Earth Syst.Sci., 18, 4733-4749, 2014 component), overland flow, unsaturated flow, saturated flow, as well as irrigation and drainage (Graham and Butts, 2005).
The SWET component is included to handle the vegetation and energy balance processes occurring in the landsurface interface from the root zone and into the lower atmospheric boundary layer (Overgaard, 2005).The SWET model is based on a two-layer system with resistances for both soil and canopy, as presented in Shuttleworth and Wallace (1985), but modified to include energy fluxes from ponded water and vegetation interception storage (Overgaard, 2005).A limitation to the current SWET model is that snow accumulation/melt is not yet included, which may be important under Danish conditions.
In the current set-up, the MIKE SHE model is derived from the Danish national water resources model (DK-model) (Stisen et al., 2011a(Stisen et al., , 2012;;Højberg et al., 2013) at 500 m resolution.The model set-up includes 11 computational layers in the groundwater system and an extensive river network, and is implemented with a basic (maximum) time step of 1 h, which is reduced dynamically during precipitation events.

HIRHAM
The climate model used in the present coupling study is the HIRHAM regional climate model version 5 (Christensen et al., 1996(Christensen et al., , 2006)).HIRHAM is based on the atmospheric dynamics from the HIRLAM model used for operational weather forecasting (Undén et al., 2002) and physical parameterization schemes from the ECHAM5 general circulation model (Roeckner et al., 2003).HIRHAM is a hydrostatic model and typically implemented in resolutions of 5-50 km, here applied at a resolution of 11 km on a rectangular grid.The HIRHAM model is here driven by ERA-Interim reanalysis data as lateral boundary conditions (Uppala et al., 2008), and the internal model time step is 120 s.The derivation of the domain is described in Larsen et al. (2013).

Coupling code
A challenge in developing the coupling code used for this work is that the MIKE SHE and HIRHAM models operate on different computing platforms, i.e. a Windows workstation and a highly parallelized Linux supercomputing facility, respectively.To facilitate communication across these very different platforms, an Open Modelling Interface (OpenMI, www.openmi.org)code have been developed and used on the Windows workstation side, and MIKE SHE modified to exploit OpenMI.On the Linux side, modifications to the HIRHAM code were made and additional code controlling the data exchange developed.An OpenMI interface was installed in order to facilitate the communication between existing time-dependent model codes running simultaneously and to handle differences in time step, model domain, resolution and discretization (Gregersen et al., 2005(Gregersen et al., , 2007)).
The OpenMI and Linux/HIRHAM coupling code served four general functions: (1) to control the timing between models so that data are stored from one model waiting for the other to reach the point in time of specified data exchange; (2) to define which variables to be exchanged in both directions and to handle potential unit conversion factors, offsets and aggregation types; (3) to handle the spatial grid structure of each model and transfer the data based on a selected spatial interpolation mapping; (4) to collect and interpolate data for each separate model time step to be exchanged between models at each data exchange time step, based on the differing time steps in the two model codes, including MIKE SHE's dynamically varying time steps during precipitation events.
The exchange of data between the models are selected within the modelling scope of using the HIRHAM climate forcing as input to MIKE SHE/SWET as well as transferring energy and water fluxes in the opposite direction.The exchange of data between the models is as follows: (1) MIKE SHE receives the driving variables PRECIP, RH, V , R g , T a and P s from HIRHAM, and (2) HIRHAM receives the variables LE and surface temperature (T s ) from MIKE SHE.T s is then used to calculate H within the HIRHAM code.The spatial mapping in this study was based on a weighted mean method where each grid cell contributes relatively according to the land share fraction.
In the current version of the coupling LE and T s (and therefore H ) calculated by MIKE SHE directly replaces the corresponding variables within HIRHAM one-to-one over the shared domain, whereas outside of the domain the simple land-surface scheme embedded in the RCM is preserved.Atmospheric fields are then updated based on the modified surface energy balance from MIKE SHE.In this study no means are implemented to assure ensuing internal physical consistency of fields within HIRHAM.Therefore, effects directly related to differences in spatial and temporal scales and in the physical formulation of the land-surface scheme may be found along the boundary of the hydrological catchment.The boundary effects seen here are, however, relatively small, which again to a large degree is due to differences in spatial and temporal scales, i.e. to cell averaging and cancellation of errors when feeding the MIKE SHE surface back to HIRHAM.In this work we address primarily the effect of the temporal scale differences on the coupled system, i.e. by varying DTI.
The standard OpenMI method for data exchange is memory-based.However, due to local safety regulations for network data exchange at the location of model execution, the current set-up is constrained to the exchange of data files on a shared drive visible to both the Windows and Linux model set-ups.Naturally, this network file transfer generates a significant overhead with respect to execution time when data exchange is frequent, which by far exceeds that of the added overhead on each of the individual models.
-Coupled variability (CV): eight two-way fully coupled simulations using a 60 min DTI were performed using starting dates from 1 to 8 March 2009 as above.
-MIKE SHE data source (MSDS): to assess the influence of data sources on MIKE SHE performance two MIKE SHE simulations were performed.(1) Uncoupled mode using observed values of PRECIP, RH, V , R g , T a and P s and (2) one-way coupled mode using simulated values as driving variables based on HIRHAM model simulations with 30 min DTI and without feedback to HIRHAM.
The eight uncoupled HIRHAM runs all show varying geographical and temporal patterns of, in particular, precipitation.With these changes in precipitation, the water available for evapotranspiration and the energy balance is altered and, therefore, attention should be given to which simulations are compared.For all model runs, simulation output from HIRHAM were assessed for the six climatic variables PRE-CIP, RH, V , R g , T a and P s , since observations were available.
The same observational data were also used as input to MIKE SHE SWET for the uncoupled runs.Likewise, the output from the MIKE SHE simulations was assessed by comparing measurements of LE, H and G at the agricultural, forest and meadow sites (Fig. 1) as well as discharge measurements from three gauging stations.
Figure 2 outlines the data flow and simulation categories.As the Skjern catchment has an irregular shape, different degrees of overlap are found between the HIRHAM grid cells and the hydrological catchment (Fig. 1).Analyses of PRE-CIP, RH, V , R g , T a and P s were therefore performed for five domains that reflect these different degrees of overlap: For HIRHAM output, the evaluation was performed on all five test domains by calculating a single root-mean-square error (RMSE) value for each full model simulation.For MIKE SHE output, the RMSE was performed on the point data only.is the total number of data points.To assess the output variability from each of the three simulation groups involving HIRHAM (TI, CV and HUV), simulation box plots with the 25th and 75th percentiles including whiskers for the most extreme data were created (Figs. 5 and 8).
Similarly, the mean absolute errors (MAE) were assessed to gain more information on the expected improvements for simulations with a more frequent DTI: where the terms correspond to the RMSE calculations.The MAE calculations, for the TI simulations, were performed for each of the six HIRHAM variables over each of the five test domains and the four MIKE SHE variables at point scale.Linear trend lines, using least squares, were then fitted to the 12-120 min DTI MAE values for each of the test domains and point scale output and for each variable.The mean absolute and percentage change in MAE, based on the trend lines from the 120 min to the 12 min data points, were then calculated.Also, correlation coefficients on the basis of the trend lines were calculated to detect statistical significance at a 95 % two-tailed level.
The HUV and CV simulation groups apply the same changes in initial conditions by using different start dates to perturb these initial conditions but differ by having different land-surface schemes over the Skjern catchment.These simulations were therefore used to test for statistical significance of the coupling.A simple two-sample t test was performed for each of the test domains and variables for the HUV and CV simulations to test the hypothesis of these simulation groups having unequal means.

Data transfer interval (DTI)
Of the six HIRHAM output variables, the four variables of PRECIP, RH, V and T a show a significant decrease in RMSE with decreasing DTI in the fully two-way coupled mode simulations, whereas P s is less affected and R g is unaffected (Fig. 3).Based on the linear trend line averages between the domains, RMSE improvements of 1.1 mm day −1 , 1.1 %, 0.2 m s −1 and 0.3 • C are seen for PRECIP, RH, V and T a , respectively (Table 2) of 0.3 mm day −1 , 0.8 %, −0.1 m s −1 and 0.2 corresponding to a change from the 120 to the 12 min simulations of 7.2 % averaged for the four significant variables (Table 2).For the variables with statistically significant trends, PRECIP, RH, V and T a , there is a specific order in the resulting RMSE trend line locations with the largest RMSE values for Dom1, Dom2 etc., decreasing down to Dom5.The execution time for the coupled set-up, as a function of DTI, is shown in Fig. 4.Only a moderate increase in execution time is seen in the range of 60-120 min DTI values whereas a sharp increase is seen from DTI values of around 15-30 min.

HIRHAM model variability
Figure 5 shows the output variability for each of the TI, CV and HUV group runs for each of the five test domains, Dom1-Dom5.For PRECIP, RH, V and to some extent T a , the largest variability is seen for the two-way coupled runs (TI).The RH and V , using a 60 min DTI, for both the CV and HUV runs show almost negligible variability.For PRECIP, the CV variability is greater than for HUV whereas the opposite is the case for T a with a larger variability in the HUV simulations.For the variables, PRECIP, RH, V and T a , a general decrease in RMSE is seen for the coupled TI and CV simulations with increasing test domain number from Dom1 to Dom5.For the HUV simulations, this pattern is seen, to some extent, for PRECIP only.The R g and P s variables show comparable levels of variability between the TI, CV and HUV simulation groups.For R g , the RMSE values increase with test domain number, whereas the opposite is the case for P s .When comparing the influence of variability with the influence of DTI it is seen that the range in RMSE values from the perturbation induced HUV variability corresponds to 47 % of the RMSE improvement for the TI simulations when going from 120 to 12 min (based on the linear trend lines).The corresponding number when comparing TI with CV is 46 %.Two-sample t tests confirmed the hypotheses that the results from the HUV and CV simulations belong to two separate populations for the variables PRECIP, RH, V and T a with significance levels of 98.2 % or above.For these four variables, there was a clear pattern of decreasing significance with increasing test domain number corresponding to a lesser degree of coupling.1.9 6.9 0.9 -1.9 4.5 Figure 6 shows the simulated PRECIP for each run of the TI, HUV and CV simulation groups and for each test domain.PRECIP is seen to decrease with increasing domain number for all three simulation groups, as well as for observations.This decrease is strongest for the two-way coupled TI and CV simulation groups, which also show the highest PRECIP levels compared to the uncoupled HUV simulations.Compared to the observed PRECIP mean over the five test domains of 892 mm over the simulation period, both the TI and CV simulations consistently overestimate PRE-CIP with accumulated values of 1004 and 1027 mm, respectively.In contrast, the HUV underestimates the PRECIP for this period, with an accumulated value of 868 mm.Despite generally overestimating the rainfall, the coupled TI runs, with high frequency DTIs and a high degree of coupling (Dom1-Dom3), show better estimates of accumulated rainfall compared to uncoupled run (CV).With regard to timing, there is a tendency for the main part of the TI simulation variability to arise from events in the fall months of 2009, whereas most of the HUV and CV variability occurs in early 2010 events.
In addition to comparing simulation statistics and precipitation accumulation plots, the HIRHAM output variables for all 24 TI, HUV and CV simulations are plotted in Fig. 7.This figure shows hourly values for the period 10-17 July 2009, with the exception of precipitation data which are given as daily values for all of August 2009.The 1-week period was chosen to reflect high dynamics in the peak summer period, whereas the 1-month period of august showed more precipitation compared to July.A large spread is seen for precipitation amounts on individual days that appears to increase with the mean intensity, most pronounced on 10 and 20 August.Reasonable agreement is seen between these simulations in terms of capturing the dry days.For the remaining five variables, RH, T a , P s and especially V and R g , the period with low pressure and precipitation, 10-12 July, exhibits a fair amount of spread between the individual simulations, whereas the remaining period, 13-17 July, shows a higher degree of consistency within each simulation group (TI, HUV and CV) especially in terms of dynamics.For the PRECIP, RH, V and T a variables the coupled simulations groups of TI and CV clearly deviate from the HUV simulations in terms of the timing, dynamics and absolute levels.Of these, the most noticeable difference is the daytime RH and night-time T a , which are notably higher and lower, respectively, for the HUV simulations.

MIKE SHE output
As for the HIRHAM simulations, the MIKE SHE RMSE results are plotted as a function of DTI (Fig. 8).LE shows a general improvement in RMSE with a higher frequency of exchange (smaller DTI), which is strongest for the agriculture and forest sites.Correlation coefficients between RMSE and DTI of 0.83, 0.55 and 0.13 are found for the agriculture, forest and meadow sites respectively.Conversely, H shows general decreases in RMSE with increased DTI and with correlation coefficients of −0.80 to −0.83.The changes in LE and H thereby represent opposing signals which could be expected, to some degree, from the conservation of the energy balance.No clear trend between DTI and RMSE results is seen for both G and Q and the corresponding correlation coefficients are generally low.
For LE, an absolute improvement of 1.9 W m −2 in both MAE and RMSE is seen from the 120 to 12 min trend line average data points corresponding to 6.9 and 4.5 for MAE and RMSE respectively (Table 2).Overall the one-way coupled and uncoupled MSDS simulations are superior to the TI simulations with the exception of agricultural LE and G and meadow G. observation data, especially for night-time LE and G, than the TI and CV runs.It should be pointed out that for this analysis (Fig. 9), although results are extracted from three single MIKE SHE cells (for meadow, forest and agriculture), the forcing data are based on either 11 km resolution HIRHAM data input (TI and CV) or 10 km observation gridded data (station interpolated -MSDS), which can be expected to smooth out local features.

Discussion
The motivation for performing this coupling study is to include the land-surface-atmosphere interactions between the RCM and the hydrological model.Our hypothesis is that the RCM will benefit from the more detailed representation of the surface and subsurface processes provided by the dedicated hydrological model as compared to the much simpler land-surface schemes that climate models usually rely on.Similarly, we expect that the hydrological model would benefit from the better representation of the horizontal redistribution processes in the atmosphere offered by dynamic coupling with the climate model.

Performance of coupled versus uncoupled model
It is not surprising that the performance of the coupled model simulations (TI and CV), when compared to hourly values of RH, V and T a and daily PRECIP, is generally poorer than the uncoupled model simulations (HUV).Even though it is based on basic physical principles the HIRHAM RCM has been refined over the years, e.g. in terms of convective parameterization and land-surface albedo, to better reproduce observations.Moreover, the model configuration (domain extent and grid size) used here was the best performing in terms of simulating precipitation and air temperature, as well as representing the atmospheric circulation patterns (Larsen et al., 2013).Likewise, MIKE SHE SWET has been subject to rigorous inverse modelling to assess parameter values (Larsen et al., 2014).By coupling, the existing Hydrol.Earth Syst.Sci., 18, 4733-4749, 2014 www.hydrol-earth-syst-sci.net/18/4733/2014/ Figure 7.The six HIRHAM output variables assessed in the present study in the 10-17 July period (precipitation is 1-31 August to match the period in Fig. 9 with a higher dynamic in discharge) for all 24 TI, HUV and CV runs and for Dom1 (nine cell mean).The legend colouring reflects the overall simulation group (TI, HUV or CV) whereas each simulation is in the colour shade as in Fig. 6.
land-surface scheme in HIRHAM is replaced by MIKE SHE SWET over the Skjern catchment.Calibration or parameter tuning of complex models comprising several processes often introduces compensational errors (i.e.providing the right answer for the wrong reason) in the different model components in order to ensure the model fits observational data as well as possible (Graham and Jacob, 2000).When the existing land-surface scheme in HIRHAM is replaced by MIKE SHE SWET, it will inevitably provide different results likely to be poorer in terms of a hindcast assessment.We should, however, highlight that the coupled system shows benefits over the uncoupled when assessing longer-term periods such as cumulative precipitation where high frequency DTI's produce better results (Fig. 6).Also, greater accuracy in the representation of soil moisture and water available for evapotranspiration, in the coupled system, could explain these findings.In terms of future climate projections, which are typically in the range of 10-30-year integrations, this is very promising and suggests there could be potential added value in using the coupled model system.Similar results, where the added complexity when joining two existing model systems does not lead to obvious direct improvements in simulations, has also been seen in studies of coupling ocean models and atmosphere models (Covey et al., 2004).From a different perspective the fact that the hourly to daily coupled model performance in many respects is poorer, when replacing the existing land-surface scheme with a more elaborate and well-calibrated one (MIKE SHE SWET), suggests that some of the HIRHAM components could be improved.So far very few attempts have been made in formalized calibration of RCMs, and we are not aware of any study that aims at calibrating coupled hydrology-RCM models.While there is a very interesting perspective here in a formal calibration of HIRHAM, such as presented by Bellprat et al. (2012), and in learning from the coupled model to improve the HIRHAM parameterizations, this is outside the scope of the current study.
To some degree the atmospheric variables are likely to be affected by the discontinuity in model physics between HIRHAM uncoupled cells and MIKE SHE coupled cells for the present version of the modelling set-up.With the current experimental set-up it was, however, not possible to distinguish between this effect and the change in landsurface signal from MIKE SHE as opposed to the inherent HIRHAM land-surface scheme signal.Large differences in surface fluxes between neighbouring grid cells both inside and outside the coupled area are nonetheless seen, as induced by differences in vegetation, soil, topography etc., and discontinuities at the uncoupled-coupled interface are therefore not considered important.

DTI
As four out of six of the assessed climatic variables exhibit improved performance statistics with a lower DTI, the relation between computation time and DTI (Fig. 4 is highly relevant for studies over longer periods.This improved performance of the coupled set-up is constrained, however, by a corresponding increase in computation time. The general decrease in RMSE levels with lower DTI is not surprising as a more frequent update of the surface forcing from MIKE SHE will include more dynamic features in the land-surface exchange and better align with variations in the surface energy balance affecting the land-atmosphere interaction.To fully capture the higher degree of dynamics in the land-surface-atmosphere interaction and dependence during unstable atmospheric conditions, a high frequency DTI closer to the RCM time step is likely to be important.One might suspect the effect of DTI to level off when approaching the internal HIRHAM model time step of 120 s and to obtain results affected by coupling features alone.Along these lines, a more dynamic pattern is seen for most variables for days with a higher degree of cloud cover and lower R g levels (10 and 17 July; Fig. 7).Similar to this study Maxwell et al. (2011) have tested the timing of data transfer between the ParFlow hydrological model and the WRF atmospheric model in a 48 h idealized constructed set-up.The simulations were performed by using four transfer intervals of 5, 10, 60 and 360 s, where WRF used a constant time step of 5 s (non-hydrostatic model) and the time step in ParFlow varied with the TI.Good water balance results were obtained for transfer rates up to 12 times that of WRF (60 s), whereas the results for TI of 360 s deteriorated.Even though a smaller time step was used in WRF than in HIRHAM in the present study (5 s compared to 120 s), the results of Maxwell et al. (2011) correspond reasonably well to our results, where a transfer rate of 12 times that of HIRHAM would correspond to a 24 min DTI.

Impact of coupling evaluated against climate model variability
Climate models as proxies for real atmospheric conditions show considerable internal variability and the effects of introducing a full coupling therefore need to be evaluated on the basis of several simulations, e.g. the initial boundary conditions are perturbed.In some cases, the internal variability could be as large as effects introduced by the coupling of a RCM and a hydrology model.Hence, it is critically recommendable to explore variations caused by the physical changes (i.e. the coupling) as opposed to the internal climate model variation when developing coupled climate-hydrology modelling systems.
In our study, the precipitation amounts spanning 75-99 and 52-134 mm for the HUV and CV simulations respectively, exhibit a significant variability in simulated PRE-CIP simply as a result of changes in the initial conditions.This has also been shown in several other studies (Casati et al., 2004;van de Beek et al., 2011;Larsen et al., 2013), which have highlighted the importance of considering climate model variability when assessing model performance.In the present case the coupling is seen to inflate the variability of local precipitation as compared to the uncoupled climate model simulations even considering internal climate model variability.Since many climate models generally tend to underestimate the variability of local precipitation, thus providing unrealistic projections of events, such as extreme precipitation, this is again a potentially promising feature of a coupled model system, e.g. with respect to the representation of long-term trends in precipitation for longer periods (multiple years) and in future climate projections, and will be investigated in future studies.

Test domains
There is a clear tendency for increased RMSE levels from the TI simulations with a higher degree of coupling from Dom1 to Dom5 with the exception of R g results (Fig. 3).An important consideration in this regard is, however, the specific location of each of the domains within Denmark (Fig. 1).For the uncoupled HUV simulations, a similar pattern of increased RMSE values is seen in PRECIP for the same test domains as for the TI simulations.Therefore, it is not possible to directly relate the share of MIKE SHE influence on the HIRHAM simulations to the results.An additional cause of the pattern of higher RMSE levels for test domains located in central Jutland (Dom1-Dom4), as compared to the eastern Dom5, could be related to certain geographical biases in the precipitation as often seen in RCMs, including HIRHAM (Jacob et al., 2007;Polanski et al., 2010).Corresponding biases for temperature have also been found (Kjellström et al., 2007;Plavcová and Kyselý, 2011) coastline has also been shown to affect precipitation results from HIRHAM (Larsen et al., 2013) and thereby the available water affecting the energy balance budget.In this regard, the test domains Dom2 and specifically Dom3-Dom4 are located close to Ringkøbing Fjord, which might contribute to the higher RMSE levels of these compared to Dom5.

Scale of variables
An essential consideration is to assess at which spatial scale the atmospheric variables are affected by the land-surface model.The Skjern River catchment covers an area of approximately 70 km × 50 km, and our hypothesis is that areas in the proximity of the catchment and up to 25 km downstream of the catchment (in relation to the dominant wind direction) may be affected by the model coupling.This corresponds to atmospheric scales from smaller mesoscale to microscale.It could be argued, however, that the effect of the coupling, although tested on regional scales below 100 km, could likely be imposed regionally on top of larger scale atmospheric phenomena such as larger mesoscale and synoptic scale features.
In this regard it should be noted that global incoming solar radiation (R g ) which is, by and large, affected by cloud cover and therefore by upstream larger meso-and synoptic scale conditions, shows no effect of the coupling scenario, as the RMSE pattern resembles a somewhat random pattern as a function of DTI, test domain and model variability (Fig. 3).Similarly, P s would be connected with larger scale weather systems and sea surface temperatures (Køltzow et al., 2011) and is seen to be constrained, to some degree, by lateral boundary conditions (Seth and Georgi, 1998;Diaconescu et al., 2007;Leduc and Laprise, 2009) but is highly influenced by domain characteristics (Larsen et al., 2013).The variables RH, V and T a all vary on spatial scales far below the resolution of HIRHAM and even MIKE SHE and the improved results with a more frequent DTI could therefore be anticipated to some extent.Also PRECIP, in particular convective rainfall, can be seen at grid scales below the HIRHAM resolution (Casati et al., 2004).
Another potential contribution to the coupled model performance comes from the fact that HIRHAM is a hydrostatic RCM with a convective scheme close to, or at, the threshold of its minimum resolution as also suggested in Larsen et al. (2013).Although, HIRHAM has been tested at similar spatial scales previously and was found to provide reasonable results, at very fine temporal scales the hydrostatic nature of HIRHAM could arguably contribute to the degree of variability seen for precipitation, and the 11 km resolution naturally has its limits compared to newer studies utilizing atmospheric model resolutions of a few kilometres, such as Kendon et al. (2014).For hydrological studies forcing data having finer resolutions are highly beneficial (Xue et al., 2014) and must be expected even more important for regions with a complex topography and a high degree of convective precipitation.One approach to reach fine resolutions appropriate for hydrological studies is seen in Berg et al. (2012) using a range of downscaling methods to achieve a resolution of 1 km over a northern European region thereby demonstrating significant improvements for both temperature and precipitation.Conversely, the uncertainty related to events such as the location and timing of precipitation, are in general much larger than the model resolution even for very high resolution non-hydrostatic models, particularly at the time scales of climate projections (Rasmussen et al., 2012).Hence, in practical terms, the HIRHAM-MIKE SHE set-up explored in this paper represents a reasonable compromise in terms of delivering results of sufficient spatial representation for a number of problems in climate projection studies.

Perspectives for further use
Computationally, we show that it is feasible to run simulations using coupled models dedicated to different types of computing systems, in this case a high performance computer and a personal computer.Moreover, we have demonstrated that transient coupled climate-hydrology simulations at the decadal scale or longer is well within reach.The present prototype implements a number of technical decisions inherent to the computing environment available for this study and more work is needed in order to reduce computation times, e.g.implementation of a more efficient memory-based data transmission scheme as prescribed in the OpenMI standard.In its current form, the coupling approach, however, may easily be generalized to other computing environments.In terms of further model development this work suggests that several steps may be undertaken to improve the coupled model performance.While we directly link model variables in the present study using an OpenMI interface, the present framework could easily be extended by imposing empirical downscaling and bias correction methods to further improve model compatibility across time and spatial scales.

Conclusions
This study presents the performance of the fully twoway coupled set-up between the HIRHAM RCM and the combined MIKE SHE/SWET hydrological and land-surface models.In particular, the influence of the DTI between the models, the domain of coupling influence and the HIRHAM model variability, was assessed.
Of the six HIRHAM output climate variables PRECIP, RH, V and T a showed significant differences between simulations from perturbed runs of HIRHAM and perturbed runs of two-way coupled MIKE SHE-HIRHAM, as well as significant improvements in RMSE with a reduced DTI in the evaluated range of 12 to 120 min DTIs.The improvement for precipitation is highlighted with regard to the potential in the coupled set-up as this is considered one of the most Hydrol.Earth Syst.Sci., 18, 4733-4749, 2014 www.hydrol-earth-syst-sci.net/18/4733/2014/ difficult variables to simulate.The R g and P s variables were shown to have little to no impact from the coupling.Little to no improvement in the MIKE SHE output variables is seen for decreased DTI values as the improvement in LE is in the same range as the H decline.
The uncoupled and coupled HIRHAM model variability, induced by perturbing the HIRHAM runs with varying starting dates, was shown to correspond to 47 and 46 %, respectively, of the average improvements in RMSE and MAE for the four significant variables when going from a 120 min to a 12 min DTI.Similarly significant variations were seen in the simulated precipitation where the eight two-way fully coupled simulations with 12 to 120 min DTI values (TI) produced spans in precipitation during the 1 year period of 108-170 mm for the five test domains.Similarly, the uncoupled (HUV) and coupled (CV) simulations where model variability was induced by changing initial conditions showed precipitation spans of 75-99 and 52-134 mm, respectively.For all of these, the resulting span increased with a higher degree of coupling.Part of this pattern may be attributed to well-known geographical HIRHAM bias over the central Jutland ridge.The HIRHAM model variability as transferred to the MIKE SHE model in the 60 min DTI CV simulations were substantially higher for discharge than for the LE, H or G heat fluxes.
In general, the coupled modelling results (TI and CV) are poorer than the uncoupled results (HUV) when assessed on a sub-daily to daily basis, whereas longer-term precipitation is better reproduced by more frequent DTI coupled simulations.The poorer short-term coupled performance is not surprising as each of the models over the years, also prior to this study, have been separately refined (convective scheme and land-surface energy balance) or calibrated to accurately reproduce observations.These calibrations are likely to have compensated for errors in the separate and complex model components to ensure a proper data fit.We suggest that the replacement of the land-surface scheme in HIRHAM, as introduced by MIKE SHE, and the change in data input in MIKE SHE, as introduced by HIRHAM, causes this deterioration.A potential calibration of the coupled set-up is outside the time frame and scope of the present paper, however we see a great potential for further improvements.

Figure 1 .
Figure 1.Location of HIRHAM regional climate domain within Europe, MIKE SHE catchment within Denmark, three point measurement sites and location of five evaluation domains.

Figure 2 .
Figure 2. Flow chart of the data flow and analyses performed in the present study, and a legend of the variables mentioned in the study.

Figure 3 .
Figure 3. HIRHAM output RMSE statistics for each of the test domains for the coupled TI simulations.Linear trend lines are shown with RMSE as a function of DTI as well as the average trend line correlation coefficients where the significant correlations on a twosided 95 % confidence level are underlined.

Figure 4 .
Figure 4. Model execution time in hours of wall time as a function of DTI.DTI steps of 6, 9, 12, 15, 24, 30, 48, 60, 90 (eight  CV runs), and 120 min were used, whereas 6 and 9 min DTI values were extrapolated from unfinished runs.For comparison, the dashed line is the execution time for the uncoupled HIRHAM runs (HUV).Reprinted fromButts et al. (2014).

Figure 5 .Figure 6 .
Figure 5. RMSE variability for the TI, HUV and CV simulations for each of the five test domains.The dots represent the median value, the box plots represent the 25-75th percentiles and the whiskers represent the entire data range.

Figure 8 .
Figure 8. MIKE SHE output RMSE statistics for each of the three flux tower measurement sites and the three discharge stations for the TI, MSDS and CV simulations.For the TI simulations linear trend lines are shown with RMSE as a function of DTI as well as the average trend line correlation coefficients where significant correlations on a two-sided 95 % confidence level are underlined.Also, the variability of the perturbed CV simulations is shown.

Figure 9 .
Figure 9. Four MIKE SHE output variables for the period 10-17 July (discharge is 1-31 August) for the TI, CV and MSDS runs and for Dom1 (nine cell mean).The legend colouring reflects the overall simulation group (TI, CV and MSDS) and each simulation has the same colour shade as in Fig. 6.The individual flux sites are shown for LE only.Notice the y axis shifts to accommodate more sites.

Table 1 .
Simulation outline showing simulation groups, number of runs in each group and short description of simulation group characteristics.The two latter columns show from which of the two model components the simulation output derives.May 2009 to 30 April 2010 with a spin-up period from the beginning of March to 30 April 2009.A total of 26 model runs were used; in the present study they are divided into four main categories (see also Table 1): -Transfer interval (TI): eight two-way fully coupled simulations were performed by varying the DTI, between the HIRHAM and MIKE SHE models, between 12 and 120 min.These DTI values were chosen to conform to time step restrictions imposed by MIKE SHE (given in fractions of an hour) to ensure accurate process modelling and to allow for executing model runs within the time slots allocated by the supercomputing facility at the Danish Meteorological Institute.The TI runs used 1 March 2009 as the starting day.
The RMSE was calculated on the basis of hourly values of RH, V , R g , T a , P s , LE, H and G and daily values of PRE-CIP and Q against the corresponding observations for the six HIRHAM and four MIKE SHE variables: i,t SIMi, t − OBS i,t 2 n (1) where SIM and OBS are simulated and observed values respectively; i and t are location and time respectively and n www.hydrol-earth-syst-sci.net/18/4733/2014/ Hydrol.Earth Syst.Sci., 18, 4733-4749, 2014

Table 2 .
Absolute and percentage change in MAE and RMSE between the largest (120 min) and smallest (12 min) DTI based on the average value of the linear trend lines of either the five test domains (HIRHAM output) or the measurement sites (MIKE SHE output).Also shown is the absolute variability from the CV and HUV runs defined as the minimum value subtracted from the maximum for the 60 min DTI averaged between test domains (HIRHAM output) or measurement sites (MIKE SHE output) for each tested variable.