There is an urgent demand for assessments of climate change impacts on the hydrological cycle at high spatial resolutions. In particular, the impacts on shallow groundwater levels, which can lead to both flooding and drought, have major implications for agriculture, adaptation, and urban planning. Predicting such hydrological impacts is typically performed using physically based hydrological models (HMs). However, such models are computationally expensive, especially at high spatial resolutions.
This study is based on the Danish national groundwater model, set up as a distributed, integrated surface–subsurface model at a 500 m horizontal resolution. Recently, a version at a higher resolution of 100 m was created, amongst others, to better represent the uppermost groundwater table and to meet end-user demands for water management and climate adaptation. However, the increase in resolution of the hydrological model also increases computational bottleneck. To evaluate climate change impacts, a large ensemble of climate models was run with the 500 m hydrological model, while performing the same ensemble run with the 100 m resolution nationwide model was deemed infeasible. The desired outputs at the 100 m resolution were produced by developing a novel, hybrid downscaling method based on machine learning (ML).
Hydrological models for five subcatchments, covering around 9 % of Denmark and selected to represent a range of hydrogeological settings, were run at 100 m resolutions with forcings from a reduced ensemble of climate models. Random forest (RF) algorithms were established using the simulated climate change impacts (future – present) on water table depth at 100 m resolution from those submodels as training data.
The trained downscaling algorithms were then applied to create nationwide maps of climate-change-induced impacts on the shallow groundwater table at 100 m resolutions. These downscaled maps were successfully validated against results from a validation submodel at a 100 m resolution excluded from training the algorithms, and compared to the impact signals from the 500 m HM across Denmark.
The suggested downscaling algorithm also opens for the spatial downscaling of other model outputs. It has the potential for further applications where, for example, computational limitations inhibit running distributed HMs at fine resolutions.
Groundwater accounts for a substantial part of the active hydrological cycle, and exhibits a wide range of interactions with ecosystems, food and energy security, as well as climate and flood regulation (Gleeson et al., 2016). Interactions with the land surface and other components of the hydrological cycle are particularly pronounced where the uppermost groundwater table is found within a few metres of the surface. Such high groundwater tables affect up to a third of the Earth's land surface (Fan et al., 2013; Soylu and Bras, 2022). Moreover, across many parts of the world, groundwater resources are not only affected by human interactions, but also changing due to climate change (Rodell et al., 2018).
Denmark, with its gentle topography and temperate climate, has particularly high groundwater tables. The uppermost groundwater table (which we hereafter refer to as shallow groundwater) is located only a few metres to decimetres below the surface across large parts of Denmark (Koch et al., 2021; Henriksen et al., 2020a); see Fig. 1. This is also reflected in roughly 50 % of agricultural land in Denmark being artificially drained (Olesen, 2009; Møller et al., 2018). Along with rising precipitation in future climate (Pasten-Zapata et al., 2019), the shallow groundwater levels are generally expected to rise (Refsgaard et al., 2016; Henriksen et al., 2020b; van Roosmalen et al., 2007). The projected rise in groundwater level is most pronounced in winter, where groundwater levels are highest to start with, due to increased precipitation. During summer, the picture is more complex for shallow groundwater levels, with some areas showing falling groundwater tables due to increased temperature and evapotranspiration. More extreme and higher groundwater levels in the future pose significant challenges for infrastructure, agriculture, and ecosystems (Halsnæs et al., 2022). Due to the considerable small-scale variability of shallow groundwater levels (Koch et al., 2021), which are mainly controlled by topographic variability and hydrogeology, high-resolution information is required for purposeful groundwater management and climate adaptation.
Denmark with the mean depth of the shallow groundwater table, as simulated by the DK-model HIP at a 100 m resolution for the period 1990–2019. The submodels used for training and validation of the downscaling algorithm are outlined in blue and red. The extents of the detail maps shown in Figs. 5, 6, and 8 are indicated as black rectangles.
This requirement for high-resolution data is particularly relevant when evaluating climate-change-induced impacts on the shallow groundwater table. The national water resource model of Denmark (the so-called DK-model) is based on a distributed, coupled surface–subsurface model at a 500 m horizontal resolution, and recently at a higher-resolution 100 m version with specific focus on surface-near processes. As forcing for climate change impact studies, a large ensemble of locally bias-corrected climate models is available (Pasten-Zapata et al., 2019). Ideally, national hydrological impact assessments would be based on the high-resolution version of the hydrological model (HM). However, the 25-fold increase in computational nodes for the 100 m model compared to the 500 m model makes such a modelling task infeasible due to the computational burden.
In order to obtain national impact projections at high resolution based on a large climate ensemble, but with a minimized computational cost, it is proposed to develop a downscaling method based on machine learning (ML) to refine impact simulations from the computationally feasible 500 m HM to a resolution of 100 m.
Within other fields such as remote sensing, ML-based spatial downscaling algorithms have been explored and used for several years, for example in the DisALEXI modelling system (Anderson et al., 2004, 2021). Here, the background is often to bridge gaps between coarse-resolution imagery from some satellites that typically have a frequent revisit time, and high-resolution imagery from other satellites with lower temporal resolution (Yang et al., 2017; Im et al., 2016). Moreover, image-sharpening techniques that build scale-independent models are utilized to increase the spatial resolution of satellite data (Guzinski and Nieto, 2019). Similarly, spatial downscaling is widely applied for weather predictions and climate models (e.g. Cheng et al., 2020; Sun and Tang, 2020). We hypothesize that applying similar techniques on outputs from complex HMs, combining the information of physically based models with various high-resolution spatial data and ML methods will increase relevance of HM outputs. This is a relatively new field with a few applications so far. For example, Koch et al. (2021) used HM outputs as covariates in ML algorithms predicting groundwater table depth at high spatial resolution, or Zhang et al. (2021) downscaled GRACE total water storage data with the help of HM output from the Global Land Data Assimilation System (GLDAS). Here, the presented research is contributing to the further integration of physically based HMs with ML techniques. This can also contribute to the discussion of further development of knowledge-guided or theory-informed ML techniques in hydrological science (Nearing et al., 2021).
Hence, the objective of this work is to develop a ML-based algorithm for spatial downscaling of physically based HM outputs. This is applied to the downscaling of climate change impact predictions on the simulated depth of the shallow groundwater table for Denmark, downscaling from a resolution of 500 to 100 m. We favour an ML-based approach over simple interpolation methods or topography-driven downscaling, because ML has the capability to effectively learn multivariate relationships which we expect to be highly relevant for a complex variable like the shallow groundwater table. Even though running the DK-model nationwide at the fine 100 m resolution with a full ensemble of climate models is computationally too expensive, it is feasible to run some selected submodels at a 100 m resolution and utilize their high resolution outputs as training data for the ML-based downscaling. In addition, a single national-scale deterministic run with historic climate is possible at 100 m, which can serve as valuable information to the downscaling algorithm. Also, all the relevant model input data are available at a resolution of at least 100 m.
The national water resource model for Denmark, the DK-model, covers most of
the Danish land surface area of approximately 43 000 km
The importance of the location of the shallow groundwater table, and it being controlled by small-scale variations in topography and geology, is also one of the main motivations for the creation of the finer 100 m resolution DK-model HIP (see also the 10 m resolution map of average groundwater tables developed in the same project by Koch et al., 2021).
The DK-model HIP is an integrated, distributed surface water–groundwater model setup in the MIKE SHE model code (Abbott et al., 1986; DHI, 2020). MIKE SHE is used to couple 3D subsurface flow, 2D overland flow, a simple two-layer description of the unsaturated zone, and 1D kinematic routing of flow in the stream network. It is run as a transient model with daily climate forcing and a maximum time step of 24 h.
The description of the DK-model is kept short here, as the model itself and its development is not the focus of this paper. For more details on model setup, input, and parameterization, the readers are referred to the provided literature in the following two sections.
The DK-model HIP exists in two distinct horizontal resolutions of 500 and 100 m, with all relevant input data available at 100 m resolution. The saturated zone is divided into 9–11 computational layers of varying thickness, depending on the region. The 3D unit-based parameterization of the subsurface is based on a nationwide hydrogeological model, with the exception of the uppermost computational layer with a constant thickness of 2 m which is parameterized based on the Danish soil map (Jakobsen et al., 2015).
Human water use is included to the extent that groundwater extractions for drinking water and irrigation are included; sewage plant outflows are also added to the model. Moreover, the model includes a representation of (subsurface) drainage, which is widespread in Denmark (as described in Schneider et al., 2022).
As historic climate forcing for precipitation, temperature, and potential evapotranspiration, the gridded, daily data from the Danish Meteorological Institute is used (Scharling, 1999a, b). Temperature and potential evapotranspiration are provided at 20 km resolution and precipitation at 10 km, and interpolated to the respective HM resolution of 100 or 500 m. Daily precipitation is corrected as described by Stisen et al. (2011).
Overview over the 17 used bias-corrected RCMs. The training models
are marked. Changes to projected precipitation (
The national DK-model HIP was calibrated in its 500 m version, against observations of groundwater heads and streamflow for the period 2000 to 2010, and validated from 1990 to 1999 and 2011 to 2019. For calibration, the model optimization tool PEST (Doherty, 2015) was used with its implementation of the Gauss–Marquardt–Levenberg algorithm. Again, due to excessive computational effort, it was not possible to conduct such a calibration, requiring a large number of model runs, with the full 100 m national model. However, through a validation comparison, it was deemed feasible to directly transfer the calibrated parameters from the 500 m version of the model to its 100 m version without loss of performance. On the contrary, especially the performance for shallow groundwater heads was improved moving from the 500 m to the 100 m version. A recalibration of a limited set of parameters for 10 subcatchments in 100 m yielded no substantial additional improvement in model performance. Details regarding the model calibration are beyond the scope of this paper and can be found in Henriksen et al. (2020a).
An ensemble of regional climate models (RCMs) from the Euro-CORDEX initiative (Jacob et al., 2014) was used in this work. Those RCMs were locally bias corrected for Denmark and remapped to the same 10 and 20 km resolutions of the historical climate data as described in Pasten-Zapata et al. (2019). All RCMs cover the years 1971 to 2100 with daily time steps. For the work presented here, we used 17 RCMs representing the RCP8.5 greenhouse gas concentration scenario (van Vuuren et al., 2011) shown in Table 1. Out of that ensemble of 17 RCMs, we selected a subset of 5 RCMs. Those 5 RCMs were selected based on their ranges of projected precipitation, to cover a wide range of future climates, ensuring that the median precipitation predictions across the 5 selected RCMs are close to the median precipitation predictions of the entire ensemble of 17, as well as cover variances in changes between summer and winter precipitation. This subset of 5 RCMs was used with the 100 m submodels (Fig. 1), producing the training data for the downscaling.
Besides climate forcing, some further input data were adapted to account for future conditions: sea level is included as fixed head boundaries along sea boundaries of the HM. For the historic model runs, the sea level is kept constant at an elevation of 0 m. For the future scenarios, sea-level rise was accounted for as projected by the Danish Meteorological Institute (Thejll et al., 2021).
Changes in groundwater abstraction rates, sewage plant outflows, and land use were not projected, but current historic values and maps were assumed for the future period. Irrigation is simulated demand driven in the model; hence, it automatically accounts for climatic changes.
An overview of the different climate change impact runs performed with the HMs is given in Table 2. Due to computational limitations, we ran the full ensemble of 17 RCMs with the 500 m national HM only (C; B is a subset of C). Besides that, we selected five training subcatchments, covering about 9 % of Denmark's surface area in total, and ran those with five selected RCMs at 100 m resolution (D). To allow for a fully independent validation, a sixth validation submodel was run at 100 m resolution, with the full RCM ensemble (E). Finally, the DK-model HIP was also run for all of Denmark with observed climate forcing, for the period 1990 to 2019. These historic runs were performed both at 500 and 100 m resolutions (A and *). Relevant HM outputs were initially stored at daily time steps and then aggregated as described below in Sect. 2.3.1.
Overview of the performed HM runs with different climate inputs
for Denmark and different submodels. A – used as covariate. B – to be
downscaled, in training, and as additional training data (auxiliary points).
C – to be downscaled, in validation. D – training data. E – validation data.
For the climate change impact evaluation, the HMs were run using each individual member of the ensemble of RCMs as forcing, for three different periods: the reference period, serving as baseline, as well as the near future period (nf), and the far future period (ff). For the reference period, model outputs were extracted from 1 January 1991 to 31 December 2020, for the near future from 1 January 2041 to 31 December 2070, and for the far future from 1 January 2071 to 31 December 2100. The HMs were started with initial conditions taken from a continuous run of the corresponding HM across the entire period of 1991 to 2100, using conditions from the same date as the respective simulation start. A warmup period of 4 years prior to the start dates for result extraction was allowed for each of the HM runs.
Finally, the climate change impacts of interest were the average conditions across each of the future 30-year periods, subtracted from the reference 30-year period. These climate change impacts on the shallow groundwater table are hereafter also referred to as to-be-downscaled-variables (TBDVs).
That means we focus on
Hence, we downscale projected climate-change-induced changes to the shallow
groundwater table. Different aggregated statistics were chosen, and changes
were calculated for both the nf and ff periods. The chosen
aggregated statistics are the changes to the mean, 1st percentile, and
99th percentile of the simulated groundwater table (high and low
groundwater tables, respectively), as well as changes to the 1 m exceedance
probability. The latter represents the fraction of time the groundwater
table is closer than 1 m to terrain during the given period. The 1 m threshold was chosen based on user feedback (Stisen et al., 2018), and is relevant due to issues with infrastructure and agriculture caused by high groundwater levels, many of which are expected to start if the groundwater is closer than 1 m to the surface (e.g. the typical depth of tile drains around 1 m). This results in a total of eight statistics to be downscaled (TBDV, see also Table 3). For example, the change to the 1st percentile depth of the groundwater table for the nf period
The full national DK-model HIP was run at 500 m resolution with the full ensemble of 17 RCMs (which includes the 5 selected training RCMs). This 500 m model output represents the data that are to be downscaled to the finer 100 m resolution with the presented method.
Six submodels distributed across Denmark (outlined and marked in Fig. 1), each representing a hydrologic subcatchment and setup at 100 m resolution, were chosen to produce training and validation data for the downscaling algorithm:
Suså Odense Å Kongeå/Kolding Å Storå Simested Å/Mariager Fjord Aarhus Å/Aarhus (validation only)
The five submodels I to V were used as training data, whereas submodel VI
was used as a fully independent validation dataset. To reduce computational
effort spent on HMs, the five training models were run with the five
selected RCMs, i.e. the training data are based on the reduced RCM ensemble.
For the validation, to address the extrapolation capability of the
downscaling algorithm, the validation model was also run with the full
ensemble of 17 RCMs. The submodels were run with the same general model
setup, periods, and warmup as for the national model described at the start
of Sect. 2.3. Along the land boundaries of the submodels, dynamic boundary
conditions were applied based on the simulated groundwater heads in the
corresponding 500 m national HM.
The five training submodels, along with the sixth validation model, were deliberately selected to cover a representative range of geologic, topographic, and hydrologic variability occurring across Denmark. The ML methods are known to perform poorly when used to extrapolate data beyond the range of data they are trained against (Meyer and Pebesma, 2021). Figure 2 displays the histograms of one of the variables to be downscaled (TBDV), together with some important covariates. The values covered by the five training submodels closely resemble the distribution and ranges seen across all of Denmark; i.e. the submodels are considered representative of all of Denmark. Similarly, the validation submodel also covers comparable covariate ranges.
Overview of the covariates.
Histograms of covariate ranges covered by the five training and
the validation submodels, compared to all of Denmark. Shown for the example
of TBDV
Table 3 presents an overview of the covariates used in the downscaling algorithm. The first type of covariates is topography-related, and contains six different covariates. Secondly, 14 covariates derived from the DK-model HIP parameters were used: this group
contains the horizontal transmissivities for the uppermost 7 computational layers of the model's saturated zone, and the mean horizontal
hydraulic conductivities across 5 depth intervals, as well as the
spatially distributed drain depths and drain-time constants used in the
model. The last group contains the actual 500 m change to the groundwater
table that is to be downscaled (the TBDV). Besides that, the respective
All covariates are available at the target resolution of 100 m for all of Denmark, to allow the application of the downscaling algorithm to the entire country. The only exception obviously is the TBDV itself, which is only available at 500 m nationwide. It was resampled from its native resolution to 100 m using bilinear interpolation: initial tests showed performance improvements of the downscaling algorithm if interpolated TBDVs were used; using resampled TBDVs without interpolation lead to visible artefacts at the 500 m grid boundaries in the 100 m outputs.
More covariates have been used in preliminary tests. Analyses of covariate importance and collinearity, excluding non-informative and strongly correlated covariates, lead to the final set of 23 covariates.
For the downscaling task at hand, we decided to use a random forest (RF) regressor. The RF goes back to Breiman (2001) and over time has proven to be a powerful and versatile data-driven modelling tool for a range of applications in environmental sciences. For example, RF was used in the context of groundwater pollution (Rodriguez-Galiano et al., 2014; Tesoriero et al., 2017), data analysis and predictions in large-sample hydrology (Addor et al., 2018; Ghiggi et al., 2019), as well as some of the remote-sensing downscaling models mentioned in Sect. 1. Examples from related Danish contexts include the prediction of groundwater level changes based on well observations (Gonzalez and Arsanjani, 2021), modelling of the historic depth of shallow groundwater and redox boundary (Koch et al., 2019b, a), or the modelling of artificial drainage properties (Motarjemi et al., 2021; Møller et al., 2018).
RF is a supervised ML method, requiring labelled training data. Based on the training dataset, an RF regressor model learns about relationships between a set of covariates and the target (training) data values. In the next step, predictions at unsampled locations beyond the training data can be made. Tyralis et al. (2019) provide a concise overview of the theory behind and the use of RF in hydrological contexts.
We used the implementation of RF regressors in Python 3.8 from the
scikit-learn package, version 1.0.2
(
In Fig. 3, a workflow diagram of the downscaling algorithm is shown: all covariates exist at 100 m resolution for all of Denmark, with the exception of the actual TBDV, which originates from the 500 m HM. The TBDV exists at the finer target resolution of 100 m for the five training submodels – these data are used as training data for the RF downscaling regressors, alongside some auxiliary points from the 500 m HM (see below). After training, the RF regressors are used to produce Denmark-wide maps of the TBDV at 100 m. The downscaled outputs are validated against both national HM output, the training data, and a sixth, independent validation submodel at 100 m. Details are presented in the following sections.
Diagram of the downscaling algorithm.
Initial tests with spatial cross validations of the downscaling algorithm showed issues with the predictions for the spatial hold-out, i.e. with spatial transferability of the algorithm to areas not covered by the training data. To improve spatial transferability, it was decided to include additional training data sampled from the entirety of Denmark outside the training submodels, which enhance the spatially limited training dataset. These points were sampled randomly in space. The actual training target – the TBDV at 100 m – is lacking for those points, and instead the TBDV at 500 m was used as training target. Hence, we refer to these points as auxiliary points. The inclusion of such auxiliary points increases the robustness of the RF downscaling outside the training data areas; for example, we experienced some areas where the downscaling incorrectly reversed the direction of change to the groundwater table without using auxiliary points, which was alleviated after including auxiliary points. Despite the covariate ranges being adequately covered by the training catchments (Fig. 2), the auxiliary points still inform the algorithm by adding covariate values and likely combinations of covariates (the latter being a focus of the work on ML algorithm transferability by Meyer and Pebesma, 2021). However, the more auxiliary points are included, the closer the downscaled output resembles the original TBDV at 500 m, which is undesired. Considering this trade-off, we tested different amounts of auxiliary points, and settled on an optimal number of 20 000 auxiliary points covering all of Denmark to be included in the training data. This adds 5 % to the total number of data points in the training dataset comprised of the 5 training submodels, where all 100 m HM grids were included in the training data (approximately 400 000 points).
The parameters used for the RF regressors. The first section presents the tested hyperparameters, where the selected values for each parameter are marked in bold. Further parameters are reported in the second section.
The four different climate change impact statistics regarded in this work –
i.e. changes to the depth of the groundwater table in average (mean), high
(
The training was performed using the 100 m target values from the five fine-scale submodels (D in Table 2), based on the reduced RCM ensemble, as well as the auxiliary points from the 500 m HM. The respective TBDV at 500 m was taken from the national HM (B in Table 2). Beyond the training data at 100 m and the TBDV at 500 m, the other covariates presented in Table 3 were used.
Based on the training data, the importance of each of the covariates was evaluated: the information content (importance) that each covariate delivers to the RF regressors is determined by randomly perturbing each of the individual covariates one by one, retraining the model, and observing the decrease in model fit. Moreover, as covariate collinearity can lead to misleading results of such an analysis, the importance analysis was also performed by perturbing whole groups of related covariates at a time (e.g. Koch et al., 2019a).
Random forest regressors have built-in parameters controlling model behaviour and complexity, the so-called hyperparameters. To fine-tune and optimize the RF regressor performance, a hyperparameter search was performed using the
standard grid-search procedure. The tested parameter values, resulting in
144 unique combinations, are listed in Table 4. For each of the parameter combinations, and each of the TBDVs, the following was
performed:
An evaluation of the ability of the RF regressor to reproduce the 100 m HM results by determining Pearson's An evaluation of the Denmark-wide bias introduced by the downscaling. Here, the RF regressor was trained against all five training submodels, and then used for predictions for all of Denmark. Those predictions were compared against the national 500 m HM results.
In this setup, (i) presents an indication of the RF regressor's ability to
reproduce the fine-scale 100 m results, whereas (ii) represents an
indication of the transferability and robustness of the RF regressor, with
decreasing performance in the case of overfitting. Based on averaging these
results across all TBDVs, a hyperparameter set representing an optimal
trade-off was determined. These hyperparameters, marked bold in
Table 4 alongside the other reported parameters,
were then used in all subsequent tasks.
Permutation covariate importance for the four RF downscaling
regressors, for each covariate (see Table 3 for an
overview). The higher the reduction in
Subsequent to training the RF regressors with the optimal hyperparameters, they were applied to produce predictions for the entirety of Denmark. That means that the four RF downscaler models were used to downscale the national 500 m HM outputs to 100 m, separately for both the nf and ff periods, and for both the ensemble of 5 RCMs used in the training as well as for the full ensemble of 17 RCMs. The only difference in downscaling for the different periods and RCM ensembles is in the used TBDV at 500 m, which always represents the corresponding period and RCM ensemble.
Validation of the downscaled results at 100 m was then performed in two distinct ways. First, against the national 500 m HM output, verifying that the downscaling does not introduce an overall bias. Second, against the 100 m HM output from the validation submodel VI, which was not used in the training process.
Example of downscaling results for changes to the mean depth of
the groundwater table for the far future period (
Figure 4 summarizes the results of the covariance
importance analysis, for each of the final trained RF downscalers. Results
are shown separately for individual covariate importance, as well as for
covariate group importance. The groups in order of decreasing
importance were as follows: (i) the TBDV itself, (ii) the historic TBDV from the 100 m model, (iii) the geology-related covariates, (iv) the topography-related
covariates, and (v) the drain parameterization-related covariates. It is
noteworthy that on average, the TBDV turns out as the most important
covariate, followed by the respective historic absolute value of the TBDV
and the difference between historic dry and wet periods. That means that the
downscaling is largely guided by information on the TBDV itself, which is
desired since our aim was to develop a
Downscaling outputs for validation submodel VI, for the example of
Figure 5 presents different input data and results
for the same area located at the north-eastern edge of submodel III
Kongeå/Kolding Å (extent indicated in Fig. 1), for the example TBDV
Overview of the validation results of the RF downscaling
algorithm.
Figure 6 illustrates downscaling results for the
fully independent validation submodel VI Aarhus Å/Aarhus (extent
indicated in Fig. 1), for the example of TBDV
Scatterplots for the example of TBDV
These visual evaluations of the downscaling outputs can be confirmed
quantitatively – Fig. 7 presents an overview. The
top row presents a Denmark-wide evaluation of the climate-change-induced
changes to the shallow groundwater table. The bars present the mean changes
across all of Denmark for each of the eight TBDVs (i.e. mean,
For the example of the TBDV
Finally, Fig. 9 displays Denmark-wide results of
the downscaling algorithm, for all four TBDVs of the far future period
Results of the downscaling, for the far future of all four TBDVs. Top row: Downscaling results in 100 m for all of Denmark. Middle row: Detail of the 100 m downscaling results (extent indicated in top row). Bottom row: Same detail for the corresponding TBDV in 500 m.
Moreover, Fig. 9 also displays the initially
mentioned tendency to more extreme groundwater levels: the low groundwater
tables (
Despite the discussed capabilities of ML techniques in exploiting large datasets at high spatial resolution, physically based HMs are still needed to predict climate-change-induced changes to the hydrological cycle. Purely data-driven methods (often based on ML algorithms) struggle with predicting previously unobserved states, which holds true for both spatial and temporal extrapolation (Meyer and Pebesma, 2021; Mai et al., 2021; Koch and Schneider, 2022). As an attempt to overcome the computational burden of high-resolution, complex HMs, Tran et al. (2021) have developed an ML-based emulator of an HM. Read et al. (2019) and Koch and Schneider (2022) showed that guiding ML models with simulation data from a physically based model enhanced the capability of the ML model to extrapolate. The downscaling algorithm presented in this work increases the resolution of HM outputs, guided by fine-resolution HM outputs.
As our method is based on output from distributed, physically based HMs, we (i) are able to obtain fully distributed climate change impact evaluations, and (ii) have higher confidence in extrapolating the models to values outside observed ranges, which is unique and, for example, goes beyond recent efforts with purely data-driven extrapolations of climate change impacts on groundwater levels for selected wells across Germany (Wunsch et al., 2022). Our developed downscaling algorithms were shown to be robust, transferrable in space, and also transferrable to a different climate signal, i.e. transferable to a different RCM ensemble.
Acknowledging computational limitations, we based our 100 m training data on five HM subcatchments, representing around 9 % of Denmark, which were run with forcings from a reduced ensemble of five RCMs. This meant that the computational time for hydrological model runs was reduced to less than 3 % of what would be required for a nation-wide full ensemble run at 100 m. The size of the training dataset was chosen to be sufficiently large to allow for extrapolation beyond the area of the training data. We also assume that the careful choice of the training submodels, covering a variety of different hydrogeological settings across Denmark, facilitated the spatial transferability of the downscaling algorithm. For other applications of similar downscaling algorithms, the necessary size of the training data also depends on the specific application – an algorithm used for predictions to a limited area or within the training data requires a smaller training dataset than an algorithm used for extrapolation beyond areas with training data.
The downscaling was facilitated by the consistency of parameter sets between the 500 and 100 m HM setups: we do not intend to compensate for potential discrepancies in simulated groundwater heads due to differences in parameter values, but merely want to downscale the 500 m national HM results to a finer scale as simulated by the corresponding model in higher resolution. The same applies for all the other input data: for both 100 and 500 m HMs, the same input data were being used, with the only exception of scale.
We successfully designed and tested an RF-based spatial downscaling algorithm for outputs from distributed HMs. The usage of ML techniques in spatial downscaling is widespread in fields such as remote sensing, however, still limited in hydrologic modelling. The developed downscaling algorithm is based on the existence of a coarse-scale HM for the full domain of interest, together with equivalent fine-scale HMs for limited parts of the domain. Furthermore, covariates in fine scale are required for the full domain. We trained the RF regressor based on the selected, representative fine-scale HMs covering only around 9 % of the full domain, significantly reducing computational effort on the complex HMs. Furthermore, using the downscaling algorithm in the context of an ensemble of RCMs, we could show that it is possible to train the downscaling algorithm based on a smaller ensemble of RCMs, and apply it to the full ensemble, further reducing computational effort on HMs. The downscaling results were successfully validated against the output from a fully independent fine-scale validation HM, as well as the full-domain results of the coarse-scale HM.
The presented framework is envisioned to be transferrable to the downscaling of other spatial output from hydrological and environmental models in general, also beyond aggregated statistics (i.e. also in transient manner). It is generally acknowledged that there is a disparity between the – computationally possible – spatial resolution of HM output, and what is desired or required by users of the data, such as authorities, consultants or citizens (Samaniego et al., 2019). Also from a scientific point of view, we are commonly dealing with HM resolution issues and spatial aggregation beyond what is appropriate, hindering process understanding (Wood et al., 2011; Nijzink et al., 2016).
This also applies to the greater context of this paper, the Danish HIP4Plus
project (Henriksen et al., 2020b). Within this project, a
larger dataset of climate-change-induced changes to the shallow groundwater
table was generated based on the presented framework. This includes
predictions for different greenhouse gas concentration scenarios (RCP4.5 and
RCP8.5), further statistics, and monthly as well as seasonally aggregated
values; a total of 236 variables (as compared to the 8 presented here).
All downscaled maps and further data can be accessed via the HIP data portal
Denmark-wide data produced with the presented downscaling algorithm, for a
total of 236 different variables, are freely available from the HIP data
portal:
RS designed the downscaling algorithm, in close cooperation with SiS and JK, and further inputs from HJH and LT. The code for the downscaling algorithm, as well as validation and visualization were written by RS. Setting up the hydrological models was a joint effort of the entire project team. Processing of the hydrological model output was handled by LT. RS prepared the manuscript with contributions from all co-authors.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The presented work was originally developed within the HIP4Plus project, headed by the Danish Agency for Data Supply and Efficiency (Styrelsen for Dataforsyning og Effektivisering, SDFE), and we want to thank them for good collaboration and support throughout the project. Furthermore, we want to acknowledge the great team effort of the entire HIP4Plus project team at the Department of Hydrology, GEUS. Besides the authors of this paper, the team encompassed Jane Gotfredsen, Annesofie Jakobsen, Søren Julsgaard Kragh, Maria Ondracek, Ernesto Pasten-Zapata, and Per Rasmussen.
This research has been supported by the HIP4Plus project, funded as part of the Danish Digital Strategy 2016–2020, initiative 6.1 common data on topography, climate and water (Den Fællesoffentlige Digitaliseringsstrategi Initiativ 6.1, Fælles data om terræn, klima og vand, FODS6.1).
This paper was edited by Harrie-Jan Hendricks Franssen and reviewed by two anonymous referees.