On the Appropriate Definition of Soil Profile Configuration and Initial Conditions for Land Surface-Hydrology Models in Cold Regions

Arctic and sub-arctic regions are amongst the most susceptible regions on Earth to global warming and climate change. Understanding and predicting the impact of climate change in these regions require a proper process representation of the interactions between climate, carbon cycle, and hydrology in Earth system models. This study focuses on Land Surface Models (LSMs) that represent the lower boundary condition of General Circulation Models (GCMs) and Regional Climate Models (RCMs), which simulate climate change evolution at the global and regional scales, respectively. LSMs typically 15 utilize a standard soil configuration with a depth of no more than 4 meters, whereas for cold, permafrost regions, field experiments show that attention to deep soil profiles is needed to understand and close the water and energy balances, which are tightly coupled through the phase change. To address this gap, we design and run a series of model experiments with a one-dimensional LSM, called CLASS (Canadian Land Surface Scheme), as embedded in the MESH (Modélisation Environmentale Communautaire – Surface and Hydrology) modelling system, to (1) characterize the effect of soil profile depth 20 under different climate conditions and in the presence of parameter uncertainty, (2) assess the effect of including or excluding the geothermal flux in the LSM at the bottom of the soil column, and (3) develop a methodology for temperature profile initialization in permafrost regions, where the system has an extended memory, by the use of paleo-records and bootstrapping. Our study area is in Norman Wells, Northwest Territories of Canada, where measurements of soil temperature profiles and historical reconstructed climate data are available. Our results demonstrate a dominant role for parameter uncertainty, that is 25 often neglected in LSMs. Considering such high sensitivity to parameter values and dependency on the climate condition, we show that a minimum depth of 20 meters is essential to adequately represent the temperature dynamics. We further show that our proposed initialization procedure is effective and robust to uncertainty in paleo-climate reconstructions and that more than 300 years of reconstructed climate time series are needed for proper model initialization. 30 .


Introduction
Arctic and subarctic regions are amongst the most susceptible on Earth to climate change (IPCC 2013;Hinzman et al., 2005).
For example, shrub expansion into the tundra regions (Sturm et al., 2001), permafrost thaw (Connon et al., 2014;Rowland et al., 2010), and glacier retreat (Marshall 2014) are some of the current manifestations of climate change. All these changes are 35 triggered by the interaction of climate, the carbon cycle and hydrology in response to global warming (Schuur et al., 2015).
These effects are expected to be exacerbated due to global warming trends in the coming years (IPCC 2013;Slater and Lawrence 2013;Lawrence and Slater 2005). Therefore, being able to evaluate and assess the impact of climate change in cold regions is a primary concern for the scientific community, stakeholders and First Nations communities in northern regions.
The significance of this problem in Canada has led to the creation of the Changing Cold Regions Network (DeBeer et al., 40 2015;www.ccrnetwork.ca), which aims to provide improved science and modelling to address these concerns.
Earth system models are essential tools for evaluating the impacts of climate change. At global and regional scales, General Circulation Models (GCMs) and Regional Climate Models (RCMs) are used to simulate climate change evolution. Land Surface Models (LSMs) are used with GCMs and RCMs (coupled or offline) to represent the hydrological processes associated with the lower boundary condition of the atmosphere. These models typically represent the coupled energy and water balance 45 in the soil, based on numerical solution of the Richards' equation and using a relatively coarse vertical discretization.
In general, a standard soil configuration with a depth of no more than 4 meters is used in all LSMs that are commonly implemented in GCMs and RCMs (see for example the comparison made by Slater and Lawrence (2013) for the soil configuration depth in LSMs implemented in some GCMs). The typical boundary conditions to solve the energy and water balance in the soil column are: (1) the exchanges with atmosphere at the top, (2) no lateral exchange of water or energy with 50 the surrounding grids (only vertical fluxes), and (3) no heat flux at the bottom of the soil.
For moderate climate conditions and at the spatial scales on which these models are commonly applied, the above depth and boundary conditions are commonly deemed to be sufficient to capture the intra-annual variability in the energy and water balance. However, for cold regions, where the energy balance is closely related to the water balance through the phase change (Woo 2012), deeper soil configurations and more representative boundary conditions are needed. A deeper soil profile in a 55 model can result in a more accurate process representation as it allows the heat signal to propagate to deeper soil layers and hence avoids erroneous near-surface states and fluxes, such as overheating or over-freezing during summer and winter, respectively (e.g, Lawrence et al., 2008;Stevens et al., 2007). An alternative to modelling a deeper soil profile is the incorporation of a rigorous lower boundary condition that adaptively changes with time and includes a geothermal heat flux (Hayashi et al., 2007). Developing and incorporating a dynamic lower boundary condition is, however, impractical in most 60 case due to lack of adequate data; in addition, the geothermal heat flux is usually ignored in LSMs, as its effects on temperature dynamics within the upper 20-30 meters of soil are considered negligible on century time scales .
The aforementioned challenges and shortcomings have been recognized by the climate, permafrost, and hydrology community.
For climate models, Slater and Lawrence (2013), Alexeev et al. (2007 and Stevens et al. (2007), have 65 disputed the validity of GCM future projections due to the shallow soil profile depth in LSMs for the reasons stated above.
There have been studies of how the spatial distribution of permafrost is improved by including deeper soil configurations in an LSM. For example, Paquin and Sushama (2015) considered a 65-meter deep soil configuration for the arctic region with a spin-up period of 200 years through recycling the 1970-1999 period in the Canadian RCM, which uses Canadian Land Surface Scheme (CLASS) (Verseghy, 1991) as the LSM, and showed an improved spatial distribution of permafrost. Zhang et al. 70 (2003Zhang et al. 70 ( , 2006Zhang et al. 70 ( , and 2008 used a thermal soil model that includes soil water balance and showed the importance of considering deep soil configurations. In the context of LSMs, Troy et al. (2012) simulated river basins in northern Eurasia using a 50-meter soil configuration with a spin-up of 500 years by recycling the 1901-2001 period 5 times. Decharme et al. (2013), who applied the ISBA model to the whole of France, concluded that an 18-meter depth was needed to properly simulate the energy and water balance. 75 In addition, deeper soil/rock configurations possess extended system memories, and as such, particular care should be taken to properly define the initial conditions for the subsurface system. The presence of significant non-stationarity in climate and hydrology further complicates the process of model initialization, as it leads to significant changes to the statistical properties and envelope of variability of forcings . Due to such non-stationarity, it may be inadvisable to initialize a model by recycling the (typically short) historical records (i.e., repeating the simulation over the same period multiple times 80 and using the final model state of one run as the initial state of the next run), as implemented in Troy et al., (2012) or Paquin and Sushama, (2015); such practice, in particular, may result in serious misrepresentation of soil processes, because the significant warming trend in the historical records of cold regions leads to unrealistically warmer soil states after each cycle.
Together, these reasons highlight the pressing need for multi-century-long hydroclimatic records to include past nonstationarity that may affect the present state and flux variables. Proxy records such as tree rings can provide a vehicle to 85 reconstruct long hydroclimatic time series, typically at annual to multi-year time scales (Razavi et al. 2016).
The sensitivity of LSMs to initial conditions and the initialization methods has been the focus of several studies (e.g., Yang et al. 1995;Rodell et al. 2005;Shrestha and Houser 2010). However, most of these works have focused on relatively shallow soil profiles located in areas other than cold regions. An exception is the work of Ednie et al. (2008) that illustrated the need for a suitable model initialization procedure to properly simulate soil thermal profiles in permafrost regions and applied a 90 simplified thermal model of soil by using reconstructed past climate variables. Despite significant advances, as briefly outlined above, the appropriate soil configuration depth (SCD) in land surface modelling of cold regions remains an open question. This question is further complicated by the fact that parameter uncertainty is typically ignored in LSMs, and parameter values are usually collected from look-up tables based on land cover and soil maps (Mendoza et al., 2015). Related to this, there have been some previous efforts for "sensitivity analysis" of model outputs 95 to parameters (Razavi and Gupta, 2015) but these have been mainly limited to comparisons of different cover types (e.g., Paquin and Sushama, 2015;Yang et al., 1995) with some few exceptions (e.g., Bastidas et al., 2006).
In this paper, we focus on the three inter-related aspects of LSMs, namely soil depth, parameter uncertainty, and initializations, together to address the above question. Unlike the previous studies that focus on each aspect in isolation, this study looks at their joint and individual effects. We set up a series of systematic modelling experiments with the following three objectives 100 to (1) identify the appropriate SCD for a given LSM and location in the presence of uncertainty in model parameter values and climate conditions, (2) assess the significance of including/excluding geothermal flux as the lower boundary condition in an LSM, (3) develop an initialization procedure for LSMs in cold regions based on paleo-reconstructions of climate variables and statistical bootstrapping.

Methods 105
To advance our understanding and modelling capability of soil moisture and energy dynamics in permafrost regions, we developed two series of numerical experiments for a study area located in the Northwest Territories, Canada, where observations of soil temperature at several depths and historical reconstructed climate data are available.

Study Area and Data
The experimental test case is located at Norman Wells, in the Mackenzie Valley, Northwest Territories, Canada ( Figure 1). 110 Based on the Permafrost Map of Canada (Geological Survey of Canada, 2000), the area is located in a zone of extensive discontinuous permafrost. The land cover is characterized by moss lichen groundcover, ericaceous shrubs, and black spruce and tamarack trees (Smith et al., 2004). The subsurface is formed by ice-rich silt clays. The climate of the region is subarctic, according to the Köppen climate classification (Pell et al., 2007), with an average annual mean daily temperature of -5 ºC and average annual precipitation of 295 mm/year. 115 This area is selected due to the availability of both soil temperature at several depths down to 20 meters (Smith et al., 2004) 120 and dendroclimatic reconstructions of summer air temperature (Szeicz and MacDonald, 1995). These data will be used to test the proposed methodology to define the SCD and the initialization approach.

Soil Temperature Profiles
Annual soil temperature profiles are available based on the maximum and minimum daily average of soil temperature at several 125 borehole locations in the Mackenzie Valley, administrated by the Geological Survey of Canada (Smith et al., 2004). Figure 2 shows the temperature profiles for the borehole 84-1-T5 selected for our analysis. The soil temperatures were measured at the

Reconstructed Summer Air Temperature 135
Szeicz and MacDonald (1995) generated proxy climate records of average summer (June-July) air temperature based on tree rings for period 1638-1988 in north-western Canada near to Norman Wells ( Figure 3). These proxy data have been previously used by other authors (Edine et al. 2008;Esper et al., 2002). For example, Edine et al. (2008) showed that the linear trend of proxy summer air temperature can be used as an approximation of the linear trend of the mean annual air temperature for the region. Following this approach, we generate a stochastic climate time series (Section 2.5.1) that follows the historical 140 reconstructions of mean annual air temperature based on the proxy data of Szeicz and MacDonald (1995).

Design of Experiments 145
The methodology and experiments were designed to be carried out in two stages. In the first stage, we focus on the characterization of the adequate soil profile depth for land surface-hydrologic modelling in the permafrost regions, in relation to climate condition and model parameterization. For this purpose, we run a 1D model under a variety of soil profile, parameter, climate configurations, and lower boundary conditions. This stage is referred to as "Experiment 1" in this paper.
In the second stage, "Experiment 2" we propose a method to handle the presence of non-stationarity in climate and hydrology, 150 in order to include effects of past non-stationarity on the present state and flux variables. This method utilizes paleo-climate reconstructions to generate long, synthetic time series of climate variables for model initialization.

The 1D-Model
The core of the experiments is a 1D model implemented in MESH, Environment and Climate Change Canada's community model (Pietronero et al., 2007). This integrates the CLASS LSM (Verseghy et al., 1993;Verseghy 1991), which solves coupled 155 energy and water balance equations for vegetation, snow and soil and their exchange of heat and moisture with the atmosphere, and WATROF (Soulis et al., 2000) or PDMROF (Mekonnen et al., 2014) to solve the horizontal flow processes for basin-scale integration. MESH discretizes the spatial domain based on regular grid cells and each individual cell is then subdivided in Grouped Response Units (GRUs) based on land cover and/or soil types. MESH has been commonly used to simulate land surface-hydrology processes in many cold regions (e.g., Yassin et al. 2017;Haghnegahdar et al. 2017). The 1D CLASS model 160 is implemented here at one grid cell, and a unique GRU was used. The upper boundary condition of the model is formed by atmospheric forcings. At the lower boundary condition, in terms of heat, we include two cases: no heat flux and geothermal flux (only in Experiment 1), and in terms of mass, we assume the water flux that reaches the bottom of the soil profile drains to generate base flow. The climate forcings needed are temperature, precipitation, shortwave radiation, longwave radiation, specific humidity, wind velocity and atmospheric pressure. 165

Experiment 1
A schematic representation of the modelling experiment is illustrated in Figure 4. Several 1D model set-ups were implemented by a combination of (1) various SCDs, (2) several climate conditions selected to spin-up the model, (3) different values for the parameters that control hydrological processes (water and energy balance), and (4) the inclusion or exclusion of the geothermal flux as the lower boundary condition. 170

Variable Soil Depth Configuration
For this experiment, a series of 1D models with incremental numbers of soil layers (corresponding to different total soil depths) are defined. The soil configurations of the 1D models are illustrated in Figure 5, and range from the standard CLASS configuration of 3 layers with a 4.1-meter depth, to 20 layers corresponding to a depth of 71.59 meters. The thickness of each 180 layer is increased exponentially for deeper soil layers. A total of 17 different soil configurations are tested.

Parameter Uncertainty
Three groups of parameters representing canopy, soil and drainage processes are perturbed within their ranges of uncertainty to analyze their influence on SCD. Table 2

Lower boundary conditions: The Geothermal Flux 210
To assess the effect of the lower boundary condition on the energy balance and soil temperature profile, an analysis was made to compare two scenarios: (1) no heat flow at the bottom of the lowest soil layer, and (2) a constant geothermal flow (called ggeo flux in CLASS). The comparative analysis was carried out for the average climatic condition (year 1945). All the 17 different soil configurations and 50 sets of parameter values were tested, resulting in a total of 850 model configurations to be run for scenario 2 above. For this scenario, the geothermal heat flow was set to be 0.083 W/m 2 , based on measurements made 215 in a borehole in Norman Wells (Garland and Lennox, 1962).

Non-Oscillation Depth
In Experiment 1, we ran a total of {(17 SCD)*(5 climates)*(50 parameters) + 850 (with geothermal flux)}= 5100 model combinations. In each of these model set-ups, a 2000-year model run was performed. All the models were set with the same initial conditions and constant temperature and liquid/ice saturation soil profiles. The soil thermal profile was defined at -3.0 220 ºC and all the soil water was defined as ice content. We assume that after the spin-up a quasi-equilibrium between the climate conditions and the ground thermal state was reached. The last cycle, a complete one year simulation, was used to compute the annual soil temperature profiles based on the maximum (maxTsp) and minimum (minTsp) daily average of soil temperature ( Figure 4). Next, we computed the difference between maxTsp and minTsp and defined a depth (h) at which this difference was less than 0.1 ºC. We named this depth h as the "non-oscillation depth" of annual soil temperature. Therefore, h, which is 225 a function of climate condition, parameter values, and simulated soil depth, represents the depth at which the soil thermal response remains invariant over seasons. In other words, the non-oscillation depth indicates the depth at which the SCD has not longer a significant effect on the energy balance computed by the model.

Experiment 2
To be able to simulate the hydrology using LSMs in cold regions in the last century (period of records) and in the future, it is 230 necessary to correctly set the initial conditions of the models. When the SCD of the model is considered to be shallow (no more than 4 meters), the initialization can be easily carried out with a relatively short spin-up period (Yang et al., 1995). However, with deeper SCDs, the memory of the system is longer, and it remembers the past climate regimes and trends. Therefore, it is necessary to run the model over an extended period of time to diminish the effect of uncertainty in initial conditions on model predictions. This is a major challenge, however, as the typical length of periods of records (say ~100 235 years) is not sufficient.

Methodology of Reconstruction
To overcome the above challenge, we stochastically generated past climate variables, back to year 1678 based on proxy data of reconstructed summer air temperature described in section 2.1.2. To this end, we applied a block bootstrapping technique Politis and Romano 1994). 240 The stochastic time series of climate variables were generated as follows: (1) First, we assumed that the reconstructed summer air temperature by Szeicz and MacDonald (1995) can be used as proxy data to derive the past trends in air temperature. The historical temperature trend back to 1678 (T Htrend ) was estimated by first computing the moving average with a window of 15 years and then subtracting the moving average from the annual time series. Figure 6 compares both temperature trends (15-year moving average) 245 obtained from WCH-FD data and tree rings for the same period, showing a reasonable agreement, with a Pearson correlation coefficient of 0.66. The existing discrepancy may be in part due to a lack of consideration of longerterm variability (longer than annual) in the reconstruction of the time series, an issue explained in Razavi et al. (2016). (2) Then, we decomposed the WCH-FD temperature time series (6-hourly time resolution) for the period 1901-2001 into its trend (based on the 15-year moving average) and its seasonality component (T seas ).
(3) Next, we applied the block bootstrapping technique with a block size of 5 years to T seas . We sampled 45 blocks 255 of 5 years so as to generate a time series long enough to cover the 1678-1901 period. (4) To finish the reconstruction of the 6-hourly temperature data, we added T seas to the T Htrend from step (1).
(5) The other six climate variables needed by MESH to run were precipitation, shortwave and longwave radiation, specific humidity, wind, and atmospheric pressure. They were generated by applying the block bootstrapping with the same time indexes of the temperature blocks (step 3). In this way, we maintained the interdependence 260 between all the climate variables.  Figure 7 shows the mean annual temperature of these 6-hourly time series generated with the methodology presented. 265

Evaluation procedure
We used the 100 realizations of the climate variables of section 2.5 to run the models with the 50 parameter sets and 17 SCDs used before. For the initial conditions, we used the stabilized model outputs obtained from the 2000 cycles for year 1945 (average with respect to temperature and precipitation). Finally, the simulated soil temperature profiles obtained were 275 compared with the observed data (see section 2.1.1) by computing the root mean squared error (RMSE). To evaluate the model performance in reproducing the observations (1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000), individual maximum and minimum soil temperature profile of simulated and observed data were used to compute RMSE for each individual year. Then all the values of RMSE obtained, one for each year, were averaged to obtain the overall RMSE of corresponding simulation.

Soil Configuration Depth
Using the experiments proposed in Experiment 1, we explored the combined and individual effects of climate, parameters and SCD on the non-oscillation depth of in the soil temperature profile. Figures 8, 9 and 10 summarize these analyses as 2D histograms: (SCD, hT-non-oscillation) ( Figure 8); (years, hT-non-oscillation) ( Figure 9); and (parameter sample group, hTnon-oscillation) ( Figure 10). Notably, Figure 8 shows that for SCDs less than 15 m, there is a high probability that the hT-greater than 20 m, the hT-non-oscillation condition is always reached, with a higher frequency that this condition occurs at a depth between 13 and 16 m. The variability observed in hT-non-oscillation depth for each SCD is, in general, mainly explained by the variation in parameter values rather than the year selected (i.e., climate condition) for spinning up the model (Figure 9 and 10).  From the previous results, it seems clear that we need at least an SCD of greater than 20 meters to adequately represent the temperature dynamics of permafrost. This conclusion is supported by the fact that the soil temperature at which hT-non-305 oscillation condition is reached remains invariant throughout the annual cycle. The distribution of this "non-oscillating temperature" is shown using 2d-histograms in Figure 11 and 12 with respect to the SCD and the climate conditions (years), respectively. Figure 11 shows that for shallow SCDs, from 3.1 to 16 meters, there is a tendency to obtain a warmer soil temperature such that the permafrost is thawed. In the SCDs with the depth of 16 meters and deeper, there is much more variability in the soil 310 temperature (between -6 ºC to 0 ºC), but with a high probability that the soil temperature at hT-non-oscillation condition is between -3 ºC to -2.5 ºC. In Figure 12 the effect of the climate condition can be appreciated. The main behavioural difference is for the warmest year (1998) when, as expected, the warmest soil temperatures at the hT-non-oscillation condition occur. As for the other climate conditions, the behaviours are quite similar and in general have a range of variation between -7ºC to 0.5 ºC. As before (Figure 11), the probability distribution for each climate condition is quite symmetrical with a peak value around 315 -2.5ºC. A slightly cooler soil temperature is obtained for the coldest year (1974).   counterparts. These differences are small, and the results confirm that more than 20 meters of soil depth are needed to adequately represent the temperature dynamics. To further compare the two scenarios, Figure 14 shows the cumulative 330 distribution function of the differences in soil temperature at the non-oscillation depth of the two simulation scenarios (with and without geothermal flux at the bottom). As shown, the temperature difference of the two scenarios is small in most simulations, and is within +/-0.15•C in approximately 60% of simulations.

Initialization by Paleo-Reconstructions 345
The previous sections have shown evidence that regardless of the climate conditions, parameter uncertainty, and lower boundary conditions, we need to have an SCD that is deeper than 20 meters. However, such depths make the model initialization problem challenging. Here, we show the results of our 1D models with different SCDs and parameter values when driven by a set of 100 tree-ring, bootstrap-based reconstructed climate forcing realizations for period 1678-2001. Figure 15a shows a general overview of the model's ability to reproduce the observed soil thermal behaviour between years 350 1985 to 2000, by plotting the 2d histogram of SCD and RMSE. The colours for a specific SCD represent the frequency distribution of RMSE values; the variability in this distribution includes the effects of different parameter values and climate forcing realizations. The RMSE was calculated as described in section 2.5.2. In general, for the shallower SCDs (say less than 15 meters), the RMSE tends to be larger with a higher variability (1.5 ºC to 9.0 ºC). The frequency distributions for deeper SCDs become, however, quite similar regardless of the depth, with an RMSE range between 1 ºC to 5 ºC with a high density 355 around 1.5 ºC to 3.0 ºC. In the histogram of Figure 15a, we included all the simulations, even if in a simulation from Experiment 1, the non-oscillation conditions had not been reached. Figure 15b, however, presents only the simulations that have reached the non-oscillation condition. As can be seen, the histograms of Figures 15a and 15 b become the same for SCDs deeper than 16 meters similar. 365 This is explained by the fact that almost all the simulations with SCDs that are sufficiently deep (>16.0 m) reach the hT-nonoscillation condition. Figure 16 shows a series of histograms of RMSE values generated by the different reconstructed climate series, each of which for a different set of parameter values. This figure is designed to assess the relative effects of the variation in the different 370 reconstructed climate time series (a manifestation of data uncertainty) and variation in model parameters (a manifestation of parameter uncertainty) on the variability of RMSE. Here, we only take into account the simulations that have reached the hTnon-oscillation condition. As can be seen, the range of variation in RMSE for each set of parameter values is quite narrow compared to the union of all the ranges across the different sets of parameter values. Therefore, two points can be made here: (1) the variability observed in RMSE can be attributed mainly to the parameter variations, indicating the significant role of 375 parameter uncertainty, and (2) the effect of stochasticity in the reconstructed time series for the period preceding the period of records is minimal on the model performance in the evaluation period. This study concludes that for permafrost regions, deeper soil configurations in LSMs are needed than commonly adopted, to be able to correctly simulate the coupled energy and water balance in the subsurface. This conclusion can be extended to all earth system models that incorporate an LSM with permafrost representation. While this conclusion has also been pointed out by other authors, this work investigated the individual and joint effects of parameter uncertainty, total soil depth, lower 385 boundary conditions (geothermal flux), and climate conditions. Further, this work addresses the uncertainty in the reconstructions of past climate for model initialization and also the question of how the initialization should be carried out.
Our analysis shows that the minimum total soil depth should be around 20 m. This value is the reliable depth considering uncertainty and variability in model parameters and climate conditions used to initialize the model, and whether or not the geothermal flux is included as lower boundary condition. The metric defined to assess this depth was based on a depth at which 390 the annual maximum and minimum of daily soil temperature are equal, referred to as hT-non-oscillation condition in this paper. This depth represents a thermally stable condition and ensures that the lower boundary condition is deep enough to accommodate a no-heat-flux or constant-heat-flux boundary condition at the bottom of the soil configuration.
The variability observed in the value of hT-non-oscillation across the many simulations we conducted was mainly explained by parameter perturbations rather than climate conditions. This assessment was the case for the both sets of analyses in 395 Experiments 1 and 2. This emphasizes the importance of recognizing and addressing parameter uncertainty and raises serious issues with the common practice in using LSMs with GCMs, where model capabilities are constrained by using hard coded parameters determined based on look-up tables (Mendoza et al., 2015). We argued that model spin-ups that are based on recycling the 20 th century data should be avoided, as simulations on back-toback repetitions of any sequence of years with a warming trend will result in an unrealistically warm soil temperature profile. 400 Instead, we recommend a two-stage procedure to set the initial conditions of the model: in stage 1 (as conducted in Experiment 1), we spin-up the model on an "average" year, and then in stage 2, we further run the model on a multi-century long bootstrapbased paleo-reconstructions to the beginning of the period of record. The first phase has a stabilizing effect and assures that coherent state variables and fluxes are set before subsequent initialization of the model. This is an important step, as the majority of the LSMs have a large number of state variables to initialize (e.g., CLASS has 17). For the first step, we recommend 405 selecting an average year in term of air temperature and precipitation, and recycle that year in simulation until the soil temperature profile is stabilized. Then, in the second step, we recommend using multi-century long time series of climate variables generated based on the procedure proposed in this study. The proposed procedure reconstructs the time series of temperature using proxy records of summer temperatures derived from tree rings and generates the concurrent time series of other climate variables such as precipitation by applying block-bootstrapping on historical records. We were able to reproduce 410 quite well the past trends of summer temperature and we included the effect of uncertainty in the climate time series by generating 100 realizations. An important remark here is that the effect of short-time scale (e.g., annual) fluctuations in the reconstructed time series used for initialization was minimal, while low frequency trends were important. The length of reconstructions required for proper initialization is longer for deeper SCDs.
Finally, we envision our future work being directed to generalize the results obtained here by extending the analyses to other 415 locations where observations of soil profile temperature and past climate are available. Furthermore, implementing a variable SCD in regional and global models may be investigated, as also proposed by Brunke et al. (2016) for the Community Land Model version 4.5. However, the computational burden is a bottleneck for large-scale simulations. To address the computational issues, surrogate modelling strategies that develop cheaper-to-run statistical or mechanistic surrogates of the original models may be explored (Razavi et al. 2012), and also an endeavour may be made by the cryosphere community to 420 generate a unified gridded data set for the last millennium or so (Jungclaus et al., 2016;Landrum et al., 2013;Schmidt et al., 2011) that approximates soil temperature profiles with adequate soil depth, considering the effect of parameter uncertainty via generating ensembles of approximations.