Multimodel simulation of vertical gas transfer in a temperate lake

Abstract. In recent decades, several lake models of varying complexity have been developed and incorporated into numerical weather prediction systems and climate models. To foster enhanced forecasting ability and verification, improvement of these lake models remains essential. This especially applies to the limited simulation capabilities of biogeochemical processes in lakes and greenhouse gas exchanges with the atmosphere. Here we present multi-model simulations of physical variables and dissolved gas dynamics in a temperate lake (Harp Lake, Canada). The five models (ALBM, FLake, LAKE, LAKEoneD, MTCR-1) considered within this most recent round of the Lake Model Intercomparison Project (LakeMIP) all captured the seasonal temperature variability well. In contrast, none of the models is able to reproduce the exact dates of ice cover and ice off, leading to considerable errors in the simulation of eddy diffusivity around those dates. We then conducted an additional modeling experiment with a diffusing passive tracer to isolate the effect of the eddy diffusivity on gas concentration. Remarkably, sophisticated k−ε models do not demonstrate a significant difference in the vertical diffusion of a passive tracer compared to models with much simpler turbulence closures. All the models simulate less intensive spring overturn compared to autumn. Reduced mixing in the models consequently leads to the accumulation of the passive tracer distribution in the water column. The lake models with a comprehensive biogeochemical module, such as the ALBM and LAKE, predict dissolved oxygen dynamics adequate to the observed data. However, for the surface carbon dioxide concentration the correlation between modeled (ALBM, LAKE) and observed data is weak (∼0.3). Overall our results indicate the need to improve the representation of physical and biogeochemical processes in lake models, thereby contributing to enhanced weather prediction and climate projection capabilities.


lakes became resolvable on the model grid. Lakes are responsible for changing local atmospheric conditions by modifying turbulent fluxes of heat, moisture and momentum relative to adjacent land (Forbes and Meritt, 1984;Mahrt, 2000;Long et al., 2007;Thiery et al., 2015). This effect can be substantial over regions where the total coverage of inland water bodies is high, e.g., in Finland or Sweden (≈ 10 %, 9 % lakes of the total area, respectively (Lehner and Döll, 2004)), or where large lakes dominate (Docquier et al., 2016;Thiery et al., 2015Thiery et al., , 2016Thiery et al., , 2017. Additionally, lakes are increasingly recognized as a considerable source of greenhouse gases, such as carbon dioxide (CO 2 ) and methane (CH 4 ), to the atmosphere (Tranvik et al., 2009;Bastviken et al., 2011;Raymond et al., 2013;Tan and Zhuang, 2015;Wik et al., 2016).
With the development of multiple lake models of varying complexity, it becomes necessary to compare them and examine their merits and drawbacks in the context of atmospheric and limnological applications. The scientific community has addressed this need through the Lake Model Intercomparison Project (LakeMIP) launched in 2008 (http:// netfam.fmi.fi/Lake08/, last access: 16 February 2020). Since then, a methodology for lake model comparison has been developed , and lake model intercomparison experiments have been conducted for a range of limnological and climatic conditions. In each experiment, input parameters were unified as much as possible, including identical meteorological forcing, optical properties of water, ice, and snow, initial temperature profiles, and lake depth and/or morphometry. This allowed for a detailed inter-model comparison in respect of limnophysical process representation.
Previous LakeMIP studies primarily investigated the ability of the lake models to simulate the thermal regime of the water bodies that cover a wide range of size, depth and mixing regimes at different latitudes. In particular, the major effort has been spent on the simulation of the evolution of the vertical temperature profile along with the surface temperature and energy fluxes to the atmosphere. At the same time, biogeochemical processes were not considered in the model intercomparison studies.
This paper presents a logical continuation of LakeMIP by advancing intercomparisons to the study of biogeochemical processes in lakes. It primarily focuses on the simulation of key physical factors affecting the vertical transport of greenhouse gases, such as thermal stratification, vertical diffusion of gases, and ice cover. The ice cover is of special interest for greenhouse gas dynamics, as it can cause the depletion of oxygen, which favors CH 4 accumulation until spring ice breakup (Phelps et al., 1998;Karlsson et al., 2013;Jammet et al., 2015).
Our study aims at identifying the major merits and shortcomings of different lake model formulations, as well as uncertainties in simulating the limnophysical controls on greenhouse gas distribution and emissions to the atmosphere. Simulations were run for Harp Lake, a lake in Ontario, Canada, with long-term high-frequency monitoring data for meteorology, water temperature, CO 2 and O 2 . Five lake models were used in this study to simulate thermal dynamics and turbulent diffusivity in the lake: (1) Arctic Lake Biogeochemistry Model, ALBM (Tan et al., , 2017(Tan et al., , 2018Tan and Zhuang, 2015), (2) FLake (Mironov, 2003), (3) LAKE (Stepanenko et al., 2011(Stepanenko et al., , 2016, (4) LAKEoneD (Jöhnk and Umlauf, 2001;Jöhnk et al., 2008), and (5) Modelagem do Transporte de Calor no Reservatório, MTCR-1 Bleninger, 2015, 2018). The ALBM and LAKE models include comprehensive biogeochemical modules for calculation of dissolved gas concentrations, which were then tested in their ability to reproduce CO 2 and oxygen O 2 dynamics. The study is structured as follows: Sect. 2 includes the description of the study site, observational data, lake models, and experimental setup. Section 3 is devoted to the simulation of the lake's thermal regime, i.e., the temperature patterns and eddy diffusivity (ED). To assess the pure effect of the ED on the gas distribution and dynamics in the lake, additional experiments have been conducted, where the fate of passive tracers emitted from the bottom of the lake was analyzed. In a final section, the ALBM and LAKE models are compared in terms of the CO 2 and O 2 concentration in Harp Lake.  (Cox, 1978;Dillon et al., 1987) (Fig. 1). It has a 0.71 km 2 surface area, a volume of 0.0095 km 3 , a maximum depth of 37.5 m and a mean depth of 13.32 m. The lake is oligotrophic, characterized by average total phosphorus and chlorophyll-a concentrations of 6.7 and 3.5 µg L −1 (Molot and Dillon, 1991). The water is slightly acidic (pH 6.3) (Dillon et al., 1987). Harp Lake has six inflows and one outflow, with a longtime average discharge of 0.067 m 3 s −1 for the inflows and 0.091 m 3 s −1 for the outflow. The difference is due to small ungauged inflows and overland flow. Its average residence time is 3.1 years. Due to this relatively long residence time, the effect of the inflows on temperature can be neglected. The lake has only small water level fluctuations (maximum ∼ 0.2 m).

Observational data
The lake has been monitored by the Dorset Environmental Science Centre (DESC) for more than 40 years, thus providing a comprehensive database supporting the model experiments. The high-frequency and continuous observations used in this study have been collected during 5 years (14 July 2010-19 October 2015) with 10 min resolution on a monitoring buoy in the lake center (Fig. 1). The meteoro-logical variables collected are air temperature, humidity at a height of 1.75 m above the water surface, wind speed at 1.75 m, and downwelling shortwave and longwave radiation. Pressure and precipitation are available from a land-based meteorological station 500 m away from the western lake shore. The high-frequency dataset further includes a vertical temperature profile (up to a depth of 27.1 m prior to the summer of 2012 and to a depth of 9.85 m thereafter, aggregated to a time step of 1 h), O 2 concentrations (at two depths: 1, 18 m, aggregated to a time step -1 h) and CO 2 concentration (at a depth of 0.39 m, for the period 12 March 2012-19 October 2015, aggregated to a time step of 1 h). In addition, traditional temperature and O 2 profiles were collected to a depth of 35 m, except during the wintertime period, at a time step of approximately 2 weeks, and Secchi depth was also measured fortnightly or monthly during these 5 years (for more details, see Tables S6 and S7). The meteorological conditions of the region during the period under study are typical for continental climate of midlatitudes (Table 1). This region is subject to the influence of the Azores High during summer and is located in between the Icelandic Low and the Canadian High forming in the central part of the continent during winter. Cold arctic air usually invades this area during winter, leading to the formation of stable freezing weather. However, it should be considered that winters are usually characterized by several abrupt rises in air temperature above 0 • C (see Fig. S1 in the Supplement). This phenomenon causes the 2012 winter to rank as the warmest winter during the 5 measurement years. Average annual wind speed is relatively low (≈ 2 m s −1 ), likely caused by windshading effects imposed by the surrounding forest. Predominant wind directions are WSW (230-280 • ) and SE (130 • ).

Lake models
In this study, a suite of 1-D lake models was applied to Harp Lake. Two classes of finite difference models can be identified based on different approaches of calculating the vertical ED: Henderson-Sellers-based models (Henderson-Sellers, 1985;Hostetler and Bartlein, 1990), which are computationally inexpensive, and the more sophisticated models based on the k − ε turbulence closure scheme. The Henderson-Sellers approach computes the ED coefficient as a function of the Richardson number and considers winddriven turbulence during stable or neutral stratification. A separate procedure is considered for convective mixing and assumes instantaneous mixing of unstably stratified depth layers. The k − ε formulation involves prognostic equations for turbulent kinetic energy and its dissipation. Both windinduced and convective turbulence are taken into account in this approach. Within the first class there are the recently developed models ALBM and MTCR-1, while k − ε models include LAKE and LAKEoneD. The FLake model stands aside from the other 1-D models due to the two-layer bulk structure which employs the concept of self-similarity to estimate the temperature profile in the mixed layer and thermocline, respectively (Mironov, 2008). In the mixed layer the temperature is assumed to be constant, whereas below it is parameterized as a function of non-dimensional depth. ED is not explicitly specified in this model. Of all models, FLake has the lowest computational cost (Thiery et al., 2014a).
Ice formation is one of the main controlling factors for the dissolved gas dynamics in a lake during winter. Models like LAKEoneD and MTCR-1 involve a simple ice parameterization, in which ice cover is governed only by air temperature and lake temperature is not considered (Ashton, 1980(Ashton, , 2011. In the FLake model, the temperature profile within the ice cover is parameterized via a time-varying shape function with a linear asymptotic. The ALBM incorporates a more comprehensive ice parameterization developed by Fang and Stefan (1994), taking into account the physical processes of freezing of the surface water, radiation and heat penetration through ice and snow. The LAKE model takes into account the same physical processes and uses a multilayer ice parameterization (Stepanenko and Lykossov, 2005;Stepanenko et al., 2011) based on an unsteady heat transfer equation. Light extinction in all models follows the Beer-Lambert law with a constant light attenuation coefficient.
The ALBM and LAKE incorporate the most comprehensive biogeochemical modules, including the sources, sinks and transport of CO 2 , O 2 , and CH 4 (and nitrogen in the ALBM) (see Table S1 in the Supplement). LAKEoneD model calculates O 2 concentration only. Parameters and characteristics of all models are summarized in Table 2. All the models use finite-difference numerical schemes along vertical coordinates, except for FLake, which is based on the two-layer representation of the lake temperature profile (see above). The vertical grid of these models is fixed in time, while the number of numerical layers varies from 51 to 105 between models. The time step varies within the range of 20-3600 s between the models. The ALBM uses a fourth-order adaptive Runge-Kutta-Fehlberg solver and a variable time step. If there are abrupt changes in boundary conditions (such as air temperature, solar radiation and wind), the time step in this model will be reduced to avoid numerical instability.

model simulations testing the sensitivity of models to variations in
lake depth, setting it to mean value h ave = 13.32 m (LDSim), the light attenuation coefficient, prescribing µ min = 0.28 m −1 and µ max = 0.68 m −1 , which are minimal and maximal values measured, respectively (ExtMinSim, ExtMaxSim), 3. a model simulation testing the effect of the vertical diffusion on the distribution of the passive tracer (PassTr-Sim), and 4. a model simulation with biogeochemical modules activated including the calculation of the O 2 and CO 2 concentration (GasSim). Measured mean CO 2 concentration and O 2 observations are used as initial profiles. The time step of model output in all experiments is 1 h.
The concept of the self-similarity in the FLake model (Sect. 2.2) includes the "shape factor" coefficient C θ , which determines the vertical temperature profile below the mixed layer. Evolution of this parameter is controlled by a relaxation timescale t rc (Mironov, 2008). This timescale includes the dimensionless relaxation constant C rc with a default value of 0.003 in the model. This constant is a calibration parameter which is generally individual for each lake (Shatwell et al., 2016;Kirillin et al., 2017). We tested the sensitivity of the model to the variation of C rc using values 0.03, 0.3, 0.5, 1, 2, 3, and 30 and achieved the best agreement with the measured temperature profile using C rc = 0.3 (for details, see Sect. 3.1). In the following sections, this C rc setting is used in RefSim and other experiments with the FLake model.
The meteorological forcing is identical for all models and experiments. This, however, does not imply that heat fluxes to the atmosphere are the same in models, as models use different surface flux schemes. Some of the models do not include bathymetry (e.g., FLake); thus, for a more consistent comparison, in the RefSim experiment all models are integrated assuming homogeneous depth. In the GasSim experiment, lake morphometry is considered because the lake's carbon budget cannot be accurately calculated without including sediment oxygen demand along a sloping bottom.
The effect of the tributaries is neglected due to the large residence time of Harp Lake. Likewise, water-level variations are not considered in the models given the small temporal fluctuations.
The choice of the lake depth in the 1-D model without bathymetry may be crucial for the modeling results, not only for the temperature regime (Balsamo et al., 2010), but also for the gas dynamics. The gas dynamics is affected by lake depth as the latter controls the amount of bottom-originated gas to be biochemically transformed in the water column (e.g., oxidation of CH 4 ). Thus, it is important to examine the sensitivity of models to this parameter. In all experiments except LDSim, the maximal depth is used to compare the output with all available measurements. To test the effect of the chosen depth on model performance, the depth is set to mean depth (13.32 m) in the LDSim experiment; as such, the modeled lake volume is equal to the real lake volume.
The light attenuation coefficient is an important parameter, regulating the thermal regime of a lake (Heiskanen et al., 2015) and biogeochemical processes including photosynthesis. Secchi depth measurements during the 5-year study period on Harp Lake are irregular but enable calculation of mean, maximum and minimum light attenuation coefficients using the Poole and Atkins formula (Poole and Atkins, 1929). The motivation for conducting experiments with perturbed input parameters (depth and light attenuation coefficient) and initial conditions (dissolved gases) is that these properties, especially the light attenuation coefficient, become considerably uncertain in global applications. While a first global lake depth database recently became available (Choulga et al., 2014;Kourzeneva, 2010), such a dataset currently does not exist for water transparency.
To understand the role of the ED coefficient in gas distribution and dynamics, an additional experiment solving the pure vertical diffusive transport equation for a passive tracer governed by eddy diffusivities resulting from different lake models was conducted. Details on this experiment are presented in Sect. 3.3.
Harp Lake is a deep lake suitable for studying the transport of O 2 and CO 2 as well as their biogeochemical transformations due to long-term regular measurements of these gases. In this respect, the additional numerical experiment (Gas-Sim) was conducted with the ALBM and LAKE, the lake models with the most sophisticated biogeochemical modules. To harmonize the experimental setup, the contribution of CO 2 and O 2 fluxes through a sloping bottom to the lake carbon budget was switched on in the LAKE model. It should be noted that the ALBM includes a CO 2 sink/source due to outlets and inlets, whereas LAKE does not take this effect into account. In the ALBM, a set of biogeochemical parameters was calibrated, specifically, maximum chlorophyllspecific photosynthetic rate, maximum metabolic loss potential, aquatic dissolved organic matter (DOM) microbial degradation rate, terrestrial DOM microbial degradation rate, and groundwater dissolved organic carbon (DOC) concen-tration (Tan et al., 2017). No parameter calibration was performed for the LAKE model (a complete set of default values of model parameters is given in Stepanenko et al., 2016).
The model skill scores used in this study are listed in Table 4.
3 Results and discussion

Temperature and ice
All the lake models simulate seasonal water temperature dynamics adequately (Fig. 2): the epilimnion is warming throughout the first half of summer, accompanied by thermocline deepening, eventually leading to overturn in autumn. During winter, the lake temperature is at the freezing point immediately below the ice cover, causing stable stratification. During spring, the lake starts to mix due to the penetration of shortwave radiation. The ice formation and ice melt dates play a key role in various processes: the interaction of the lake surface with the atmosphere, the overturning during spring, biogeochemical processes, and release of gases that have accumulated during wintertime, e.g., CO 2 or CH 4 . Yet none of the models is able to precisely predict these dates. The difference between the model results and observations become even larger towards the end of the simulation, with differences exceeding 2 weeks in winter 2014-2015 (see Table S3).

FLake model
In RefSim we test the model with different values of C rc (see Sect. 2.3 above). The best agreement with the observed temperature profile we get using C rc = 0.3. In particular, the error (RMSE c ) for surface temperature reduces from 3.4 • C (using a standard value of c relax = 0.003) to 2.5 • and becomes close to other models. At the depth of ∼ 6-Hydrol. Earth Syst. Sci., 24, 697-715, 2020 www.hydrol-earth-syst-sci.net/24/697/2020/ 10 m RMSE c reduces from 5.7 • C (using a standard value of c relax = 0.003) to 5.2 • . In addition, RMSE c reduces from 3.3-0.7 to 1.8-0.2 • C at the depths 20-27 m. FLake demonstrates the largest error among other models for the depth of the thermocline (Fig. 3a) (RMSE c up to 5.2 • C). In contrast, the model performance in simulation of the lake surface temperature is comparable with other models, and the error is only 2.5 • C (see Fig. S2a), which points to the fact that it includes similar formulations for the surface heat balance to the other models.
As highlighted in former studies (e.g., Stepanenko et al., 2010;Thiery et al., 2014b), this model overestimates the surface mixed-layer depth during summer, even under ice (Fig. 2b). The mixed-layer depth in the FLake model is 1.5 times larger than in observed data (5.6 m versus 3.6 m). Predictions of ice cover and ice off are characterized by large Table 4. Statistical measures of model performance.
Symbol Property Definition f n , r n Model and observed data σ f , σ r Standard deviations of "f" and "r", respectively DM Difference between the mean values DM = f − r of the model and the observed data errors (see Table S3), with modeled dates more than 2 weeks off from the observations. Remarkably, FLake demonstrates the largest sensitivity among all models to the lake depth -RMSE c is reduced approximately 2-fold (from 2.5 to 1.28 • C) (see Table S4, Figs. S3 and S4) when using a mean depth of 13.32 m instead of the maximal depth 37.5 m. The thickness of mixed-layer depth reduces up to 4.5 m and it has a better agreement with the observations in comparison with the RefSim experiment.
FLake showed a similar response in the sensitivity model test to variation of the light attenuation coefficient to other models. However, the difference in temperature is smaller than in other models: in both RefSim and ExtMinSim, ExtMaxSim is less than 4 • C in comparison with 8 • C (Fig. 4). It can be related to the weaker temperature gradient in the thermocline in comparison with other models (Fig. 2).

ALBM
The warmest model in terms of the epilimnion temperature was the ALBM (Fig. 2a). The maximum water tempera-ture reached in the model is 32.4 • C (28.4 • C -in observations). However, the ALBM demonstrates the lowest RMSE c (1.53 • C) and the highest correlation (0.98) with the observed data for the temperature of the water column (except the surface) during the entire simulation period among all models (Figs. 3a and S2, Table S4). It can be noted visually that the shape of the thermocline is better reproduced by this model than by other model participants (Fig. 3a). In addition, the correlation coefficient of ALBM temperature decreases more slowly with depth than in all other models (Fig. 3b). In contrast, RMSE c for the surface temperature in the ALBM is almost 1.5 times larger than in k − ε models (see Fig. S2a, Table S4).
The ALBM successfully reproduces autumnal overturn, although the homogeneous temperature distribution in this model occurs approximately 7-10 d later compared to the measurements. The likely reason for this offset is the excessive accumulation of heat in the mixed layer during summer, which delays the cooling to the temperature of maximum density and subsequent vertical mixing.
Hydrol. Earth Syst. Sci., 24, 697-715, 2020 www.hydrol-earth-syst-sci.net/24/697/2020/ The ice-cover and ice-off dates are quite close to observations (e.g., 24 December versus 28 December 2011; see Table S3). Differences between model and observation icecover dates of more than 2 weeks happen three times. The large errors mainly occur for ice cover, likely due to the excessive heat accumulated in summer.

MTCR-1 model
In terms of temperature, the Henderson-Sellers-based MTCR-1 model shows contrasting performance compared to ALBM (Fig. 2e). It is generally the coldest model (the temperature reaches a maximum of 24.2 • C in the epilimnion) and results in the deepest mixed layer (5.9 m versus 3.6 m observed). This significant difference in mixed-layer depth (see Table S2) is possibly associated with different schemes of convective mixing.
Similarly to the FLake model, the surface temperature in MTCR-1 is sensitive to changes in lake depth. The surface temperature DM reduced 4 times compared to the reference model simulation when decreasing the lake depth from 37.5 to 13.32 m (see Table S4).
While MTCR-1 and LAKEoneD use similar parameterization for ice formation and melt, their results are similar only for the ice-off dates. The ice-cover dates greatly vary between these models, with differences reaching up to 1 month  Table S3), when MTCR-1 does not show it at all (there is no observed ice melt during these winters).

LAKEoneD and LAKE models
Even though LAKE and LAKEoneD use the same turbulence closure, they represent the temperature dynamics in the lake with considerable differences (Fig. 2c and d). In particular, the mean depth of the mixed layer as simulated by LAKE is 1 m deeper compared to LAKEoneD results (4.8 and 3.9 m, respectively). LAKE demonstrates greater heating of the epilimnion (up to 30 • C) than LAKEoneD (maximum value of 27.3 • C). The homogeneous distribution of temperature during autumn mixing predicted by LAKEoneD reaches the depth of 15 m in most years, as opposed to LAKE (up to 35 m) and the observational data (at least up to 25 m, where the deepest sensor is located). Remarkably, only in these two models do periods of unstable ice cover occur. For example, the ice in LAKE is thin and often comes off during winter (more than five times in winter 2011-2012), leading to winddriven surface layer mixing (see Table S3).
The sensitivity experiments with varying light attenuation coefficients (see Table S4) generally result in a similar response in all models. Increasing the light attenuation co- efficient (i.e., reducing water transparency) leads to an upward shift of the thermocline position (Fig. 4); conversely, decreasing the attenuation coefficient causes a downward move of the thermocline. This leads to a temperature change at respective depths of up to 4 • C in the FLake model and 8 • C in the ALBM and LAKEoneD. Figure 4 contains only three models with different types of turbulent closure because the effect of varying the light attenuation coefficient is very similar in models of the same type (e.g., k − models LAKEoneD and LAKE).
It is important to note that all models have a maximum error, RMSE c for temperature (compared to observations) in the thermocline (Fig. 3a), which agrees with a previous study (Stepanenko et al., 2014). Conversely, the correlation between simulated and observed temperature decreases with depth in all models (Fig. 3b), starting from 0.9-1.0 near the surface to 0.1-0.7 close to the bottom.

Eddy diffusivity
Eddy diffusivity (ED), K, is the variable controlling the vertical turbulent transport in lake models. This section examines the results of simulated ED values, thereby focusing on discrepancies between models caused by different turbulent closure schemes.
The simulated time-depth distribution of ED is shown in Fig. 5. EDs in ALBM and MTCR-1 do not reproduce spring and autumn whole-water-column mixing as the Henderson-Sellers diffusivity is not valid for unstable stratification. In contrast, k − ε models correctly predict the deepening of the mixed layer in the ED pattern during summertime towards a complete mixing during autumn.
Even models employing the same type of turbulence parameterization differ in simulated eddy diffusivities in the thermocline by orders of magnitude. This is caused by the different background values of ED added to those estimated by turbulence closures and mimicking mixing mechanisms Hydrol. Earth Syst. Sci., 24, 697-715, 2020 www.hydrol-earth-syst-sci.net/24/697/2020/ not directly resolved in 1-D models such as internal wave breaking (e.g., Hondzo and Stefan, 1993). Apparently, the background diffusivity is switched off in the MTCR-1 model during ice cover. The increase in ED near the bottom during the stratified period in the ALBM can be explained by the change from a stable stratification in the thermocline to a quasi-neutral stratification beneath.
A peculiar feature of LAKE is that it features highintensity turbulence during much of the ice-cover period in the lower part of the water column that is apparently related to residual flow after the autumn overturn. In contrast, in LAKEoneD, turbulence dissipates shortly after the ice cover. Both of these models demonstrate a subsurface region of high ED values in winter which is not reproduced by other models. Production of turbulent kinetic energy (TKE) below the ice may be caused by both momentum transfer through partial ice cover (implemented in LAKE) and shortwave radiation penetration causing under-ice convection. The conspicuous difference between LAKE and LAKEoneD results is that springtime deep mixing does not develop in LAKEoneD. This may be linked to different constants and stability functions used in its k − ε closures or different buoyancy and momentum fluxes at the lake-atmosphere interface during this season.
It is worth noting that the significant difference between the modeled and observed dates of ice cover and ice off (see Table S3) leads to errors in the simulation of ED as water opens and interacts with the atmosphere.
The main features of ED distribution over seasons described in this section hold in experiment LDSim with a different lake depth value (13.32 m).

Effects of eddy diffusivity on the vertical distribution of a passive tracer
The vertical distribution of ED in lake models is one of the major drivers of vertical transport of dissolved gases and their emission to the atmosphere. To isolate the effect of the ED parameterization in the different models, we solve the diffusion equation for a passive tracer concentration C: The emission rate of the passive tracer at the bottom is set constant: while at the surface the flux is assumed to be large enough to make the surface concentration negligible compared to deepwater concentration, so that the boundary condition here becomes Setting zero concentration at the surface is justified by the fact that surface concentration of gases of interest (CO 2 , CH 4 ) is typically orders of magnitude less than bottom concentration. The total (molecular plus turbulent) diffusivity K is taken from different models, so that the solution C of Eqs. (1-3) is individual for each lake model. Calculation of tracer diffusion was carried out using simulated values K(z, t) from the two Henderson-Sellers-based models -ALBM, MTCR-1, and two k − ε models -LAKE and LAKEoneD, covering the period of 5 years. In order to take into account vertical mixing under unstable stratification performed in ALBM and MTCR-1 and not expressed by Henderson-Sellers diffusivity, a value K = 10 5 m 2 s −1 was applied for the convective layer in such cases.
The results of the numerical tracer experiment are presented in Fig. 6. All models demonstrate seasonal variability of the passive tracer concentration, including the nearbottom accumulation during stratified periods. Three models (ALBM, MTCR-1, and LAKE) perform the full overturn of the water column during autumn and spring, leading to the emission of almost the whole tracer storage to the atmosphere and forming a homogeneous concentration profile. In LAKEoneD, the vertical mixing does not reach the lake bottom during spring and autumn in certain years (e.g., 2011 and 2014). Hence, not all tracer amount is removed from the water column during overturn periods (see Fig. S5b) in this model. As a result, not only the concentration of the tracer near the bottom, but also the total tracer amount in the water column increases in LAKEoneD throughout the simulation period. However, this effect is a result of the assumption of a bathtub-shaped lake bathymetry and would be less when including the depth dependence of lake volume.
The MTCR-1 model demonstrates short overturn periods similar to those of LAKEoneD, but the mixing reaches deeper. Hence, the passive tracer does not accumulate over time. Furthermore, ALBM exhibits a higher near-bottom concentration than MTCR-1 because the ED simulated by ALBM is 1 order of magnitude lower compared to MTCR-1 at 30-35 m depth (see Fig. 6).
To assess the effect of different eddy diffusivities on the surface tracer flux, we use the ratio of the surface flux, F surf , to the bottom flux, F bot , where the former is given as with C eq = 0 the surface concentration, C w the concentration at the first model output level below the surface (0.3 m), both in mol m −3 , z = 0.3 m, and K surf the surface diffusivity coefficient. The Henderson-Sellers-based models produce short periods with high fluxes, primarily during autumnal mixing (see Fig. S5a), and the drops of the integral concentration in the water column at the same time (see Fig. S5b). The flux spikes are several orders of magnitude higher compared to k − ε models. A reasonable explanation is that instantaneous convective mixing implemented in the ALBM and MTCR-1 models leads to almost the whole tracer content inside the convective layer being released to the atmosphere during a single convective event. All the models demonstrate much weaker tracer fluxes during spring overturn compared to those resulting from autumnal convection.
Lake models with k − ε closure produce substantial tracer fluxes during the stratified period in particular years (e.g., 2013 and 2014). The likely reason for this effect is rapid changes in mixed-layer depth, due to sudden wind forcing, entraining higher concentrations from the hypolimnion into the mixed layer.
The tracer diffusion experiment was also conducted at the reduced depth of 13.32 m (see Fig. S5c and d). Due to the lower depth, seasonal vertical mixing now extends over the whole water column in all models, providing more effective release of the substance. ALBM and MTCR-1 demonstrate episodic high fluxes during summer along with k −ε models. The spring mixing in LAKEoneD now reaches the bottom in some years; hence, the integral tracer amount evolves here in a similar way compared to all other models. The timeaveraged tracer emission in LAKEoneD is thus the most sensitive to depth reduction. The mean surface flux in this model increases up to approximately 54 %, while for the ALBM, MTCR-1, and LAKE this increase is 4 %, 19 %, and 8 %, respectively. This can be primarily explained by the contribution of the increased fluxes during summer stratification and of the autumn mixing now reaching the bottom.

Oxygen
The evolution of the observed vertical distribution of O 2 concentration in Harp Lake is shown in Fig. 7c excluding winter periods and data with higher sampling frequency (1 h) collected at 1 and 18 m depth (Fig. 8). During spring (end of April-beginning of May) after ice off, convective mixing occurs, leading to a homogenized O 2 profile. The sim-Hydrol. Earth Syst. Sci., 24, 697-715, 2020 www.hydrol-earth-syst-sci.net/24/697/2020/  ilar process happens in autumn. During summer, O 2 depletes in the hypolimnion due to its consumption by organic matter degradation in sediments and deep water layers. The main source of O 2 is located in the photic layer, where it forms as a result of photosynthesis. Below the mixed layer (at 5-tion rate from organic matter degradation in the lake. In summer, surface oxygen content decreases following reduction of temperature-controlled solubility. The measurements thus indicate common trends observed in most deep oligotrophic lakes at mid-latitudes (Mitchell and Prepas, 1990). Both models depict a reasonable representation of O 2 dynamics at different depths ( Fig. 7a and b). The ALBM includes the O 2 model described in Tan et al. (2017), whereas LAKE employs the representation of O 2 sinks and sources from Stefan and Fang (1994). Even though they use different oxygen and turbulent parameterizations, both models show a similar spatiotemporal pattern. In particular, they reproduce the maximum of O 2 concentration below the mixed layer, its quasi-linear in time decrease in the hypolimnion (18 m) during stratified periods, and the effect of mixing events during spring and autumn, marked by rapid changes in O 2 concentration at 18 m depth. However, the O 2 concentration as computed by LAKE is on average 1-2 mg L −1 higher than the observed values. A plausible explanation for this positive bias is that photosynthesis in LAKE yields excessive production of O 2 (parameters of biogeochemical parameterizations in this model have not been calibrated in the study). Another deficiency of LAKE simulations is that during ice cover nearsurface O 2 decays linearly and then exhibits significant peaks at thinning of ice cover and ice off, whereas in the ALBM and measurements, oxygen content remains almost constant during these periods, which is more consistent with measurements.
The time series of O 2 concentration from the ALBM and LAKE correlate reasonably with observed data, with the linear correlation coefficient ranging between 0.5 and 0.6 at 1 m and between 0.6 and 0.8 at 18 m depth, and with RMSE c 1.27-1.56 mg L −1 at 1 m and 1.19-1.52 at 18 m depth (see Fig. S6). For both depths, the ALBM demonstrates higher correlation with observations. Both models correlate better with observations at large depth. Decrease in the temperature in the ExtMaxSim simulation (Fig. 4) due to the upward shift of the thermocline leads to the reduction of O 2 concentrations at the respective depths due to temperature dependence of photosynthesis.
Overall, the oxygen content is highly dependent on physical controls, which are temperature, radiation, ED and ice cover. Given that these factors are reasonably reproduced by the ALBM and LAKE, the O 2 spatiotemporal pattern is captured as well. Successful representation of maximal O 2 content below the mixed layer during summer may be an important modeling skill for simulating correctly CH 4 in a lake because the former acts as a sink region for the latter. Furthermore, realistic oxygen concentration reduction during periods of stable stratification means that the respective CO 2 production from aerobic organic matter decomposition is reproduced reasonably as well.

Carbon dioxide
The vertical distributions of CO 2 concentrations simulated by the ALBM and LAKE are shown in Fig. 9. The ALBM simulates higher concentrations of CO 2 compared to LAKE. This is likely because, in contrast to LAKE, the ALBM considers dissolved inorganic carbon (DIC) inflow from the catchment and explicitly treats DOC and particulate organic carbon (POC) dynamics (Tan et al., 2017). Parameter calibration applied in the ALBM brought CO 2 content closer to measurements in this model. Correlation with the observational data at a depth of 0.39 m is relatively small for both models (0.3), and RMSE c is 4.5 and 2.7 mg L −1 for LAKE and the ALBM, respectively.
Both models reproduce the seasonality of the CO 2 concentrations: during summer CO 2 accumulates in the hypolimnion, while CO 2 is consumed in the mixed layer due to photosynthesis. For the same reason, the depth of the O 2 maximum (Fig. 7c) is consistent with the metalimnetic CO 2 minimum.
Vertical turbulent diffusion (Sect. 3.3) greatly affects CO 2 patterns in models. In particular, the mixing periods clearly seen in Fig. 9 coincide with those in Fig. 6. Accumulation of CO 2 below the mixed layer during stratified periods is clearly demonstrated by both models. However, the vertical CO 2 distribution in the thermocline is very different; i.e., in LAKE, concentration attains its maximum near the bottom, whereas in the ALBM the maximum is located at 15-20 m depth. This can be related to the fact that in the ALBM, DOC and DIC from surface water and groundwater can be injected in the middle of the water column.
In contrast to oxygen, measured CO 2 content is much less correlated with simulated series. The wintertime increase in surface concentration seen in calculated series is hardly observable in measured data. In summer, simulated CO 2 series are smooth, whereas there are significant fluctuations in empirical data. There is some correspondence between seasonal-mixing-caused peaks in modeled concentration and the measured one, though with time lags. All this leads to the conclusion that, compared to oxygen, carbon dioxide is more controlled by biogeochemical processes misrepresented in lake models than by physical factors. As stated in the previous section, a realistic decay of oxygen content during stratified periods in the ALBM and LAKE models suggests that CO 2 amount produced by aerobic decomposition of organic matter in both the water column and in the top part of sediments is reasonably simulated as well. Satisfactory agreement of computed oxygen in the mixed layer and below with observations implies that photosynthesis minus respiration rate is fairly captured as well, and so do the models for CO 2 gain or loss due to these processes. Hence, the primary drawback of the models used in respect to DIC simulation is likely to be not explicitly simulating transport of carbon species from catchment to a water body. Thus, modeling approaches coupling the catchment and a lake presented recently (Fut-Hydrol. Earth Syst. Sci., 24, 697-715, 2020 www.hydrol-earth-syst-sci.net/24/697/2020/ The important role of catchment processes in building up the DIC levels in lakes introduces an extra difficulty for implementing lacustrine CO 2 emissions in the Earth system models. This is caused by the necessity to provide regional or global data on a lake catchment's geometric, physical and biogeochemical properties, which are not currently available. In relation to this, we can also note a faster progress on a roadmap to introduce lake CH 4 dynamics in climate models, where simulations passed from site-level studies to regional estimates , whereas CO 2 modeling is currently confined to individual lakes (Stepanenko et al., 2016;Tan et al., 2017;Kiuru et al., 2018).

Conclusions
A lake model intercomparison study focusing on physical controls of biogeochemical species (i.e., dissolved gas concentration) is performed, marking a step forward in the LakeMIP exercises, which have been previously focused on thermodynamic modeling only. To evaluate model performance with respect to biogeochemical processes, we had to first evaluate their performance in hydrody-namic/thermodynamic simulations and variations due to uncertainty in major driving parameters, such as temperature and turbulent diffusivity.
The participating lake models (ALBM, FLake, LAKE, LAKEoneD, MTCR-1) were evaluated regarding their ability to reproduce the thermodynamic regime of Harp Lake. All the models capture the seasonal variability of the temperature profile in the deep boreal dimictic lake. A substantial discrepancy was found between the models in the representation of thermocline evolution -consistent with previous studies -and the correlation coefficient for calculated and observed temperature was found to decrease from surface to bottom (from 0.9-1 to 0.1-0.6). The sensitivity of the simulation results to the light attenuation coefficient and the lake depth was assessed, given that in global applications, such as numerical weather prediction, they are external parameters known with high uncertainty.
All considered lake models do not accurately reproduce the dates of ice cover and ice off, similar to the results from a previous simulation study for the same lake which used other lake models and an earlier time period (Yao et al., 2014). The difference between the model results and the observations exceeds 2 weeks in particular years. The unrealistic representation of ice formation and ice melt dates leads to untimely changes in eddy diffusivity (ED) within the water column, which in turn strongly affects the modeled distribution of gases and their emission to the atmosphere. This implies more efforts of a community should be spent on elaborating ice-snow schemes in lake models, where currently vertical homogeneity and temporal invariance of optical and thermodynamic properties are typically assumed, which does not correspond to a bulk of existing knowledge (Leppäranta, 2015). In a recent study (Tan et al., 2018), one possible direction to improve the simulation of ice cover is shown. They demonstrated that including the conversion of snow to white or slush ice when the weight of ice and snow exceeds the buoyancy of the ice cover can significantly improve the ice simulation results. In the special case of saline lakes, effects of salt trapping in ice cover become important and should be adequately parameterized as well (Stepanenko et al., 2019).
To study the effect of turbulent transport on the vertical distribution of gases, we conducted passive tracer simulations using simulated ED from different models.
This experiment reveals that all lake models demonstrate more intense mixed-layer deepening during autumn than during spring, with spring overturn in some cases not even reaching the lake bottom (e.g., in MTCR-1 and LAKEoneD).
Moreover, less intensive mixing during spring and autumn leads to an accumulation of passive tracer concentrations in deep water at a multiyear timescale.
In addition, the representation of an equivalent lake depth in 1-D models is found to be a crucial parameter regulating both passive tracer concentration and its flux to the atmosphere: when reducing the depth of a lake approximately by 50 %, the average tracer flux increased by 4 %-54 %.
The lake models with the most comprehensive biogeochemical modules, such as ALBM and LAKE, demonstrated reasonable reproduction of the O 2 concentration profile and its seasonal evolution. For instance, both models predict an oxygen concentration maximum and corresponding CO 2 minimum below the mixed layer in summer, elevated surface concentration and linear decay of deep-water content under ice cover, and vertical homogenization during spring and autumn overturns.
The main difference between the models in the CO 2 simulation is that the ALBM involves an additional source of DIC from the catchment. Due to this, the average CO 2 concentration in the ALBM is higher than in LAKE and closer to measurements. Results from both models are weakly correlated with measurements of surface CO 2 concentration. This leads to the conclusion that, compared to oxygen, carbon dioxide is much more controlled by biogeochemical processes misrepresented in lake models than by physical factors.
As a result of this study, we emphasize the following issues to be addressed by the lake modeling community in respect to dissolved gas simulation.
-Simplified lake modeling approaches especially employing a parameterized temperature profile and representing reasonably the surface temperature may fail in calculating vertical temperature distribution, with a potentially significant effect on biogeochemical processes if the respective module is coupled to the thermodynamic compartment.
-For deep lakes, models differ in the depth of autumn and especially spring overturn; if this depth reaches bottom, the gas content originating from sediments and accumulating in the hypolimnion during the preceding stratified period is almost completely released to the atmosphere; otherwise, these gases may continuously accumulate in deep water at a multi-year timescale; however, more research is needed to establish the effect of lake morphometry on these processes.
-Oxygen content in both surface and deep water is efficiently controlled by physical and biogeochemical factors, successfully simulated by lake models, whereas measured surface CO 2 exhibits significant temporal variability not captured by the models, calling for more research in biogeochemical mechanisms and parameterizations of carbon dioxide dynamics.
-1-D models have certain limitations when applied to such a 3-D system as a lake. They do not capture all the lake mixing mechanisms such as density-driven currents, which can be important under certain conditions (Samolyubov, 1999). It was found that for deep lakes with an extended littoral zone such as Lake Geneva (Fer et al., 2001(Fer et al., , 2002 or Lake Van (Kaden et al., 2010), these flows can be significant in terms of vertical heat and gas transfer. In particular, for Harp Lake it may be not important due to smaller depth and not a large shallow area. In winter, a large-scale convective circulation (up to 3-5 cm s −1 ) due to heat exchange at the sediment-water interface may develop under ice (Kirillin et al., 2015). However, the importance of this circulation in terms of gas transfer has not been studied yet. So far, to the best of our knowledge, these currents are not parameterized in 1-D models.
Code and data availability. Author contributions. VS, WT, KJ, ZT, and HY conceptually designed the experiments. SG, BAP, TB, WT, KJ, ZT, and QZ carried them out and provided the results of the simulations. JAR and HY provided observational data. SG made a formal analysis and combined all the data and visualized them under the supervision of VS. SG prepared the manuscript under the supervision of AL and VS with significant contributions from all the co-authors.