<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">HESS</journal-id><journal-title-group>
    <journal-title>Hydrology and Earth System Sciences</journal-title>
    <abbrev-journal-title abbrev-type="publisher">HESS</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Hydrol. Earth Syst. Sci.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1607-7938</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/hess-30-2973-2026</article-id><title-group><article-title>Interpretable soil moisture prediction with a knowledge-guided deep learning approach</article-title><alt-title>Interpretable soil moisture prediction with a knowledge-guided deep learning approach</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Wang</surname><given-names>Yanling</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Hu</surname><given-names>Xiaolong</given-names></name>
          <email>xlhu@whu.edu.cn</email>
        <ext-link>https://orcid.org/0009-0007-2015-4946</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Hu</surname><given-names>Yaan</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>He</surname><given-names>Leilei</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Wang</surname><given-names>Lijun</given-names></name>
          
        <ext-link>https://orcid.org/0009-0008-9298-1059</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Song</surname><given-names>Wenxiang</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-9834-7197</ext-link></contrib>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Shi</surname><given-names>Liangsheng</given-names></name>
          <email>liangshs@whu.edu.cn</email>
        </contrib>
        <aff id="aff1"><label>1</label><institution>State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan, China</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Nanjing Hydraulic Research Institute, Nanjing, China</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Xiaolong Hu (xlhu@whu.edu.cn) and Liangsheng Shi (liangshs@whu.edu.cn)</corresp></author-notes><pub-date><day>19</day><month>May</month><year>2026</year></pub-date>
      
      <volume>30</volume>
      <issue>10</issue>
      <fpage>2973</fpage><lpage>2994</lpage>
      <history>
        <date date-type="received"><day>10</day><month>September</month><year>2025</year></date>
           <date date-type="rev-request"><day>18</day><month>September</month><year>2025</year></date>
           <date date-type="rev-recd"><day>21</day><month>April</month><year>2026</year></date>
           <date date-type="accepted"><day>27</day><month>April</month><year>2026</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2026 Yanling Wang et al.</copyright-statement>
        <copyright-year>2026</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026.html">This article is available from https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026.html</self-uri><self-uri xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026.pdf">The full text article is available as a PDF file from https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e147">Soil moisture (SM) is a critical component of the hydrological cycle, but accurately predicting it remains challenging due to the nonlinearity of soil water transport, variability in boundary conditions, and the intricate nature of soil properties. Recently, deep learning has shown promise in this domain, typically by modeling temporal dependencies for soil moisture predictions. In this study, we propose non-local neural networks (NLNNs) to convert this problem into a single-time-step, simultaneous multi-depth soil moisture forecasting. The non-local operation design includes embedded Gaussian operations and disentangled knowledge-guided operations, resulting in two variants: the self-attention non-local neural network (SA-NLNN) and the knowledge-guided non-local neural network (KG-NLNN). The knowledge-guided non-local operation is designed to capture vertical soil moisture relationships by decomposing the influences on soil moisture at a given depth into four components, each governed by distinct physical processes. The models offer visual interpretability through learned non-local weights, which reveal interactions among soil moisture across different depths, thereby enabling a qualitative representation of inter-layer connectivity. Notably, the model guided by soil moisture transport knowledge yields more stable and reasonable interpretations. With in-situ observations, we demonstrate that our proposed models perform satisfactorily. The knowledge-guided non-local operations significantly enhance accuracy and reliability. Additionally, our models adapt to diverse time-scale situations while maintaining high computational efficiency. Both models exhibit robust noise resistance, with knowledge guidance enhancing KG-NLNN's noise resistance. In summary, our work addresses the soil moisture prediction challenge in a novel way, highlighting the potential of NLNN and the importance of incorporating physic guidance in data-driven models.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>National Key Research and Development Program of China</funding-source>
<award-id>2021YFC3201203</award-id>
</award-group>
<award-group id="gs2">
<funding-source>National Natural Science Foundation of China</funding-source>
<award-id>52425901</award-id>
<award-id>U2243235</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e159">Soil moisture plays an important role in hydrological processes, governing the exchange of water and energy fluxes between the atmosphere and the land (Vereecken et al., 2008). Accurate simulations of soil moisture dynamics hold great significance in various domains, including effective water resources planning and management, agricultural production, and flood disaster monitoring (Entekhabi et al., 1996; Koster et al., 2004; Zhang et al., 2018). However, precisely forecasting soil moisture dynamics poses challenges due to the nonlinearity of soil water transport (Richards, 1931), randomness in boundary conditions (Guswa et al., 2002), and the intricate nature of soil properties, including soil structure and hydraulic parameters (Vereecken et al., 2022). These factors contribute to strong spatio-temporal variabilities in soil moisture dynamics (Heathman et al., 2012). Traditionally, the simulation of soil moisture dynamics has primarily relied on physically based models, such as the soil-plant-atmosphere-water model (Saxton et al., 1974) and HYDRUS (Simunek et al., 2005). However, their implementation faces challenges in accurately estimating the required parameters (Bandai and Ghezzehei, 2021; Gill et al., 2006). What's more, the current methodology struggles to accurately characterize soil structure at spatially relevant scales  (Romero-Ruiz et al., 2018). This limitation complicates handling scenarios involving cracks, root water absorption, and other complexities, as illustrated in Fig. 1. With advancements in technology and big data analysis capabilities, data-driven models have aroused increasing focus and appear to be more practical in soil moisture dynamics forecasting. For instance, researchers have discovered that both support vector regression and random forest show satisfactory results in soil moisture prediction while maintaining low computing costs (Gill et al., 2006; Prasad et al., 2019). Furthermore, the extreme learning machine (Huang et al., 2006) has demonstrated its capability to precisely predict soil moisture trends (Liu et al., 2014).</p>

      <fig id="F1"><label>Figure 1</label><caption><p id="d2e164">Examples of complex soil conditions related to soil texture and soil structure at the soil profile scale.SM3 is more related to SM1 other than SM2 or SM4, due to the existence of wormholes. The proposed non-local neural network is designed to understand that SM3 is highly correlated with SM1 (caused by fast water migration in wormholes) and less correlated with SM2 (caused by slow seepage under gravity).</p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f01.png"/>

      </fig>

      <p id="d2e173">In recent years, deep learning  (Lecun et al., 2015) has gained considerable attention for its remarkable capabilities in fitting to complex data patterns. When predicting soil moisture, deep learning primarily relies on modeling temporal dependencies. The fundamental models handling sequential data fall into three categories: Recurrent Neural Networks (RNNs)  (Elman, 1990), Convolution Neural Networks (CNNs) (LeCun, 1989), and Transformers (Vaswani et al., 2017). RNNs exploit temporal dependencies through recurrent operations, with Long Short-Term Memory (LSTM) networks demonstrating accurate soil moisture predictions  (Fang et al., 2019). CNNs capture dependencies with repetitive convolutional operations and also yield satisfactory results in soil moisture dynamics modeling (Severyn and Moschitti, 2015; Shi et al., 2015). Both recurrent and convolutional operations process local neighborhoods in input data. Consequently, long-range dependencies are captured through repeated local operations, which is inefficient (Zhu et al., 2021). In contrast, Transformers process data in a more efficient way, owing to its core component – self-attention mechanisms. These mechanisms extract crucial long-range non-local information directly. For instance, Temporal Fusion Transformers with interpretable self-attention layers have shown significant improvements over existing benchmarks in multi-horizon time series forecasting (Lim et al., 2021). Furthermore, Transformers exhibit potential for effective soil moisture dynamics prediction with straightforward model structures  (Wang et al., 2024). Researchers are increasingly recognizing the potential of Transformers.</p>
      <p id="d2e177">However, it is worth noting that current deep learning models often lack physical laws and interpretability. To bridge the gap between data-driven approaches and physics, physical principles can be embedded into loss functions or model architectures. Some researchers have added the residuals of governing physical equations to the loss function, giving rise to Physics-informed Neural Networks (PINN) (Raissi et al., 2017, 2019). In terms of model architectures, Jiang et al. (2020) integrated the physical processes from a conceptual hydrological model into an RNN for runoff modeling. De Bézenac et al. (2018) incorporated advection-diffusion principles into the kernel design of a CNN to predict sea surface temperature. To date, most previous works have relied on traditional model structures, leaving a critical gap in reliable data-driven methods for soil moisture prediction. This underscores the necessity of transitioning toward soil science-informed machine learning models that use the power of data-driven techniques while integrating soil science knowledge during the training process to enhance reliability and generalizability (Minasny et al., 2024).</p>
      <p id="d2e180">Considering that physical models calculate soil moisture content by iteratively using current soil profile states for stepwise predictions, we incorporate the spatial interactions of soil moisture within the profile into our machine learning model. We intend to update soil moisture at each depth based on the states of all depths, with predictions computed as a weighted aggregation of the previous states. When dealing with relationships between multiple variables, geometric deep learning (Bronstein et al., 2017) defines model invariances to enhance robustness and generalization. As an example, graph neural networks (GNNs) (Scarselli et al., 2008) utilize the adjacency matrix to aggregate node features and achieve local invariance. Wang et al. (2025) proposes a spatiotemporal graph convolutional network that models inter-station relationships to effectively predict soil moisture. While GNNs aggregate information through graph-structured neighborhood relationships, Non-local Neural Networks (NLNNs) directly model pairwise dependencies among all positions  (Wang et al., 2018). This fully connected interaction pattern allows each position to directly interact with all other positions, thereby enabling the model to capture long-range global dependencies. The interaction weights are adaptively determined by the real-time soil moisture state in a fully data-driven manner. This fundamental difference reflects distinct inductive biases: GNNs rely on graph-structured message passing, whereas NLNNs explicitly model global interactions without neighborhood restrictions. For soil moisture dynamics, where relevant dependencies may exist between distant soil layers and vary over time, such global modeling capability is particularly beneficial. The non-local operation in NLNNs calculates responses at specific locations by aggregating features from all positions in the input feature map (Wang et al., 2018). This design allows NLNNs to flexibly model global relationships in a data-driven manner, making them suitable as a general modeling module for various tasks. Considering the complexity of interactions between multi-depth soil moisture, we introduce the NLNNs to capture spatially invariant soil moisture relationships across soil layers. Our objective is to model vertical heterogeneity and inter-layer connectivity without physical assumptions. Moreover, the weights computed through non-local operations provide qualitative interpretation for model learning mechanisms. NLNNs find wide application in image segmentation tasks and time series forecasting  (Liu et al., 2019; Zhu et al., 2019). As a representative of NLNNs, the Transformer is adept at processing various types of data, including images and video-related challenges (Guo et al., 2022; Khan et al., 2022; Lim et al., 2021; Liu et al., 2021; Xie et al., 2021). Furthermore, NLNNs can serve as auxiliary blocks to enhance context modeling abilities  (Wang et al., 2018; Yin et al., 2020). With the flexibility of non-local operation modifications, we can envision using NLNNs to simulate the characteristics of soil water dynamics in spatial distribution while ensuring interpretability.</p>
      <p id="d2e183">In this study, we have integrated NLNNs to simulate in-profile soil moisture interactions and predict multi-depth soil moisture content without physical assumptions. Our aim is to achieve accurate and effective forecasts under diverse real-world scenarios, as depicted in Fig. 1, while also providing qualitative description of intricate soil moisture dynamics, such as vertical heterogeneity and inter-layer connectivity. Specifically, we discard all assumptions on soil, root, or boundary conditions and instead attempt to learn the soil water dynamics directly from the data. Unlike traditional one-dimensional soil water flow models that often focus on adjacent-layer fluxes, our model captures complex vertical dependencies and non-uniform moisture redistribution across various depths, enhancing predictions in complex scenarios. We introduce the Self-Attention mechanism Non-local Neural Networks (SA-NLNN) to explore the potential of NLNN structures in soil moisture forecasting. Moreover, the Knowledge-Guided Non-local Neural Network (KG-NLNN) that incorporates soil water transport guidance into the non-local operation is proposed. We examine the models' interpretability using the synthetic data, while in-situ data is applied to assess the practicality and accuracy of the models. The key innovations of our study are as follows: First, unlike previous machine learning models that rely on time-series processing to capture temporal patterns, our study is designed based on a physically motivated assumption: the soil moisture profile at the current day, together with meteorological forcing, contains sufficient information to predict the soil moisture state of the following day. Therefore, the prediction task is formulated as a single-time-step problem involving multi-depth variables. This allows mutual compensation within the soil profile, enabling effective and precise soil moisture forecasts. The adaptability of NLNNs across various temporal and spatial scales is also demonstrated. Second, the learned non-local weights of the NLNN model can be visualized to provide qualitative information on soil properties inferred from soil moisture data. Each weight represents the relative influence of soil moisture at one depth on the moisture state at another depth in the subsequent time step, thereby reflecting vertical soil water interactions. The model interpretability is investigated using synthetic soil moisture data, including virtual examples of homogeneous soil, heterogeneous soil, two-layered soil, and soil with root water uptake. Third, incorporating knowledge-inspired concepts enhances model accuracy and reliability. When evaluating practical performance, we utilize in-situ soil moisture data sourced from the International Soil Moisture Network (ISMN) and compare our models with the benchmark LSTM model (Datta and Faroughi, 2023; Semwal et al., 2021;  Wang et al., 2024). To the best of our knowledge, this marks the first instance of employing NLNNs for interpretable soil moisture dynamics forecasting.</p>
      <p id="d2e186">The remainder of this study is organized as follows: Sect. 2 presents the NLNNs for soil moisture forecasting, including the SA-NLNN and KG-NLNN; Sect. 3 describes the synthetically generated soil moisture data and the in-situ data; Sect. 4 provides the model results and the interpretability analysis. Finally, the conclusion is drawn in Sect. 5.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Methodologies</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Physical Background</title>
      <p id="d2e204">The dynamics of soil moisture transport are fundamentally described by the Richards equation, a governing relation derived from the mass conservation law and the Buckingham-Darcy law (Buckingham, 1907). For one-dimensional uniform flow in homogeneous soil, and assuming the absence of preferential flow, this equation takes the following form:

            <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M1" display="block"><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="italic">θ</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi>z</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced open="[" close="]"><mml:mrow><mml:mi>K</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="italic">ψ</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>z</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:mfenced></mml:mrow></mml:mfenced></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M2" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>  [cm<sup>3</sup> cm<sup>−3</sup>] is the volumetric moisture content, <inline-formula><mml:math id="M5" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> [d] denotes the time, <inline-formula><mml:math id="M6" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> [cm] is the vertical coordinate (positive upward), <inline-formula><mml:math id="M7" display="inline"><mml:mi>K</mml:mi></mml:math></inline-formula> [cm d<sup>−1</sup>] is the unsaturated hydraulic conductivity, <inline-formula><mml:math id="M9" display="inline"><mml:mi mathvariant="italic">ψ</mml:mi></mml:math></inline-formula> [cm] is the soil matric potential of water.</p>
      <p id="d2e327">Based on this equation, the soil moisture profile at a subsequent time step evolves from the preceding profile. Infiltration and evaporation, driven by meteorological factors, directly influence surface soil moisture, which triggers a redistribution of moisture through the soil profile. Therefore, the multi-depth soil moisture at the next time step can be determined by both the current meteorological conditions and the soil moisture profile from the previous time step.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Model structures</title>
      <p id="d2e338">According to Sect. 2.1, we assume that the soil moisture within the profile at the next time step depends on both the current meteorological conditions and the soil moisture from the previous time step in our soil moisture forecasts at multiple depths. The NLNN models are designed to capture the potential interactions of soil moisture at different depths within the vertical profile (Fig. 1), thereby making predictions that are closer to reality. Figure 2 illustrates the NLNN structure proposed for soil moisture dynamics prediction. The input data for the NLNN model, denoted as <inline-formula><mml:math id="M10" display="inline"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> [<inline-formula><mml:math id="M11" display="inline"><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M12" display="inline"><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M13" display="inline"><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, …, <inline-formula><mml:math id="M14" display="inline"><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M15" display="inline"><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:msubsup><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, comprises a concatenation of soil moisture truth at <inline-formula><mml:math id="M16" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> depths from the previous time step <inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mi>t</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mi>t</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:msubsup><mml:msup><mml:mo>]</mml:mo><mml:mi mathvariant="normal">T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> and the upper boundary factor <inline-formula><mml:math id="M18" display="inline"><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> obtained from meteorological conditions processing through an LSTM. Here, <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> denotes the soil moisture at depth <inline-formula><mml:math id="M20" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> and time <inline-formula><mml:math id="M21" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>. The initial soil moisture content for the prediction is set to the truth from the preceding day. Specifically, this value is obtained from the physical model's output for the virtual scenario and from field observations for the real-world scenario.</p>
      <p id="d2e524">Within our framework, we employ two types of non-local operations. The first, SA-NLNN, utilizes embedded Gaussian functions; it represents a novel application of the self-attention mechanism to capture vertical dependencies in soil moisture. The second model, KG-NLNN, is a newly proposed architecture where the non-local operation is decoupled based on the soil water transport mechanisms. In the NLNN structure, following the non-local operation and a residual connection, a fully connected neural network is employed to generate predictions for the soil moisture at each corresponding depth. This yields prediction denoted as, <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">sm</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msubsup><mml:msup><mml:mo>]</mml:mo><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. The ground truth is represented as <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">sm</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msup><mml:mo>]</mml:mo><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. The model is trained by minimizing the error between predictions and the ground truth.</p>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Non-local Operations</title>
      <p id="d2e724">The general form of a non-local operation in NLNNs can be defined as follows (Wang et al., 2018):

            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M24" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mo>∀</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e790">Here <inline-formula><mml:math id="M25" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> denotes the index of the output <inline-formula><mml:math id="M26" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula> for which the output value is being calculated, while <inline-formula><mml:math id="M27" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> is the index that lists all conceivable positions in the input <inline-formula><mml:math id="M28" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula>. The term <inline-formula><mml:math id="M29" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes the <inline-formula><mml:math id="M30" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th component of the output <inline-formula><mml:math id="M31" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula>. In this context, <inline-formula><mml:math id="M32" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> represents the input data and <inline-formula><mml:math id="M33" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> denotes the corresponding output, both sharing the same dimensionality. In this work, <inline-formula><mml:math id="M34" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> represents the concatenation of input soil moisture data and upper boundary condition data, denoted as <inline-formula><mml:math id="M35" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. Accordingly, <inline-formula><mml:math id="M36" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denote the soil moisture at the <inline-formula><mml:math id="M38" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th and <inline-formula><mml:math id="M39" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>th depths at time step <inline-formula><mml:math id="M40" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M41" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>j</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>. The output <inline-formula><mml:math id="M43" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> corresponds to the predicted soil moisture at the next time step, denoted as <inline-formula><mml:math id="M44" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">sm</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M45" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represents the predicted soil moisture content at the <inline-formula><mml:math id="M46" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th depth, <inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>. The computation of a generic non-local operation involves three components: the pairwise function <inline-formula><mml:math id="M48" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula>, the unary function <inline-formula><mml:math id="M49" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula>, and the normalization sum <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The function <inline-formula><mml:math id="M51" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> calculates a scalar (representing relationship such as affinity) between <inline-formula><mml:math id="M52" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and all <inline-formula><mml:math id="M53" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>, while the unary function <inline-formula><mml:math id="M54" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula> generates a representation of the input at position <inline-formula><mml:math id="M55" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>. The aggregated response is then normalized by <inline-formula><mml:math id="M56" display="inline"><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. In this study, the form of <inline-formula><mml:math id="M57" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula> is restricted to a linear embedding: <inline-formula><mml:math id="M58" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>g</mml:mi></mml:msub><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>g</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is a learnable weight matrix. The primary modification focuses on the pairwise function <inline-formula><mml:math id="M60" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula>. The <inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is contingent on the design of <inline-formula><mml:math id="M62" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula>. Following the definition of attention heads from previous work on self-attention mechanisms  (Vaswani et al., 2017), our NLNN models employ several operation heads to enhance the model's feature extraction and representation capabilities. The number of operation heads is denoted as <inline-formula><mml:math id="M63" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">head</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. Similar non-local operations are performed in each head, with some parameter matrices being unique. To form the output, results from each head are concatenated, and a parameterized linear transformation is applied.</p>
      <p id="d2e1179">The non-local operations offer flexibility by assuming various forms and can adapt to specific problem designs. This provides potential solutions for many complex situations. This flexibility stems from their ability to model global dependencies through data-dependent pairwise interactions. Among these formulations, the Transformer represents the most typical and widely used architectural instantiation, which models global dependencies through the query–key–value self-attention mechanism, multi-head attention, positional encoding, and feed-forward layers. From a more general perspective, the Non-local Neural Network can be viewed as a broader formulation of non-local dependency modeling, which computes interactions based on pairwise affinity functions without requiring the full Transformer architecture. In the following sections, we will introduce the classical embedded Gaussian operation, along with our knowledge-guided non-local operation designed for soil moisture dynamics.</p>
<sec id="Ch1.S2.SS3.SSS1">
  <label>2.3.1</label><title>Embedded Gaussian Operation:</title>
      <p id="d2e1189">Self-attention, a specific case of non-local operations within the embedded Gaussian version, is a key component of the Transformer architecture. It excels in processing data concisely and capturing intricate relationships, making it widely applied in various research areas (Devlin et al., 2019; Lim et al., 2021; Liu et al., 2021). However, it overlooks the ordering of input, necessitating the incorporation of position information into the calculations to ensure accurate processing.</p>
      <p id="d2e1192">Common position encoding methods include absolute position encoding (Devlin et al., 2019; Gehring et al., 2017; Vaswani et al., 2017) and relative position encoding (Shaw et al., 2018). Absolute position encoding directly incorporates absolute position information pertaining to <inline-formula><mml:math id="M64" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> or <inline-formula><mml:math id="M65" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> and integrates it into the input. In contrast, relative position encoding focuses on the relative relationship between position <inline-formula><mml:math id="M66" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M67" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>. Given the complexity of soil properties and the nature of soil moisture interactions, prioritizing the relative influence of soil moisture at each depth may prove more effective than relying on absolute position information in soil moisture analysis. In this approach, we utilize the relative position encoding similar to the method proposed by Shaw et al. (2018). The function <inline-formula><mml:math id="M68" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> encompasses a Gaussian function of two embeddings along with the relative position representation associated with <inline-formula><mml:math id="M69" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M70" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>. A self-attention mechanism with relative position encodings in each head can be defined as follows:

                  <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M71" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E3"><mml:mtd><mml:mtext>3</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>q</mml:mi></mml:msub><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="normal">r</mml:mi><mml:mi mathvariant="normal">_</mml:mi><mml:msub><mml:mi mathvariant="normal">score</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:msqrt><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E4"><mml:mtd><mml:mtext>4</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mo>∀</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e1384">Here, <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the weight matrixes to be learned for embeddings. <inline-formula><mml:math id="M74" display="inline"><mml:msqrt><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:math></inline-formula> denotes the scale factor, where <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represents the dimension of the embeddings. <inline-formula><mml:math id="M76" display="inline"><mml:mrow><mml:mi mathvariant="normal">r</mml:mi><mml:mi mathvariant="normal">_</mml:mi><mml:msub><mml:mi mathvariant="normal">score</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is the relative position score computed using relative position encoding. Then the <inline-formula><mml:math id="M77" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> can be calculated through Eq. (1). The embedded Gaussian operation for soil moisture forecasts is illustrated in Fig. 2.</p>

      <fig id="F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e1465">Left: non-local neural network structure for soil moisture forecasting. Right: embedded Gaussian operation and knowledge-guided non-local operation. RPE: relative position encoding. SA/KG score: non-local weights computed through embedded Gaussian operation and knowledge-guided operation. <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M79" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M80" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>g</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the weight matrixes to be learned for embeddings.</p></caption>
            <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f02.png"/>

          </fig>

      <p id="d2e1507">In the relative position encoding, each relationship between two arbitrary positions <inline-formula><mml:math id="M81" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M82" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> is represented by a learnable vector. Here, r_score<sub><italic>i</italic><italic>j</italic></sub> denotes an internal relative position score used in the non-local operation, rather than a model evaluation metric. Then, the r_score<sub><italic>i</italic><italic>j</italic></sub> is calculated as follows:

              <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M85" display="block"><mml:mrow><mml:mi mathvariant="normal">r</mml:mi><mml:mi mathvariant="normal">_</mml:mi><mml:msub><mml:mi mathvariant="normal">score</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mi mathvariant="normal">T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>q</mml:mi></mml:msub><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M86" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> represents the relative position encoding utilized for r_score<sub><italic>i</italic><italic>j</italic></sub> computing. <inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>is a parameter vector that needs to be trained. In the proposed SA-NLNN model, our trainable relative position encoding matrix <inline-formula><mml:math id="M89" display="inline"><mml:mi>A</mml:mi></mml:math></inline-formula> consists of <inline-formula><mml:math id="M90" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>×</mml:mo><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> distinct elements. The matrix <inline-formula><mml:math id="M91" display="inline"><mml:mi mathvariant="bold">A</mml:mi></mml:math></inline-formula> needs to be learned through training:

              <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M92" display="block"><mml:mrow><mml:mi mathvariant="bold">A</mml:mi><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mtable class="array" columnalign="left left left"><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋯</mml:mi></mml:mtd><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi mathvariant="normal">⋮</mml:mi></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋱</mml:mi></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋮</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋯</mml:mi></mml:mtd><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mfenced></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e1764">In this model, all operation heads perform similar operations. <inline-formula><mml:math id="M93" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M94" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M95" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>g</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are unique in each head. However, the relative position encoding can be shared across non-local operation heads.</p>
</sec>
<sec id="Ch1.S2.SS3.SSS2">
  <label>2.3.2</label><title>Disentangled Knowledge-Guided operation</title>
      <p id="d2e1809">In this work, we propose KG-NLNN, a model specifically designed for forecasting soil moisture at multiple depths in the soil profile, as depicted in Fig. 2. The vertical movement of soil moisture exhibits a directional divergence: downward flow is driven primarily by gravity and constitutes a dissipation of potential energy, while upward movement is governed by capillary forces and other mechanisms acting against gravity. In this specific context, we employ a set of masks to decouple soil moisture interactions from different directions. The four masks in Fig. 2 correspond to four key components: meteorological forcing, upper soil water influence, same-depth soil moisture effects, and lower soil water interactions, respectively. Meteorological forcing, upper soil water influence, and lower soil water interactions are modeled by fully connected networks with soil moisture content and depth differences as inputs, whereas same-depth soil moisture effects are represented via relative position encoding. This knowledge-guided architecture separates different moisture movement processes for independent learning, thereby enhancing the model's ability to capture complex relationships among soil moisture variables across the soil profile.</p>
      <p id="d2e1812">When analyzing the soil moisture at <inline-formula><mml:math id="M96" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th depth, denoted as <inline-formula><mml:math id="M97" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, its dynamics are influenced by several factors: upper boundary conditions represented by <inline-formula><mml:math id="M98" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, upper soil moisture state at the previous time step, <inline-formula><mml:math id="M99" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (where <inline-formula><mml:math id="M100" display="inline"><mml:mrow><mml:mi>u</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:math></inline-formula>, primarily donated by gravity), lower soil moisture <inline-formula><mml:math id="M101" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, (where <inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:mi>l</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:math></inline-formula>, mainly affected by capillary), and the soil moisture at the same depth from the previous time step, <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. Since these four components are motivated by diverse physical mechanisms, they are defined in distinct forms within the non-local operation.</p>
      <p id="d2e1902">Before proceeding to the subsections, we provide a brief introduction to fully-connected neural networks (FNNs) that are utilized in the following sections. A two-layer fully-connected neural network can be defined as follows:

              <disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M104" display="block"><mml:mrow><mml:mi mathvariant="normal">FNN</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi mathvariant="normal">input</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">t</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">t</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi mathvariant="normal">input</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M105" display="inline"><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes the tanh activation function, and <inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>L</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi>L</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represent the weight matrices and bias parameters to be learned in the <inline-formula><mml:math id="M108" display="inline"><mml:mi>L</mml:mi></mml:math></inline-formula>th layer, respectively, where <inline-formula><mml:math id="M109" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula>. <inline-formula><mml:math id="M110" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>p</mml:mi><mml:mi>u</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> denotes the input vector of an FNN. According to the universal approximation theorem (Cybenko, 1989), a feedforward neural network with a single hidden layer is theoretically sufficient to approximate a wide range of nonlinear functions. In this study, a two-layer FNN is adopted to balance model expressiveness and computational efficiency. The hyperbolic tangent function is adopted as the activation function <inline-formula><mml:math id="M111" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e2058">The effect of upper boundary conditions on soil moisture at depth <inline-formula><mml:math id="M112" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is described by the function, <inline-formula><mml:math id="M113" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which corresponds to three factors: <inline-formula><mml:math id="M114" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, the meteorological factor; <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, the soil moisture at depth <inline-formula><mml:math id="M116" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> from the previous time step; and <inline-formula><mml:math id="M117" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, the depth of the concerned soil moisture. <inline-formula><mml:math id="M118" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes the <inline-formula><mml:math id="M119" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th depth in the depth vector <inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msub><mml:mi>z</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi>z</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>z</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:msup><mml:mo>]</mml:mo><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, which corresponds to the input soil moisture data <inline-formula><mml:math id="M121" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. We utilize a two-layer FNN to describe this relationship:

              <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M122" display="block"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="normal">FNN</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e2296">In considering the impacts of soil moisture in the upper layers and lower layers on soil moisture at depth <inline-formula><mml:math id="M123" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, we propose <inline-formula><mml:math id="M124" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M125" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> to calculate the effects. Both functions are determined by the disparity in soil moisture content <inline-formula><mml:math id="M126" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, the intrinsic soil moisture <inline-formula><mml:math id="M127" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, and the distance between two positions <inline-formula><mml:math id="M128" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. As previously stated, two two-layer FNNs are employed in this section:

                  <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M129" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E9"><mml:mtd><mml:mtext>9</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mi>f</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="normal">FNN</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>&gt;</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E10"><mml:mtd><mml:mtext>10</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi>f</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="normal">FNN</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e2645">Additionally, we utilize relative position encodings to describe the soil water retention effect:

              <disp-formula id="Ch1.E11" content-type="numbered"><label>11</label><mml:math id="M130" display="block"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>r</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="normal">r</mml:mi><mml:mi mathvariant="normal">_</mml:mi><mml:msub><mml:mi mathvariant="normal">score</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:math></disp-formula>

            where the relative position score r_score<sub><italic>i</italic><italic>j</italic></sub> is utilized for the water retention effect of soil moisture at a specific depth across two adjacent time steps. It can be calculated in Eq. (4). Consequently, our position encoding matrix <inline-formula><mml:math id="M132" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">A</mml:mi><mml:mi mathvariant="normal">PG</mml:mi><mml:mi>K</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is a diagonal matrix comprising <inline-formula><mml:math id="M133" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> distinct elements, which needs to be learned through training:

              <disp-formula id="Ch1.E12" content-type="numbered"><label>12</label><mml:math id="M134" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">A</mml:mi><mml:mi mathvariant="normal">PG</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mtable class="array" columnalign="left left left"><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋯</mml:mi></mml:mtd><mml:mtd><mml:mn mathvariant="normal">0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi mathvariant="normal">⋮</mml:mi></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋱</mml:mi></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋮</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn mathvariant="normal">0</mml:mn></mml:mtd><mml:mtd><mml:mi mathvariant="normal">⋯</mml:mi></mml:mtd><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mfenced></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e2801">According to the above, the impact on soil moisture at a fixed depth is harmoniously coordinated and integrated through the four components mentioned earlier, as illustrated in Fig. 2. Therefore, the knowledge-guided non-local operation for soil moisture dynamics simulation can be defined as follows:

                  <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M135" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E13"><mml:mtd><mml:mtext>13</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>r</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:msqrt><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E14"><mml:mtd><mml:mtext>14</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mo>∀</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

            where <inline-formula><mml:math id="M136" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of positions in <inline-formula><mml:math id="M137" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M138" display="inline"><mml:msqrt><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:math></inline-formula> denotes the scale factor. Then <inline-formula><mml:math id="M139" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> can be calculated using Eq. (1). All operation heads execute similar operations in this model. <inline-formula><mml:math id="M140" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> utilized for r_score computing and <inline-formula><mml:math id="M141" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mi>g</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> in <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are still unique in each head. The parameters of the FNNs are shared across non-local operation heads.</p>
</sec>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Boundary processing</title>
      <p id="d2e3161">In our soil moisture prediction task, the impact of the upper boundary conditions on soil moisture is partially simulated by an LSTM module (Hochreiter and Schmidhuber, 1997), as illustrated in Fig. 2. We have selected six meteorological variables to characterize the influence of these upper boundary conditions: precipitation (<inline-formula><mml:math id="M143" display="inline"><mml:mi>P</mml:mi></mml:math></inline-formula>), air temperature (AT), long-wave radiation (LR), short-wave radiation (SR), relative humidity (RH), and wind speed (WS). These variables, denoted as <inline-formula><mml:math id="M144" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">AT</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">LR</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">SR</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">RH</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">WS</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:msup><mml:mo>]</mml:mo><mml:mi mathvariant="normal">T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, are closely associated with the infiltration and evapotranspiration processes. Hydrologically, meteorological conditions from the previous time step (<inline-formula><mml:math id="M145" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>) do not cease their influence immediately; rather, processes such as infiltration, lateral flow, and redistribution allow these conditions to continue affecting soil moisture at the subsequent time step t. Incorporating both time steps thus enables the model to capture cross-day causal relationships. A time step of 2 is used to keep the meteorological inputs concise while retaining adequate informational richness. Accordingly, the task of learning meteorological temporal dependencies is assigned to the LSTM network, which also justifies its use in processing boundary conditions. Following LSTM processing, the impact of the upper boundary conditions takes the form of <inline-formula><mml:math id="M146" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, which is subsequently utilized in non-local operations in conjunction with the input soil moisture data <inline-formula><mml:math id="M147" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">1</mml:mn><mml:mi>t</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:mi>t</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:msubsup><mml:msup><mml:mo>]</mml:mo><mml:mi mathvariant="normal">T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> within the soil profile. The operation of an LSTM can be summarized as follows:

                <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M148" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E15"><mml:mtd><mml:mtext>15</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msup><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">i</mml:mi></mml:msub><mml:mo>⋅</mml:mo><mml:mo>[</mml:mo><mml:msup><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E16"><mml:mtd><mml:mtext>16</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msup><mml:mi>f</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub><mml:mo>⋅</mml:mo><mml:mo>[</mml:mo><mml:msup><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E17"><mml:mtd><mml:mtext>17</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msup><mml:mi>o</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub><mml:mo>⋅</mml:mo><mml:mo>[</mml:mo><mml:msup><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E18"><mml:mtd><mml:mtext>18</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msup><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">t</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">c</mml:mi></mml:msub><mml:mo>⋅</mml:mo><mml:mo>[</mml:mo><mml:msup><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">c</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E19"><mml:mtd><mml:mtext>19</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msup><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mi>f</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>⋅</mml:mo><mml:msup><mml:mi>c</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msup><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>⋅</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E20"><mml:mtd><mml:mtext>20</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msup><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mi>o</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>⋅</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">t</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msup><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

          where <inline-formula><mml:math id="M149" display="inline"><mml:mrow><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M150" display="inline"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M152" display="inline"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M153" display="inline"><mml:mrow><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denote the deep learning parameters for the input gate, forget gate, and the output gate, respectively; <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M156" display="inline"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi mathvariant="normal">c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the parameters for cell state updating; in addition, <inline-formula><mml:math id="M157" display="inline"><mml:mrow><mml:msup><mml:mi>i</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M158" display="inline"><mml:mrow><mml:msup><mml:mi>f</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M159" display="inline"><mml:mrow><mml:msup><mml:mi>o</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> are the input gate, forget gate, and output gate at time <inline-formula><mml:math id="M160" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>, respectively, and <inline-formula><mml:math id="M161" display="inline"><mml:mrow><mml:msup><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is the memory cell state; <inline-formula><mml:math id="M162" display="inline"><mml:mrow><mml:msup><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> represents the hidden state; <inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the sigmoid activation function, and <inline-formula><mml:math id="M164" display="inline"><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi mathvariant="normal">t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes the tanh activation function.</p>
      <p id="d2e3817">Through sequential processing, the last hidden state <inline-formula><mml:math id="M165" display="inline"><mml:mrow><mml:msup><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> in the output <inline-formula><mml:math id="M166" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:msup><mml:mi>h</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi>h</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> derived from input <inline-formula><mml:math id="M167" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">ub</mml:mi><mml:mi>t</mml:mi></mml:msup><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, which encodes the upper boundary effect over two time steps, is adopted as the <inline-formula><mml:math id="M168" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>. In this study, the lower boundary conditions are disregarded due to the obstacles in observation.</p>
</sec>
<sec id="Ch1.S2.SS5">
  <label>2.5</label><title>Training Strategies</title>
      <p id="d2e3908">The objective of our model is to simultaneously predict soil moisture at multiple depths for the next time step. To achieve this, we define the loss function as the sum of squared errors between the model predictions and the corresponding ground truth of soil moisture content at different depths. The model is trained by minimizing this loss function:

            <disp-formula id="Ch1.E21" content-type="numbered"><label>21</label><mml:math id="M169" display="block"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow><mml:mi>B</mml:mi></mml:munderover><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>′</mml:mo></mml:mrow></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mi mathvariant="normal">sm</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M170" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> denotes the number of concerned soil moisture depths, and <inline-formula><mml:math id="M171" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula> is the training batch size, which is set to 100 in this study.</p>
      <p id="d2e3994">In this work, the collected data is divided into training, validation, and test sets in a time-ordered ratio of 6 : 2 : 2. For training, we employ the Adam optimizer  (Kingma and Ba, 2015) with a learning rate of 0.001. The models are trained for a minimum of 2500 epochs, with 20 batches in each epoch. The validation set is utilized to select the best model and mitigate overfitting. Subsequently, the test set is then employed to evaluate the performance of the models. Each result is computed based on 10 replicates with different initializations. Regarding the model hyperparameter settings, in the non-local neural network, we set <inline-formula><mml:math id="M172" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>q</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">10</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>g</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">16</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M173" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">head</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M174" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M175" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M176" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>g</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represents the dimensions of the key, query and value (function <inline-formula><mml:math id="M177" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula>) components within the non-local block, respectively. <inline-formula><mml:math id="M178" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">head</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes the number of non-local heads. The LSTM consists of two stacked blocks, each configured with a hidden layer of 20 neurons. In the FNN adopted for KG-NLNN, we utilize 10 neurons in each hidden layer.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Data Descriptions</title>
      <p id="d2e4108">In our study, synthetic soil moisture data is generated to investigate the interpretability of these NLNN models. Additionally, we utilize the selected in-situ soil moisture data to assess the accuracy and practicability of our models.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Synthetic Data Description</title>
      <p id="d2e4118">The synthetic data are generated using the ROSS method  (Ross, 2003, 2006). The Ross method is a rapid, non-iterative numerical scheme for soil moisture forward modeling. In our simulation, we create soil moisture content data for a 100 cm soil column with 1 cm intervals. For boundary conditions, the daily reference evapotranspiration (ET0) is calculated with the FAO Penman-Monteith method (Allen et al., 1998) in Wuhan coordinates to generate the synthetic data. As standardized in the FAO guidelines (Allen et al., 1998), actual evapotranspiration is the product of <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>C</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and ET0, where <inline-formula><mml:math id="M180" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>C</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> serves as a refined empirical parameter. When generating synthetic data, we applied this empirical coefficient method to derive a preliminary evapotranspiration estimate, adopting a coefficient value of 1.0 in this instance. The daily time series data of precipitation and calculated evapotranspiration are shown in Fig. 3. The lower boundary condition is set as free drainage, and the initial moisture content of the soil column is set to a uniform value of 0.10. We generate three years of time series soil moisture data for this research.</p>

      <fig id="F3"><label>Figure 3</label><caption><p id="d2e4145">Daily time series precipitation and reference evapotranspiration data calculated at Wuhan coordinate for generating synthetic data.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f03.png"/>

        </fig>

      <p id="d2e4154">In this section, we design four virtual cases of different configurations to investigate model interpretability, including homogeneous soil, heterogeneous soil, two-layered soil, and soil with root water uptake scenarios, as represented in Fig. 4. When generating synthetic data in the case with root water uptake, the root depth is set to 50 cm, and root density is vertically distributed evenly. Detailed soil property settings are given in Appendix A. Besides, we assess the adaptability across different time scales and observation locations using the available data.</p>

      <fig id="F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e4160">The virtual cases design, with homogeneous soil <bold>(a)</bold>, heterogeneous soil <bold>(b)</bold>, two-layered soil <bold>(c)</bold>, and homogeneous soil with root water uptake <bold>(d)</bold>.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f04.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>In-situ Data Description</title>
      <p id="d2e4189">To comprehensively evaluate the proposed NLNN models, we carefully select soil moisture content observations from twenty sites within the International Soil Moisture Network (ISMN) (<uri>https://ismn.earth/en/</uri>, last access: 12 May 2026.). These sites are chosen based on geographical locations, soil textures, and land cover types. Detailed information for the selected sites is presented in Table 1, and their spatial locations are illustrated in Fig. 5. These carefully selected sites encompass 16 soil types and 6 land cover species, providing a diverse range to assess the model's performance and its ability to adapt to complex soil situations. At each site, in-situ observations are required to include soil moisture observations at 5 standard depths (0.05, 0.10, 0.20, 0.50, 1.00 m).</p>

<table-wrap id="T1" specific-use="star"><label>Table 1</label><caption><p id="d2e4198">Summary of main characteristics of twenty selected sites.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="9">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="left"/>
     <oasis:colspec colnum="7" colname="col7" align="left"/>
     <oasis:colspec colnum="8" colname="col8" align="left"/>
     <oasis:colspec colnum="9" colname="col9" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Number</oasis:entry>
         <oasis:entry colname="col2">Site</oasis:entry>
         <oasis:entry colname="col3">Sand</oasis:entry>
         <oasis:entry colname="col4">Silt</oasis:entry>
         <oasis:entry colname="col5">Clay</oasis:entry>
         <oasis:entry colname="col6">Land cover</oasis:entry>
         <oasis:entry colname="col7">Period</oasis:entry>
         <oasis:entry colname="col8">Lat.</oasis:entry>
         <oasis:entry colname="col9">Lon.</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">1</oasis:entry>
         <oasis:entry colname="col2">Kingston-1-W</oasis:entry>
         <oasis:entry colname="col3">85</oasis:entry>
         <oasis:entry colname="col4">10</oasis:entry>
         <oasis:entry colname="col5">5</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2012–2023</oasis:entry>
         <oasis:entry colname="col8">41.48</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M181" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>71.54</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">2</oasis:entry>
         <oasis:entry colname="col2">Monahans-6-ENE</oasis:entry>
         <oasis:entry colname="col3">83</oasis:entry>
         <oasis:entry colname="col4">6</oasis:entry>
         <oasis:entry colname="col5">11</oasis:entry>
         <oasis:entry colname="col6">Shrub cover</oasis:entry>
         <oasis:entry colname="col7">2010–2022</oasis:entry>
         <oasis:entry colname="col8">31.62</oasis:entry>
         <oasis:entry colname="col9">102.81</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">3</oasis:entry>
         <oasis:entry colname="col2">Necedah-5-WNW</oasis:entry>
         <oasis:entry colname="col3">83</oasis:entry>
         <oasis:entry colname="col4">11</oasis:entry>
         <oasis:entry colname="col5">6</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2009–2022</oasis:entry>
         <oasis:entry colname="col8">44.06</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M182" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>90.17</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">4</oasis:entry>
         <oasis:entry colname="col2">Shadow Mtns</oasis:entry>
         <oasis:entry colname="col3">79</oasis:entry>
         <oasis:entry colname="col4">10</oasis:entry>
         <oasis:entry colname="col5">11</oasis:entry>
         <oasis:entry colname="col6">Shrub cover</oasis:entry>
         <oasis:entry colname="col7">2013–2017</oasis:entry>
         <oasis:entry colname="col8">35.47</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M183" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>115.72</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">5</oasis:entry>
         <oasis:entry colname="col2">Falkenberg</oasis:entry>
         <oasis:entry colname="col3">73</oasis:entry>
         <oasis:entry colname="col4">21</oasis:entry>
         <oasis:entry colname="col5">6</oasis:entry>
         <oasis:entry colname="col6">Cropland, rained</oasis:entry>
         <oasis:entry colname="col7">2003–2020</oasis:entry>
         <oasis:entry colname="col8">52.17</oasis:entry>
         <oasis:entry colname="col9">14.12</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">6</oasis:entry>
         <oasis:entry colname="col2">Kenai-29-ENE</oasis:entry>
         <oasis:entry colname="col3">54</oasis:entry>
         <oasis:entry colname="col4">38</oasis:entry>
         <oasis:entry colname="col5">8</oasis:entry>
         <oasis:entry colname="col6">Shrub cover</oasis:entry>
         <oasis:entry colname="col7">2012–2023</oasis:entry>
         <oasis:entry colname="col8">60.72</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M184" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>150.45</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">7</oasis:entry>
         <oasis:entry colname="col2">AAMU-jtg</oasis:entry>
         <oasis:entry colname="col3">53</oasis:entry>
         <oasis:entry colname="col4">22</oasis:entry>
         <oasis:entry colname="col5">25</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2010–2022</oasis:entry>
         <oasis:entry colname="col8">34.78</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M185" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>86.55</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">8</oasis:entry>
         <oasis:entry colname="col2">Darrington-21-NNE</oasis:entry>
         <oasis:entry colname="col3">53</oasis:entry>
         <oasis:entry colname="col4">22</oasis:entry>
         <oasis:entry colname="col5">25</oasis:entry>
         <oasis:entry colname="col6">Tree cover</oasis:entry>
         <oasis:entry colname="col7">2013–2019</oasis:entry>
         <oasis:entry colname="col8">48.54</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M186" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>121.45</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">9</oasis:entry>
         <oasis:entry colname="col2">Palestine-6-WNW</oasis:entry>
         <oasis:entry colname="col3">49</oasis:entry>
         <oasis:entry colname="col4">27</oasis:entry>
         <oasis:entry colname="col5">24</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2009–2013</oasis:entry>
         <oasis:entry colname="col8">31.78</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M187" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>95.72</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">10</oasis:entry>
         <oasis:entry colname="col2">Cullman</oasis:entry>
         <oasis:entry colname="col3">49</oasis:entry>
         <oasis:entry colname="col4">27</oasis:entry>
         <oasis:entry colname="col5">24</oasis:entry>
         <oasis:entry colname="col6">Mosaic Cropland</oasis:entry>
         <oasis:entry colname="col7">2006–2022</oasis:entry>
         <oasis:entry colname="col8">34.20</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M188" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>86.80</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">11</oasis:entry>
         <oasis:entry colname="col2">Cape-Charles</oasis:entry>
         <oasis:entry colname="col3">49</oasis:entry>
         <oasis:entry colname="col4">27</oasis:entry>
         <oasis:entry colname="col5">24</oasis:entry>
         <oasis:entry colname="col6">Herbaceous cover</oasis:entry>
         <oasis:entry colname="col7">2011–2022</oasis:entry>
         <oasis:entry colname="col8">37.29</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M189" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>75.93</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">12</oasis:entry>
         <oasis:entry colname="col2">LittleRiver</oasis:entry>
         <oasis:entry colname="col3">47</oasis:entry>
         <oasis:entry colname="col4">30</oasis:entry>
         <oasis:entry colname="col5">23</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2005–2020</oasis:entry>
         <oasis:entry colname="col8">31.50</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M190" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>83.55</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">13</oasis:entry>
         <oasis:entry colname="col2">Montrose-11-ENE</oasis:entry>
         <oasis:entry colname="col3">43</oasis:entry>
         <oasis:entry colname="col4">35</oasis:entry>
         <oasis:entry colname="col5">22</oasis:entry>
         <oasis:entry colname="col6">Tree cover</oasis:entry>
         <oasis:entry colname="col7">2010–2023</oasis:entry>
         <oasis:entry colname="col8">38.54</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M191" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>107.69</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">14</oasis:entry>
         <oasis:entry colname="col2">Coshocton-8-NNE</oasis:entry>
         <oasis:entry colname="col3">41</oasis:entry>
         <oasis:entry colname="col4">39</oasis:entry>
         <oasis:entry colname="col5">20</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2009–2016</oasis:entry>
         <oasis:entry colname="col8">40.37</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M192" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>81.78</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">15</oasis:entry>
         <oasis:entry colname="col2">Bodega-6-WSW</oasis:entry>
         <oasis:entry colname="col3">39</oasis:entry>
         <oasis:entry colname="col4">38</oasis:entry>
         <oasis:entry colname="col5">23</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2011–2023</oasis:entry>
         <oasis:entry colname="col8">38.32</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M193" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>123.08</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">16</oasis:entry>
         <oasis:entry colname="col2">Goodwell-2-SE</oasis:entry>
         <oasis:entry colname="col3">36</oasis:entry>
         <oasis:entry colname="col4">41</oasis:entry>
         <oasis:entry colname="col5">23</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2010–2022</oasis:entry>
         <oasis:entry colname="col8">36.57</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M194" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>101.61</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">17</oasis:entry>
         <oasis:entry colname="col2">Riley-10-WSW</oasis:entry>
         <oasis:entry colname="col3">36</oasis:entry>
         <oasis:entry colname="col4">41</oasis:entry>
         <oasis:entry colname="col5">23</oasis:entry>
         <oasis:entry colname="col6">Shrub cover</oasis:entry>
         <oasis:entry colname="col7">2011–2021</oasis:entry>
         <oasis:entry colname="col8">43.47</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M195" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>119.69</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">18</oasis:entry>
         <oasis:entry colname="col2">Joplin-24-N</oasis:entry>
         <oasis:entry colname="col3">35</oasis:entry>
         <oasis:entry colname="col4">41</oasis:entry>
         <oasis:entry colname="col5">24</oasis:entry>
         <oasis:entry colname="col6">Grassland</oasis:entry>
         <oasis:entry colname="col7">2010–2020</oasis:entry>
         <oasis:entry colname="col8">37.43</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M196" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>94.58</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">19</oasis:entry>
         <oasis:entry colname="col2">Weslaco</oasis:entry>
         <oasis:entry colname="col3">34</oasis:entry>
         <oasis:entry colname="col4">45</oasis:entry>
         <oasis:entry colname="col5">21</oasis:entry>
         <oasis:entry colname="col6">Cropland, rained</oasis:entry>
         <oasis:entry colname="col7">2017–2021</oasis:entry>
         <oasis:entry colname="col8">26.16</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M197" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>97.96</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">20</oasis:entry>
         <oasis:entry colname="col2">UpperBethlehem</oasis:entry>
         <oasis:entry colname="col3">32</oasis:entry>
         <oasis:entry colname="col4">38</oasis:entry>
         <oasis:entry colname="col5">30</oasis:entry>
         <oasis:entry colname="col6">Herbaceous cover</oasis:entry>
         <oasis:entry colname="col7">2008–2010</oasis:entry>
         <oasis:entry colname="col8">17.72</oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M198" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>64.80</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <fig id="F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e4996">The spatial locations of twenty selected sites. The numbers on the sites correspond to the serial numbers in Table 1.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f05.png"/>

        </fig>

      <p id="d2e5006">The meteorological inputs for our models include precipitation, atmospheric temperature, long-wave radiation, short-wave radiation, wind speed, and relative humidity, as mentioned above. These meteorological data are sourced from the NASA Prediction of Worldwide Energy Resources project (<uri>https://power.larc.nasa.gov/</uri>, last access: 12 May 2026). Based on the latitude and longitude coordinates of each station, we downloaded the corresponding point-scale, daily-resolution meteorological datasets. Detailed information about this can be found at (<uri>https://power.larc.nasa.gov/docs/methodology/meteorology/</uri>, last access: 12 May 2026). Unfortunately, due to challenges in obtaining groundwater level observations, changes in the lower boundary conditions are not considered in this study.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Results and discussions </title>
      <p id="d2e5024">In this study, we systematically examine and analyze our models from three perspectives. Initially, we assess the essential capabilities of models, including accuracy and uncertainty, using both synthetic data and in-situ observations. Subsequently, we apply simulated soil moisture data under diverse virtual scenarios to evaluate our model's interpretability and its ability to provide qualitative interpretations depicting soil moisture interaction mechanisms across diverse depths within the profile. Finally, we investigate the impacts of varying temporal scales, noise levels, and observation locations on our non-local neural networks.</p>
      <p id="d2e5027">To explore the forecasting ability of our models over time series, we examine predictions for 1, 3, and 7 d ahead at selected sites, as well as 1, 3, 7, and 15 d ahead for simulated data. We generate predictions iteratively. The evaluation standards in this work comprise the mean absolute error (MAE) and the root mean square error (RMSE). Both MAE and RMSE quantify the deviation between the predictions and the ground truth. However, RMSE exhibits greater sensitivity to outliers due to its squaring of deviations, which amplifies the impact of extreme values, while MAE offers a smoother average error value. These metrics are calculated as follows:

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M199" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E22"><mml:mtd><mml:mtext>22</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="normal">MAE</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub></mml:mrow></mml:msubsup><mml:mo>|</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>T</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E23"><mml:mtd><mml:mtext>23</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="normal">RMSE</mml:mi><mml:mo>=</mml:mo><mml:msqrt><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub></mml:mrow></mml:msubsup><mml:mo>(</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>T</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:msqrt></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

        where <inline-formula><mml:math id="M200" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>T</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M201" display="inline"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represent the predictions and the ground truth, respectively; <inline-formula><mml:math id="M202" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>T</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the average of the ground truth; <inline-formula><mml:math id="M203" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the test sample size. Here, <inline-formula><mml:math id="M204" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> denotes the soil moisture content [%] which needs to be calculated. All the compared models are trained and evaluated using the same datasets, input variables, and evaluation metrics to further ensure consistency and fairness in the comparison.</p>
      <p id="d2e5208">When conducting uncertainty analysis, evaluating confidence bounds becomes challenging because most deep learning neural networks are essentially deterministic models. To address this, many researchers utilize the bootstrap aggregating (bagging) method  (Breiman, 1996) to analyze model predictive uncertainty  (Kornelsen and Coulibaly, 2014). The bagging method involves training multiple neural network models using subsets of the training set, all with identical architecture. To create the training subset for each model, a statistical bootstrap approach is employed. For each subset, we randomly select individual input vectors from the entire training set with replacement, ensuring that each subset contains the same number of elements as the entire training set. After training, we obtain an ensemble of trained models, each trained with a unique training subset. The final output and uncertainty estimates are then derived from the mean and standard deviation of this ensemble.</p>
      <p id="d2e5211">To explore the impact of noise on our models using the synthetic data, we apply the zero-mean Gaussian noise with a variance of 1:

          <disp-formula id="Ch1.E24" content-type="numbered"><label>24</label><mml:math id="M205" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="italic">θ</mml:mi><mml:mo mathvariant="normal">˙</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="italic">η</mml:mi><mml:mo>⋅</mml:mo><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M206" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">θ</mml:mi><mml:mo mathvariant="normal">˙</mml:mo></mml:mover></mml:math></inline-formula> is the volumetric soil moisture content with noise [%], and <inline-formula><mml:math id="M207" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> is the synthetic volumetric soil moisture content. Three noise levels are tested (<inline-formula><mml:math id="M208" display="inline"><mml:mrow><mml:mi mathvariant="italic">η</mml:mi><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 0.5, 1.0, 2.0) in this work.</p>
      <p id="d2e5276">In our investigation of model interpretability, the visualized non-local weight maps generated from the output play a crucial role as evaluation standards. According to Eq. (2), the normalized weights <inline-formula><mml:math id="M209" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> quantify the relative influence of soil moisture at depth <inline-formula><mml:math id="M210" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> on the prediction at depth <inline-formula><mml:math id="M211" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>. These normalized interaction weights reflect how strongly soil moisture information from different depths on the previous day contributes to the predicted soil moisture at a given depth on the following day. These weight maps may provide qualitative interpretations depicting intricate mechanisms of soil water dynamics. The color brightness on the weight distribution map signifies the level of interaction strength among upper boundary conditions and soil moisture across different depths. Therefore, analyzing the weight matrix map is essential for gaining insights into the learning mechanisms of our NLNN models.</p>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Interpretability analysis</title>
      <p id="d2e5337">Before the models can be applied to real-world scenarios, their stability and interpretability must first be analyzed. In this section, we explore the interpretability of the NLNN models by designing several scenarios that generate synthetic data. These simulated cases primarily involve variations in soil properties, including homogeneous soil, heterogeneous soil, two-layered soil, and soil with root water uptake scenarios. We benchmark the soil moisture prediction tasks against the LSTM model, widely used in time series forecasting (Datta and Faroughi, 2023; Ding et al., 2019; Siami-Namini et al., 2019). Specifically, the LSTM model takes two forms tailored for different data processing approaches: LSTM_T, which utilizes input data from the previous four time steps to predict soil moisture content at the next time step. It follows a configuration similar to that in previous work (Wang et al., 2024). These predictions rely on modeling temporal dependencies. In contrast, LSTM_I replaces the non-local operations in the architecture shown in Fig. 2 with LSTM modules, thereby modeling interactions among soil water layers. It represents the predictive capabilities achievable by a single-time-step LSTM. With the synthetic data, we investigate the model performance and interpretability through the weight matrix maps and delve into their learning mechanisms across diverse scenarios.</p>
      <p id="d2e5340">Figure 6 displays the RMSE results for 1, 3, 7, and 15 d forecasts of four models, and the MAE values of four simulated scenarios are summarized in Appendix C. As shown in Fig. 6, the LSTM_T model achieves very high accuracy in 1 d predictions, but its performance deteriorates rapidly over longer periods. As for the other models, NLNNs and LSTM_I exhibit comparable performance. The knowledge-guided model KG-NLNN exhibits lower variance and maintains greater stability in RMSE, especially in the 15 d prediction task. The integration of knowledge guidance proves crucial in ensuring model stability.</p>

      <fig id="F6" specific-use="star"><label>Figure 6</label><caption><p id="d2e5345">The RMSE results for 1, 3, 7, and 15 d for heterogeneous soil <bold>(a–e)</bold>, and two-layered soil <bold>(f–j)</bold>. The error bar indicates the standard deviations of the RMSE, which are computed via ten training replicates.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f06.png"/>

        </fig>

      <fig id="F7"><label>Figure 7</label><caption><p id="d2e5363">The non-local weight maps in homogeneous simulated soil scenarios through KG-NLNN <bold>(a)</bold> <inline-formula><mml:math id="M212" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 0.25 <bold>(b)</bold> <inline-formula><mml:math id="M213" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 10.49, and SA-NLNN <bold>(c)</bold> <inline-formula><mml:math id="M214" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 0.25, <bold>(d)</bold> <inline-formula><mml:math id="M215" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 10.49.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f07.png"/>

        </fig>

      <p id="d2e5437">Figure 7 depicts the weight matrix maps generated by KG-NLNN and SA-NLNN models for homogeneous soil scenarios varying saturated hydraulic conductivity (<inline-formula><mml:math id="M216" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>) values. These maps represent the term <inline-formula><mml:math id="M217" display="inline"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:math></inline-formula> calculated through non-local operations. Each element at position <inline-formula><mml:math id="M218" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the impact of soil moisture at depth <inline-formula><mml:math id="M219" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> at the previous time on the soil moisture content at depth <inline-formula><mml:math id="M220" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. Notably, when <inline-formula><mml:math id="M221" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>, it signifies the influence of upper boundary conditions on soil moisture across various depths. The brightness level corresponds to the strength of this influence, with higher brightness indicating a stronger impact. Homogeneous soil scenarios with different <inline-formula><mml:math id="M222" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> values are used to examine variations in the non-local weight matrices. The weight maps produced by the KG-NLNN model exhibit clear and stable spatial patterns across different <inline-formula><mml:math id="M223" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, whereas the SA-NLNN results appear relatively chaotic, indicating that a knowledge-guided structural design can serve as a valuable enhancement.</p>
      <p id="d2e5559">Differences in hydraulic conductivity govern soil water flow velocity, leading to variations in the time required for water to reach different depths. These differences shape the structure of the weight maps and give rise to the distinct patterns observed in Fig. 7a and b. For instance, loam (<inline-formula><mml:math id="M224" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 0.25) exhibits slow infiltration, so its moisture content is easily influenced by adjacent layers in Fig. 7a. In contrast, sand (<inline-formula><mml:math id="M225" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> 10.49) allows rapid infiltration, resulting in deeper soil moisture being affected directly by meteorological factors. Although the proposed model does not involve any parameterization nor perform a quantitative description of soil hydraulic parameters, it nevertheless provides insights into these hydraulic properties to some extent.</p>
      <p id="d2e5589">Additionally, two-layer soil scenarios are employed in which the soil properties of the upper and lower layers are exchanged to further investigate changes in the non-local weight matrices. Figure 8 depicts the weight matrix maps generated by KG-NLNN and SA-NLNN models for two-layered soil scenarios. The saturated hydraulic conductivity of the two soil types varies significantly, with distinct characteristics influencing water transport and drainage, as recorded in Appendix A. Figure 8 presents the weight matrix maps generated through KG-NLNN and SA-NLNN. Some soil structural information, such as stratification, can be reflected from the soil moisture interactions in Fig. 8a, b. In the scenario where sand is beneath loam, water gradually released from the loam layer can quickly reach various depths of the sand below. Consequently, soil moisture in the lower layers is primarily influenced by the upper loam. As shown in Fig. 8a, the moisture in the lower layer (0.10, 0.20, 0.5, 1.0 m) is notably influenced by the moisture at 0.05 m. Conversely, with sand above loam, the upper sand rapidly drains water, and the water from the upper sand is absorbed and held by the lower loam. Therefore, soil moisture in the lower layers is mainly affected by the adjacent upper layer, as shown in Fig. 8b. This layered pattern in the weight map serves as a qualitative indicator of soil texture. Although the weights do not have a direct quantitative relationship with the soil hydraulic parameters, they can reflect the difference in hydraulic conductivity between the layers and reveal which layer is more permeable.</p>

      <fig id="F8"><label>Figure 8</label><caption><p id="d2e5594">The non-local weight maps in two-layered simulated stratified soil scenarios through KG-NLNN <bold>(a)</bold> loam above sand <bold>(b)</bold> sand above loam, and SA-NLNN, <bold>(c)</bold> loam above sand, <bold>(d)</bold> sand above loam.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f08.png"/>

        </fig>

      <p id="d2e5616">As a result, both NLNN models achieve satisfactory soil moisture forecasts in the simulated scenarios. Furthermore, the models have advanced the interpretability of machine learning through non-local weight matrix maps. Notably, KG-NLNN offers more reliable qualitative descriptions of soil properties via weights visualizations, highlighting the importance of knowledge guidance.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Performance evaluation</title>
      <p id="d2e5627">In this section, we evaluate the performance of the SA-NLNN and KG-NLNN models using in-situ observations from twenty ISMN sites. The performance of LSTM_T, LSTM_T, SA-NLNN, and KG-NLNN is evaluated at five different depths (0.05, 0.1, 0.2, 0.5, 1.0 m). Notably, our NLNN models predict soil moisture for all five depths simultaneously, whereas LSTM_T models each depth separately. When comparing our models with physical models, the inherent methodological differences between machine learning and physical models make fair and direct comparisons with standard knowledge-based modeling particularly challenging. We therefore limit our comparison to a preliminary assessment in Appendix B.</p>

      <fig id="F9"><label>Figure 9</label><caption><p id="d2e5632">Comparison of mean RMSE for LSTM_T, LSTM_I, SA-NLNN, and KG-NLNN. The values are averaged across twenty research sites and presented separately for each of the five soil depths: 0.05 m <bold>(a)</bold>, 0.10 m <bold>(b)</bold>, 0.20 m <bold>(c)</bold>, 0.50 m <bold>(d)</bold>, 1.00 m <bold>(e)</bold>.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f09.png"/>

        </fig>

<table-wrap id="T2" specific-use="star"><label>Table 2</label><caption><p id="d2e5659">The MAE [%] values for 1, 3, and 7 d forecasts across the four models across twenty research sites at 5 distinct depths, based on ten repeated trainings. The bold values indicate the best performance for each metric across the models.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="13">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right" colsep="1"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right" colsep="1"/>
     <oasis:colspec colnum="8" colname="col8" align="right"/>
     <oasis:colspec colnum="9" colname="col9" align="right"/>
     <oasis:colspec colnum="10" colname="col10" align="right" colsep="1"/>
     <oasis:colspec colnum="11" colname="col11" align="right"/>
     <oasis:colspec colnum="12" colname="col12" align="right"/>
     <oasis:colspec colnum="13" colname="col13" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Depth/m</oasis:entry>
         <oasis:entry rowsep="1" namest="col2" nameend="col13" align="center">MAE SA-NLNN LSTM_4 LSTM_1 </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry rowsep="1" namest="col2" nameend="col4" align="center" colsep="1">KG-NLNN </oasis:entry>
         <oasis:entry rowsep="1" namest="col5" nameend="col7" align="center" colsep="1">SA-NLNN </oasis:entry>
         <oasis:entry rowsep="1" namest="col8" nameend="col10" align="center" colsep="1">LSTM_T </oasis:entry>
         <oasis:entry rowsep="1" namest="col11" nameend="col13" align="center">LSTM_I </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">1dd</oasis:entry>
         <oasis:entry colname="col3">3d</oasis:entry>
         <oasis:entry colname="col4">7d</oasis:entry>
         <oasis:entry colname="col5">1d</oasis:entry>
         <oasis:entry colname="col6">3d</oasis:entry>
         <oasis:entry colname="col7">7d</oasis:entry>
         <oasis:entry colname="col8">1d</oasis:entry>
         <oasis:entry colname="col9">3d</oasis:entry>
         <oasis:entry colname="col10">7d</oasis:entry>
         <oasis:entry colname="col11">1d</oasis:entry>
         <oasis:entry colname="col12">3d</oasis:entry>
         <oasis:entry colname="col13">7d</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">0.05</oasis:entry>
         <oasis:entry colname="col2"><bold>0.391</bold></oasis:entry>
         <oasis:entry colname="col3"><bold>0.600</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.893</bold></oasis:entry>
         <oasis:entry colname="col5">0.440</oasis:entry>
         <oasis:entry colname="col6">0.666</oasis:entry>
         <oasis:entry colname="col7">0.979</oasis:entry>
         <oasis:entry colname="col8">0.737</oasis:entry>
         <oasis:entry colname="col9">1.074</oasis:entry>
         <oasis:entry colname="col10">1.515</oasis:entry>
         <oasis:entry colname="col11">0.808</oasis:entry>
         <oasis:entry colname="col12">1.203</oasis:entry>
         <oasis:entry colname="col13">1.713</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.10</oasis:entry>
         <oasis:entry colname="col2"><bold>0.392</bold></oasis:entry>
         <oasis:entry colname="col3"><bold>0.603</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.900</bold></oasis:entry>
         <oasis:entry colname="col5">0.431</oasis:entry>
         <oasis:entry colname="col6">0.659</oasis:entry>
         <oasis:entry colname="col7">0.972</oasis:entry>
         <oasis:entry colname="col8">0.498</oasis:entry>
         <oasis:entry colname="col9">0.726</oasis:entry>
         <oasis:entry colname="col10">1.027</oasis:entry>
         <oasis:entry colname="col11">0.506</oasis:entry>
         <oasis:entry colname="col12">0.771</oasis:entry>
         <oasis:entry colname="col13">1.113</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.20</oasis:entry>
         <oasis:entry colname="col2">0.397</oasis:entry>
         <oasis:entry colname="col3">0.607</oasis:entry>
         <oasis:entry colname="col4">0.900</oasis:entry>
         <oasis:entry colname="col5">0.431</oasis:entry>
         <oasis:entry colname="col6">0.648</oasis:entry>
         <oasis:entry colname="col7">0.947</oasis:entry>
         <oasis:entry colname="col8"><bold>0.356</bold></oasis:entry>
         <oasis:entry colname="col9">0.558</oasis:entry>
         <oasis:entry colname="col10">0.844</oasis:entry>
         <oasis:entry colname="col11">0.357</oasis:entry>
         <oasis:entry colname="col12"><bold>0.547</bold></oasis:entry>
         <oasis:entry colname="col13"><bold>0.787</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.50</oasis:entry>
         <oasis:entry colname="col2"><bold>0.392</bold></oasis:entry>
         <oasis:entry colname="col3"><bold>0.601</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.896</bold></oasis:entry>
         <oasis:entry colname="col5">0.432</oasis:entry>
         <oasis:entry colname="col6">0.648</oasis:entry>
         <oasis:entry colname="col7">0.962</oasis:entry>
         <oasis:entry colname="col8">0.405</oasis:entry>
         <oasis:entry colname="col9">0.632</oasis:entry>
         <oasis:entry colname="col10">0.955</oasis:entry>
         <oasis:entry colname="col11">0.403</oasis:entry>
         <oasis:entry colname="col12">0.620</oasis:entry>
         <oasis:entry colname="col13">0.909</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">1.00</oasis:entry>
         <oasis:entry colname="col2">0.394</oasis:entry>
         <oasis:entry colname="col3">0.602</oasis:entry>
         <oasis:entry colname="col4">0.885</oasis:entry>
         <oasis:entry colname="col5">0.422</oasis:entry>
         <oasis:entry colname="col6">0.641</oasis:entry>
         <oasis:entry colname="col7">0.943</oasis:entry>
         <oasis:entry colname="col8">0.245</oasis:entry>
         <oasis:entry colname="col9">0.386</oasis:entry>
         <oasis:entry colname="col10">0.597</oasis:entry>
         <oasis:entry colname="col11"><bold>0.243</bold></oasis:entry>
         <oasis:entry colname="col12"><bold>0.385</bold></oasis:entry>
         <oasis:entry colname="col13"><bold>0.592</bold></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e6012">Table 2 displays the MAE values across twenty selected sites, considering forecasts for 1, 3, and 7 d from the four models at five distinct depths. These results are derived from ten repeated trainings, and the corresponding RMSE results are presented in Fig. 9. From MAE results, we observe that both LSTM_1 and LSTM_4 perform well in deep soil moisture predictions. Meanwhile, our proposed NLNN models consistently demonstrate superior accuracy at depths from 0.05 to 0.5 m. Regarding RMSE, the KG-NLNN model stands out as the best model in most situations. Figure 10 depicts the correlation between the 7 d soil moisture predictions and observations of the test set for LSTM-4, LSTM-1, SA-NLNN, and KG-NLNN. The density of scatter plots serves as an indicator of model reliability (Datta and Faroughi, 2023). The KG-NLNN model exhibits superior performance in soil moisture prediction compared to the other models, suggesting the stability of our model over longer prediction periods. The comparison between KG-NLNN and SA-NLNN underscores the value of incorporating soil water transport mechanisms into of decoupled non-local operations. Nevertheless, a limitation of the proposed NLNN models lies in their forecasts for moisture content at 1.0m. This limitation could be attributed to the absence of consideration for lower boundary conditions in our study.</p>

      <fig id="F10" specific-use="star"><label>Figure 10</label><caption><p id="d2e6018">Scatter plots of the soil moisture observations and 7 d predictions generated from <bold>(a)</bold> LSTM_T, <bold>(b)</bold> LSTM_I, <bold>(c)</bold> SA-NLNN, and <bold>(d)</bold> KG-NLNN at UpperBethlehem.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f10.png"/>

        </fig>

      <fig id="F11" specific-use="star"><label>Figure 11</label><caption><p id="d2e6041">The autoregressive 24 d predicted soil moisture time series of 5 depths with LSTM_I, LSTM_T, KG-NLNN and SA-NLNN at Falkenberg <bold>(a–e)</bold>, Cape-Charles <bold>(f–j)</bold>, and Goodwell <bold>(k–o)</bold>. The shaded region represents the confidence interval of the models, spanning 1 standard deviation.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f11.png"/>

        </fig>

      <p id="d2e6059">Regarding how NLNN model predictions change over time, Fig. 11 displays the autoregressive 24 d predicted time series soil moisture data for the NLNN models across three sites: Falkenberg, Cape-Charles, and Goodwell. The shaded region represents the confidence interval of the models, spanning 1 standard deviation. The LSTM-based models exhibit relatively greater uncertainty in predictions. However, it is evident that both models perform satisfactorily and stably, with the proposed KG-NLNN model being closer to the observations. Considering the temporal accumulation of autoregressive errors in extended soil moisture forecasting, we provide additional long-term prediction results in Appendix B for comprehensive evaluation.</p>
      <p id="d2e6062">According to Sect. 4.1, the non-local weight maps can be qualitatively related to the soil properties, demonstrating the interpretability of the model. In real-world cases, even with limited soil information from the site in Table 1, we can combine the weight maps with the measured soil texture data for our analysis. Figure 12 illustrates the non-local weight matrix maps for the Falkenberg, Cape-Charles, and UpperBethlehem sites, generated by the KG-NLNN model. These maps remain stable during repeated training, with discernible variations among the three sites. They offer qualitative interpretations related to soil properties. In Fig. 12a, it is seen that at Falkenberg site, soil moisture at different depths is primarily influenced by upper boundary conditions and upper layer soil moisture. Figure 12b shows that at Cape-Charles site, soil moisture is mainly affected by upper boundary conditions and soil moisture at the same depth from the previous time step. Figure 12c depicts the strong soil water retention effect at UpperBethlehem site, soil moisture is mainly related to its own state at the previous time step. By combining Table 1, we can see that the non-local weight maps are consistent with the soil texture information. From Falkenberg to UpperBethlehem site, as the soil texture changes from sandy to clay, the learnt water retention capacity in Fig. 12 increases from low to high. Consequently, the non-local weight maps are able to capture different physical mechanisms of different sites from the measurement data.</p>

      <fig id="F12" specific-use="star"><label>Figure 12</label><caption><p id="d2e6068">The non-local weight maps through the KG-NLNN at three typical sites, <bold>(a)</bold> Falkenberg, <bold>(b)</bold> Cape-Charles, and <bold>(c)</bold> UpperBethlehem.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f12.png"/>

        </fig>

      <p id="d2e6086">In summary, our NLNN models achieve precise and efficient soil moisture predictions across diverse scenarios, as validated by comparisons with LSTMs using in-situ observations. Their multi-depth modeling strategy enhances overall accuracy through complementary interactions. The proposed KG-NLNN model delivers accurate predictions with low uncertainty, while also providing qualitative descriptions of the intricate soil properties. This performance underscores the necessity of incorporating soil water transport knowledge guidance in non-local operation design.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Effects of the noise levels, time scales, and observation positions</title>
      <p id="d2e6097">In addition to model accuracy and interpretability, our non-local neural network exhibits adaptability in prediction tasks across different time scales. In this section, we have conducted tests involving different noise levels, time intervals, and observation positions. To further investigate the impact of noise on our NLNN models, we have employed five different noise levels (0.5, 1.0, 2.0, 5.0, 10.0) and compared the NLNN model performance with LSTM models. The RMSE results for soil moisture prediction at 0.05, 0.10, 0.20, 0.50, and 1.00 m are presented in Fig. 13. The LSTM_T model demonstrates poor noise resistance and long-term forecasting capability. The other three models perform similarly under low-noise conditions, with LSTM_I even exhibiting some advantage. However, as the noise level increases, NLNN models demonstrate better robustness. Notably, the knowledge-guided NLNN is particularly stable, consistent with its performance on in-situ soil moisture data.</p>

      <fig id="F13" specific-use="star"><label>Figure 13</label><caption><p id="d2e6102">The RMSE results for 1, 3, 7, and 15 d at 0.05 m <bold>(a–d)</bold>, 0.10 m <bold>(e–h)</bold>, 0.20 m <bold>(i–l)</bold>, 0.50 m <bold>(m–p)</bold> and 1.0 m <bold>(q–t)</bold> in the homogenous soil under increasing noise levels. The error bar indicates the standard deviations of the RMSE, which are computed via ten training replicates. Note: portions of the red curves are truncated where the error significantly exceeds this range, reflecting its relatively lower predictive accuracy.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f13.png"/>

        </fig>

      <p id="d2e6126">When investigating the KG-NLNN model's performance at the 0.2, 0.5, and 1 d time intervals within homogenous soil, a subtle difference emerges in the weight map generated by the KG-NLNN model, as illustrated in Fig. 14. Despite a decrease in accuracy with longer time intervals, the model consistently achieves satisfactory results. The results reflect the adaptability of the model to diverse time scales.</p>
      <p id="d2e6131">When the number of observation locations increases to 10 (at depths of 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 m), the MAE values for soil moisture 1, 3, 7, and 15 d forecasts of the NLNN models across five depths are summarized in Table 3. The uniform augmentation of measurements significantly enhances the prediction accuracy of SA-NLNN, while having minimal impact on the performance of KG-NLNN. This suggests that the knowledge guidance allows for lower requirements on soil moisture measurements. In scenarios with uniformly augmented observations, SA-NLNN may prove more efficient.</p>

<table-wrap id="T3" specific-use="star"><label>Table 3</label><caption><p id="d2e6137">The MAE [%] values for 1, 3, 7, and 15 d forecasts of the proposed KG-NLNN model and SA-NLNN model at 5 depths with 10 depth measurements under the homogenous soil scenario. The bold values indicate the best performance for each metric across the models.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="9">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right" colsep="1"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:colspec colnum="8" colname="col8" align="right"/>
     <oasis:colspec colnum="9" colname="col9" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Depth/m</oasis:entry>
         <oasis:entry rowsep="1" namest="col2" nameend="col9" align="center">Homogeneous soil </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry rowsep="1" namest="col2" nameend="col5" align="center" colsep="1">KG-NLNN </oasis:entry>
         <oasis:entry rowsep="1" namest="col6" nameend="col9" align="center">SA-NLNN </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">1d</oasis:entry>
         <oasis:entry colname="col3">3d</oasis:entry>
         <oasis:entry colname="col4">7d</oasis:entry>
         <oasis:entry colname="col5">15d</oasis:entry>
         <oasis:entry colname="col6">1d</oasis:entry>
         <oasis:entry colname="col7">3d</oasis:entry>
         <oasis:entry colname="col8">7d</oasis:entry>
         <oasis:entry colname="col9">15d</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">0.05</oasis:entry>
         <oasis:entry colname="col2"><bold>0.327</bold></oasis:entry>
         <oasis:entry colname="col3"><bold>0.470</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.645</bold></oasis:entry>
         <oasis:entry colname="col5"><bold>0.817</bold></oasis:entry>
         <oasis:entry colname="col6">0.394</oasis:entry>
         <oasis:entry colname="col7">0.588</oasis:entry>
         <oasis:entry colname="col8">0.906</oasis:entry>
         <oasis:entry colname="col9">1.657</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.10</oasis:entry>
         <oasis:entry colname="col2">0.280</oasis:entry>
         <oasis:entry colname="col3">0.407</oasis:entry>
         <oasis:entry colname="col4">0.602</oasis:entry>
         <oasis:entry colname="col5"><bold>0.825</bold></oasis:entry>
         <oasis:entry colname="col6"><bold>0.250</bold></oasis:entry>
         <oasis:entry colname="col7"><bold>0.350</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.535</bold></oasis:entry>
         <oasis:entry colname="col9">0.892</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.20</oasis:entry>
         <oasis:entry colname="col2">0.331</oasis:entry>
         <oasis:entry colname="col3">0.564</oasis:entry>
         <oasis:entry colname="col4">0.979</oasis:entry>
         <oasis:entry colname="col5">1.419</oasis:entry>
         <oasis:entry colname="col6"><bold>0.221</bold></oasis:entry>
         <oasis:entry colname="col7"><bold>0.300</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.418</bold></oasis:entry>
         <oasis:entry colname="col9"><bold>0.604</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.50</oasis:entry>
         <oasis:entry colname="col2">0.174</oasis:entry>
         <oasis:entry colname="col3">0.258</oasis:entry>
         <oasis:entry colname="col4">0.380</oasis:entry>
         <oasis:entry colname="col5">0.581</oasis:entry>
         <oasis:entry colname="col6"><bold>0.148</bold></oasis:entry>
         <oasis:entry colname="col7"><bold>0.204</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.302</bold></oasis:entry>
         <oasis:entry colname="col9"><bold>0.502</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">1.00</oasis:entry>
         <oasis:entry colname="col2"><bold>0.108</bold></oasis:entry>
         <oasis:entry colname="col3">0.180</oasis:entry>
         <oasis:entry colname="col4">0.300</oasis:entry>
         <oasis:entry colname="col5">0.493</oasis:entry>
         <oasis:entry colname="col6">0.118</oasis:entry>
         <oasis:entry colname="col7"><bold>0.174</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.259</bold></oasis:entry>
         <oasis:entry colname="col9"><bold>0.460</bold></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e6396">In conclusion, both the NLNN models achieve accurate and reliable soil moisture predictions under diverse scenarios. They can adapt to tasks across different time scales. The SA-NLNN performs better under uniformly distributed observations, while the KG-NLNN demonstrates stronger noise resistance.</p>

      <fig id="F14" specific-use="star"><label>Figure 14</label><caption><p id="d2e6401">The non-local weight maps of the KG-NLNN model at different time scales at 0.2 d (a), 0.5 d <bold>(b)</bold>, and 1.0 d <bold>(c)</bold> in the homogenous soil.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f14.png"/>

        </fig>

</sec>
</sec>
<sec id="Ch1.S5" sec-type="conclusions">
  <label>5</label><title>Conclusions</title>
      <p id="d2e6426">In this study, we employ the deep learning model NLNNs to achieve precise and efficient soil moisture predictions under diverse scenarios without relying physical assumptions., while providing qualitative interpretation for complex soil moisture dynamics, such as vertical heterogeneity and inter-layer connectivity. In light of the accuracy and parameter estimation challenges in physical models, and the credibility concerns in machine learning models, we have introduced a framework that integrates both accuracy and mechanistic insight. Our method leverages in-profile soil moisture interactions across various depths. Consequently, the soil moisture prediction task is reformulated as a single-time-step prediction task that involves multi-depth soil moisture variables. In this way, we apply the self-attention-based model SA-NLNN to explore the potential of the NLNN structure. Expanding on this framework, we disentangle the non-local operation into four components to create the KG-NLNN model according to the soil water transport knowledge. By comparing our NLNNs with the LSTM model using synthetic data and in-situ observations, we demonstrate that both our NLNN models achieve precise and effective forecasts, providing an alternative possibility for soil moisture simulations. The knowledge-guided model KG-NLNN exhibits the best performance and remains stable with low uncertainty. The physical knowledge guidance in non-local operations significantly enhances the model's accuracy and reliability.</p>
      <p id="d2e6429">Additionally, our proposed models offer qualitative interpretations related to the soil properties. Through the investigation of various virtual scenarios – including homogeneous soil, heterogeneous soil, two-layered soil, and soil with root water uptake – we observe that both the KG-NLNN and SA-NLNN models perform well in different soil conditions. The qualitative interpretations derived from soil moisture data generated by KG-NLNN facilitate descriptions of soil textures. When testing with in-situ data, we find that the KG-NLNN model also provides interpretations consistent with real soil vertical heterogeneity without physical assumptions. This highlights the importance of integrating knowledge-guided assistance into model design. Moreover, we have assessed the model's performance under different noise conditions, observation positions, and time scales. Both NLNN models exhibit robustness to noise, and the knowledge guidance enhances noise resistance. Besides, NLNN model demonstrates adaptability to diverse time scales. When observations are evenly distributed, the SA-NLNN shows significant improvements compared to KG-NLNN, while maintaining high computational efficiency.</p>
      <p id="d2e6432">Nevertheless, the model faces challenges that necessitate future improvements. Its training and application are site-specific, limiting its transferability. Further research is required to enhance its applicability across different sites. Specifically, difficulties arise in estimating soil moisture content at deep layers, possibly due to the lack of consideration for the groundwater boundary. Incorporating lower boundary conditions into the model could address this limitation. Additionally, multi-objective network training may benefit from more effective strategies and more precise loss function designs. Introducing constraints at multiple time steps holds promise for achieving more stable results. Finally, further refinement of the non-local operation may enhance the model's performance. What's more, the proposed network framework is architecturally flexible and modular, making it customizable for diverse research requirements. Beyond soil moisture, the NLNN-based strategy could be readily extended to other systems, such as solute transport in groundwater. We encourage the exploration of such specialized structures to address various coupled physical or hydrological problems across different scales.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title/>
      <p id="d2e6446">The parameters used to generate the synthetic data are recorded in Tables A1 and A2.</p>

<table-wrap id="TA1"><label>Table A1</label><caption><p id="d2e6453">The van Genuchten soil hydraulic parameters (van Genuchten, 1980) used for synthetic data generation.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Case Design</oasis:entry>
         <oasis:entry colname="col2">Homogenous</oasis:entry>
         <oasis:entry colname="col3">Heterogeneous</oasis:entry>
         <oasis:entry colname="col4">Two-layered</oasis:entry>
         <oasis:entry colname="col5">Soil with root</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">soil</oasis:entry>
         <oasis:entry colname="col3">soil</oasis:entry>
         <oasis:entry colname="col4">soil</oasis:entry>
         <oasis:entry colname="col5">water uptake</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M226" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>r</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> [–]</oasis:entry>
         <oasis:entry colname="col2">0.078</oasis:entry>
         <oasis:entry colname="col3">0.078</oasis:entry>
         <oasis:entry colname="col4">0.078</oasis:entry>
         <oasis:entry colname="col5">0.078</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M227" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> [–]</oasis:entry>
         <oasis:entry colname="col2">0.43</oasis:entry>
         <oasis:entry colname="col3">0.43</oasis:entry>
         <oasis:entry colname="col4">0.43</oasis:entry>
         <oasis:entry colname="col5">0.43</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M228" display="inline"><mml:mi mathvariant="italic">α</mml:mi></mml:math></inline-formula> [cm<sup>−1</sup>]</oasis:entry>
         <oasis:entry colname="col2">3.6</oasis:entry>
         <oasis:entry colname="col3">3.6</oasis:entry>
         <oasis:entry colname="col4">3.6</oasis:entry>
         <oasis:entry colname="col5">3.6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M230" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>  [–]</oasis:entry>
         <oasis:entry colname="col2">1.56</oasis:entry>
         <oasis:entry colname="col3">1.56</oasis:entry>
         <oasis:entry colname="col4">1.56</oasis:entry>
         <oasis:entry colname="col5">1.56</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M231" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> [cm d<sup>−1</sup>] (0–10 cm)</oasis:entry>
         <oasis:entry colname="col2">0.250</oasis:entry>
         <oasis:entry colname="col3">Table A2</oasis:entry>
         <oasis:entry colname="col4">0.250</oasis:entry>
         <oasis:entry colname="col5">0.250</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M233" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> [cm d<sup>−1</sup>] (10–100 cm)</oasis:entry>
         <oasis:entry colname="col2">0.250</oasis:entry>
         <oasis:entry colname="col3">Table A2</oasis:entry>
         <oasis:entry colname="col4">10.49</oasis:entry>
         <oasis:entry colname="col5">0.250</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M235" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula> [–]</oasis:entry>
         <oasis:entry colname="col2">0.5</oasis:entry>
         <oasis:entry colname="col3">0.5</oasis:entry>
         <oasis:entry colname="col4">0.5</oasis:entry>
         <oasis:entry colname="col5">0.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Presence of plant</oasis:entry>
         <oasis:entry colname="col2">False</oasis:entry>
         <oasis:entry colname="col3">False</oasis:entry>
         <oasis:entry colname="col4">False</oasis:entry>
         <oasis:entry colname="col5">True</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<table-wrap id="TA2"><label>Table A2</label><caption><p id="d2e6756">The soil hydraulic conductivity of the heterogeneous scenario.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="11">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:colspec colnum="8" colname="col8" align="right"/>
     <oasis:colspec colnum="9" colname="col9" align="right"/>
     <oasis:colspec colnum="10" colname="col10" align="right"/>
     <oasis:colspec colnum="11" colname="col11" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Depth [cm]</oasis:entry>
         <oasis:entry namest="col2" nameend="col11" align="center"><inline-formula><mml:math id="M236" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> [cm d<sup>−1</sup>] </oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">0–10 cm</oasis:entry>
         <oasis:entry colname="col2">0.226</oasis:entry>
         <oasis:entry colname="col3">0.270</oasis:entry>
         <oasis:entry colname="col4">0.241</oasis:entry>
         <oasis:entry colname="col5">0.263</oasis:entry>
         <oasis:entry colname="col6">0.222</oasis:entry>
         <oasis:entry colname="col7">0.226</oasis:entry>
         <oasis:entry colname="col8">0.263</oasis:entry>
         <oasis:entry colname="col9">0.221</oasis:entry>
         <oasis:entry colname="col10">0.262</oasis:entry>
         <oasis:entry colname="col11">0.276</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">10–20 cm</oasis:entry>
         <oasis:entry colname="col2">0.230</oasis:entry>
         <oasis:entry colname="col3">0.226</oasis:entry>
         <oasis:entry colname="col4">0.217</oasis:entry>
         <oasis:entry colname="col5">0.226</oasis:entry>
         <oasis:entry colname="col6">0.249</oasis:entry>
         <oasis:entry colname="col7">0.203</oasis:entry>
         <oasis:entry colname="col8">0.229</oasis:entry>
         <oasis:entry colname="col9">0.196</oasis:entry>
         <oasis:entry colname="col10">0.207</oasis:entry>
         <oasis:entry colname="col11">0.202</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">20–30 cm</oasis:entry>
         <oasis:entry colname="col2">0.200</oasis:entry>
         <oasis:entry colname="col3">0.239</oasis:entry>
         <oasis:entry colname="col4">0.244</oasis:entry>
         <oasis:entry colname="col5">0.253</oasis:entry>
         <oasis:entry colname="col6">0.251</oasis:entry>
         <oasis:entry colname="col7">0.248</oasis:entry>
         <oasis:entry colname="col8">0.203</oasis:entry>
         <oasis:entry colname="col9">0.225</oasis:entry>
         <oasis:entry colname="col10">0.206</oasis:entry>
         <oasis:entry colname="col11">0.205</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">30–40 cm</oasis:entry>
         <oasis:entry colname="col2">0.241</oasis:entry>
         <oasis:entry colname="col3">0.223</oasis:entry>
         <oasis:entry colname="col4">0.197</oasis:entry>
         <oasis:entry colname="col5">0.227</oasis:entry>
         <oasis:entry colname="col6">0.218</oasis:entry>
         <oasis:entry colname="col7">0.256</oasis:entry>
         <oasis:entry colname="col8">0.258</oasis:entry>
         <oasis:entry colname="col9">0.294</oasis:entry>
         <oasis:entry colname="col10">0.308</oasis:entry>
         <oasis:entry colname="col11">0.242</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">40–50 cm</oasis:entry>
         <oasis:entry colname="col2">0.242</oasis:entry>
         <oasis:entry colname="col3">0.155</oasis:entry>
         <oasis:entry colname="col4">0.177</oasis:entry>
         <oasis:entry colname="col5">0.184</oasis:entry>
         <oasis:entry colname="col6">0.218</oasis:entry>
         <oasis:entry colname="col7">0.230</oasis:entry>
         <oasis:entry colname="col8">0.225</oasis:entry>
         <oasis:entry colname="col9">0.211</oasis:entry>
         <oasis:entry colname="col10">0.207</oasis:entry>
         <oasis:entry colname="col11">0.252</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">50–60 cm</oasis:entry>
         <oasis:entry colname="col2">0.285</oasis:entry>
         <oasis:entry colname="col3">0.338</oasis:entry>
         <oasis:entry colname="col4">0.351</oasis:entry>
         <oasis:entry colname="col5">0.345</oasis:entry>
         <oasis:entry colname="col6">0.317</oasis:entry>
         <oasis:entry colname="col7">0.355</oasis:entry>
         <oasis:entry colname="col8">0.333</oasis:entry>
         <oasis:entry colname="col9">0.343</oasis:entry>
         <oasis:entry colname="col10">0.322</oasis:entry>
         <oasis:entry colname="col11">0.320</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">60–70 cm</oasis:entry>
         <oasis:entry colname="col2">0.261</oasis:entry>
         <oasis:entry colname="col3">0.272</oasis:entry>
         <oasis:entry colname="col4">0.306</oasis:entry>
         <oasis:entry colname="col5">0.279</oasis:entry>
         <oasis:entry colname="col6">0.319</oasis:entry>
         <oasis:entry colname="col7">0.250</oasis:entry>
         <oasis:entry colname="col8">0.262</oasis:entry>
         <oasis:entry colname="col9">0.224</oasis:entry>
         <oasis:entry colname="col10">0.240</oasis:entry>
         <oasis:entry colname="col11">0.269</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">70–80 cm</oasis:entry>
         <oasis:entry colname="col2">0.269</oasis:entry>
         <oasis:entry colname="col3">0.300</oasis:entry>
         <oasis:entry colname="col4">0.276</oasis:entry>
         <oasis:entry colname="col5">0.250</oasis:entry>
         <oasis:entry colname="col6">0.267</oasis:entry>
         <oasis:entry colname="col7">0.233</oasis:entry>
         <oasis:entry colname="col8">0.240</oasis:entry>
         <oasis:entry colname="col9">0.249</oasis:entry>
         <oasis:entry colname="col10">0.207</oasis:entry>
         <oasis:entry colname="col11">0.233</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">80–90 cm</oasis:entry>
         <oasis:entry colname="col2">0.202</oasis:entry>
         <oasis:entry colname="col3">0.209</oasis:entry>
         <oasis:entry colname="col4">0.208</oasis:entry>
         <oasis:entry colname="col5">0.248</oasis:entry>
         <oasis:entry colname="col6">0.231</oasis:entry>
         <oasis:entry colname="col7">0.232</oasis:entry>
         <oasis:entry colname="col8">0.245</oasis:entry>
         <oasis:entry colname="col9">0.258</oasis:entry>
         <oasis:entry colname="col10">0.250</oasis:entry>
         <oasis:entry colname="col11">0.222</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">90–100 cm</oasis:entry>
         <oasis:entry colname="col2">0.254</oasis:entry>
         <oasis:entry colname="col3">0.211</oasis:entry>
         <oasis:entry colname="col4">0.201</oasis:entry>
         <oasis:entry colname="col5">0.203</oasis:entry>
         <oasis:entry colname="col6">0.186</oasis:entry>
         <oasis:entry colname="col7">0.213</oasis:entry>
         <oasis:entry colname="col8">0.233</oasis:entry>
         <oasis:entry colname="col9">0.196</oasis:entry>
         <oasis:entry colname="col10">0.247</oasis:entry>
         <oasis:entry colname="col11">0.213</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</app>

<app id="App1.Ch1.S2">
  <label>Appendix B</label><title/>
      <p id="d2e7203">This section presents a preliminary comparison between the NLNN model and the physics-based soil moisture model derived from Richards' equation.</p>
      <p id="d2e7206">The Ross method (Ross, 2003, 2006) is a rapid, non-iterative numerical scheme for soil moisture forward modeling based on Richards' Equation. For boundary conditions, the daily reference evapotranspiration (ET0) is calculated with the FAO Penman-Monteith method  (Allen et al., 1998). As standardized in the FAO guidelines  (Allen et al., 1998), actual evapotranspiration is the product of <inline-formula><mml:math id="M238" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>C</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and ET0, where <inline-formula><mml:math id="M239" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>C</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> serves as a refined empirical parameter. When generating synthetic data, we applied this empirical coefficient method to derive a preliminary evapotranspiration estimate, adopting a coefficient value of 1.0 in this instance. We first utilize 10 d of site historical data to invert the site-specific soil hydraulic parameters (<inline-formula><mml:math id="M240" display="inline"><mml:mi mathvariant="italic">α</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M241" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M242" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>) through data assimilation with the ensemble Kalman filter (EnKF) method​​​​​​​​ ​​​​​​​​  (Evensen, 2003) within the Ross framework. These parameters are then applied in the Ross method to obtain a fast solution of one-dimensional Richards' equation, enabling the forecasting of soil moisture dynamics.</p>
      <p id="d2e7262">In the real-world experiments, we selected three sites: Falkenberg, Cape-Charles, and Goodwell, with distinctly different soil textures and land covers, as recorded in Table 1 in the manuscript. Figure B1 illustrates the autoregressive 24 d predicted time series soil moisture data for the KG-NLNN model and Ross-EnKF across these three sites. The MAE results are recorded in Table B1. It is seen that soil moisture forecasts obtained by KG-NLNN are closer to real observations, compared to the traditional Ross-EnKF method.</p><table-wrap id="TB1"><label>Table B1</label><caption><p id="d2e7269">The MAE [%] values for 24 d forecasts of the proposed KG-NLNN model and Ross-EnKF model</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Falkenberg</oasis:entry>
         <oasis:entry colname="col3">Cape-Charles</oasis:entry>
         <oasis:entry colname="col4">Goodwell</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">KG-NLNN</oasis:entry>
         <oasis:entry colname="col2">0.681</oasis:entry>
         <oasis:entry colname="col3">1.766</oasis:entry>
         <oasis:entry colname="col4">1.998</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Ross-EnKF</oasis:entry>
         <oasis:entry colname="col2">4.395</oasis:entry>
         <oasis:entry colname="col3">5.484</oasis:entry>
         <oasis:entry colname="col4">3.840</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e7336">However, it should be noted that the data assimilation process in Ross-EnKF did not update soil infiltration parameters, potentially disadvantaging the physical model. What's more, the proposed approaches cannot predict soil moisture at arbitrary depths and times as the physical models. The fundamental differences between machine learning and physical modeling make fair, direct comparisons with standard methods both critical and difficult.</p>

      <fig id="FB1"><label>Figure B1</label><caption><p id="d2e7341">The 24 d predicted soil moisture time series of 5 depths with KG-NLNN and Ross-EnKF at Falkenberg <bold>(a–e)</bold>, Cape-Charles <bold>(f–j)</bold>, and Goodwell <bold>(k–o)</bold>.</p></caption>
        
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f15.png"/>

      </fig>

      <fig id="FB2"><label>Figure B2</label><caption><p id="d2e7363">The 120 d predicted soil moisture time series of 5 depths with KG-NLNN and SA-NLNN at Falkenberg <bold>(a–e)</bold>, Cape-Charles <bold>(f–j)</bold>, and Goodwell <bold>(k–o)</bold>.</p></caption>
        
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2973/2026/hess-30-2973-2026-f16.png"/>

      </fig>

      <p id="d2e7383">Moreover, our machine learning approach exhibits autoregressive error accumulation in long-term soil moisture predictions – a limitation not observed in knowledge-based modeling. As demonstrated by the 120 d autoregressive forecasts (Fig. B2), while model uncertainty gradually accumulates with prediction time, it remains within acceptable bounds. Importantly, the knowledge-guided KG-NLNN model maintains significantly greater stability across the entire prediction horizon.</p>
</app>

<app id="App1.Ch1.S3">
  <label>Appendix C</label><title/>

<table-wrap id="TC1"><label>Table C1</label><caption><p id="d2e7400">The MAE [%] values for 1, 3, 7, and 15 d forecasts of LSTM_T LSTM_I, the proposed KG-NLNN and SA-NLNN model at 5 depths under four designed scenarios. The bold values indicate the best performance for each metric across the models.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="17">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right" colsep="1"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="right"/>
     <oasis:colspec colnum="8" colname="col8" align="right"/>
     <oasis:colspec colnum="9" colname="col9" align="right" colsep="1"/>
     <oasis:colspec colnum="10" colname="col10" align="right"/>
     <oasis:colspec colnum="11" colname="col11" align="right"/>
     <oasis:colspec colnum="12" colname="col12" align="right"/>
     <oasis:colspec colnum="13" colname="col13" align="right" colsep="1"/>
     <oasis:colspec colnum="14" colname="col14" align="right"/>
     <oasis:colspec colnum="15" colname="col15" align="right"/>
     <oasis:colspec colnum="16" colname="col16" align="right"/>
     <oasis:colspec colnum="17" colname="col17" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Depth/m</oasis:entry>
         <oasis:entry rowsep="1" namest="col2" nameend="col17" align="center">KG-NLNN SA   PG </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry rowsep="1" namest="col2" nameend="col5" align="center" colsep="1">homogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col6" nameend="col9" align="center" colsep="1">heterogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col10" nameend="col13" align="center" colsep="1">two-layered </oasis:entry>
         <oasis:entry rowsep="1" namest="col14" nameend="col17" align="center">root water uptake </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">1d</oasis:entry>
         <oasis:entry colname="col3">3d</oasis:entry>
         <oasis:entry colname="col4">7d</oasis:entry>
         <oasis:entry colname="col5">15d</oasis:entry>
         <oasis:entry colname="col6">1d</oasis:entry>
         <oasis:entry colname="col7">3d</oasis:entry>
         <oasis:entry colname="col8">7d</oasis:entry>
         <oasis:entry colname="col9">15d</oasis:entry>
         <oasis:entry colname="col10">1d</oasis:entry>
         <oasis:entry colname="col11">3d</oasis:entry>
         <oasis:entry colname="col12">7d</oasis:entry>
         <oasis:entry colname="col13">15d</oasis:entry>
         <oasis:entry colname="col14">1d</oasis:entry>
         <oasis:entry colname="col15">3d</oasis:entry>
         <oasis:entry colname="col16">7d</oasis:entry>
         <oasis:entry colname="col17">15d</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">0.05</oasis:entry>
         <oasis:entry colname="col2">0.235</oasis:entry>
         <oasis:entry colname="col3"><bold>0.327</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.433</bold></oasis:entry>
         <oasis:entry colname="col5"><bold>0.539</bold></oasis:entry>
         <oasis:entry colname="col6"><bold>0.259</bold></oasis:entry>
         <oasis:entry colname="col7"><bold>0.372</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.510</bold></oasis:entry>
         <oasis:entry colname="col9"><bold>0.652</bold></oasis:entry>
         <oasis:entry colname="col10">0.449</oasis:entry>
         <oasis:entry colname="col11">0.680</oasis:entry>
         <oasis:entry colname="col12">0.945</oasis:entry>
         <oasis:entry colname="col13"><bold>1.170</bold></oasis:entry>
         <oasis:entry colname="col14">0.528</oasis:entry>
         <oasis:entry colname="col15">0.747</oasis:entry>
         <oasis:entry colname="col16">0.996</oasis:entry>
         <oasis:entry colname="col17">1.224</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.10</oasis:entry>
         <oasis:entry colname="col2">0.313</oasis:entry>
         <oasis:entry colname="col3">0.451</oasis:entry>
         <oasis:entry colname="col4">0.627</oasis:entry>
         <oasis:entry colname="col5">0.788</oasis:entry>
         <oasis:entry colname="col6">0.306</oasis:entry>
         <oasis:entry colname="col7">0.431</oasis:entry>
         <oasis:entry colname="col8">0.593</oasis:entry>
         <oasis:entry colname="col9">0.749</oasis:entry>
         <oasis:entry colname="col10">0.521</oasis:entry>
         <oasis:entry colname="col11">0.745</oasis:entry>
         <oasis:entry colname="col12">0.995</oasis:entry>
         <oasis:entry colname="col13"><bold>1.191</bold></oasis:entry>
         <oasis:entry colname="col14">0.409</oasis:entry>
         <oasis:entry colname="col15">0.544</oasis:entry>
         <oasis:entry colname="col16">0.685</oasis:entry>
         <oasis:entry colname="col17">0.825</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.20</oasis:entry>
         <oasis:entry colname="col2">0.342</oasis:entry>
         <oasis:entry colname="col3">0.533</oasis:entry>
         <oasis:entry colname="col4">0.776</oasis:entry>
         <oasis:entry colname="col5">1.016</oasis:entry>
         <oasis:entry colname="col6">0.305</oasis:entry>
         <oasis:entry colname="col7">0.488</oasis:entry>
         <oasis:entry colname="col8">0.736</oasis:entry>
         <oasis:entry colname="col9">0.971</oasis:entry>
         <oasis:entry colname="col10">0.433</oasis:entry>
         <oasis:entry colname="col11">0.649</oasis:entry>
         <oasis:entry colname="col12">0.901</oasis:entry>
         <oasis:entry colname="col13"><bold>1.179</bold></oasis:entry>
         <oasis:entry colname="col14">0.623</oasis:entry>
         <oasis:entry colname="col15">0.852</oasis:entry>
         <oasis:entry colname="col16">1.056</oasis:entry>
         <oasis:entry colname="col17">1.212</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.50</oasis:entry>
         <oasis:entry colname="col2">0.235</oasis:entry>
         <oasis:entry colname="col3">0.357</oasis:entry>
         <oasis:entry colname="col4">0.545</oasis:entry>
         <oasis:entry colname="col5">0.782</oasis:entry>
         <oasis:entry colname="col6">0.253</oasis:entry>
         <oasis:entry colname="col7">0.399</oasis:entry>
         <oasis:entry colname="col8">0.630</oasis:entry>
         <oasis:entry colname="col9">0.952</oasis:entry>
         <oasis:entry colname="col10">0.334</oasis:entry>
         <oasis:entry colname="col11">0.518</oasis:entry>
         <oasis:entry colname="col12">0.774</oasis:entry>
         <oasis:entry colname="col13">1.098</oasis:entry>
         <oasis:entry colname="col14">0.375</oasis:entry>
         <oasis:entry colname="col15">0.598</oasis:entry>
         <oasis:entry colname="col16">0.870</oasis:entry>
         <oasis:entry colname="col17">1.182</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">1.00</oasis:entry>
         <oasis:entry colname="col2">0.203</oasis:entry>
         <oasis:entry colname="col3"><bold>0.312</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.445</bold></oasis:entry>
         <oasis:entry colname="col5"><bold>0.647</bold></oasis:entry>
         <oasis:entry colname="col6">0.244</oasis:entry>
         <oasis:entry colname="col7">0.397</oasis:entry>
         <oasis:entry colname="col8">0.618</oasis:entry>
         <oasis:entry colname="col9">0.934</oasis:entry>
         <oasis:entry colname="col10">0.368</oasis:entry>
         <oasis:entry colname="col11">0.625</oasis:entry>
         <oasis:entry colname="col12">0.969</oasis:entry>
         <oasis:entry colname="col13">1.329</oasis:entry>
         <oasis:entry colname="col14">0.278</oasis:entry>
         <oasis:entry colname="col15">0.455</oasis:entry>
         <oasis:entry colname="col16">0.749</oasis:entry>
         <oasis:entry colname="col17"><bold>1.120</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Depth/m</oasis:entry>
         <oasis:entry rowsep="1" namest="col2" nameend="col17" align="center">SA-NLNN </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry rowsep="1" namest="col2" nameend="col5" align="center" colsep="1">homogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col6" nameend="col9" align="center" colsep="1">heterogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col10" nameend="col13" align="center" colsep="1">two-layered </oasis:entry>
         <oasis:entry rowsep="1" namest="col14" nameend="col17" align="center">root water uptake </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">1d</oasis:entry>
         <oasis:entry colname="col3">3d</oasis:entry>
         <oasis:entry colname="col4">7d</oasis:entry>
         <oasis:entry colname="col5">15d</oasis:entry>
         <oasis:entry colname="col6">1d</oasis:entry>
         <oasis:entry colname="col7">3d</oasis:entry>
         <oasis:entry colname="col8">7d</oasis:entry>
         <oasis:entry colname="col9">15d</oasis:entry>
         <oasis:entry colname="col10">1d</oasis:entry>
         <oasis:entry colname="col11">3d</oasis:entry>
         <oasis:entry colname="col12">7d</oasis:entry>
         <oasis:entry colname="col13">15d</oasis:entry>
         <oasis:entry colname="col14">1d</oasis:entry>
         <oasis:entry colname="col15">3d</oasis:entry>
         <oasis:entry colname="col16">7d</oasis:entry>
         <oasis:entry colname="col17">15d</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.05</oasis:entry>
         <oasis:entry colname="col2">0.328</oasis:entry>
         <oasis:entry colname="col3">0.470</oasis:entry>
         <oasis:entry colname="col4">0.686</oasis:entry>
         <oasis:entry colname="col5">1.039</oasis:entry>
         <oasis:entry colname="col6">0.363</oasis:entry>
         <oasis:entry colname="col7">0.524</oasis:entry>
         <oasis:entry colname="col8">0.750</oasis:entry>
         <oasis:entry colname="col9">1.840</oasis:entry>
         <oasis:entry colname="col10">0.327</oasis:entry>
         <oasis:entry colname="col11">0.505</oasis:entry>
         <oasis:entry colname="col12">0.836</oasis:entry>
         <oasis:entry colname="col13">2.210</oasis:entry>
         <oasis:entry colname="col14">0.536</oasis:entry>
         <oasis:entry colname="col15">0.918</oasis:entry>
         <oasis:entry colname="col16">2.150</oasis:entry>
         <oasis:entry colname="col17">6.702</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.10</oasis:entry>
         <oasis:entry colname="col2">0.249</oasis:entry>
         <oasis:entry colname="col3">0.375</oasis:entry>
         <oasis:entry colname="col4">0.580</oasis:entry>
         <oasis:entry colname="col5">0.957</oasis:entry>
         <oasis:entry colname="col6">0.220</oasis:entry>
         <oasis:entry colname="col7">0.314</oasis:entry>
         <oasis:entry colname="col8">0.477</oasis:entry>
         <oasis:entry colname="col9">0.851</oasis:entry>
         <oasis:entry colname="col10">0.390</oasis:entry>
         <oasis:entry colname="col11">0.569</oasis:entry>
         <oasis:entry colname="col12"><bold>0.808</bold></oasis:entry>
         <oasis:entry colname="col13">1.465</oasis:entry>
         <oasis:entry colname="col14">0.322</oasis:entry>
         <oasis:entry colname="col15">0.447</oasis:entry>
         <oasis:entry colname="col16">0.675</oasis:entry>
         <oasis:entry colname="col17">1.480</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.20</oasis:entry>
         <oasis:entry colname="col2">0.262</oasis:entry>
         <oasis:entry colname="col3">0.366</oasis:entry>
         <oasis:entry colname="col4">0.519</oasis:entry>
         <oasis:entry colname="col5">0.820</oasis:entry>
         <oasis:entry colname="col6">0.292</oasis:entry>
         <oasis:entry colname="col7">0.389</oasis:entry>
         <oasis:entry colname="col8">0.482</oasis:entry>
         <oasis:entry colname="col9"><bold>0.648</bold></oasis:entry>
         <oasis:entry colname="col10">0.487</oasis:entry>
         <oasis:entry colname="col11">0.696</oasis:entry>
         <oasis:entry colname="col12">0.945</oasis:entry>
         <oasis:entry colname="col13">1.350</oasis:entry>
         <oasis:entry colname="col14">0.379</oasis:entry>
         <oasis:entry colname="col15">0.546</oasis:entry>
         <oasis:entry colname="col16">0.775</oasis:entry>
         <oasis:entry colname="col17">1.861</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.50</oasis:entry>
         <oasis:entry colname="col2">0.209</oasis:entry>
         <oasis:entry colname="col3">0.291</oasis:entry>
         <oasis:entry colname="col4">0.414</oasis:entry>
         <oasis:entry colname="col5">0.566</oasis:entry>
         <oasis:entry colname="col6">0.265</oasis:entry>
         <oasis:entry colname="col7">0.337</oasis:entry>
         <oasis:entry colname="col8">0.431</oasis:entry>
         <oasis:entry colname="col9"><bold>0.623</bold></oasis:entry>
         <oasis:entry colname="col10">0.327</oasis:entry>
         <oasis:entry colname="col11">0.483</oasis:entry>
         <oasis:entry colname="col12">0.708</oasis:entry>
         <oasis:entry colname="col13"><bold>1.018</bold></oasis:entry>
         <oasis:entry colname="col14">0.344</oasis:entry>
         <oasis:entry colname="col15">0.485</oasis:entry>
         <oasis:entry colname="col16">0.687</oasis:entry>
         <oasis:entry colname="col17">1.502</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">1.00</oasis:entry>
         <oasis:entry colname="col2">0.245</oasis:entry>
         <oasis:entry colname="col3">0.376</oasis:entry>
         <oasis:entry colname="col4">0.575</oasis:entry>
         <oasis:entry colname="col5">0.807</oasis:entry>
         <oasis:entry colname="col6">0.282</oasis:entry>
         <oasis:entry colname="col7">0.430</oasis:entry>
         <oasis:entry colname="col8">0.640</oasis:entry>
         <oasis:entry colname="col9">0.941</oasis:entry>
         <oasis:entry colname="col10">0.336</oasis:entry>
         <oasis:entry colname="col11">0.530</oasis:entry>
         <oasis:entry colname="col12">0.810</oasis:entry>
         <oasis:entry colname="col13"><bold>1.250</bold></oasis:entry>
         <oasis:entry colname="col14">0.297</oasis:entry>
         <oasis:entry colname="col15">0.482</oasis:entry>
         <oasis:entry colname="col16">0.820</oasis:entry>
         <oasis:entry colname="col17">1.748</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Depth/m</oasis:entry>
         <oasis:entry rowsep="1" namest="col2" nameend="col17" align="center">LSTM_T </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry rowsep="1" namest="col2" nameend="col5" align="center" colsep="1">homogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col6" nameend="col9" align="center" colsep="1">heterogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col10" nameend="col13" align="center" colsep="1">two-layered  </oasis:entry>
         <oasis:entry rowsep="1" namest="col14" nameend="col17" align="center">root water uptake </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">1d</oasis:entry>
         <oasis:entry colname="col3">3d</oasis:entry>
         <oasis:entry colname="col4">7d</oasis:entry>
         <oasis:entry colname="col5">15d</oasis:entry>
         <oasis:entry colname="col6">1d</oasis:entry>
         <oasis:entry colname="col7">3d</oasis:entry>
         <oasis:entry colname="col8">7d</oasis:entry>
         <oasis:entry colname="col9">15d</oasis:entry>
         <oasis:entry colname="col10">1d</oasis:entry>
         <oasis:entry colname="col11">3d</oasis:entry>
         <oasis:entry colname="col12">7d</oasis:entry>
         <oasis:entry colname="col13">15d</oasis:entry>
         <oasis:entry colname="col14">1d</oasis:entry>
         <oasis:entry colname="col15">3d</oasis:entry>
         <oasis:entry colname="col16">7d</oasis:entry>
         <oasis:entry colname="col17">15d</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.05</oasis:entry>
         <oasis:entry colname="col2"><bold>0.009</bold></oasis:entry>
         <oasis:entry colname="col3">1.503</oasis:entry>
         <oasis:entry colname="col4">2.791</oasis:entry>
         <oasis:entry colname="col5">3.874</oasis:entry>
         <oasis:entry colname="col6"><bold>0.010</bold></oasis:entry>
         <oasis:entry colname="col7">1.520</oasis:entry>
         <oasis:entry colname="col8">2.829</oasis:entry>
         <oasis:entry colname="col9">3.936</oasis:entry>
         <oasis:entry colname="col10"><bold>0.015</bold></oasis:entry>
         <oasis:entry colname="col11">1.422</oasis:entry>
         <oasis:entry colname="col12">2.795</oasis:entry>
         <oasis:entry colname="col13">4.106</oasis:entry>
         <oasis:entry colname="col14"><bold>0.020</bold></oasis:entry>
         <oasis:entry colname="col15">1.649</oasis:entry>
         <oasis:entry colname="col16">3.187</oasis:entry>
         <oasis:entry colname="col17">4.641</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.10</oasis:entry>
         <oasis:entry colname="col2"><bold>0.007</bold></oasis:entry>
         <oasis:entry colname="col3">1.176</oasis:entry>
         <oasis:entry colname="col4">2.237</oasis:entry>
         <oasis:entry colname="col5">3.184</oasis:entry>
         <oasis:entry colname="col6"><bold>0.007</bold></oasis:entry>
         <oasis:entry colname="col7">1.202</oasis:entry>
         <oasis:entry colname="col8">2.282</oasis:entry>
         <oasis:entry colname="col9">3.240</oasis:entry>
         <oasis:entry colname="col10"><bold>0.015</bold></oasis:entry>
         <oasis:entry colname="col11">1.479</oasis:entry>
         <oasis:entry colname="col12">2.879</oasis:entry>
         <oasis:entry colname="col13">4.179</oasis:entry>
         <oasis:entry colname="col14"><bold>0.014</bold></oasis:entry>
         <oasis:entry colname="col15">1.255</oasis:entry>
         <oasis:entry colname="col16">2.449</oasis:entry>
         <oasis:entry colname="col17">3.584</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.20</oasis:entry>
         <oasis:entry colname="col2"><bold>0.008</bold></oasis:entry>
         <oasis:entry colname="col3">0.786</oasis:entry>
         <oasis:entry colname="col4">1.630</oasis:entry>
         <oasis:entry colname="col5">2.380</oasis:entry>
         <oasis:entry colname="col6"><bold>0.010</bold></oasis:entry>
         <oasis:entry colname="col7">0.782</oasis:entry>
         <oasis:entry colname="col8">1.628</oasis:entry>
         <oasis:entry colname="col9">2.384</oasis:entry>
         <oasis:entry colname="col10"><bold>0.012</bold></oasis:entry>
         <oasis:entry colname="col11">0.836</oasis:entry>
         <oasis:entry colname="col12">1.735</oasis:entry>
         <oasis:entry colname="col13">2.561</oasis:entry>
         <oasis:entry colname="col14"><bold>0.013</bold></oasis:entry>
         <oasis:entry colname="col15">0.801</oasis:entry>
         <oasis:entry colname="col16">1.671</oasis:entry>
         <oasis:entry colname="col17">2.479</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.50</oasis:entry>
         <oasis:entry colname="col2"><bold>0.006</bold></oasis:entry>
         <oasis:entry colname="col3">0.406</oasis:entry>
         <oasis:entry colname="col4">0.942</oasis:entry>
         <oasis:entry colname="col5">1.483</oasis:entry>
         <oasis:entry colname="col6"><bold>0.008</bold></oasis:entry>
         <oasis:entry colname="col7">0.373</oasis:entry>
         <oasis:entry colname="col8">0.872</oasis:entry>
         <oasis:entry colname="col9">1.375</oasis:entry>
         <oasis:entry colname="col10"><bold>0.008</bold></oasis:entry>
         <oasis:entry colname="col11">0.400</oasis:entry>
         <oasis:entry colname="col12">0.933</oasis:entry>
         <oasis:entry colname="col13">1.476</oasis:entry>
         <oasis:entry colname="col14"><bold>0.009</bold></oasis:entry>
         <oasis:entry colname="col15">0.403</oasis:entry>
         <oasis:entry colname="col16">0.939</oasis:entry>
         <oasis:entry colname="col17">1.482</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">1.00</oasis:entry>
         <oasis:entry colname="col2"><bold>0.008</bold></oasis:entry>
         <oasis:entry colname="col3">0.266</oasis:entry>
         <oasis:entry colname="col4">0.662</oasis:entry>
         <oasis:entry colname="col5">1.116</oasis:entry>
         <oasis:entry colname="col6"><bold>0.007</bold></oasis:entry>
         <oasis:entry colname="col7">0.266</oasis:entry>
         <oasis:entry colname="col8">0.664</oasis:entry>
         <oasis:entry colname="col9">1.121</oasis:entry>
         <oasis:entry colname="col10"><bold>0.006</bold></oasis:entry>
         <oasis:entry colname="col11">0.258</oasis:entry>
         <oasis:entry colname="col12">0.644</oasis:entry>
         <oasis:entry colname="col13">1.103</oasis:entry>
         <oasis:entry colname="col14"><bold>0.006</bold></oasis:entry>
         <oasis:entry colname="col15">0.267</oasis:entry>
         <oasis:entry colname="col16">0.667</oasis:entry>
         <oasis:entry colname="col17">1.136</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Depth/m</oasis:entry>
         <oasis:entry rowsep="1" namest="col2" nameend="col17" align="center">LSTM_I </oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry rowsep="1" namest="col2" nameend="col5" align="center" colsep="1">homogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col6" nameend="col9" align="center" colsep="1">heterogeneous </oasis:entry>
         <oasis:entry rowsep="1" namest="col10" nameend="col13" align="center" colsep="1">two-layered  </oasis:entry>
         <oasis:entry rowsep="1" namest="col14" nameend="col17" align="center">root water uptake </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">1d</oasis:entry>
         <oasis:entry colname="col3">3d</oasis:entry>
         <oasis:entry colname="col4">7d</oasis:entry>
         <oasis:entry colname="col5">15d</oasis:entry>
         <oasis:entry colname="col6">1d</oasis:entry>
         <oasis:entry colname="col7">3d</oasis:entry>
         <oasis:entry colname="col8">7d</oasis:entry>
         <oasis:entry colname="col9">15d</oasis:entry>
         <oasis:entry colname="col10">1d</oasis:entry>
         <oasis:entry colname="col11">3d</oasis:entry>
         <oasis:entry colname="col12">7d</oasis:entry>
         <oasis:entry colname="col13">15d</oasis:entry>
         <oasis:entry colname="col14">1d</oasis:entry>
         <oasis:entry colname="col15">3d</oasis:entry>
         <oasis:entry colname="col16">7d</oasis:entry>
         <oasis:entry colname="col17">15d</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.05</oasis:entry>
         <oasis:entry colname="col2">0.318</oasis:entry>
         <oasis:entry colname="col3">0.440</oasis:entry>
         <oasis:entry colname="col4">0.590</oasis:entry>
         <oasis:entry colname="col5">0.845</oasis:entry>
         <oasis:entry colname="col6">0.346</oasis:entry>
         <oasis:entry colname="col7">0.484</oasis:entry>
         <oasis:entry colname="col8">0.656</oasis:entry>
         <oasis:entry colname="col9">0.948</oasis:entry>
         <oasis:entry colname="col10">0.264</oasis:entry>
         <oasis:entry colname="col11"><bold>0.451</bold></oasis:entry>
         <oasis:entry colname="col12"><bold>0.771</bold></oasis:entry>
         <oasis:entry colname="col13">1.313</oasis:entry>
         <oasis:entry colname="col14">0.343</oasis:entry>
         <oasis:entry colname="col15"><bold>0.462</bold></oasis:entry>
         <oasis:entry colname="col16"><bold>0.600</bold></oasis:entry>
         <oasis:entry colname="col17"><bold>0.804</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.10</oasis:entry>
         <oasis:entry colname="col2">0.135</oasis:entry>
         <oasis:entry colname="col3"><bold>0.202</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.319</bold></oasis:entry>
         <oasis:entry colname="col5"><bold>0.528</bold></oasis:entry>
         <oasis:entry colname="col6">0.149</oasis:entry>
         <oasis:entry colname="col7"><bold>0.249</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.408</bold></oasis:entry>
         <oasis:entry colname="col9"><bold>0.699</bold></oasis:entry>
         <oasis:entry colname="col10">0.359</oasis:entry>
         <oasis:entry colname="col11"><bold>0.542</bold></oasis:entry>
         <oasis:entry colname="col12">0.863</oasis:entry>
         <oasis:entry colname="col13">1.436</oasis:entry>
         <oasis:entry colname="col14">0.274</oasis:entry>
         <oasis:entry colname="col15"><bold>0.365</bold></oasis:entry>
         <oasis:entry colname="col16"><bold>0.491</bold></oasis:entry>
         <oasis:entry colname="col17"><bold>0.707</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.20</oasis:entry>
         <oasis:entry colname="col2">0.120</oasis:entry>
         <oasis:entry colname="col3"><bold>0.174</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.262</bold></oasis:entry>
         <oasis:entry colname="col5"><bold>0.444</bold></oasis:entry>
         <oasis:entry colname="col6">0.138</oasis:entry>
         <oasis:entry colname="col7"><bold>0.217</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.359</bold></oasis:entry>
         <oasis:entry colname="col9">0.669</oasis:entry>
         <oasis:entry colname="col10">0.218</oasis:entry>
         <oasis:entry colname="col11"><bold>0.320</bold></oasis:entry>
         <oasis:entry colname="col12"><bold>0.545</bold></oasis:entry>
         <oasis:entry colname="col13">1.072</oasis:entry>
         <oasis:entry colname="col14">0.159</oasis:entry>
         <oasis:entry colname="col15"><bold>0.238</bold></oasis:entry>
         <oasis:entry colname="col16"><bold>0.366</bold></oasis:entry>
         <oasis:entry colname="col17"><bold>0.594</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">0.50</oasis:entry>
         <oasis:entry colname="col2">0.128</oasis:entry>
         <oasis:entry colname="col3"><bold>0.177</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>0.274</bold></oasis:entry>
         <oasis:entry colname="col5"><bold>0.494</bold></oasis:entry>
         <oasis:entry colname="col6">0.142</oasis:entry>
         <oasis:entry colname="col7"><bold>0.207</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.341</bold></oasis:entry>
         <oasis:entry colname="col9">0.640</oasis:entry>
         <oasis:entry colname="col10">0.179</oasis:entry>
         <oasis:entry colname="col11"><bold>0.293</bold></oasis:entry>
         <oasis:entry colname="col12"><bold>0.542</bold></oasis:entry>
         <oasis:entry colname="col13">1.090</oasis:entry>
         <oasis:entry colname="col14">0.173</oasis:entry>
         <oasis:entry colname="col15"><bold>0.276</bold></oasis:entry>
         <oasis:entry colname="col16"><bold>0.443</bold></oasis:entry>
         <oasis:entry colname="col17"><bold>0.742</bold></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">1.00</oasis:entry>
         <oasis:entry colname="col2">0.214</oasis:entry>
         <oasis:entry colname="col3">0.350</oasis:entry>
         <oasis:entry colname="col4">0.594</oasis:entry>
         <oasis:entry colname="col5">1.075</oasis:entry>
         <oasis:entry colname="col6">0.188</oasis:entry>
         <oasis:entry colname="col7"><bold>0.288</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.465</bold></oasis:entry>
         <oasis:entry colname="col9"><bold>0.865</bold></oasis:entry>
         <oasis:entry colname="col10">0.180</oasis:entry>
         <oasis:entry colname="col11"><bold>0.293</bold></oasis:entry>
         <oasis:entry colname="col12"><bold>0.578</bold></oasis:entry>
         <oasis:entry colname="col13">1.343</oasis:entry>
         <oasis:entry colname="col14">0.242</oasis:entry>
         <oasis:entry colname="col15"><bold>0.393</bold></oasis:entry>
         <oasis:entry colname="col16"><bold>0.672</bold></oasis:entry>
         <oasis:entry colname="col17">1.203</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</app>
  </app-group><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d2e8984">The data and codes used in this paper are available at  <ext-link xlink:href="https://doi.org/10.5281/zenodo.10408929" ext-link-type="DOI">10.5281/zenodo.10408929</ext-link> (Wang, 2023).</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e8993">YW: Conceptualization, Methodology, Software, Writing–original draft. XH: Writing – review &amp; editing, Supervision. YH: Supervision. LH: Writing – review &amp; editing. LS: Writing – review &amp; editing, Supervision. LW: Writing – review &amp; editing. WS: Methodology, Writing – review &amp; editing.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e8999">The contact author has declared that none of the authors has any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e9007">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e9013">This work was supported by the National Key Research and Development Program of China (grant 2021YFC3201203) and the National Natural Science Foundation of China (grants 52425901 and   U2243235).</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e9018">This research has been supported by the National Key Research and Development Program of China, Chinese Polar Environment Comprehensive Investigation and Assessment Programmes (grant no. 2021YFC3201203) and by the National Natural Science Foundation of China (grant nos. 52425901 and U2243235.).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e9025">This paper was edited by Bo Guo and reviewed by four anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bib1"><label>1</label><mixed-citation> Allen, R. G., Pereira, L. S., Raes, D., and Smith, M.: Crop evapotranspiration-Guidelines for computing crop water requirement, FAO Irrigation and drainage paper 56, Fao, Rome, 300, D05109, ISBN 92-5-104219-5, 1998.</mixed-citation></ref>
      <ref id="bib1.bib2"><label>2</label><mixed-citation>Bandai, T. and Ghezzehei, T. A.: Physics-Informed Neural Networks With Monotonicity Constraints for Richardson-Richards Equation: Estimation of Constitutive Relationships and Soil Water Flux Density From Volumetric Water Content Measurements, Water Resour. Res., 57, <ext-link xlink:href="https://doi.org/10.1029/2020WR027642" ext-link-type="DOI">10.1029/2020WR027642</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib3"><label>3</label><mixed-citation>De Bézenac, E., Pajot, A., and Gallinari, P.: Deep learning for physical processes: Incorporating prior scientific knowledge, 6th Int. Conf. Learn. Represent. ICLR 2018 – Conf. Track Proc., <ext-link xlink:href="https://doi.org/10.1088/1742-5468/ab3195" ext-link-type="DOI">10.1088/1742-5468/ab3195</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib4"><label>4</label><mixed-citation> Breiman, L.: Bagging predictors, Mach. Learn., 24, 123–140, 1996.</mixed-citation></ref>
      <ref id="bib1.bib5"><label>5</label><mixed-citation>Bronstein, M. M., Bruna, J., Lecun, Y., Szlam, A., and Vandergheynst, P.: Geometric Deep Learning: Going beyond Euclidean data, IEEE Signal Process. Mag., 34, 18–42, <ext-link xlink:href="https://doi.org/10.1109/MSP.2017.2693418" ext-link-type="DOI">10.1109/MSP.2017.2693418</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib6"><label>6</label><mixed-citation>Buckingham, E.: Studies on the movement of soil moisture, U.S. Department of Agriculture, Bureau of Soils, <uri>https://archive.org/details/studiesonmovemen38buck</uri> (last access: 13 May 2026), 1907.</mixed-citation></ref>
      <ref id="bib1.bib7"><label>7</label><mixed-citation> Cybenko, G.: Approximation by superpositions of a sigmoidal function, Math. Control. Signal., 2, 303–314, 1989.</mixed-citation></ref>
      <ref id="bib1.bib8"><label>8</label><mixed-citation>Datta, P. and Faroughi, S. A.: A multihead LSTM technique for prognostic prediction of soil moisture, Geoderma, 433, 116452, <ext-link xlink:href="https://doi.org/10.1016/j.geoderma.2023.116452" ext-link-type="DOI">10.1016/j.geoderma.2023.116452</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bib9"><label>9</label><mixed-citation>Devlin, J., Chang, M. W., Lee, K., and Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding, NAACL HLT 2019 – 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. – Proc. Conf., 1, 4171–4186, <ext-link xlink:href="https://doi.org/10.18653/v1/N19-1423" ext-link-type="DOI">10.18653/v1/N19-1423</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib10"><label>10</label><mixed-citation>Ding, Y., Zhu, Y., Wu, Y., Jun, F., and Cheng, Z.: Spatio-Temporal attention lstm model for flood forecasting, Proc. – 2019 IEEE Int. Congr. Cybermatics 12th IEEE Int. Conf. Internet Things, 15th IEEE Int. Conf. Green Comput. Commun. 12th IEEE Int. Conf. Cyber, Phys. So, 458–465, <ext-link xlink:href="https://doi.org/10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00095">https://doi.org/10.1109/iThings/GreenCom/CPSCom/ SmartData.2019.00095</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib11"><label>11</label><mixed-citation> Elman, J. L.: Finding structure in time, Cogn. Sci., 14, 179–211, 1990.</mixed-citation></ref>
      <ref id="bib1.bib12"><label>12</label><mixed-citation>Entekhabi, D., Rodriguez-Iturbe, I., and Castelli, F.: Mutual interaction of soil moisture state and atmospheric processes, J. Hydrol., 184, 3–17, <ext-link xlink:href="https://doi.org/10.1016/0022-1694(95)02965-6" ext-link-type="DOI">10.1016/0022-1694(95)02965-6</ext-link>, 1996.</mixed-citation></ref>
      <ref id="bib1.bib13"><label>13</label><mixed-citation> Evensen, G.: The ensemble Kalman filter: Theoretical formulation and practical implementation, Ocean Dynam., 53, 343–367, 2003.</mixed-citation></ref>
      <ref id="bib1.bib14"><label>14</label><mixed-citation>Fang, K., Pan, M., and Shen, C.: The Value of SMAP for Long-Term Soil Moisture Estimation with the Help of Deep Learning, IEEE T. Geosci. Remote, 57, 2221–2233, <ext-link xlink:href="https://doi.org/10.1109/TGRS.2018.2872131" ext-link-type="DOI">10.1109/TGRS.2018.2872131</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib15"><label>15</label><mixed-citation>Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y. N.: Convolutional sequence to sequence learning, in: International conference on machine learning, 1243–1252, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1705.03122" ext-link-type="DOI">10.48550/arXiv.1705.03122</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib16"><label>16</label><mixed-citation>van Genuchten, M. T.: A Closed-form Equation for Predicting the Hydraulic Conductivity of Unsaturated Soils, Soil Sci. Soc. Am. J., 44, 892–898, <ext-link xlink:href="https://doi.org/10.2136/sssaj1980.03615995004400050002x" ext-link-type="DOI">10.2136/sssaj1980.03615995004400050002x</ext-link>, 1980.</mixed-citation></ref>
      <ref id="bib1.bib17"><label>17</label><mixed-citation>Gill, M. K., Asefa, T., Kemblowski, M. W., and McKee, M.: Soil moisture prediction using support vector machines, J. Am. Water Resour. Assoc., 42, 1033–1046, <ext-link xlink:href="https://doi.org/10.1111/j.1752-1688.2006.tb04512.x" ext-link-type="DOI">10.1111/j.1752-1688.2006.tb04512.x</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bib18"><label>18</label><mixed-citation> Guo, M.-H., Xu, T.-X., Liu, J.-J., Liu, Z.-N., Jiang, P.-T., Mu, T.-J., Zhang, S.-H., Martin, R. R., Cheng, M.-M., and Hu, S.-M.: Attention mechanisms in computer vision: A survey, Comput. Vis. media, 8, 331–368, 2022.</mixed-citation></ref>
      <ref id="bib1.bib19"><label>19</label><mixed-citation>Guswa, A. J., Celia, M. A., and Rodriguez-Iturbe, I.: Models of soil moisture dynamics in ecohydrology: A comparative study, Water Resour. Res., 38, 5-1–5-15, <ext-link xlink:href="https://doi.org/10.1029/2001wr000826" ext-link-type="DOI">10.1029/2001wr000826</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bib20"><label>20</label><mixed-citation>Heathman, G. C., Cosh, M. H., Merwade, V., and Han, E.: Multi-scale temporal stability analysis of surface and subsurface soil moisture within the Upper Cedar Creek Watershed, Indiana, Catena, 95, 91–103, <ext-link xlink:href="https://doi.org/10.1016/j.catena.2012.03.008" ext-link-type="DOI">10.1016/j.catena.2012.03.008</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bib21"><label>21</label><mixed-citation> Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput., 9, 1735–1780, 1997.</mixed-citation></ref>
      <ref id="bib1.bib22"><label>22</label><mixed-citation>Huang, G. Bin, Zhu, Q. Y., and Siew, C. K.: Extreme learning machine: Theory and applications, Neurocomputing, 70, 489–501, <ext-link xlink:href="https://doi.org/10.1016/j.neucom.2005.12.126" ext-link-type="DOI">10.1016/j.neucom.2005.12.126</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bib23"><label>23</label><mixed-citation>Jiang, S., Zheng, Y., and Solomatine, D.: Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning, Geophys. Res. Lett., 47, <ext-link xlink:href="https://doi.org/10.1029/2020GL088229" ext-link-type="DOI">10.1029/2020GL088229</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib24"><label>24</label><mixed-citation> Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., and Shah, M.: Transformers in vision: A survey, ACM Comput. Surv., 54, 1–41, 2022.</mixed-citation></ref>
      <ref id="bib1.bib25"><label>25</label><mixed-citation>Kingma, D. P. and Ba, J. L.: Adam: A method for stochastic optimization, 3rd Int. Conf. Learn. Represent. ICLR 2015 – Conf. Track Proc., 1–15, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1412.6980" ext-link-type="DOI">10.48550/arXiv.1412.6980</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib26"><label>26</label><mixed-citation> Kornelsen, K. C. and Coulibaly, P.: Root-zone soil moisture estimation using data-driven methods, Water Resour. Res., 50, 2946–2962, 2014.</mixed-citation></ref>
      <ref id="bib1.bib27"><label>27</label><mixed-citation> Koster, R. D., Dirmeyer, P. A., Guo, Z., Bonan, G., Chan, E., Cox, P., Gordon, C. T., Kanae, S., Kowalczyk, E., and Lawrence, D.: Regions of strong coupling between soil moisture and precipitation, Science, 305, 1138–1140, 2004.</mixed-citation></ref>
      <ref id="bib1.bib28"><label>28</label><mixed-citation>Lecun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444, <ext-link xlink:href="https://doi.org/10.1038/nature14539" ext-link-type="DOI">10.1038/nature14539</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib29"><label>29</label><mixed-citation> LeCun, Y.: Generalization and network design strategies, Connect. Perspect., Elsevier (North-Holland), 19, 18, ISBN-10: 0444880615, 1989.</mixed-citation></ref>
      <ref id="bib1.bib30"><label>30</label><mixed-citation>Lim, B., Arýk, S., Loeff, N., and Pfister, T.: Temporal Fusion Transformers for interpretable multi-horizon time series forecasting, Int. J. Forecast., 37, 1748–1764, <ext-link xlink:href="https://doi.org/10.1016/j.ijforecast.2021.03.012" ext-link-type="DOI">10.1016/j.ijforecast.2021.03.012</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib31"><label>31</label><mixed-citation>Liu, P., Chang, S., Huang, X., Tang, J., and Cheung, J. C. K.: Contextualized non-local neural networks for sequence learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, 6762–6769, <ext-link xlink:href="https://doi.org/10.1609/aaai.v33i01.33016762" ext-link-type="DOI">10.1609/aaai.v33i01.33016762</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib32"><label>32</label><mixed-citation>Liu, Y., Mei, L., and Ki, S. O.: Prediction of soil moisture based on Extreme Learning Machine for an apple orchard, CCIS 2014 – Proc. 2014 IEEE 3rd Int. Conf. Cloud Comput. Intell. Syst., 400–404, <ext-link xlink:href="https://doi.org/10.1109/CCIS.2014.7175768" ext-link-type="DOI">10.1109/CCIS.2014.7175768</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bib33"><label>33</label><mixed-citation>Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B.: Swin Transformer, 2021 IEEE/CVF Int. Conf. Comput. Vis., 9992–10002, <ext-link xlink:href="https://doi.org/10.1109/ICCV48922.2021.00986" ext-link-type="DOI">10.1109/ICCV48922.2021.00986</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib34"><label>34</label><mixed-citation>Minasny, B., Bandai, T., Ghezzehei, T. A., Huang, Y. C., Ma, Y., McBratney, A. B., Ng, W., Norouzi, S., Padarian, J., Rudiyanto, Sharififar, A., Styc, Q., and Widyastuti, M.: Soil Science-Informed Machine Learning, Geoderma, 452, 117094, <ext-link xlink:href="https://doi.org/10.1016/j.geoderma.2024.117094" ext-link-type="DOI">10.1016/j.geoderma.2024.117094</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib35"><label>35</label><mixed-citation>Prasad, R., Deo, R. C., Li, Y., and Maraseni, T.: Weekly soil moisture forecasting with multivariate sequential, ensemble empirical mode decomposition and Boruta-random forest hybridizer algorithm approach, Catena, 177, 149–166, <ext-link xlink:href="https://doi.org/10.1016/j.catena.2019.02.012" ext-link-type="DOI">10.1016/j.catena.2019.02.012</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib36"><label>36</label><mixed-citation>Raissi, M., Perdikaris, P., and Karniadakis, G. E.: Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations, 1–22, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1711.10561" ext-link-type="DOI">10.48550/arXiv.1711.10561</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bib37"><label>37</label><mixed-citation>Raissi, M., Perdikaris, P., and Karniadakis, G. E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 378, 686–707, <ext-link xlink:href="https://doi.org/10.1016/j.jcp.2018.10.045" ext-link-type="DOI">10.1016/j.jcp.2018.10.045</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib38"><label>38</label><mixed-citation>Richards, L. A.: Capillary conduction of liquids through porous mediums, J. Appl. Phys., 1, 318–333, <ext-link xlink:href="https://doi.org/10.1063/1.1745010" ext-link-type="DOI">10.1063/1.1745010</ext-link>, 1931.</mixed-citation></ref>
      <ref id="bib1.bib39"><label>39</label><mixed-citation> Romero-Ruiz, A., Linde, N., Keller, T., and Or, D.: A review of geophysical methods for soil structure characterization, Rev. Geophys., 56, 672–697, 2018.</mixed-citation></ref>
      <ref id="bib1.bib40"><label>40</label><mixed-citation> Ross, P. J.: Modeling soil water and solute transport – Fast, simplified numerical solutions, Agron. J., 95, 1352–1361, 2003.</mixed-citation></ref>
      <ref id="bib1.bib41"><label>41</label><mixed-citation>Ross, P. J.: Fast solution of Richards' equation for flexible soil hydraulic property descriptions, L. Water Tech. Report, CSIRO, 39, <ext-link xlink:href="https://doi.org/10.4225/08/5859741868a90" ext-link-type="DOI">10.4225/08/5859741868a90</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bib42"><label>42</label><mixed-citation>Saxton, K. E., Johnson, H. P., and Shaw, R. H.: Modeling Evapotranspiration and Soil Moisture, Trans. Am. Soc. Agric. Eng., 17, 673–677, <ext-link xlink:href="https://doi.org/10.13031/2013.36935" ext-link-type="DOI">10.13031/2013.36935</ext-link>, 1974.</mixed-citation></ref>
      <ref id="bib1.bib43"><label>43</label><mixed-citation> Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and Monfardini, G.: The graph neural network model, IEEE Trans. Neural Networ., 20, 61–80, 2008.</mixed-citation></ref>
      <ref id="bib1.bib44"><label>44</label><mixed-citation>Semwal, V. B., Gupta, A., and Lalwani, P.: An optimized hybrid deep learning model using ensemble learning approach for human walking activities recognition, J. Supercomput., 77, 12256–12279, <ext-link xlink:href="https://doi.org/10.1007/s11227-021-03768-7" ext-link-type="DOI">10.1007/s11227-021-03768-7</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib45"><label>45</label><mixed-citation>Severyn, A. and Moschitti, A.: UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification, SemEval 2015 – 9th Int. Work. Semant. Eval. co-located with 2015 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. NAACL-HLT 2015 – Proc., 464–469, <ext-link xlink:href="https://doi.org/10.18653/v1/s15-2079" ext-link-type="DOI">10.18653/v1/s15-2079</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bib46"><label>46</label><mixed-citation>Shaw, P., Uszkoreit, J., and Vaswani, A.: Self-attention with relative position representations, NAACL HLT 2018 – 2018 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. – Proc. Conf., 2, 464–468, <ext-link xlink:href="https://doi.org/10.18653/v1/n18-2074" ext-link-type="DOI">10.18653/v1/n18-2074</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bib47"><label>47</label><mixed-citation> Shi, X., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., and Woo, W. C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting, Adv. Neur. In., 2015, 802–810, 2015.</mixed-citation></ref>
      <ref id="bib1.bib48"><label>48</label><mixed-citation>Siami-Namini, S., Tavakoli, N., and Namin, A. S.: The performance of LSTM and BiLSTM in forecasting time series, in: 2019 IEEE International conference on big data (Big Data), 3285–3292, <ext-link xlink:href="https://doi.org/10.1109/BigData47090.2019.9005997" ext-link-type="DOI">10.1109/BigData47090.2019.9005997</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bib49"><label>49</label><mixed-citation> Simunek, J., Van Genuchten, M. T., and Sejna, M.: The HYDRUS-1D software package for simulating the one-dimensional movement of water, heat, and multiple solutes in variably-saturated media, Univ. California-Riverside Res. Reports, 3, 1–240, 2005.</mixed-citation></ref>
      <ref id="bib1.bib50"><label>50</label><mixed-citation> Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.: Attention is all you need, Adv. Neural Inf. Process. Syst., 30, 5998–6008, 2017.</mixed-citation></ref>
      <ref id="bib1.bib51"><label>51</label><mixed-citation>Vereecken, H., Huisman, J. A., Bogena, H., Vanderborght, J., Vrugt, J. A., and Hopmans, J. W.: On the value of soil moisture measurements in vadose zone hydrology: A review, Water Resour. Res., 46, 1–21, <ext-link xlink:href="https://doi.org/10.1029/2008WR006829" ext-link-type="DOI">10.1029/2008WR006829</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bib52"><label>52</label><mixed-citation>Vereecken, H., Amelung, W., Bauke, S. L., Bogena, H., Brüggemann, N., Montzka, C., Vanderborght, J., Bechtold, M., Blöschl, G., Carminati, A., Javaux, M., Konings, A. G., Kusche, J., Neuweiler, I., Or, D., Steele-Dunne, S., Verhoef, A., Young, M., and Zhang, Y.: Soil hydrology in the Earth system, Nat. Rev. Earth Environ., 3, 573–587, <ext-link xlink:href="https://doi.org/10.1038/s43017-022-00324-6" ext-link-type="DOI">10.1038/s43017-022-00324-6</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bib53"><label>53</label><mixed-citation>Wang:  soil_moisture_NLNN, Zenodo [code] and [data set], <ext-link xlink:href="https://doi.org/10.5281/zenodo.10408929" ext-link-type="DOI">10.5281/zenodo.10408929</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bib54"><label>54</label><mixed-citation>Wang, W., Wei, Y., Hao, L., Wei, Z., and Zhao, T.: Soil moisture forecasting in wireless sensor networks via spatiotemporal graph convolutional networks, Vadose Zone Journal, 1–17, <ext-link xlink:href="https://doi.org/10.1002/vzj2.70000" ext-link-type="DOI">10.1002/vzj2.70000</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bib55"><label>55</label><mixed-citation> Wang, X., Girshick, R., Gupta, A., and He, K.: Non-local neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, 7794–7803, 2018.</mixed-citation></ref>
      <ref id="bib1.bib56"><label>56</label><mixed-citation>Wang, Y., Shi, L., Hu, Y., Hu, X., Song, W., and Wang, L.: A comprehensive study of deep learning for soil moisture prediction, Hydrol. Earth Syst. Sci., 28, 917–943, <ext-link xlink:href="https://doi.org/10.5194/hess-28-917-2024" ext-link-type="DOI">10.5194/hess-28-917-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bib57"><label>57</label><mixed-citation> Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., and Luo, P.: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers, Adv. Neu. In., 15, 12077–12090, 2021.</mixed-citation></ref>
      <ref id="bib1.bib58"><label>58</label><mixed-citation>Yin, M., Yao, Z., Cao, Y., Li, X., Zhang, Z., Lin, S., and Hu, H.: Disentangled non-local neural networks, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16, 191–207, <ext-link xlink:href="https://doi.org/10.1007/978-3-030-58555-6_12" ext-link-type="DOI">10.1007/978-3-030-58555-6_12</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bib59"><label>59</label><mixed-citation>Zhang, C., Liu, J., Shang, J., and Cai, H.: Capability of crop water content for revealing variability of winter wheat grain yield and soil moisture under limited irrigation, Sci. Total Environ., 631, 677–687, 2018. </mixed-citation></ref>
      <ref id="bib1.bib60"><label>60</label><mixed-citation>Zhu, L., She, Q., Li, D., Lu, Y., Kang, X., Hu, J., and Wang, C.: Unifying Nonlocal Blocks for Neural Networks, Proc. IEEE Int. Conf. Comput. Vis., 12272–12281, <ext-link xlink:href="https://doi.org/10.1109/ICCV48922.2021.01207" ext-link-type="DOI">10.1109/ICCV48922.2021.01207</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bib61"><label>61</label><mixed-citation>Zhu, Z., Xu, M., Bai, S., Huang, T., and Bai, X.: Asymmetric non-local neural networks for semantic segmentation, Proc. IEEE Int. Conf. Comput. Vis., 2019-Octob, 593–602, <ext-link xlink:href="https://doi.org/10.1109/ICCV.2019.00068" ext-link-type="DOI">10.1109/ICCV.2019.00068</ext-link>, 2019.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Interpretable soil moisture prediction with a knowledge-guided deep learning approach</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>1</label><mixed-citation>
      
Allen, R. G., Pereira, L. S., Raes, D., and Smith, M.: Crop
evapotranspiration-Guidelines for computing crop water requirement, FAO
Irrigation and drainage paper 56, Fao, Rome, 300, D05109, ISBN 92-5-104219-5, 1998.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>2</label><mixed-citation>
      
Bandai, T. and Ghezzehei, T. A.: Physics-Informed Neural Networks With
Monotonicity Constraints for Richardson-Richards Equation: Estimation of
Constitutive Relationships and Soil Water Flux Density From Volumetric Water
Content Measurements, Water Resour. Res., 57,
<a href="https://doi.org/10.1029/2020WR027642" target="_blank">https://doi.org/10.1029/2020WR027642</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>3</label><mixed-citation>
      
De Bézenac, E., Pajot, A., and Gallinari, P.: Deep learning for physical
processes: Incorporating prior scientific knowledge, 6th Int. Conf. Learn.
Represent. ICLR 2018 – Conf. Track Proc., <a href="https://doi.org/10.1088/1742-5468/ab3195" target="_blank">https://doi.org/10.1088/1742-5468/ab3195</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>4</label><mixed-citation>
      
Breiman, L.: Bagging predictors, Mach. Learn., 24, 123–140, 1996.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>5</label><mixed-citation>
      
Bronstein, M. M., Bruna, J., Lecun, Y., Szlam, A., and Vandergheynst, P.: Geometric Deep Learning: Going beyond Euclidean data, IEEE Signal Process. Mag., 34, 18–42, <a href="https://doi.org/10.1109/MSP.2017.2693418" target="_blank">https://doi.org/10.1109/MSP.2017.2693418</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>6</label><mixed-citation>
      
Buckingham, E.: Studies on the movement of soil moisture, U.S. Department of Agriculture, Bureau of Soils, <a href="https://archive.org/details/studiesonmovemen38buck" target="_blank"/> (last access: 13 May 2026), 1907.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>7</label><mixed-citation>
      
Cybenko, G.: Approximation by superpositions of a sigmoidal function, Math.
Control. Signal., 2, 303–314, 1989.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>8</label><mixed-citation>
      
Datta, P. and Faroughi, S. A.: A multihead LSTM technique for prognostic
prediction of soil moisture, Geoderma, 433, 116452,
<a href="https://doi.org/10.1016/j.geoderma.2023.116452" target="_blank">https://doi.org/10.1016/j.geoderma.2023.116452</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>9</label><mixed-citation>
      
Devlin, J., Chang, M. W., Lee, K., and Toutanova, K.: BERT: Pre-training of
deep bidirectional transformers for language understanding, NAACL HLT 2019 –
2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. –
Proc. Conf., 1, 4171–4186, <a href="https://doi.org/10.18653/v1/N19-1423" target="_blank">https://doi.org/10.18653/v1/N19-1423</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>10</label><mixed-citation>
      
Ding, Y., Zhu, Y., Wu, Y., Jun, F., and Cheng, Z.: Spatio-Temporal attention
lstm model for flood forecasting, Proc. – 2019 IEEE Int. Congr. Cybermatics
12th IEEE Int. Conf. Internet Things, 15th IEEE Int. Conf. Green Comput.
Commun. 12th IEEE Int. Conf. Cyber, Phys. So, 458–465,
<a href="https://doi.org/10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00095" target="_blank">https://doi.org/10.1109/iThings/GreenCom/CPSCom/
SmartData.2019.00095</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>11</label><mixed-citation>
      
Elman, J. L.: Finding structure in time, Cogn. Sci., 14, 179–211, 1990.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>12</label><mixed-citation>
      
Entekhabi, D., Rodriguez-Iturbe, I., and Castelli, F.: Mutual interaction of
soil moisture state and atmospheric processes, J. Hydrol., 184, 3–17,
<a href="https://doi.org/10.1016/0022-1694(95)02965-6" target="_blank">https://doi.org/10.1016/0022-1694(95)02965-6</a>, 1996.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>13</label><mixed-citation>
      
Evensen, G.: The ensemble Kalman filter: Theoretical formulation and
practical implementation, Ocean Dynam., 53, 343–367, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>14</label><mixed-citation>
      
Fang, K., Pan, M., and Shen, C.: The Value of SMAP for Long-Term Soil
Moisture Estimation with the Help of Deep Learning, IEEE T. Geosci.
Remote, 57, 2221–2233, <a href="https://doi.org/10.1109/TGRS.2018.2872131" target="_blank">https://doi.org/10.1109/TGRS.2018.2872131</a>,
2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>15</label><mixed-citation>
      
Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y. N.:
Convolutional sequence to sequence learning, in: International conference on
machine learning, 1243–1252, <a href="https://doi.org/10.48550/arXiv.1705.03122" target="_blank">https://doi.org/10.48550/arXiv.1705.03122</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>16</label><mixed-citation>
      
van Genuchten, M. T.: A Closed-form Equation for Predicting the Hydraulic
Conductivity of Unsaturated Soils, Soil Sci. Soc. Am. J., 44, 892–898,
<a href="https://doi.org/10.2136/sssaj1980.03615995004400050002x" target="_blank">https://doi.org/10.2136/sssaj1980.03615995004400050002x</a>, 1980.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>17</label><mixed-citation>
      
Gill, M. K., Asefa, T., Kemblowski, M. W., and McKee, M.: Soil moisture
prediction using support vector machines, J. Am. Water Resour. Assoc., 42,
1033–1046, <a href="https://doi.org/10.1111/j.1752-1688.2006.tb04512.x" target="_blank">https://doi.org/10.1111/j.1752-1688.2006.tb04512.x</a>, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>18</label><mixed-citation>
      
Guo, M.-H., Xu, T.-X., Liu, J.-J., Liu, Z.-N., Jiang, P.-T., Mu, T.-J.,
Zhang, S.-H., Martin, R. R., Cheng, M.-M., and Hu, S.-M.: Attention
mechanisms in computer vision: A survey, Comput. Vis. media, 8, 331–368,
2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>19</label><mixed-citation>
      
Guswa, A. J., Celia, M. A., and Rodriguez-Iturbe, I.: Models of soil
moisture dynamics in ecohydrology: A comparative study, Water Resour. Res.,
38, 5-1–5-15, <a href="https://doi.org/10.1029/2001wr000826" target="_blank">https://doi.org/10.1029/2001wr000826</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>20</label><mixed-citation>
      
Heathman, G. C., Cosh, M. H., Merwade, V., and Han, E.: Multi-scale temporal
stability analysis of surface and subsurface soil moisture within the Upper
Cedar Creek Watershed, Indiana, Catena, 95, 91–103,
<a href="https://doi.org/10.1016/j.catena.2012.03.008" target="_blank">https://doi.org/10.1016/j.catena.2012.03.008</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>21</label><mixed-citation>
      
Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput.,
9, 1735–1780, 1997.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>22</label><mixed-citation>
      
Huang, G. Bin, Zhu, Q. Y., and Siew, C. K.: Extreme learning machine: Theory
and applications, Neurocomputing, 70, 489–501,
<a href="https://doi.org/10.1016/j.neucom.2005.12.126" target="_blank">https://doi.org/10.1016/j.neucom.2005.12.126</a>, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>23</label><mixed-citation>
      
Jiang, S., Zheng, Y., and Solomatine, D.: Improving AI System Awareness of
Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep
Learning, Geophys. Res. Lett., 47, <a href="https://doi.org/10.1029/2020GL088229" target="_blank">https://doi.org/10.1029/2020GL088229</a>,
2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>24</label><mixed-citation>
      
Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., and Shah, M.:
Transformers in vision: A survey, ACM Comput. Surv., 54, 1–41, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>25</label><mixed-citation>
      
Kingma, D. P. and Ba, J. L.: Adam: A method for stochastic optimization, 3rd
Int. Conf. Learn. Represent. ICLR 2015 – Conf. Track Proc., 1–15, <a href="https://doi.org/10.48550/arXiv.1412.6980" target="_blank">https://doi.org/10.48550/arXiv.1412.6980</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>26</label><mixed-citation>
      
Kornelsen, K. C. and Coulibaly, P.: Root-zone soil moisture estimation using
data-driven methods, Water Resour. Res., 50, 2946–2962, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>27</label><mixed-citation>
      
Koster, R. D., Dirmeyer, P. A., Guo, Z., Bonan, G., Chan, E., Cox, P.,
Gordon, C. T., Kanae, S., Kowalczyk, E., and Lawrence, D.: Regions of strong
coupling between soil moisture and precipitation, Science, 305,
1138–1140, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>28</label><mixed-citation>
      
Lecun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444,
<a href="https://doi.org/10.1038/nature14539" target="_blank">https://doi.org/10.1038/nature14539</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>29</label><mixed-citation>
      
LeCun, Y.: Generalization and network design strategies, Connect. Perspect., Elsevier (North-Holland),
19, 18, ISBN-10: 0444880615, 1989.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>30</label><mixed-citation>
      
Lim, B., Arýk, S., Loeff, N., and Pfister, T.: Temporal Fusion
Transformers for interpretable multi-horizon time series forecasting, Int.
J. Forecast., 37, 1748–1764,
<a href="https://doi.org/10.1016/j.ijforecast.2021.03.012" target="_blank">https://doi.org/10.1016/j.ijforecast.2021.03.012</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>31</label><mixed-citation>
      
Liu, P., Chang, S., Huang, X., Tang, J., and Cheung, J. C. K.:
Contextualized non-local neural networks for sequence learning, in:
Proceedings of the AAAI Conference on Artificial Intelligence, 6762–6769, <a href="https://doi.org/10.1609/aaai.v33i01.33016762" target="_blank">https://doi.org/10.1609/aaai.v33i01.33016762</a>,
2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>32</label><mixed-citation>
      
Liu, Y., Mei, L., and Ki, S. O.: Prediction of soil moisture based on
Extreme Learning Machine for an apple orchard, CCIS 2014 – Proc. 2014 IEEE
3rd Int. Conf. Cloud Comput. Intell. Syst., 400–404,
<a href="https://doi.org/10.1109/CCIS.2014.7175768" target="_blank">https://doi.org/10.1109/CCIS.2014.7175768</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>33</label><mixed-citation>
      
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B.:
Swin Transformer, 2021 IEEE/CVF Int. Conf. Comput. Vis., 9992–10002, <a href="https://doi.org/10.1109/ICCV48922.2021.00986" target="_blank">https://doi.org/10.1109/ICCV48922.2021.00986</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>34</label><mixed-citation>
      
Minasny, B., Bandai, T., Ghezzehei, T. A., Huang, Y. C., Ma, Y., McBratney,
A. B., Ng, W., Norouzi, S., Padarian, J., Rudiyanto, Sharififar, A., Styc,
Q., and Widyastuti, M.: Soil Science-Informed Machine Learning, Geoderma,
452, 117094, <a href="https://doi.org/10.1016/j.geoderma.2024.117094" target="_blank">https://doi.org/10.1016/j.geoderma.2024.117094</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>35</label><mixed-citation>
      
Prasad, R., Deo, R. C., Li, Y., and Maraseni, T.: Weekly soil moisture
forecasting with multivariate sequential, ensemble empirical mode
decomposition and Boruta-random forest hybridizer algorithm approach,
Catena, 177, 149–166, <a href="https://doi.org/10.1016/j.catena.2019.02.012" target="_blank">https://doi.org/10.1016/j.catena.2019.02.012</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>36</label><mixed-citation>
      
Raissi, M., Perdikaris, P., and Karniadakis, G. E.: Physics Informed Deep
Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential
Equations, 1–22, <a href="https://doi.org/10.48550/arXiv.1711.10561" target="_blank">https://doi.org/10.48550/arXiv.1711.10561</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>37</label><mixed-citation>
      
Raissi, M., Perdikaris, P., and Karniadakis, G. E.: Physics-informed neural
networks: A deep learning framework for solving forward and inverse problems
involving nonlinear partial differential equations, J. Comput. Phys., 378,
686–707, <a href="https://doi.org/10.1016/j.jcp.2018.10.045" target="_blank">https://doi.org/10.1016/j.jcp.2018.10.045</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>38</label><mixed-citation>
      
Richards, L. A.: Capillary conduction of liquids through porous mediums, J.
Appl. Phys., 1, 318–333, <a href="https://doi.org/10.1063/1.1745010" target="_blank">https://doi.org/10.1063/1.1745010</a>, 1931.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>39</label><mixed-citation>
      
Romero-Ruiz, A., Linde, N., Keller, T., and Or, D.: A review of geophysical
methods for soil structure characterization, Rev. Geophys., 56, 672–697,
2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>40</label><mixed-citation>
      
Ross, P. J.: Modeling soil water and solute transport – Fast, simplified
numerical solutions, Agron. J., 95, 1352–1361, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>41</label><mixed-citation>
      
Ross, P. J.: Fast solution of Richards' equation for flexible soil hydraulic
property descriptions, L. Water Tech. Report, CSIRO, 39, <a href="https://doi.org/10.4225/08/5859741868a90" target="_blank">https://doi.org/10.4225/08/5859741868a90</a>, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>42</label><mixed-citation>
      
Saxton, K. E., Johnson, H. P., and Shaw, R. H.: Modeling Evapotranspiration
and Soil Moisture, Trans. Am. Soc. Agric. Eng., 17, 673–677,
<a href="https://doi.org/10.13031/2013.36935" target="_blank">https://doi.org/10.13031/2013.36935</a>, 1974.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>43</label><mixed-citation>
      
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and Monfardini, G.:
The graph neural network model, IEEE Trans. Neural Networ., 20, 61–80,
2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>44</label><mixed-citation>
      
Semwal, V. B., Gupta, A., and Lalwani, P.: An optimized hybrid deep learning
model using ensemble learning approach for human walking activities
recognition, J. Supercomput., 77, 12256–12279,
<a href="https://doi.org/10.1007/s11227-021-03768-7" target="_blank">https://doi.org/10.1007/s11227-021-03768-7</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>45</label><mixed-citation>
      
Severyn, A. and Moschitti, A.: UNITN: Training Deep Convolutional Neural
Network for Twitter Sentiment Classification, SemEval 2015 – 9th Int. Work.
Semant. Eval. co-located with 2015 Conf. North Am. Chapter Assoc. Comput.
Linguist. Hum. Lang. Technol. NAACL-HLT 2015 – Proc., 464–469,
<a href="https://doi.org/10.18653/v1/s15-2079" target="_blank">https://doi.org/10.18653/v1/s15-2079</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>46</label><mixed-citation>
      
Shaw, P., Uszkoreit, J., and Vaswani, A.: Self-attention with relative
position representations, NAACL HLT 2018 – 2018 Conf. North Am. Chapter
Assoc. Comput. Linguist. Hum. Lang. Technol. – Proc. Conf., 2, 464–468,
<a href="https://doi.org/10.18653/v1/n18-2074" target="_blank">https://doi.org/10.18653/v1/n18-2074</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>47</label><mixed-citation>
      
Shi, X., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., and Woo, W. C.:
Convolutional LSTM network: A machine learning approach for precipitation
nowcasting, Adv. Neur. In., 2015, 802–810, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>48</label><mixed-citation>
      
Siami-Namini, S., Tavakoli, N., and Namin, A. S.: The performance of LSTM
and BiLSTM in forecasting time series, in: 2019 IEEE International
conference on big data (Big Data), 3285–3292, <a href="https://doi.org/10.1109/BigData47090.2019.9005997" target="_blank">https://doi.org/10.1109/BigData47090.2019.9005997</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>49</label><mixed-citation>
      
Simunek, J., Van Genuchten, M. T., and Sejna, M.: The HYDRUS-1D software
package for simulating the one-dimensional movement of water, heat, and
multiple solutes in variably-saturated media, Univ. California-Riverside
Res. Reports, 3, 1–240, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>50</label><mixed-citation>
      
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.
N., Kaiser, Ł., and Polosukhin, I.: Attention is all you need,
Adv. Neural Inf. Process. Syst., 30, 5998–6008, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>51</label><mixed-citation>
      
Vereecken, H., Huisman, J. A., Bogena, H., Vanderborght, J., Vrugt, J. A.,
and Hopmans, J. W.: On the value of soil moisture measurements in vadose
zone hydrology: A review, Water Resour. Res., 46, 1–21,
<a href="https://doi.org/10.1029/2008WR006829" target="_blank">https://doi.org/10.1029/2008WR006829</a>, 2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>52</label><mixed-citation>
      
Vereecken, H., Amelung, W., Bauke, S. L., Bogena, H., Brüggemann, N.,
Montzka, C., Vanderborght, J., Bechtold, M., Blöschl, G., Carminati, A.,
Javaux, M., Konings, A. G., Kusche, J., Neuweiler, I., Or, D., Steele-Dunne,
S., Verhoef, A., Young, M., and Zhang, Y.: Soil hydrology in the Earth
system, Nat. Rev. Earth Environ., 3, 573–587,
<a href="https://doi.org/10.1038/s43017-022-00324-6" target="_blank">https://doi.org/10.1038/s43017-022-00324-6</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>53</label><mixed-citation>
      
Wang:  soil_moisture_NLNN, Zenodo [code] and [data set], <a href="https://doi.org/10.5281/zenodo.10408929" target="_blank">https://doi.org/10.5281/zenodo.10408929</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>54</label><mixed-citation>
      
Wang, W., Wei, Y., Hao, L., Wei, Z., and Zhao, T.: Soil moisture forecasting
in wireless sensor networks via spatiotemporal graph convolutional networks, Vadose Zone Journal,
1–17, <a href="https://doi.org/10.1002/vzj2.70000" target="_blank">https://doi.org/10.1002/vzj2.70000</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>55</label><mixed-citation>
      
Wang, X., Girshick, R., Gupta, A., and He, K.: Non-local neural networks,
in: Proceedings of the IEEE conference on computer vision and pattern
recognition, IEEE, 7794–7803, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>56</label><mixed-citation>
      
Wang, Y., Shi, L., Hu, Y., Hu, X., Song, W., and Wang, L.: A comprehensive study of deep learning for soil moisture prediction, Hydrol. Earth Syst. Sci., 28, 917–943, <a href="https://doi.org/10.5194/hess-28-917-2024" target="_blank">https://doi.org/10.5194/hess-28-917-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>57</label><mixed-citation>
      
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., and Luo, P.:
SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers, Adv. Neu. In., 15, 12077–12090, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>58</label><mixed-citation>
      
Yin, M., Yao, Z., Cao, Y., Li, X., Zhang, Z., Lin, S., and Hu, H.:
Disentangled non-local neural networks, in: Computer Vision–ECCV 2020: 16th
European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV
16, 191–207, <a href="https://doi.org/10.1007/978-3-030-58555-6_12" target="_blank">https://doi.org/10.1007/978-3-030-58555-6_12</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>59</label><mixed-citation>
      
Zhang, C., Liu, J., Shang, J., and Cai, H.: Capability of crop water content
for revealing variability of winter wheat grain yield and soil moisture
under limited irrigation, Sci. Total Environ., 631, 677–687, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>60</label><mixed-citation>
      
Zhu, L., She, Q., Li, D., Lu, Y., Kang, X., Hu, J., and Wang, C.: Unifying
Nonlocal Blocks for Neural Networks, Proc. IEEE Int. Conf. Comput. Vis.,
12272–12281, <a href="https://doi.org/10.1109/ICCV48922.2021.01207" target="_blank">https://doi.org/10.1109/ICCV48922.2021.01207</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>61</label><mixed-citation>
      
Zhu, Z., Xu, M., Bai, S., Huang, T., and Bai, X.: Asymmetric non-local
neural networks for semantic segmentation, Proc. IEEE Int. Conf. Comput.
Vis., 2019-Octob, 593–602, <a href="https://doi.org/10.1109/ICCV.2019.00068" target="_blank">https://doi.org/10.1109/ICCV.2019.00068</a>, 2019.

    </mixed-citation></ref-html>--></article>
