Covariance-based selection of parameters for particle filter data assimilation in soil hydrology

Jamal, Alaa; Linker, Raphael

doi:10.5194/hess-2021-295

Preprints

https://doi.org/10.5194/hess-2021-295

Preprints

28 Jun 2021

| 28 Jun 2021

Status: this discussion paper is a preprint. It has been under review for the journal Hydrology and Earth System Sciences (HESS). The manuscript was not accepted for further review after discussion.

Covariance-based selection of parameters for particle filter data assimilation in soil hydrology

Alaa Jamal and Raphael Linker

Abstract. Real-time in-situ measurements are increasingly used to improve the estimations of simulation models via data assimilation techniques such as particle filter. However, models that describe complex processes such as water flow contain a large number of parameters while the data available is typically very limited. In such situations, applying particle filter to a large, fixed set of parameters chosen a priori can lead to unstable behavior, i.e. inconsistent adjustment of some of the parameters that have only limited impact on the states that are being measured. To prevent this, in this study correlation-based variable selection is embedded in the particle filter, so that at each data assimilation step only a subset of the parameters is adjusted. More specifically, whenever measurements become available, the most influential (i.e., the most highly correlated) parameters are determined by correlation analysis, and only these are updated by particle filter. The proposed method was applied to a water flow model (Hydrus-1D) in which states (soil water contents) and parameters (soil hydraulic parameters) were updated via data assimilation. Two simulation case studies were conducted in order to demonstrate the performance of the proposed method. Overall, the proposed method yielded parameters and states estimates that were more accurate and more consistent than those obtained when adjusting all the parameters.

Received: 31 May 2021 – Discussion started: 28 Jun 2021

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Alaa Jamal and Raphael Linker

Status: closed

RC1:
'Comment on hess-2021-295', Anonymous Referee #1, 30 Jun 2021

A one-dimensional exercise of data assimilation with homogeneous parameters and no benchmarking is of no interest to the readers.

Please, reconsider submitting a paper with higher complexity in the parameter description (two- or three-dimensional, heterogeneous parameters...) and benchmark it against other more commonly used approaches describing why the new method proposed should be adopted.

Citation: https://doi.org/10.5194/hess-2021-295-RC1
- AC1: 'Reply on RC1', Alaa Jamal, 14 Sep 2021
  
  We regret that you didn’t find much interest in our paper. Clearly, we agree that a one-dimensional model has severe limitations but we restricted ourselves to such a model in the current study for the following reason: the primary purpose of the study was to introduce the concept of real-time dynamic selection of parameters for data assimilation, which to the best of our knowledge has not be reported in previous works. We considered that a one-dimensional model, which is someone trackable, was sufficient to illustrate the approach.
  With regard to comparing the proposed method with existing ones, as mentioned above to the best of our knowledge selecting dynamically which parameters should be estimated has not been suggested in the past and hence such a comparison is not relevant. However, we compared the proposed method to straightforward application of data assimilation involving all parameters in order to show that selecting dynamically a sub-set of parameters for estimation leads to improved results
  
  Citation: https://doi.org/10.5194/hess-2021-295-AC1
RC2:
'Comment on hess-2021-295', Anonymous Referee #2, 28 Jul 2021

Paper # hess-2021-295

Covariance-based selection of parameters for particle filter data assimilation in soil hydrology

by

Alaa Jamal and Raphael Linker

The paper presents a new particle filter data assimilation method where only highly correlated parameters to the measures are updated by the particle filter. A combination of Monte Carlo Markov Chain method and Particle Filter with Genetic Algorithm (Jamal and Linker, 2020) is used for the estimation of the state variables and parameters. The method is then applied to a water flow model where soil water contents and hydraulic soil parameters are updated using data assimilation. The results are compared to the traditional particle filter method where all parameters are updated during data assimilation.

Comments:

1/The investigated test cases are synthetic. The authors should consider simulation of real field or laboratory cases.

2/The constitutive law describing the relationships between pressure head, conductivity and water content should be specified when describing the test studies. Is it the Mualem–van Genuchten model?

3/The parameter alpha should have a dimension [L-1].

4/The value of the parameter n should be greater than 1

5/Why the parameter n is fixed and not included in the inversion procedure? Both Mualem van Genuchten parameters alpha and n cannot be measured and both should be included in the estimation.

6/The interval of variation +-40% is acceptable for the saturated water content, but not for alpha nor KS which can vary at least by one order of magnitude. Their intervals should be significantly enlarged because of the usually lack of prior knowledge of their values.

7/Why the estimation of the tetaS in the second layer is better with C-GPFM than with CPFM? Note that the inverse can be observed with KS ? How can you explain that parameters of this layer although they are not highly correlated to the measures, they can be better estimated with C-GPFM ? I believe that the parameters in the second zone cannot be accurately estimated with neither of the methods.

8/Could you provide detailed explanation why C-GPFM is more consistent than GPFM?

9/All presented results (for all figures) should include uncertainty ranges for the estimated parameters and variables and the results should be discussed based on these uncertainties.

In sum, the conclusion is not well supported by the analysis. I can understand that C-GPFM can be more efficient than CPFM but I really don’t see why it could be more consistent or more accurate that CPFM. Further, the authors should consider real and not synthetic experiments.

I suggest major revisions.

Citation: https://doi.org/10.5194/hess-2021-295-RC2
- AC2: 'Reply on RC2', Alaa Jamal, 14 Sep 2021
  
  1/ While testing the proposed procedure with a real field test should certainly be the next step, we believe such a test should be conducted only after the soundness of the approach has been established, which was the purpose of the present study. For this first step, working only with synthetic data has the main advantage that ground-truth values are available for both parameters and state variables (at all depths), so that convergence of the ensemble can be fully analyzed. It would not be possible to quantify the procedure performance in an objective fashion using a real measurements since (1) measurements are not perfect, (2) measurements would be available only at specific depths and (3) the “true” characteristics of the soil would of course not be known.
  2/ The constitutive law was indeed the Mualem–van Genunchten model. This information will be added.
  3/ This regretfully mistake will be corrected
  4/ The parameters n and alpha were regretfully mistakenly substituted in Table 1. In addition, the parameters in Figure 3 should be n instead of alpha. This mistake will be corrected throughout the manuscript.
  5/ None of the parameters was fixed a priori. All parameters were allowed to change, if the correlation analysis showed that such a change was “justified”. In the specific case studies reported in the manuscript, the parameter n was never selected for adjustment.
  6/ In specific situations the uncertainty in Ksat could indeed be very large. However, particle filters such as used in the study tend to perform poorly when the uncertainty is very large, since it causes particle weights to become extremely small. This is most certainly a limitation of the approach (related to particle filter and not parameter selection) that should have been emphasized in the manuscript and we will correct this omission.
  7/ The measurements were located close to the interface between top and middle layer, and middle and bottom layer, which explains why these measurements indeed conveyed some information about the properties of the middle layer. The fact that some of the parameters of the middle layer were indeed adjusted is a direct result of the fact that some correlation between the parameters of this layer and the measured states indeed existed during specific periods. We will add a Figure showing such correlation and expend the explanation.
  8/ C-GPFM is more consistent, or provides more robust estimations, than GPFM because it prevents changes in parameters that are currently not influential but will become influential under other circumstance. To illustrate this, consider a period during which the water content remains close to field capacity. Clearly during this period the value of the residual water content is not important, but when applying GPFM (or any other particle filter) there is nothing that prevents the value of this (currently non-influential) parameter from being changed to values that are in fact worse than the current ones. If the adjusted model is then used on a dry period (where of course the value of residual water content influences the results), the performance of the adjusted model will be worse than the performance of the initial model. The proposed C-GPFM approach prevents this by applying a “don’t touch if not necessary” approach.
  9/ We will add the uncertainties of the current states errors and current states to Figures 4 and 5, respectively. As mentioned in the paper (L162) parameters estimation is directly related to the state estimation so that states uncertainties provide information about parameters uncertainties as well.
  
  Citation: https://doi.org/10.5194/hess-2021-295-AC2

Status: closed

RC1:
'Comment on hess-2021-295', Anonymous Referee #1, 30 Jun 2021

A one-dimensional exercise of data assimilation with homogeneous parameters and no benchmarking is of no interest to the readers.

Please, reconsider submitting a paper with higher complexity in the parameter description (two- or three-dimensional, heterogeneous parameters...) and benchmark it against other more commonly used approaches describing why the new method proposed should be adopted.

Citation: https://doi.org/10.5194/hess-2021-295-RC1
- AC1: 'Reply on RC1', Alaa Jamal, 14 Sep 2021
  
  We regret that you didn’t find much interest in our paper. Clearly, we agree that a one-dimensional model has severe limitations but we restricted ourselves to such a model in the current study for the following reason: the primary purpose of the study was to introduce the concept of real-time dynamic selection of parameters for data assimilation, which to the best of our knowledge has not be reported in previous works. We considered that a one-dimensional model, which is someone trackable, was sufficient to illustrate the approach.
  With regard to comparing the proposed method with existing ones, as mentioned above to the best of our knowledge selecting dynamically which parameters should be estimated has not been suggested in the past and hence such a comparison is not relevant. However, we compared the proposed method to straightforward application of data assimilation involving all parameters in order to show that selecting dynamically a sub-set of parameters for estimation leads to improved results
  
  Citation: https://doi.org/10.5194/hess-2021-295-AC1
RC2:
'Comment on hess-2021-295', Anonymous Referee #2, 28 Jul 2021

Paper # hess-2021-295

Covariance-based selection of parameters for particle filter data assimilation in soil hydrology

by

Alaa Jamal and Raphael Linker

The paper presents a new particle filter data assimilation method where only highly correlated parameters to the measures are updated by the particle filter. A combination of Monte Carlo Markov Chain method and Particle Filter with Genetic Algorithm (Jamal and Linker, 2020) is used for the estimation of the state variables and parameters. The method is then applied to a water flow model where soil water contents and hydraulic soil parameters are updated using data assimilation. The results are compared to the traditional particle filter method where all parameters are updated during data assimilation.

Comments:

1/The investigated test cases are synthetic. The authors should consider simulation of real field or laboratory cases.

2/The constitutive law describing the relationships between pressure head, conductivity and water content should be specified when describing the test studies. Is it the Mualem–van Genuchten model?

3/The parameter alpha should have a dimension [L-1].

4/The value of the parameter n should be greater than 1

5/Why the parameter n is fixed and not included in the inversion procedure? Both Mualem van Genuchten parameters alpha and n cannot be measured and both should be included in the estimation.

6/The interval of variation +-40% is acceptable for the saturated water content, but not for alpha nor KS which can vary at least by one order of magnitude. Their intervals should be significantly enlarged because of the usually lack of prior knowledge of their values.

7/Why the estimation of the tetaS in the second layer is better with C-GPFM than with CPFM? Note that the inverse can be observed with KS ? How can you explain that parameters of this layer although they are not highly correlated to the measures, they can be better estimated with C-GPFM ? I believe that the parameters in the second zone cannot be accurately estimated with neither of the methods.

8/Could you provide detailed explanation why C-GPFM is more consistent than GPFM?

9/All presented results (for all figures) should include uncertainty ranges for the estimated parameters and variables and the results should be discussed based on these uncertainties.

In sum, the conclusion is not well supported by the analysis. I can understand that C-GPFM can be more efficient than CPFM but I really don’t see why it could be more consistent or more accurate that CPFM. Further, the authors should consider real and not synthetic experiments.

I suggest major revisions.

Citation: https://doi.org/10.5194/hess-2021-295-RC2
- AC2: 'Reply on RC2', Alaa Jamal, 14 Sep 2021
  
  1/ While testing the proposed procedure with a real field test should certainly be the next step, we believe such a test should be conducted only after the soundness of the approach has been established, which was the purpose of the present study. For this first step, working only with synthetic data has the main advantage that ground-truth values are available for both parameters and state variables (at all depths), so that convergence of the ensemble can be fully analyzed. It would not be possible to quantify the procedure performance in an objective fashion using a real measurements since (1) measurements are not perfect, (2) measurements would be available only at specific depths and (3) the “true” characteristics of the soil would of course not be known.
  2/ The constitutive law was indeed the Mualem–van Genunchten model. This information will be added.
  3/ This regretfully mistake will be corrected
  4/ The parameters n and alpha were regretfully mistakenly substituted in Table 1. In addition, the parameters in Figure 3 should be n instead of alpha. This mistake will be corrected throughout the manuscript.
  5/ None of the parameters was fixed a priori. All parameters were allowed to change, if the correlation analysis showed that such a change was “justified”. In the specific case studies reported in the manuscript, the parameter n was never selected for adjustment.
  6/ In specific situations the uncertainty in Ksat could indeed be very large. However, particle filters such as used in the study tend to perform poorly when the uncertainty is very large, since it causes particle weights to become extremely small. This is most certainly a limitation of the approach (related to particle filter and not parameter selection) that should have been emphasized in the manuscript and we will correct this omission.
  7/ The measurements were located close to the interface between top and middle layer, and middle and bottom layer, which explains why these measurements indeed conveyed some information about the properties of the middle layer. The fact that some of the parameters of the middle layer were indeed adjusted is a direct result of the fact that some correlation between the parameters of this layer and the measured states indeed existed during specific periods. We will add a Figure showing such correlation and expend the explanation.
  8/ C-GPFM is more consistent, or provides more robust estimations, than GPFM because it prevents changes in parameters that are currently not influential but will become influential under other circumstance. To illustrate this, consider a period during which the water content remains close to field capacity. Clearly during this period the value of the residual water content is not important, but when applying GPFM (or any other particle filter) there is nothing that prevents the value of this (currently non-influential) parameter from being changed to values that are in fact worse than the current ones. If the adjusted model is then used on a dry period (where of course the value of residual water content influences the results), the performance of the adjusted model will be worse than the performance of the initial model. The proposed C-GPFM approach prevents this by applying a “don’t touch if not necessary” approach.
  9/ We will add the uncertainties of the current states errors and current states to Figures 4 and 5, respectively. As mentioned in the paper (L162) parameters estimation is directly related to the state estimation so that states uncertainties provide information about parameters uncertainties as well.
  
  Citation: https://doi.org/10.5194/hess-2021-295-AC2

Alaa Jamal and Raphael Linker

Viewed

Total article views: 2,109 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,657	391	61	2,109	74	86

HTML: 1,657
PDF: 391
XML: 61
Total: 2,109
BibTeX: 74
EndNote: 86

Views and downloads (calculated since 28 Jun 2021)

Month	HTML	PDF	XML	Total
Jun 2021	86	15	2	103
Jul 2021	93	21	5	119
Aug 2021	23	7	1	31
Sep 2021	43	17	4	64
Oct 2021	22	42	0	64
Nov 2021	18	21	1	40
Dec 2021	17	9	1	27
Jan 2022	17	4	0	21
Feb 2022	18	6	0	24
Mar 2022	9	3	1	13
Apr 2022	11	3	0	14
May 2022	7	3	1	11
Jun 2022	4	3	1	8
Jul 2022	22	4	0	26
Aug 2022	10	7	0	17
Sep 2022	5	7	0	12
Oct 2022	9	5	0	14
Nov 2022	23	4	3	30
Dec 2022	9	6	0	15
Jan 2023	34	10	0	44
Feb 2023	11	1	0	12
Mar 2023	5	2	1	8
Apr 2023	3	6	2	11
May 2023	3	4	0	7
Jun 2023	5	4	1	10
Jul 2023	33	5	1	39
Aug 2023	25	4	2	31
Sep 2023	13	8	1	22
Oct 2023	9	2	1	12
Nov 2023	8	0	8
Dec 2023	8	0	8
Jan 2024	9	2	1	12
Feb 2024	10	9	0	19
Mar 2024	12	14	4	30
Apr 2024	34	2	7	43
May 2024	29	2	2	33
Jun 2024	34	4	3	41
Jul 2024	31	2	1	34
Aug 2024	29	2	1	32
Sep 2024	30	8	0	38
Oct 2024	32	10	1	43
Nov 2024	26	2	1	29
Dec 2024	25	1	0	26
Jan 2025	33	6	1	40
Feb 2025	28	1	0	29
Mar 2025	31	11	2	44
Apr 2025	24	6	3	33
May 2025	25	3	0	28
Jun 2025	46	10	0	56
Jul 2025	40	12	2	54
Aug 2025	92	4	1	97
Sep 2025	319	6	1	326
Oct 2025	49	13	0	62
Nov 2025	32	6	0	38
Dec 2025	34	22	1	57

Cumulative views and downloads (calculated since 28 Jun 2021)

Month	HTML	PDF	XML	Total
Jun 2021	86	15	2	103
Jul 2021	93	21	5	119
Aug 2021	23	7	1	31
Sep 2021	43	17	4	64
Oct 2021	22	42	0	64
Nov 2021	18	21	1	40
Dec 2021	17	9	1	27
Jan 2022	17	4	0	21
Feb 2022	18	6	0	24
Mar 2022	9	3	1	13
Apr 2022	11	3	0	14
May 2022	7	3	1	11
Jun 2022	4	3	1	8
Jul 2022	22	4	0	26
Aug 2022	10	7	0	17
Sep 2022	5	7	0	12
Oct 2022	9	5	0	14
Nov 2022	23	4	3	30
Dec 2022	9	6	0	15
Jan 2023	34	10	0	44
Feb 2023	11	1	0	12
Mar 2023	5	2	1	8
Apr 2023	3	6	2	11
May 2023	3	4	0	7
Jun 2023	5	4	1	10
Jul 2023	33	5	1	39
Aug 2023	25	4	2	31
Sep 2023	13	8	1	22
Oct 2023	9	2	1	12
Nov 2023	8	0	8
Dec 2023	8	0	8
Jan 2024	9	2	1	12
Feb 2024	10	9	0	19
Mar 2024	12	14	4	30
Apr 2024	34	2	7	43
May 2024	29	2	2	33
Jun 2024	34	4	3	41
Jul 2024	31	2	1	34
Aug 2024	29	2	1	32
Sep 2024	30	8	0	38
Oct 2024	32	10	1	43
Nov 2024	26	2	1	29
Dec 2024	25	1	0	26
Jan 2025	33	6	1	40
Feb 2025	28	1	0	29
Mar 2025	31	11	2	44
Apr 2025	24	6	3	33
May 2025	25	3	0	28
Jun 2025	46	10	0	56
Jul 2025	40	12	2	54
Aug 2025	92	4	1	97
Sep 2025	319	6	1	326
Oct 2025	49	13	0	62
Nov 2025	32	6	0	38
Dec 2025	34	22	1	57

Viewed (geographical distribution)

Total article views: 2,005 (including HTML, PDF, and XML) Thereof 2,005 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 28 Dec 2025

Short summary

Data assimilation uses field measurements to improve field state estimation and parameters of simulation models. However, in dynamic problems, the influence of parameters on the field state estimation that corresponds to the field measurements changes over time. Therefore, when the influence of the parameters is low, the estimations of these parameters might be inaccurate. In this study, a dynamic high-influence parameter is presented in order to improve the data assimilation estimations.


Total:	0
HTML:	0
PDF:	0
XML:	0