|This is a re-review of the paper. I will focus on the theory, not the results. The theory and implementation of the EnKF for joint parameter and state estimation is questionable - even though it is based on previous work by others, the methodology is incorrect and statistically invalid. Of course the results of this (and other papers that use this methodology) are affected by this wrong implementation. |
1) Equation 2: You use Q_t to denote the covariance matrix of the model error. This might be confusing because "Q" is often used to denote streamflow - whereas in this case you use it for the states. Would suggest to use another symbol.
2) With equation (2) an immediate question that pops up is: how did you decide which covariance matrix of Q to use? Problem is that you perturb different states - each of those states has a different magnitude and time behavior. How do you know that your state perturbations are realistic when measured in the output (streamflow) space? This constitutes a serious problem. That is why I strongly recommend to do the perturbation in the streamflow space and then to compute the analysis streamflow which you then use to compute the corresponding analysis states. It is much easier to define the model error in the streamflow space rather than the state space! For instance is your model error heteroscedastic? Does it increase with simulated flow level? When done directly in the state space this is very difficult to tune and derive an appropriate covariance matrix that can do this. You can circumvent this by problem doing the Kalman analysis in the streamflow space, and then to derive the corresponding analysis states.
3) The kernel smoothing technique and state augmention method for parameter estimation. Does this converge to the "correct" posterior target distribution. Vrugt et al. (2013) demonstrate it does not and discusses why this is the case. I consider this a serious flaw in the current work (and previous work by others this work is based on). Also, the settings for the smoothing and shrinkage factor, etc. strongly affect the results - and different settings are required for different problems. This is not desirable. A universal method is available that does not rely on subjective algorithmic parameter values and incorrect theory.
4) Equation (7) - the model states are updated using the standard Kalman analysis equation. It would be much more productive however to use an alternative scheme that takes into consideration how much water is originating from what tank. For instance, during high flows, it does not make sense to update the groundwater (baseflow) reservoir, and vice versa during low flows it would not be productive to update the quick flow reservoirs. Instead, if one first computes the contribution of each constituent component of the discharge then one can use this information to appropriately update (according to percentage contribution) the respective reservoirs where these fluxes originate from. For instance, during low flow, the large majority of the streamflow will constitute baseflow - for instance 90%, the other 10% can come from the other thanks. Then, 90% of the difference between the analysis discharge and forecasted discharge should be attributed to the slow flow tank, and not an equal distribution among the tanks. This will significantly enhance the implementation of the filter and the quality of the results as water is entered into the right tanks.
5) Equation (8) --> sentence just prior "....these variables...". which variable are referred to? It seems unrealistic to describe rainfall errors with a Gaussian distribution. Dry days are not corrected and will remain without precipitation even after perturbation.
Soil moisture errors are homosecdastic and not heteroscedastic. About 0.01 to 0.02 m3/m3 error seems realistic independent of measured/simulated value.
6) parameter estimations --> parameter estimates.
7) Papers needs editing. Grammar / syntax still needs to be further improved.
8) Conclusion section: Accumulative?
9) "big picture" --> not very scientific.
10) The parameter estimates might stabilize after relatively few assimilation steps but do they converge to their "appropriate" values. The results you get are strongly controlled by the settings of your filter. If you change some of the settings for the smoother and parameter estimation part, the filter will not converge as rapidly. moreover, is the posterior parameter uncertainty reasonable. My experience suggests that this is not the case. This is an engineering solution but violates statistical principles. A simple test can show this. Create a synthetic streamflow data set of 5 years where you use completely different parameter values for the first and second part of this data set. Then assimilate this data set using your filter. You will see that your method will not converge appropriately, and certainly cannot diagnose the suddenly varying parameter values. Why? The filter settings are set such that they promote very quick convergence of the parameter values. Once the filter has converged the parameter uncertainty is too small (and significantly underestimates the "actual" understand - the parameters will also be wrong!) - due to this small uncertainty the filter cannot travel to the new parameter values that created the second part of the time series. I consider this a serious flaw in the methodology - The results are enforced by the user and are statistically not correct. Yes, it might improve the discharge estimates - but should one not use a statistically correct methodology?