Introduction
Many studies have demonstrated that assimilation of remotely sensed
near-surface soil moisture observations can improve modeled soil moisture,
with improvement typically measured by temporal agreement with in situ
observations .
Typically, the remotely sensed soil moisture observations are assimilated
using a bias-blind assimilation of observations that have been rescaled to
have the same mean and variance as the model forecast soil moisture
. This approach is designed to avoid forcing
the model into a regime that is incompatible with its assumed (likely
erroneous) structure and parameters, while also avoiding the inadvertent
introduction of any observation biases into the model .
The assimilation can then correct for random errors in the model forecasts,
where random errors are defined as errors that persist for less than the timescale
used to – subjectively – define the bias in the mean. Traditionally,
observation rescaling is based on the maximum available coincident observed
and forecast data record ,
effectively defining the bias over the same period. The rescaled observations
will then retain the signal of all observation–forecast differences occurring
at timescales shorter than the data record, which for a multi-year data
record would include differences spanning the subseasonal, seasonal, and
inter-annual timescales. Assimilating these rescaled observations then has
the potential to improve the model soil moisture at each of the
aforementioned timescales, and yet bias-blind soil moisture assimilation is
often implicitly assumed to target only the random errors occurring at the
relatively short subseasonal timescales.
At subseasonal, seasonal, and inter-annual timescales, different physical
processes control the true soil moisture and errors in soil moisture
estimates. Most notably, in many locations seasonal scale variability is
dominated by the mean seasonal cycle (the annually repeating variability),
and any errors in the mean seasonal cycle will be systematic, with causes
such as incorrect separation of the soil and vegetation moisture signals retrieved from
remotely sensed brightness temperatures, or errors in the land surface model
vegetation dynamics. In contrast, variability at subseasonal and inter-annual
timescales is rarely dominated by repeating cycles, and is more typically
associated with transient atmospheric forcing events. Specifically, rapid
timescale (daily) soil moisture dynamics are driven by factors such as
individual precipitation events and changes in cloud cover, while longer timescale (seasonal-plus) dynamics are driven by changes in the atmospheric
supply and demand for moisture . Soil moisture errors at
subseasonal scales could then be caused by factors such as atmospheric noise
in remotely sensed data, or errors in the daily meteorology of the model
atmospheric forcing, while inter-annual-scale errors could be caused by
factors such as drift in the remote sensor calibration, or incorrect
representation of atmospheric drought conditions in the atmospheric forcing.
The differing nature of soil moisture errors across timescales has
unexplored consequences for data assimilation. Most notably, the systematic
nature of errors in the mean seasonal cycle is problematic. Theoretically,
bias-blind
data assimilation is not designed, nor optimized, to correct for
systematic errors. More practically, if the systematic differences are not
due to model errors (i.e., are caused by observation errors, including
representativity errors), then assimilating such information can seriously
degrade model performance. Additionally, the timescale dependence of soil
moisture errors may also be problematic for observation rescaling using bulk parameters,
intended to correct systematic differences across all timescales. Even within relatively short timescales (up to about
1 month), showed that the multiplicative (differences in
standard deviation) and additive (differences in mean) components of the
systematic differences between modeled and remotely sensed soil moisture
differ across timescales. They highlight that this lack of stationarity
cannot be adequately addressed by using bulk statistics to estimate
observation rescaling parameters.
Consequently, in this study we have decomposed modeled, remotely
sensed, and in situ soil moisture into separate time series
representing soil moisture dynamics at subseasonal, mean seasonal, and
inter-annual timescales. We have then used this decomposition to examine
the differences between remotely sensed and modeled soil moisture at
each timescale, and how assimilating bulk-rescaled soil moisture observations
impacts the model soil moisture at each timescale.
The decomposition is achieved by fitting each soil moisture (SM)
time series with harmonic functions specified to target the mean
seasonal cycle (SMseas), and the subseasonal
(SMshort) and inter-annual
(SMlong) dynamics. By fitting the appropriate harmonic functions to each time series, we
can separate the total mean square error of each soil moisture time
series into contributions from each timescale. This is a much more
targeted evaluation of soil moisture dynamics at physically relevant timescales than is usually undertaken. Standard evaluation methods focus on bias-blind metrics,
such as the correlation or unbiased root mean square error (ubRMSE;
which is calculated after removing the long-term mean difference
). Both R and ubRMSE are sensitive to soil moisture
time series variability at all timescales. While anomaly correlations
(Ranom), are also used to exclude the seasonal cycle, this
is not done consistently, and does not allow for the total error to be
broken into contributing timescales. Depending on how the anomalies
are calculated, Ranom measures subseasonal scale errors
(anomalies defined relative to a simple moving average, as in
), or a combination of inter-annual and
subseasonal scale errors (anomalies defined relative to the mean
seasonal cycle over multiple years, as in .
In the second part of this study, we also explore the impact on the
assimilation of using short time periods for observation bias
correction. When first introducing cumulative distribution functions (CDF) matching to rescale remotely
sensed soil moisture prior to assimilation,
showed that for Scanning Multi-channel Microwave Radiometer (SMMR)
soil moisture observations (1979–1987), reasonable rescaling
parameters could be estimated using a single year of data. We repeat
their investigation using the more modern Advanced Microwave Scanning
Radiometer – Earth Observing System (AMSR-E) data set, and also
extend their investigation by providing a more statistically robust
analysis of the impact of using single-year scaling parameters in the
assimilation. This part of the study is motivated by the recent launch
of the NASA's Soil Moisture Active Passive (SMAP) mission
, and it will address the consequences of using
short records to rescale the observations during the early phases of
the SMAP mission.
Location, time period, and number of in situ network sites at each
watershed.
Name (abbreviation)
Approx. area and center location
No. sites
Time period
Reynolds Creek, surface (RC-sfc)
150 km4 (116.7∘ W, 43.2∘ N)
10
Oct 2002–Sep 2010,
excluding 1 Dec–10 Mar
Walnut Gulch, surface (WG-sfc)
600 km4, (110.0∘ W, 31.7∘ N)
14
Oct 2002–Sep 2011
Little Washita, surface (LW-sfc)
350 km4 (98.0∘ W, 34.8∘ N)
15
Oct 2002–Sep 2011
Little River, surface (LR-sfc)
250 km4 (83.5∘ W, 31.5∘ N)
8
Oct 2002–Sep 2011
Little River, root-zone (LR-rz)
as above
4
Oct 2004–Sep 2011
Data and methods
A 9 years record of surface soil moisture retrievals from AMSR-E X-band data
have been assimilated into the
Catchment land surface model , at four locations in
the US. The impact of the assimilation on the model skill is measured
by comparison to watershed scale in situ soil moisture observations
collected by the Agricultural Research Service (ARS) of the United
States Department of Agriculture . Each of these data sets is first described
below (Sect. ), followed by a discussion of the
assimilation approach (Sect. ) and the method used to
decompose soil moisture time series into subseasonal, seasonal, and
inter-annual timescales (Sect. ).
The soil moisture data sets
For over a decade the ARS has been collecting near-surface (5 cm) soil
moisture observations, at least hourly, using dense networks of in situ
sensors at four watershed scale sites in the US: Reynolds Creek (RC), Walnut
Gulch (WG), Little Washita (LW), and Little River (LR). See
Table for the locations and network details for each
watershed. At each watershed, observations collected from between 8 and
15 sites are averaged using the Thiessen polygon method to produce a coarse-scale near-surface soil moisture observation with spatial support similar to
the AMSR-E observations. Intensive field experiments have shown these coarse-scale estimates to be very accurate, with errors on the order of
0.01 m3 m-3 .
The soil moisture has also been observed below the near-surface layer at
Little Washita since 2007, and at Little River since 2004, with observations
potentially made at every 5 cm from 5 to 60 cm. Here, the
root-zone soil moisture at Little River is estimated using the average of the
5–60 cm observations (due to the relatively short time period the
root-zone Little Washita data were not used). At Little River, the root-zone
soil moisture estimate is calculated from fewer sensors than the near-surface
estimate, due to the greater number of sub-surface sensor drop outs. However,
the lesser number of root-zone sensors is not expected to be overly
problematic, since soil moisture is less variable (temporally and spatially)
in the root zone than in the near-surface layer.
Given that we will focus on evaluating variance,
we have not supplemented the ARS in situ observations with observations from
single sensor networks, such as SCAN . Unlike the
locally dense in situ measurements from the ARS networks, the variance
(and mean) of observations from single sensors cannot be assumed
representative of the coarse-scale soil moisture from Catchment and AMSR-E.
Level 3 Land Parameter Retrieval Model (LPRM) X-band AMSR-E near-surface
soil moisture retrievals at 0.25∘ resolution were obtained for
the grid cells encompassing the center of each watershed site in Table . At
X-band the observations relate to a surface layer depth slightly less
than 1 cm. Only the descending (01:30 LT) overpass has
been used to avoid possible differences in the climatological
statistics of day- and nighttime observations. The sites were
explicitly selected by ARS to avoid possible radio frequency
interference and proximity to permanent open water, and the AMSR-E
soil moisture retrievals were screened to remove observations with
X-band vegetation optical depth above 0.8.
NASA's Catchment land surface model was run over the 9 km EASE
grid cells encompassing the center of each watershed site, using atmospheric forcing
fields from Modern Era Retrospective-Analysis for Research (MERRA;
) and recently improved soil parameters
. The model initial conditions were first spun-up
from January 1993 to January 2002 using a single member without
perturbations. The ensemble (including perturbations) was then spun-up
from January to October 2002 (see Sect. for details of
the ensemble). For both the model open loop and data assimilation
model output, the ensemble average near-surface (0–5 cm) and
root-zone (0–100 cm) soil moisture is then reported.
Daily ARS and Catchment time series were generated by sampling each at
the approximate time of the descending AMSR-E overpass
(01:30 LT). Initially each time series spanned the AMSR-E
data record, rounded down to nine full years from October 2002 to
September 2011; however the Little River root-zone soil moisture
observations are not available before January 2004, and were truncated
to the 7 years from October 2004 to September 2011. Also, there
were just 21 ARS observations at Reynolds Creek in the last year of
this period, and therefore the Reynolds Creek time series were truncated to
the 8 years from October 2002 to September 2010. The ARS and
AMSR-E sensors can only measure liquid soil moisture, and all data
have been screened out when the Catchment model indicates frozen
near-surface conditions. Since the Reynolds Creek site is frozen for
an extended period each winter, liquid soil moisture is not well
defined there during winter, and the Reynolds Creek time series have
then been truncated to remove winter, defined as from 1 December to 10 March
(the period during which the Catchment surface is continuously
frozen for at least 3 of the 8 years of the Reynolds Creek record).
The assimilation experiments
The assimilation experiments were performed using a one-dimensional
bias-blind ensemble Kalman filter, with the same set-up and ensemble
generation as in . Prior to assimilation, the AMSR-E
observations were rescaled using CDF matching . For each
experiment a single set of bulk CDF-matching parameters were used (i.e., the
rescaling is applied only to original AMSR-E time series, and not to the
decomposed time series). In the baseline assimilation experiments, the
CDF-matching parameters were calculated using the maximum available 9-year
AMSR-E data record, following standard practice.
This 9-year AMSR-E record is the longest remotely sensed soil moisture
record available from a single satellite sensor, and soil moisture
assimilation experiments using newer satellites, or a modeling system with
limited archives, are limited to shorter time periods for observation
rescaling. To establish the potential consequences of using a shorter data
record, a second set of experiments was conducted, in which the rescaling
parameters were estimated using the 12 month periods starting in consecutive
Octobers (but assimilating the full 8- or 9-year near-surface soil
moisture data record listed in Table ).
also tested the use of 1-year periods for rescaling
soil moisture from SMMR. In contrast to their approach, we do not use ergodic
substitution (of spatial sampling for temporal sampling) when estimating the
rescaling parameters with a single year of observations, since with more
modern remote sensors, this is no longer necessary to obtain a sufficient
sample size. Additionally, for the assimilation of Soil Moisture Ocean Salinity retrievals, found ergodic substitution degraded
the estimated CDFs, by introducing conflicting information from neighboring
grid cells, possibly due to the higher spatial resolution, compared to SMMR.
The benefits of each assimilation experiment is compared to that of the
Catchment model open-loop ensemble mean, in which the same ensemble
generation parameters were used, and no observations were assimilated. The
improvement from the open loop is measured using the unbiased mean square
error (ubMSE) of the resulting model soil moisture, with respect to the ARS
in situ observations. For data set X compared to in situ data I, both of
length n, the ubMSE is calculated as
ubMSE=1nΣi=1,nXi-Ii-〈X-I〉2,
where 〈.〉 indicates the temporal mean. The ubMSE is also
referred to as the variance of the errors in X; however we use the ubMSE
terminology for consistency with commonly used ubRMSE in the soil moisture literature .
We do not apply the square root here to take advantage of the additive
property of the variance of independent time series. However, to aid
interpretation the ubMSE equivalent to the common ubRMSE target accuracy of
0.04 m3 m-3 is indicated in the relevant plots.
Decomposition of soil moisture time series
We wish to decompose each SM time series into separate
components representing soil moisture dynamics at the
SMshort, SMseas,
and SMlong timescales. Variability
in a time series at specific timescales can be isolated by fitting
a function made up of the sum of sinusoidal functions. Formally, for
some observed time series, y, the ak and bk coefficients in the
decomposed form y^ are fit for some selection of integers ki:
y^(t)=a0+Σk=k1,k2,…aksin2πktn+bkcos2πktn,
where t is the time step and n is the length of the time series.
2πkn is the (angular) frequency for a sinusoid
completing k cycles over n time steps (i.e., that has frequency
k/n per time unit), and y^ for k = ki is referred
to as the kith harmonic. a0 is the mean of y. If the
time series is sampled at regular intervals and has no missing data,
the sinusoids for individual harmonics are orthogonal and independent
of each other. This is the basis for the discrete Fourier transform,
which exactly fits Eq. () to y using the first n/2
harmonics (i.e., ki = 1, 2, 3, … n/2). In this study, we
use multiple linear least-squares regression to fit
Eq. () to the soil moisture time series for a sum of
harmonic frequencies selected to isolate the variability at each
target timescale, as described below.
We define SMseas by fitting Eq. ()
to the soil moisture time series for some combination of the annual
harmonic frequencies (i.e., for k/n an integer multiple of
1 yr-1). The frequencies higher than 1 yr-1
moderate the shape of y^ to account for differences in the
shape of the seasonal cycle from the single sinusoid described by the
first harmonic. Typically, only a few annual harmonics are necessary
to fit the seasonal cycle of geophysical variables
. Here we define
SMseas to be the sum of the first two harmonics,
since fitting additional harmonics did not improve the ability to
predict withheld data, following the method of
. Note that since the same annual harmonics are
repeated each year, we are restricting SMseas to
represent only the mean seasonal cycle, and any inter-annual
variability at seasonal timescales, such as anomalous vegetation
growth in a given year, will be assigned to the subseasonal or
inter-annual variability, depending on its temporal characteristics.
We define SMlong by fitting Eq. ()
to the soil moisture time series using the harmonic frequencies lower
than 1 yr-1 that divide into the number of years in the
data record (i.e., for k/n = 1/m, 2/m, 3/m … (m - 1)/m, where m is
the time series length in years). Finally, we define SMshort as the residual:
SMshort=SM-〈SM〉-SMlong-SMseas.
Note that, as defined here, SMlong, SMseas, and SMshort are all
zero mean, since the time series mean was assigned to a0 in
Eq. (). Both of the AMSR-E and ARS observed
time series are incomplete (Table ). When applied to incomplete time series,
the sinusoids fitted by Eq. () are not necessarily independent; hence, the fitted SMseas
and SMlong may not be independent. We opted not to
use gap-filling prior to fitting Eq. (), to keep the
method simple, and because gap-filling would directly affect the
SMshort dynamics. In Sect. ,
before using the decomposed time series we check for signs of strong
dependence between the fitted SMlong,
SMseas, and SMshort, by testing
whether the sum of the variances of the three timescale components
differs from the variance of the original soil moisture time
series. We assume that if there is little
difference then any dependence between SMlong,
SMseas, and SMshort has only
a minimal impact on our results. Following initial investigation with
this test, the number of observations used at each location is
maximized by comparing only model (or assimilation) estimates to ARS
in situ measurements, avoiding direct comparison of the incomplete ARS
and AMSR-E time series (which would require cross-screening for the
availability of both). Finally, we do not use the harmonic fit to
interpolate missing data, and instead screen out the fitted
SMlong and SMseas at times when
the original soil moisture was not available. Also, at Reynolds
Creek, where the time series has been truncated to remove frozen
winters, the length of the year used to fit the harmonics was
similarly truncated.
Descriptive statistics for the data sets at each watershed.
Data source
Number of
Mean
Standard
daily data
(m3 m-3)
deviation
(m3 m-3)
Reynolds Creek, surface
AMSR-E
1209
0.17
0.097
ARS
1944
0.10
0.068
Catchment
2111
0.16
0.039
Walnut Gulch, surface
AMSR-E
1960
0.15
0.067
ARS
3282
0.05
0.023
Catchment
3287
0.14
0.039
Little Washita, surface
AMSR-E
1748
0.27
0.097
ARS
2690
0.13
0.054
Catchment
3287
0.14
0.039
Little River, surface
AMSR-E
1989
0.31
0.100
ARS
3155
0.10
0.044
Catchment
3287
0.19
0.049
Little River, root-zone
AMSR-E
–
–
–
ARS
2808
0.09
0.036
Catchment
2830
0.15
0.038
For demonstration purposes, in Sect. we decompose each soil moisture
time series into similarly defined timescale components using moving
averages, since moving averages are often used for calculating anomaly
correlations . The length of the
averaging windows were chosen to give close agreement with the results
of the harmonic decomposition described above. For the moving average
decomposition, the inter-annual soil moisture time series,
SMlongMA, is defined as the 181-day
moving average, and the seasonal cycle,
SMseasMA, is defined for each day of the
year by averaging the data from all years that fall within a 45-day
window surrounding that day of year. As with the harmonic approach,
the subseasonal time series, SMshortMA,
is calculated as the residual, analogous to Eq. (). The
same data processing and quality control as for the harmonic
decomposition is used (also without gap filling), plus the moving averages are only calculated
when at least 60 % of the data within the averaging window are available.
Results
Below, the original AMSR-E, Catchment, and ARS soil moisture time
series are examined (Sect. ), before being split into
SMseas, SMlong, and SMshort (Sect. ). The distribution
of variance across the different timescales for each soil moisture
estimate is then compared (Sect. ), before the
observations are rescaled (Sect. ), and the
benefit of assimilating the AMSR-E data into Catchment is assessed at
each timescale (Sect. ). Finally, the
consequences of using a relatively short record to rescale the AMSR-E
data are examined (Sect. ).
The ARS in situ, Catchment model, and AMSR-E remotely sensed surface
soil moisture, with near-surface soil moisture at (a) Reynolds
Creek, (b) Walnut Gulch, (c) Little Washita,
(d) Little River, and (e) root-zone soil moisture at Little
River.
Decomposition of the Catchment near-surface soil moisture time
series at Little River, using the harmonic (HA; black) and moving average
(MA; cyan) methods, for (a) the original time series (red dots) and
the sum of SMlong + SMseas + the long-term mean
soil moisture (solid lines), and the individual components
(b) SMlong, (c) SMseas, and
(d) SMshort.
Time series variance at each timescale, with the Catchment
Model (M), original AMSR-E Observed (O), and CDF-matched AMSR-E Observed (Oc)
soil moisture variances plotted for the (a) harmonic, and
(b) moving average decomposition methods. The circles and error bars
give the variance of the original soil moisture time series, with 95 %
confidence intervals (some very small confidence intervals are obscured by
the plotted circles).
The ARS, AMSR-E, and Catchment time series
Figure shows the original time series at each
site. In general, soil moisture from in situ, modeled, and remotely
sensed estimates have systematic differences in their behavior, due to
representativity or structural differences between each estimate
. The most obvious difference in Fig. is that the
mean and variance of each estimate differ (see also
Table ). Both AMSR-E and Catchment are
consistently biased high compared to the ARS soil moisture. Bias
values for the model range from 0.01 m3 m-3 for Little
Washita to 0.09 m3 m-3 for Little River, and bias
values for the AMSR-E retrievals range from 0.07 m3 m-3
for Reynolds Creek to 0.21 m3 m-3 for Little River.
Additionally, the standard deviation of AMSR-E is 2 to 3 times
larger than the other two estimates. Figure
demonstrates that this is due to greater noise, and also a prominent
seasonal cycle at Little Washita and Little River that is not evident
in the other time series.
In addition to the systematic differences in their mean and standard
deviation reported above, there are more subtle differences between
the soil moisture dynamics described by each estimate. For example,
for both the surface and root-zone soil moisture, the ARS time series
tend to show a sharper response to individual rain events than does
Catchment, with (relatively) larger peaks followed by more rapid dry
down after each event. At Walnut Gulch this is particularly obvious,
with ARS rapidly drying to a well-defined lower limit after each
precipitation event, while Catchment has a lesser response to
individual events, and a stronger seasonal signal.
Soil moisture time series at each timescale
Figure shows an example of the timescale
decomposition, for the Catchment surface soil moisture at Little
River, for both the harmonic and moving average approaches. The time
series described by each method are similar in terms of the magnitude
and timing of their dynamics, except that the moving average inter-annual
soil moisture includes more high-frequency variability
than does the harmonic version. Evaluation of soil moisture at
specific timescales should ideally be based on time series separated
into independent timescale components. For the harmonic method,
independence between the time series at each timescale is not
guaranteed since the original time series were not complete, while for
the moving average method, independence is not expected.
Fraction of variance at each timescale, obtained by normalizing the
time series variance before decomposition. The Catchment model (M), original
AMSR-E observed (O), and the CDF-matched AMSR-E-observed (Oc) soil moisture
time series, cross-screened for AMSR-E availability, are plotted
in (a), and the ARS in situ observations (I), Catchment model (M),
and baseline assimilation (Ac) soil moisture time series, cross-screened for
ARS availability, are plotted in (b). The circles give the variance
of the original (normalized) soil moisture time series.
Figure shows an example of the variance bar plots
used to check for signs of dependence between the time series at each
timescale, in this case for the Catchment model and the AMSR-E
observations. In Fig. a, for the harmonic method,
the sum of the variances at each timescale (the stacked bars) is very
close (within 2 %) to the total variance of the original soil
moisture time series (the white circles), falling within the 95 %
confidence interval of the total variance in each case. In contrast,
for the moving average method in Fig. b the sum of
the variances of each timescale falls outside the 95 % confidence
interval for the total time series variance at three of four sites,
with a mean difference of 8 % of the total variance (with
differences ranging between 1 and 16 %), indicating strong
dependence between the three components. In each case the sum of the variances of
the timescale components is less than the total variance at each
site, indicating positively correlated features between the moving
average timescale components (since 〈σX+Y2〉 = 〈σX2〉 - 2〈σXY〉 + 〈σY2〉).
This positive correlation is intuitively expected, since an anomaly in the original soil moisture
time series has the same direction of influence on both the moving
averages and the residual from that moving average (e.g., in
Fig. note the signal of the large positive anomaly
in early 2004 in both SMlongMA and
SMshortMA). Finally, the distribution of
variance across the timescales is similar for each method, largely
because the moving average window lengths for
SMseasMA and SMlongMA were selected to generate time
series closely matching those from the harmonic method.
Variance distribution across timescales
In Fig. the AMSR-E variance is much larger than
that for Catchment (as was discussed in Sect. ), making
it difficult to compare the relative distribution of variance across
each timescale. Figure a then shows the AMSR-E and
Catchment variance bar plots with the total variance normalized to
one, to allow direct comparison to the fraction of variance at each
timescale. The same plots are also presented for the Catchment and
ARS soil moisture in Fig. b (recall we do not
directly compare the ARS and AMSR-E time series, so as to avoid
cross-screening their availability).
In Fig. , the distribution of variance across timescales for each data set can be very different, and there is not
a consistent pattern across the four sites. As was previously noted
from Fig. , AMSR-E has a very prominent
seasonal cycle at Little River and Little Washita (40–70 % of
the total variance) that is not present for Catchment or ARS, for
which the SMseas fraction of variance is around
10–20 % in Fig. . In contrast, at Reynolds
Creek and Walnut Gulch, Catchment has a larger fraction of its
variance in the seasonal cycle (55–70 %) than does AMSR-E
(20–40 %), with ARS agreeing with Catchment at Reynolds Creek
only. At Walnut Gulch the greater variance fraction in the Catchment
SMseas is mostly balanced by less variability in
SMshort (30 % compared to 60 % for
ARS). This is associated with the differing responses to
precipitation events already noted in Fig. .
One might expect AMSR-E to have a larger fraction of
variance at SMshort, due to measurement noise from the remote sensor. However, this is only the case at
Reynolds Creek, where AMSR-E has 50 % of its variance in
SMshort, compared to 20–30 % for Catchment and
ARS. At Walnut Gulch, the AMSR-E and ARS SMshort
variance fractions are similar (50–60 %), while the fraction for
Catchment is much lower (25 %). At Little Washita and Little River
the variance fraction in the AMSR-E SMshort is
similar to Catchment (at around 50 and 30 %, respectively) and
both are much smaller than for ARS (around 70 %). At these two
sites the AMSR-E SMshort variance fraction may well be
less than expected due to the large amount of variance in its
exaggerated seasonal cycle.
For the SMlong variance, the patterns at Little
Washita and Little River are again similar to each other. Catchment
has much more variance in SMlong (40–50 %)
than ARS (20 %) or AMSR-E (10 % or less). At the other two
sites, the SMlong variance fraction is similar for
all data sets, except for the lower value for AMSR-E at Walnut Gulch
(< 10 %, compared to around 20 % for ARS and Catchment).
Error variances (ubMSE) compared to ARS in situ observations at each
timescale, for the near-surface soil moisture at (a) Reynolds
Creek, (b) Walnut Gulch, (c) Little Washita,
(d) Little River, and (e) for the root-zone soil moisture
at Little River. Bars show the Catchment model open loop (M), baseline
assimilation (Ac), individual Ay assimilation experiments, and the mean
across the Ay experiments (〈Ay〉). Label AyYY indicates
the Ay experiment with bias correction parameters estimated from the
12 months from 1 October of 20YY. Circles and error bars give the ubMSE and
its 95 % confidence interval for the original soil moisture time series
(some very small confidence intervals are obscured by the plotted circles).
The dashed line at ubMSE of 1.6 × 10-3 (m3 m-3)2
is equivalent to the common ubRMSE target of 0.04 m3 m-3.
Harmonic decomposition of the ARS in situ (I), Catchment model (M),
and baseline assimilation output (Ac) near-surface soil moisture time series
at Little River, showing the (a) original time series,
(b) SMlong, (c) SMseas, and
(d) SMshort.
Baseline observation rescaling
For the baseline experiment, the AMSR-E observations were rescaled using bulk
CDF-matching parameters estimated over the full data record. By design, the
CDF-matched AMSR-E observations, labeled Oc, have the same mean (not shown)
and variance (Fig. a) as the Catchment soil moisture.
Figure shows that the CDF matching had little impact on
the variance distributions across each timescale. This suggests that for the
particular examples in this study, the CDF-matching operator could be
approximated by a linear rescaling, in which only the mean and variance of
the model are matched, as in . To confirm this, the
assimilation experiments were repeated using linear rescaling of the AMSR-E
observations in place of CDF matching. The results (not shown) were indeed
very similar to the CDF-matching experiments, in terms of the rescaled
observations and the assimilation output (for both the Oc rescaling presented
in this Section, and the Oy rescaling presented in Sect. ).
Recall that the distribution of the variance across each timescale was quite different for the AMSR-E and Catchment soil moisture in
Fig. . Note that large errors in the variance at one
timescale (in either AMSR-E or Catchment) will affect the rescaling
of the variance at other timescales. In particular, if the
unrealistically large AMSR-E seasonal cycle at Little Washita were
replaced with something more realistic, for example representing
8 % of the total variance (as in the ARS time series), then the
fraction of variance in SMshort would increase from
the current 48 to 75 %, increasing the SMshort
variance in the CDF-matched AMSR-E from 0.0036 to 0.0054 (m3 m-3)2.
Evaluation of the baseline assimilation experiment at each timescale
Figure shows the ubMSE for each assimilation
experiment, separated into each timescale. Prior to assimilation, the average ubMSE in the near-surface
soil moisture across the four sites was 1.8 × 10-3 (m3 m-3)2 (giving a ubRMSE just above the
0.04 m3 m-3 target). Close to half (55 %) of the
ubMSE is in SMshort, with the rest split between
SMseas (26 %) and SMlong
(19 %). The Ac assimilation significantly reduced the total ubMSE
at each site, reducing the average near-surface ubMSE across the four
sites by 33 % to 1.2 × 10-3 (m3 m-3)2, with
average reductions in the near-surface layer of 52 % for
SMlong, 25 % for SMseas, and
22 % for SMshort. The baseline assimilation experiment, labeled Ac, reduced the total ubMSE
at each site for all timescale components, except for
SMseas at Little Washita (where the model ubMSE was
already relatively small).
Root-zone soil moisture observations were available for the study
period only at Little River. Both the distribution of the ubMSE across
each timescale, and the relative reductions achieved from
assimilation, are similar for the near-surface and root-zone layers at
Little River in Fig. d and e, adding confidence
that the model improvements reported above for the near-surface soil
moisture are indicative of the performance throughout the soil profile.
To illustrate the impact of the assimilation at each timescale,
Fig. compares the decomposed time series for
the Catchment model and Ac assimilation experiments to that from the
ARS in situ observations at Little River. The difference between the
three SMshort time series is difficult to visually
judge in Fig. d; however, the impact of the
assimilation on the SMseas and
SMlong time series is
clear. Figure b suggests that the large
SMlong ubMSE reduction (by over 80 %) from the
assimilation is due to the reduced amplitude in the
SMlong dynamics, although there is perhaps also an
improvement in event timing. In Fig. c, the
model seasonal cycle has an overestimated amplitude, and also includes
two maxima per year, where the ARS seasonal cycle has only one. The
assimilation exacerbates the overestimated amplitude, but also removes
the second annual maxima, resulting in an overall SMseas ubMSE reduction (by 46 %).
Systematic differences between AMSR-E observations and Catchment
model near-surface soil moisture, with (a) the mean difference
(〈model〉 - 〈observation〉), and
(b) the ratio of the standard deviations
(σ(model)/σ(observations)). The parameters are estimated using
all years (All), and each year separately (with label YY indicating the
parameters estimated from the 12 months from 1 October of 20YY), and the
dashed lines give the mean of the YY parameters.
Observation rescaling with a short data record
The 9-year time period used in the baseline experiment to estimate the
CDF-matching parameters is longer than is often available for soil moisture
assimilation experiments. Obviously, assimilating a shorter time period will
limit the potential improvements to the model SMlong (of
similar magnitude to the SMshort improvement in this
study). The potential benefit of an assimilation over a shorter period may
also be limited by the increased sampling uncertainty in the estimated
observation rescaling parameters. This increased uncertainty could arise from
systematic errors due to inadequate sampling of SMseas and
SMlong, or from increased random errors associated with
the smaller sample size. This is tested here with nine additional
experiments, labeled AyYY, in which the CDF-matching parameters are each
based on a 12-month period starting on 1 October. For example, experiment
Ay03 uses CDF-matching parameters based on the 12 months of data from
1 October 2003 to 30 September 2004. Each of the nine experiments assimilates
the full 8- or 9-year record of AMSR-E near-surface soil moisture
retrievals (including the data for the year from which the CDF-matching
parameters were determined).
The potential uncertainty introduced by using a single year to estimate the
rescaling parameters depends on the inter-annual variability in the
systematic differences between the observed and forecast soil moisture. The
main systematic differences that are addressed by the CDF matching are the
differences in the observed and forecast mean and standard deviation. For
demonstrative purposes, Fig. illustrates the
difference between the means, and the ratio of the standard deviations,
estimated using the full data record, and using each single year. Note,
however, that in the presented Ay experiments the AMSR-E observations are
rescaled based on the full CDF, and not just the mean and variance. In
Fig. a there is considerable inter-annual
scatter in the yearly mean differences, although by linearity the
average is unbiased. The standard deviation ratio in
Fig. b also shows inter-annual variability;
however, the single year ratios are also biased low compared to the
all-years ratio, since the single year estimates did not sample the
SMlong variance (which was consistently a greater
fraction of the total variance for Catchment than for AMSR-E in
Fig. a). This is particularly marked at Little
River, where the average of the single year standard deviation ratios
was 30 % less than when estimated using all years (since
SMlong makes up close to 50 % of the total
variance in Catchment, compared to less than 5 % for AMSR-E in Fig. a).
Figure includes the ubMSE for the nine Ay
assimilation experiments, as well as the mean ubMSE (〈Ay〉)
across all nine. On average, assimilating the AMSR-E observations
that have been rescaled using parameters estimated from a single year is
beneficial. As with the Ac experiment, the
〈Ay〉 ubMSE is consistently less than that of
the model at each timescale, except for SMseas at
Little Washita. However, for individual realizations there is an
increased risk when using the single year parameters that the
assimilation will not significantly improve the model, or will even
significantly degrade the model. For example, at Little Washita,
where the Ac experiment reduced the ubMSE by a small but significant
amount, none of the Ay experiments significantly decreased the ubMSE,
and the Ay10 experiment significantly increased it.
Additionally, comparing the Ay experiments in Fig. to the baseline
Ac experiment shows that at Reynolds Creek, Walnut Gulch, and
Little Washita most of the Ay experiments resulted in larger total
ubMSE than for the Ac experiment, while at Little River the opposite occurred. Overall
there were eight Ay experiments for which the total ubMSE was
significantly different (at the 5 % level) and higher than for the
Ac experiment, seven for which it was significantly different and
lower, and 20 where the ubMSE was not significantly changed. The
differences between the Ac and Ay ubMSE are skewed, in that when the
Ay ubMSE is higher, the difference tends to be greater than when it is
lower. Consequently, the average reduction in the model ubMSE for the
near-surface soil moisture, compared to the model with no
assimilation, is slightly less for 〈Ay〉
(30 %) than for Ac (33 %).
Each instance of relatively poor ubMSE for an Ay experiment can be
traced to the more extreme (i.e., unrepresentative) single year
systematic differences in Fig. . Going through
the experiments with the largest relative increase in
ubMSE, experiment Ay07 at Reynolds Creek, and experiments Ay05, Ay06,
and Ay07 at Walnut Gulch all have extreme standard deviation ratios,
while Ay06 at Reynolds Creek and Ay10 at Little Washita have extreme
mean differences. In each case, most of the increase in the ubMSE is
due to increased errors in the SMseas and
SMlong components, suggesting that the
SMshort corrections are more robust to uncertainty
in the scaling parameters. Note that unrepresentative scaling parameters do not necessarily
degrade the assimilation output, and in some instances are even
advantageous. Most obviously, at Little River, where the single year
standard deviation ratios were biased low (by 30 %), the Ay
assimilation experiments all produced slightly lower ubMSE than the Ac experiment.
In general, the impact of errors in the
rescaling of the mean value are likely under-reported here, since any
introduction of biases into the model will not be directly detected by
the ubMSE. Despite this, the examples cited above in which unrepresentative mean
difference corrections degraded the bias-robust ubMSE highlight
the potential for a bias-free assimilation of
biased observations to degrade model soil moisture dynamics.
Conclusions
Many studies have demonstrated that near-surface soil moisture
assimilation can improve modeled soil moisture, in terms of the
anomaly time series used to represent random errors, often
implicitly assumed to represent subseasonal scale variability
associated with individual precipitation events .
Here, 9 years of LPRM AMSR-E observations
were assimilated into the Catchment model, and the resulting model
output evaluated separately at the subseasonal
(SMshort), seasonal (SMseas),
and inter-annual (SMlong) timescales against
watershed scale in situ observations at four ARS sites in the US. The
results show that, in addition to reducing the near-surface
SMshort ubMSE averaged across the four sites, the
assimilation also reduced the near-surface SMlong
ubMSE. The magnitude of the reductions in SMshort
and SMlong were similar (2.1 × 10-4 (m3 m-3)2, and
2.5 × 10-4 (m3 m-3)2, respectively), although
this represented a much larger relative reduction in the
SMlong ubMSE (52 % of the model
SMlong ubMSE, compared to 22 % for the
SMshort ubMSE). In situ observations of the
root-zone layer were available for only one site; however, the
similarity between the near-surface and root-zone results at this site
(Fig. ) is encouraging in terms our
near-surface results being representative of the deeper soil moisture profile.
The reduced SMlong ubMSE suggests that assimilating
a sufficiently long data record of near-surface soil moisture observations
can improve the model soil moisture dynamics at inter-annual timescales,
enhancing the model ability to simulate important events such as droughts.
There is then a clear potential for reanalyses, or other long-term
simulations, to benefit from the assimilation of long-term remotely sensed
soil moisture records. Such long records are available from the AMSR-E
satellite used here (May 2002–October 2011), and increasingly from the
active microwave ASCAT series (ongoing from October 2006). Carefully merged
multi-satellite records, such as the 30-year record being produced by the
Water Cycle Multi-mission Observation Strategy (WACMOS) project
are also now providing data records of unprecedented length. As with
SMshort, an important caveat on this finding is that it is
possible that the reduced SMlong ubMSE was
associated with reduced representativity differences compared to the
in situ observations, rather than a true model improvement. For
example, at Little River in Fig. the
substantial improvements to the SMlong near-surface
and root-zone soil moisture gained by assimilating the AMSR-E
observations were largely due to reduced SMlong
variance. If the model's exaggerated SMlong was
a representativity or structural error (e.g., too strong a signal of
underlying water table), then it is not clear that the model would
benefit from correcting this error, in terms of improvements to forecast skill.
Assimilating the AMSR-E observations also reduced the near-surface
SMseas ubMSE by 26 %, averaged across the four
sites, suggesting the possibility that the assimilation was beneficial
to the modeled mean seasonal cycle, despite not being designed to
address systematic errors. However, even more so than for
SMlong, the reduced SMseas ubMSE
could be due to reduced representativity differences, rather than
a genuine improvement to the model's ability to represent the desired
physical processes. To confirm that the SMlong and
SMseas ubMSE reductions do indicate improved model
soil moisture would require evaluating the dependent moisture and
energy flux forecast, and unfortunately verifying observations are not
available at the study locations.
In comparing the AMSR-E and Catchment soil moisture at each timescale
in this study, it became apparent that the distribution of variance
across each timescale was very different between the remotely sensed
and modeled soil moisture time series (Fig. ).
Traditionally, observation rescaling strategies used in land data
assimilation do not distinguish between variability at different timescales, and apply a single set of bulk rescaling parameters to the
full time series. Consequently, the large discrepancies in the
variance at one timescale (due to errors in one of or both estimates)
can have follow-on effects for the rescaling of other timescales. For
example, the unrealistically large AMSR-E seasonal cycle at Little
Washita caused the variability at SMlong and
SMshort to be overly dampened by the bulk rescaling. This
could perhaps be avoided by rescaling the observations separately at each
timescale using the decomposed time series produced in this study, or using
other methods that distinguish scaling characteristics at different timescales (e.g., ).
In addition to observation bias removal strategies that respect the
timescale-dependent nature of observation–forecast systematic
differences, it may be advantageous to target only certain timescales, for example by retaining the model seasonal cycle while
rescaling other timescales (e.g., ).
Ultimately, whether these approaches will be beneficial will
depend on whether the model observation differences at each timescale
are caused by model or observation errors. This study is a first
effort to investigate soil moisture assimilation at specific timescales associated with different soil moisture physical
processes. Looking forward, further evaluation of soil moisture at
these timescales will help to identify the physical processes
responsible for errors in modeled and remotely sensed soil moisture
(including representativity errors in the latter), which will in turn
help to refine observation bias removal strategies.
Finally, we have updated the investigation of
into the use of short data records for estimating observation
rescaling (CDF-matching) parameters. Nine additional assimilation
experiments were performed, each with the AMSR-E observations rescaled
using parameters estimated from a single year of data. Compared to the
scaling parameters estimated using the full data record, using only
1 year of data introduced sampling errors due to inter-annual
variability in SMseas and SMshort, and the unsampled
SMlong variability in the parameters.
For hindcasting/reanalysis applications, when the same short time
period is used for bias parameter estimation and data assimilation,
such unrepresentative parameters should not be problematic, since the
rescaled observations will still be unbiased relative to the model
over the length of the assimilation experiment, allowing shorter timescale errors to be corrected. However, in a forecasting/analysis
application in which the bias corrections parameters must be estimated
with the available (short) data record, and then applied to future
observations, unrepresentative parameters can be more problematic. Our
results suggest that, when necessary, for example early in the SMAP
mission, assimilating near-surface soil moisture over an extended
period using single year parameters will introduce some additional
uncertainty into the assimilation output; however, over a large domain
the overall impact will be minor. Of the total of 35 individual
assimilation realizations that we performed with single year
parameters at the four locations, nine resulted in no significant
change in the near-surface ubMSE compared to the open loop, and one resulted in significantly
increased ubMSE (recall that the baseline assimilation significantly
reduced the ubMSE at all four sites). However, averaged across all
realizations, which should translate to an average across a large
spatial domain, the net impact of the single year parameters was
small, and the benefit gained from the assimilation was not practically reduced,
compared to the baseline assimilation experiment.