The impact of near-surface soil moisture assimilation at subseasonal, seasonal, and inter-annual timescales

A 9 year record of Advanced Microwave Scanning Radiometer – Earth Observing System (AMSR-E) soil moisture retrievals are assimilated into the Catchment land surface model at four locations in the US. The assimilation is evaluated using the unbiased mean square error (ubMSE) relative to watershed-scale in situ observations, with the ubMSE separated into contributions from the subseasonal (SMshort), mean seasonal (SMseas), and inter-annual (SMlong) soil moisture dynamics. For near-surface soil moisture, the average ubMSE for Catchment without assimilation was (1.8× 10 m m), of which 19 % was in SMlong, 26 % in SMseas, and 55 % in SMshort. The AMSR-E assimilation significantly reduced the total ubMSE at every site, with an average reduction of 33 %. Of this ubMSE reduction, 37 % occurred in SMlong, 24 % in SMseas, and 38 % in SMshort. For root-zone soil moisture, in situ observations were available at one site only, and the near-surface and root-zone results were very similar at this site. These results suggest that, in addition to the well-reported improvements in SMshort, assimilating a sufficiently long soil moisture data record can also improve the model representation of important long-term events, such as droughts. The improved agreement between the modeled and in situ SMseas is harder to interpret, given that mean seasonal cycle errors are systematic, and systematic errors are not typically targeted by (bias-blind) data assimilation. Finally, the use of 1-year subsets of the AMSR-E and Catchment soil moisture for estimating the observation-bias correction (rescaling) parameters is investigated. It is concluded that when only 1 year of data are available, the associated uncertainty in the rescaling parameters should not greatly reduce the average benefit gained from data assimilation, although locally and in extreme years there is a risk of increased errors.


Introduction
Many studies have demonstrated that assimilation of remotely sensed near-surface soil moisture observations can improve modeled soil moisture, with improvement typically measured by temporal agreement with in situ observations (Reichle et al., 2007;Scipal et al., 2008;Bolten et al., 2010;Draper et al., 2012).Typically, the remotely sensed soil moisture observations are assimilated using a bias-blind assimilation of observations that have been rescaled to have the same mean and variance as the model forecast soil moisture (Reichle and Koster, 2004;Scipal et al., 2008).This approach is designed to avoid forcing the model into a regime that is incompatible with its assumed (likely erroneous) structure and parameters, while also avoiding the inadvertent introduction of any observation biases into the model (Reichle and Koster, 2004).The assimilation can then correct for random errors in the model forecasts, where random errors are defined as errors that persist for less than the timescale used to -subjectively -define the bias in the mean.Traditionally, observation rescaling is based on the maximum available coincident observed and forecast data record (Reichle et al., 2007;Scipal et al., 2008;Draper et al., 2012), effectively defining the bias over the same period.The rescaled observations will then retain the signal of all observation-forecast differences occur-Published by Copernicus Publications on behalf of the European Geosciences Union.
ring at timescales shorter than the data record, which for a multi-year data record would include differences spanning the subseasonal, seasonal, and inter-annual timescales.Assimilating these rescaled observations then has the potential to improve the model soil moisture at each of the aforementioned timescales, and yet bias-blind soil moisture assimilation is often implicitly assumed to target only the random errors occurring at the relatively short subseasonal timescales.
At subseasonal, seasonal, and inter-annual timescales, different physical processes control the true soil moisture and errors in soil moisture estimates.Most notably, in many locations seasonal scale variability is dominated by the mean seasonal cycle (the annually repeating variability), and any errors in the mean seasonal cycle will be systematic, with causes such as incorrect separation of the soil and vegetation moisture signals retrieved from remotely sensed brightness temperatures, or errors in the land surface model vegetation dynamics.In contrast, variability at subseasonal and interannual timescales is rarely dominated by repeating cycles, and is more typically associated with transient atmospheric forcing events.Specifically, rapid timescale (daily) soil moisture dynamics are driven by factors such as individual precipitation events and changes in cloud cover, while longer timescale (seasonal-plus) dynamics are driven by changes in the atmospheric supply and demand for moisture (Entin et al., 2000).Soil moisture errors at subseasonal scales could then be caused by factors such as atmospheric noise in remotely sensed data, or errors in the daily meteorology of the model atmospheric forcing, while inter-annual-scale errors could be caused by factors such as drift in the remote sensor calibration, or incorrect representation of atmospheric drought conditions in the atmospheric forcing.
The differing nature of soil moisture errors across timescales has unexplored consequences for data assimilation.Most notably, the systematic nature of errors in the mean seasonal cycle is problematic.Theoretically, bias-blind data assimilation is not designed, nor optimized, to correct for systematic errors.More practically, if the systematic differences are not due to model errors (i.e., are caused by observation errors, including representativity errors), then assimilating such information can seriously degrade model performance.Additionally, the timescale dependence of soil moisture errors may also be problematic for observation rescaling using bulk parameters, intended to correct systematic differences across all timescales.Even within relatively short timescales (up to about 1 month), Su and Ryu (2015) showed that the multiplicative (differences in standard deviation) and additive (differences in mean) components of the systematic differences between modeled and remotely sensed soil moisture differ across timescales.They highlight that this lack of stationarity cannot be adequately addressed by using bulk statistics to estimate observation rescaling parameters.
Consequently, in this study we have decomposed modeled, remotely sensed, and in situ soil moisture into separate time series representing soil moisture dynamics at subseasonal, mean seasonal, and inter-annual timescales.We have then used this decomposition to examine the differences between remotely sensed and modeled soil moisture at each timescale, and how assimilating bulk-rescaled soil moisture observations impacts the model soil moisture at each timescale.
The decomposition is achieved by fitting each soil moisture (SM) time series with harmonic functions specified to target the mean seasonal cycle (SM seas ), and the subseasonal (SM short ) and inter-annual (SM long ) dynamics.By fitting the appropriate harmonic functions to each time series, we can separate the total mean square error of each soil moisture time series into contributions from each timescale.This is a much more targeted evaluation of soil moisture dynamics at physically relevant timescales than is usually undertaken.Standard evaluation methods focus on bias-blind metrics, such as the correlation or unbiased root mean square error (ubRMSE; which is calculated after removing the longterm mean difference (Entekhabi et al., 2010b)).Both R and ubRMSE are sensitive to soil moisture time series variability at all timescales.While anomaly correlations (R anom ), are also used to exclude the seasonal cycle, this is not done consistently, and does not allow for the total error to be broken into contributing timescales.Depending on how the anomalies are calculated, R anom measures subseasonal scale errors (anomalies defined relative to a simple moving average, as in Dorigo et al., 2015), or a combination of inter-annual and subseasonal scale errors (anomalies defined relative to the mean seasonal cycle over multiple years, as in Draper et al., 2012.In the second part of this study, we also explore the impact on the assimilation of using short time periods for observation bias correction.When first introducing cumulative distribution functions (CDF) matching to rescale remotely sensed soil moisture prior to assimilation, Reichle and Koster (2004) showed that for Scanning Multi-channel Microwave Radiometer (SMMR) soil moisture observations (1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987), reasonable rescaling parameters could be estimated using a single year of data.We repeat their investigation using the more modern Advanced Microwave Scanning Radiometer -Earth Observing System (AMSR-E) data set, and also extend their investigation by providing a more statistically robust analysis of the impact of using single-year scaling parameters in the assimilation.This part of the study is motivated by the recent launch of the NASA's Soil Moisture Active Passive (SMAP) mission (Entekhabi et al., 2010a), and it will address the consequences of using short records to rescale the observations during the early phases of the SMAP mission.

Data and methods
A 9 years record of surface soil moisture retrievals from AMSR-E X-band data (Owe et al., 2008) (Jackson et al., 2010).Each of these data sets is first described below (Sect.2.1), followed by a discussion of the assimilation approach (Sect.2.2) and the method used to decompose soil moisture time series into subseasonal, seasonal, and inter-annual timescales (Sect.2.3).

The soil moisture data sets
For over a decade the ARS has been collecting near-surface (5 cm) soil moisture observations, at least hourly, using dense networks of in situ sensors at four watershed scale sites in the US: Reynolds Creek (RC), Walnut Gulch (WG), Little Washita (LW), and Little River (LR).See Table 1 for the locations and network details for each watershed.At each watershed, observations collected from between 8 and 15 sites are averaged using the Thiessen polygon method to produce a coarse-scale near-surface soil moisture observation with spatial support similar to the AMSR-E observations.Intensive field experiments have shown these coarse-scale estimates to be very accurate, with errors on the order of 0.01 m 3 m −3 (Bosch et al., 2006;Cosh et al., 2006Cosh et al., , 2008)).
The soil moisture has also been observed below the nearsurface layer at Little Washita since 2007, and at Little River since 2004, with observations potentially made at every 5 cm from 5 to 60 cm.Here, the root-zone soil moisture at Little River is estimated using the average of the 5-60 cm observations (due to the relatively short time period the root-zone Little Washita data were not used).At Little River, the rootzone soil moisture estimate is calculated from fewer sensors than the near-surface estimate, due to the greater number of sub-surface sensor drop outs.However, the lesser number of root-zone sensors is not expected to be overly problematic, since soil moisture is less variable (temporally and spatially) in the root zone than in the near-surface layer.
Given that we will focus on evaluating variance, we have not supplemented the ARS in situ observations with observations from single sensor networks, such as SCAN (Schaefer et al., 2007).Unlike the locally dense in situ measurements from the ARS networks, the variance (and mean) of observations from single sensors cannot be assumed representative of the coarse-scale soil moisture from Catchment and AMSR-E.
Level 3 Land Parameter Retrieval Model (LPRM) X-band AMSR-E near-surface soil moisture retrievals at 0.25 • resolution were obtained for the grid cells encompassing the center of each watershed site in Table 1.At X-band the observations relate to a surface layer depth slightly less than 1 cm.Only the descending (01:30 LT) overpass has been used to avoid possible differences in the climatological statistics of day-and nighttime observations.The sites were explicitly selected by ARS to avoid possible radio frequency interference and proximity to permanent open water, and the AMSR-E soil moisture retrievals were screened to remove observations with X-band vegetation optical depth above 0.8.
NASA's Catchment land surface model was run over the 9 km EASE grid cells encompassing the center of each watershed site, using atmospheric forcing fields from Modern Era Retrospective-Analysis for Research (MERRA; Rienecker et al., 2011) and recently improved soil parameters (De Lannoy et al., 2014).The model initial conditions were first spun-up from January 1993 to January 2002 using a single member without perturbations.The ensemble (including perturbations) was then spun-up from January to October 2002 (see Sect. 2.2 for details of the ensemble).For both the model open loop and data assimilation model output, the ensemble average near-surface (0-5 cm) and root-zone (0-100 cm) soil moisture is then reported.
Daily ARS and Catchment time series were generated by sampling each at the approximate time of the descending AMSR-E overpass (01:30 LT).Initially each time series spanned the AMSR-E data record, rounded down to nine full years from October 2002 to September 2011; however the Little River root-zone soil moisture observations are not available before January 2004, and were truncated to the 7 years from October 2004 to September 2011.Also, there were just 21 ARS observations at Reynolds Creek in the last year of this period, and therefore the Reynolds Creek time series were truncated to the 8 years from October 2002 to September 2010.The ARS and AMSR-E sensors can only measure liquid soil moisture, and all data have been screened C. Draper and R. Reichle: Soil moisture assimilation timescales out when the Catchment model indicates frozen near-surface conditions.Since the Reynolds Creek site is frozen for an extended period each winter, liquid soil moisture is not well defined there during winter, and the Reynolds Creek time series have then been truncated to remove winter, defined as from 1 December to 10 March (the period during which the Catchment surface is continuously frozen for at least 3 of the 8 years of the Reynolds Creek record).

The assimilation experiments
The assimilation experiments were performed using a onedimensional bias-blind ensemble Kalman filter, with the same set-up and ensemble generation as in Liu et al. (2011).Prior to assimilation, the AMSR-E observations were rescaled using CDF matching (Reichle and Koster, 2004).For each experiment a single set of bulk CDF-matching parameters were used (i.e., the rescaling is applied only to original AMSR-E time series, and not to the decomposed time series).In the baseline assimilation experiments, the CDFmatching parameters were calculated using the maximum available 9-year AMSR-E data record, following standard practice.
This 9-year AMSR-E record is the longest remotely sensed soil moisture record available from a single satellite sensor, and soil moisture assimilation experiments using newer satellites, or a modeling system with limited archives, are limited to shorter time periods for observation rescaling.To establish the potential consequences of using a shorter data record, a second set of experiments was conducted, in which the rescaling parameters were estimated using the 12 month periods starting in consecutive Octobers (but assimilating the full 8-or 9-year near-surface soil moisture data record listed in Table 1).Reichle and Koster (2004) also tested the use of 1year periods for rescaling soil moisture from SMMR.In contrast to their approach, we do not use ergodic substitution (of spatial sampling for temporal sampling) when estimating the rescaling parameters with a single year of observations, since with more modern remote sensors, this is no longer necessary to obtain a sufficient sample size.Additionally, for the assimilation of Soil Moisture Ocean Salinity retrievals, De Lannoy and Reichle (2015) found ergodic substitution degraded the estimated CDFs, by introducing conflicting information from neighboring grid cells, possibly due to the higher spatial resolution, compared to SMMR.
The benefits of each assimilation experiment is compared to that of the Catchment model open-loop ensemble mean, in which the same ensemble generation parameters were used, and no observations were assimilated.The improvement from the open loop is measured using the unbiased mean square error (ubMSE) of the resulting model soil moisture, with respect to the ARS in situ observations.For data set X compared to in situ data I , both of length n, the ubMSE is calculated as where .indicates the temporal mean.The ubMSE is also referred to as the variance of the errors in X; however we use the ubMSE terminology for consistency with commonly used ubRMSE in the soil moisture literature (Entekhabi et al., 2010b).We do not apply the square root here to take advantage of the additive property of the variance of independent time series.However, to aid interpretation the ubMSE equivalent to the common ubRMSE target accuracy of 0.04 m 3 m −3 is indicated in the relevant plots.

Decomposition of soil moisture time series
We wish to decompose each SM time series into separate components representing soil moisture dynamics at the SM short , SM seas , and SM long timescales.Variability in a time series at specific timescales can be isolated by fitting a function made up of the sum of sinusoidal functions.Formally, for some observed time series, y, the a k and b k coefficients in the decomposed form ŷ are fit for some selection of integers k i : where t is the time step and n is the length of the time series.2πk n is the (angular) frequency for a sinusoid completing k cycles over n time steps (i.e., that has frequency k/n per time unit), and ŷ for k = k i is referred to as the k i th harmonic.a 0 is the mean of y.If the time series is sampled at regular intervals and has no missing data, the sinusoids for individual harmonics are orthogonal and independent of each other.This is the basis for the discrete Fourier transform, which exactly fits Eq. ( 2) to y using the first n/2 harmonics (i.e., k i = 1, 2, 3, . . .n/2).In this study, we use multiple linear least-squares regression to fit Eq. (2) to the soil moisture time series for a sum of harmonic frequencies selected to isolate the variability at each target timescale, as described below.
We define SM seas by fitting Eq. (2) to the soil moisture time series for some combination of the annual harmonic frequencies (i.e., for k/n an integer multiple of 1 yr −1 ).The frequencies higher than 1 yr −1 moderate the shape of ŷ to account for differences in the shape of the seasonal cycle from the single sinusoid described by the first harmonic.Typically, only a few annual harmonics are necessary to fit the seasonal cycle of geophysical variables (Scharlemann et al., 2008;Vinnikov et al., 2008).Here we define SM seas to be the sum of the first two harmonics, since fitting additional harmonics did not improve the ability to predict withheld data, following the method of Narapusetty et al. (2009).Note that since the same annual harmonics are repeated each year, we are restricting SM seas to represent only the mean seasonal cycle, and any inter-annual variability at seasonal timescales, such as anomalous vegetation growth in a given year, will be assigned to the subseasonal or inter-annual variability, depending on its temporal characteristics.
We define SM long by fitting Eq. (2) to the soil moisture time series using the harmonic frequencies lower than 1 yr −1 that divide into the number of years in the data record (i.e., for k/n = 1/m, 2/m, 3/m . . .(m − 1)/m, where m is the time series length in years).Finally, we define SM short as the residual: (3) Note that, as defined here, SM long , SM seas , and SM short are all zero mean, since the time series mean was assigned to a 0 in Eq. ( 2).Both of the AMSR-E and ARS observed time series are incomplete (Table 2).When applied to incomplete time series, the sinusoids fitted by Eq. ( 2) are not necessarily independent; hence, the fitted SM seas and SM long may not be independent.We opted not to use gap-filling prior to fitting Eq. ( 2), to keep the method simple, and because gap-filling would directly affect the SM short dynamics.In Sect.3, before using the decomposed time series we check for signs of strong dependence between the fitted SM long , SM seas , and SM short , by testing whether the sum of the variances of the three timescale components differs from the variance of the original soil moisture time series.We assume that if there is little difference then any dependence between SM long , SM seas , and SM short has only a minimal impact on our results.Following initial investigation with this test, the number of observations used at each location is maximized by comparing only model (or assimilation) estimates to ARS in situ measurements, avoiding direct comparison of the incomplete ARS and AMSR-E time series (which would require cross-screening for the availability of both).Finally, we do not use the harmonic fit to interpolate missing data, and instead screen out the fitted SM long and SM seas at times when the original soil moisture was not available.Also, at Reynolds Creek, where the time series has been truncated to remove frozen winters, the length of the year used to fit the harmonics was similarly truncated.
For demonstration purposes, in Sect.3.3 we decompose each soil moisture time series into similarly defined timescale components using moving averages, since moving averages are often used for calculating anomaly correlations (Draper et al., 2012;Dorigo et al., 2015).The length of the averaging windows were chosen to give close agreement with the results of the harmonic decomposition described above.For the moving average decomposition, the inter-annual soil moisture time series, SM MA long , is defined as the 181-day moving average, and the seasonal cycle, SM MA seas , is defined for each day of the year by averaging the data from all years that fall within a 45-day window surrounding that day of year.As with the harmonic approach, the subseasonal time series, SM MA  short , is calculated as the residual, analogous to Eq. ( 3).The same data processing and quality control as for the harmonic decomposition is used (also without gap filling), plus the moving averages are only calculated when at least 60 % of the data within the averaging window are available.

Results
Below, the original AMSR-E, Catchment, and ARS soil moisture time series are examined (Sect.3.1), before being split into SM seas , SM long , and SM short (Sect.3.2).The distribution of variance across the different timescales for each soil moisture estimate is then compared (Sect.3.3), before the observations are rescaled (Sect.3.4), and the benefit of assimilating the AMSR-E data into Catchment is assessed at each timescale (Sect.3.5).Finally, the consequences of using a relatively short record to rescale the AMSR-E data are examined (Sect.3.6).

The ARS, AMSR-E, and Catchment time series
Figure 1 shows the original time series at each site.In general, soil moisture from in situ, modeled, and remotely sensed estimates have systematic differences in their behavior, due to representativity or structural differences between each estimate (Reichle et al., 2004).The most obvious difference in (see also Table 2).Both AMSR-E and Catchment are consistently biased high compared to the ARS soil moisture.
Bias values for the model range from 0.01 m 3 m −3 for Little Washita to 0.09 m 3 m −3 for Little River, and bias values for the AMSR-E retrievals range from 0.07 m 3 m −3 for Reynolds Creek to 0.21 m 3 m −3 for Little River.Additionally, the standard deviation of AMSR-E is 2 to 3 times larger than the other two estimates.Figure 1 demonstrates that this is due to greater noise, and also a prominent seasonal cycle at Little Washita and Little River that is not evident in the other time series.
In addition to the systematic differences in their mean and standard deviation reported above, there are more subtle differences between the soil moisture dynamics described by each estimate.For example, for both the surface and rootzone soil moisture, the ARS time series tend to show a sharper response to individual rain events than does Catchment, with (relatively) larger peaks followed by more rapid dry down after each event.At Walnut Gulch this is partic-ularly obvious, with ARS rapidly drying to a well-defined lower limit after each precipitation event, while Catchment has a lesser response to individual events, and a stronger seasonal signal.

Soil moisture time series at each timescale
Figure 2 shows an example of the timescale decomposition, for the Catchment surface soil moisture at Little River, for both the harmonic and moving average approaches.The time series described by each method are similar in terms of the magnitude and timing of their dynamics, except that the moving average inter-annual soil moisture includes more highfrequency variability than does the harmonic version.Evaluation of soil moisture at specific timescales should ideally be based on time series separated into independent timescale components.For the harmonic method, independence between the time series at each timescale is not guaranteed since the original time series were not complete, while for the moving average method, independence is not expected.Figure 3 shows an example of the variance bar plots used to check for signs of dependence between the time series at each timescale, in this case for the Catchment model and the AMSR-E observations.In Fig. 3a, for the harmonic method, the sum of the variances at each timescale (the stacked bars) is very close (within 2 %) to the total variance of the original soil moisture time series (the white circles), falling within the 95 % confidence interval of the total variance in each case.In contrast, for the moving average method in Fig. 3b the sum of the variances of each timescale falls outside the 95 % confidence interval for the total time series variance at three of four sites, with a mean difference of 8 % of the total variance (with differences ranging between 1 and 16 %), indicating strong dependence between the three components.In each case the sum of the variances of the timescale components is less than the total variance at each site, indicating positively correlated features between the moving average timescale components (since . This positive correlation is intuitively expected, since an anomaly in the original soil moisture time series has the same direction of influence on both the moving averages and the residual from that moving average (e.g., in Fig. 2 note the signal of the large positive anomaly in early 2004 in both SM MA long and SM MA short ).Finally, the distribution of variance across the timescales is similar for each method, largely because the moving average window lengths for SM MA seas and SM MA long were selected to generate time series closely matching those from the harmonic method.

Variance distribution across timescales
In Fig. 3 the AMSR-E variance is much larger than that for Catchment (as was discussed in Sect.3.1), making it difficult to compare the relative distribution of variance across each timescale.Figure 4a then shows the AMSR-E and Catchment variance bar plots with the total variance normalized to one, to allow direct comparison to the fraction of variance at each timescale.The same plots are also presented for the Catchment and ARS soil moisture in Fig. 4b (recall we do not directly compare the ARS and AMSR-E time series, so as to avoid cross-screening their availability).
In Fig. 4, the distribution of variance across timescales for each data set can be very different, and there is not a consistent pattern across the four sites.As was previously noted from Fig. 1, AMSR-E has a very prominent seasonal cycle at Little River and Little Washita (40-70 % of the total variance) that is not present for Catchment or ARS, for which the SM seas fraction of variance is around 10-20 % in Fig. 4. In contrast, at Reynolds Creek and Walnut Gulch, Catchment has a larger fraction of its variance in the seasonal cycle (55-70 %) than does AMSR-E (20-40 %), with ARS agreeing with Catchment at Reynolds Creek only.At Walnut Gulch the greater variance fraction in the Catchment SM seas is mostly balanced by less variability in SM short (30 % compared to 60 % for ARS).This is associated with the differing responses to precipitation events already noted in Fig. 2.
One might expect AMSR-E to have a larger fraction of variance at SM short , due to measurement noise from the remote sensor.However, this is only the case at Reynolds Creek, where AMSR-E has 50 % of its variance in SM short , compared to 20-30 % for Catchment and ARS.At Walnut Gulch, the AMSR-E and ARS SM short variance fractions are similar (50-60 %), while the fraction for Catchment is much lower (25 %).At Little Washita and Little River the variance fraction in the AMSR-E SM short is similar to Catchment (at around 50 and 30 %, respectively) and both are much smaller than for ARS (around 70 %).At these two sites the AMSR-E SM short variance fraction may well be less than expected due to the large amount of variance in its exaggerated seasonal cycle.
For the SM long variance, the patterns at Little Washita and Little River are again similar to each other.Catchment has much more variance in SM long (40-50 %) than ARS (20 %) or AMSR-E (10 % or less).At the other two sites, the SM long variance fraction is similar for all data sets, except for the lower value for AMSR-E at Walnut Gulch (< 10 %, compared to around 20 % for ARS and Catchment).

Baseline observation rescaling
For the baseline experiment, the AMSR-E observations were rescaled using bulk CDF-matching parameters estimated over the full data record.By design, the CDF-matched AMSR-E observations, labeled Oc, have the same mean (not shown) and variance (Fig. 3a) as the Catchment soil moisture.Figure 4 shows that the CDF matching had little impact on the variance distributions across each timescale.This suggests that for the particular examples in this study, the CDF-matching operator could be approximated by a linear rescaling, in which only the mean and variance of the model are matched, as in Scipal et al. (2008).To confirm this, the assimilation experiments were repeated using linear rescaling of the AMSR-E observations in place of CDF matching.The results (not shown) were indeed very similar to the CDF-matching experiments, in terms of the rescaled observations and the assimilation output (for both the Oc rescaling presented in this Section, and the Oy rescaling presented in Sect.3.6).
Recall that the distribution of the variance across each timescale was quite different for the AMSR-E and Catchment soil moisture in Fig. 4. Note that large errors in the variance at one timescale (in either AMSR-E or Catchment) will affect the rescaling of the variance at other timescales.In particular, if the unrealistically large AMSR-E seasonal cycle at Little Washita were replaced with something more realistic, for example representing 8 % of the total variance (as in the ARS time series), then the fraction of variance in SM short would increase from the current 48 to 75 %, increasing the SM short variance in the CDF-matched AMSR-E from 0.0036 to 0.0054 (m 3 m −3 ) 2 .

Evaluation of the baseline assimilation experiment at each timescale
Figure 5 shows the ubMSE for each assimilation experiment, separated into each timescale.Prior to assimilation, the average ubMSE in the near-surface soil moisture across the four sites was 1.8 × 10 −3 (m 3 m −3 ) 2 (giving a ubRMSE just above the 0.04 m 3 m −3 target).Close to half (55 %) of the ubMSE is in SM short , with the rest split between SM seas (26 %) and SM long (19 %).The Ac assimilation significantly reduced the total ubMSE at each site, reducing the average near-surface ubMSE across the four sites by 33 % to 1.2 × 10 −3 (m 3 m −3 ) 2 , with average reductions in the nearsurface layer of 52 % for SM long , 25 % for SM seas , and 22 % for SM short .The baseline assimilation experiment, labeled Ac, reduced the total ubMSE at each site for all timescale components, except for SM seas at Little Washita (where the model ubMSE was already relatively small).Root-zone soil moisture observations were available for the study period only at Little River.Both the distribution of the ubMSE across each timescale, and the relative reductions achieved from assimilation, are similar for the near-surface and root-zone layers at Little River in Fig. 5d and e, adding confidence that the model improvements reported above for the near-surface soil moisture are indicative of the performance throughout the soil profile.
To illustrate the impact of the assimilation at each timescale, Fig. 6 compares the decomposed time series for the Catchment model and Ac assimilation experiments to that from the ARS in situ observations at Little River.The difference between the three SM short time series is difficult to visually judge in Fig. 6d; however, the impact of the assimilation on the SM seas and SM long time series is clear.Figure 6b suggests that the large SM long ubMSE reduction (by over 80 %) from the assimilation is due to the reduced amplitude in the SM long dynamics, although there is perhaps also an improvement in event timing.In Fig. 6c, the model seasonal cycle has an overestimated amplitude, and also includes two maxima per year, where the ARS seasonal cycle has only one.The assimilation exacerbates the overestimated amplitude, but also removes the second annual maxima, resulting in an overall SM seas ubMSE reduction (by 46 %).

Observation rescaling with a short data record
The 9-year time period used in the baseline experiment to estimate the CDF-matching parameters is longer than is often available for soil moisture assimilation experiments.Obviously, assimilating a shorter time period will limit the potential improvements to the model SM long (of similar magnitude to the SM short improvement in this study).The potential benefit of an assimilation over a shorter period may also be limited by the increased sampling uncertainty in the estimated observation rescaling parameters.This increased uncertainty could arise from systematic errors due to inadequate sampling of SM seas and SM long , or from increased random errors associated with the smaller sample size.This is tested here with nine additional experiments, labeled AyYY, in which the CDF-matching parameters are each based on a 12-month period starting on 1 October.For example, experiment Ay03 uses CDF-matching parameters based on the 12 months of data from 1 October 2003 to 30 September 2004.Each of the nine experiments assimilates the full 8-or 9-year record of AMSR-E near-surface soil moisture retrievals (including the data for the year from which the CDF-matching parameters were determined).
The potential uncertainty introduced by using a single year to estimate the rescaling parameters depends on the inter- annual variability in the systematic differences between the observed and forecast soil moisture.The main systematic differences that are addressed by the CDF matching are the differences in the observed and forecast mean and standard deviation.For demonstrative purposes, Fig. 7 illustrates the difference between the means, and the ratio of the standard deviations, estimated using the full data record, and using each single year.Note, however, that in the presented Ay experiments the AMSR-E observations are rescaled based on the full CDF, and not just the mean and variance.In Fig. 7a there is considerable inter-annual scatter in the yearly mean differences, although by linearity the average is unbiased.The standard deviation ratio in Fig. 7b also shows inter-annual variability; however, the single year ratios are also biased low compared to the all-years ratio, since the single year estimates did not sample the SM long variance (which was consistently a greater fraction of the total variance for Catchment than for AMSR-E in Fig. 4a).This is particularly marked at Little River, where the average of the single year standard deviation ratios was 30 % less than when estimated using all years (since SM long makes up close to 50 % of the total variance in Catchment, compared to less than 5 % for AMSR-E in Fig. 4a).
Figure 5 includes the ubMSE for the nine Ay assimilation experiments, as well as the mean ubMSE ( Ay ) across all nine.On average, assimilating the AMSR-E observations that have been rescaled using parameters estimated from a single year is beneficial.As with the Ac experiment, the Ay ubMSE is consistently less than that of the model at each timescale, except for SM seas at Little Washita.However, for individual realizations there is an increased risk when using the single year parameters that the assimilation will not significantly improve the model, or will even significantly degrade the model.For example, at Little Washita, where the Ac experiment reduced the ubMSE by a small but significant amount, none of the Ay experiments significantly decreased the ubMSE, and the Ay10 experiment significantly increased it.
Additionally, comparing the Ay experiments in Fig. 5 to the baseline Ac experiment shows that at Reynolds Creek, Walnut Gulch, and Little Washita most of the Ay experiments resulted in larger total ubMSE than for the Ac experiment, while at Little River the opposite occurred.Overall there were eight Ay experiments for which the total ubMSE was significantly different (at the 5 % level) and higher than for the Ac experiment, seven for which it was significantly different and lower, and 20 where the ubMSE was not significantly changed.The differences between the Ac and Ay ubMSE are skewed, in that when the Ay ubMSE is higher, the difference tends to be greater than when it is lower.Consequently, the average reduction in the model ubMSE for the near-surface soil moisture, compared to the model with no assimilation, is slightly less for Ay (30 %) than for Ac (33 %).
Each instance of relatively poor ubMSE for an Ay experiment can be traced to the more extreme (i.e., unrepresentative) single year systematic differences in Fig. 7. Going through the experiments with the largest relative increase in ubMSE, experiment Ay07 at Reynolds Creek, and experiments Ay05, Ay06, and Ay07 at Walnut Gulch all have extreme standard deviation ratios, while Ay06 at Reynolds Creek and Ay10 at Little Washita have extreme mean differences.In each case, most of the increase in the ubMSE is due to increased errors in the SM seas and SM long components, suggesting that the SM short corrections are more robust to uncertainty in the scaling parameters.Note that unrepresentative scaling parameters do not necessarily degrade the assimilation output, and in some instances are even advantageous.Most obviously, at Little River, where the single year standard deviation ratios were biased low (by 30 %), the Ay assimilation experiments all produced slightly lower ubMSE than the Ac experiment.
In general, the impact of errors in the rescaling of the mean value are likely under-reported here, since any introduction of biases into the model will not be directly detected by the ubMSE.Despite this, the examples cited above in which unrepresentative mean difference corrections degraded the biasrobust ubMSE highlight the potential for a bias-free assimilation of biased observations to degrade model soil moisture dynamics.

Conclusions
Many studies have demonstrated that near-surface soil moisture assimilation can improve modeled soil moisture, in terms of the anomaly time series used to represent random errors, often implicitly assumed to represent subseasonal scale variability associated with individual precipitation events (Reichle et al., 2007;Scipal et al., 2008;Draper et al., 2012).Here, 9 years of LPRM AMSR-E observations were assimilated into the Catchment model, and the resulting model output evaluated separately at the subseasonal (SM short ), seasonal (SM seas ), and inter-annual (SM long ) timescales against watershed scale in situ observations at four ARS sites in the US.The results show that, in addition to reducing the nearsurface SM short ubMSE averaged across the four sites, the assimilation also reduced the near-surface SM long ubMSE.The magnitude of the reductions in SM short and SM long were similar (2.1 × 10 −4 (m 3 m −3 ) 2 , and 2.5 × 10 −4 (m 3 m −3 ) 2 , respectively), although this represented a much larger relative reduction in the SM long ubMSE (52 % of the model SM long ubMSE, compared to 22 % for the SM short ubMSE).In situ observations of the root-zone layer were available for only one site; however, the similarity between the near-surface and root-zone results at this site (Fig. 5) is encouraging in terms our near-surface results being representative of the deeper soil moisture profile.
The reduced SM long ubMSE suggests that assimilating a sufficiently long data record of near-surface soil moisture observations can improve the model soil moisture dynamics at inter-annual timescales, enhancing the model ability to simulate important events such as droughts.There is then a clear potential for reanalyses, or other long-term simulations, to benefit from the assimilation of long-term remotely sensed soil moisture records.Such long records are available from the AMSR-E satellite used here (May 2002-October 2011), and increasingly from the active microwave ASCAT series (ongoing from October 2006).Carefully merged multisatellite records, such as the 30-year record being produced by the Water Cycle Multi-mission Observation Strategy (WACMOS) project (Su et al., 2014;Liu et al., 2012) are also now providing data records of unprecedented length.As with SM short , an important caveat on this finding is that it is possible that the reduced SM long ubMSE was associated with reduced representativity differences compared to the in situ observations, rather than a true model improvement.For example, at Little River in Fig. 5 the substantial improvements to the SM long near-surface and root-zone soil moisture gained by assimilating the AMSR-E observations were largely due to reduced SM long variance.If the model's exaggerated SM long was a representativity or structural error (e.g., too strong a signal of underlying water table), then it is not clear that the model would benefit from correcting this error, in terms of improvements to forecast skill.
Assimilating the AMSR-E observations also reduced the near-surface SM seas ubMSE by 26 %, averaged across the four sites, suggesting the possibility that the assimilation was beneficial to the modeled mean seasonal cycle, despite not being designed to address systematic errors.However, even more so than for SM long , the reduced SM seas ubMSE could be due to reduced representativity differences, rather than a genuine improvement to the model's ability to represent the desired physical processes.To confirm that the SM long and SM seas ubMSE reductions do indicate improved model soil moisture would require evaluating the dependent moisture and energy flux forecast, and unfortunately verifying observations are not available at the study locations.
In comparing the AMSR-E and Catchment soil moisture at each timescale in this study, it became apparent that the distribution of variance across each timescale was very different between the remotely sensed and modeled soil moisture time series (Fig. 4).Traditionally, observation rescaling strategies used in land data assimilation do not distinguish between variability at different timescales, and apply a single set of bulk rescaling parameters to the full time series.Consequently, the large discrepancies in the variance at one timescale (due to errors in one of or both estimates) can have follow-on effects for the rescaling of other timescales.For example, the unrealistically large AMSR-E seasonal cycle at Little Washita caused the variability at SM long and SM short to be overly dampened by the bulk rescaling.This could perhaps be avoided by rescaling the observations separately at each timescale using the decomposed time series produced in this study, or using other methods that distinguish scaling characteristics at different timescales (e.g., Su and Ryu, 2015).
In addition to observation bias removal strategies that respect the timescale-dependent nature of observation-forecast systematic differences, it may be advantageous to target only certain timescales, for example by retaining the model seasonal cycle while rescaling other timescales (e.g., Drusch et al., 2005;Bolten et al., 2010).Ultimately, whether these approaches will be beneficial will depend on whether the model observation differences at each timescale are caused by model or observation errors.This study is a first effort to investigate soil moisture assimilation at specific timescales associated with different soil moisture physical processes.Looking forward, further evaluation of soil moisture at these timescales will help to identify the physical processes re-Hydrol.Earth Syst.Sci., 19, 4831-4844, 2015 www.hydrol-earth-syst-sci.net/19/4831/2015/ sponsible for errors in modeled and remotely sensed soil moisture (including representativity errors in the latter), which will in turn help to refine observation bias removal strategies.
Finally, we have updated the investigation of Reichle and Koster (2004) into the use of short data records for estimating observation rescaling (CDF-matching) parameters.Nine additional assimilation experiments were performed, each with the AMSR-E observations rescaled using parameters estimated from a single year of data.Compared to the scaling parameters estimated using the full data record, using only 1 year of data introduced sampling errors due to inter-annual variability in SM seas and SM short , and the unsampled SM long variability in the parameters.
For hindcasting/reanalysis applications, when the same short time period is used for bias parameter estimation and data assimilation, such unrepresentative parameters should not be problematic, since the rescaled observations will still be unbiased relative to the model over the length of the assimilation experiment, allowing shorter timescale errors to be corrected.However, in a forecasting/analysis application in which the bias corrections parameters must be estimated with the available (short) data record, and then applied to future observations, unrepresentative parameters can be more problematic.Our results suggest that, when necessary, for example early in the SMAP mission, assimilating near-surface soil moisture over an extended period using single year parameters will introduce some additional uncertainty into the assimilation output; however, over a large domain the overall impact will be minor.Of the total of 35 individual assimilation realizations that we performed with single year parameters at the four locations, nine resulted in no significant change in the near-surface ubMSE compared to the open loop, and one resulted in significantly increased ubMSE (recall that the baseline assimilation significantly reduced the ubMSE at all four sites).However, averaged across all realizations, which should translate to an average across a large spatial domain, the net impact of the single year parameters was small, and the benefit gained from the assimilation was not practically reduced, compared to the baseline assimilation experiment.

Fig. 1 Figure 1 .
Figure1shows the original time series at each site.In general, soil moisture from in situ, modeled, and remotely sensed estimates have systematic differences in their behavior, due to representativity or structural differences between each estimate(Reichle et al., 2004).The most obvious difference in Fig.1is that the mean and variance of each estimate differ

Figure 2 .Figure 3 .
Figure 2. Decomposition of the Catchment near-surface soil moisture time series at Little River, using the harmonic (HA; black) and moving average (MA; cyan) methods, for (a) the original time series (red dots) and the sum of SM long + SM seas + the long-term mean soil moisture (solid lines), and the individual components (b) SM long , (c) SM seas , and (d) SM short .

Figure 4 .
Figure 4. Fraction of variance at each timescale, obtained by normalizing the time series variance before decomposition.The Catchment model (M), original AMSR-E observed (O), and the CDF-matched AMSR-E-observed (Oc) soil moisture time series, cross-screened for AMSR-E availability, are plotted in (a), and the ARS in situ observations (I), Catchment model (M), and baseline assimilation (Ac) soil moisture time series, cross-screened for ARS availability, are plotted in (b).The circles give the variance of the original (normalized) soil moisture time series.

Figure 5 .
Figure 5. Error variances (ubMSE) compared to ARS in situ observations at each timescale, for the near-surface soil moisture at (a) Reynolds Creek, (b) Walnut Gulch, (c) Little Washita, (d) Little River, and (e) for the root-zone soil moisture at Little River.Bars show the Catchment model open loop (M), baseline assimilation (Ac), individual Ay assimilation experiments, and the mean across the Ay experiments ( Ay ).Label AyYY indicates the Ay experiment with bias correction parameters estimated from the 12 months from 1 October of 20YY.Circles and error bars give the ubMSE and its 95 % confidence interval for the original soil moisture time series (some very small confidence intervals are obscured by the plotted circles).The dashed line at ubMSE of 1.6 × 10 −3 (m 3 m −3 ) 2 is equivalent to the common ubRMSE target of 0.04 m 3 m −3 .

Figure 6 .
Figure 6.Harmonic decomposition of the ARS in situ (I), Catchment model (M), and baseline assimilation output (Ac) near-surface soil moisture time series at Little River, showing the (a) original time series, (b) SM long , (c) SM seas , and (d) SM short .

Figure 7 .
Figure 7. Systematic differences between AMSR-E observations and Catchment model near-surface soil moisture, with (a) the mean difference ( model − observation ), and (b) the ratio of the standard deviations (σ (model)/σ (observations)).The parameters are estimated using all years (All), and each year separately (with label YY indicating the parameters estimated from the 12 months from 1 October of 20YY), and the dashed lines give the mean of the YY parameters.

Table 1 .
Location, time period, and number of in situ network sites at each watershed.

Table 2 .
Descriptive statistics for the data sets at each watershed.