Interactive comment on “ Investigating temporal field sampling strategies for site-specific calibration of three soil moisture – neutron intensity parameterisation methods ” by J . Iwema

The Cosmic-Ray Neutron Sensor (CRNS) can provide soil moisture information at scales relevant to hydrometeorological modelling applications. Site-specific calibration is needed to translate CRNS neutron intensities into sensor footprint average soil moisture contents. We investigated temporal sampling strategies for calibration of three CRNS parameterisations (modified N 0 , HMF, and COSMIC) by assessing the effects of the number of sampling days and soil wetness conditions on the performance of the calibration results while investigating actual neutron intensity measurements, for three sites with distinct climate and land use: a semi-arid site, a temperate grassland, and a temperate forest. When calibrated with 1 year of data, both COSMIC and the modified N 0 method performed better than HMF. The performance of COSMIC was remarkably good at the semi-arid site in the USA, while the N 0mod performed best at the two temperate sites in Germany. The successful performance of COSMIC at all three sites can be attributed to the benefits of explicitly resolving individual soil layers (which is not accounted for in the other two parameterisations). To better calibrate these parameterisations, we recommend in situ soil sampled to be collected on more than a single day. However, little improvement is observed for sampling on more than 6 days. At the semi-arid site, the N 0mod method was calibrated better under site-specific average wetness conditions, whereas HMF and COSMIC were calibrated better under drier conditions. Average soil wetness condition gave better calibration results at the two humid sites. The calibration results for the HMF method were better when calibrated with combinations of days with similar soil wetness conditions, opposed to N 0mod and COSMIC, which profited from using days with distinct wetness conditions. Errors in actual neutron intensities were translated to average errors specifically to each site. At the semi-arid site, these errors were below the typical measurement uncertainties from in situ point-scale sensors and satellite remote sensing products. Nevertheless, at the two humid sites, reduction in uncertainty with increasing sampling days only reached typical errors associated with satellite remote sensing products. The outcomes of this study can be used by researchers as a CRNS calibration strategy guideline.

A recent technology that may help fill this scale gap, is the Cosmic-Ray Neutron Sensor (CRNS) (Zreda et al., 2008(Zreda et al., , 2012)).The CRNS detects fast neutrons, which are produced from high-energy neutrons of cosmic origin and are further attenuated as they travel through the soil (Hess et al., 1961;Zreda et al., 2008).Because of the high attenuation power of hydrogen for these cosmic-ray neutrons, fast neutron intensity decreases with increasing hydrogen content within the sensor footprint (Zreda et al., 2008).Through this inverse relationship with hydrogen content, fast neutron intensity is non-linearly related with soil moisture content (Zreda et al., 2008).The sensor footprint has a horizontal effective area of about 600 m diameter at sea level for dry air but changes slightly with elevation and soil moisture content in the atmosphere (Desilets and Zreda, 2013).The measurement depth varies between about 12 (wet conditions) and 76 cm (dry conditions) (Zreda et al., 2008).
Site-specific neutron intensity-soil moisture relationships should be determined to derive soil moisture values, i.e. the CRNS needs site-specific calibration.The fully empirical N 0 formula (Desilets et al., 2010) is usually deployed for this calibration (Zreda et al., 2012).However, not only soil moisture content affects the fast neutron intensity (Franz et al., 2013c).All other hydrogen pools, (e.g.biomass, snow) affect the signal, complicating the finding of a unique relationship between neutron intensity and soil moisture content for a variety of sites and conditions (Zreda et al., 2012).Therefore a universal calibration function, the hydrogen molar fraction method (HMF), was developed, which assumes a relationship between hydrogen prevalence and neutron intensity (Franz et al., 2013b).While N 0 and HMF both calculate an integrated, depth-weighted profile average soil moisture content, the COsmic-ray Soil Moisture Interaction Code (COSMIC) computes neutron intensities from soil moisture profiles (Shuttleworth et al., 2013) and can be directly applied in the context of hydrometeorological data assimilation (Rosolem et al., 2014).
Typically only a single parameter (N 0 ) of the N 0 -method needs to be calibrated with a single point from average soil moisture, representative of the CRNS footprint (Desilets et al., 2010;Zreda et al., 2012).A similar approach is typically used for HMF, although estimates of additional hydrogen pools are also needed (Franz et al., 2013b).Originally, COSMIC was site-calibrated against neutron particle transport model Monte Carlo N-Particle eXtended (MC-NPX, Pelowitz, 2005), using 22 hypothetical profiles covering a range of possible soil moisture profiles, but weighted towards the more probable profiles at each considered site (Shuttleworth et al., 2013).
Although a number of investigations have used single calibration points from measured soil moisture profiles for each of the three methods (Rivera Villarreyes et al., 2011;Zreda et al., 2012;Franz et al., 2013c;Bogena et al., 2013;Baatz et al., 2014), to our record, there has been no previous study on whether this is feasible for each of these three methods at distinct sites.The fact that hydrogen pools (e.g.biomass, litter layer water) vary differently over time than soil moisture content profiles, and that not all these hydrogen pools can always be monitored completely and accurately (Bogena et al., 2013;Rivera Villarreyes et al., 2011), could be a complicating factor.Therefore we posed the following research questions: -What are the benefits and limitations of the three different soil moisture-neutron intensity parameterisation methods (N 0 , HMF and COSMIC) across sites with distinct climates and land cover types?
-How often should soil moisture profiles be sampled in order to reliably calibrate the three soil moistureneutron intensity parameterisation methods?
-Under what type of wetness conditions or combinations of wetness conditions should soil moisture profiles be sampled in order to reliably calibrate the three soil moisture-neutron intensity parameterisation methods?
In order to answer these questions, we calibrated the three parameterisation methods for three sites with distinct types of land cover and climate: a semi-arid, sparsely vegetated site, a humid grassland, and a humid spruce forest.We used data from 2012 to evaluate whether different days or different combinations of days would lead to different calibration results and to investigate our hypothesis that a single calibration day is not always sufficient.We used depth-weighted average soil moisture content to see whether wetness conditions affect calibration results, and whether combinations of days with different wetness conditions would yield different calibration results.

Methodology
We calibrated the three methods with different numbers of sampling days: 1, 2, 4, 6, 10, and 16.The 1 day sampling strategy (1DAY) corresponds to using a single, randomly selected day within the time series to calibrate each method; the 2 days strategy (2DAY) is based on a pair of days randomly selected from the time series, and so on.We compared these temporal sampling strategies (see  any combination of 16 days 3.0 × 10 27 combined use of CRNS and sensor networks has been essential for understanding and improving this technology (Franz et al., 2013b;Rosolem et al., 2014;Bogena et al., 2013;Baatz et al., 2014).Despite having slightly different uncertainties compared to gravimetric/volumetric soil samples obtained in the field, in situ sensors have been already successfully used in previous studies for comparison of CRNS and smallerscale soil moisture sensors (Franz et al., 2012;Bogena et al., 2013;Baatz et al., 2014).

Santa Rita Creosote (SR)
Santa Rita Creosote (Table 2 and Fig. 1), hereinafter referred to as SR, is a semi-arid site in Arizona, USA (Scott et al., 1990), which is sparsely vegetated (∼ 24 % of surface area) with creosote bush (∼ 14 % of surface area) and other species of bushes, grasses, and cacti (Cavanaugh et al., 2011).Daytime temperatures above 35 • C in summer and above 15 • C in winter are common, and precipitation falls mostly in summer and winter (Scott et al., 1990;Franz et al., 2012).The soil texture can be characterised as sandy loam with 5 to 15 % gravel (Cavanaugh et al., 2011).At SR 18 paired in situ sensor profiles, with sensors (ACC-SEN-TDT, Acclima Inc., Meridian,ID,USA) at 5, 10, 20, 30, 50, and 70 cm depth, were installed with the spatial distribution as described by Franz et al. (2012), with all equal horizontal weights (less than 1 % of missing data).We computed a simple mean horizontal soil moisture content for each sensor layer on every day.
While the sensor profiles at SR were positioned such that all had equal weights, this was not the case at RB, where we calculated horizontal average daily soil moisture contents by assigning weights to the sensor profiles representing their distance to the CRS, as described in Bogena et al. (2013).

Wüstebach (WB)
Wüstebach (Table 2 and Fig. 1), hereinafter referred to as WB, is a humid Norway spruce (90 % of surface area) forest test site in Germany, with little undergrowth (Etmann, 2009;Baatz et al., 2014).The seasonality in precipitation is small, with on average 550 mm in winter and 650 mm in summer (DWD, 2014).Average temperatures are 4.5   (Zreda et al., 2012), days with snow cover were omitted for both German sites, RB, and WB (Baatz et al., 2014), while at SR no snow cover was recorded.

CRNS and in situ soil moisture data preprocessing
The same CRNS model (CRS-1000, Hydroinnova LLC, Albuquerque, NM, USA) was used at all sites.We corrected the CRNS observed neutron intensities at each site for variation in high-energy neutron intensity, atmospheric pressure, and atmospheric water vapour content (Rosolem et al., 2013), following the suggestions of Zreda et al. (2012) and Baatz et al. (2014).To simulate a single day soil sampling campaign, we used daily average soil moisture contents from each in situ soil moisture sensor layer and daily average neutron intensities (Fig. 2).

Modified N 0 method
The N 0 method was originally developed by Desilets et al. (2010), using MCNPX.Bogena et al. (2013) introduced some changes to the N 0 method by taking into consideration dry soil bulk density to calculate the volumetric water content, and adding lattice water and soil organic matter water equivalent (Eq.1): where the parameter values a 0 = 0.0808 (cm 3 g −1 ), a 1 = 0.372 (−), a 2 = 0.115 (cm 3 g −1 ), and N 0 (cph) is a sitedependent normalisation parameter.Parameters lw and w SOM are the CRNS-footprint average volumetric lattice wa- ter content and soil organic matter equivalent water content (cm 3 cm −3 ) respectively, and ρ s (g cm −3 ) is the dry soil bulk density, usually determined from soil samples.N pih is corrected fast neutron intensity and θ is CRNS footprint average volumetric soil moisture content (cm 3 water cm −3 soil).
In order to better compare the results with the HMF and COSMIC methods, we have rearranged terms in the N 0mod formulation, so that neutron intensities are calculated based on given soil moisture.However, our preliminary results indicated that the N 0 method failed to accurately estimate the soil moisture measurements consistent to the sites (results not shown).The likely reason was the fixed coefficients defined in the equation which was also found by Rivera Villarreyes et al. (2011).We therefore modified Eq. ( 1), giving Eq. ( 2).
This equation contains parameters b 0 (cph cm 3 g −1 ), b 1 (cph), and b 2 (cm 3 g −1 ), which all need site-specific calibration.We hereinafter refer to this equation as the modified N 0 method (N 0mod ).We calculated depth-weighted profile average soil moisture contents with the methods proposed by Bogena et al. (2013) (3) and y = −5.8ln(0.14) where z represents the measurement depth (cm) and H p represents the total below ground hydrogen pool in the respective soil layer in g H 2 O cm −3 .For a more detailed description we refer to Bogena et al. (2013).

Hydrogen Molar Fraction (HMF) method
The HMF method was first developed to avoid site-specific calibration of the CRNS where soil sampling is difficult and also to facilitate the application of the mobile cosmicray soil moisture sensors (i.e.rover applications) (Franz et al., 2013b).In such cases soil moisture could be calculated provided neutron intensity and other hydrogen sources are known.However, for sites for which reliable soil moisture samples can be obtained, the HMF method can also be used for site-specific calibration of the CRNS.In the HMF method, the fast neutron intensity is calculated with Eq. ( 5): where the values of the coefficients were revised according to McJannet et al. (2014).hmf is (H)/ (E all ) is total hydrogen molar fraction (mol H/total mol).(H) is the sum of all hydrogen (mol), including hydrogen in aboveground biomass, lattice water hydrogen, hydrogen in and bound to soil organic matter, and soil water hydrogen; and (E all ) (mol) is the sum of all elements: atmospheric N and O, soil solids (quartz), lattice water, soil organic matter water equivalent, soil water, above-ground biomass, (cellulose) and above-ground biomass water.N s (cph) is a normalisation parameter which needs to be site-calibrated.
We employed HMF following the same approach recommended by Franz et al. (2013b), and calculated average profile soil moisture contents with the same depth weighting method used for the N 0mod method.We neglected root biomass, and litter layers.To calculate total amounts of chemical elements, we used a horizontal footprint radius of 335 m for all three sites (Franz et al., 2013b).We calculated measurement depths with the method from Bogena et al. (2013).

COSMIC
COSMIC was developed as a data assimilation forward operator, and is a simpler, computationally less expensive fast neutron transport model than MCNPX (Shuttleworth et al., 2013;Rosolem et al., 2014).COSMIC considers three processes: (1) exponential decay of high-energy neutron intensity with depth, (2) creation of fast neutrons as a consequence of collisions with soil and water particles and (3) exponential decay of fast neutrons while they travel upward from the place where they were created.COSMIC can be written as follows (Eq.6): where β (−), L 1 = 162.0(g cm −2 ), L 2 = 129.1 (g cm −2 ), and L 4 = 3.16 (g cm −2 ) are universal parameter values, and L 3 (g cm −2 ), N (cph), and α (−) are site-dependent parameters.The parameters m w and m s are the integrated mass per unit area (g cm −2 ) of dry soil and water respectively and ρ s and ρ w are the dry soil bulk density and soil water density (g cm −3 ).In the original model, the soil water included soil moisture and lattice water (Shuttleworth et al., 2013), while Baatz et al. (2014) added soil organic matter water equivalent to this.We used an empirical relation with a high correlation (r 2 = 0.995 (−)) between parameter L 3 and soil bulk density (ρ s ) (see Fig. 3) to derive values for L 3 at the three sites.Hence, we calibrated only parameters N and α in this study.

Calibration methodology
To investigate our first research question ("What are the benefits and limitations of the three parameterisations across distinct climates and land cover types?"), we introduced a reference strategy: for each site, we calibrated each parameterisation using all available days of the year 2012.This yielded one best solution for each site/parameterisation combination, against which we compared the results from other six sam-pling strategies (Table 1).We calibrated the parameterisations, for the 1DAY strategy, for each site, for each day of the year, resulting in as many calibration solutions as there were days with data.While we could calibrate the parameterisations for the 2DAY temporal strategy for all possible combinations of different days (65 000, 47 895, and 39 060 for SR, RB, and WB, respectively), for the higher order strategies this would, in theory, have resulted in an impractical number of combinations and consequently be highly expensive computationally (Table 1).Therefore, we drew random samples of day combinations, equal in size to the total number of combinations of the 2DAY strategy, from the populations of possible combinations.To investigate whether the chosen sample sizes were sufficiently large, we drew for each parameterisation and each site, for the 4DAY and 16DAY strategies, four extra random samples of the same size.Additionally, we drew samples with different numbers of day combinations (500, 5000, 50 000, 200 000, 1 000 000) for each parameterisation at each site.The results of both tests (not shown) indicated that using sample sizes of 65 000, 47 895, and 39 060 for SR, RB, and WB respectively, was sufficient for our analyses.
To determine parameter calibration ranges for the N 0mod method, we first applied relatively wide ranges (b 0 : 25-1000, b 1 : 10-3000, and b 2 : 0.01-1.0)based on the original values of parameters a 0 , a 1 , and a 2 and values of N 0 from the COsmic-ray Soil Moisture Observing System (COS-MOS) (Zreda et al., 2012; data available at http://cosmos.hwr.arizona.edu/)and Baatz et al. (2014).Using the initial ranges, we calibrated the N 0mod method against soil moisture content -neutron intensity combinations obtained from COSMIC simulations for each of these sites.We used a range (θ varying from zero to 0.50 cm 3 cm −3 increments) of homogeneous soil moisture profiles as input for COSMIC to calculate the neutron intensities for COSMOS sites from Shuttleworth et al. ( 2013) (except two volcanic Hawaïan sites) and the two German sites used in this study.We used COSMIC parameter values from calibration against MCNPX (Shuttleworth et al., 2013) for this purpose, and added the two German sites with parameter values from Baatz et al. (2014) because these showed, in contrast with the COSMOS sites, neutron intensities below 750 (cph).The resulting parameter ranges were smaller than the initial ranges and were used in our analyses (Table 3).We constructed a calibration range for HMF parameter N s (Table 3) using the values reported by Franz et al. (2013b) and Baatz et al. (2014), and we based the parameter calibration ranges for COSMIC (Table 3) on the values found by Shuttleworth et al. (2013) and Baatz et al. (2014).
A total of 100 000 parameter sets were sampled from the parameter space of the N 0mod method, 5000 for the HMF method, and 200 000 for COSMIC, using Latin hypercube sampling (LHS).We ran the parameterisations with these generated parameter sets for each day, and simulated the neutron intensity.We calculated the absolute error (AE) for the 3 Results and discussion

Identification of strengths and weaknesses of the three parameterisations when calibrated against all available data
Figure 4 shows calibration results using all three parameterisations, at the three sites.The simulated neutron intensities closely matched the observed neutron intensities with relative errors (MAE val divided by mean neutron intensity) between 1 and 2 %.However, at SR, observed neutron intensities were systematically overestimated (by 1 to 8 %) by all three parameterisations during the monsoon (mid Julymid September) and underestimated between mid November and mid December (by 2 to 6 %).Additionally, from early January until mid March, HMF and COSMIC matched the observed fast neutron intensities well, while N 0mod underestimated fast neutron intensity by up to 3 % for N 0mod .At RB, N 0mod seemed to have yielded the best calibration result, while HMF and COSMIC showed some periods of both overand underestimation with absolute errors up to 28 cph.Finally, at WB the calibration solution for HMF seemed to have had slightly more difficulty simulating the observed neutron intensities, although neutron intensity variation (standard deviation (SD) of 9.4 cph) was more similar to the observed variation (SD of 9.2 cph) than the other two parameterisations (SD of 7.2 cph for N 0mod , and 7.7 cph for COSMIC).
Although at SR more daily neutron intensity estimations were outside the observed uncertainty bounds (e.g.63 % for HMF at SR and 6 % at WB), we note that this is due to the relatively lower uncertainty caused by the higher observed neutron intensities (Zreda et al., 2008).Overall, N 0mod performed best at the two temperate sites, HMF showed slightly poorer results at all three sites, and COSMIC performed best at the semi-arid site, and average at the two temperate sites.
The periods of over/underestimation for all parameterisations at SR could indicate either limitations with the parameterisations used or with the quality of measurements used.
The differences between the best solutions of the three parameterisations for certain periods, found at all three sites, might be related to differences in parameterisation complexity.Where COSMIC performed better compared to the two other methods, this could indicate the benefits of explicitly resolving individual soil layers, as opposed to using depthweighted soil moisture as employed by the other two methods.Explicitly taking into consideration the depth-varying SOM and lattice water content could potentially improve measurement depth and neutron intensity estimates.
To get a better idea of how good the best solutions from the reference strategy actually were, we compared them with calibration results obtained from previous research; see Fig. 5 and Table 4.The original N 0 solution (only parameter N 0 calibrated) for SR was taken from the COSMOS website (Zreda et al., 2012), for HMF from Franz et al. (2013a) and for COSMIC from the MCNPx calibrations from Shuttleworth et al. (2013).We took all original solutions for RB and WB from Baatz et al. (2014).Only parameter N was calibrated for COSMIC at RB and WB (Baatz et al., 2014), while parameters L 3 and α were computed with relationships from Shuttleworth et al. (2013).The original solutions matched the observed neutron intensities less satisfactorily when compared to the best solutions from the reference strategy employed in this study.The most striking difference is that N 0 at SR was not able to match the observed neutron intensities because of the shape of the neutron intensity-soil moisture relationship defined by parameters a 0 , a 1 , and a 2 (notice this was one of the main motivations for introducing the modified N 0 method, as discussed in Sect.2.3.1).As mentioned in Sect.2.3.1 for our preliminary results, this suggests that using the fixed parameter values for a 0 , a 1 , and a 2 should be investigated locally.At RB the original COSMIC solution was clearly worse than our reference strategy solution and at WB this occurred for HMF and COSMIC at WB.
To identify the reasons for the relatively worse performance of the original solutions of HMF and COSMIC at RB and WB, we compared these with calibration solutions for which we used the same single days, but with our model and calibration settings (in situ soil moisture data, COS-MIC with both parameters N and α calibrated).The differences between the original and reference solution of HMF seemed to have been caused by the different values for the HMF coefficients and the chosen sampling days.The main cause for the systematic underestimations by COSMIC was that Baatz et al. ( 2014) calibrated only parameter N, since our solutions using the same days performed clearly better (MAE val = 7.6 cph at RB; 5.0 cph at WB; compare to 12.2 cph and 9.8 cph, respectively, from the original calibration).

Assessing a suitable soil sampling frequency for the three methods
In Fig. 6, the 25, 50, and 75 percentiles of the MAE val populations of best solutions are represented by dots, for each temporal strategy.The MAE val values of the best solutions of the reference temporal strategy are indicated with coloured ) and the original (Orig.)solutions.Parameters a 0 , a 1 , and a 2 are constants in the original N 0 (Desilets et al., 2010) and are hence not shown.For the original HMF solutions, the coefficients used were defined by Franz et al. (2013b).
Orig  horizontal lines.This figure can be interpreted as such that 25 % of the best solutions of a population had an MAE val equal to or smaller than the MAE val of the 25 percentile, the 50 % best calibration solutions had values smaller than the 50 percentile MAE val , etcetera.The MAE val value of the 25 percentile hence tells us how good the better solutions were; a low value means the chance of obtaining a good solution was high.A MAE val value for the 75 percentile closer to the 50 and 25 percentiles means the overall range of solutions was reduced, and hence the chance of obtaining a relatively poor performance due to calibration was relatively small.We see that for the 1DAY strategy at SR, for all three percentiles, the MAE val values of N 0mod were higher than those of HMF and COSMIC (by approximately 1.5 to 2 times).However, subsequent increase of the number of days used, made the results of N 0mod approach those of HMF.At 6DAY the MAE val of N 0mod was less than 1.2 times higher than that of HMF only.As expected, with increasing number of sampling days, the population range was reduced for all three parameterisations, and hence also the chance of obtaining poor solutions decreased.The differences between the temporal strategies were smallest for HMF at all three sites: be- From the 75 percentiles we see that the MAE val values for all three parameterisations flattened out between the 6DAY and the 10DAY strategy, after improvements of between 1.3 and 2.2 times.After these sharp decreases, little improvements (up to 1.2 times) were made by increasing the number of days to those of the reference solutions.From a fieldwork perspective, this means that despite the strong increase in work effort, only a small improvement in parameterisation quality will be gained.The quicker improvement (to relatively poor reference strategy solutions), and smaller differences between the temporal strategies of HMF could be due to the fact that HMF contains only one free parameter.
Researchers have traditionally interpreted soil moisture error values rather than neutron intensity errors.We have therefore translated the neutron intensity errors into soil moisture content errors for clarity.We took the mean observed neutron intensity at each site and used the reference solutions to compute soil moisture content error estimates (Fig. 7).For that purpose we subtracted or added the MAE neutron values from the mean observed neutron intensity and then projected onto the vertical axis of Fig. 7 to obtain soil moisture differences using the reference solution curves.We did this for the 75 percentiles only.We compared them with typical errors of time domain transmissivity (TDT) sensors (0.02 cm 3 cm −3 , Topp et al., 2001) and with those from satellite remote sensing products such as SMOS and SMAP (0.04 cm 3 cm −3 , Kerr et al., 2001).We hence assumed a 75 % chance of obtaining a calibration result, which was equal to, or better than these thresholds, sufficiently reduces the uncertainty.For simplicity, in order to obtain curves representing the COSMIC reference solutions, we assumed homogeneous soil moisture profiles.However, the different curves for each site had different slopes (e.g.HMF flatter at RB and WB), which would introduce mixed results not necessarily relevant to the overall behaviour analysed for each site.We hence had to choose one curve per site to estimate soil moisture errors, shown in Fig. 8.We chose N 0mod for two reasons.Firstly, it yielded the best reference solutions at RB and WB.Secondly, while COSMIC was best at SR, to obtain soil moisture error estimates, the need to use homogeneous profiles for this model makes representing it with a single curve an approximation only.Choosing HMF or COSMIC would have yielded slightly different error magnitudes only because the curves are only slightly different within the range at which observations are available for each individual site.This is indicated by relatively similar correlation coefficients calculated between observed and individual curves (not shown).On average, all computed errors were below the two imposed thresholds at SR.At RB and WB the magnitude of the errors was always higher than the TDT threshold.At RB, about 4 days would be needed for N 0mod and HMF while for COSMIC, 4 to 6 days would suffice to pass the SMOS threshold.All three parameterisations needed about 10 days to reach the SMOS threshold at WB.At SR, relatively low soil moisture content error estimates were obtained because the observations were limited to the dry range where the curve is relatively flat and a large neutron intensity error translates into a small soil moisture content error.At RB and WB instead, observed soil moisture contents were limited to the wet range and the curves are steeper than those at SR.
The distributions of the parameter values are shown in Figs. 9 and 10.The 75 percentile ranges of b 0 and b 1 were reduced in size by 2 to 4 times for all parameterisation/site combinations with increased number of sampling days.The parameter values satisfactorily approached the solutions from the reference strategy (Table 4), with the exception of b 2 at RB and WB.Parameter b 2 probably specifies a soil moisture content offset at the dry end of the soil moisture content/neutron intensity curve.In Eq. ( 3) it is added (after multiplication with ρ s ) to the soil moisture and lattice water terms.While at SR the observations were in the dry range, at RB and WB the wet range was observed only (Fig. 7).Hence, the role of parameter b 2 was probably less relevant for fitting to the data.The 75 percentile parameter ranges of HMF and COSMIC converged towards the parameter values from the reference temporal strategy for all three sites.
In addition to the MAE val , we evaluated the coefficient of determination (r 2 ; results not shown), and the mean bias (results not shown) with respect to the observed and simulated neutron intensities of all days of 2012.While the mean bias improved (decreased) clearly with increasing numbers of sampling day, for all sites and methods (up to 20 times smaller for reference solutions compared to 1DAY), r 2 remained nearly constant.These findings indicate that parameterisation dynamics, which are reflected in r 2 , are more strongly conditioned by the input data whereas systematic biases can be caused by poor parameter selection.The found improvement of the MAE val with increasing number of sampling days was hence due to reduced systematic biases.This is important, because systematic biases in soil moisture may hinder modelling applications (e.g.data assimilation, Dee, 2005;Reichle and Koster, 2004).
Calibrating with a single day appears to be insufficient to guarantee accurate/acceptable parameterisation performance for all three parameterisations at sites enduring predominantly wet soil conditions and relatively steep soil moisture content/neutron intensity curves.The results for the reference strategy and the other sampling strategies indicate that N 0mod is more easily calibrated for sites with relatively low seasonality in temperature and precipitation.HMF probably showed least differences between few and many sampling days; it only has one parameter that needs calibration.Moreover the reference strategy yielded relatively poor calibration results for HMF anyway.COSMIC performed relatively similarly for sites with different vegetation cover, and precipitation and temperature variability.A model with fewer parameters but similarly or slightly worse performance may be preferred to a more complex model.
For applications of mobile CRNS rovers (Chrisman and Zreda, 2013;Dong et al., 2014), multiple calibration instances are more difficult to be realised.However, in regions where stationary CRNS are available, information from mobile surveys can be better translated/constrained by such  (Bogena et al., 2013).
sensors, and hence multiple-day calibration becomes even more important for stationary sensors.Alternatively, one may adopt a space-for-time approach such as those approaches proposed for satellite remote sensing soil moisture applications (e.g.Reichle and Koster, 2004).

Evaluating preferred wetness conditions for calibration
The required numbers of sampling days found in the previous section could possibly be reduced if certain wetness states that yield relatively poor calibration solutions are avoided, and preferred wetness states for good sampling days are chosen.To identify such preferred wetness states, we used depth-weighted average soil moisture content (Bogena et al., 2013) as an indicator of wetness conditions.We used the cumulative density function (CDF) approach as employed for parameter sensitivity analysis (Demaria et al., 2007), but instead applied it to soil moisture content states.We split the MAE val populations into groups of 25 % increments, ranked from best (0-25 %) to worst (75-100 %).We calculated a CDF describing the distribution of weighted average soil moisture contents for each of the MAE val groups for the 1DAY temporal strategy (Fig. 11).We computed CDFs describing the absolute difference between the soil moisture contents of the paired days for the 2DAY strategy (Fig. 12), while for the 4-16DAY strategies we used the SDs over the soil moisture contents of the combined days.Notice that all metrics are somewhat related to a dispersion measure from the mean value (or the mean value itself for 1DAY), and are hence related to each other.The figures can be understood by realising that at soil moisture contents where the CDF of a certain group is steep, relatively more solutions are obtained.The CDFs of the 1DAY strategy showed differences between the 75-100 % solutions and the other groups for all site-parameterisation combinations except N 0mod and HMF at WB.At SR, relatively dry conditions seemed to yield a better chance of relatively good calibration solutions for HMF and COSMIC; for instance, 50 % (CDF = 0.5 (−)) of the solutions of the best 25 % group of both parameterisations had θ < 0.035 cm 3 cm −3 , while 50 % of the solutions of the worst 25 % group had θ > 0.05 cm 3 cm −3 .Relatively dry to average wetness conditions (0.03 < θ < 0.04 cm 3 cm −3 ) yielded relatively good calibration solutions for N 0mod at SR.The worst solutions (75-100 % groups) mostly originated from relatively dry conditions (θ < 0.35 cm 3 cm −3 ) for all three parameterisations at RB, while the better solutions (0-75 % groups) were mostly obtained under average wetness conditions (0.37 < θ < 0.41 cm 3 cm −3 ).At WB this was only the case for COSMIC.We therefore recommend avoiding relatively dry conditions at RB and WB and to sample under conditions more closely related to the average conditions of those sites instead, if only a single day is used.It is unlikely that the worse calibration solutions obtained under drier conditions at RB and WB were caused by changes in aboveground hydrogen pools (e.g.litter layer), because Bogena et al. (2013) found that such hydrogen pools become less dominant under drier conditions.
The calibration for N 0mod and COSMIC at all three sites was improved when paired days with distinct soil moisture contents were used, because the CDFs of the groups of worst (50-75 and 75-100 %) calibration solutions showed relatively sharp increases for similar soil moisture contents (SR: θ < 0.01 cm 3 cm −3 ; RB and WB: θ < 0.05 cm 3 cm −3 ), whereas better solutions were obtained under relatively drier conditions (Fig. 12).This might be expected because different soil moisture profiles are taken into account, as well as variations in other hydrogen pools.HMF showed no differences at SR and somewhat opposite results at RB and WB, where better solutions were relatively often obtained from combinations of days with similar wetness conditions ( θ < 0.05 cm 3 cm −3 ).Figures 13 (4DAY) and 14 (16DAY) show that increasing the number of days decreased the effects of different wetness conditions of the constituting days.Similar to the 2DAY strategy, for the 4DAY strategy different wetness conditions were more likely to yield a relatively good calibration solution for N 0mod and COSMIC while for HMF, different wetness conditions seemed to affect the results least.
A possible explanation for the opposite effects of wetness variability on HMF compared to the other two parameterisations at RB and WB is the fixed shape of the HMF curves as shown in Fig. 7.While the shapes of N 0mod and COS-MIC can change (different parameter values) when a wider range of wetness conditions is covered, the shape of the HMF curves cannot be adjusted by sampling a wider range of wetness conditions and hence such practice may not always improve results.Figure 7 also indicates the data were limited to certain parts of the curves only and hence increasing the differences between wetness conditions outside these ranges could potentially reduce the needed number of sampling days and/or increase the confidence about the calibration results obtained.
Based on our results, we can conclude that the required number of days could be limited by choosing appropriate wetness conditions, or wetness variability.However, this is mainly limited to the worst 25 % (i.e.75-100 %) of the analysed results.The preferred choice depends on the site chosen and the parameterisation used and hence no general recommendation can be given.

Conclusions
We investigated the performance of three currently available CRNS parameterisation methods (modified N 0 , HMF, and COSMIC) at three sites characterised by distinct climate and land use.When calibrated with data from all days available from 1 year, the COSMIC and N 0mod methods performed slightly better than HMF at the two more temperate and humid sites, while at the semi-arid site, COSMIC performed better than both other methods.The soil profile approach of COSMIC gave an advantage at this site.
We found that it is advisable to collect soil moisture samples on more than a single day regardless of which parameterisation is used.However, sampling on more than 6 days would, despite the strong increase in work effort, improve parameterisation quality only a little.On average, observed errors in soil moisture (translated from errors in neutron intensities) showed that at the semi-arid site, the soil moisture  error is systematically below typical uncertainties observed for point-scale and satellite remote sensing products regardless of number of sampling days.At both humid sites in Germany, the increase in sampling days reduced the uncertainty in translated soil moisture data to values similar or slightly below those assumed for satellite remote sensing, but failed to reach the same level of accuracy found in point-scale sensors.
Sampling on days or combinations of days with appropriate soil wetness conditions can reduce the required number of sampling days.The preferred choice depends on the site and the parameterisation used.At the semi-arid site, the N 0mod method was better calibrated better under average wetness conditions, whereas HMF and COSMIC were calibrated better under drier conditions.Average soil wetness conditions gave higher chances for better calibration results for all three parameterisations at the humid grassland site, and for COS-MIC at the humid forest site.In addition, the calibration results for the N 0mod and COSMIC method were better when calibrated with combinations of days with distinct soil wetness conditions.On the other hand, HMF was less affected by distinct wetness conditions at the semi-arid site while performing slightly better when using days with more similar wetness conditions at both humid sites.These differences decreased with an increasing number of days and were almost absent for the 16 days sampling strategy.
It is important to notice that varying the density and/or spatial (vertical and horizontal) sampling of soil moisture measurements may influence the calibration performance.The analysis of the actual impact on performance is beyond the scope of this study, which focuses on understanding the temporal sampling using typical spatial soil sampling approaches previously published in literature (Zreda et al., 2012;Desilets and Zreda, 2013;Bogena et al., 2013).
By providing a first general guideline of how often and under which wetness conditions soil moisture should be sampled, the outcomes of this study will help researchers to validate old calibration results and to reliably calibrate new CRNS sites and such as in the UK, as part of the AMUSED project (http://www.bris.ac.uk/news/2014/august/ soil-moisture-and-cosmic-rays.html).Our discussion on differences between the three CRNS parameterisation methods can be used to identify which parameterisation can be used best to relate neutron intensities to footprint average soil moisture contents.

Figure 2 .
Figure 2. Precipitation (P ) and in situ sensor soil moisture content (θ ) time series from the three research sites.

Figure 3 .
Figure 3. Relationship (red line) between soil bulk density ρ s (g cm −3 ) and COSMIC parameter L 3 (g cm −2 ), adapted from Shuttleworth et al. (2013).Two volcanic Hawaïan sites from Shuttleworth et al. (2013) were discarded in this case because of their aberrant physical characteristics.

Figure 4 .
Figure 4. Neutron intensity time series for the calibration solutions from the reference strategy plotted with observed neutron intensities with uncertainty bounds.The uncertainty boundaries represent 95 % confidence intervals around the mean daily fluxes.MAE val values of each parameterisation are shown in the same colour used for the neutron intensity time series.

Figure 5 .
Figure 5. Neutron intensity time series for the calibration solutions from the reference strategy (Ref.) and from original (Orig.)calibration solutions plotted together with observed neutron intensities and associated uncertainty bounds.MAE val, orig (cph) values for original solutions are included.

Figure 6 .
Figure 6. 25, 50, and 75 percentiles of MAE val best solution populations.The coloured horizontal lines represent the MAE val values for the calibrated solutions from the reference strategy.

Figure 7 .
Figure 7. Soil moisture-neutron intensity relationship derived from reference calibration for all studied sites using three distinct parameterisations.Extrapolated curves are shown as dashed lines.

Figure 8 .
Figure 8.Estimated errors in soil moisture representing the 75 percentiles obtained by calibrating against observed neutron intensities.The coloured horizontal lines represent the estimated errors from the reference strategy.The grey solid and dashed lines represent the typical errors found in point-scale sensors (TDT) and satellite remote sensing products (e.g.SMOS) respectively.

Figure 9 .
Figure 9. Parameter range distributions obtained for the best solution populations for the N 0mod parameters (b 0 , b 1 , and b 2 ).The parameter values of the reference strategy solutions are represented by black horizontal lines.

Figure 10 .
Figure 10.Parameter range distributions obtained for the best solution populations for HMF parameter N s and COSMIC parameters N, and α.The parameter values of the reference strategy solutions are represented by black horizontal lines.

Figure 12 .
Figure 12.Cumulative density functions (CDF) of sub-groups from the 2DAY best solution MAE val populations, plotted against the difference ( ) between the weighted average soil moisture contents (θ ) of the paired days.

Figure 13 .
Figure13.Cumulative density functions (CDF) of sub-groups from the 4DAY best solution MAE val populations, plotted against the SD (σ ) of the weighted average soil moisture contents (θ ) of the combined days(Bogena et al., 2013).

Figure 14 .
Figure 14.Cumulative density functions (CDF) of sub-groups from the 16DAY best solution MAE val populations, plotted against the SD (σ ) of the weighted average soil moisture contents (θ ) of the combined days (Bogena et al., 2013).
Table1for all abbreviations) with a reference strategy in which all available days from the year 2012 were used to calibrate the three methods.
As a proxy for soil moisture samples, we used data from in situ soil moisture sensor networks, because continuous soil moisture sampling over a full year is usually not available.It is however important to emphasise that distributed sensor networks are not necessarily needed to be co-located with the CRNS for operational purposes, including calibration.The Hydrol.Earth Syst.Sci., 19, 3203-3216, 2015 www.hydrol-earth-syst-sci.net/19/3203/2015/

Table 1 .
Temporal sampling strategies and their theoretical numbers of combinations.

Table 3 .
Parameter ranges for the three parameterisations used in this study.1DAYstrategy, and the mean absolute error (MAE) for the multiple day strategies.The best solution for each day was found by selecting the parameter set which gave the lowest AE or MAE.To compare the overall performance throughout the whole year of a given calibrated parameterisation, we computed the MAE over all available days (with respect to simulated and observed neutron intensities) of 2012, for each best solution, hereinafter referred to as MAE val .

Table 4 .
Parameter values for the best solutions of the reference strategy (Ref.