River water-quality time series often exhibit fractal scaling, which here
refers to autocorrelation that decays as a power law over some range of
scales. Fractal scaling presents challenges to the identification of
deterministic trends because (1) fractal scaling has the potential to lead to
false inference about the statistical significance of trends and (2) the
abundance of irregularly spaced data in water-quality monitoring networks
complicates efforts to quantify fractal scaling. Traditional methods for
estimating fractal scaling – in the form of spectral slope (

It is well known that time series from natural systems often exhibit
autocorrelation; that is, observations at each time step are correlated with
observations one or more time steps in the past. This property is usually
characterized by the autocorrelation function (ACF), which is defined as
follows for a process

Although the short-term memory assumption holds sometimes, it cannot adequately describe many time series whose ACFs decay as a power law (thus much slower than exponentially) and may not reach zero even for large lags, which implies that the ACF is non-summable. This property is commonly referred to as long-term memory or fractal scaling, as opposed to short-term memory (Beran, 2010).

Fractal scaling has been increasingly recognized in studies of hydrological
time series, particularly for the common task of trend identification. Such
hydrological series include river flows (Montanari et al., 2000; Khaliq et
al., 2008, 2009; Ehsanzadeh and Adamowski, 2010), air and
sea temperatures (Fatichi et al., 2009; Lennartz and Bunde, 2009; Franzke,
2012a, b), conservative tracers (Kirchner et al., 2000,
2001; Godsey et al., 2010), and non-conservative chemical constituents
(Kirchner and Neal, 2013; Aubert et al., 2014). Because for fractal scaling
processes the variance of the sample mean converges to zero much slower than
the rate of

Several equivalent metrics can be used to quantify fractal scaling. Here we provide a review of the definitions of such processes and several typical modeling approaches, including both time-domain and frequency-domain techniques, with special attention to their reconciliation. For a more comprehensive review, readers are referred to Beran et al. (2013), Boutahar et al. (2007), and Witt and Malamud (2013).

Strictly speaking,

One popular model for describing long-memory processes is the so-called
fractional autoregressive integrated moving-average model, or ARFIMA (

Synthetic time series with 200 time steps for three representative
fractal scaling processes that correspond to white noise (

In addition to a slowly decaying ACF, a long-memory process manifests itself
in two other equivalent fashions. One is the so-called Hurst effect,
which states that, on a log–log scale, the range of variability of a process
changes linearly with the length of the time period under consideration. This
power-law slope is often referred to as the Hurst exponent or Hurst
coefficient

The second equivalent
description of long-memory processes, this time from a frequency-domain
perspective, is fractal scaling, which describes a power-law decrease in
spectral power with increasing frequency, yielding power spectra that are
linear on log–log axes (Lomb, 1976; Scargle, 1982; Kirchner, 2005).
Mathematically, this inverse proportionality can be expressed as

In addition, it can be shown that the spectral density function for ARFIMA
(

To account for fractal scaling in trend analysis, one must be able to first quantify the strength of fractal scaling for a given time series. Numerous estimation methods have been developed for this purpose, including the Hurst rescaled range analysis, Higuchi's method, Geweke and Porter-Hudak's method, Whittle's maximum likelihood estimator, detrended fluctuation analysis, and others (Taqqu et al., 1995; Montanari et al., 1997, 1999; Rea et al., 2009; Stroe-Kunold et al., 2009). For brevity, these methods are not elaborated here; readers are referred to Beran (2010) and Witt and Malamud (2013) for details. While these estimation methods have been extensively adopted, they are unfortunately only applicable to regular (i.e., evenly spaced) data, e.g., daily streamflow discharge, monthly temperature. In practice, many types of hydrological data, including river water-quality data, are often sampled irregularly or have missing values, and hence their strengths of fractal scaling cannot be readily estimated with the above traditional estimation methods.

Thus, estimation of fractal scaling in irregularly sampled data is an important challenge for hydrologists and practitioners. Many data analysts may be tempted to interpolate the time series to make it regular and hence analyzable (Graham, 2009). Although technically convenient, interpolation can be problematic if it distorts the series' autocorrelation structure (Kirchner and Weil, 1998). In this regard, it is important to evaluate various types of interpolation methods using carefully designed benchmark tests and to identify the scenarios under which the interpolated data can yield reliable (or, alternatively, biased) estimates of spectral slope.

Moreover, quantification of fractal scaling in real-world water-quality data
is subject to several common complexities. First, water-quality data are
rarely normally distributed; instead, they are typically characterized by
log-normal or other skewed distributions (Hirsch et al., 1991; Helsel and
Hirsch, 2002), with potential consequences for

In the above context, the main objective of this work was to use
Monte Carlo simulation to
systematically evaluate and compare two broad types of approaches for
estimating the strength of fractal scaling (i.e., spectral slope

to examine the sampling irregularity of typical river water-quality monitoring data and to simulate time series that contain such irregularity, and

to evaluate two broad types of approaches for estimating

This work was designed to make several specific contributions. First, it uses
benchmark tests to quantify the performance of a wide range of methods for
estimating fractal scaling in irregularly sampled water-quality data. Second,
it proposes an innovative and general approach for modeling sampling
irregularity in water-quality records. Third, while this work was not
intended to compare all published estimation methods for fractal scaling, it
does provide and demonstrate a generalizable framework for data simulation
(with gaps) and

The rest of the paper is organized as follows. We propose a general approach for modeling sampling irregularity in typical river water-quality data and discuss our approach for simulating irregularly sampled data (Sect. 2). We then introduce various methods for estimating fractal scaling in irregular time series and compare their estimation performance (Sect. 3). We close with a discussion of the results and implications (Sect. 4).

River water-quality data are often sampled irregularly. In some cases, samples are taken more frequently during particular periods of interest, such as high flows or drought periods; here we will address the implications of the irregularity, but not the (intentional) bias, inherent in such a sampling strategy. In other cases, the sampling is planned with a fixed sampling interval (e.g., 1 day) but samples are missed (or lost, or fail quality-control checks) at some time steps during implementation. In still other cases, the sampling is intrinsically irregular because, for example, one cannot measure the chemistry of rainfall on rainless days or the chemistry of a stream that has dried up. Theoretically, any deviation from fixed-interval sampling can affect the subsequent analysis of the time series.

Examples of gap-interval simulation using negative binomial
distributions, NB (shape

To quantify sampling irregularity, we propose a simple and general approach that can be applied to
any time series of monitoring data. Specifically, for a given time series
with

The dimensionless set

Quantification of sampling irregularity for selected water-quality
constituents at nine sites of the Chesapeake Bay River Input Monitoring
Program and six sites of the Lake Erie and Ohio Tributary Monitoring Program.
(

The parameters

Examples of quantified sampling irregularity with negative binomial
(NB) distributions: total nitrogen in Choptank River

To visually illustrate these gap distributions, representative samples of
irregular time series are presented in Fig. 1 for the three special
processes described above (Sect. 1.2), i.e., white noise, pink noise, and
Brown noise. Specifically, three
different gap distributions, namely, NB(

The above modeling approach was applied to real water-quality data from two
large river monitoring networks in the United States to examine sampling
irregularity. One such network is the Chesapeake Bay River Input Monitoring
Program, which typically samples streams roughly once or twice monthly,
accompanied with additional sampling during storm flows (Langland et al.,
2012; Zhang et al., 2015). These data were obtained from the US Geological
Survey National Water Information System
(

For the Chesapeake Bay River Input Monitoring Program (nine sites), total
nitrogen (TN) and total phosphorus (TP) are taken as representatives of
water-quality constituents. According to the maximum likelihood approach, the
shape parameter

For the Lake Erie and Ohio Tributary Monitoring Program (six sites), records of
nitrate plus nitrite (NO

To evaluate the various

A total of 100 replicates of regular (gap-free) time series were produced for nine
prescribed spectral slopes, which vary from

The simulated regular time series were converted to irregular time series
using gap intervals that were simulated with NB distributions. To make these
gap intervals mimic those in typical river water-quality time series,
representative NB parameters were chosen based on results from Sect. 2.2.
Specifically,

For the simulated irregular time series,

Global mean: all missing values replaced with the mean of all observations.

Global median: all missing values replaced with the median of all observations.

Random replacement: all missing values replaced with observations randomly drawn (with replacement) from the time series.

Next observation carried backward (NOCB): each missing value replaced with the next available observation.

Last observation carried forward (LOCF): each missing value replaced with the preceding available observation.

Average of the two nearest samples: each missing value replaced with the mean of its next and preceding available observations.

LOWESS (locally weighted scatterplot smoothing) with a smoothing span of 1: missing values replaced using fitted values from a LOWESS model determined using all available observations (Cleveland, 1981).

LOWESS with a smoothing span of 0.75: same as B7 except that the smoothing span is 75 % of the available data (similar distinction follows for B9–B11).

LOWESS with a smoothing span of 50 %.

LOWESS with a smoothing span of 30 %.

LOWESS with a smoothing span of 10 %.

Illustration of the interpolation methods for gap filling. The
gap-free data (A1) was simulated with a series length of 500, with the first
30 data shown. (

Comparison of bias in estimated spectral slope in irregular data
that are simulated with prescribed

The second type of approaches estimates

Lomb–Scargle periodogram: the spectral density of the time series (with gaps) is estimated and the spectral slope is fit using all frequencies (Lomb, 1976; Scargle, 1982). This is a classic method for examining periodicity in irregularly sampled data, which is analogous to the more familiar fast Fourier transform method often used for regularly sampled data.

Lomb–Scargle periodogram with 5 % data: same as C1a except that the fitting of the spectral slope considers only the lowest 5 % of the frequencies (Montanari et al., 1999).

Lomb–Scargle periodogram with “binned” data: same as C1a except that
the fitting of the spectral slope is performed on binned data in three steps as
follows.

The entire range of frequency is divided into 100 equal-interval bins on logarithmic scale.

The respective medians of frequency and power spectral density are calculated for each of the 100 bins.

The 100 pairs of median frequency and median spectral density are used to estimate the spectral slope on a log–log scale.

Kirchner and Neal (2013)'s wavelet method: uses a modified version of Foster's weighted wavelet spectrum (Foster, 1996) to suppress spectral leakage from low frequencies and applies an aliasing filter (Kirchner, 2005) to remove spectral aliasing artifacts at high frequencies.

Each estimation method listed above was applied to the simulated data
(Sect. 2.3) to estimate

Comparison of bias in estimated spectral slope in irregular data
that are simulated with varying prescribed

For the simulated irregular data, the estimation methods differ widely in
their performance. Specifically, three interpolation methods (i.e., B4–B6)
consistently overestimate

Among the direct methods (i.e., C1a, C1b, C1c, and C2), the Lomb–Scargle
method, with original data (C1a) or binned data (C1c) tends to underestimate

The shape parameter

Comparison of root-mean-squared error (RMSE) in estimated spectral
slope in irregular data that are simulated with varying prescribed

Next, the method evaluation is extended to all the simulated spectral slopes,
that is,

Comparison of bias in estimated spectral slope in irregular data
that are simulated with varying prescribed

Comparison of root-mean-squared error (RMSE) in estimated spectral
slope in irregular data that are simulated with varying prescribed

For simulations with

For simulations with

Quantification of spectral slope in real water-quality data from the
two regional monitoring networks, as estimated using the set of examined
methods. All estimations were performed on concentration residuals (in
natural log concentration units) after accounting for effects of time,
discharge, and season. The two dashed lines in each panel indicate white
noise (

In this section, the proposed estimation approaches were applied to quantify

The estimated

For TN and TP concentration data at the Chesapeake River input monitoring
sites (Table 1),

For NO

Overall, the above analysis of real water-quality data has illustrated the
wide variability in

River water-quality time series often exhibit fractal scaling behavior, which
presents challenges to the identification of deterministic trends. Because
traditional spectral estimation methods are generally not applicable to
irregularly sampled time series, we have examined two broad types of
estimation approaches and evaluated their performances against synthetic data
with a wide range of prescribed

The results of this work suggest several important messages. First, the
results remind us of the risks in using interpolation for gap filling when
examining autocorrelation, as the interpolation methods consistently
underestimate or overestimate

Overall, these results provide new contributions in terms of better
understanding and quantification of the proposed methods' performances for
estimating the strength of fractal scaling in irregularly sampled
water-quality data. In addition, the work has provided an innovative and
general approach for modeling sampling irregularity in water-quality records.
Moreover, this work has proposed and demonstrated a generalizable framework
for data simulation (with gaps) and

River monitoring data used in this study are available
through the US Geological Survey National Water Information System
(

The authors declare that they have no conflict of interest.

Zhang was supported by the Maryland Sea Grant through awards NA10OAR4170072 and NA14OAR1470090 and by the Maryland Water Resources Research Center through a graduate fellowship while he was a doctoral student at Johns Hopkins University. Subsequent support to Zhang was provided by the US EPA under grant “EPA/CBP Technical Support 2017” (no. 07-5-230480). Harman's contribution to this work was supported by the National Science Foundation through grants CBET-1360415 and EAR-1344664. We thank Bill Ball (Johns Hopkins University) and Bob Hirsch (US Geological Survey) for many useful discussions. We are very grateful to the Editor and two anonymous reviewers for their comments and suggestions. This is contribution no. 5449 of the University of Maryland Center for Environmental Science. Edited by: Erwin Zehe Reviewed by: two anonymous referees