Key challenges facing the application of the conductivity mass balance method: a case study of the Mississippi River basin

The conductivity mass balance (CMB) method has a long history of application to baseflow separation studies. The CMB method uses site-specific and widely available discharge and specific conductance data. However, certain aspects of the method remain unstandardized, including the determination of the applicability of this method for a specific area, minimum data requirements for baseflow separation and the most accurate parameter calculation method. This study collected and analyzed stream discharge and water conductivity data for over 200 stream sites at large spatial (2.77 to 2 915 834 km2 watersheds) and temporal (up to 56 years) scales in the Mississippi River basin. The suitability criteria and key factors influencing the applicability of the CMB method were identified based on an analysis of the spatial distribution of the inverse correlation coefficient between stream discharge and conductivity and the rationality of baseflow separation results. Sensitivity analysis, uncertainty assessment and T test were used to identify the parameter the method was most sensitive to, and the uncertainties of baseflow separation results obtained from different parameter determination methods and various sampling durations were compared. The results indicated that the inverse correlation coefficient between discharge and conductivity can be used to quantitatively determine the applicability of the CMB method, while the CMB method is more applicable in tributaries, headwater reaches, high altitudes and regions with little influence from anthropogenic activities. A minimum of 6-month discharge and conductivity data was found to provide reliable parameters for the CMB method with acceptable errors, and it is recommended that the parameters SCRO and SCBF be determined by the 1st percentile and dynamic 99th percentile methods, respectively. The results of this study can provide an important basis for the standardized treatment of key problems in the application of the CMB.

Abstract. The conductivity mass balance (CMB) method has a long history of application to baseflow separation studies. The CMB method uses site-specific and widely available discharge and specific conductance data. However, certain aspects of the method remain unstandardized, including the determination of the applicability of this method for a specific area, minimum data requirements for baseflow separation and the most accurate parameter calculation method. This study collected and analyzed stream discharge and water conductivity data for over 200 stream sites at large spatial (2.77 to 2 915 834 km 2 watersheds) and temporal (up to 56 years) scales in the Mississippi River basin. The suitability criteria and key factors influencing the applicability of the CMB method were identified based on an analysis of the spatial distribution of the inverse correlation coefficient between stream discharge and conductivity and the rationality of baseflow separation results. Sensitivity analysis, uncertainty assessment and T test were used to identify the parameter the method was most sensitive to, and the uncertainties of baseflow separation results obtained from different parameter determination methods and various sampling durations were compared. The results indicated that the inverse correlation coefficient between discharge and conductivity can be used to quantitatively determine the applicability of the CMB method, while the CMB method is more applicable in tributaries, headwater reaches, high altitudes and regions with little influence from anthropogenic activities. A minimum of 6-month discharge and conductivity data was found to provide reliable parameters for the CMB method with acceptable errors, and it is recommended that the parameters SC RO and SC BF be determined by the 1st percentile and dynamic 99th percentile methods, respectively. The results of this study can provide an important basis for the standardized treatment of key problems in the application of the CMB.

Introduction
Baseflow is the groundwater contribution to total streamflow (Hewlett and Hibbert, 1967), which plays a critical role in sustaining streamflow during dry periods (Rosenberry and Winter, 1997). Quantitative estimates of stream baseflow can be used to determine baseflow response to environmental conditions, thereby improving understanding of the water budget of a watershed and facilitating the estimation of groundwater discharge and recharge (Tan et al., 2009;Dhakal et al., 2012;Ran et al., 2012).
Given the importance of baseflow, many methods have been proposed for baseflow separation. Although these methods can be categorized according to various conditions (Stewart et al., 2007;Zhang et al., 2013;Miller et al., 2014;Lott and Stewart, 2016), they can generally be divided into two groups, namely non-tracer-based and tracer-based separation methods (Li et al., 2014). Non-tracer methods mainly include graphical and low-pass filter methods which only require stream discharge data (Nathan and McMahon, 1990;Eckhardt, 2008). Given the wide availability of stream discharge records, these approaches can readily be applied to a large number of sites (Miller et al., 2014). However, since these methods are typically applied without reference to any hydrological basin variables, the objective assessment of their accuracy remains a challenge (Nathan and McMahon, 1990; Arnold et al., 2000;Furey and Gupta, 2001;Huyck et al., 2005;Eckhardt, 2008). In contrast, tracer-based baseflow separation methods adhere to the principle of mass balance (MB). Tracers such as stable isotopes, major ions and specific conductance (SC) have been used to quantify surface runoff and groundwater discharge to streamflow (Miller et al., 2014). The advantage of these methods relates to their use of site-specific variables, such as concentrations of chemical constituents, which are a function of actual physical processes and flow paths in the basin responsible for generation of different flow components. Therefore, chemical mass balance estimates of baseflow are often considered to be more reliable than those from graphical hydrograph separation estimates (Stewart et al., 2007). The principal disadvantage of mass-balance methods relates to their requirements of both observed discharge and chemical concentration data, which are not widely available, especially over a long period. This makes the application of the MB method in large basins impractical over a long period. For example, while stable isotopes are generally considered to be the most accurate chemical tracers for hydrograph separation (Kendall and McDonnell, 2012), the analytical costs associated with these constituents often limit their use in large studies (Miller et al., 2014).
In an analysis of hydrograph separation conducted using different geochemical tracers, Caissie et al. (1996) demonstrated that SC was the most effective single variable for quantifying the runoff and groundwater components of total streamflow since SC is a natural environmental tracer that can be inexpensively measured concurrently with streamflow (Kunkle, 1965;Matsubayashi et al., 1993;Arnold et al., 1995;Caissie et al., 1996;Cey et al., 1998;Heppell and Chapman, 2006;Stewart et al., 2007;Pellerin et al., 2008).
The conductivity mass balance (CMB) method converts specific conductance to a baseflow value using a twocomponent mass balance calculation (Pinder and Jones, 1969;Nakamura, 1971;Stewart et al., 2007): In Eq. (1), Q is the measured streamflow discharge (L 3 T −1 ), SC is the measured specific conductance (lS cm −1 ) of streamflow, SC RO is the specific conductance of the runoff end-member, and SC BF is the specific conductance of the baseflow end-member. Certain questions need to be addressed before the CMB method can be considered for separating baseflow in a watershed. These include whether the CMB method is applicable to a watershed, how to more accurately determine the key parameters SC RO and SC BF when a long series of monitoring data are available, and the length of the monitoring period required to ensure the accuracy of the results when adopting a CMB method for a new conductivity monitoring network. These questions have been partially answered by past studies. Miller et al. (2014) concluded that the CMB method was successful in quantifying baseflow in a variety of stream ecosystems, including snowmelt-dominated watersheds (Covino and McGlynn, 2007), urban watersheds (Pellerin et al., 2008) and a range of other settings (Stewart et al., 2007;Sanford et al., 2011;Lott and Stewart, 2016). However, most chemical hydrograph separation studies have been conducted in small watersheds and for short durations (Miller et al., 2014). In addition, the CMB method is often not appropriate for application to systems in which there is no a consistent inverse correlation between discharge and SC, particularly for sites heavily influenced by anthropogenic activities. However, there appears to be no further systematic summary of characteristics of watershed systems that indicates the suitability of the CMB method. Questions therefore remain of how to determine whether the CMB method is appropriate for application to a particular watershed and which factors have the greatest impact on the outcome of the application of the CMB method. Further uncertainties in the CMB method relate to appropriate methods for determining the parameters of the method. Stewart et al. (2007) determined through a field test that the maximum and minimum conductivity can be used to replace SC BF and SC RO , respectively. Miller et al. (2014) found that the maximum conductivity of streamflow may exceed the real SC BF ; therefore, they suggested the use of the 99th percentile of conductivity of each year as SC BF to avoid the impact of high SC BF estimates on the separation results and assumed that baseflow conductivity varies linearly between years. However, questions remain in relation to which parameter determination method is more reasonable and accurate for calculation of baseflow. In a study of the shortest monitoring period of the CMB method, Li et al. (2014) evaluated data requirements and potential bias in the estimated baseflow index (BFI) using conductivity data for different seasons and/or resampled data segments at various sampling durations, and they found that a minimum of 6 months of discharge and conductivity data are required to obtain reliable parameters with acceptable errors. However, their study conceded that further studies of watersheds at large temporal and spatial scales are needed to verify the conclusions.
The present study conducted a comprehensive qualitative and quantitative analysis of data from more than 200 hydrological sites widely distributed in the Mississippi River basin, United States of America. Based on the results of statistical analysis, the present study had the following objectives: (1) determine the criteria and main factors influencing the applicability of the CMB method; (2) identify the best method for determining the parameters of the CMB method; (3) determine data requirements for the CMB method. The conclusions of the present study can help to determine whether the CMB method is applicable to a particular river reach and can provide a reference standard for use of the method.

Data sources and site description
The Mississippi River basin is located on the western side of the continental divide. The basin encompasses five states and has a drainage area of 320 000 km 2 . A total of 201 sites were selected in watersheds of the Missouri, Illinois, Minnesota, Iowa, Ohio, Arkansas, Red, White and Des Moines rivers to represent the variability of sub-basin areas and physiographic and climatic regions, with the areas of subbasins ranging from 2.77 to 2 915 834 km 2 (Fig. 1). Each selected site had at least 2 years of continuous discharge data paired with specific conductance data. All discharge and specific conductance data used in the present study were mean daily values retrieved from the United States Geological Survey's (USGS) National Water Information System (NWIS) website (http://waterdata.usgs.gov/nwis, last access: 10 March 2019).
2.2 Determination of the applicability of the CMB method and the identification of the major factors influencing the applicability of the CMB method The CMB method assumes that the two main recharge sources in any particular river section, streamflow runoff and baseflow have relatively stable conductivity values (Stewart et al., 2007;Lott and Stewart, 2012). Under natural conditions, streamflow conductivity reaches a maximum value under the dry season minimum discharge, indicating the dominant contribution of baseflow to streamflow (Miller et al., 2014). In contrast, streamflow conductivity will decrease during the high-flow period when the contribution of direct runoff through rainfall or snowmelt to discharge increases. This relationship between stream conductivity and the discharge persists through intermediate-state streamflows, with an inverse power function between streamflow discharge and conductivity identified (Miller et al., 2014). Conditions under which the above general relationship does not apply indicate the influence of other external factors on the river which the CMB method would be unable to represent. Therefore, during the process of baseflow separation, the applicability of the CMB method to a particular river section can be determined by identifying the relationship between stream discharge and conductivity.
In the present study, to identify the applicability of the CMB method to the 201 different site locations in the Mississippi River basin, the relationships between conductivity and streamflow discharge at the sites were quantitatively evaluated by correlation analysis. Stream sites were grouped into four categories according to the strength of the relationship, as indicated by the inverse correlation coefficient (r): (1) high degree of inverse correlation (r ≤ −0.8); (2) medium degree of inverse correlation (−0.8 < r ≤ −0.5); (3) low degree of inverse correlation (−0.5 < r ≤ −0.3); (2) no inverse correlation (r > −0.3). The present study analyzed the spatial distribution of stream site correlation coefficients in the basin combined with statistical data on topography, stream discharge and anthropogenic activities. The influences of these factors on the inverse correlation were studied, following which the key factors affecting the applicability of the CMB method to sub-basins of different spatial scales were identified. Thus, a set of judgement criteria for the applicability of the CMB method for baseflow separation to a certain area was established.

Determination of the SC BF and SC RO
As according to the CMB equation (Eq. 1), the key parameters that are needed to calculate the baseflow index of total flow are the conductivities of baseflow (SC BF ) and surface runoff (SC RO ). It is generally believed that runoff dominates streamflow during the extreme high-flow and minimum stream conductivity periods of each year, during which stream conductivity is assigned as SC RO . In contrast, stream conductivity during extreme low-flow and maximum stream conductivity periods of each year is assigned as SC BF , during which baseflow dominates streamflow (Stewart et al., 2007;Lott and Stewart, 2012).
Several approaches are currently used to determine SC BF : (1) directly assigning the maximum stream conductivity of the stream monitoring record as SC BF (Stewart et al., 2007); (2) assigning the 99th percentile (ordered by increasing conductivity) of the stream conductivity monitoring record to avoid the impacts of extremely high SC BF estimates that may arise when river conductivity has been affected by factors such as evaporation, irrigation, mining activity and the use of salts as road de-icing agents on the separation results; (3) identifying yearly dynamic maximum or 99th percentile conductivity measurements within a monitoring record as SC BF (Miller et al., 2014).
Since Stewart et al. (2007) have pointed out that longer conductivity records are more likely to contain low conductivity values associated with high discharge, the present study used the minimum or 1st percentile (ordered by decreasing conductivity) method to estimate SC RO .
The sensitivities of BFI to SC BF and SC RO expressed as an index, i.e., S(BFI/SC BF ) and S(BFI/SC RO ), respectively, and the uncertainties of SC BF , SC RO and BFI, which can be expressed as W SC BF , W SC RO and W BFI , respectively, were calculated using the monitoring data of 26 stream sites with long-term records of stream discharge and conductivity for at least 5 years. The present study then proposed an optimal method of determining SC BF and SC RO according to an analysis of different methods for calculating baseflow hydrographs.

Data requirements for SC BF and SC RO
Monitoring data of 26 stream sites with long-term records of stream discharge and water conductivity were analyzed to study the influence of different monitoring durations on the accuracy of parameter determination and baseflow separation results. Among the 26 sites, 5 had monitoring periods longer than 14 years, whereas the remainder had monitoring periods longer than 5 years. Continuous sampling periods within the five longer stream monitoring records included 3, 6, 9, 12, 15, 18, 21 and 24 months, whereas those in the remaining stream monitoring records included 3, 6, 9 and 12 months. To reduce the sampling error caused by the small number of samples, overlapping of monitoring data was allowed when sampling. In addition, each segment for a specific sampling duration was randomly chosen due to the variability in water quality measurements (Li et al., 2014). SC BF , SC RO and BFI were calculated for each segment, following which it was determined whether the BFI of all segments for the specific sampling durations followed normal distributions. On the premise of following a normal distribution, the BFI values obtained using 3, 6, 9, 12, 15, 18 and 21 months of conductivity measurements were compared with the BFI values obtained with 24 months of data for the five sites with longer records. For the remaining sites, the BFI values obtained with 3, 6 and 9 months of conductivity measurements were compared with the BFI values obtained with 12 months of data. A Student's T test at a statistical significance level of 0.05 was used to examine the differences between BFI determined from data of each sampling duration and those from the 24 or 12 months of data. No significant difference in BFI values estimated with a shorter duration of conductivity records with those obtained with 24 or 12 months of data (P > 0.05) indicated that the shorter time duration for conductivity measurement was acceptable.

Quantitative estimates of the sensitivity and uncertainty in baseflow
As mentioned above, the sensitivities of BFI measurement to SC BF and SC RO were calculated and the uncertainties of CMB results obtained using different parameter determination methods and monitoring durations were evaluated to identify the most accurate parameter calculation method and the shortest appropriate monitoring period. The dimensionless sensitivity index of BFI (output) with SC BF (uncertain input) and SC RO , S(BFI|SC BF ) and S(BFI|SC RO ), reflecting the proportional relationship between the relative error in BFI and the relative error in parameters, were calculated using the following equations (Yang et al., 2019): In Eqs. (2) and (3), y is streamflow (L 3 T −1 ) and k is the time step.
There is uncertainty associated with the estimation of true means from finite samples, which is regarded as a type of error in statistical inference (Lo, 2005). This uncertainty in the CMB method was estimated based on the uncertainties in SC BF , SC RO , and SC k . Under the approach used in the present study, the errors in the input variables are propagated to output variables following the uncertainty transfer equation derived from (Genereux and Hooper, 1998) In Eq. (4), f bf is the ratio of baseflow to streamflow in a single calculation process, W f bf is the uncertainty in f bf at the 95 % confidence interval, W SC BF is the standard deviation of SC BF multiplied by the t value (α = 0.05; two-tail) from the Student's distribution, W SC RO is the standard deviation of the lowest 1 % of measured SC concentrations multiplied by the t value (α = 0.05; two-tail), and W SC K is the analytical error in the SC measurement multiplied by the t value (α = 0.05; two-tail). The average uncertainty in multiple calculation processes is then used to estimate the uncertainty in the baseflow index (BFI, long-term ratio of baseflow to total streamflow), which can be expressed as W BFI-Genereux (Genereux and Hooper, 1998;Miller et al., 2014). On the other hand, Yang et al. (2019) found that random measurement errors in y k or SC k for time series exceeding 365 d will cancel each other out, allowing the influence on BFI to be ignored. An additional uncertainty estimation method of BFI can then be derived on the basis of the sensitivity analysis (Yang et al., 2019): In Eq. (5), W SC BF and W SC RO represent the same type of uncertainty values for SC BF and SC RO , respectively, as described above (Yang et al., 2019). Given that the determination of the parameters involves sensitivity analysis and that the sampling period of the shortest time series might not exceed 1 year, both the uncertainty estimation methods of BFI proposed by Yang et al. (2019) and Genereux and Hooper (1998) were used to determine the parameters and the shortest time series in the present study.

Assessment of sub-basin criteria for suitability of the CMB method
The analysis of the 201 stations across the major Mississippi River basin showed a high variation in response of conductivity to stream discharge. Most sites (157) showed an inverse correlation between streamflow discharge and conductivity, with the number of sites with the high, medium, and low inverse correlations being 47, 72 and 38, respectively. The goodness of fit (R 2 ) of each site identified by regression analysis ranged from 0.00002 to 0.9655 (Fig. 2). An analysis of the spatial distribution of inverse correlations between stream discharge and conductivity in the basin showed that the correlations were related to various factors, including topography, altitude, stream discharge and location. In general, most stations located in stream headwater areas with a steep terrain and high elevation showed inverse correlations between flow and conductivity, with 18/19 of the sites with an elevation above 1500 m showing an r ≤ −0.5. Fewer sites (101/182) falling within middle and lower reaches with a lower topography showed an r ≤ −0.5 (Fig. 3). These results showed that sites with an inverse correlation between conductivity and streamflow were more likely to be located on tributaries than on mainstems. The proportions of sites in which the correlation coefficient r ≤ −0.5 for mainstems and tributaries for the Missouri River basin, upper Mississippi River basin, lower Mississippi River basin, and Ohio River basin were 36.4 % (4/11) and 51.6 % (33/64), 50 % (3/6) and 54.5 % (6/11), 0 % and 77.8 % (14/18), and 50 % (5/10) and 70.5 % (31/44), respectively. On the other hand, the quantitative relationship between streamflow discharge and the correlation coefficient was not significant, and there were significant differences among the stream discharges of sub-basins.

Comparison of different SC BF and SC RO determination methods
The sensitivity analysis results (Table 1) showed that the sensitivity indices of BFI for SC BF and SC RO were all negative, indicating negative correlations between BFI and SC BF (SC RO ). The absolute value of the sensitivity index for SC BF was generally greater than that for SC RO , indicating that BFI was affected by SC BF to a greater degree. Taking site 07097000 as an example, uncertainty of 10 % for both SC BF and SC RO resulted in the contribution of SC BF to the uncertainty in BFI being −1.34 times 10 % (−13.4 %), whereas that of SC RO was −0.56 times 10 % (−5.6 %). Therefore, it is clear that more attention should be focused on SC BF to reduce uncertainty in BFI. Furthermore, underestimation or overestimation of SC BF has a different impact on BFI, which will result in overestimation or underestimation of BFI, respectively (Zhang et al., 2013), and   it can be proven by Eq.
(1) that although underestimation or overestimation of SC BF is of the same degree, the former one has more impact on BFI. On this basis, the uncertainty values W SC BF and W BFI-Yang obtained from different determinations of SC BF were compared, with the yearly dynamic maximum and yearly dynamic 99th percentile determination methods mainly considered. This approach was adopted as anthropogenic activities over long periods of time or year-to-year changes in the water table level may result in temporal changes in SC BF (Miller et al., 2014). Therefore, by adopting yearly dynamic maximum and 99th percentile values, the effects of temporal fluctuations in SC BF can be avoided. The results showed that nearly all the uncertainty values W SC BF and W BFI-Yang obtained from using the yearly dynamic 99th percentile were less than the corresponding values obtained from yearly dynamic maximum values. In addition, the values of W SC RO were much less than those of W SC BF , which can be explained by considering that W SC RO is the standard deviation of the lowest 1 % of measured SC concentrations multiplied by the t value (α = 0.05; two-tail). This excluded the possibility of calculating various standard deviations; therefore, various W SC RO have not been compared in the present study.

Data requirements for determining SC BF and SC RO
The SC BF , SC RO and BFI values tended to stabilize with increasing sampling duration. In general, with a gradual increase in SC BF , SC RO showed a decreasing trend, whereas BFI showed fluctuation with no significant upward or downward trend (e.g., stream site 07086000 shown in Fig. 4 and other sites shown in Supplement 1). The P values of BFI as determined by the T test did not indicate signifi-cant changes with sampling duration, which were greater than 0.05 for durations longer than 3 months. The uncertainty of BFI (i.e., W BFI-Genereux ) similarly showed significant variation of as high as 0.31 at a conductivity sampling duration of 3 months but stabilized in the range of 0.14 to 0.27 for sampling duration greater than 3 months (Fig. 5). Therefore, it is clear that a BFI obtained from any continuous data with a sampling duration no longer than 3 months will obviously differ from that obtained from data with a 2-year continuous sampling duration. Therefore, at least 6 months of conductivity records are suggested to obtain reliable estimates of SC BF , SC RO and BFI. Stream sites in which the BFI followed a normal distribution (∼ 20 stream sites) were assessed, and it was found that there were 10 sites with minimum sampling durations of 3 and 6 months, respectively (see Supplements 1 and 2 for details). Therefore, a minimum of 6-month sampling duration is recommended for application of the CMB method to separate the hydrograph for sites in the Mississippi River basin.

Sub-basin characteristics as indicators of the applicability of the CMB method
The results of the present study suggested that the applicability of the CMB method to a particular site can be determined by the presence of an inverse correlation between streamflow discharge and conductivity within monitoring data. Baseflow separation showed unreasonable results for sites in which there was no significant inverse correlation between stream conductivity and discharge. Taking site 01636315 as an example (Fig. 6), an increase in river flow from 28 August   to 16 December 2006 was accompanied by a consistently high level of conductivity over the entire monitoring period. The calculated baseflow for this site using Eq. (1) was too large, with a significantly higher ratio during the flood process which clearly did not conform with the mechanism of the baseflow recharge process. During periods of recession (for example, 23 July-6 November 2007, 9 June-24 August 2008, 30 June-21 October 2009, and 23 May-11 August 2010), a gradual decrease in discharge was accompanied by a gradual decrease in conductivity, which is an opposite trend to what would be expected, and resulting in the calculated baseflow hydrograph being significantly lower than the runoff hydrograph. During the dry season, the only source of water in the river was baseflow, and therefore the separation results were clearly incorrect. In fact, for sites in which there was no significant inverse correlation between stream discharge and conductivity, they tended to show a positive relationship. Under these conditions, baseflow separation will generate inaccurate baseflow estimates. Therefore, the present study confirmed the value of an inverse correlation between conductivity and discharge as an indicator of the suitability of the CMB method. The presence of an inverse correlation between stream conductivity and discharge is dependent on a strong hydraulic connection between groundwater and surface water in a reach and on the major direction of surface watergroundwater interaction being from groundwater to surface water. The CMB method should not be applied to sites in which there is interference in this relationship through anthropogenic activities and other external factors. In this way, conductivity and streamflow data can accurately reflect the natural spatial and temporal variation in baseflow and in the baseflow index. The present study further analyzed the characteristics of factors influencing the inverse correlation between stream conductivity and discharge, including location, topography, surrounding environmental conditions and anthropogenic interferences. By combining the inverse correlation and baseflow separation results, the present study provides a discussion of the key factors influencing the applicability of the CMB method.

Impacts of topography and altitude
More than 90 % (18/19) of the sites located in the upstream area of the basin characterized by a steep terrain and high altitude (particularly those above 1500 m) showed an inverse correlation (i.e., r ≤ −0.5) between streamflow conductivity and discharge, thereby indicating the good applicability of the CMB method for these sites (Fig. 7). In these areas, high flow velocity and a significant downcutting effect of the river contribute to V-shaped river valleys. There is a strong hydraulic connection between groundwater and surface water in these cases. The middle and lower river reaches are in contrast characterized by lower flow velocity and a weakened downcutting effect, and as the river water level rises, the river may cross a threshold in which it becomes a source of groundwater recharge. This change in relationship between surface water and groundwater results in a breakdown in the inverse correlation between conductivity and discharge, thereby violating the mechanistic understanding the CMB method is based on. In particular, the lower reaches of the basin downstream of Cairo are characterized by a reduced riverbed gradient, wider river valleys and circuitous river channels in which groundwater is recharged by surface water, and the ratio of sites with a medium to high degree of inverse correlation (i.e., r ≤ −0.5) is reduced to 55 % (101/182), suggesting that the applicability of the CMB method for these sites is significantly reduced. As shown in Fig. 8, the proportion of sites with a correlation coefficient less than −0.5 increased significantly with increasing site  elevation. However, the relationship between the correlation coefficient and site elevation did not strictly satisfy linear inverse correlation, and there are also some sites below 1500 m (especially 500 m) that met the requirements of the correlation coefficient (less than −0.5); these sites were mainly located in the Ohio River basin, the terrain of the basin is relatively flat and the elevation is low. Since the elevations of many sites located in stream headwater areas were less than 500 m, the impact of site location (such as on a tributary or mainstem) may be more significant than elevation.

Impacts of site location and streamflow discharge
The present study analyzed and compared site data for the mainstem and tributaries of the Missouri River basin, Arkansas River basin, upper Mississippi River basin and other sub-basins. The results showed that a higher proportion of sites in the tributaries met the requirements of the CMB method. For example, the proportions of tributary and mainstem sites which met the requirements of the CMB method in the Missouri River, Ohio River and upper Mississippi River were 51.6 % and 36.4 %, 70.5 % and 50 %, and 54.5 % and 50 %, respectively. Tributary sites were generally characterized by a high altitude and steep terrain, whereas the mainstem sites fell within plain and low-altitude areas. Therefore, in general, the CMB method is more likely to be applicable to tributary sites.
In theory, streamflow discharge should be a strong determinant of the feasibility of the CMB method. Within a specific watershed, sites with high discharge are mostly located along the mainstems and downstream area, and as discussed above, few are suitable for application of the CMB method. On the other hand, sub-basins with lower flow are likely to be more susceptible to temporal variations in water quantity and the influences of external factors, resulting in distorted results of baseflow separation. However, the results of the present study showed no consistent mathematical relationship between streamflow discharge and correlation coefficient r. Considering the existence of a strong linear relationship between discharge and catchment area for certain subbasins, for example, for the Missouri River basin in which the R 2 of the relationship is 0.94, further analysis of the relationship between catchment area and the applicability of the CMB method was justified. The present study found that the proportion of monitoring sites with a strong inverse correlation coefficient for the stream conductivity-discharge relationship (i.e., r ≤ −0.5) was relatively low under a very large catchment area. For example, within the Arkansas River basin, only ∼ 11 % of sites with an area > 34 000 km 2 showed a strong inverse correlation coefficient (Fig. 9a). In addition, the proportion of monitoring sites with catchment areas < 800 km 2 in which there was a strong inverse correlation coefficient (i.e., r ≤ −0.5) was relatively low, with approximately 20 % in the Missouri River basin (Fig. 9b). However, it is difficult to simultaneously determine the high-flow and low-flow thresholds for applicability of the CMB method within a particular sub-basin.

Impacts of anthropogenic factors
Human activities can significantly affect stream discharge and water quality, thereby disrupting their natural relationship and invalidating the application of the CMB method. Human activities can result in dramatic changes to river conductivity, and the major impact processes include agricultural irrigation, mining activity, the use of salts as road deicing agents and groundwater pumping (Kaushal et al., 2005;Crosa et al., 2006;Zume and Tarhule, 2008;Dikio, 2010;Palmer et al., 2010;Bäthe and Coring, 2011;Miguel et al., 2013). Other anthropogenic factors can also result in artificial variations in conductivity, such as industrial wastewater discharge (Piscart et al., 2005;Dikio, 2010), discharge of sewage wastewater (Silva et al., 2000;Williams et al., 2003;Lerotholi et al., 2004) or reduced river discharge due to river impoundment (Mirza, 1998).
Irrigation and the resulting rise in groundwater tables have been reported as one of the main factors leading to significant changes in electrical conductivity of river water, particularly in arid and semi-arid regions in which crop production consumes large quantities of water. Since crops absorb only a fraction of salt introduced through irrigation water, the remaining salt concentrates in the soil, leading to saline soil (Lerotholi et al., 2004). These salts may be leached out through run-off, ultimately ending up in rivers. Therefore, agriculture practices such as fertilizer application can influence the concentrations of conductivity and hence affect the accuracy of the CMB method. In contrast, Li et al. (2018) showed that conductivity of baseflow and surface runoff did not change over time in forest watersheds.
Mining activity is another major source of salts in rivers. Large quantities of potash salts are extracted each year for the manufacture of agricultural fertilizers. During the process of manufacturing of crude salt, which contains not only potash, but also NaCl and other salts, huge amounts of solid residues are stockpiled. The salts are dissolved during precipitation events and may enter surface waters. Mountaintop mining is a mining technique which involves removing 150 or more meters of a mountain to gain access to coal seams and has been blamed for large-scale stream salinization (Pond et al., 2008). The exposure of coal seams to weathering and percolation during coal mining provides many opportunities for the leaching of sulfate from coal wastes into surface waters (Fritz et al., 2010;Bernhardt and Palmer, 2011).
Significant changes in electrical conductivity in the cold regions has often been reported to be the result of the use of salts as road de-icing agents (Löfgren, 2001;Ruth, 2003;Williams et al., 2003). The amount of salts used to de-ice roads in North America increased from 909 000 to 1 347 000 t per winter from 1961 to 1966 (Hanes et al., 1970). During the 1980s, the amount of salts applied to roads increased to 10 million t yr −1 in the United States alone (Salt Institute, 1992). Around 14 million t of salt per year is currently applied to roads in North America (Environment Canada, 2001). The majority of salts used on roads are transported to adjacent streams during rainfall events and snow melting periods (Williams et al., 2003). Consequently, concentrations of salts downstream from major roads have been recorded to be up to 31 times higher than comparative upstream concentrations (Demers and Sage, 1990), and some rural streams have registered chloride concentrations exceeding 0.1 g L −1 (≈ 0.16 g NaCl g L −1 ), similar to those found in the salt front of the Hudson River estuary (Kaushal et al., 2005).
Groundwater pumping can reduce groundwater discharge to streams and affect the hydraulic connection between groundwater and surface water and then invalidates the application of the CMB method. When a well is pumped at a constant rate, initially most of the groundwater comes from storage, eventually reaching the river, inducing a leakage of stream water to adjacent aquifer and depleting streamflow significantly (Bredehoeft and Kendy, 2008;Gleeson and Ritcher, 2018). This change in relationship between groundwater and surface water renders CMB method less applicable.
Typically, a monitoring site is located adjacent to a reservoir or other water conservancy infrastructure, which may contribute to significantly increased evaporation and higher conductivity. On the other hand, the reservoir/dam can also provide substantial sources of water in low-flow periods. This may decrease conductivity in streams, thereby undermining the groundwater contribution to streams and leading to an underestimation of baseflow conductivity. In the present study, such affected stream sites included 07130500, 05116000, 06058502, 03400800 and 05370000 located in the upstream part of the Mississippi River basin, and these sites showed relatively poor inverse correlations between stream conductivity and discharge, with correlation coefficients of −0.42, −0.29, 0.06, −0.44 and −0.495, respectively.
Since the Mississippi River basin encompasses almost two-thirds of the entire area of the United States and streamflow occurs through large areas of plain in the Midwest and densely populated areas in the east, the impacts of anthropogenic factors in these areas are great, resulting in limited applicability of the CMB method.
The present study found that, in general, for the entire Mississippi River basin, the CMB method was more applicable for headwater sites, tributaries and high-altitude regions of > 1500 m a.s.l. (above sea level), with relatively few impacts by anthropogenic factors. In contrast, the application of the CMB method to downstream flat and low-altitude areas or to areas affected by anthropogenic activities should be carefully considered.
A related study in the upper Colorado River basin suggests higher-elevation watersheds typically have greater baseflow yield (Rumsey et al., 2015), and Dyer (2008) found that high flows in upper streams are mainly stimulated by the snowmelt process and whether the impacts of altitude and site location are mainly due to differences in hydrological regimes, i.e., snow-dominated in upper streams and raindominated in lower watersheds. From these findings which are based on the major river basins in North America, we still cannot establish a relationship between hydrological regimes and the applicability of the CMB method. On the other hand, as a large watershed, the Mississippi River basin has sizeable spatial heterogeneity of climate. The role of climate in hydrology, particularly for low flows, is more pronounced in larger watersheds. The influence of hydrological processes on baseflow is complex, particularly when taking climate change into consideration. Therefore, specialized research will be required in the future.

Optimal method to determine SC BF and SC RO
The comparison of sensitivity analysis results indicated that the influence of parameter SC RO on the separation results was significantly lower than that of parameter SC BF . This result is supported by previous relevant research (Stewart et al., 2007;Zhang et al., 2013;Li et al., 2014;Yang et al., 2019). Moreover, since SC RO represents the minimum conductivity during the wet season, whereas SC BF represents the maximum conductivity during the dry season, the SC RO is less likely to be reduced to an unreasonable extremely low value by the effects of natural or anthropogenic activities. The present study conservatively recommends the 1st percentile of conductivity of the entire monitoring period as indicative of the SC RO to avoid extreme values.
Over a long-term monitoring period, river water quality is often influenced by anthropogenic processes such as release of water from upstream reservoirs and sewage discharge, which can result in extremely high conductivity and underestimated baseflow. The use of the 99th percentile of conductivity as SC BF can effectively avoid these extreme situations. Considering that the climate, human activities and corresponding hydrological processes occurring in a basin will change greatly over the full extent of a monitoring period, it is recommended that the SC BF be determined dynamically to further improve the accuracy of baseflow separation. From the calculated uncertainty results of each method (Table 1), it can be concluded that the uncertainty associated with the use of the dynamic 99th percentile approach was lower than that of the dynamic maximum conductivity approach. Taking site 07097000 as an example for comparing the four approaches of assigning SC RO and SC BF (Fig. 10), during the recession process, the baseflow calculated by the recommended approach appeared rational, whereas the other three approaches generated relatively low baseflow. Therefore, it is suggested that the 1st percentile of conductivity of the entire monitoring period and yearly dynamic 99th percentile approach should be used to determine SC RO and SC BF , respectively.
However, it must be stressed that although the applicability of the CMB method has been verified for a site before determining parameters, it cannot be guaranteed that there will be no anthropogenic disturbance to parameters of a site in which the CMB method has been found to be applicable and that the parameters correspond to the lowest flows very well. For example, leakage of an underground storage tank may last for a long time, which may result in many observations of extremely high conductivities that cannot be avoided by the 99th percentile method. So there is a possibility that the 99th percentile conductivity does not correspond to the lowest flows. Therefore, parameters should be assessed after calculation by the 99th percentile method to further avoid abnormal phenomena and errors within separation results.

Data requirements for SC BF and SC RO
Determining the shortest monitoring periods appropriate for calculating SC RO and SC BF requires determination of the monitoring period required to obtain the reference standard of separation results. Generally, the length of the monitoring period is positively related to the accuracy of the hydrological characteristics of the station reflected by the monitoring data, and the BFI result obtained from a longer moni-toring record will be more reasonable compared to that obtained from a relatively shorter record. As an example in the present study and using the BFI calculated by 24 months of data as a standard, the random selection of 20 segments in which no more than half of the data were reused will require monitoring periods of greater than 21 years. For this reason, only 26 of 201 sites were selected for analysis in the present study, from which 5 sites allowed the standard BFI calculation from 24 months of data, whereas the remaining 21 sites allowed the BFI to be calculated from 12 months of data. Therefore, there needs to be further comparison and discussion of the data requirements of utilizing different standard sampling durations. The BFI calculated from 24-month data and yearly data were viewed as a standard for the four stream sites in which the standard sampling durations were 24-months and in which the monitored data followed a normal distribution, respectively. The Student's T test was used to compare differences in BFI obtained from 3, 6 or 9 months of data and the BFI obtained from standard sampling durations ( Table 2). The results showed that minimum sampling durations were all less than or equal to 6 months, which indicated that the results obtained by 12-month sampling duration as a standard were also reasonable. Li et al. (2014) similarly questioned their assumption of requiring a dataset of 12-month duration to provide the best representativeness for a watershed and stressed that the uncertainties associated with variations in SC RO and SC BF over years require further study. The results of the present study support their hypothesis that variations in SC RO and SC BF over years will not have a substantial impact on the determination of standard sampling duration.

Conclusions
Through comprehensive qualitative and quantitative analysis of stream discharge and conductivity data for more than 200 hydrological stations in the Mississippi River basin, the present study systematically addressed key questions related to the application of the CMB method to particular sites for baseflow separation. In general, the CMB method was found to be more applicable to tributaries, headwater sites, sites at high altitude and sites with little influence from anthropogenic activities. The applicability of the CMB method can be determined by analyzing the inverse correlation between stream discharge and conductivity. Continuous monitoring of flow and conductivity of longer than 6 months in duration are required to ensure the reliability of baseflow separation results within the CMB method. Within a long series of monitoring data, the 1st percentile method and dynamic 99th percentile method are recommended to determine the parameters of SC RO and SC BF , respectively. Further study is required to determine which 6 months should be selected for continuous monitoring after the shortest sampling period is determined, as this could be closely related to the geographical location and meteorological conditions of each station. In addition, future research should address whether monitoring should occur during the wet season, dry season, or both. Future research should also consider large watersheds in other latitudes and climates so as to compare and verify the conclusions of the present study and to establish more generalized methods. The present study can act as a reference for the identification of parameters of baseflow separation methods so as to improve the accuracy of these methods.
Data availability. All streamflow and conductivity data can be retrieved from the US Geological Survey's (USGS) National Water Information System (NWIS) website using the special site number: http://waterdata.usgs.gov/nwis (last access: 10 March 2019) (NWIS, 2019).
Author contributions. HL developed the research train of thought. CX completed the data requirement analysis. JZ carried out the CMB method suitability assessment. BL compared different parameter determination methods. HL prepared the manuscript with contributions from all the coauthors.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. This work is supported by the project funded by the National Key R & D Program of China (2018YFC0406503) and the National Natural Science Foundation of China (U19A20107, 41702252) special funds for basic scientific research-operating expenses of central universities. We would like to express our sincere thanks to the editor and the anonymous reviewers for the constructive and positive advice and comments which helped improve the manuscript. Review statement. This paper was edited by Stacey Archfield and reviewed by two anonymous referees.