Analysing inter-relationships among water, governance, human development variables in developing countries

The “Integrated Water Resources Management” principle was formally laid down at the International Conference on Water and Sustainable development in Dublin 1992. One of the main results of this conference is that improving Water and Sanitation Services (WSS), being a complex and interdisciplinary issue, passes through collaboration and coordination of different sectors (environment, health, economic activities, governance, and international cooperation). These sectors influence or are influenced by the access to WSS. The understanding of these interrelations appears as crucial for decision makers in the water sector. In this framework, the Joint Research Centre (JRC) of the European Commission (EC) has developed a new database (WatSan4Dev database) containing 42 indicators (called variables in this paper) from environmental, socio-economic, governance and financial aid flows data in developing countries. This paper describes the development of the WatSan4Dev dataset, the statistical processes needed to improve the data quality, and finally, the analysis to verify the database coherence is presented. Based on 25 relevant variables, the relationships between variables are described and organised into five factors (HDP – Human Development against Poverty, AP – Human Activity Pressure on water resources, WR – Water Resources, ODA – Official Development Aid, CEC – Country Environmental Concern). Linear regression methods are used to identify key variables having influence on water supply and sanitation. First analysis indicates that the informal urbanisation development is an important factor negatively influencing the percentage of the population having access to WSS. Health, and in particular children’s health, benefits from the improvement of WSS. Irrigation is also enhancing Water Supply service thanks to multi-purpose infrastructure. Five country profiles are also created to deeper understand and synthetize the amount of information gathered. This new classification of countries is useful in identifying countries with a less advanced position and weaknesses to be tackled. The relevance of indicators gathered to represent environmental and water resources state is questioned in the discussion section. The paper concludes with the necessity to increase the reliability of current indicators and calls for further research on specific indicators, in particular on water quality at national scale, in order to better include environmental state in analysis to WSS.


Introduction
The experience of development cooperation in the water sector shows that only the building of water supply and sanitation (WSS) infrastructures is inefficient in bringing sustainable water supply and sanitation services to the population.Water Supply (WS) and Sanitation (S) are complex issues that impact other sectors across the society, first and foremost, health and environment but also institutional capacities, and economic sectors such as agriculture and industry.
A multi-dimensional approach is needed and therefore, in 1992, principles for sustainable management in the water sector are formalised with the adoption of the integrated water resources management (IWRM) approach 1 by the international community."IWRM is defined as a process aimed at ensuring that water is used more efficiently (economic dimension), promoting equitable access to water (social dimension) and guaranteeing sustainability (environmental dimension)" (Europe Aid, 2009).Concretely, this has led to a redefinition of strategies and ways of behaving in the water sector: improving efficiency and sustainable development of water and sanitation services should ensure the involvement of all stakeholders (e.g.institutional, civil society, suppliers, funders) and sectors (e.g.education, heath, economic activities) concerned to building appropriate facilities.
Millennium Development Goals' initiative (MDGs) also calls for increasing efforts and finding solutions to extend the WSS coverage.Relevant for developing countries, the millennium goals were set by the international community in 2000 to foster efforts towards poverty reduction.Water and sanitation are concerned under the Target 7C: Halve, by 2015, the proportion of the population without sustainable access to safe drinking water and basic sanitation.
In terms of research, this cross-cutting approach is translated by performing crossed analyses of the different dimensions of a question.For instance, Adler et al. (2010) have built a framework for the analysis of human development index data, financial resources, and the MDG's targets.It allows the evaluation of a country's progress towards MDGs, considering the development measures together with financial flows indicators.Botting et al. (2010) evaluated the impact of Official Development Aid (ODA) on water and sanitation coverage and infant and child mortality.
Applying the same approach, this research wants to identify the key indicators (variables) explaining the various levels of access to WSS at national scale.Having such a comprehensive view could support decision making at country scale and the international donors have a general picture helping in better orienting their investments.For instance, in the case of Africa, talking about prioritisation for allocating funds "it is unclear if the EU donors are focusing their commitments and disbursements on those countries that most lack water supply and sanitation" (Fonsceca et al., 2008).Beyond this mismatching between the level of WSS and funds affected, it is useful to identify the main reasons explaining a low level of access.In fact a similar access may result from different factors that include roughly: those related to the infrastructure itself ("hardware") and those related to its sustainable management ("software").This software component is related to the context in which the infrastructure should function and can include governance conditions, constraints on water resources, political commitment towards its population and the environment and the availability of necessary skills.Identifying country weaknesses (within hard or soft factors) should follow the same framework for all developing countries.
The actual fact is that the data on the variables influencing WSS are currently disseminated among international and/or national providers, making difficult their direct use to support decision makers.For this reason, one of the objectives of this work is to look into building a common framework around a coherent database (WatSan4Dev) based on the available variables collected by official international providers.
Once gathering various indicators on economic and social status, governance, and environmentally related WSS services at national scale, the reliability and coherence of the indicators, as well as their relevance toward WSS issues are to be verified before making the database available.In this way, the work proposed here presents several statistical methodologies to pre-process the indicators and analyse the different variables and observations.MDG indicators and the percentage of the population having access to improved water supply and sanitation are the references representing WSS.
The analysis of the different variables are performed at two levels (Fig. 1): (a) the analysis of the variables themselves, which gives an insight on the relationships among the different indicators and their relevance in the water sector; and (b) the analysis of the observations (countries) which offers a classification of the countries looking at the water sector.This geographical analysis is also used to observe country status regarding the selected variables, and therefore to highlight countries where efforts should be put in priority.Several countries' profiles are proposed as a tool to identify weaknesses restricting country development and population well-being.
Therefore, within this framework, this paper presents the data sources and variable selection criteria that are included in the WatSan4Dev database (Sect.2) and the methods and analysis performed to establish the coherence of the dataset (Sect.3).It finally presents the results of the coherency verification process of WatSan4dev for 2004 (Sect. 4).This year is taken as reference because of its being the last release of the MDG's variables regarding water and sanitation when this work started.Key variables impacted or being influenced by the level of WSS are presented in Sect. 4. A deep analysis of countries behaviours is also described through the definition of five profiles.

Database description
International institutions collect and provide data with detailed methodologies necessary to support and orient their actions.The indicators that monitor progresses towards the MDGs are under the responsibility of the United Nations.These data are freely available, accessible through web online databases and considered by the International Community as reference data (AQUASTAT, CPI , Earth trends, GEO portal, JMP, OECD and World Bank databases).However data collection and processing methodologies are heterogeneous calling for caution and requiring additional analyses before use.Better data and methods harmonization between international organisations is the main mean to reduce errors in analyses.
The indicators used in this work are collected from the World Bank (WB), Organisation for Economic Cooperation and Development (OECD), Food and Agriculture Organization (FAO), World Health Organization (WHO), United Nations Department of Economic and Social Affairs (UN DESA), United Nations Development Programme (UNDP), United Nations Statistics Division (UNSD), United Nations Human Settlements Programme (UN-HABITAT) and the Joint Monitoring Programme (JMP).Some indicators come from research institutions such as Universities (Yale, Columbia, and Harvard Universities), NGO (Transparency International) or Institutes (Wallingford, Centre for Ecology and Hydrology) that benefit from international recognition in the domain.
The compatibility and consistency of these indicators, in terms of geographical and temporal scales, are major constraints in the analysis process.The national scale is chosen as most of the data are supplied at this level.Data sets for 2004 are used because the last release of the Joint Monitoring Programme (JMP) report on WSS access level at the beginning of this research.Moreover, the majority of other indicators present more completed datasets for 2002 to 2004.
The indicators are chosen considering all the variables that can both result in and/or influence (double-way relationships) the WSS levels.These variables are thematically clustered under four main areas or pillars: Water Resources (water resources and environmental variables), Human activity pressure on water resources (human activity pressure variables), Governance (variables measuring various aspects of governance including environmental aspects), Human development (social and health variables).
Complementary to these four thematic pillars, as developing countries are the main focus of this work, the Official development Aid delivery is included in the database.This last pillar represents the global disbursed official aid provided to the developing countries by donors.
Initially, the data collection covers countries worldwide.132 variables were analysed based on the following main criteria: (i) relevance: the variable plays a potential role regarding the WSS level; (ii) data availability: the variable collected has enough observations (less than 50 % of data in the indicator are missing); (iii) reliability: the variable is produced by trustful-official providers and the method is fully described; and, (iv) inter-correlation: variables with high correlations with other variables (above 0.9) are directly removed.After applying these different criteria to the whole set of variables, 42 variables are pre-selected for this work (Table 1) and included in the WatSan4Dev database.The description of all variables is detailed in Appendix A.
The errors and incoherence of the variables collected (the relationships between the variables and magnitudes of the values) are tracked through Principal Component Analysis (PCA).
Concerning uncertainty, standards errors and boundaries are poorly available in particular regarding socio-economic indicators.However variables are collected from official international providers that do a substantial work of verification and consolidation of datasets.For instance, JMP holds "data reconciliation processes" to reduce discrepancies among data national sources and between these sources and international estimates generated by the JMP.These processes aim at ensuring the quality of the data being collected in a country.In addition to validation through human interaction, FAO performs cross-comparison with similar countries as well as historic data for the country in question and mathematically checks for consistency and correctness.
Because of the strong heterogeneity of the sources and the computation methods, the WatSan4Dev dataset should be considered by the community for qualitative analysis and not for quantitative interpretation purposes.
Although WatSan4Dev database includes 192 countries, only 101 countries are considered in this study: small states, islands and countries having more than 35 % of missing values are removed to limit analysis perturbations and, in particular, to avoid bias in the imputation process.country belongs to low income country category 2 .Bahrain, Kuwait Qatar and the United Emirates are included here because they are considered emerging and developing countries by the FMI.Countries from Eastern Europe and Central Asia are excluded from this analysis because they are out of our target countries and second, they are a group with particular behaviour that imply a specific analysis.

Methods
Taking into account the strong heterogeneity and origin of the indicators, not only from a point of view of the data collection-gathering methods but also from the geographical origin of the data, several data pre-process methods are used.First, normalisation and multi-variate data imputation are applied to impute missing values.Then principal component analysis and factor analysis are used in two different ways: (a) to analyse the relationships between variables and track error or inconsistency within the dataset; (b) to reduce the number of variables but maximising the information coming from the variables.
This reduction of variables leads to the construction of relevant factors that allow: (a) clustering the variables to identify the relationships among them and detect collinearity; (b) clustering observations-countries to build country profiles 2 World Bank method for country classification http://data.worldbank.org/about/country-classificationswith homogenous behaviours.From the first group of factors (variable analysis), linear regression is chosen as a robust method to identify the keys indicators being influenced or influencing WSS variables.From the second group of factors (observations-countries), the construction of these profiles implies the application of k-means clustering methods to identify similar behaviour among countries.

Normalization and data imputation
One of the major difficulties in implementing and analysing the WatSan4Dev dataset is missing data.The solution adopted is to process the missing data using multiple imputation methods (Horton and Lipsitz, 2001).Effectively, standard imputation methods such as mean, mode or nearest neighbour methods introduce important biases in the data distribution and therefore have an impact on the analysis and interpretations.
The imputation process can only be applied to data following a normal distribution.Therefore, previous to this imputation step, normalization of the distribution is performed on these 42 variables.Standard normalization processes are used according to the data distribution (square root, logarithmic and Ordinary Least Square (OLS) regression normalization).Complementary normalization tests were performed to verify the statistical stability of the variables meaning that the data distribution was not affected by the normalisation process.The multiple imputation methods compare country observations based on several indicators and impute missing data without modifying the statistical nature of the variables.Missing data (m) computations are performed to obtain realistic values rather than accurate quantitative values because of the qualitative nature of the indicators collected.This will assure the coherency and significance of the analysis performed but note that quantitative interpretations are to be avoided.
In this work, the Expectation-Maximization Algorithm (EM) (Honaker and King, 2010) was chosen as a multiple imputation method.It involves imputing m-values for each missing cell in the data matrix and creating m completed data sets.Across these completed data sets, the observed values are the same, but the missing values are filled in with a distribution of imputations that reflect the uncertainty about the missing data.
The EMB algorithm (Expectation Maximization Bootstrap)3 is used in this paper for imputation.This algorithm combines the classic EM algorithm with a bootstrap approach: For each draw, algorithm bootstraps the data to simulate estimation uncertainty and then run the EM algorithm to find the mode of the posterior for the bootstrapped data.Several completed sets are created and then combined under the analyst control.The assumptions are: (1) the imputation model assumes that the complete data (that is, both observed and unobserved) are multi-variate normal; and, (2) the data are randomly missing (MAR).This algorithm is fully described in Honaker et al. (2011).
The dataset is incrementally completed by imputing missing data for variables with less missing data before processing more incomplete ones.This process is applied to 156 observations -countries.The imputation process includes manual verifications and specific case-by-case corrections.

Analysis methods
Both PCA and FA are variable reduction methods.Effectively, by analysing the correlation between variables in a dataset, the variables can be reduced to a smaller amount of Factors (FA) or Principal Components (PCA).Both methods provide a set of loadings (correlations between original variables and extracted factors or components) and a set of scores (values each data item gets on the extracted factors/components after the variable reduction).
The most important differences between PCA and FA are: 1. PCA analyses all the variance present in the dataset.
PCA is used to find optimal ways of combining variables by establishing potential relationships among the dataset variables for obtaining an empirical summary of the data.
2. FA analyses only the common variance that means uncontaminated by unique and error variability.Therefore, FA is less sensitive to noise in the dataset as based on assumed underlying factors.A thematic knowledge (a set of hypotheses that form the conceptual basis of the study) of the dataset is required for applying the FA.
FA analyses are usually jointly combined with PCA for confirming the variable reduction proposed by the PCA through the thematic knowledge (FA) of the dataset.

Principal Component Analysis (PCA)
The Principal Component Analysis (PCA) is a statistic description technique allowing an optimised graphic representation of multi-dimensional data.The representation allows a simultaneous description of the relationships (correlation matrix) between the variables (N) and the similitude (coordinates of the observations in the space of the Principal Components) among the observations (M).One of the main advantages of the method is the reduction of the initial Ndimensional space into a low dimensional map (the optimal view for a variability criterion) and build a set of P uncorrelated factors (called Principal Components).The PCA technique is widely used in the multi-variate analyses domain to derive dominant patterns of variability.
From a formal point of view, the PCA uses an orthogonal transformation converting a set of observations (M) of possibly correlated variables (N ) into a set of values of uncorrelated variables called principal components.The number of principal components is less than or equal to the number of original variables.However this transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (uncorrelated with) the preceding components.This essential feature allows the representation of only the three main PCA factors in Fig. 2 because of gathering a significant level of variability.Therefore the interpretation is facilitated because being directly readable as the number of dimensions is reduced.The independency of the principal components is ensured, as the dataset is jointly normally distributed after the normalization phase.PCA computation details can be found in Pearson (1902) and Hotelling (1935).
PCA is used in this work to verify the coherency and robustness of the WatSanDev dataset tracking data errors and incoherencies among the variables.This analysis is suitable for understanding multi-dimensional observations: the behaviour of 25 variables-indicators for 101 observationscountries.PCA provides a global view of the correlations among variables and/or countries.

Factor Analysis (FA)
Factor analysis (FA) is a statistical method used to describe variability among observed and correlated variables in terms of a potentially lower number of unobserved and uncorrelated variables called factors.While the PCA analyses all the observed variance in order to find optimal ways of combining variables into a small number of subsets, FA uses regression modelling techniques to test hypotheses and produce error estimates by the analysis of the only shared variances (Bartholomew et al., 2008).The latter method allows the identification of the structure underlying such variables and to estimate scores to measure latent factors themselves.FA computation details can be found in Thurstone (1947).Significant factors highlighted by the FA will be the basis for the country analyses in this paper.

K-mean clustering
The K-mean clustering algorithm was developed by Mac-Queen (1967).K-mean clustering is an algorithm to classify or to group your objects based on attributes/features into K number of groups.The grouping is done by minimizing the sum of squares of distances between data and the corresponding cluster centroid.This clustering method is used to analyse countries' characteristics to define different country profiles.

Ordinary Least Square (OLS) linear regression analysis
Although the modelling phase is not part of this paper, the classical Ordinary Least Square (OLS) regression analysis is used here to provide an exploratory identification of the key explanatory variables influencing the water supply and sanitation access variables.
The OLS regression analysis is a method for predicting the value of dependent variables Y i , based on the values of independent variables X i .The equation system can be written as follows: where Y i are the dependent variables, β 0,i are the intercepts of the model, X j,i corresponds to the j -th explanatory variable of the model (j = 1 to p), and ε i is the random error with expectation 0 and variance σ 2 .The β 0,i and β j,i parameters and ε i errors are estimated from the observations.The results from the OLS analysis are validated by the goodness of fit coefficients of the model (the coefficient of determination, R 2 ), the variability explained by the model and the analysis of the variance.The Ficher's F-test is also used for estimating the risk of assuming the null hypothesis.Finally, the relative influence of the explanatory variables are considered as significant if complying with the field experience, the classical cases studies or the scientific literature in the domain.

Results
As described in the following paragraph, only 25 variables are selected using PCA and FA; finding those sufficient and relevant to perform robust analyses in accordance with the approach described by Fig. 1.These variables are considered as coherent if, other than showing statistical coherency and robustness, they also comply with field experiences, case studies and scientific literature.The following analysis of variables aims at identifying their relationships.PCA and FA are also used to synthesize the information through a smaller representative numbers of variables-components.The components are then used to analyse the observations and build country profiles.Two other main results are obtained: (i) the understanding of relationships between variables and (ii) the proposition of five profiles to support a geographical analysis of data aiming at identifying less favourable country situations that will help in orienting efforts.

Selection of variables
In this work, only 25 representative variables out of the 42 WatSan4Dev variables are used for the statistical analyses; excluded and selected variables are listed hereafter.Pearson correlation coefficient is used for multi-variate analyses.Governance variables, namely the Worldwide Governance Indicators (WGI), Corruption Perception Index (CPI) and Environmental governance (Env Gov) are highly correlated together (correlation values range from 0.645 to 0.852) showing in this way coherency despite different gathering methods and data sources.Their number is reduced to avoid multi-collinearity.The Governance Effectiveness (WGI-GE) is taken as representative of several other governance indicators because of high correlation with corruption perception index (0.881 correlation -CPI), Rule of law (0.906 correlation -WGI-RofL), Regulatory Quality (0.876 correlation -WGI-RQ) and environmental governance (0.807 -Env gov).In the same way, Child mortality rate (Child Mort-5) represents the life expectancy at birth (life expect) and the fertility rate (fertility): a high child mortality rate means a low life expectancy at birth and therefore in developing countries is correlated with high fertility rate.Health expenditure (Health expect) has a high correlation (> 0.9) with the income per capita (GDP per capita).
Water bodies surface (WB) and National Biologic Index (NBI) are correlated with few variables, mainly the amount of precipitation (respectively 0.439 and 0.732 correlation -Precipit) and the amount of water resources available per capita (0.569 correlation value with NBI-TWRR) but not correlated with WSS.In the same way, Dam capacity (DAM) shows no significant correlations with any other variable, indicating that the equipment in dam-reservoirs neither impacts directly nor indirectly on the level of access to WSS.
Therefore, these variables are excluded from the analysis for not contributing to the explanation of water supply (WS) or sanitation (S) variables.
Rural and urban population growths (Growth Urb pop and Growth Rural Pop) variables represent complex processes that are not targeted in this paper.Effectively, the evolution of rural/urban population requires considering both the repartition of the population (Urban pop) and their own dynamic (Growth Urb pop and Growth Rural Pop).The interaction between these variables involves demographic, economic and urban transition processes4 .Indeed, developing countries are at the beginning of this multi-dimensional transition.Being a long and complex process, countries can be at a diverse stage of this transition but should move towards an increment of urban population, with a lower growth of their total population due to the drop of the fertility rate and the increase of income (UNFPA, 2007).
The Environmental Sustainability Index (ESI), the Water Poverty index (WPI), Human development Index (HDI) are also excluded because of being composite indicators overlapping simple variables.They were only used as reference indicators projected into the space of the PCA to support the factors interpretation.
The PCA analysis shows that the proportion of country land dedicated to Agriculture (% agri area) is not significant in this study and therefore removed.

Understanding relations between variables
Multi-variate analysis, successively PCA and FA, are used with a twofold objective (1) verifying the consistence of selected variables through statistical and theory/field experience compliance (2) analyzing their relationships.
The first five components explain more than 73 % of the variability with an eigenvalue above 1 (Table 3 and Fig. 2).These components are the basis for the following analysis of relationships between variables.

Factor 1 (HDP)
Factor 1 (HDP) represents the human development of the country taking into account health (Child Mortal-5, Malaria), education (School enrol, School G/B), living conditions (Urban Pop, % Slums, WS, S), the income (GDP per cap, femal eco, poverty) and governance (WGI-GE).Following the effort of synthesis of the HDI, the HDP factor represents the level of human development in a more multi-dimensional way by including health, gender, urbanization and governance aspects.The HDI is, indeed, based on four indicators: life expectancy at birth, mean years of schooling, expected years of schooling and Gross National Income per capita.As expected, Fig. 2 shows that variables related to poverty (female economic activity rate -"femal eco", the percentage of slums -"% slums", child mortality rate -"child Mortal-5", poverty rate -"poverty") are negatively correlated with development indicators (the income per capita -"GDP per cap", sanitation and water supply level -"WS" and "S") .
The participation of women in economic activities is correlated with malaria (0.589 correlation), urban slums proportion (0.511 correlation) water supply access (−0.498 correlation), the mortality of children under 5 yr old (0.477 correlation-Child Mortal-5), and other variables expressing the poverty of the country.This correlation agrees with the well-known behaviour of female economic participation described in Boserup (1989), Mammen and Paxson (2000), and Beguy (2009).The percentage of women being economically active is high when the income per capita is low and drops with the increment of GDP of the country until the threshold of 2550 $ per capita (valid for the 1970-1985 period).After this threshold, the trend inverts and the female economic activity increases with the GDP (see Appendix B).Developing countries in the WatSan4Dev dataset are clearly in line with the first part of this behaviour: a high female economic activity rate is therefore an expression of poverty.
Governance (WGI-GE) is correlated mainly with the development of the country: 0.724 correlation with Gross Domestic Product per capita (GDP per cap), 0.647 correlation with Water Supply access (WS), 0.559 with basic Sanitation access (S), and 0.518 correlation with gross school enrolment (School enrol).

Factor 2 (AP)
Factor 2 (AP) shows high factor loadings for variables related to water consumption according to their uses (Total Withdrawals, Withdrawal municipal and Withdrawal industrial) together with indicators related to irrigation (% irrigation) and intensity of agriculture (water use int agri).It can be considered as a measure of how economic activities put pressure on water resources in a country.As is well known, water demand in developing countries is dedicated in majority for agricultural purposes (Hinrichsen et al., 1997).Almost 82 % of developing countries considered (83 out of 101) dedicate water to agriculture at more than 50 %.Therefore, total water withdrawal is negatively correlated to municipal (−0.553 correlation) and industrial (−0.251 correlation) withdrawals.The more the factor loading is positive, the more agriculture is predominant.On the contrary, the more the factor loading is close to 0 or negative, the more the share of municipal and industrial activities in water consumption growths.
In addition to main high correlations with withdrawals and agriculture intensity (Water Use intensity in agri), the level of agricultural land equipped in irrigation system (% irrigation) is more correlated with to the level of development of the country than to the water resources variables (TWRR and Precipit).We advance the hypothesis that irrigated perimeters captured by this variable are medium or large scale schemes; very small irrigation parcels are not included because of the data collection scale and data gathering methodologies.Irrigation development underlies investment, technological capacities and strong political will for agricultural development.Water resources availability accounts for little in the choice of irrigation development in this case in developing countries.

Factor 3 (WR)
Factor 3 (WR) gathers high factor loadings for variables expressing the amount of water resources available through the amount of precipitations (precipit), the estimated amount of water renewable resources available per capita (TWRR) and the proportion of land under desertification risk (desert risk).The more the value is negative, the more the country presents arid areas, either directly desert like in the Middle East or areas receiving low water resources that can be under threat like Sahel countries.

Factor 4 (ODA)
Factor 4 (ODA) represents the Official Development Aid, both the global flow and the specific flow dedicated to Water Supply and Sanitation.The governance indicator related to the political stability and absence of violence (WGI PS AV) is included in this component.ODA are funds dedicated to development not to emergency purpose.This component shows that donors request beneficiary countries for a minimum of political stability before committing and transfering funds.Other than the stability and absence of violence (WGI PS AV), the global official development Aid (ODA) is also correlated with health (0.460 correlation -Child mortal-5), access to sanitation and water facilities (−0.450 and −0.412 correlation, respectively) and poverty variables (0.408 correlation -Poverty, 0.405 correlation -% slums).These are therefore the main indicators being considered by donors for fund commitment.

Factor 5 (CEC)
Factor 5 (CEC) expresses the degree of concern of a country for the environment empowered by government accountability towards its population (WGI-VA).WGI Voice and accountability (WGI-VA) measures the ability of citizens to participate in selecting their government and examining its actions.The indicator also includes the level of freedom of expression/media, freedom of association that constitute the necessary conditions for the civil society to be an opposition or controlling force to government.From the perspective of this analysis, this is an expression of the essential role of the civil society in pushing a global environmental policy in the country; for instance, regarding climate change mitigation, protection of biodiversity or hazardous waste management.The economic-industrial interests are often an obstacle to environmental protection and related policy making.However, citizens are key actors for pushing such protection processes because being directly affected by an environmental degradation (health, household budget, food security, future generations, ...).In a more direct way, many studies (e.g.Barrett and Graddy, 2000) found that an increase in civil and political freedoms significantly reduces some measures of pollution and hence improves the quality of the environment.
FA confirms the PCA results by showing the same factor loading matrix (Table 4) therefore should be interpreted in the same way.

Identifying key variables
This section describes the analyses performed to identify the key independent variables involved in the WSS variables when considering all developing countries.The relations between the independent variables that are common to developing countries are analysed.For each dependent variable (water supply -WS, and sanitation -S), two linear regression analyses are performed.The objective is to identify the main key explanatory -independent indicators and not the associated models.
The first linear regressions for both dependent variables allow the identification of the proportion of the urban slums,   and WHO, 2009).However, the existence of such serious health issues is also an important incentive for policy and commitments towards WSS.Therefore the variables children under five years' mortality rate and malaria being rather consequences than causes of WSS, we proceed to a second linear regression but removing those variables from the analysis.This allows the identification of complementary key variables masked by the first results.This time the adjusted R 2 are equal to 0.686 and 0.635 for WS and S, respectively.As the objective of this study is rather about the identification of the key variables impacting WSS than the consequences of a good WSS, we will describe in this section the results of the second analysis and not those of the first.The first analyses confirm the impact of access to improved WSS on health, represented here by child mortality and the prevalence of malaria.Although several variable introduction methods have been tested, such as stepwise and backward, these provide similar results which show once again the stability of the variables and their relationships.

Water Supply (WS) level
Table 5 presents the parameters of the second linear regression with stepwise method and Adjusted R 2 = 0.686.The income per capita (Gdp per cap), the proportion of irrigation (% irrigation), urban population (Urban pop) and, the percentage of informal settlement (% slums) are found as the most significant in explaining the access level to water supply (WS).
Developing countries, being in an important and often informal urbanisation process, encounter difficulties in providing basic services in slums, explaining the negative coefficient β (−0.333).More generally, urbanisation takes an important role in the access to water supply and sanitation facilities and more concretely to the kind of urbanisation (urban versus slums).Informal urbanisation developments (slums) impact negatively the water supply and sanitation conditions because of the difficulty of local authorities in the cities to face and structure population flows from rural areas.Slum development is expected to rise and by 2050, urban dwellers will likely account for 66 % of that in the less developed regions (UN DESA, 2010).Therefore, extending the access to sanitation and water supply in these conditions can be hindered or slowed down.However cities are an opportunity for investment into water supply and sanitation because the critical mass to create new infrastructures and management capacities is easily reachable.The medium income (GDP per cap) understood as an indicator of living standard is an important condition for households to access improved infrastructure.Finally, the development of irrigation appears as a positive factor.The reason may stand in the multi-purpose of the irrigation infrastructure: water supply source used for crops is also used for water delivery for livestock and population.

Sanitation (S) level
Table 6 presents the parameters of the second linear regression with stepwise method and an Adjusted R 2 = 0.635.The total water withdrawal per capita (total withdrawal), the proportion of urban population (Urban pop) and within this population, the percentage of informal settlement (% slums), are found as the most significant in explaining the access level of sanitation (S).
The type of urban development is crucial in explaining the observed access to sanitation for the same reasons mentioned in Sect.4.2.1.Total water withdrawal represents the capacity of the society to mobilize water resources available without distinction of uses.It is generally accepted that the total water consumption per inhabitant (total water withdrawal) increases with development of the country because of the extension of household consumption (Gleick, 1996) and/or  The development of water supply capacities (e.g.technological aspects, operator and financial capacities) may also overflow to sanitation services.Stakeholders involved in the water supply production start investing at one point on the end of the water cycle, meaning wastewater treatment and sanitation.In that case, the link with the amount water withdrawals is direct.

Clustering countries with similar profile
The K-mean clustering method is used to identify groups of countries with a similar profile (Fig. 3).The interpretation is relative to the 101 country observations.Hierarchical Agglomerative Clustering method was tested but the K-mean method provided more relevant clusters.The analysis was performed testing different numbers of clusters and five clusters was found to be the most optimised choice and coherent with the factors obtained from the previous analysis (HDP -Human Development against Poverty, AP -Human Activity Pressure, WR -Water resources, ODA -Official development Aid, CEC -Country Environmental Concern).

Description of the five profiles
Country profiles (Fig. 3) are ordered from most favourable (profile 1) to the less favourable country situation (profile 5).
Profile 1 presents high values on WR, HDP and CEC implying little need for external support (ODA).The water demand reflects a relative balance between municipal/industrial activities and agriculture, leaving the latter as the dominant activity.Profile 2 shows weaknesses in terms of accountability and population's freedom, leading to a low commitment towards environment.Profile 3 indicates that the economy is mainly driven by agricultural activities facilitated by a context of abundant natural resources.Profile 4 and 5 are the less favourable profiles when considering human development.
Although profile 4 benefits from a higher level of freedom (CEC), political stability allows a high commitment of ODA and a more balanced economy, which corresponds with value around 0 in the AP component.Profile 5 shows an economy mainly based on primary resources exploitation (negative value in AP) with little consideration on environmental aspects.
In Fig. 4, the African continent shows the highest situation diversity, having countries belonging to all five profiles.Latin America concentrates a majority of "best profiles", except Bolivia, while South East Asia presents a mixed distribution with countries mainly ranging between advanced development (profile 1) and agricultural-oriented countries (profile 3).
Profile 1: towards well-being (Fig. 5, Table 7) Profile 1 is characterised by good development (HDP), a high level of freedom, together with an environmental engagement (CEC).Although water resources are still mainly dedicated to agriculture, these resources are also devoted to domestic, commercial and/or industrial uses as shown by the average value ranging around 0 in AP axe.These countries benefit from a positive context of water resources availability (WR).The countries in this profile benefit from the lowest level of external aid (ODA).Therefore, the profile is considered the most advantageous, showing high human development, environmental concerns, degree of freedom and a certain balance between water uses (agriculture versus industry/municipal uses), indicating a diversified economic The Philippines is the most representative country in this cluster (the closest to the centroid, Table 7).Belize shows a similar shape to that of profile 5, but the amplitude of values in particular regarding ODA, CEC and HDP justifies its inclusion in profile 1.
Botswana has a major share of water withdrawals due to municipal and industrial activities (59 %), the agriculture is not dominant.Its economy is based on natural resources exploitation but it has succeeded in maintaining other sectors such as tourism, financial services, subsistence farming, and cattle breeding5 .This tends to the ideal of economic diversification of this profile.
Jamaica, Malaysia and Mauritius show similar values toward ODA, which is superior to the average.Thailand, as well as India, shows a maximum in AP indicating the dominance of agricultural activities (respectively consuming 96 and 87 % of water resources).Thailand has a profile close to the average, except its AP value.India also shows a significant low human development index.Being in the periphery of this cluster, India could be an intermediary country between class 1 and 3.The clustering algorithm has included it in this cluster because having three components (CEC, ODA, WR) out of the five match with the average values in this profile.
Profile 2: freedom/democracy black spot (Fig. 6, Table 8) This profile presents a high HDP value close to profile 1, but with a low-medium level of ODA.Agriculture is consuming the majority of the drawn water resources but enough resources remain for satisfying municipal needs, industries being marginal.The main differences with profile 1 stand in the water scarcity and the poor level in CEC.A majority of these countries leave little space for citizen expression, being monarchies or authoritarian regimes such as in the Arabian Peninsula.In addition, they are scarcely exposed to global environmental matters (e.g.climate change, protected species trade, hazardous waste transport) coherent with the low level of CEC.Regarding WSS, these countries present the best levels of access (close to 100 %).
Lebanon is the referenced country of this cluster (Table 8).Cuba and the Dominican Republic diverge from the average because they benefit from better climate conditions and more water resource availability.Saudi Arabia and Libya have the poorest environmental involvement at international level together with little accountability to citizens, implying a high divergence on CEC.
Profile 3: agricultural economy (Fig. 7, Table 9) These countries show a medium human development level sustained by a rather important external Aid.The economy of these countries is almost oriented, in some cases exclusively, towards agricultural activities.This agriculture predominance is often accompanied by an important practice of irrigated agriculture.This intensity of agriculture is facilitated by the abundance of water resources.Environmental concerns and government accountability come last in importance.These countries are also defined by an important gap in term of access to basic sanitation (S) compared to access to an improved water supply (WS).Sanitation access is clearly neglected, falling behind with at least 25-30 % less than WS access.Nepal is a good example of this cluster with a WS access at around 90 % and a sanitation access level at 35 %.The fact that 81 % of the population lives in rural areas6 pushes away the efforts to improve sanitation by keeping cheaper  Several countries diverge from the average profile: (i) Honduras, Nicaragua and Guatemala have developed little irrigated/intensive agriculture compared to Southern East Asia countries with less than 3 % irrigated areas; however, in these countries, agriculture demands around 80 % of the water consumption; (ii) Yemen's water resources are low, however agricultural represents almost all the amount of water consumption (at 95 %); (iii) Bhutan receives more ODA than average (> 125 $ per cap, 5th position); (iv) Bangladesh shows a higher concern for environmental issues than average.One hypothesis is that Bangladesh is concerned by  10) This profile is mainly defined by the high level of ODA being justified by the low level of human development (HDP).Despite this fact, these countries show an important commitment toward environmental and international issues and have significantly developed freedom for citizens (high values in CEC).Agriculture monopolises a majority of the drawn water resources but remains enough for the development of other activities (domestic, commercial and/or industrial) in the context of water resources stress.Indeed, most of the countries belonging to this profile are facing stress on water resources mainly located in Sub-Sahel and South East Africa.The consequences of this situation is poor access to Water services and an even worse sanitation service.This also includes some very low levels observed such as Ethiopia with 22 % for water supply and 13 % for sanitation.
Table 10 shows the distances of the different countries to the centroid.Zambia is the most representative country, having a dominant agriculture activity (76 %) but with 86 % of its land under desertification risk.Zambia also benefits from almost 100 $ per cap in ODA.
Bolivia is the only country from Latin America in this profile.Bolivia is the second target country for external Aid (ODA) (after Honduras) in Latin America, having one of the lowest human development index of the region (HDI).Its economy is dominated by agriculture but also exploits natural resources as indicated by the average value in AP.This country is also concerned by environmental issues by facing, for instance, desertification processes.With the existence of a democratic regime since 1982, this leads to a high value in CEC which is characteristic of this profile.
Madagascar and Eritrea almost exclusively dedicate drawn water resources to agriculture (at 95 % like in profile 3); being superior to the average value, the other four components still correspond to this profile.
Cape Verde has a development superior to the other countries but keeps a similar shape for the other factors in profile 4. Djibouti distinguishes itself by having scarce water resources.Being an already existing desert area, this country is no more considered "under desertification risk" by definition.
Profile 5: primary material consumption (Fig. 9, Table 11) This profile is clearly characterised by the abundance of water resources.However these resources are mainly dedicated to industrial and municipal uses, in contrast with an underdeveloped agriculture.In fact, countries in this profile have an economy based on mining, or primary material exportation, being the main consumer of water resources.This is typically the case of Gabon, Congo, Angola Liberia, Papua New Guinea, Togo and Nigeria that exploit subsoil resources such as oil, gold, copper, phosphates, diamonds, or other natural resources such as timber.
This profile cumulates both non-favourable conditions for a sustainable development and an economy focused on primary resources exploitation consuming water resources in a non-sustainable way.These countries also have poor human development in a context of political instability.The consequences are that they benefit from limited external financial commitments from international donors.Although these countries have a low level of human development, they benefit from a rather low level of ODA.One hypothesis is because of the political instability and context of violence predominant in these countries.Effectively, as already mentioned, ODA is dedicated to development and not to humanitarian emergencies.International donors seem to consider that a minimum security and stability is necessary to engage and implement this medium-to long-term vision.From a point of view of WSS, the situation is similar to profile 4, with low level of access for the population either to water supply or sanitation.Comoros stands as the reference country of profile 5.Although Gabon has a high value of HDP, the other components still fit well in the profile.Zimbabwe and Burundi diverge from the average profile by keeping agriculture as central economic activities.These two countries can be seen as intermediate cases between profile 3 and 5.They belong to cluster 5 because three components out of five (HDP, ODA, WR) fit with the average behaviour of this profile.Equatorial Guinea has had an economy based on oil exploitation since 1996.They are in the group of least developing countries7 (low HDP).The main divergence stands in the very negative value in CEC due to authoritarian regimes less oriented to accountability and freedom of its citizen and low concern for the environment (considered as extra costs and economic constraints).However its "political stability" since 1979 allows the delivery of ODA at a significant level.Lesotho enjoys a political stability and benefits from important external financial flows (48 $ per capita).It also has a high level of CEC, indicating a rather high level of freedom (similar to profile 4).

Discussion
Environmental variables, namely the amount of water resources available per capita (TWRR), land under desertification threat (Desert Risk), precipitation, National Biological Hydrol.Earth Syst.Sci., 16, 3791-3816, 2012 www.hydrol-earth-syst-sci.net/16/3791/2012/ Index (NBI), water bodies surface (WB), and Environmental Sustainability Index (ESI); they are coherent and correlated among themselves but they show little correlation to the percentages of the population having access to water supply and sanitation.On another side, the Environmental Sustainability Index (ESI) is a cross-cutting and very complex index (the list of variables being part of this index are detailed in Appendix A) showing the capacity of a country to manage its environment in a sustainable way.Therefore, it is highly correlated with the water resources availability (0.528 correlation -TWRR) and governance (0.420 correlation -WGI-VA) but little correlated with the WSS.In addition, the water quality sub-indicators included in ESI computation often have 86 % of missing values.This undermines the reliability of these sub-indicators and can contribute to explaining this absence of correlation.Water bodies surface (WB) and National Biological Index (NBI) are also correlated to the amount of precipitations (respectively 0.439 and 0.732 correlation -Precipit) but are not linked to WSS.Therefore, although this paper shows no significant correlations between environmental variables and WSS services, it is evident that WSS has an important impact on the environment.For example, wastewater treatment or sanitation excreta, when not adequately treated, have a direct impact on surface or groundwater quality and vice versa.
This proves that considering only the quantity of water resources is not sufficient for establishing links with WSS but should be combined with qualitative water indicators.The main issue is that at national scale, indicators on quality, accessibility, and management capacities of water resources are not available in most developing countries.Downscaling analyses may permit the availability of such indicators, for instance at basin level.Garriga et al. (2009) when modelling data on the Turkana district in Kenya obtained information on water quality (water point protection, pollutant agents from livestock or waste, water salinity) through qualitative questionnaires.
The representativeness of water resources at national scale still requires reinforcement data collection and reliability of current indicators, finding proxies or designing appropriate new indicators at national scale.The FAO and UNEP/GEMS-Water mainly engaged in these efforts provide data on biochemical, nitrogen concentrations at national scale and water quality surface waters at watershed scale.The JMP performed pilot country projects (Ethiopia, Jordan, Nicaragua, Nigeria, Tajikistan) in 2004-2005 to investigate new methods to monitor quality of drinking water supplies.The conclusions of these pilots have been available since 2010 but the generalisation of data collection and methods worldwide has not been initiated yet.To summarize, further research on country indicators defining quality (accuracy and uncertainty), accessibility, management capacities of resources, and beyond environment state are needed to refine analyses of the relationship between environment, human development and WSS.
Despite this constraint on data availability and quality, this paper proposes a common analytical framework to developing countries and using multi-variate analyses in combination with OLS methods.It provides a general view of the water sector in developing countries, including other sectors on which efforts should be focussed in priority to facilitate the WSS sustainable development.This can be used either by international donors or national authorities that have to face and handle these issues.Using robust and simple methodologies, these analyses can be replicated easily.
Commitment towards waterborne diseases reduction and formalisation of the ongoing urbanisation process came out as main leverages to increase WSS sustainable access.The rapid urban growth throughout the developing world observed over the last 20 yr is seriously "outstripping the capacity of most cities to provide adequate services for their citizens" (Cohen, 2006).Cities are an opportunity for investment in water supply and sanitation because the critical consumer mass to create new infrastructures and management capacities are easily reachable.Breakdown statistics between rural and urban areas can illustrate this difference by often showing a gap between levels of WSS access.This concentration of population is therefore an a priori chance to foster WSS access.However, as highlighted by UN DESA (2010), urbanisation should be organised to avoid the development of slums (66 to 86 % of the developing countries' population will live in slums by 2050).Urbanisation is expected to continue in the next 30 years, mainly in medium and small cities. Cohen (2006) mentioned that these latter city categories are facing higher health issues and lack of capacities more than large cities.Fighting and finding solutions to organise urbanisation processes, in particular in medium and small cities, should be therefore a priority.
The new classification of countries around country profiles proposed in this paper helps in targeting countries in difficulty but also identifying weaknesses restraining country development and well-being.The approach of evaluation of country performance on several dimensions appears as necessary to best adapt the state/international donor actions to the various situations observable among developing countries.The UN institutions in charge of monitoring the progress toward MDG's targets, among those the WSS'ones, analyse data with a geographical approach: at global level or by subregions.This geographical approach masks disparities among the different countries but also the countries' specific needs within the analysed area.The approach proposed in this paper considers a different angle by analysing groups of countries which have similar profiles to avoid this "bias".International donors and policy makers can use the different clusters of countries to establish joint strategies but also for monitoring and assessing the impact of their development and cooperation policies in a more strategic way.
The WatSan4Dev dataset proposed in this paper is a new database of development (socio-economic, environmental and governance indicators) collecting the indices widely dispersed in the different international institutions in charge of their management.Because of the different origins and gathering methods, raw data have been cleaned, and robust missing data processing methods applied for filling the gaps.As a result of the statistical processing, since the coherence and robustness among the different indicators has been increased and proved that these indicators can be used for further and deeper studies, the WatSan4Dev dataset should only be used for qualitative estimations and analysis.
The main objective of this paper was to validate the coherency of relationships among the different indicators and establish cross-sector links with WSS.From a first analysis of the variables in WatSan4Dev (Table 1), it has been proved that 25 variables out of the 42 variables considered are actually independent and coherent with the literature and experience in the field.
The variables organise themselves around five factors: -Factor 1 (HDP) represents the human development of a country not only as a function of income, life expectancy and education (as this is the case of the HDI index) but also takes into account population living conditions (Urban Pop, % Slums, WS, S), and governance aspects (WGI-GE).
-Factor 2 (AP) expresses how economic activities pressure water resources in the country.
-Factor 3 (WR) gathers high factors loadings for variables expressing the amount of water resources available.
-Factor 4 (ODA) represents the external financial flows ODA and ODA WSS conditioned by the political stability and absence of violence in the country.
-Factor 5 (CEC) links the involvement of the country in International Environmental Agreements with the governance (WGI-VA).WGI Voice and Accountability (WGI-VA) measures the ability of citizens to participate in electing their government (democracy) and to be a government controlling force in the country.
The identification of the key variables in an improved access to WSS showed that urbanisation takes an essential role in the access to improved water supply and sanitation facilities, together with the kind of urbanisation (urban versus slums).Informal urbanisation developments (slums) impact negatively the water supply and sanitation conditions because of the difficulty of local authorities in the cities to face and structure massive population flows from rural areas.The slums' development is expected to rise and, by 2050, urban dwellers will likely account for 66 % in the less developed regions (UN, 2010).Therefore, extending the access to sanitation and water supply in these conditions can be hindered or slowed down.However, cities are an opportunity for investment in water supply and sanitation because the critical mass to create new infrastructures and maintain management capacities is easily reachable.The relationship between Child mortality under 5 yr (Child Mortality-5) and access to water supply and sanitation (WSS) is clearly depicted with an important impact on children's health.Indeed, more generally, WSS is clearly correlated with health, which is an essential leverage for improving living conditions.
Irrigation also pushes towards improvement of water supply thanks to multi-purpose infrastructure.Environmental conditions referring to water resources (total water renewable resources, the precipitation, desert risk) are secondary factors in explaining the WSS level.
Built around the five factors, five different country profiles have also been established (Fig. 3).Profile 1 represents the equilibrium among the different development indicators with countries benefiting from access to Water Supply around 80 %.These countries are close to reaching MDGs, however, access to basic sanitation still falls behind (around 70 %).Profile 2 can also be considered as a good profile with high WSS access but authoritarian political systems and deficit accountability is the major black spot of this profile.Profile 3 illustrates the fact that Sanitation is clearly neglected, falling behind with at least 25-30 % less than WS.Profile 4 has the lowest human development expressed in terms of poor access to Water services and even worth sanitation services.Profile 5 cumulates both non-favourable conditions for a sustainable development and an economy focused on primary resources exploitation and consuming water resources in a non-sustainable way.These countries also have poor human development in a context of political instability.
This new classification based on development indicators is a new way to identify countries in difficulty but also indicates weaknesses restraining countries' development and well-being.International donors can use these profiles for establishing more adapted development strategies and better monitoring to assess the impact of their policies.
Countries in profiles 4 and 5 are less advanced but efforts should be put on different domains: (a) profile 4 countries are facing water stress from scarcity in a context of low development.Therefore, efficient resources management, mitigation measures and adapted solutions appear crucial to improve WSS; (b) in the case of profile 5, efforts should be concentrated to ensure political stability and reduction of violence to get out of the "dog eat dog organisation" implied by such a context.This appears as the a priori condition to ensure the existence of individual/collective initiatives towards sustainable development.
For countries in profile 3, sanitation should be a priority.Within this agricultural irrigation context, water supply services are often ensured, while basic sanitation development is neglected.Rural areas keep, indeed, feasible/acceptable non-improved practices such as open defecation, while investments in basics sanitation facilities may be seen as unnecessary, in particular for poor households.
It is to be noted that countries from profile 2 benefit from a significant level of ODA, while WSS level is almost at 100 %.A detailed analysis of the external aid provided to these countries should be considered to better understand in what kind of activities the aid is engaged.
As shown in this paper, the biophysical variables included to characterize the environmental state are not sufficient or appropriate for exploring the links among water resources, WSS and human development as they describe only the quantity of water resources.Further research on country indicators defining quality, accessibility, management capacities of resources, and beyond environment state is needed to refine analysis of the relationships among the environment, human development and WSS at national country scale.This paper finally proposes that the international organisations in charge of monitoring MDG's targets should collect qualitative water indicators at national level to establish links between the environment and WSS.
On 6 March 20128 , the UN announced that the Millennium targets for safe drinking water were reached, although the one for sanitation is still out of reach.Beyond this positive quantitative improvement, access to safe water and basic sanitation is not ensured yet.Quality of water resources still remains a challenge in a majority of developing countries, for instance, in Latin America.Disparities also appear across regions and according to urban/rural areas.Progress towards WSS may require even more efforts in the future because of these potential rising constraints, like the increase of population, urbanisation, and/or climate change and variability consequences.
The next part of this research will introduce the different data models to produce a map of relationships between the different variables and countries reflecting the water sector knowledge.In fact, the various causal and consequential relationships are already identified throughout the multiple actors of the water sector.The model will measure and order the variables of the WatSan4Dev dataset according to their impact on the WSS and thus the results obtained from these models could be interpreted to improve the decision mechanisms into the policy making processes in the sector.

C. Dondeynaz et al.: Analysing inter-relationships among water and human development variables
range of issues that fall into the following five broad categories: (1) Environmental Systems, (2) Reducing Environmental Stresses, (3) Reducing Human Vulnerability to Environmental Stresses, (4) Societal and Institutional Capacity to Respond to Environmental Challenges and (5) Global Stewardship.Esty et al. (2005) describes the ESI and methods used.

Femal eco, female economic activity rate
This rate concerns women aged 15 and above and calculated on the basis of data on the economically active population (person looking for or having an occupation) and male population from ILO (International Labour Organization).Generally, students, retired people and persons not looking for an occupation are excluded.

Fertility, fertility rates
Total fertility rate is an estimate of the number of children an average woman would have if current age-specific fertility rates remained constant during her reproductive years.

GDP per cap, gross domestic product -purchasing power parity
Gross Domestic Product (Purchasing Power Parity) is gross domestic product converted to GDP per capita based on purchasing power parity (PPP).PPP GDP is gross domestic product converted to international dollars using purchasing power parity rates.An international dollar has the same purchasing power over GDP as the US dollar has in the US.GDP is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products.Data are in current international dollars.

Health exp, health expenditure (PPP-Capita)
This is the sum of public and private health expenditure (in PPP, International $) divided by population.Health expenditure includes the provision of health services, family planning activities, nutrition activities and emergency aid designated for heath, but excludes the provision of water and sanitation.

Malaria, Malaria cases
Number of reported cases per 1000 persons in country.

ODA, Official Development Aid
Aid as a percent of government expenditure is the amount of official development assistance (ODA) received by a country as a percentage of its central government expenditure.

ODA-WSS, official development aid to the water sector
Total of all donors disbursements of ODA towards all recipients related to Water supply and sanitation.

Particip, participation to international environmental agreements
It is calculated taking into account the participation to Framework Convention on Climate Change (UNFCCC), Vienna Convention on the Protection of the Ozone Layer, Convention on the Trade in Endangered Species (CITES), Basel Convention on the Transboundary Movement of Hazardous Waste and United Nations.

Poverty, poverty rate
National poverty rate is the percentage of a country's population living below the country's established national poverty line.

School G/B, Girls to Boys Ratio in primary education enrolment
This is a mesure of the attendance of girls at primary school.The core at this level consists of education provided for children, the customary or legal age of entrance being not younger than five years or older than seven years.This level covers in principle six years of full-time schooling.

School enrol, gross enrolment ratio at school
It is calculated by expressing the number of students enrolled in primary, secondary and tertiary levels of education, regardless of age, as a percentage of the population of official school age for the three levels.

% irrigation, total surface in irrigation
Area equipped to provide water (via irrigation) to the crops.It includes areas equipped for full and partial control irrigation, equipped lowland areas, pastures, and areas equipped for spate irrigation.

TWRR, total water renewable resources
This is an estimate of the surface water resources available for use in a country corresponding to the sum of the internal renewable surface water resources and the total external actual renewable surface water resources.

Urban Pop, urban population -rural population
Total population residing in urban areas.Because of national differences in the characteristics that distinguish urban from rural areas, the distinction between urban and rural population is not amenable to a single definition that would

Urban-rural growth, population growth
Variation of the population respectively in urban and rural areas between 2000-2005.

% slums, urban slum population
Proportion of the urban population living in slums.(A slum is a contiguous settlement where the inhabitants are characterized as having inadequate housing and basic services.)

Water use Int Agri, water use intensity for agriculture
This is the amount of water used in the agricultural sector per hectare of temporary and permanent cropland in the year specified.This indicator shows a country's dependence on irrigation for agricultural production.

WB, water bodies surface
It's the ratio of water bodies regarding the total country surface.Water bodies are oceans, seas, lakes, reservoirs, and rivers.They can be either fresh or salt water bodies.

WGI-V & A, worldwide governance index voice and accountability
This index captures perceptions of the extent to which country'citizens are able to participate in selecting their government, as well as freedom of expression, freedom of association and free media.

WGI PS & AV, worldwide governance index political stability and absence of violence
This index captures perceptions of the likelihood that the government will be destabilized or overthrown by unconstitutional or violent means including politically-motivated violence and terrorism.

WGI-GE, worldwide governance index government effectiveness
This index captures perceptions of the quality of the public services, the quality of the civil services, and the degree of its independence from political pressure, the quality of policy formulation and implementation and the credibility of the government's commitments to such policies.

WGI-RQ, worldwide governance index regulatory quality
This index captures perceptions of the ability of the government to formulate and implement sound policies and regulations to permit and promote private sector development.

WGI-RofL, worldwide governance index rule of law
This index captures perceptions of the extent to which agents have confidence in and abide by the rule of the society and in particular the quality of the contract enforcement, property rights, the police and the courts as well as the likelihood of crime and violence.

WPI, water poverty index
WPI expresses an interdisciplinary measure which links household welfare with water availability and indicates the degree to which the water scarcity impacts on population.WPI has of five component indices: Resources, Access, Capacity, Use, and Environment.The higher this index is, the lower the water constraint is.

TOT-withdrawals, water withdrawal total
Annual gross quantity of water produced and used for agricultural, industrial and domestic purposes.It does not include other in situ-uses: energy, mining, recreation, navigation, fisheries and the environment, which are typically nonconsumptive uses of water.The typology of water use is independent from the source of water.
Total Water Use = Agricultural Water Use + Domestic Water Use + Industrial Water Use.

Withdrawal-municipal, water withdrawal for municipal purpose
Annual quantity of water used for domestic purposes.It is usually computed as the total amount of water supplied by public distribution networks, and usually includes the withdrawal by those industries connected to public networks.

Withdrawal-Industrial, water withdrawal for industrial purpose
Annual quantity of water used by self-supplied industries not connected to any distribution network.crossed with the income level per capita for the 1970-1985 period (Fig. B1).Female economic participation shows high rates for low income countries and decreases up to around 2550 $ per capita for the period considered.We can assume that the threshold (2550 $ per cap) has risen in absolute terms since 1985, at least because of inflation, but the U-shape is still valid.Therefore, the decreasing trend observed on our data corresponds to the first part of the U-shape (Fig. B2).This provides additional proof on the coherency of the dataset.
The economic participation of women depends not only on the income but also on several other factors such as ruralurban context and fertility, including social and cultural parameters that make more complex the explanation of this phenomenon (Ahn and Mira, 1999;Boserup, 1989;Beguy, 2009).

Fig. 1 .
Fig. 1.Approach followed that provides two main results: (1) the identification of key variables impacting the level of WSS services and (2) geographical analysis identifying several country profiles.

Fig. 2 .
Fig. 2. Graphical representation of the WatSan4Dev variables in the three first components of the PCA (57 %).The first component (HDP) is represented by the green axis; in red, the second component (AP); and, the size of the point represents the factor loading in the third component (WR).

Fig. 3 .
Fig. 3.The five country profiles.Each color shows the centroid of the class for each profile.HDP (D1) stands for Human Development against Poverty in the country, AP (D2) for Human Activity Pressure, WR (D3) for Water Resources, ODA (D4) for Official development Aid, and CEC for Country Environmental Concern.

Fig. 4 .
Fig. 4. Geographical distribution of the five country profiles.

Fig. 5 .
Fig. 5. Profile 1 -this profile corresponds to the most advanced countries having a relatively good human development (HDP) based on a diversified economy (AP) and political/democratic organization (CEC).WSS level is, therefore, relatively high compared to other developing countries; consequently, these countries benefit from low ODA from donors.

Fig. 6 .
Fig. 6.Profile 2 -besides good human development (HDP), this profile presents very low values in terms of level of accountability/democracy, together with low interest in environmental matters (CEC).The relatively wealthy situation allows stepping over the scarce water resources context (WR) by showing a high level of WSS services.

Fig. 7 .
Fig. 7. Profile 3 -mainly organised around agriculture (AP), often irrigation, the countries in this profile show limited human development (HDP) with low accountability and concern for the environment (CEC).An important gap in basic sanitation access is observed that steps generally far behind water supply access.

Fig. 8 .
Fig. 8. Profile 4 -mean feature of this profile is the important level of External Aid (ODA) facilitated by a relatively good political accountability and stability (CEC) despite a low human development (HDP).In line with this general context, access to WSS is still below the MDG's targets.

Fig. 9 .
Fig. 9. Profile 5 -this profile shows low values either in human development (HDP) and accountability toward population and environment (CEC).Because of the political instability or violence within the country, external aid remains low.The environmental concern may be undermined by the type of economy, mainly based on raw material or natural resources exploitation.

Fig. B1 .
Fig. B1.Distribution shape of female participation in labour force (for females aged 45-59, but still true for other ages).
Table 2 lists the countries considered in this paper.The World Bank set the threshold at 12 276 current $ per capita, below which a

Table 1 .
Indicators-Variables integrated in the WatSan4Dev dataset.
* OECD flowDisbursement for the water sector, breakdown by subsectors (ODA-WSS) * * Variables selected for multi-variate analyses in this paper (total variables selected = 25).

Table 2 .
Countries included in the WatSan4Dev database.

Table 3 .
Rotated PCA factors matrix after varimax rotation and kaiser normalization.The five components explain 73 % of the variability.In bold, the highest factor loading.

Table 4 .
Rotated FA factors matrix with principal component as extraction method, varimax rotation and Kaiser normalization.The five components explain more than 73 % of the variability.In bold, the highest factor loading.

Table 5 .
Model parameters with water supply as dependent variable -method stepwise (adjusted R 2 = 0.686).
* Sig.* 95 % confidence Interval for β Coefficients β * Std error * Lower Bound Upper Bound * Where β is the constant, Std error is the standard error of β, t is β divided by the std error and Sig. is the p-value.
* Sig.* 95 % confidence Interval for β Coefficients Variables β * Std Error * Lower Bound Upper Bound * Where β is the constant, Std error is the standard error of β, t is β divided by the std error and Sig. is the p-value.

Table 7 .
Factors loadings and distances to the centroid of the profile 1: the last 7 countries of the list are considered outliers with divergence behavior from the average value.

Table 8 .
Factors loadings and distances to the centroid of the profile 2.

Table 9 .
Factors loadings and distances to average for profile 3. The last 5 countries in the list are considered outliers of the cluster, showing divergences with the average value of the profile.

Table 10 .
Factors loadings and distances to average for profile 4. The last 5 countries of the list are considered outliers of the cluster, showing divergence with the average value of the profile.

Table 11 .
Factors loadings and distance to centroid of the profile 5. Equatorial Guinea is considered outlier of the cluster, showing divergence with the average value of the profile.