the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Developing a Bayesian network model for understanding river catchment resilience under future change scenarios
Christopher A. J. Macleod
Marc J. Metzger
Nicola Melville
Rachel C. Helliwell
Jim Pritchard
Miriam Glendell
Download
- Final revised paper (published on 14 Jun 2023)
- Supplement to the final revised paper
- Preprint (discussion started on 21 Jul 2022)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-617', Laura Uusitalo, 27 Jul 2022
This paper reports an ambitious work of a Bayesian network model created together with stakeholders to evaluate future scenarios on a watershed. I appreciate the effort put to the stakehodler interaction, which is also reported very well.
I have some things / questions that I would like to see addressed in the paper.
1) I think a picture of the model should be presented. I understand it can be complex, but I also understand it was presented for the stakeholders in the workshops, so it should be possible to present it also in the paper, or at the minimum in the supplement. It would make it easier for the reader to undertand the model.
2) It seems from the supplement that the model was parameterized using deterministic equations. Usually Bayesian Networks are use specifically to model also the uncertainty that is related to the model parameters. Please discuss this and explain your modelling choice.
3) The use of simulations to evaluate the results is a bit unclear. We don't usually use simulations as such to evaluate the outputs of a BN, but we aim to compute the total probability disctibution over the modelled domain, given the conditional probability distributions and the model structure. This way, we can then reason "backwards" (what is the most brobable cause given the consequences), compute the probabilities ofoutcomes given a number of causes or observations, etc. In the case of disctrete models, this can be done analytically, and in the case of continuous models, the distributions are often approximated using simulations, but BNs are not usually simulated as such. When continuous BNs are run/solved, often using Monte Carlo Markov chain computation, the early part of the Markov chain is usually thrown out to make sure that the chain has converged to the true distribution (burn-in). This wasn't mentioned in this paper, and I was left uncertain about the modelling technique. Please explain it more clearly. Also, BNs are supposed to give the best available assessment of the *probabilities* of the events (given the scenarios etc.), so it should not be necessary to refer to "x out of y simulations" whan discussing the results.
4) Maybe go further back to the roots (such as Perl 1986) when explaining what BNs are in the introduction.
Citation: https://doi.org/10.5194/egusphere-2022-617-RC1 -
AC1: 'Reply on RC1', Kerr Adams, 09 Sep 2022
Dear Dr Laura Uusitalo,
Many thanks for taking the time to review our pre-print and provide constructive feedback.
We have made our best efforts to respond to your questions and aspects you would like addressed in the paper in the attached response file. We will updated a revised manuscript and supplementary material, where we have made changes and edits.
With best wishes,
Kerr Adams, on behalf of authors
-
AC3: 'Reply on RC1', Kerr Adams, 27 Mar 2023
Dear Dr Laura Uusitalo,
Having become acquainted with the EGU Sphere commenting process, we have observed that our previous response to your helpful comments may not be as accessible as we had originally hoped. As part of the final response to comments, I wanted to clearly set out our response to your comments, building on our previous response.
1) I think a picture of the model should be presented. I understand it can be complex, but I also understand it was presented for the stakeholders in the workshops, so it should be possible to present it also in the paper, or at the minimum in the supplement. It would make it easier for the reader to understand the model.
Thank you for this comment, we accept that a visualisation of the model should be presented. We have included a simplified visualisation in the attached EGU_2022_617_Review#1_Supplement file.
Our model contains 417 nodes, 623 arcs and 23 sub-models. Despite not being a spatial model, there are some geographical considerations included to represent the sub-catchment scale and individual wastewater assets, which results in the repetition of nodes, arcs and sub-models. These geographical considerations do make it complex to represent the full model visually, which is why we decided to include a simplified version in the supplementary material. We’ve highlighted where there is repetition in the supporting text box.
2) It seems from the supplement that the model was parameterized using deterministic equations. Usually Bayesian Networks are use specifically to model also the uncertainty that is related to the model parameters. Please discuss this and explain your modelling choice.
Thank you for this comment. Where we do have the data available to measure uncertainty, we use prior distributions, which are represented in equations using the β function. The distributions are determined by analysing the available data to identify mean, standard deviations and truncations to create the best possible prior. We also take advantage of a built-in GeNIe function which analyses the data to create a custom prior distribution where there are larger data records to support variables, such as surface water flows. When generating prior distributions isn’t possible, we make use of the diverse coupled future pathways as the best available method for representing uncertainty.
3) The use of simulations to evaluate the results is a bit unclear. We don't usually use simulations as such to evaluate the outputs of a BN, but we aim to compute the total probability distribution over the modelled domain, given the conditional probability distributions and the model structure. This way, we can then reason "backwards" (what is the most probable cause given the consequences), compute the probabilities of outcomes given a number of causes or observations, etc. In the case of discrete models, this can be done analytically, and in the case of continuous models, the distributions are often approximated using simulations, but BNs are not usually simulated as such. When continuous BNs are run/solved, often using Monte Carlo Markov chain computation, the early part of the Markov chain is usually thrown out to make sure that the chain has converged to the true distribution (burn-in). This wasn't mentioned in this paper, and I was left uncertain about the modelling technique. Please explain it more clearly. Also, BNs are supposed to give the best available assessment of the *probabilities* of the events (given the scenarios etc.), so it should not be necessary to refer to "x out of y simulations" whan discussing the results.
Many thanks for highlighting our confusing use of simulations, we will update the manuscript to replace simulation with scenario and samples where appropriate throughout the manuscript as our results are describing and comparing outputs from executing both the current and future (coupled RCPs and SSPs) scenarios.
The modelling technique we use is a hybrid forward sampling algorithm, which is the best available algorithm for hybrid models using the GeNIe software. The algorithm generates 10,000 samples and when stakeholders enquired what was meant by, for example, a 51% probability of a variable being resilient.
We have added details of the forward sampling algorithm in the methods section lines 247-254 of the manuscript explaining the following:
The hybrid forward sampling algorithm generates samples from the probability distributions of parentless nodes, which it then uses to generate samples in child nodes of the parent nodes that have been sampled, generating conditional probability distributions. The algorithm is hybrid because the algorithm can generate samples from both discrete and continuous distributions.4) Maybe go further back to the roots (such as Perl 1986) when explaining what BNs are in the introduction.
Reference is now made to the work of Pearl (1986) describing BNs as directed acyclic graphs and conditional probability quantification, line 42.
References:
1) Pearl, J.: Fusion, propagation, and structuring in belief networks, Artificial Intelligence, 29, 241-288, https://doi.org/10.1016/0004-3702(86)90072-X, 1986.
-
AC1: 'Reply on RC1', Kerr Adams, 09 Sep 2022
-
RC2: 'Comment on egusphere-2022-617', Ibrahim Alameddine, 12 Feb 2023
The paper entitled "Developing a Bayesian network model for understanding river catchment resilience under future change scenarios" presents an application of how a Bayesian Network model can be constructed using a participatory modelling approach. It also provides a novel approach of using BN to assess teh impacts of climate change. The authors need to address the follwoing comments:
1) The authors use the term reslience but do not provide a clear definition on what that term is supposed to capture in their work.
2) The authors need to provide refernces to the One PLanet Choices work given its tight coupling with their work.
3) How did you ensure that when you were eliciting the model from teh stakeholders that you did not end up with cyclical pathways?
4) The paper needs to discuss the choice of the spatio-temporal scales in more details. The data is reported using daily units; yet the model is predicting a yearly avearge in 2050. The authors work with subwatersheds, some of which are nested. How was the nesting accounting for?
5) Many of the variables in the model are statsic in time and space. For example all subwatersheds had the same projected change in precipitation. All subwatersheds had the same P application, etc.. The authors need to have a table that highlights what were the variables that varied by time and by watershed and what was assumed to not vary over space and/or time. How did you define the spatial and temporal scales? Did your stakeholders agree on the adopted spatio-temporal resolutionl? The authors use the Precipitation rate anomaly as the major CC metric. This might be good for point sources; for no-point sources the annual % change is not teh best option since P application is seasonal. Please explain the decision to move to annual rather than a seaosnal scale.
6) The authors need to explain more their adopted methodology that they used to model LULC change over time. Was the change over time assumed to be linear? Did they track the change as a function of what the original LULC class was?
7) The authors adopr a 10,999 monte carlo simulations, Is that enough given the complicated model? How did they know that this value is good enough?
8) Page 11 line 254: Thee authors adopted the equal interbval discritization. Why that and not equal frequency? Did you run a sensitivity analysis to determine the impact of the the type of discretization on model results?
9) Line 266: "sum of the ‘IF’ statement was used to determine their overall state." This assumes all nodes are all equal. How can you defend that? For example, if you have 3 nodes and each haa a score of one is that the same as having one node with a score of 3 and 2 nodes with a score of zero. Is that accurate? How did your stakeholders feel about these assumptions?
10) Line 269: There is no Appendix E
11) Lines 278-281: Expand on how the model valaution was done. Did you compared the mean? Did you averaged concentration values over time? If so over which period? Did you see the sd? Was this done for all subwatersheds? WHat metric did you use to evaluate?
12) The authors are encouraged to doa sensitivity analysis (sensitivity to findings and sensitivity to parameters)
13) Figure 6: Discuss why the change in 6202, 6205, 6206 is so high and diffrent from teh rest under scenario (d).
14) Bar charts 7 and 8 need to show the uncertainty bounds. Also provide similar charts for the rest of the subwatersheds in the SM.
15) Line 447: Explain why the evaluation was done in sub-catchment 6200. Also state the period over which the data was collected and used for that sub-catchment.
16) Figure 9: Did you check and see if these differences are coming from the point or non-point sources. You have the point source data for all WWTPs so you can estimate the relative contribution of each.
17) It is not clear why the authors adopted a continous BN to only discretize its output. An explanation on the reasons for that is needed.
18) Discuss why theGeNIe platform was used as compared to others?
19) Table S2 in the SM is very important; yet it hard to follow. It also needs English editing.
20) Are the gammas in Table S2 fixed or do they have distributions around them? How were these determined?
21) The authors limit their continous distributions to normal and truncated normal. Why?
22) Table S3: Make sure you repeat header on all pages.
Citation: https://doi.org/10.5194/egusphere-2022-617-RC2 -
AC4: 'Comment on egusphere-2022-617', Kerr Adams, 05 Apr 2023
Dear Dr Ibrahim Alameddine,
Thank you for your constructive comments on our manuscript. We believe in responding to your comments the manuscript has improved.
We have detailed our response and how we plan to address each of your comments and questions where appropriate in a future version of the manuscript. Where appropriate we have included and referenced material to the attached supplementary material document named ‘EGU_2022_617_Review#2_Supplement_Apr23'.Please note, we had issues finalising our response. We apologise for the confusion and duplicating responses and kindly ask you to review the responses provided on the 5th of April 2023.
1) The authors use the term resilience but do not provide a clear definition on what that term is supposed to capture in their work.
Thank you for this comment. We agree that a definition of resilience is lacking in the manuscript.
We will include the following text to define resilience and make it clear that we are capturing socio-ecological resilience in the manuscript from line 13 onwards:“Resilience was first introduced by (Holling, 1973) as the ability of ecological systems to absorb disturbances and retain their functions in the face of change. Adger (2000) later defined social resilience as the ability of groups and communities to cope with social, political and environmental change. The crossover between social and ecological theories led to the theory of socio-ecological system resilience (Cretney, 2014, Folke, 2006). Decision-makers must be able to understand how a system shifts from one state to another in the face of change (Renaud et al., 2010) to inform resilient water management and allow freshwater systems to bounce back and adapt to variability, uncertainty and transformation (Brown, 2015).”
2) The authors need to provide references to the One Planet Choices work given its tight coupling with their work.
Thank you for this comment. We have provided the following reference to line 96 of the manuscript:
“SEPA: Sustainable Growth Agreement Scottish Water and Scottish Environment Protection Agency Progress Update February. 2020. Available Online: https://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdfhttps://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdf.
We have also included a footnote next to the first reference of One Planet Choices in line 96 with the following link to a video describing the method: https://vimeo.com/804313679/1139d31b45”
3) How did you ensure that when you were eliciting the model from the stakeholders that you did not end up with cyclical pathways?
Thank you for highlighting these questions. We agree that methods used to prevent cyclical pathways need to be made clearer in the manuscript.The model section headings presented in lines 156-162 were used to ensure that the purpose of the BN model was clear to the different stakeholder groups. In addition, they were also used to ensure that cause-and-effect relationships were linear, which is evident in Figure 3 (line 352) and our response to Referee #1 comment #1.
The model section headings were used during stage two of the five-stage participatory approach described in Figure 2 line 163 of the manuscript to keep the model linear. We will make this clearer in a future version of the manuscript by providing a visual representation of the headings (Figure 1 of EGU_2022_617_Review#2_Supplement_Apr23) and by updating lines 156-157 of the original manuscript to the following:
“Model section headings (Figure 2 - in manuscript) were agreed with the project team at the outset to clarify the modelling purpose with different stakeholder groups and ensure that the elicited cause-and-effect relationships were linear.”
4) The paper needs to discuss the choice of the spatio-temporal scales in more details. The data is reported using daily units; yet the model is predicting a yearly average in 2050. The authors work with subwatersheds, some of which are nested. How was the nesting accounting for?
Thank you for these comments and questions. Model outputs provide the average daily unit loads/concentration values for 2050, rather than the average yearly load/concentration values. We use the equation-based BN model to investigate how different climate, population and land cover scenarios to 2050 would change daily unit loads/concentrations and present the outputs as the average daily load/concentration to 2050. For example, the ‘annual’ scenario takes account of the predicted annual precipitation rate anomaly, therefore in the ‘annual’ scenario we investigate how the average daily rate might change given the average annual predicted change in precipitation.
Sub-watersheds were modelled using sub-models to create a semi-distributed model. The sub-models were nested to account for the routing of water and contaminants from upstream sub-watersheds to downstream sub-watersheds.
The spatial and temporal scales are limited in the model meaning many of the variables are static in time and space. In this research, we aimed to address gaps in the application of equation-based hybrid BN structures in environmental risk assessment. Given the findings from the research, we would like to understand ways of developing a spatial and dynamic BN model in the future, however, this was not the focus of the current research described in the manuscript.
We specifically address the need to discuss the choice of spatio-temporal scale in more detail in the paper in our response to comment #5 of the review.
5) Many of the variables in the model are static in time and space. For example all subwatersheds had the same projected change in precipitation.
Thank you for the comments and questions. It is correct that many of the variables in the model are static in time and space. Using the UKCP18 precipitation change anomaly data for a 25 km grid square meant that the entire catchment was within the same spatial grid and therefore had the same projected change in precipitation. If we are to investigate the use of a spatial BN model in the future, we could utilise the UKCP18 2.2 km grid square product. We believe this should be considered in the manuscript and have included the following text from line 584 of the manuscript:
“For example, in this study, we considered future precipitation change anomalies using the UKCP18 25km grid square data which is limited compared to the possible use of UKCP18 2.2km grid square precipitation change anomaly data.”
All subwatersheds had the same P application, etc.. The authors need to have a table that highlights what were the variables that varied by time and by watershed and what was assumed to not vary over space and/or time.
The P application does differ in the different watersheds based on diffuse Phosphorus loads in kg/day, derived from the ADAS PSYCHIC (Phosphorus and Sediment Yield Characterisation In Catchments) model by Davison et al. (2008) that uses land use data (AgCensus) to estimate loads derived from Arable and Livestock land cover in 1km grids used in a Source Apportionment GIS model. Given we knew the land cover of arable land in each sub-catchment we could calculate the kg/ha/day load, which is presented as γ value in supplementary table S3.
We accept there needs to be greater clarity on the variable spatial and temporal scales. To achieve this Table S3 of the manuscript has been updated to describe if a variable is static in space and time.How did you define the spatial and temporal scales? Did your stakeholders agree on the adopted spatio-temporal resolution?
Stakeholders co-developed the conceptual model structure, providing details of the different sub-catchments and assets in the catchment. Stakeholders then provided the data and metric information to support each of the variables included in the model structure to inform the spatio-temporal resolution.
We have added the following detail in line 185 of the manuscript to make this point clearer:
“Data, metric and catchment specific information provided by stakeholders for each model variable informed the spatio-temporal resolution of the model.”
The authors use the Precipitation rate anomaly as the major CC metric. This might be good for point sources; for no-point sources the annual % change is not the best option since P application is seasonal. Please explain the decision to move to annual rather than a seasonal scale.
We agree that exploring using annual changes is limiting and that seasonal considerations would be best for no-point sources, however, to consider a seasonal scale would require data on the seasonal differences in Phosphorus applications in the catchment, which were not available to us. We only had access to static diffuse P loads in kg/day, from the ADAS PSYCHIC model. If seasonal diffuse P data was available, it would be possible to consider seasonal differences, as the UK Climate Projection data does provide seasonal precipitation rate anomalies, which are considered in extreme scenarios.
Despite the data limitation, we would like to consider changes in seasonal Phosphorus applications while also responding to referee questions provided in comment #12 regarding the consideration of the sensitivity of the model to input parameters. We have included model outputs for a 20% increase/decrease in P application rates in each waterbody sub-catchment. The outputs of the sensitivity analysis are presented in (Table 3 of EGU_2022_617_Review#2_Supplement_Apr23), which we plan to present in the supplementary material of the manuscript.
We have combined a full response in comment #12.7) The authors need to explain more their adopted methodology that they used to model LULC change over time. Was the change over time assumed to be linear? Did they track the change as a function of what the original LULC class was?
Thank you for this important comment. LULC change was assumed to be linear until 2050, tracking the change as a function of the baseline LULC class cover. We used a story and simulation approach to consider how the land cover would change across different regional shared socioeconomic pathways (SSPs) for the UK compared to the current land cover as the baseline. Having reviewed the information in the manuscript, we believe greater information on how SSP narrative storylines were used to support the story and simulation process should be provided with land cover change values.
To better represent the land cover change in each waterbody sub-catchment under the different scenarios we have removed Table S6 and Figure S4 of the original manuscript and replaced them with Figures 2-6 in EGU_2022_617_Review#2_Supplement_Apr23.
We include the following text in Appendix S4 of the Supplementary Material:
“We used the Shared Socioeconomic pathway scenarios developed by Pedde et al (2021) to use the trends of how land cover may change in the future using UK-SSP1 as the basis for the Green Road Scenario, UK-SSP2 for the Business as Usual Scenario and UK-SSP5 as the basis for the Fossil Fuelled Development Scenario.
In the Green Road scenario, there is a greater emphasis on protecting environmental areas, therefore an increase in woodland and wild grasslands is evident in the scenario. The Green Road Scenario considers a switch to a more vegetarian-based diet, resulting in the reduction of pasture land for meat production. In contrast, the Fossil-Fuelled Development scenario includes less consideration for environmental protection and maximises the amount of land in traditional economic-based land covers, such as pasture, to support a meat-based diet and conifer plantations. Using local interpretations from stakeholders, it was clear that arable farming was a key source of income in the catchment and an area of Scotland highly desirable for arable farming, therefore, the arable land cover was increased across all scenarios. For urban land cover, all scenarios considered an increase in population in Cupar, which is the largest town in the catchment, and the neighbouring Foodieash and Dairsie. Less urbanisation is considered in the Green Road scenario as living in rural areas is considered more desirable in comparison to the Fossil Fuelled Development scenario, which sees the greatest increase in urbanisation as the UK-SSP5 trends predict an increase in movement to eastern Scotland.
For the Business as Usual scenario, land cover trends from 1990 were used to determine the changes in land cover. Arable cover has been the predominant type since the 1990s and has been gradually increasing. Pasture land has the second largest coverage, but has been gradually declining. Trends since 2010 were used to inform the Business as Usual trajectory to 2050 and historic values from the 1990s were used to consider the upper limits of the different land cover types through time.
Understanding the historic land cover type helped inform the story and simulation approach to inform the boundaries of how the land cover could change in the future under each scenario. The total hectares in each sub-catchment per land cover type were calculated for current conditions (2019). The percentage of each land cover type in each sub-catchment was then calculated and altered based on the different scenario narratives, local stakeholder knowledge and historical boundaries, before being converted back to hectares. The different land cover hectares across the different scenarios are presented in Figures S4-7. There are only subtle differences, particularly in arable land cover in the different scenarios, mainly due to the arable land cover nearly being maximised in the catchment currently.”
7) The authors adopt a 10,999 monte carlo simulations, Is that enough given the complicated model? How did they know that this value is good enough?
Thank you for this question. We tested running the model using 1,000, 10,000 and 100,000 simulations and check for differences in model output values for reactive phosphorus concentrations (µg/l) in sub-catchment 6200 (the catchment outlet). As there was no difference in model output values between these scenarios, we decided to use the standard 10,000 simulations applied in GeNIe software.
8) Page 11 line 254: Thee authors adopted the equal interval discretization. Why that and not equal frequency? Did you run a sensitivity analysis to determine the impact of the type of discretization on model results?
Thanks for this consideration. We use equal intervals to represent the final output nodes on a discrete scale to support stakeholders in intuitively understanding discretised outputs. However, the model itself is run as a hybrid model, not a fully discretised model. The intervals are informed using an indexing method and regulatory standards where available.
9) Line 266: "sum of the ‘IF’ statement was used to determine their overall state." This assumes all nodes are all equal. How can you defend that? For example, if you have 3 nodes and each have a score of one is that the same as having one node with a score of 3 and 2 nodes with a score of zero. Is that accurate? How did your stakeholders feel about these assumptions?
Thank you for this comment and the example provided. The IF statement indexing method is used for variables such as natural capital which don’t have a common metric for measurement. The first point to raise is that finding common ground across multiple sectors to agree on which capital and their resources should hold greater value is complex. An environmental regulator may value water quality to be the most important variable in the system, while a food producer may value soil quality or financial margins as holding greater value. To reduce complexity, we assumed that all nodes, particularly capitals and their capital resources, are equal in value.
We developed the IF statement indexing method and presented it to the stakeholder ‘project team’ who accepted the assumption of the index scoring. The ‘project team’ explained that a ‘one out, all out’ approach, described by Carvalho et al., (2019), is used when recording ecological status as part of the Water Framework Directive reporting. This method does have the disadvantage in that two goods don’t equal one bad, which can lead to regulators being set up to fail, however, it was highlighted that taking a precautionary approach prevents the masking of undesirable outcomes when averaging out scores and provides an easy and transparent way of measuring overall variable states.
We have updated line 265 of the manuscript to reflect on this justification:
“As multiple parent nodes were associated with capital and capital resource variables, the sum of the ‘IF’ statement was used to determine their overall state. The ‘IF’ statement indexing method follows the ‘one out, all out’ approach applied to the evaluation of Good Ecological Status in the EU Water Framework Directive, as described in Carvalho et al., (2019). The ‘one out all out’ approach adopts the precautionary principle to prevent masking of undesirable outcomes when averaging scores and provides an easy and transparent way of measuring overall variable states.”
10) Line 269: There is no Appendix E:
Thank you for highlighting this, our apologies for the error. We have updated the manuscript to reference S5 of the supplementary material instead of Appendix E.
11) Lines 278-281: Expand on how the model valuation was done. Did you compared the mean? Did you averaged concentration values over time? If so over which period? Did you see the sd? Was this done for all subwatersheds? What metric did you use to evaluate?
Thanks for this comment, we accept the information regarding the goodness of fit evaluation is unnecessarily split between sections 2.2.4 (methods) & 3.5 (results) of the manuscript, which relates to points raised in comment #15.
Using 52 reactive phosphorus concentrations (µg/l) observations at the 6200 sub-catchment outlet collected between 2017-2019, we fitted a histogram using the custom function tool in GeNIe to create an ‘observed phosphorus concentration (µg/l) 6200’ variable in the model, which was both parentless and childless. The goodness of fit method was done only in 6200 as this is the catchment outlet, therefore all other sub-catchment will drain to this point, making it a suitable point to evaluate.
When running the ‘current’ model scenario, we compared the median, standard deviation and discretised class probabilities – informed by the WFD classification boundaries for the sub-catchment – for both the modelled reactive phosphorus concentrations and observed reactive phosphorus variables to evaluate the goodness of fit. In a relevant study since the submission of our preprint, Glendell et al., (2022) applied a % Bias measure of model performance where the departure of +/- 50% from observations was considered behavioural, we have therefore updated our evaluation to include this method as an additional metric to evaluate the model.
Addressing comments #11 and #15 we have updated the manuscript from lines 278-283 with the following:
“Model performance was evaluated using a goodness of fit method (Aguilera et al., 2011) using 52 bi-monthly observed reactive phosphorus concentrations in micrograms per litre (µg/l) collected by Scottish Water in sub-catchment 6200. We fitted a histogram using the custom function tool in GeNIe to create an ‘observed phosphorus concentration (µg/l) 6200’ variable, which was both parentless and childless. We evaluate sub-catchment 6200 as this is the catchment outlet for all sub-catchments. Computing the ‘current’ model scenario, we compared the median, standard deviation and discretised class probabilities – informed by the WFD classification boundaries for the sub-catchment – for both the modelled reactive phosphorus concentrations and observed reactive phosphorus variables to evaluate the goodness of fit. A % Bias method applied by Glendell et al., (2022), with a departure of +/−50% from observations considered behavioural, was used to evaluate model performance:
%Bias=Xsim−Xobs/Xobs (see EQ1 of EGU_2022_617_Review#2_Supplement_Apr23 for the equation).
Where Xsim is the modelled reactive phosphorus concentration (µg/l) and Xobs is the observed reactive phosphorus concentration ((µg/l).”
Results reported in the manuscript from lines 446-459 have been updated to the following:
“We evaluated the model performance by comparing the modelled current reactive phosphorus concentrations in micrograms per litre in waterbody sub-catchment 6200 with a simulation of the current observed reactive phosphorus concentrations (μg/l) in waterbody sub-catchment 6200. Based on the median reactive phosphorus concentration (Table 1 - see EGU_2022_617_Review#2_Supplement_Apr23), the model underestimated the median reactive phosphorus concentration (157.63 μg/l) at the catchment outlet compared to the observed simulated median reactive phosphorus concentration (168.82 μg/l). A greater standard deviation was observed in the model simulation (361.7 μg/l) compared to the observed simulation (109.3 μg/l).
Based on the discrete output (Fig.9), the model underestimated the reactive phosphorus concentration compared to the observed simulation. The most probable state for reactive phosphorus concentrations in the observed simulation was moderate risk (44% probability) or poor WFD status, compared to the modelled scenario which estimated low-risk, or moderate ecological status with a probability of 41%. The modelled reactive phosphorus concentrations were more widely distributed, which is evident in a 2% probability of high-risk, or bad ecological status, compared with 0% for the observed data.
When evaluating the goodness of fit using the % bias correction (Table 2 see EGU_2022_617_Review#2_Supplement_Apr23) 43% of observations were within the +/- 50% behavioural threshold, 31% of simulated values were above the 50% acceptable threshold, and 26% were below the 50% acceptable threshold.”
12) The authors are encouraged to do a sensitivity analysis (sensitivity to findings and sensitivity to parameters)
Thank you, we appreciate the comment. We didn't discretise the model and perform a sensitivity to findings analysis to avoid loss of information and imprecision discussed in lines 536-549.
We do agree that a sensitivity to parameters investigation would improve the manuscript. The different Representative Concentration Pathways and Shared Socioeconomic Pathways provide a measure of sensitivity to changes in input variables Climate, Population and Land Cover, however, there is limited sensitivity testing to changes in diffuse and point source pollution concentrations, as highlighted in response to comment #5.
We have updated the manuscript to include details of the parameter sensitivity analysis of diffuse and point source variables in section 2.2.4 from lines 285:
“A one-at-a-time parameter sensitivity analysis was conducted to determine which input variables contribute the greatest variability to model outputs (Wohler et al., 2020, Hamby, 1994). We use the target variable reactive phosphorus concentrations (µg/l) at the 6200 catchment outlet to determine the sensitivity of the model to diffuse pollution phosphorus loads and point source wastewater phosphorus loads. The sensitivity analysis compared the median reactive phosphorus concentration (µg/l) for the current scenario against a +/- 20% difference for diffuse arable, pasture and septic tank P sources, and wastewater P sources while holding other input values constant."
We then present findings from the sensitivity analysis in section 3.5 of the results from lines 463 onwards with the following text and additional Table 3 (EGU_2022_617_Review#2_Supplement_Apr23).
"The results of the parameter sensitivity analysis are presented in Table 3. Changes in point source reactive phosphorus loads have a greater influence on reactive phosphorus concentrations (µg/l) compared to diffuse sources in sub-catchment 6200 under a current scenario. A 20% increase in point source loads resulted in an 8.4% increase in reactive phosphorus concentrations, while a 20% reduction resulted in an 8.1% reduction. Of the diffuse sources, arable sources had the greatest influence on reactive phosphorus concentration with a 20% increase yielding a 4.9% increase in concentration, while a 20% reduction resulted in a 6.5% reduction in concentrations.”
13) Figure 6: Discuss why the change in 6202, 6205, 6206 is so high and different from the rest under scenario (d).
Thank you for this comment. The change in reactive phosphorus concentrations in sub-catchments 6202, 6205 and 6206 in scenario (d) is because scenario (d) assumes an extreme high rainfall scenario which results in both increased diffuse source run-off and increase effluent loads as there is a greater probability of effluent discharge and spills. We believe the change is higher in the three sub-catchments compared to sub-catchments 6200 and 6201 because we can’t consider the influence of river flow volume and its diluting influence on reactive phosphorus concentrations. River flow volume data was not available for sub-catchments 6202, 6205 and 6206, only sub-catchments 6200 and 6201.
Another reason behind the higher change was due to results in the previous version of the manuscript reporting mean concentration values, which are influenced by outliers. The Fossil Fuelled Development Extreme High Precipitation scenario has the greatest standard deviation, as is evident in the graphs in our response to comment #14.
We have since updated the manuscript to present median statistics instead of mean statistics (more information in response to comment #16). The updated Figure 6 of the manuscript is presented in EGU_2022_617_Review#2_Supplement_Apr23 as Figure 7.
In addition, we have added the following text to the manuscript from line 568:
“Investigations of future scenarios highlighted that in the future business-as-usual scenario (Figure 6, Pane ii) (Figure 7 in EGU_2022_617_Review#2_Supplement_Apr23) median reactive phosphorus concentrations (µg/l) increased compared to current annual conditions (Figure 6, Pane i) in sub-catchments 6200, 6201 and 6205 and decreased in sub-catchments 6202 and 6206. Figure 7 (Figure 8 in EGU_2022_617_Review#2_Supplement_Apr23) for sub-catchment 6200 and Figures S8-12 (Figures 9-13 in EGU_2022_617_Review#2_Supplement_Apr23) for remaining sub-catchments) show increases in total phosphorus loads (kg/day) in sub-catchments 6200, 6201 and 6205, while the total phosphorus loads in sub-catchment 6202 and 6206 decreased, particularly for wastewater sources. The changes in total phosphorus can be seen in the source apportionment between wastewater and diffuse sources, as well as the trends in climate, population and land cover change. Wastewater sources increase in sub-catchments where the population is projected to increase, while diffuse sources are expected to increase in all sub-catchments.
In the Green Road and Fossil-Fuelled Development Extreme Precipitation scenarios, the influence of precipitation change and catchment processes are evident. In the Green Road Extreme Low Precipitation scenario, total phosphorus loads (kg/day) are reduced in all sub-catchments, due to reductions in diffuse run-off. The lower likelihood of wastewater spills contributing untreated effluent to wastewater source loads are also reduced.
Reactive phosphorus concentrations (µg/l) are greater in the Green Road Extreme Low Precipitation scenario compared to the current scenarios in sub-catchments 6200 and 6201, despite the reductions in total phosphorus loads in both sub-catchments (Figure 7 and Figures S8-12). We believe these concentration increases are due to the reduction in river flow volumes in the extreme low precipitation rate scenario, meaning regulating diluting functions are absent and reactive phosphorus concentrations increase. We are unable to investigate the influence of flows in the sub-catchments where reactive phosphorus concentrations decrease compared to current conditions (6202, 6205 and 6206) as observed river flow volume data were not available for all sub-catchments, see SM Table 2 (of manuscript supplementary material) for more information on how surface water quality is measured absence of river flow volume data.
In the Fossil Fuelled Development Extreme High Precipitation scenario, increases in reactive phosphorus concentrations (µg/l) compared to current conditions are evident in all sub-catchment waterbodies, which is attributed to increases in total phosphorus loads (kg/day). Increased precipitation rates increase diffuse run-off, wastewater effluent flows and the likelihood of effluent spills. For sub-catchments 6200 and 6201, despite increases in river flow volumes from increased precipitation, P source loads into the waterbodies was greater than the dilution capacity.”
14) Bar charts 7 and 8 need to show the uncertainty bounds. Also provide similar charts for the rest of the subwatersheds in the SM
Thank you for this comment, we have included the Bayesian credible interval (Q5 and Q95) uncertainty bounds in Figures 7 and 8 (Figures 8 and 13 in EGU_2022_617_Review#2_Supplement_Apr23) and added the results for the remaining sub-catchments in Figures 9-12. We will include these as Figures 7 and 8 in the manuscript and Figures S9-12 in S6 of the Supplementary Material.
15) Line 447: Explain why the evaluation was done in sub-catchment 6200. Also state the period over which the data was collected and used for that sub-catchment.
Thank you for this comment, we have acknowledged and addressed this in our response to comment #11 regarding sub-catchment 6200 being the catchment outlet.
16) Figure 9: Did you check and see if these differences are coming from the point or non-point sources. You have the point source data for all WWTPs so you can estimate the relative contribution of each.
Thank you for this consideration. We have now considered this in our sensitivity to input parameters analysis described in response to comment #12 however, we believe that this comparison wouldn’t provide a complete justification as to why the model underestimates the probability of being within a moderate risk class.
We have updated the manuscript to report results as median values, rather than mean values as there are outliers in the modelled outputs, which was influencing the overestimation values in Table 1 compared to the underestimation in the discretised outputs in Figure 9 of the original manuscript.
Further, we accept that a discussion of the underestimation and wider distribution evident for the modelled reactive phosphorus concentrations (µg/l) is required in the manuscript. We have included the following discussion on the potential reasoning behind the underestimation of the modelled reactive phosphorus concentrations and the observed concentrations from line 555 of the manuscript:
“Despite 46% of the % bias observations falling within the +/- 50% acceptable model performance, results from the goodness of fit evaluation demonstrate that the model underestimated current median reactive phosphorus concentrations (µg/l) at the catchment outlet in sub-catchment 6200 and the probable risk class. Despite, underestimation, simulated concentrations were more widely distributed, as compared to the observed data, as is evident in the 2% of observations within a high-risk state for simulated concentrations, compared to 0% for observed concentrations. A wider distribution in simulated reactive phosphorus values using a hybrid BN model was also found by Glendell et al., (2022). We concur with their considerations that both the quality and the low temporal resolutions of observed data may be responsible for this discrepancy.”
17) It is not clear why the authors adopted a continuous BN to only discretize its output. An explanation on the reasons for that is needed.
Thank you for this comment. Firstly, we avoided the discretisation of continuous variables within the model to avoid loss of information. However, from the stakeholder’s perspective, understanding the probability of model outputs falling into agreed risk classes helped the understanding and the communication of the results. Therefore, we discretised terminal continuous nodes to provide a dual representation of outputs to support effective communication of findings, as described in lines 542-549.
To make the reasoning clearer we will update the manuscript as follows from lines 258-259:
“We presented a dual representation of continuous nodes using a discretised child node to support the communication of the results using both summary statistics (median and standard deviation) available in continuous outputs and the probability of model outputs falling into agreed risk classes available in discrete variables.”
Providing a simplified visualisation of the BN model in response to referee 1 comment #1 supports our additional comments (Figure 14 in EGU_2022_617_Review#2_Supplement_Apr23)
18) Discuss why theGeNIe platform was used as compared to others?
Thanks for this consideration. We used the GeNIe software because it was free for academic use and it allows the development of hybrid models without discretising the output. Further, members of the author team have experience in using GeNIe, for example in Stewart et al, 2021 and Glendell et al 2022, meaning an in-depth understanding of the software was already available to the research team.
19) Table S2 in the SM is very important; yet it hard to follow. It also needs English editing.
Thank you for this comment, we accept that there is a lot of detail in Table S2 which makes it difficult to follow. We included as much detail about the nodes within the model as possible to provide transparency, however, the depth of detail may have been counterintuitive.
To address this, we updated Table S2 by combining node name and identifier columns, removing state and discretisation columns as they are presented in Table S3 and simplifying the supporting information column to only include key information. For brevity, we have included a sample of how the updated SM Table 2 is now presented as Table 4 of EGU_2022_617_Review#2_Supplement_Apr23.20) Are the gammas in Table S2 fixed or do they have distributions around them? How were these determined?
Thank you for these questions. The gamma values are fixed values, which are available in Table S3. The values are determined using a mixture of methods which are explained in the supporting information column of Table S2. In our response to comment #5, we have tried to make the justification for gamma values clearer when updating Table S3.
21) The authors limit their continuous distributions to normal and truncated normal. Why?
Thanks for this question. We mainly use normal and truncated normal distributions because there is limited data to use/fit different distribution types. For example, for wastewater phosphorus concentrations we rely on mean and standard deviation values, rather than multiple observations or data points. Where we do have sufficient data points/observations, we use the ‘custom’ GeNIe function, which fits a customised histogram distribution to the data available, which is the preferred method but not always possible with the reliance on summary statistics. Hence, in that sense, our model acts as a meta-model integrating output from proprietary models as well as regulatory data.
22) Table S3: Make sure you repeat header on all pages.
Thank you for highlighting this, we have repeated the header for Tables S3 and S2.
References:
Adger, W. N. 2000. Social and ecological resilience: are they related? 24, 347-364.Brown, K. 2015. Resilience, development and global change, Routledge.Carvalho, Laurence, Eleanor B. Mackay, Ana Cristina Cardoso, Annette Baattrup-Pedersen, Sebastian Birk, Kirsty L. Blackstock, Gábor Borics et al. 2019. "Protecting and restoring Europe's waters: An analysis of the future development needs of the Water Framework Directive." Science of the Total Environment 658. 1228-1238.
Cretney, R. 2014. Resilience for Whom? Emerging Critical Geographies of Socio-ecological Resilience. 8, 627-640.
Davison, P.S., Withers, P.J., Lord, E.I., Betson, M.J. and Strömqvist, J., 2008. PSYCHIC–A process-based model of phosphorus and sediment mobilisation and delivery within agricultural catchments. Part 1: Model description and parameterisation. Journal of Hydrology, 350(3-4), pp.290-302.
Folke, C. 2006. Resilience: The emergence of a perspective for social–ecological systems analyses. Global Environmental Change, 16, 253-267.
Glendell, Miriam, Zisis Gagkas, Marc Stutter, Samia Richards, Allan Lilly, Andy Vinten, and Malcolm Coull. 2022. "A systems approach to modelling phosphorus pollution risk in Scottish rivers using a spatial Bayesian Belief Network helps targeting effective mitigation measures." Frontiers in Environmental Science. 1825.
Hamby, D.M., 1995. A comparison of sensitivity analysis techniques. Health physics, 68(2), pp.195-204.
Holling, C. S. 1973. Resilience and stability of ecological systems. Annual review of ecology and systematics, 1-23.
Renaud, F. G., Birkmann, J., Damm, M. & Gallopín, G. C. 2010. Understanding multiple thresholds of coupled social–ecological systems exposed to natural hazards as external shocks. Natural Hazards, 55, 749-763.
SEPA: Sustainable Growth Agreement Scottish Water and Scottish Environment Protection Agency Progress Update February. 2020. Available Online: https://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdfhttps://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdf
Stewart, G. B., Miriam Glendell, Rob McMorran, Mads Troldborg, Zisis Gagkas, Paola Ovando, Michaela Roberts et al. 2021. "Uplandia: Making better policy in complex upland systems. Final Report."
Wöhler, L., Niebaum, G., Krol, M. and Hoekstra, A.Y., 2020. The grey water footprint of human and veterinary pharmaceuticals. Water research X, 7, p.100044.
-
AC4: 'Comment on egusphere-2022-617', Kerr Adams, 05 Apr 2023
-
AC2: 'Comment on egusphere-2022-617', Kerr Adams, 27 Mar 2023
Please note this response has been updated, please refer to the following for the latest response: https://doi.org/10.5194/egusphere-2022-617-AC4
Dear Dr Ibrahim Alameddine,
Many thanks for taking the time to review our manuscript and provide constructive comments. We believe in responding to your comments the manuscript has improved. We have detailed our response and how we plan to address each of your comments and questions where appropriate in a future version of the manuscript. Where appropriate we have included and referenced material to the attached supplementary material document named ‘EGU_2022_617_Review#2_Supplement'.
1) The authors use the term resilience but do not provide a clear definition on what that term is supposed to capture in their work.
Thank you for this comment. We agree that a definition for resilience is lacking in the manuscript.
We will include the following text to define resilience and make it clear that we are capturing socio-ecological resilience in the manuscript from line 13 onwards:
Resilience was first introduced by (Holling, 1973) as the ability of ecological systems to absorb disturbances and retain their functions in the face of change. Relationships between ecological and societal systems with Adger (2000) defining social resilience as the ability of groups and communities to cope with social, political and environmental change. The crossover between social and ecological theories led to the theory of socio-ecological system resilience (Cretney, 2014, Folke, 2006). Decision-makers must be able to understand how a system shift from one state to another (Renaud et al., 2010) to inform resilient water management and allow freshwater systems to bounce back and adapt to variability, uncertainty and transformation (Brown, 2015).
2) The authors need to provide refernces to the One Planet Choices work given its tight coupling with their work.
Thank you for this comment. We have provided the following reference to line 96 of the manuscript:
SEPA: Sustainable Growth Agreement Scottish Water and Scottish Environment Protection Agency Progress Update February. Available Online: https://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdfhttps://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdf 2020.
We have also included a footnote next to the first reference of One Planet Choices in line 96 with the following link to a video describing the method: https://vimeo.com/804313679/1139d31b45
3) How did you ensure that when you were eliciting the model from the stakeholders that you did not end up with cyclical pathways?
Thank you for highlighting these questions. We agree that methods used to prevent cyclical pathways needs to be made clearer in the manuscript.
The model boundary headings presented in lines 156-162 were not only used to purpose of the BN model was clear to the different stakeholder groups, they were also used to ensure that cause-and-effect relationships were linear, which is evident in Figure 3 (line 352) and in our response to Referee #1.
The model boundary headings were used during stage two of the five-stage participatory approach described in figure 2 line 163 of the manuscript to keep the model linear. We will make this clearer in a future version of the manuscript by providing a visual representation of the headings (Figure 1 of EGU_2022_617_Review#2_Supplement) and by updating lines 156-157 of the original manuscript to the following:
Model boundary headings (Figure 2) - in manuscript) were agreed with the project team to ensure the purpose of the BN model was achieved during model construction with different stakeholder groups and ensure that cause-and-effect relationships were linear.
4) The paper needs to discuss the choice of the spatio-temporal scales in more details. The data is reported using daily units; yet the model is predicting a yearly average in 2050. The authors work with subwatersheds, some of which are nested. How was the nesting accounting for?
Thank you for these comments and questions. Model outputs provide the average daily unit loads/concentration values for 2050, rather than the average yearly load/concentration values. We use the equation-based BN model to investigate how different climate, population and land cover scenarios to 2050 would change daily unit loads/concentrations and present the outputs as the average daily load/concentration to 2050. For example the ‘annual’ scenario takes account of the predicted annual precipitation rate anomaly, therefore in the ‘annual’ scenario we investigate how the average daily rate might change given the average annual predicted change in precipitation.
The nesting of subwatersheds was accounted for by using sub-models which was possible using the GeNIe software to create a semi-distributed model.
Given the learnings from the research, we would like understand ways of developing a spatial and dynamic BN models in the future, however, this was not the focus of the current research described in the manuscript. Instead, we aimed at addressing the gaps in the application of equation-based hybrid BN structures, which is currently limited in its spatial scale and two-time slices (current conditions and conditions in 2050) meaning many of the variables are static and time and space.
We specifically address the need to discuss the choice of spatio-temporal scale in more detail in the paper in our response to comment #5 of the review.
5) Many of the variables in the model are static in time and space. For example all subwatersheds had the same projected change in precipitation.
Thank you for the comments and questions. It is correct that many of the variables in the model are static in time and space. Using the UKCP18 precipitation change anomaly data for a 25 km grid square meant that the entire catchment was within the same spatial grid and therefore had the same projected change in precipitation. If we are to investigate the use of a spatial BN model in the future, we could utilise the UKCP18 2.2 km grid square product.
All subwatersheds had the same P application, etc.. The authors need to have a table that highlights what were the variables that varied by time and by watershed and what was assumed to not vary over space and/or time.
The P application does differ in the different watersheds based on diffuse Phosphorus loads in kg/day, derived from the ADAS PSYCHIC (Phosphorus and Sediment Yield Characterisation In Catchments) model by Davison et al. (2008) that uses land use data (AgCensus) to estimate loads derived from Arable and Livestock land cover in 1km grids used in a Source Apportionment GIS model. Given we knew the land cover of arable land in each sub-catchment we could calculate the kg/ha/day load, which is presented as γ value in supplementary table S3.
We accept there needs to be greater clarity on the variable spatial and temporal scales. To achieve this Table S3 of the manuscript has been updated to describe if a variable is static in space and time.
How did you define the spatial and temporal scales? Did your stakeholders agree on the adopted spatio-temporal resolution?
Stakeholders mapped the model structure, providing details of the different sub-catchments and assets in the catchment. Stakeholders then provided the data and metric information to support each of the variables included in the model structure to inform the spatio-temporal resolution.
The authors use the Precipitation rate anomaly as the major CC metric. This might be good for point sources; for no-point sources the annual % change is not the best option since P application is seasonal. Please explain the decision to move to annual rather than a seasonal scale.
We agree that exploring using annual changes is limiting and that seasonal considerations would be best for point sources, however, to consider a seasonal scale would require data on the seasonal differences in Phosphorus applications in the catchment are not available. We only had access to diffuse Phosphorus loads in kg/day, from the ADAS PSYCHIC. If seasonal Phosphorus application data was available, it would be possible to investigate seasonal as the UK Climate Projection data does provide seasonal precipitation rate anomalies, which are considered in the extreme scenarios.
Despite the data limitation, we would like to consider changes in seasonal Phosphorus applications while also responding to referee questions provided in comment #12 regarding the consideration of the sensitivity of the model to input parameters.
We have included model simulation outputs for a 20% increase/decrease in P application rates in each waterbody sub-catchment. The 20% change in concentration values. The outputs of the sensitivity analysis are presented in (Table 1 of EGU_2022_617_Review#2_Supplement), which we plan to present in the supplementary material of the manuscript.
We have combined a full response in comment #12.
7) The authors need to explain more their adopted methodology that they used to model LULC change over time. Was the change over time assumed to be linear? Did they track the change as a function of what the original LULC class was?
Thank you for this important comment. LULC change was assumed to be linear until 2050, tracking the change as a function of the baseline LULC class cover. A story and simulation approach to consider how land cover would change across different regional shared socioeconomic pathways (SSPs) for the UK. Having reviewed the information in the manuscript, we believe greater information on how SSP narrative storylines were used to support the story and simulation process should be provided for both land cover change values.
To better represent land cover change in each waterbody sub-catchment under the different scenarios we have removed Table S6 and Figure S4 and replaced them with the Figures 2-6 in EGU_2022_617_Review#2_Supplement.
We include the following text in Appendix S4 of the Supplementary Material:
We used the Shared Socioeconomic pathway scenarios developed by Pedde et al (2021) to use the trends of how land cover may change in the future using UK-SSP1 as the basis for the Green Road Scenario, UK-SSP2 for the Business as Usual Scenario and UK-SSP5 as the basis for the Fossil Fuelled Development Scenario.
In the Green Road scenario, there is a greater emphasis on protecting environmental areas, therefore an increase in woodland and wild grasslands is evident in the scenario. The Green Road Scenario considers a switch to a more vegetarian-based diet, resulting in the reduction of pasture land for meat production, In contrast, the Fossil-Fuelled Development scenario includes less consideration for environmental protection and maximises the amount of land in traditional economic-based land cover such as pasture to support a meat-based diet and conifer plantations. Using local interpretations from stakeholders, it was clear that arable farming was a key source of income in the catchment and an area of Scotland highly desirable for arable farming, therefore, the arable land cover was increased across all scenarios. For the Urban land cover, all scenarios considered an increase in population in Cupar which is the largest town in the catchment and the neighbouring Foodieash and Dairsie. Less urbanisation is considered in the Green Road scenario as living in rural areas is considered more desirable in comparison to the Fossil Fuelled Development scenario, which sees the greatest increase in urbanisation as the UK-SSP5 trend predicts an increase in movement to eastern Scotland.
For the Business as Usual scenario, land cover trends from 1990 where available were used to determine the changes in land cover. Arable cover has been the predominant type since the 1990s and has been gradually increasing, with pasture cover also being significant, but gradually declining and replaced with broadleaf woodland. Trends since 2010 were used to inform the Business as Usual trajectory to 2050 and historic values from the 1990s were used to consider the upper limits of the different land cover types through time.
Understanding the historic land cover type helped inform the story and simulation approach to inform the boundaries of how the land cover could change in the future under each scenario. The total hectares in each sub-catchment per land cover type were calculated for current conditions. The percentage of each land cover type in each sub-catchment was then calculated and altered based on the different scenario narratives, local stakeholder knowledge and historical boundaries, before being converted back to hectares. The different land cover hectares across the different scenarios are presented in Figures S4-7. There are only subtle differences, particularly in arable land cover in the different scenarios, mainly due to the arable land cover nearly being maximised in the catchment currently and its importance in the future.
7) The authors adopt a 10,999 monte carlo simulations, Is that enough given the complicated model? How did they know that this value is good enough?
We use 10,000 monte carlo simulations as this is the standard simulation for GeNIe. We did test using 1,000 and 100,000 simulations, but there was no divergence in model output values from using 10.000 simulations, so we, therefore, maintained the standard simulations.
8) Page 11 line 254: Thee authors adopted the equal interbval discritization. Why that and not equal frequency? Did you run a sensitivity analysis to determine the impact of the the type of discretization on model results?
Thanks for this consideration. We use equal frequency as it has the advantage of representing the resilient – high-risk states for model output variables to support stakeholders in intuitively understanding discretised outputs.
We are constrained in the types of distribution functions we can use in the model due to the lack of time-series data, or data or multiple-point data, meaning we rely mainly on fitting truncated normal distributions to represent variable distributions. As we have limited instances of- unevenly distributed histograms where an equal frequency approach may be more appropriate. Due to our limitations in applying an equal frequency discretization method, we did not perform a sensitivity analysis.
9) Line 266: "sum of the ‘IF’ statement was used to determine their overall state." This assumes all nodes are all equal. How can you defend that? For example, if you have 3 nodes and each haa a score of one is that the same as having one node with a score of 3 and 2 nodes with a score of zero. Is that accurate? How did your stakeholders feel about these assumptions?
Thank you for this comment and the example provided.
The IF statement indexing method is used for variables such as natural capital which don’t have a common metric for measurement. The first point to raise is that finding common ground across multiple sectors of which capital and their resources should hold greater value is complex. An environmental regulator may value water quality to be the most important node in the system, while a food producer may value soil quality or financial margins as holding greater value. To reduce complexity, we assumed that all nodes, particularly capitals and their capital resources, are equal in value.
We developed the IF statement indexing method and presented it to the stakeholder ‘project team’ who accepted the assumption of the index scoring. The ‘project team’ explained that a ‘one out, all out’ approach described by Carvalho et al., (2019) is used when recording ecological status as part of the Water Framework Directive reporting. This method does have the disadvantage in that two goods don’t equal one bad which can lead to regulators being set up to fail, however, it was highlighted that taking a precautionary approach prevents the masking of undesirable outcomes when averaging out scores and provides an easy and transparent way of measuring overall variable states.
We have updated line 265 of the manuscript to reflect on this justification:
As multiple parent nodes were associated with capital and capital resource nodes, the sum of the ‘IF’statement was used to determine their overall state. The ‘IF’ statement indexing method best follows the ‘one out, all out’ approach described by Carvalho et al., (2019) is used when recording ecological status as part of the Water Framework Directive reporting method. The ‘one out all out’ approach adopts a precautionary approach preventing the masking of undesirable outcomes when averaging out scores and provides an easy and transparent way of measuring overall variable states.
10) Line 269: There is no Appendix E:
Thank you for highlighting this, our apologies for the error. We have updated the manuscript to reference S5 of the supplementary material instead of Appendix E.
11) Lines 278-281: Expand on how the model valaution was done. Did you compared the mean? Did you averaged concentration values over time? If so over which period? Did you see the sd? Was this done for all subwatersheds? WHat metric did you use to evaluate?
Thanks for this comment, we accept the information regarding the goodness of fit evaluation is unnecessarily split between sections 2.2.4 (methods) & 3.5 (results) of the manuscript, which relates to points raised in comment #15.
Using 52 reactive phosphorus concentrations (µg/l) observations at the 6200 sub-catchment outlet collected between 2017-2019, we fitted a customised histogram using the custom function tool in GeNIe to create an ‘observed phosphorus concentration (µg/l) 6200’ variable in the model, which was both parentless and childless. The goodness of fit method was done only in 6200 as this is the catchment outlet, therefore all other sub-catchment will drain to this point, making it a suitable point to evaluate.
When running the ‘current’ model scenario we originally compared mean values but after further consideration of the higher stand deviation value in the modelled outputs, we have decided to compare the median and standard deviation statistics and discretised class probabilities – informed by the WFD classification boundaries for the sub-catchment – for both the modelled reactive phosphorus concentrations and observed reactive phosphorus variables to evaluate the goodness of fit. In a relevant study since the submission of our preprint, Glendell et al., (2022) applied a % Bias measure of model performance where the departure of +/- 50% from observations is considered behavioural, we have therefore updated our evaluation to include this method as a metric to evaluate the model.
Addressing comments #11 and #15 we have updated the manuscript from line 278-283 with the following:
Model performance was evaluated using a goodness of fit method (Aguilera et al., 2011) using 52 bi-monlthy observed reactive phosphorus concentrations in micrograms per litre (µg/l) collected by Scottish Water at catchment outlet in sub-cathcment 6200. We fitted a customised histogram using the custom function tool in GeNIe to create an ‘observed phosphorus concentration (µg/l) 6200’ (RP 6200 Obs) variable in the model, which was both parentless and childless. When running the ‘current’ model scenario we compared the median, standard deviation and discretised class probabilities – informed by the WFD classification boundaries for the sub-catchment – for both the modelled reactive phosphorus concentrations and observed reactive phosphorus variables to evaluate the goodness of fit. A % Bias method applied by Glendell et al., (2022) was used to measure model performance:
%Bias=Xsim−XobsXobs(1) - see EQ1 of EGU_2022_617_Review#2_Supplement for the equation.
Where Xsim is the modelled reactive phosphorus concentration and Xobs is the observed reactive phosphorus concentration.
Results reported in the manuscript from lines 446-459 have been updated to the following:
When evaluating the model performance by comparing the modelled current reactive phosphorus concentrations in micrograms per litre in waterbody sub-catchment 6200 with a simulation of the current observed reactive phosphorus concentrations in micrograms per litre in waterbody sub-catchment 6200. Based on the median reactive phosphorus concentration (Table 1 see EGU_2022_617_Review#2_Supplement), the model underestimated the median reactive phosphorus concentration (157.63 μg/l) at the catchment outlet compared to the observed simulated median reactive phosphorus concentration (168.82 μg/l). A greater standard deviation was observed in the model simulation (361.7 μg/l) compared to the observed simulation (109.3 μg/l).
Based on the discrete output (Fig.9), the model underestimated the reactive phosphorus concentration compared to the observed simulation. The most probable state for reactive phosphorus concentrations in the observed simulation was moderate-risk (44% probability) or poor WFD status, compared to the modelled scenario which estimated that low-risk, or moderate ecological status. The modelled reactive phosphorus concentrations did have a wider distribution of values, which is evident in a 2% probability of high-risk, or bad ecological status, compared with 0% for the observed data.
When evaluating the goodness of fit using the % bias correction (Table 2 see EGU_2022_617_Review#2_Supplement) 43% of observations were within the optimal +/- 50% behavioural standard, 31% of modelled values were above the 50% acceptance threshold, and 26% were below the 50% acceptance threshold.
12) The authors are encouraged to doa sensitivity analysis (sensitivity to findings and sensitivity to parameters)
Thank you, we appreciated the comment. Due to the size of the model, we did not use a sensitivity analysis as this would lead to a loss of information and imprecision discussed in lines 536-549.
We do agree that a sensitivity to parameters investigation would improve the manuscript. The different Representative Concentration Pathways and Shared Socioeconomic Pathways allow testing the sensitivity of the model to changes in Climate, Population and Land Cover, however, there is limited sensitivity testing to changes in diffuse and point source pollution concentrations, as highlighted in response to comment #5.
We have therefore updated the manuscript to include details of the parameter sensitivity analysis of diffuse and point source variables in section 2.2.4 from lines 285:
A parameter sensitivity analysis was conducted to determine which input variables contribute the greatest variability to model outputs (Hamby, 1994). We use the target variable reactive phosphorus concentrations (µg/l) at the 6200 catchment outlet to determine the sensitivity of the model to diffuse pollution phosphorus loads and point source wastewater phosphorus loads. To measure sensitivity, We consider a +/- 20% change in input values to test the sensitivity of the model to changes in diffuse and point source loads. The sensitivity analysis compared the median reactive phosphorus concentration (µg/l) for the current scenario against the upper and lower limits for each of the different input values while holding other input values constant.
We then present findings from the sensitivity analysis in section 3.5 of the results from lines 463 onwards with the following text and additional Table 3 (EGU_2022_617_Review#2_Supplement).
The results of the parameter sensitivity analysis are presented in Table 3. Changes in point source reactive phosphorus loads have a greater influence on reactive phosphorus concentrations (µg/l) compared to diffuse sources in sub-catchment 6200. A 20% increase in point source loads resulted in an 8.4% increase in reactive phosphorus concentrations, while a 20% reduction resulted in an 8.1% reduction. Of the diffuse sources, arable sources had the greatest influence on reactive phosphorus concentration with a 20% increase yielding a 4.9% increase in concentration, while a 20% reduction resulted in a 6.5% reduction in concentrations.
13) Figure 6: Discuss why the change in 6202, 6205, 6206 is so high and diffrent from teh rest under scenario (d).
Thank you for this comment. The change in reactive phosphorus concentrations in sub-catchments 6202, 6205 and 6206 in scenario (d) is because scenario (d) assumes an extreme high rainfall scenario which results in both increased diffuse source run-off and increase effluent loads as there is a greater probability of effluent discharge and spills. We believe the change is higher in the three sub-catchments compared to sub-catchments 6200 and 6201 because we can’t consider the regulating influence of surface water flows on reactive phosphorus concentrations, as this data was not available for sub-catchments 6202, 6205 and 6206.
Another reason behind the higher change was also due to the results reporting the mean concentration, which is influenced by outliers. The Fossil Fuelled Development Extreme High Precipitation scenario has the greatest standard deviation, as is evident in the graphs in our response to comment #14.
We have since updated the manuscript to present median statistics instead of mean statistics. The updated Figure 6 of the manuscript is presented in EGU_2022_617_Review#2_Supplement as Figure 7.
Despite the switch from mean the median reporting, we believe it is still important to discuss provide increased discussion of the surface water quality findings. We have therefore added the following text to the manuscript from line 568:
Investigations of future scenarios highlighted that annually under a future business-as-usual scenario (Figure 6, Pane ii) (Figure 7 in EGU_2022_617_Review#2_Supplement) median reactive phosphorus concentrations (µg/l) increased compared to current annual conditions in sub-catchments 6200, 6201 and 6205 and decreased in sub-catchments 6202 and 6206. Figure 7 (Figure 8 in EGU_2022_617_Review#2_Supplement) for sub-catchment 6200 and Figures S8-12 (Figures 9-13 in EGU_2022_617_Review#2_Supplement) for reaming sub-catchments, highlight the increases in total phosphorus loads (kg/day) in sub-catchments 6200, 6201 and 6205, while the total phosphorus loads in sub-catchment 6202 and 6206 decreased, particularly for wastewater sources. The changes in total phosphorus can be investigated in the breakdown of wastewater and diffuse sources, as well as the trends in climate, population and land cover change. In sub-catchments where the population is projected to increase, wastewater sources increase, while diffuse sources are expected to increase across all sub-catchments.
In the Green Road and Fossil-Fuelled Development Extreme Precipitation scenarios, the influence of precipitation change and catchment processes are evident. In the Green Road Extreme Low Precipitation scenario, total phosphorus loads (kg/day) reduce in all sub-catchments, due to reductions in diffuse run-off in the absence of precipitation. The lower likelihood of wastewater spills and untreated effluent increasing source wastewater source loads is also reduced.
Interestingly, reactive phosphorus concentrations (µg/l) are greater in the Green Road Extreme Low Precipitation scenario compared to the current scenarios in sub-catchments 6200 and 6201, despite the reductions in total phosphorus loads in both sub-catchments (Figure 7 and Figures S8-12). We believe concentration increases are due to the reduction in surface water flows in the extreme low precipitation rate scenario, meaning regulating diluting functions are absent and reactive phosphorus concentrations increase. We are unable to investigate the influence of flows in the sub-catchments where reactive phosphorus concentrations decrease compared to current conditions (6202, 6205 and 6206) as surface water flow data were not available for all sub-catchments, see SM Table 2 (of manuscript supplementary material) for more information on how surface water quality is measured absence of surface water flow data.
In the Fossil Fuelled Development Extreme High Precipitation scenario, increases in reactive phosphorus concentrations (µg/l) compared to current conditions are evident in all sub-catchment waterbodies, which is attributed to increases in total phosphorus loads (kg/day) in all sub-catchments. Increases in precipitation rates increase diffuse run-off, wastewater effluent flows and the likelihood of effluent spills.
14) Bar charts 7 and 8 need to show the uncertainty bounds. Also provide similar charts for the rest of the subwatersheds in the SM
Thank you for this comment, we have included uncertainty bounds in Figures 7 and 8 (Figure 8 and 13 in EGU_2022_617_Review#2_Supplement) and added the results for the remaining sub-catchments in Figures 9-12. We will include these as Figures 7 and 8 in the manuscript and Figures S8-12 in S6 of the Supplementary Material.
15) Line 447: Explain why the evaluation was done in sub-catchment 6200. Also state the period over which the data was collected and used for that sub-catchment.
Thank you for this comment, we have acknowledged and addressed this in our response to comment #11.
16) Figure 9: Did you check and see if these differences are coming from the point or non-point sources. You have the point source data for all WWTPs so you can estimate the relative contribution of each.
Thank you for this consideration. We could provide the comparison between modelled and observed WWTW source contributions, however, we believe that this comparison wouldn’t provide a complete justification as to why the model underestimates the probability of being within a moderate risk class.
We have updated the manuscript to report results as median values, rather than mean values as there are outliers in the modelled outputs, which was influencing the overestimation values in the Table 1 compared to the underestimation in the discretised outputs in Figure 9 of the original manuscript.
Further, we accept that a discussion of the underestimation, but the wider distribution of the modelled reactive phosphorus concentrations (µg/l), is required in the manuscript. We have included the following discussion on the potential reasoning behind the underestimation of the modelled reactive phosphorus concentrations and the observed concentrations from line 555:
Despite 46% of the % bias observations falling within the optimal +/- 50% acceptable range of model performance, results from the goodness of fit evaluation demonstrate that the model underestimated current median reactive phosphorus concentrations (µg/l) at the catchment outlet in sub-catchment 6200. Modelled concentrations had a greater distribution of values compared to the observed data. A greater distribution in modelled reactive phosphorus values using a hybrid Bayesian network model was also found by Glendell et al., (2022), we concur with their considerations that both the quality of observation data collected and the low temporal resolutions in observed data may influence the predictability of modelling the variability in water quality data.
17) It is not clear why the authors adopted a continous BN to only discretize its output. An explanation on the reasons for that is needed.
Thank you for this comment. We discretise continuous nodes to provide dual representation of outputs to support the effective communication of findings, which is described in lines 542-549. To make the reasoning clearer in the manuscript we will update add the following text from lines 258-259:
To prevent model outputs from being completely continuous, we presented a dual representation of continuous nodes using a discretised child node. Dual representation of nodes support the communication of both summary statistics available in continuous outputs and probabilistic outputs of resilient-high risk states available in discrete variables.
Providing a simplified visualisation of the BN model in response to referee 1 comment #1 supports our additional comments.
18) Discuss why theGeNIe platform was used as compared to others?
Thanks for this consideration. We used the GeNIe software because it was free to use for academic purposes, which is favourable in comparison to similar commercial products such as Hugin and Netica. Members of the author team have experience in using GeNIe, for example in Stewart et al, 2021 and Glendell et al 2022, meaning an in-depth understanding of the software was already available to the research.
19) Table S2 in the SM is very important; yet it hard to follow. It also needs English editing.
Thank you for this comment, we accept that there is a lot of detail in Table S2 which makes it difficult to follow. We included as much detail about the nodes within the model as possible to provide transparency, however, the depth of detail may have led to the opposite of our desired effect.
To address this, we have updated by combining node name and identifier columns, removing state and discretisation columns as they are also presented in Table S3 and simplifying the supporting information column to only include key information. For brevity, we have included a sample of how the updated SM table 2 is now presented as Table 4 of EGU_2022_617_Review#2_Supplement.
20) Are the gammas in Table S2 fixed or do they have distributions around them? How were these determined?
Thank you for these questions. The gamma values are fixed values, which are available in Table S3. The values are determined using a mixture of methods which are explained in the supporting information column.
In our response to comment #5, we have tried to make the justification for gamma values clearer by updating Table S3.
21) The authors limit their continous distributions to normal and truncated normal. Why?
Thanks for this question. We mainly use normal and truncated normal distributions because there is limited data to fit different distributions types. For example for wastewater phosphorus concentrations we rely on mean and standard deviation values, rather than multiple observations or data points. Where we do have sufficient data points/observations we use the ‘custom’ GeNIe function, which fits an appropriate customised distribution to the data available, which is the preferable method.
22) Table S3: Make sure you repeat header on all pages.
Thank you for highlighting this, we have repeated the header for Tables S3 and S2.
**Please note additional figure to EGU_2022_617_Review#2_Supplement - Figure 13**
We have had difficulty uploading the visualisation of the model requested by Referee #1, we have therefore included the representation in the supplementary material in response to this review.
References:
1) Carvalho, Laurence, Eleanor B. Mackay, Ana Cristina Cardoso, Annette Baattrup-Pedersen, Sebastian Birk, Kirsty L. Blackstock, Gábor Borics et al. "Protecting and restoring Europe's waters: An analysis of the future development needs of the Water Framework Directive." Science of the Total Environment 658 (2019): 1228-1238.
2) Glendell, Miriam, Zisis Gagkas, Marc Stutter, Samia Richards, Allan Lilly, Andy Vinten, and Malcolm Coull. "A systems approach to modelling phosphorus pollution risk in Scottish rivers using a spatial Bayesian Belief Network helps targeting effective mitigation measures." Frontiers in Environmental Science (2022): 1825.
3) Stewart, G. B., Miriam Glendell, Rob McMorran, Mads Troldborg, Zisis Gagkas, Paola Ovando, Michaela Roberts et al. "Uplandia: Making better policy in complex upland systems. Final Report." (2021).
-
AC4: 'Comment on egusphere-2022-617', Kerr Adams, 05 Apr 2023
Dear Dr Ibrahim Alameddine,
Thank you for your constructive comments on our manuscript. We believe in responding to your comments the manuscript has improved.
We have detailed our response and how we plan to address each of your comments and questions where appropriate in a future version of the manuscript. Where appropriate we have included and referenced material to the attached supplementary material document named ‘EGU_2022_617_Review#2_Supplement_Apr23'.Please note, we had issues finalising our response. We apologise for the confusion and duplicating responses and kindly ask you to review the responses provided on the 5th of April 2023.
1) The authors use the term resilience but do not provide a clear definition on what that term is supposed to capture in their work.
Thank you for this comment. We agree that a definition of resilience is lacking in the manuscript.
We will include the following text to define resilience and make it clear that we are capturing socio-ecological resilience in the manuscript from line 13 onwards:“Resilience was first introduced by (Holling, 1973) as the ability of ecological systems to absorb disturbances and retain their functions in the face of change. Adger (2000) later defined social resilience as the ability of groups and communities to cope with social, political and environmental change. The crossover between social and ecological theories led to the theory of socio-ecological system resilience (Cretney, 2014, Folke, 2006). Decision-makers must be able to understand how a system shifts from one state to another in the face of change (Renaud et al., 2010) to inform resilient water management and allow freshwater systems to bounce back and adapt to variability, uncertainty and transformation (Brown, 2015).”
2) The authors need to provide references to the One Planet Choices work given its tight coupling with their work.
Thank you for this comment. We have provided the following reference to line 96 of the manuscript:
“SEPA: Sustainable Growth Agreement Scottish Water and Scottish Environment Protection Agency Progress Update February. 2020. Available Online: https://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdfhttps://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdf.
We have also included a footnote next to the first reference of One Planet Choices in line 96 with the following link to a video describing the method: https://vimeo.com/804313679/1139d31b45”
3) How did you ensure that when you were eliciting the model from the stakeholders that you did not end up with cyclical pathways?
Thank you for highlighting these questions. We agree that methods used to prevent cyclical pathways need to be made clearer in the manuscript.The model section headings presented in lines 156-162 were used to ensure that the purpose of the BN model was clear to the different stakeholder groups. In addition, they were also used to ensure that cause-and-effect relationships were linear, which is evident in Figure 3 (line 352) and our response to Referee #1 comment #1.
The model section headings were used during stage two of the five-stage participatory approach described in Figure 2 line 163 of the manuscript to keep the model linear. We will make this clearer in a future version of the manuscript by providing a visual representation of the headings (Figure 1 of EGU_2022_617_Review#2_Supplement_Apr23) and by updating lines 156-157 of the original manuscript to the following:
“Model section headings (Figure 2 - in manuscript) were agreed with the project team at the outset to clarify the modelling purpose with different stakeholder groups and ensure that the elicited cause-and-effect relationships were linear.”
4) The paper needs to discuss the choice of the spatio-temporal scales in more details. The data is reported using daily units; yet the model is predicting a yearly average in 2050. The authors work with subwatersheds, some of which are nested. How was the nesting accounting for?
Thank you for these comments and questions. Model outputs provide the average daily unit loads/concentration values for 2050, rather than the average yearly load/concentration values. We use the equation-based BN model to investigate how different climate, population and land cover scenarios to 2050 would change daily unit loads/concentrations and present the outputs as the average daily load/concentration to 2050. For example, the ‘annual’ scenario takes account of the predicted annual precipitation rate anomaly, therefore in the ‘annual’ scenario we investigate how the average daily rate might change given the average annual predicted change in precipitation.
Sub-watersheds were modelled using sub-models to create a semi-distributed model. The sub-models were nested to account for the routing of water and contaminants from upstream sub-watersheds to downstream sub-watersheds.
The spatial and temporal scales are limited in the model meaning many of the variables are static in time and space. In this research, we aimed to address gaps in the application of equation-based hybrid BN structures in environmental risk assessment. Given the findings from the research, we would like to understand ways of developing a spatial and dynamic BN model in the future, however, this was not the focus of the current research described in the manuscript.
We specifically address the need to discuss the choice of spatio-temporal scale in more detail in the paper in our response to comment #5 of the review.
5) Many of the variables in the model are static in time and space. For example all subwatersheds had the same projected change in precipitation.
Thank you for the comments and questions. It is correct that many of the variables in the model are static in time and space. Using the UKCP18 precipitation change anomaly data for a 25 km grid square meant that the entire catchment was within the same spatial grid and therefore had the same projected change in precipitation. If we are to investigate the use of a spatial BN model in the future, we could utilise the UKCP18 2.2 km grid square product. We believe this should be considered in the manuscript and have included the following text from line 584 of the manuscript:
“For example, in this study, we considered future precipitation change anomalies using the UKCP18 25km grid square data which is limited compared to the possible use of UKCP18 2.2km grid square precipitation change anomaly data.”
All subwatersheds had the same P application, etc.. The authors need to have a table that highlights what were the variables that varied by time and by watershed and what was assumed to not vary over space and/or time.
The P application does differ in the different watersheds based on diffuse Phosphorus loads in kg/day, derived from the ADAS PSYCHIC (Phosphorus and Sediment Yield Characterisation In Catchments) model by Davison et al. (2008) that uses land use data (AgCensus) to estimate loads derived from Arable and Livestock land cover in 1km grids used in a Source Apportionment GIS model. Given we knew the land cover of arable land in each sub-catchment we could calculate the kg/ha/day load, which is presented as γ value in supplementary table S3.
We accept there needs to be greater clarity on the variable spatial and temporal scales. To achieve this Table S3 of the manuscript has been updated to describe if a variable is static in space and time.How did you define the spatial and temporal scales? Did your stakeholders agree on the adopted spatio-temporal resolution?
Stakeholders co-developed the conceptual model structure, providing details of the different sub-catchments and assets in the catchment. Stakeholders then provided the data and metric information to support each of the variables included in the model structure to inform the spatio-temporal resolution.
We have added the following detail in line 185 of the manuscript to make this point clearer:
“Data, metric and catchment specific information provided by stakeholders for each model variable informed the spatio-temporal resolution of the model.”
The authors use the Precipitation rate anomaly as the major CC metric. This might be good for point sources; for no-point sources the annual % change is not the best option since P application is seasonal. Please explain the decision to move to annual rather than a seasonal scale.
We agree that exploring using annual changes is limiting and that seasonal considerations would be best for no-point sources, however, to consider a seasonal scale would require data on the seasonal differences in Phosphorus applications in the catchment, which were not available to us. We only had access to static diffuse P loads in kg/day, from the ADAS PSYCHIC model. If seasonal diffuse P data was available, it would be possible to consider seasonal differences, as the UK Climate Projection data does provide seasonal precipitation rate anomalies, which are considered in extreme scenarios.
Despite the data limitation, we would like to consider changes in seasonal Phosphorus applications while also responding to referee questions provided in comment #12 regarding the consideration of the sensitivity of the model to input parameters. We have included model outputs for a 20% increase/decrease in P application rates in each waterbody sub-catchment. The outputs of the sensitivity analysis are presented in (Table 3 of EGU_2022_617_Review#2_Supplement_Apr23), which we plan to present in the supplementary material of the manuscript.
We have combined a full response in comment #12.7) The authors need to explain more their adopted methodology that they used to model LULC change over time. Was the change over time assumed to be linear? Did they track the change as a function of what the original LULC class was?
Thank you for this important comment. LULC change was assumed to be linear until 2050, tracking the change as a function of the baseline LULC class cover. We used a story and simulation approach to consider how the land cover would change across different regional shared socioeconomic pathways (SSPs) for the UK compared to the current land cover as the baseline. Having reviewed the information in the manuscript, we believe greater information on how SSP narrative storylines were used to support the story and simulation process should be provided with land cover change values.
To better represent the land cover change in each waterbody sub-catchment under the different scenarios we have removed Table S6 and Figure S4 of the original manuscript and replaced them with Figures 2-6 in EGU_2022_617_Review#2_Supplement_Apr23.
We include the following text in Appendix S4 of the Supplementary Material:
“We used the Shared Socioeconomic pathway scenarios developed by Pedde et al (2021) to use the trends of how land cover may change in the future using UK-SSP1 as the basis for the Green Road Scenario, UK-SSP2 for the Business as Usual Scenario and UK-SSP5 as the basis for the Fossil Fuelled Development Scenario.
In the Green Road scenario, there is a greater emphasis on protecting environmental areas, therefore an increase in woodland and wild grasslands is evident in the scenario. The Green Road Scenario considers a switch to a more vegetarian-based diet, resulting in the reduction of pasture land for meat production. In contrast, the Fossil-Fuelled Development scenario includes less consideration for environmental protection and maximises the amount of land in traditional economic-based land covers, such as pasture, to support a meat-based diet and conifer plantations. Using local interpretations from stakeholders, it was clear that arable farming was a key source of income in the catchment and an area of Scotland highly desirable for arable farming, therefore, the arable land cover was increased across all scenarios. For urban land cover, all scenarios considered an increase in population in Cupar, which is the largest town in the catchment, and the neighbouring Foodieash and Dairsie. Less urbanisation is considered in the Green Road scenario as living in rural areas is considered more desirable in comparison to the Fossil Fuelled Development scenario, which sees the greatest increase in urbanisation as the UK-SSP5 trends predict an increase in movement to eastern Scotland.
For the Business as Usual scenario, land cover trends from 1990 were used to determine the changes in land cover. Arable cover has been the predominant type since the 1990s and has been gradually increasing. Pasture land has the second largest coverage, but has been gradually declining. Trends since 2010 were used to inform the Business as Usual trajectory to 2050 and historic values from the 1990s were used to consider the upper limits of the different land cover types through time.
Understanding the historic land cover type helped inform the story and simulation approach to inform the boundaries of how the land cover could change in the future under each scenario. The total hectares in each sub-catchment per land cover type were calculated for current conditions (2019). The percentage of each land cover type in each sub-catchment was then calculated and altered based on the different scenario narratives, local stakeholder knowledge and historical boundaries, before being converted back to hectares. The different land cover hectares across the different scenarios are presented in Figures S4-7. There are only subtle differences, particularly in arable land cover in the different scenarios, mainly due to the arable land cover nearly being maximised in the catchment currently.”
7) The authors adopt a 10,999 monte carlo simulations, Is that enough given the complicated model? How did they know that this value is good enough?
Thank you for this question. We tested running the model using 1,000, 10,000 and 100,000 simulations and check for differences in model output values for reactive phosphorus concentrations (µg/l) in sub-catchment 6200 (the catchment outlet). As there was no difference in model output values between these scenarios, we decided to use the standard 10,000 simulations applied in GeNIe software.
8) Page 11 line 254: Thee authors adopted the equal interval discretization. Why that and not equal frequency? Did you run a sensitivity analysis to determine the impact of the type of discretization on model results?
Thanks for this consideration. We use equal intervals to represent the final output nodes on a discrete scale to support stakeholders in intuitively understanding discretised outputs. However, the model itself is run as a hybrid model, not a fully discretised model. The intervals are informed using an indexing method and regulatory standards where available.
9) Line 266: "sum of the ‘IF’ statement was used to determine their overall state." This assumes all nodes are all equal. How can you defend that? For example, if you have 3 nodes and each have a score of one is that the same as having one node with a score of 3 and 2 nodes with a score of zero. Is that accurate? How did your stakeholders feel about these assumptions?
Thank you for this comment and the example provided. The IF statement indexing method is used for variables such as natural capital which don’t have a common metric for measurement. The first point to raise is that finding common ground across multiple sectors to agree on which capital and their resources should hold greater value is complex. An environmental regulator may value water quality to be the most important variable in the system, while a food producer may value soil quality or financial margins as holding greater value. To reduce complexity, we assumed that all nodes, particularly capitals and their capital resources, are equal in value.
We developed the IF statement indexing method and presented it to the stakeholder ‘project team’ who accepted the assumption of the index scoring. The ‘project team’ explained that a ‘one out, all out’ approach, described by Carvalho et al., (2019), is used when recording ecological status as part of the Water Framework Directive reporting. This method does have the disadvantage in that two goods don’t equal one bad, which can lead to regulators being set up to fail, however, it was highlighted that taking a precautionary approach prevents the masking of undesirable outcomes when averaging out scores and provides an easy and transparent way of measuring overall variable states.
We have updated line 265 of the manuscript to reflect on this justification:
“As multiple parent nodes were associated with capital and capital resource variables, the sum of the ‘IF’ statement was used to determine their overall state. The ‘IF’ statement indexing method follows the ‘one out, all out’ approach applied to the evaluation of Good Ecological Status in the EU Water Framework Directive, as described in Carvalho et al., (2019). The ‘one out all out’ approach adopts the precautionary principle to prevent masking of undesirable outcomes when averaging scores and provides an easy and transparent way of measuring overall variable states.”
10) Line 269: There is no Appendix E:
Thank you for highlighting this, our apologies for the error. We have updated the manuscript to reference S5 of the supplementary material instead of Appendix E.
11) Lines 278-281: Expand on how the model valuation was done. Did you compared the mean? Did you averaged concentration values over time? If so over which period? Did you see the sd? Was this done for all subwatersheds? What metric did you use to evaluate?
Thanks for this comment, we accept the information regarding the goodness of fit evaluation is unnecessarily split between sections 2.2.4 (methods) & 3.5 (results) of the manuscript, which relates to points raised in comment #15.
Using 52 reactive phosphorus concentrations (µg/l) observations at the 6200 sub-catchment outlet collected between 2017-2019, we fitted a histogram using the custom function tool in GeNIe to create an ‘observed phosphorus concentration (µg/l) 6200’ variable in the model, which was both parentless and childless. The goodness of fit method was done only in 6200 as this is the catchment outlet, therefore all other sub-catchment will drain to this point, making it a suitable point to evaluate.
When running the ‘current’ model scenario, we compared the median, standard deviation and discretised class probabilities – informed by the WFD classification boundaries for the sub-catchment – for both the modelled reactive phosphorus concentrations and observed reactive phosphorus variables to evaluate the goodness of fit. In a relevant study since the submission of our preprint, Glendell et al., (2022) applied a % Bias measure of model performance where the departure of +/- 50% from observations was considered behavioural, we have therefore updated our evaluation to include this method as an additional metric to evaluate the model.
Addressing comments #11 and #15 we have updated the manuscript from lines 278-283 with the following:
“Model performance was evaluated using a goodness of fit method (Aguilera et al., 2011) using 52 bi-monthly observed reactive phosphorus concentrations in micrograms per litre (µg/l) collected by Scottish Water in sub-catchment 6200. We fitted a histogram using the custom function tool in GeNIe to create an ‘observed phosphorus concentration (µg/l) 6200’ variable, which was both parentless and childless. We evaluate sub-catchment 6200 as this is the catchment outlet for all sub-catchments. Computing the ‘current’ model scenario, we compared the median, standard deviation and discretised class probabilities – informed by the WFD classification boundaries for the sub-catchment – for both the modelled reactive phosphorus concentrations and observed reactive phosphorus variables to evaluate the goodness of fit. A % Bias method applied by Glendell et al., (2022), with a departure of +/−50% from observations considered behavioural, was used to evaluate model performance:
%Bias=Xsim−Xobs/Xobs (see EQ1 of EGU_2022_617_Review#2_Supplement_Apr23 for the equation).
Where Xsim is the modelled reactive phosphorus concentration (µg/l) and Xobs is the observed reactive phosphorus concentration ((µg/l).”
Results reported in the manuscript from lines 446-459 have been updated to the following:
“We evaluated the model performance by comparing the modelled current reactive phosphorus concentrations in micrograms per litre in waterbody sub-catchment 6200 with a simulation of the current observed reactive phosphorus concentrations (μg/l) in waterbody sub-catchment 6200. Based on the median reactive phosphorus concentration (Table 1 - see EGU_2022_617_Review#2_Supplement_Apr23), the model underestimated the median reactive phosphorus concentration (157.63 μg/l) at the catchment outlet compared to the observed simulated median reactive phosphorus concentration (168.82 μg/l). A greater standard deviation was observed in the model simulation (361.7 μg/l) compared to the observed simulation (109.3 μg/l).
Based on the discrete output (Fig.9), the model underestimated the reactive phosphorus concentration compared to the observed simulation. The most probable state for reactive phosphorus concentrations in the observed simulation was moderate risk (44% probability) or poor WFD status, compared to the modelled scenario which estimated low-risk, or moderate ecological status with a probability of 41%. The modelled reactive phosphorus concentrations were more widely distributed, which is evident in a 2% probability of high-risk, or bad ecological status, compared with 0% for the observed data.
When evaluating the goodness of fit using the % bias correction (Table 2 see EGU_2022_617_Review#2_Supplement_Apr23) 43% of observations were within the +/- 50% behavioural threshold, 31% of simulated values were above the 50% acceptable threshold, and 26% were below the 50% acceptable threshold.”
12) The authors are encouraged to do a sensitivity analysis (sensitivity to findings and sensitivity to parameters)
Thank you, we appreciate the comment. We didn't discretise the model and perform a sensitivity to findings analysis to avoid loss of information and imprecision discussed in lines 536-549.
We do agree that a sensitivity to parameters investigation would improve the manuscript. The different Representative Concentration Pathways and Shared Socioeconomic Pathways provide a measure of sensitivity to changes in input variables Climate, Population and Land Cover, however, there is limited sensitivity testing to changes in diffuse and point source pollution concentrations, as highlighted in response to comment #5.
We have updated the manuscript to include details of the parameter sensitivity analysis of diffuse and point source variables in section 2.2.4 from lines 285:
“A one-at-a-time parameter sensitivity analysis was conducted to determine which input variables contribute the greatest variability to model outputs (Wohler et al., 2020, Hamby, 1994). We use the target variable reactive phosphorus concentrations (µg/l) at the 6200 catchment outlet to determine the sensitivity of the model to diffuse pollution phosphorus loads and point source wastewater phosphorus loads. The sensitivity analysis compared the median reactive phosphorus concentration (µg/l) for the current scenario against a +/- 20% difference for diffuse arable, pasture and septic tank P sources, and wastewater P sources while holding other input values constant."
We then present findings from the sensitivity analysis in section 3.5 of the results from lines 463 onwards with the following text and additional Table 3 (EGU_2022_617_Review#2_Supplement_Apr23).
"The results of the parameter sensitivity analysis are presented in Table 3. Changes in point source reactive phosphorus loads have a greater influence on reactive phosphorus concentrations (µg/l) compared to diffuse sources in sub-catchment 6200 under a current scenario. A 20% increase in point source loads resulted in an 8.4% increase in reactive phosphorus concentrations, while a 20% reduction resulted in an 8.1% reduction. Of the diffuse sources, arable sources had the greatest influence on reactive phosphorus concentration with a 20% increase yielding a 4.9% increase in concentration, while a 20% reduction resulted in a 6.5% reduction in concentrations.”
13) Figure 6: Discuss why the change in 6202, 6205, 6206 is so high and different from the rest under scenario (d).
Thank you for this comment. The change in reactive phosphorus concentrations in sub-catchments 6202, 6205 and 6206 in scenario (d) is because scenario (d) assumes an extreme high rainfall scenario which results in both increased diffuse source run-off and increase effluent loads as there is a greater probability of effluent discharge and spills. We believe the change is higher in the three sub-catchments compared to sub-catchments 6200 and 6201 because we can’t consider the influence of river flow volume and its diluting influence on reactive phosphorus concentrations. River flow volume data was not available for sub-catchments 6202, 6205 and 6206, only sub-catchments 6200 and 6201.
Another reason behind the higher change was due to results in the previous version of the manuscript reporting mean concentration values, which are influenced by outliers. The Fossil Fuelled Development Extreme High Precipitation scenario has the greatest standard deviation, as is evident in the graphs in our response to comment #14.
We have since updated the manuscript to present median statistics instead of mean statistics (more information in response to comment #16). The updated Figure 6 of the manuscript is presented in EGU_2022_617_Review#2_Supplement_Apr23 as Figure 7.
In addition, we have added the following text to the manuscript from line 568:
“Investigations of future scenarios highlighted that in the future business-as-usual scenario (Figure 6, Pane ii) (Figure 7 in EGU_2022_617_Review#2_Supplement_Apr23) median reactive phosphorus concentrations (µg/l) increased compared to current annual conditions (Figure 6, Pane i) in sub-catchments 6200, 6201 and 6205 and decreased in sub-catchments 6202 and 6206. Figure 7 (Figure 8 in EGU_2022_617_Review#2_Supplement_Apr23) for sub-catchment 6200 and Figures S8-12 (Figures 9-13 in EGU_2022_617_Review#2_Supplement_Apr23) for remaining sub-catchments) show increases in total phosphorus loads (kg/day) in sub-catchments 6200, 6201 and 6205, while the total phosphorus loads in sub-catchment 6202 and 6206 decreased, particularly for wastewater sources. The changes in total phosphorus can be seen in the source apportionment between wastewater and diffuse sources, as well as the trends in climate, population and land cover change. Wastewater sources increase in sub-catchments where the population is projected to increase, while diffuse sources are expected to increase in all sub-catchments.
In the Green Road and Fossil-Fuelled Development Extreme Precipitation scenarios, the influence of precipitation change and catchment processes are evident. In the Green Road Extreme Low Precipitation scenario, total phosphorus loads (kg/day) are reduced in all sub-catchments, due to reductions in diffuse run-off. The lower likelihood of wastewater spills contributing untreated effluent to wastewater source loads are also reduced.
Reactive phosphorus concentrations (µg/l) are greater in the Green Road Extreme Low Precipitation scenario compared to the current scenarios in sub-catchments 6200 and 6201, despite the reductions in total phosphorus loads in both sub-catchments (Figure 7 and Figures S8-12). We believe these concentration increases are due to the reduction in river flow volumes in the extreme low precipitation rate scenario, meaning regulating diluting functions are absent and reactive phosphorus concentrations increase. We are unable to investigate the influence of flows in the sub-catchments where reactive phosphorus concentrations decrease compared to current conditions (6202, 6205 and 6206) as observed river flow volume data were not available for all sub-catchments, see SM Table 2 (of manuscript supplementary material) for more information on how surface water quality is measured absence of river flow volume data.
In the Fossil Fuelled Development Extreme High Precipitation scenario, increases in reactive phosphorus concentrations (µg/l) compared to current conditions are evident in all sub-catchment waterbodies, which is attributed to increases in total phosphorus loads (kg/day). Increased precipitation rates increase diffuse run-off, wastewater effluent flows and the likelihood of effluent spills. For sub-catchments 6200 and 6201, despite increases in river flow volumes from increased precipitation, P source loads into the waterbodies was greater than the dilution capacity.”
14) Bar charts 7 and 8 need to show the uncertainty bounds. Also provide similar charts for the rest of the subwatersheds in the SM
Thank you for this comment, we have included the Bayesian credible interval (Q5 and Q95) uncertainty bounds in Figures 7 and 8 (Figures 8 and 13 in EGU_2022_617_Review#2_Supplement_Apr23) and added the results for the remaining sub-catchments in Figures 9-12. We will include these as Figures 7 and 8 in the manuscript and Figures S9-12 in S6 of the Supplementary Material.
15) Line 447: Explain why the evaluation was done in sub-catchment 6200. Also state the period over which the data was collected and used for that sub-catchment.
Thank you for this comment, we have acknowledged and addressed this in our response to comment #11 regarding sub-catchment 6200 being the catchment outlet.
16) Figure 9: Did you check and see if these differences are coming from the point or non-point sources. You have the point source data for all WWTPs so you can estimate the relative contribution of each.
Thank you for this consideration. We have now considered this in our sensitivity to input parameters analysis described in response to comment #12 however, we believe that this comparison wouldn’t provide a complete justification as to why the model underestimates the probability of being within a moderate risk class.
We have updated the manuscript to report results as median values, rather than mean values as there are outliers in the modelled outputs, which was influencing the overestimation values in Table 1 compared to the underestimation in the discretised outputs in Figure 9 of the original manuscript.
Further, we accept that a discussion of the underestimation and wider distribution evident for the modelled reactive phosphorus concentrations (µg/l) is required in the manuscript. We have included the following discussion on the potential reasoning behind the underestimation of the modelled reactive phosphorus concentrations and the observed concentrations from line 555 of the manuscript:
“Despite 46% of the % bias observations falling within the +/- 50% acceptable model performance, results from the goodness of fit evaluation demonstrate that the model underestimated current median reactive phosphorus concentrations (µg/l) at the catchment outlet in sub-catchment 6200 and the probable risk class. Despite, underestimation, simulated concentrations were more widely distributed, as compared to the observed data, as is evident in the 2% of observations within a high-risk state for simulated concentrations, compared to 0% for observed concentrations. A wider distribution in simulated reactive phosphorus values using a hybrid BN model was also found by Glendell et al., (2022). We concur with their considerations that both the quality and the low temporal resolutions of observed data may be responsible for this discrepancy.”
17) It is not clear why the authors adopted a continuous BN to only discretize its output. An explanation on the reasons for that is needed.
Thank you for this comment. Firstly, we avoided the discretisation of continuous variables within the model to avoid loss of information. However, from the stakeholder’s perspective, understanding the probability of model outputs falling into agreed risk classes helped the understanding and the communication of the results. Therefore, we discretised terminal continuous nodes to provide a dual representation of outputs to support effective communication of findings, as described in lines 542-549.
To make the reasoning clearer we will update the manuscript as follows from lines 258-259:
“We presented a dual representation of continuous nodes using a discretised child node to support the communication of the results using both summary statistics (median and standard deviation) available in continuous outputs and the probability of model outputs falling into agreed risk classes available in discrete variables.”
Providing a simplified visualisation of the BN model in response to referee 1 comment #1 supports our additional comments (Figure 14 in EGU_2022_617_Review#2_Supplement_Apr23)
18) Discuss why theGeNIe platform was used as compared to others?
Thanks for this consideration. We used the GeNIe software because it was free for academic use and it allows the development of hybrid models without discretising the output. Further, members of the author team have experience in using GeNIe, for example in Stewart et al, 2021 and Glendell et al 2022, meaning an in-depth understanding of the software was already available to the research team.
19) Table S2 in the SM is very important; yet it hard to follow. It also needs English editing.
Thank you for this comment, we accept that there is a lot of detail in Table S2 which makes it difficult to follow. We included as much detail about the nodes within the model as possible to provide transparency, however, the depth of detail may have been counterintuitive.
To address this, we updated Table S2 by combining node name and identifier columns, removing state and discretisation columns as they are presented in Table S3 and simplifying the supporting information column to only include key information. For brevity, we have included a sample of how the updated SM Table 2 is now presented as Table 4 of EGU_2022_617_Review#2_Supplement_Apr23.20) Are the gammas in Table S2 fixed or do they have distributions around them? How were these determined?
Thank you for these questions. The gamma values are fixed values, which are available in Table S3. The values are determined using a mixture of methods which are explained in the supporting information column of Table S2. In our response to comment #5, we have tried to make the justification for gamma values clearer when updating Table S3.
21) The authors limit their continuous distributions to normal and truncated normal. Why?
Thanks for this question. We mainly use normal and truncated normal distributions because there is limited data to use/fit different distribution types. For example, for wastewater phosphorus concentrations we rely on mean and standard deviation values, rather than multiple observations or data points. Where we do have sufficient data points/observations, we use the ‘custom’ GeNIe function, which fits a customised histogram distribution to the data available, which is the preferred method but not always possible with the reliance on summary statistics. Hence, in that sense, our model acts as a meta-model integrating output from proprietary models as well as regulatory data.
22) Table S3: Make sure you repeat header on all pages.
Thank you for highlighting this, we have repeated the header for Tables S3 and S2.
References:
Adger, W. N. 2000. Social and ecological resilience: are they related? 24, 347-364.Brown, K. 2015. Resilience, development and global change, Routledge.Carvalho, Laurence, Eleanor B. Mackay, Ana Cristina Cardoso, Annette Baattrup-Pedersen, Sebastian Birk, Kirsty L. Blackstock, Gábor Borics et al. 2019. "Protecting and restoring Europe's waters: An analysis of the future development needs of the Water Framework Directive." Science of the Total Environment 658. 1228-1238.
Cretney, R. 2014. Resilience for Whom? Emerging Critical Geographies of Socio-ecological Resilience. 8, 627-640.
Davison, P.S., Withers, P.J., Lord, E.I., Betson, M.J. and Strömqvist, J., 2008. PSYCHIC–A process-based model of phosphorus and sediment mobilisation and delivery within agricultural catchments. Part 1: Model description and parameterisation. Journal of Hydrology, 350(3-4), pp.290-302.
Folke, C. 2006. Resilience: The emergence of a perspective for social–ecological systems analyses. Global Environmental Change, 16, 253-267.
Glendell, Miriam, Zisis Gagkas, Marc Stutter, Samia Richards, Allan Lilly, Andy Vinten, and Malcolm Coull. 2022. "A systems approach to modelling phosphorus pollution risk in Scottish rivers using a spatial Bayesian Belief Network helps targeting effective mitigation measures." Frontiers in Environmental Science. 1825.
Hamby, D.M., 1995. A comparison of sensitivity analysis techniques. Health physics, 68(2), pp.195-204.
Holling, C. S. 1973. Resilience and stability of ecological systems. Annual review of ecology and systematics, 1-23.
Renaud, F. G., Birkmann, J., Damm, M. & Gallopín, G. C. 2010. Understanding multiple thresholds of coupled social–ecological systems exposed to natural hazards as external shocks. Natural Hazards, 55, 749-763.
SEPA: Sustainable Growth Agreement Scottish Water and Scottish Environment Protection Agency Progress Update February. 2020. Available Online: https://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdfhttps://www.sepa.org.uk/media/496202/scottish-water-sga-update.pdf
Stewart, G. B., Miriam Glendell, Rob McMorran, Mads Troldborg, Zisis Gagkas, Paola Ovando, Michaela Roberts et al. 2021. "Uplandia: Making better policy in complex upland systems. Final Report."
Wöhler, L., Niebaum, G., Krol, M. and Hoekstra, A.Y., 2020. The grey water footprint of human and veterinary pharmaceuticals. Water research X, 7, p.100044.