Reply to rebuttal
Dear authors, thank you for your reply.
I enjoyed seeing how this paper has turned into a better version of itself. I think your clarifications with respect to the points I initially highlighted were mostly met in your reply. To simplify the process, If I agree with your reply, I wont add it here. However, there are still some open points that I would like to further discuss based on the reply and re-structuring of the paper. I will start with the reply to the comments you made, then move to general comments about the paper (in its current state), and finally some specific comments line-by-line.
************************
About the reply
************************
----------------------
In the reply regarding the discussion about optimality you responded:
"In this paper, we do not answer the question ``what is optimal?'' with an optimal network.
Rather, we reflect on the question of how to define optimality in a way that is logically
consistent and useful within the monitoring network optimization context, thereby questioning the widespread use of minimum dependence between stations as part of the objectives."
With this in mind, I think the title of the paper, beyond catchy, may be misleading. I am not against making this the point of discussion of your paper, but I believe that the title can better reflect the discussion that is inside. I think the main focus of discussion in this paper is to avoid using redundancy metrics unless the problem is explicitly defined to require them, and to see what is the effect of greedy-drop algorithm in the defining monitoring networks.
-----------------------
"l367-368 Language has to be precise (how to numerically calculate this objective function, or other objective functions used in other approaches)
Agreed, we will modify as follows: Another important question that needs to be addressed in future research is to investigate how the choices and assumptions made (i.e., data quantization which influences probability distribution) in the numerical calculation of objective functions would affect network ranking."
In this comment, I was also trying to highlight the fact that not all of the equations to calculate the metrics presented in your study were explicitly shown. At the moment, some of them are there, but I was not able to find (in the body of the document) the complete information to replicate your study. In your paper your mention that you are using the same weights as the original authors (l259-260) to identify a single solution in the case of MOO, but it would be best if this is directly presented in the formulation to help replicate this study.
--------------------------
*************
About document in general
*************
I think the general structure of the abstract can be improved. I suggest you being more concise on the methods and conclusions of your study. Also, I think it would be a better to end the abstract highlighting the conclusions rather than mentioning that there is a case study.
The structure of the document has to critically revised. I think a literature review section would not hurt, as you are including these topics in the methodology. Also, I noticed that you moved the position of the Methodology section, and now include topics which should not be presented there. The experimental setup (which experiments and using which data is being used to get which values) is not really clear; also, the first reference to the "synthetic dataset" is shown in the results section. In addition, the first part of your results reads as conclusions from the literature rather than from your own numerical experiments; in addition, during the results section there are suggestions of further work which should be included in its respective section. Please, reconsider the order of the sections in your document as it needs a big overhaul.
Language has to be revised. Paragraphs tend to be too long and sometimes drift over more than 1 idea at the time. I am asking you to be more concrete in each paragraph, as I agree that the overall length of the document and the amount of information presented is adequate. Also, it is important to revise the structure of some sentences to avoid using as many commas. Finally, make sure to stick to only one term for the same element through the text (i.e. station).
*************
Specific points
*************
l82 - you can be more specific which trade-offs are being interpreted.
l87 - You start the section with an addition connector ("In monitoring network design, also ..."). This leaves the reader without a context.
l89 - Kriging (with capital K)
l91 - "such as for example" pick one
l93 - "on that topic" This is redundant
l94 - "on spatially distributed observed" In your case study you do not have spatially distributed observations (such as radar or remote sensor products), but rather discrete observations along a stream (even if they are spatially distributed).
l95 - which -> that
l95-98 - (Furthermore ...) In the methodology section this has to be way more concrete
l99-101 - This can be slightly rephrased to avoid as many commas.
l103 - The discussion is not demonstrated, but rather carried out.
l104 - "as we will argue" This is the methodology section. Please focus the text of what has been done in this particular paper (therefore written in present tense).
l121-123 - Sounds more like a line for the introduction and not as part of the methodology
S2.1 should not be part of the methodology. This is part of the literature review.
l125 - I am not sure if Shannon in particular talks about uncertainty. I think his work is more closely related to compressibility of data. Similar, but yet different.
l131 - "For monitoring networks, the information each sensor provides through its observations (outcomes) is therefore linked to the uncertainty of those outcomes before measurement. These are quantified through the probability distributions of the data." This line is quite hard to read, please consider it for simplification.
l135 - "the placement"
l137 - I think this is also a good place to mention joint entropy as you will be coming back to this later on the paper
l137 - "a random variable"
l140 - " Objective functions are often composed from these basic expressions. Details of each expression are presented below." I think this can be removed
l144 - You changed the name here to Discrete marginal entropy. For example (l138, entropy; l142, Shannon entropy; l144, Discrete marginal entropy). Please be consistent
l149 you mention is only about units, while in l151 you mention that is used in answering questions. Which one is then the case? I think the answers depend on the quantisation rather than the base of the algorithm. Please clarify this.
l152-153 - This does not say much. Can you be more specific. Also, will you please generalise all of the equations in this section (H, H|X, T and C).
l155 - Marginal distributions are not used in calculating joint entropies. Please move them where necessary.
S2.2 This section contains quite a bit of information that should not be part of the Methodology.
S2.2 - I do not think the word "measure" as in the title works in this context (as a noun). Consider using other term such as metrics.
l184 - Additive properties were never discussed. What are these?
l185 - Please consider clarifying this line as for our previous discussion.
l213 - "However, there is no consensus on how to minimize redundant information". I think that there is no consensus on its definition rather than on its ways to minimise it. Most authors will agree that the only way to find the "optimal" result is to resort to full-enumeration, while others have used more efficient methods to get there, so the problem can actually be (realistically) solved. In particular, when revising Alfonso et al. (2010), we found out that WMP is a criteria for defining redundancy, while the way to solve the problem was resorting to NSGA-II. These differences have to be clarified, as they seem mixed-up in the document.
Eq 7-10 - are framed as greedy-add, as they read to optimise the given metric given a set of previous iterations, and a new candidate. Therefore, its formulation is not suitable for its use in the greedy-drop. Would not be better just to have a definition of the objective function, and present separately the way to solve it?
l277 - "This approach can for example be useful in Alpine terrain, where relocating a sensor requires significant effort (Simoni et al., 2011)" I think this argument is relevant when physically changing the placement of the sensors, but I do not see it as an argument when testing potential locations while in the "design" phase.
l288 - optimal combination of ... Better to be specific here.
S2 - I think at the end of this section there is not a clear presentation of the experimental setup to test the hypothesis presented at the introduction of the paper. I think it would be a good idea to re-draft this section to clearly point out what is the methodology of the paper to support your conclusions, rather than mixing it with the literature review. A diagram always helps. Also, in this point it would be best to be (briefly) specific in the formulations, and not leave those in the references or annex (such as in S2.4), as this is the core of your document.
l368-379 - I think belongs to the literature review, as it does not show or discuss any of the results. If it does, I didnt find it, so please help reader pointing that out.
l381-395 - Idem
l406 - "efficiency = bits of unique info / bits collected" -> efficiency (i.e. bits of unique information / bits collected)
l406 - bits collected -> bits
l423 "when increasing network size by one station" I think this should be removed. If we consider increasing the network by one element at the time then is greedy add and not exhaustive search. I think this will be clarified when the methodology shows that you will be testing the results for the optimisation of the network using 3,4,5.... n sensors.
l428-430 - Don't downplay yourself or your setup :). It is well known that combinatorial problems are not simple to solve when they grow. However, try to quantitatively report what the results or setup is (type of PC, OS, etc).
l443-445 - I think this may go in the further work as is not something that is tested in this study.
l446 - Is well known that only exhaustive search can guarantee optimality. I think you can show this from well known (textbook) optimisation material.
l446 - If there is a second dataset used in obtaining the results, this should be mentioned in the case study. Also, it has to be shown in the methodology what kind of experiments were carried out with it.
l458 - by definition, every search algorithm is more efficient than exhaustive search.
T5 - This table reads way better now. Update the caption as is not monitor order, but selected stations. In addition, you can refer that you are using dataset #2 for these results if presented earlier in the case study and methodology sections. Finally, Make sure to use the same term (station) all over the paper.
T6 - Same comments as with T5. Please rename "Multivariate dimensions" to number of stations.
l480 - Going back to our original discussion, I would like you to clarify that there is no justification on minimizing redundancy as long as there is no specific objective that supports that requirement. This is to highlight the point that (ideally) objective functions are created as a requirement for the optimisation, but given the fact that the problem is not that well defined in many applications, opting for joint entropy makes sense.
l489 - The dimensionality that feasible depends not only on the amount of stations, but also on the quantisation method, the potential superset of potential locations, and the hardware availability. I can imagine someone with an HPC cluster and a well-tuned implementation can process these results. I think it'll be better to leave it open on the side that complexity exponentially grows with the size of the problem and that should be considered.
Wish you all the best!
Reviewer #4 |