Review of latest revision of Technical Note: An illustrative introduction to the domain dependence of spatial Principal Component patterns by Lehr and Hohenbrink.
I appreciate all of the work put into this manuscript. Given the latest responses to the last revision, I believe that the present manuscript is about as complete as I can hope for, with the exception of cleaning up and further explaining a few issues, mostly in the section on rotation. Given the hesitation to investigate fully the domain effects on rotated solutions, I believe there are three good options. 1. Remove the incomplete rotation section. 2. Leave the very limited experiments using the single algorithm (Varimax) over a limited number of PCs AND issue appropriate caveats, rather than the vague sentences in this last version (see my comments in the annotated manuscript). 3. Complete a comprehensive section of rotation using several algorithms and testing for the optimal number of PCs to retain and test those for DD and for validity. I believe that options 1 and 2 are more viable given the comments in the last response about the manuscript size and your intended scope. That is fine as there are other paths to finalize this manuscript.
The present version could stand as an important contribution to the literature, with appropriate caveats issued. If you wish to proceed to a larger manuscript, it is important to be fully prepared to investigate all the possible permutations arising from each step when rotated solutions are investigated. Those permutations would involve running a much more detailed set of experiments for the number of EOFs/PCs to retain, as that number now effects the pattern morphology (unlike the unrotated solutions where increasing the number retained adds a new pattern but does not change the previous ones), assessing the degree of simple structure in their data sets and then selecting the optimal rotational algorithm from a sizable number of algorithms (again, this step requires validating many sets of patterns to the largest correspondence to the similarity matrix from which those patterns were derived). Given all the domain shapes, such a set of experiments is cumbersome so, unless you are really motivated to go that route, I don't feel comfortable prescribing that as the required bar and deem the present manuscript is acceptable after the issuance of some caveats. Specifically, issue caveats that the analyses presented for Varimax have no way to deconvolute the effects of keeping too few PCs or too many PCs from DD and temper comments such as "the Varimax way of displaying DD" (lines 751-752). For all we know, it has nothing to do with Varimax but is a function of keeping too few PCs and forcing multiple correlation modes onto too few PCs. Acknowledge the possibility. Not investigating it is okay but, then, either don't mention it or list it as one of a number of possibilities.
Here are my comments on the last set of replies:
1. “If PCA is used purely for data reduction, DD is of no interest as the patterns are never examined; they serve only as an efficient set of basis vectors. If, however, the subsequent use of the PCs requires an adequate description of multiplet subspaces, for example if PCA is used as preprocessing step for other analyses, care should be taken that no multiplet is split by the selection of retained PCs.”
Response: I think as a general statement that is solid advice. However, there are at least two other considerations:
(a) Small magnitude eigenvalues are thought to be associated with noise. Rarely, if ever, if that assumption tested. Small scale signals that extract little variance would be indistinguishable from noise through investigating eigenvalue magnitude, particularly without resampling (there is no resampling in the North et al. test).
(b) Ignoring point (a), the effects of the infusion of noise into the signal+noise PC patterns may or may not be an issue depending upon the amount of variance extracted by those retained. For example, if the retained PCs account for 95% of the total variance, the remaining 5% would be expected to have a minimal effect on the patterns retained, particularly if rotated. In contrast, if that percent variance were 60%, there exists a larger risk of rotation being effected by the noise introduced.
That said, the conservative advice may be safe as it should prevent problems regardless of the total amount of residual noise or if some signal is being discarded.
2. “If the spatial PC patterns do not differ significantly from DD reference patterns, we recommend to report that and stop any interpretation of individual spatial PC patterns as distinct hydrological features.”
Comment: This is good advice, though providing some thoughts on "significantly" or how to assess that would be helpful.
3."What we are suggesting is that if spatial PC patterns are used for interpretation, the patterns should be checked for DD before. We state this in Lines 803–806 of the conclusions. We recommend “visual comparison of the spatial PC patterns from subdomains with markedly different shapes and/or sizes … as quick qualitative check” (end of 3rd paragraph of the conclusion). And we recommend DD reference patterns as null hypothesis to “test whether spatial PC patterns differ significantly from DD patterns” (beginning of the 4th paragraph of the conclusion)."
Comment: Yes, this is good advice as the first step. In practice that null hypothesis would have an alternative that the patterns are different (the alternative hypothesis), a statistical test is than brought to bear on the sample pattern and the probability of a Type I error arises is produced (perhaps with a yes/no decision). One possibility (or many) to avoid a purely subjective determination would be to relate each EOF/PC pattern to the patterns derived in your manuscript for the best matching domain shape. One could apply the aCC statistic and (because it arises from an unknown distribution) apply a permutation test to determine the p-value.
4. "AR: We highly appreciate the sincereness and precision of your comments and the amount of work you spent. It considerably helped to improve our manuscript, in particular with respect to completeness, precision and explicit statements. However, some of your suggestions point conceptually to a different direction than it is our intention for our study."
Comment: That is a fair statement. To run a proper set of DD tests on rotated solutions would require being fully prepared to investigate all the possible permutations arising from each step when rotated solutions are investigated. Those permutations would involve running a much more detailed set of experiments for the number of EOFs/PCs to retain (k), as that number now effects the pattern morphology (unlike the unrotated solutions where increasing the number retained adds a new pattern but does not change the previous ones), assessing the degree of simple structure in their data sets and then selecting the optimal rotational algorithm from a sizable number of algorithms (again, this step requires validating many sets of patterns to the largest correspondence to the similarity matrix from which those patterns were derived). Given all their domain shapes, such a set of experiments is cumbersome. Perhaps list this as an area of investigation for others or for a future paper?
Given the above statements it seems there are three possibilities to satisfy Occam's razor in two broad categories:
1. Pure data reduction with no physical interpretation: keep the process simplest and don't bother with DD investigations or with post processing with rotation. That is now discussed in the manuscript.
2. Data reduction with physical interpretation:
a. No longer can the simplest process be assumed to work without further investigation, so test for DD. If DD is present in a "significant" amount, either stop, try step 2b or try a methodology other than EOF/PCA.
b. Investigate if the patterns reflect the patterns of variation in the similarity matrix to ensure physically valid patterns.
(i) If there is not a significant amount of DD present (that is why we are at step 2b) but there is a "significant" correspondence for each unrotated pattern retained to those in the similarity matrix, then interpret each pattern in the set of k patterns or try 2biii to determine if the matches become more significantly matched to the similarity matrix.
(ii) If there is not a significant amount of DD present (again that is why we are at step 2b), but there is not a "significant" correspondence for every unrotated pattern retained to those in the similarity matrix, then (a) reduce the number of EOF/PCs retained and retest each in the set of patterns for correspondence to the similarity matrix or (b) try 2biii to determine if the matches become more significantly matched to the similarity matrix patterns.
(iii) If there is not a "significant" correspondence for every unrotated pattern retained for any set of k EOFs/PCs retained, or one is interested in determining if the matches are improved by post-processing the unrotated EOF/PC patterns, then try rotating the patterns and assessing those rotated patterns for a "significant" correspondence for each pattern retained to those in the similarity matrix, then interpret each of the rotated patterns.
(iv) If either step 2bii or step 2biii fails to find a set of k PCs where each shows insignificant amount of DD and each is significantly related to the patterns on the similarity matrix, then try a different non-EOF/PC approach or stop.
I believe some comments on this could be added to Lines 774–780 and at the end of the conclusions.
5. "Table 4 provides the comparison of the spatial PC patterns from the precipitation data (Figure 16) and the corresponding DD reference patterns (Figure S7). The reasoning is to show how well the patterns of the precipitation data match with those of the DD reference. The DD reference patterns here and throughout the paper are calculated for unrotated correlation matrix based PCs. An equivalent comparison with varimax rotated PCs would require to calculate DD reference patterns for varimax rotated PCs, depending on the k number of PCs selected for rotation."
Comment: Because the analysis of precipitation or any other field cannot be assumed to be well-rendered by Varimax, such an experiment, if it were made, should examine a range of rotations. Investigation of various Python and R libraries (e.g., the R package GPArotation) suggest dozens of possibilities (other than Varimax).
6. "To our knowledge, calculating DD reference patterns for varimax rotated PCs has not been done before. It could be an interesting objective in a future study focussing on rotated PCA (and physical interpretation of PC patterns)."
Comment: See the comment above where eliminating DD or reducing DD to an "insignificant" amount is a necessary but not sufficient step for physical interpretation. To meet the more stringent level of sufficient for physical interpretation, each of the retained EOF/PC patterns must reflect the correlation (or covariance) patterns well. However, because that sufficiency applies to both unrotated and rotated EOFs/PCs, some mention of that could be made because it is possible a set of patterns may not show DD but may not reflect well the patterns of data similarity (e.g., recall the heavy lift issues of maximum variance and orthogonality that are perhaps related to but not the same as DD).
7. "See the comment above where eliminating DD or reducing DD to an "insignificant" amount is a necessary but not sufficient step for physical interpretation. To meet the more stringent level of sufficient for physical interpretation, each of the retained EOF/PC patterns must reflect the correlation (or covariance) patterns well. However, because that sufficiency applies to both unrotated and rotated EOFs/PCs, some mention of that could be made because it is possible a set of patterns may not show DD but may not reflect well the patterns of data similarity (e.g., recall the heavy lift issues of maximum variance and orthogonality that are perhaps related to but not the same as DD)."
Comment" Thanks for the clarification and note that in some fields unit length eigenvectors (e.g., EOFs) are more often used than the scaled version. This has led to a morass in terminology.
8. "AR: There was no testing to optimize the selection of the k number of PCs for rotation performed. The purpose of the varimax rotation experiment and section in our manuscript is not to identify the one set of k PCs that is best suited for physical interpretation of the rotated precipitation PC patterns or alike. It seems your suggestion is pointing in this direction. We also do not want to perform or include a full-scale rotation study here."
Comment: Understood. I left some comments earlier in case you decide to try such an experiment. It would involve more than testing Varimax over a broader range of PCs retained.
9. “Thus, in our simple experiment here, varimax rotation was not successful in resolving DD"
Comment: Yes, that is true for the limited scope of your design. I would suggest adding the caveat that your design was not comprehensive, so you can't say if Varimax could reduce DD to an insignificant level.
10. “Note however, that for the introductory scope here, the experiment with the three varimax rotation variants was kept deliberately simple."
Comment: Read my comments in the revised manuscript why the deliberately simple experiment cannot deconvolute between the way varimax creates patterns or the way keeping a specific subset of k PCs creates patterns.
11."Also, we did not investigate which number of rotated PCs resulted in more or less DD, nor did we aim to find an optimum number of rotated PCs with respect to DD.“.
Comment: I provided comments in the manuscript about this too. Not having DD patterns (unless that DD pattern happens to be the correlation pattern) is necessary for physical interpretation. However, the sufficiency comes from relating the EOFs/PCs to the correlation matrix. If the EOF/PC patterns are not supported by the correlations, then some other factor(s) are creating them. Most likely these are mis-specifying the domain size (or shape) that fails to capture the data variability scale, maximum variance of the PCs, orthogonality of the PCs or combinations thereof.
12. "AR: We agree that, usually, rotation of PCs is applied for physical interpretation of the PCs and their patterns. However, in our study, this is not the case. We did not perform any physical interpretation of the PCA results in the paper and we never meant to. The focus is to introduce DD to the PCA users in the hydrological community."
Comment: Yes, but my understanding of the conclusions in the manuscript is that if the EOFs/PCs contain significant amounts of DD, they should not be interpreted. I offer up an additional thought on this. If there is not significant DD, there may or may not be validity for other reasons.
Comments in the revised manuscript:
1. Line 508: "4.3. Effects of the domain size and spatial correlation length"
Comment: Yes, and consider the following. For domain size less than or equal to the correlation length, one cannot apply rotation as the goal of rotation is to identify patterns that are subsets of the domain. If the correlations span the domain, there can be no meaningful simplification. Application of rotation in such cases will attempt to simplify patterns that should not be simplified. This is why examining the PC loadings and comparing them to the correlation patterns is so important.. Only in cases where the domain size exceeds the correlation length can rotation be examined in a meaningful way for improving the resolving of correlation modes, as finding spatially simplified configurations of the data is supported by the data. [As an aside, small domains, relative to the spatial correlation length, is rarely an issue in weather and climate studies but may be an issue for other fields of study]
2. Lines 558-559: "In particular, special care has to be taken that the
truncation point of a PCA does not split a multiplet (North et al., 1982)."
Comment: This is reasonable general advice but the likelihood of problems arising is a function of the amount of variance explained by those PCs retained. If there is a substantial percentage of variance beyond the truncation point, there is a probability of more noise contaminating those eigenvectors retained. In such cases, the North et al. test is more critical. If the retained eigenvectors explain a large majority of the total variance, the small amount of residual variance (thought to represent noise) is much less and the application of the test becomes less important. Further, because rotation is immune to degenerate multiplet distortion for closely spaced eigenvalues, the problem is lessened in that situation.
3. Line 719: "Analysing a subsampled data set..."
Comment: Assuming this refers to spatial subsampling rather than time subsampling. Please clarify.
4. Line 730: "5.2.2. Rotation of PC eigenvectors"
Comment: At this juncture, given the comments in the previous response about a full rotation analysis being beyond the intended scope, it is easier to clean up the details as listed below rather than embark on a full rotation analysis for this particular manuscript. However, should you decide to go there now or later in a separate manuscript, it is important to be fully prepared to investigate all the possible permutations arising from each step when rotated solutions are investigated. Those permutations would involve running a much more detailed set of experiments for the number of EOFs/PCs to retain, as that number now effects the pattern morphology (unlike the unrotated solutions where increasing the number retained adds a new pattern but does not change the previous ones), assessing the degree of simple structure in their data sets and then selecting the optimal rotational algorithm from a sizable number of algorithms (again, this step requires validating many sets of patterns to the largest correspondence to the similarity matrix from which those patterns were derived). Given all their domain shapes, such a set of experiments is cumbersome so, unless the authors are really motivated to go that route, I don't feel comfortable prescribing that as the acceptable bar and deem the present manuscript can be made acceptable. What I do request is that you issue caveats that you have no way to deconvolute the effects of keeping too few PCs or too many PCs from DD and they need to temper comments such as "the Varimax way of displaying DD" (lines 751-752). For all we know, it has nothing to do with Varimax but is a function of keeping too few PCs and forcing multiple correlation modes onto too few PCs. Please acknowledge the possibility. Not investigating it is okay but then, either don't mention it or list it as one of a number of possibilities.
5. Lines 739-740: "No multiplets were split by the rotations (Figure S7) to
740 ensure that the results of the rotation were not affected by multiplet effects (Section 4.4)."
Comment: Although it is fine to say this, keep two things in mind:
1. Eigenvalues are a property of unrotated EOFs/PCs. The property is destroyed by rotating. After rotation, the variance on individual PC loading vectors can be tallied by summing the squared PC loadings for that vector. The total variance for the k rotated PCs will be identical to the k unrotated total variance, but the variance of individual rotated PCs will differ from the variance defined by the eigenvalues on individual unrotated PCs.
2. Rotated PCs are immune from the effects of closely spaced eigenvalues. It is a good practice not to select the truncation point, k, in the middle of a degenerate multiplet if the total variance explained by those k PCs is not large. I have commented on this in other parts of the paper.
6. Lines 741-742: "Note, that the newly assigned
fractions of variance do not any longer decrease continuously with the PC ranks in all cases."
Comment: That depends on the software being used. Some packages will sort the rotated PCs by their variance explained.
7. Line 746: "simple structure"
Comment: Yes. In fact, this fits into the comments made earlier about domain size and rotation. If the domain is too small to expect near-zero values on some subset of locations, rotation should not be applied.
8. Line 749: "2rPCs variant"
Comment: I think there are two issues here, in the abstract. Only one might apply to your study, but both should be mentioned because they involve domain shape.
1. A. If the data correlation scale is larger than the spatial domain selected, the shape of the domain will affect the EOF/PC patterns because on has sampled a subset of the correlation pattern (think of it as, for example, having a large circular correlation pattern and then applying a triangular cookie cutter to that pattern, distorting the original shape). One cannot determine DD as envisioned in this paper in such a situation. Obviously, the correlation patterns should be examined prior to deciding on a domain shape.
1. B. If the data correlation scale is approximately the same spatial scale the domain selected, the shape of the domain may effect the EOF/PC patterns because on has sampled a subset of the correlation pattern (think of it as, for example, having a circular correlation pattern and then applying a triangular cookie cutter, of approximately the same size, to that pattern), distorting the original shape. One can test for DD in such cases but there could be a competing effect of truncating the physical correlation patterns. Obviously, the correlation patterns should be examined prior to deciding on a domain shape.
In cases 1.A. and 1.B., rotation cannot work as it requires correlation scales to be smaller than the domain size.
1. C. If the data correlation scale is smaller than the domain selected, DD can be tested as suggested in the manuscript. Obviously, the correlation patterns should be examined prior to deciding on a domain shape.
2. For your experiment with 2rPC varimax, it is possible 2 PCs was insufficient and you are forcing more than 2 unique signal patterns onto 2 PCs. We know that keeping too few PCs (known as "underfactoring") forces unrelated signals on a single PC, which could be mistaken for DD. Similarly, keeping too many PCs (known as "overfactoring") splits the correlation patterns (e.g., waves with positive and negative loadings, into two separate PCs, each with one piece of the pattern). Although the overfactored rotated PCs may not show DD, then may be non-physical just the same. That is why optimizing the k in rotated PCs is so important. You need to add a caveat that your experiment did not involve this optimization step, so claiming that DD is present is not testable in your framework and is why I object to you claiming (for now, until a full rotated test is performed) that it is a "varimax way of displaying DD". For all you know, it is a 2rPC way of portraying DD, where there is an unfortunate choice of k, and not the rotation method. The same might hold for 3 rPC, ...
9. Lines 751-752: "seemed to be the varimax way of displaying DD."
This is both vague. and not tested (see earlier comment). What is "the varimax way of displaying DD"? Such a statement would suggest that no matter what the underlying correlations, varimax PCs would give the same set of patterns. Is that the case? From what I can see, it might as well be "the k Varimax way" (whatever that is) for a single example and not generalizable to other correlation functions as it is for unrotated EOFs/PCs? See earlier comments on the distortions known to occur when underfactoring/overfactoring. Of course, the added step (not examined in this manuscript) of relating the PC patterns to the correlation patterns will instantly confirm if DD is a potential issue for any analysis (unrotated, rotated). If that comparison has a poor match between the PCs and the correlations, then some other factors (perhaps including DD) might play a role. We just can't tell from this experiment.
10. Line 758: "deliberately simple."
Comment: You need to tell the reader what "deliberately simple" means. Hopefully, I have left sufficient comments about the critical need to test each k PCs when rotated to determine if any of those sets gives a valid result.
11. Line 759: "are limited."
Comment: Suggested addition to this sentence: "or misattributed to the rotation method rather than underfactoring/overfactoring."
12. Lines 760-761: "Also, we did not investigate which number of rotated PCs resulted in more or less DD, nor did we aim to find an optimum number of rotated PCs with respect to DD."
Comment: What you have not deconvoluted is the DD effect of selecting a different number of PCs to retain from the rotation method applied. Equally important, because the PCs are not compared to the correlation patterns, the physical validity cannot be established. [I realize you don't want to go there, though if the paper is concluding that EOF/PC patterns with DD should not be interpreted; therefore, failing to tell the reader when the patterns should be interpreted (and the method to support the interpretation) is less than satisfying.] Once again, the conclusion in this paragraph could be because the data are not well represented by the k PCs retained, or the data are not well represented by a varimax rotation, or by a combination of both.
13. Lines 762-763: "be more robust against spatial"
Comment: It may be the spatial instability is inter-related to DD. Can you comment on that from these experiments?
14: Line 763: "and less sensitive to degeneracy (Richman, 1986)."
Actually that study showed essentially no sensitivity to degeneracy for rotated PCs, as the eigenvalues were the same to many decimal places. It did show some sampling variability at very small sample sizes.
15. Line 765: "drawbacks of rotation"
These are undefined in this manuscript. Jolliffe's comments applied to the loss of uncorrelatedness and orthogonality in the spatial and/or temporal patterns and the extra work involved in running a rotated analysis. You have previously criticized orthogonality in this paper as one factor hindering interpretation, so it's not clear what is meant here. I do mention the situation when the domain size is smaller than or equivalent in size to the correlation scale as factors against rotation. However, in such cases of small domains (relative to the correlation scale), Jolliffe's suggestion of rotating select PCs will not help physical interpretation, regardless of the eigenvalue spacing.
16. Lines 818-820: " If the spatial PC patterns do not differ significantly from DD reference patterns, we recommend to report that and stop any interpretation of individual spatial PC patterns as distinct hydrological features."
Comment: Yes, though I believe you can say more about this. See my previous comments.
---
I hope my comments are useful in finalizing the manuscript. |
Review of Technical Note: An illustrative introduction to the domain dependence of spatial Principal Component patterns by Lehr and Hohenbrink.
This manuscript attempts to extend the study of how analyzing data on various shaped spatial domains affects the principal component loading patterns. The extension is both in content, as new material is added to the existing literature and the authors hope to gain the audience of hydrologists who, by and large, have not been exposed to such a concept. The importance of the work lies in several areas (expanded on below) but the key one is that if the PC loading patterns match those that are expected to arise from the shape of the domain, rather than the covariance fields, the recommendation should be a full stop on continuing. Therefore, understanding domain dependence is a necessary, but not sufficient condition, for physical interpretation of PC loadings.
Let me add that I like this paper and believe it can be a useful addition to the literature, helping analysts to interpret their eigenanalyses. Therefore, I hope the authors view my extensive comments with that in mind. If I come across as opinionated it is because of my lengthy work in this area and if it seems direct, that is my nature. Regardless, I like this manuscript and hope it gets published after further revisions.
Now for the general comments. The paper builds upon the pioneering work of C. Eugene Buell. Those papers are cited. Buell (1979) left the reader with this final thought on the subject of domain dependence in the last line of his conclusions, stating that unless domain dependence was accounted for, on interpreting EOFs, "Otherwise, such interpretations may well be on a scientific level with the observations of children who see castles in the clouds". That is a pretty direct and strong statement. Digging deeper into why that can occur, the manner in which individual EOFs were being analyzed in the 1970s,...,2020s is by inferring physics by visual inspection of the magnitudes and gradients of the EOFs when plotted on maps. There was no external or internal validation of the patterns, only conjecture. With over 50 years of this practice, little attention was paid to whether this was a wise idea and thousands of such EOF studies emerged, with claims of the importance of the magnitudes and shapes of the patterns, many of which looked suspiciously like those patterns Buell generate. However, we should be wiser today and the authors are telling the investigator that if the covariance fields vary across a given domain shape but the same basic Buell patterns emerge, perhaps it is castles in the clouds rather than physics. However, there may be something more than a chimera, a mixture of signal and domain dependence. We come to learn later in the manuscript that a third confounding factor, namely the degeneracy of PC loading patterns with closely spaced eigenvalues, playing a role. It is good to see these factors considered.
Next, let's discuss PCA as a technique. According to those who understand the method, there is general agreement that PCA is useful for data reduction. In other words, in the type of analysis in the manuscript, the time series at n gridpoints or locations can have their covariances explained in k PCs where k<
1. Given the above prologue, the authors on lines 408-409 discuss "heavy constraints" of PCA that inhibit physical interpretation. To that good list, I'll add that it has been shown the leading PC, by virtue of the constraint of maximal variance can pull multiple unrelated sources of variation onto that leading PC, confounding physical interpretation. This should be added. The Karl and Koscielny citation (in your reference list already) shows this in their Appendix. Further details are given in the annotated manuscript (attached).
2. There is a general lack of agreement on terminology for eigenmodels, that leads to massive confusion among users of these techniques. At first when reading this manuscript, I thought the authors were applying EAOFs, only to change my opinion later in the manuscript that they were applying the PCA model. The original paper where EOFs were named EOFs, is generally attributed to Lorenz (1956). However, in that report, Lorenz refers to the displays as EOFs of space, and EOFs of time, to define what have now mutated somewhat into what are called "EOFs", and "Principal Components", respectively. Assuming a spatial analysis, those EOFs of space are unit length (sum of the squares of each EOF's coefficients = 1), whereas the EOFs of time are orthogonal vectors, each with a mean of zero and variance equal to the associated eigenvalue. In contrast, the PCA model, generally attributed to both Pearson (1901) and more fully to Hoteling (1933). weights (postmultiplies) the unit length eigenvectors (EOFs) by the square root of the corresponding eigenvalue to give "PC loadings". That seemingly minor change in the spatial patterns (keeping with the definition of space and time given for EOFs) results in the time series calculation and properties being different. Those time series in the PCA model are called "PC scores" and have mean 0 and variance 1. They are also orthogonal. Flip the space and time definitions of these displays if the analysis is temporal. Because the two models result in different space and time patterns, they cannot be compared directly and the precise equations used are necessary to attempt to reproduce the findings of others. I urge the authors to state clearly what model they are using immediately after the introduction and show the equation. The situation becomes more complicated as users of these techniques tend to grab EOF/PCA code off of various statistical packages or Python code libraries, that often mislabel the results, never checking the specifics, thereby perpetuating the confusion. For the current paper, one must know if the analyses are applied to EOFs (unit length eigenvectors) or PC loadings (unit length eigenvectors postmultiplied by the square root of the corresponding eigenvalues). Further, it would be helpful to know if any of the results for domain dependence change as a function of the specific model invoked. There is considerable confusion about this topic when reading this paper. It is important the model being used herein is stated unambiguously at the outset of this paper and the equation added in the methods section to avoid such confusion. Further adopt the correct terminology for that model and don't list any alternative terminology that might confuse the reader.
3. The treatment of eigenvalue degeneracy is generally well addressed with one exception that potentially plagues nearly every applied eigenanalysis: eigenvalue degeneracy at the truncation point (k). If those PCs associated with closely spaced eigenvalues between k and k+1 have information that is intermixed, problems arise and data is intermixed with noise on the kth retained PC loading vector. Your paper presents 10 PCs, therefore, the spacing between the 10th and 11th eigenvalues should exceed the North et al. criterion. Does it? Let the reader know.
Further, this needs to be mentioned because it can cause the loss of a domain dependence pattern simply because the way eigenvalues are ordered in descending order makes them more likely to be closely spaced as the smallest eigenvalues head toward the tail (presumably noise) where the analyst would normally truncate the analysis to discard the k+1,...,nth eigenvalues, perhaps using some other criterion (e.g., based on percent variance extracted, eigenvalue magnitude).
Related to this, I wonder why eigenvalue degeneracy is not addressed earlier in the paper as it seems to affect domain dependence. If that is the case, then consider moving it earlier in the paper as those PC loadings arising from degenerate multiplets should not be expected to exhibit the domain dependent patterns but the multiplet may be dominated by the domain dependent patterns and those are intermixed into new patterns that don;t seem to be domain dependent patterns.
4. Comparison of PC loading patterns is accomplished with correlations. S-mode PC loading (and that of EOFs) interpretation depends on the magnitude of the PC loadings plotted on a map (and in general, the magnitude of the PC loadings/EOFs is important in any mode). Therefore, correlations subtract each PC loading/EOF vector mean (pattern mean), so two patterns with different means can have their large correlations, yet their magnitude patterns will be much different and the grid boxes (I think what you refer to as cells) with the maximum PC loadings will be in different geographical (or topological) locations in your domains. If that is the case, the the correlation is suboptimal for such comparisons. Find a better metric that includes magnitude in terms of comparison. I suggest the congruence coefficient, though others exist that preserve the vector magnitudes.
5. It seems odd that after the paper establishes the details and importance of domain dependence, it has no results on how rotating those PCs affects such dependence. There is only a scant mention of the possibility of this near the end of the paper, mostly in the context of rotating degenerate multiplets. However, rotation can be applied to PC loadings associated with non-degenerate eigenvalues and it will affect domain dependence patterns. Please consider adding a section on rotation and show those patterns to comment about how domain dependence is addressed by post processing the PC loadings with a rotation.
6. The manuscript discusses accounting for domain dependence prior to attempting physical interpretation. Both the abstract and the introductions discuss how ignorance about domain dependence can easily lead to the wrong interpretations of PCA results (e.g., "Ignorance about DD can easily lead to the wrong interpretations of PCA results. DD patterns are distinct, with strong gradients and contrasts, and therefore highly suggestive to indicate physically meaningful drivers or properties of the analyses system". I agree with this statement and, assuming it is valid, the reader will want to know abut the right interpretations of PCA results. The manuscript further states (correctly) that the analyses proceed from data that are formed into a correlation (or covariance) matrix, either explicitly and implicitly and that matrix (or the standardized data in the case of SVD) are decomposed into eigenvectors that should be capable of summarizing the correlations/covariances of the data (after ensuring they do not represent domain dependence patterns). Therefore, some additional discussion of how to interpret those eigenvector (in the case of the present manuscript, PC loadings and PC scores), after passing a domain dependence assessment, must be added. It seems the majority of patterns shown in the paper suffer from domain dependence or from the effects of eigenvalue degeneracy combined with domain dependence. Would that be the null hypothesis for other investigators?
The main recommendation to assess such a hypothesis of domain dependent patterns (according to the manuscript) seems to be to visually assess the similarity but it leaves the reader asking, "then what do I do?". Presently, there is a suggestion to visually assess the analyzed patterns and compare to the domain dependent patterns for a similarly shaped domain. Two issues with visual assessment are (a) the reliability of the same pattern under the eyes of different analysts may well have one analyst believing there is a strong resemblance, and the pattern should not be further interpreted, yet a second analyst may think it has some resemblance but not that much to reject it as domain dependent. Further, (b) the nature of a qualitative visual assessment means any one analyst can see some resemblance to domain dependent patterns in their visual assessment and then discount it based on personal bias. A more quantitative approach to avoid (a) and (b) would be a direct numerical comparison using a matching coefficient (e.g., congruence coefficient). In that case, a recommendation could be made, such as, if the congruence coefficient exceeds some value (e.g., > 0.8), the analysis is dominated by domain dependence and the unrotated PC loadings/EOFs should not be analyzed physically. The assessment of the physical interpretation gets even trickier at this point. If the PC loading pattern based on either visual assessment or congruence coefficient value is thought not to be sufficiently contaminated be domain dependence, it does not mean it is physically interpretable as a meaningful mode without further investigation. Recall what the PCA does. It summarizes the correlation/covariance structure into a set of k PC loadings and k PC scores. Do we know if any of those structures relate well to the correlation/covariance matrix from which they were drawn? Without such a step, physical interpretation would seem unwise (we're back to the castles in the clouds but now from the "heavy constraints"). Because the manuscript is motivated by finding physically important modes, a revised manuscript should address or provide some suggestions on how to confirm if a mode is physically realistic or related to the correlations/covariances (or not). There is some literature on this topic, ranging from never physically analyze any PC structures (in that case domain dependence is moot because domain don't affect the ability of PCA to extract most of the variance from a dense correlation/covariance matrix) to, in many cases, the PC structures can be analyzed after confirming similarity to the correlations/covariances . I suggested examining the Compagnucci and Richman (2008) and Huth and Beranova (2021) papers for starters. The latter asks the specific question about what is a "true mode" whereas the former addresses the question about if certain analysis modes can retrieve the modal patterns. Of course, there are other alternatives, such as using a technique not rooted in eigenvectors. However, if the paper offers a path to identifying domain dependence that undercuts physical interpretation, some remedy should be offered.
Specific comments
Numerous specific comments are listed in the annotated manuscript (attached).