HESS Opinions “ A random walk on water ”

According to the traditional notion of randomness and uncertainty, natural phenomena are separated into two mutually exclusive components, random (or stochastic) and deterministic. Within this dichotomous logic, the deterministic part supposedly represents cause-effect relationships and, thus, is physics and science (the “good”), 5 whereas randomness has little relationship with science and no relationship with understanding (the “evil”). We argue that such views should be reconsidered by admitting that uncertainty is an intrinsic property of nature, that causality implies dependence of natural processes in time, thus suggesting predictability, but even the tiniest uncertainty (e.g., in initial conditions) may result in unpredictability after a 10 certain time horizon. On these premises it is possible to shape a consistent stochastic representation of natural processes, in which predictability (suggested by deterministic laws) and unpredictability (randomness) coexist and are not separable or additive components. Deciding which of the two dominates is simply a matter of specifying the time horizon of the prediction. Long horizons of prediction are inevitably associated 15 with high uncertainty, whose quantification relies on understanding the long-term stochastic properties of the processes. 2 Αἰών παῖς ἐστι παίζων πεσσεύων. Παιδός ἡ βασιληίη. (Time is a child playing, throwing dice. The ruling power is a childʹs; Heraclitus; ca. 540‐480 BC; Fragment 52) I am convinced that He does not throw dice. (Albert Einstein, in a letter to Max Born in 1926) 1 What is randomness? In his foundation of the modern axiomatic theory of probability, A. N. Kolmogorov (1933) avoided defining randomness. He used the notions of random events and random variables in a mathematical sense but without explaining what randomness is. Later, in about 1965, A. N. Kolmogorov and G. J. Chaitin independently proposed a definition of randomness based on complexity or absence of regularities or patterns (which could be reproduced by an algorithm). Specifically, a series of numbers is random if the smallest algorithm capable of specifying it to a computer has about the same number of bits of information as the series itself (Chaitin, 1975; Kolmogorov, 1963, 1965, Kolmogorov and Uspenskii, 1987, from Shiryaev, 1989). Interestingly, Chaitin proved that, although randomness can be precisely defined in this manner and can even be measured, there cannot be a proof that a given real number (regarded as a series of its digits) is random. The move from this mathematical abstraction of a real number to the realm of real physical phenomena is not straightforward. Here, commonly, randomness is contrasted to determinism. The movement of planets is a typical example of a deterministic phenomenon, whereas that of dice is thought to be random. This reflects a dichotomous logic, according to which there exist two mutually exclusive types of events or processes—deterministic and random (or stochastic). Such dichotomy is perceived either on ontological or on epistemological grounds. In the former perception the natural events are thought to belong, in their essence, to these two different types, whereas in the latter it is regarded convenient to separate them into these types, where processes that we do not understand or explain are considered random. When a classification of a specific process into one of these two types fails—and it usually does, except in a few cases such as the above examples of planets and dice—then a separation of the process into two different, usually additive, parts is typically devised. This perception has been dominant in geosciences, including hydrology. This thinking proceeds so as to form a reductionist hierarchy. Thus, each of the parts may be further subdivided into subparts (e.g., deterministic subparts such as periodic and aperiodic or trends). This dichotomous logic is typically combined with a manichean perception, in which the deterministic part supposedly represents cause-effect relationships and reason and thus is ( i is a child play20 ing, throwing dice. The ruling power is a child’s; Heraclitus; ca. 540–480 BC; Fragment 52) I am convinced that e does not throw dice. (Albert Einstein, in a letter to Max Born in 1926)

whereas randomness has little relationship with science and no relationship with understanding (the "evil").We argue that such views should be reconsidered by admitting that uncertainty is an intrinsic property of nature, that causality implies dependence of natural processes in time, thus suggesting predictability, but even the tiniest uncertainty (e.g., in initial conditions) may result in unpredictability after a certain time horizon.On these premises it is possible to shape a consistent stochastic representation of natural processes, in which predictability (suggested by deterministic laws) and unpredictability (randomness) coexist and are not separable or additive components.Deciding which of the two dominates is simply a matter of specifying the time horizon of the prediction.Long horizons of prediction are inevitably associated with high uncertainty, whose quantification relies on understanding the long-term stochastic properties of the processes.
Αἰών παῖς ἐστι παίζων πεσσεύων.Παιδός ἡ βασιληίη.(Time is a child playing, throwing dice. The ruling power is a childʹs; Heraclitus; ca.540-480 BC; Fragment 52) I am convinced that He does not throw dice.(Albert Einstein, in a letter to Max Born in 1926) 1 What is randomness?
In his foundation of the modern axiomatic theory of probability, A. N. Kolmogorov (1933) avoided defining randomness.He used the notions of random events and random variables in a mathematical sense but without explaining what randomness is.Later, in about 1965, A. N.
(Time is a child play-20 ing, throwing dice.The ruling power is a child's; Heraclitus; ca.540-480 BC; Fragment 52) I am convinced that He does not throw dice.(Albert Einstein, in a letter to Max Born in 1926) 1 What is randomness?
In his foundation of the modern axiomatic theory of probability, A. N. Kolmogorov (1933) avoided defining randomness.He used the notions of random events and random variables in a mathematical sense but without explaining what randomness is.Later, in about 1965, A. N. Kolmogorov and G. J. Chaitin independently proposed a definition of randomness based on complexity or absence of regularities or patterns (which could be reproduced by an algorithm).Specifically, a series of numbers is random if the smallest algorithm capable of specifying it to a computer has about the same number of bits of information as the series itself (Chaitin, 1975;Kolmogorov, 1963Kolmogorov, , 1965;;Kolmogorov and Uspenskii, 1987;from Shiryaev, 1989).Interestingly, Chaitin proved that, although randomness can be precisely defined in this manner and can even be measured, there cannot be a proof that a given real number (regarded as a series of its digits) is random.
The move from this mathematical abstraction of a real number to the realm of real physical phenomena is not straightforward.Here, commonly, randomness is contrasted to determinism.The movement of planets is a typical example of a deterministic phenomenon, whereas that of dice is thought to be random.This reflects a dichotomous logic, according to which there exist two mutually exclusive types of events or processes -deterministic and random (or stochastic).Such dichotomy is perceived either on ontological or on epistemological grounds.In the former perception the natural events are thought to belong, in their essence, to these two different types, whereas in the latter it is regarded convenient to separate them into these types, where processes that we do not understand or explain are considered random.When a classification of a specific process into one of these two types fails -and it usually does, except in a few cases such as the above examples of planets and dice -then a separation of the process into two different, usually additive, parts is typically devised.This perception has been dominant in geosciences, including hydrology.This thinking proceeds so as to form a reductionist hierarchy.Thus, each of the parts may be further subdivided into subparts (e.g., deterministic subparts such as periodic and aperiodic or trends).This 6613 dichotomous logic is typically combined with a manichean perception, in which the deterministic part supposedly represents cause-effect relationships and reason and thus is physics and science (the "good"), whereas randomness has little relationship with science and no relationship with understanding (the "evil").The random part is also characterized as "noise", in contrast to the deterministic "signal"."Noise" is a contaminant that causes uncertainty, a kind of illness that should be remedied or eliminated.
Probability theory and statistics, which traditionally provided the tools for dealing with randomness and uncertainty, have been regarded by some as the "necessary evil", but not as an essential part of physical sciences.This view has also affected hydrology and geophysics, particularly in the last couple of decades.Some tried to banish probability from hydrology, replacing it with deterministic sensitivity analysis and fuzzy-logic representations.Others attempted to demonstrate that irregular fluctuations observed in natural processes are au fond manifestations of underlying deterministic dynamics with low dimensionality, thus rendering probabilistic descriptions unnecessary.Some of the above views and recent developments are simply flawed because they make erroneous use of probability and statistics, which, remarkably, provide the tools for such analyses.The entire underlying logic is just a false dichotomy.To see this, it suffices to recall that P.-S.Laplace, perhaps the most famous proponent of determinism in the history of philosophy of science (cf.Laplace's demon), is, at the same time, one of the founders of probability theory.According to Laplace (1812), "probability theory is, au fond, nothing but common sense reduced to calculus" 1 6615 random variable allows probabilization of uncertainty, typical in Bayesian statistics (not to be confused with the lately abused term of "Bayesian beliefs").

Emergence of randomness from determinism
To illustrate that randomness coexists with determinism and that the two do not imply different types of mechanisms, or different parts or components in the time evolution, we will study a toy model of a caricature hydrological system.The system, shown in Fig. 1, and its toy model are designed intentionally to be simple.A large piece of land is considered, on which water infiltrates and is stored in the soil (without distinction from groundwater), from where it can transpire though vegetation.Except infiltration, transpiration and water storage in the soil, no other hydrological processes are considered.To simplify the system, no change is imposed to its external "forcings".That is, the rates of infiltration φ and potential transpiration τ p are assumed constant in time.The toy model is constructed assuming discrete time, denoted as i =1, 2, ..., and that in each time unit ∆t (say, "year"), the input is φ:= φ ∆t=250 mm and the potential output τ:= τ p ∆t p =1000 mm.The internal state variables, which are allowed to vary in time, are two, thus shaping a two-dimensional (2-D) dynamical system: the fraction of the land that is covered by vegetation, v i (0≤v i ≤1) and the soil water storage x i .The latter is measured above a certain datum, so that it can take positive values up to some upper bound α (assuming that water above α spills as runoff) or negative values without a bound (i.e., -∞≤x i ≤α).The constant α is assumed to be 750 mm.If the vegetation at time i is v i , the actual output through transpiration will be τ i =v i τ p .Thus, the water balance equation is the same symbol X ), other texts do not distinguish the two at all, thus creating other type of ambiguity.Here we follow another convention, in which random variables are underscored and their values are not.
We can observe that, if at some time i , v i =φ/τ p =250/1000=0.25, then the water balance results in x i =x i −1 +φ−v i τ p =x i −1 .Assuming that the system dynamics is fully deterministic, continuity demands that there should be some specific value of x i −1 for which v i =v i −1 .Without loss of generality, we set this value x=0; that is, we define the datum in such a way that the vegetation remains unchanged if water is stored up to the datum.Thus, the state (v=0.25,x=0) represents an equilibrium state: if at some time the system happens to be at this state, it will remain there for ever.In other words, once the system reaches its equilibrium state, it becomes a "dead" system, exhibiting no change.
Apparently, it is more interesting to study our system when it is "alive", that is, out of the equilibrium.To this aim, as the system is 2-D, we need one equation additional to Eq. (1) to model it, which we seek in conceptualizing the dynamics of vegetation.As x=0 represents the state where the vegetation does not change, we may assume that soil water in excess, x>0, will result in increase of vegetation and soil water in deficit, x<0, will result in decrease of the vegetation cover.The graph in Fig. 2 was constructed heuristically, according to this logic, and is described by the following equation: where β=100 mm is a standardizing constant to make the equation dimensionally consistent.
Summarizing, we have a 2-D toy model in discrete time i whose state is described by the state variables (x i , v i )=:x i (with x i in bold denoting a vector) and whose dynamics is represented by Eq. ( 1) (water balance) and Eq. ( 2) (vegetation cover dynamics).
The system parameters are four and are assumed to be known precisely: φ=250 mm, τ p =1000 mm, α=750 mm and β=100 mm.The model is easy to program on a hand calculator or a spreadsheet4 .The system dynamics is graphically demonstrated in 6617 Fig. 3, where interesting geometrical surfaces appear, showing that the transformation x i =f (x i −1 ) is not invertible.It should be stressed that no explicit "agent" of randomness (e.g., perturbation by a random number generator) has been introduced into the system.Moreover, there is no external forcing imposing change onto the system.If any change occurs, it is caused by internal reasons, that is, by an "imbalance" of the vegetation cover and water stored.To see this, let us assume that at time i =0 the system state is somewhat different from the equilibrium state, setting initial conditions x 0 =100 mm ( =0) and v 0 =0.30 ( =0.25).Using Eqs. ( 1) and ( 2) we can calculate the system state (x i , v i ) at times i >0.The trajectories of x and v for time i =1 to 100 are shown in Fig. 4. Apparently, the system remains "alive", i.e., it exhibits change all the time, and its state does not converge to the equilibrium.The trajectories, albeit produced by simple deterministic dynamics, are interesting and seem periodic; we will discuss their properties in Sect. 5.As the dynamics is fully deterministic, one may be tempted to cast predictions for arbitrarily long time horizons.For example, for time 100, iterative application of the simple dynamics allows to calculate the prediction x 100 = -244.55mm, v 100 =0.7423 (as plotted in the right end of Fig. 4).Furthermore, one may be tempted to think that the "primitive science" that this system represents, if any, has come to an end with the above discourse.We have already achieved an understanding of the system, its driving mechanisms and the causative relationships: (a) there is water balance (conservation of mass); (b) excessive soil water causes increase of vegetation; (c) deficient soil water causes decrease of vegetation; (d) excessive vegetation causes decrease of soil water; and (e) deficient vegetation causes increase of soil water.And we have completely and precisely formulated the system dynamics, which is fully consistent with this understanding, very simple, fully deterministic, nonlinear and chaotic.However, science is not identical to understanding.As R. Feynman (1965) stated, "I think I can safely say that nobody understands quantum mechanics" -and this does not preclude quantum mechanics from being science.Literally, the name science points to anisms and a precise formulation of its dynamics, science may have not come to an end, as far as our toy model is concerned.Let us now focus on predictions, especially of the future, which is a crucial target of science -with even higher importance in engineering.Does, really, deterministic dynamics allow a reliable prediction at an arbitrarily long time horizon, as in our above example?In the previous section, in constructing our prediction for time 100, (x 100 =−44.55 mm, v 100 =0.7423) we, explicitly or implicitly, assumed that we know the parameters and initial state with full precision.However, these are real numbers.It is now well known that not only cannot real numbers be known with full precision, but (with probability 1), they are not computable (Chaitin, 2004).Therefore, our further investigations will incorporate the premise that a continuous (real) variable cannot ever be described with full (infinite) precision, particularly if it varies in time.This premise, which we will call the premise of incomplete precision, is consistent with mathematics (cf.Chatin's results), as well as with physics (cf.W. Heisenberg's (1927), uncertainty principle).
It is reasonable, then, to assume that there is some small uncertainty, at least in the 20 initial conditions (initial values of state variables).Perhaps it would be reasonable to assume that there is uncertainty also in the parameters and in model Eq.(2) (but not in Eq. ( 1), which represents preservation of mass).However, to keep our study simple, we will restrict our investigation to the uncertainty of initial conditions.Fig. 5 shows the trajectory of the soil water x for the already examined initial conditions (x 0 =100 mm, v 0 =0.30) and for five more sets of initial conditions only slightly (<1%) different from the 5 Science<Latin Scientia<translation of Greek Episteme ( 7 (x 100 = -244.55mm, v 100 = 0.7423) we, explicitly or implicitly, assumed that we know the parameters and initial state with full precision.However, these are real numbers.It is now well known that not only cannot real numbers be known with full precision, but (with probability 1), they are not computable (Chaitin, 2004).Therefore, our further investigations will incorporate the premise that a continuous (real) variable cannot ever be described with full (infinite) precision, particularly if it varies in time.This premise, which we will call the premise of incomplete precision, is consistent with mathematics (cf.Chatin's results), as well as with physics (cf.W. Heisenberg's, 1927, uncertainty principle).
It is reasonable, then, to assume that there is some small uncertainty, at least in the initial conditions (initial values of state variables).Perhaps it would be reasonable to assume that there is uncertainty also in the parameters and in model equation (2) (but not in (1), which represents preservation of mass).However, to keep our study simple, we will restrict our investigation to the uncertainty of initial conditions.Figure 5 shows the trajectory of the soil water x for the already examined initial conditions (x 0 = 100 mm, v 0 = 0.30) and for five more sets of initial conditions only slightly (< 1%) different than the basic set.At short times, the differences in the trajectories are not visible in Figure 5.At about time 20, the differences become visible and slightly later (time ~30) they become large.Soon thereafter, the different trajectories become unrelated to each other.This shows that a tiny uncertainty in initial conditions gets amplified after some time, a fact well known in chaotic systems since * Science < Latin Scientia < translation of Greek Episteme (Επιστήµη) < Epistasthai (Επίστασθαι) = to know how to do < [epi (επί) = over] + [histasthai (ίστασθαι) = to stand] = to overstand.
)<Epistasthai ( 7 on.However, these are real numbers.It is now bers be known with full precision, but (with itin, 2004).Therefore, our further investigations us (real) variable cannot ever be described with es in time.This premise, which we will call the t with mathematics (cf.Chatin's results), as well ncertainty principle).
there is some small uncertainty, at least in the bles).Perhaps it would be reasonable to assume s and in model equation (2) (but not in (1), which to keep our study simple, we will restrict our ditions.Figure 5 shows the trajectory of the soil itions (x 0 = 100 mm, v 0 = 0.30) and for five more ) different than the basic set.At short times, the in Figure 5.At about time 20, the differences they become large.Soon thereafter, the different r.This shows that a tiny uncertainty in initial , a fact well known in chaotic systems since isteme (Επιστήµη) < Epistasthai (Επίστασθαι) = to know = to stand] = to overstand.
)=to know how to do<[epi ( full (infinite) precision, particularly if it varies in time.This premise, which premise of incomplete precision, is consistent with mathematics (cf.Chatin' as with physics (cf.W. Heisenberg's, 1927, uncertainty principle).
It is reasonable, then, to assume that there is some small uncertaint initial conditions (initial values of state variables).Perhaps it would be reas that there is uncertainty also in the parameters and in model equation (2) (but represents preservation of mass).However, to keep our study simple, we investigation to the uncertainty of initial conditions.Figure 5 shows the traj water x for the already examined initial conditions (x 0 = 100 mm, v 0 = 0.30) sets of initial conditions only slightly (< 1%) different than the basic set.A differences in the trajectories are not visible in Figure 5.At about time 20 become visible and slightly later (time ~30) they become large.Soon therea trajectories become unrelated to each other.This shows that a tiny unce conditions gets amplified after some time, a fact well known in chaot * Science < Latin Scientia < translation of Greek Episteme (Επιστήµη) < Epistasthai (Επί how to do < [epi (επί) = over] + [histasthai (ίστασθαι) = to stand] = to overstand.
It is reasonable, then, to assume that there is some small uncertainty, a initial conditions (initial values of state variables).Perhaps it would be reasonab that there is uncertainty also in the parameters and in model equation (2) (but not represents preservation of mass).However, to keep our study simple, we wil investigation to the uncertainty of initial conditions.Figure 5 shows the trajecto water x for the already examined initial conditions (x 0 = 100 mm, v 0 = 0.30) and sets of initial conditions only slightly (< 1%) different than the basic set.At sh differences in the trajectories are not visible in Figure 5.At about time 20, th become visible and slightly later (time ~30) they become large.Soon thereafter, trajectories become unrelated to each other.This shows that a tiny uncertai conditions gets amplified after some time, a fact well known in chaotic s * Science < Latin Scientia < translation of Greek Episteme (Επιστήµη) < Epistasthai (Επίστασ how to do < [epi (επί) = over] + [histasthai (ίστασθαι) = to stand] = to overstand.
)=to stand]=to overstand.6619 basic set.At short times, the differences in the trajectories are not visible in Fig. 5.At about time 20, the differences become visible and slightly later (time∼30) they become large.Soon thereafter, the different trajectories become unrelated to each other.This shows that a tiny uncertainty in initial conditions gets amplified after some time, a fact well known in chaotic systems since H. Poincar é's discovery of chaos (e.g., Poincar é, 1908).As a result, the deterministic dynamics can produce good predictions only for short time horizons.For longer time horizons, the deterministic predictions become extremely inaccurate and useless.In other words, at long times the system behaviour is unpredictable, that is, random, whereas at short times it is very well described by the deterministic dynamics.
We can easily imagine that if the system dynamics were different, so as to drive it to its "dead" equilibrium state, there would not be uncertainty in the future.The nonlinear type of dynamics we used is the agent that made the system "alive", i.e., changing and no dying.Apparently, what makes the system alive is the same agent that creates the uncertainty.Only dead systems are certain -and this might be useful to recall when thinking to eliminate uncertainty.
The type of uncertainty we observed here could be hardly classified in categories typically used in hydrology.It is not model uncertainty, i.e., incomplete representation of reality, because our system is artificial.It is not parameter uncertainty, because we assumed that the parameters are completely known.In is not even data uncertainty, as our inputs and outputs are assumed fully known and constant, and in fact we have not assumed any measurement error.The uncertainty in the initial conditions should be thought of as a consequence of the premise of incomplete precision, rather than as a measurement error.One may think that the assumed uncertainty 1% is too high to represent this premise.But we used this number just for better illustration.One may easily experiment with lower uncertainties in initial conditions to see that the behaviour does not change.Only the time span of predictability changes.For example, reducing uncertainty from 1% to 10 −6 will extend the predictability time span, but not more than double it.
All alive natural systems behave more or less this way, and only the predictability time span changes.This view unifies phenomena as diverse as the movement of dice and planets, although in the former the time span of predictability is less than a second, whereas in the latter it is several millions of years.As strange as it seems, even the solar system is chaotic and unpredictable in such long horizons (Laskar, 1989).For example, it has been shown that it may never be possible to accurately calculate the location of the Earth in its orbit 100 million years in the past or into the future (Duncan, 1994;Lissauer, 1999).

From determinism to stochastics
As simple and obvious as the premise of incomplete precision may seem, it implies a radically different perception and study of physical phenomena.First of all, the proper visualization of the trajectory of a system's evolution can no longer be a line or a thread.Rather it should be a stream tube (a notion familiar to hydrologists) of nonzero size (distance between its imaginary walls).The path this tube follows is important to know, but the size of the tube is equally important.This size is not constant, but varies in time.When an observation is made at a time i , the size of the tube at this time becomes tiny.In our hypothetical system this small size represents an error related to the premise of incomplete precision but in real-world systems it represents a (usually much higher) observation error.However, for future times, as well as past times at which no observation had been made, the size becomes much larger.Initially, we may think that the imaginary walls of this stream tube should be the envelope curves of several model runs with perturbed initial conditions within the assumed uncertainty bounds.
The tube visualization of this type for x i is shown in Fig. 6 in two cases, when only x 0 is observed and when x 0 to x 30 are observed, and with only 5 model runs (as in Fig. 5) as well as with 1000 model runs.At small times the tube has a size too tiny to be seen and a rough shape.The latter is typical in systems with discrete time dynamics 6621 (in continuous time it would be smooth, but the dynamics and the calculations would be too complicated to serve our purpose).At long times, the size gets much larger and increases with the number of model runs that are used to construct the envelope curves.It can be expected that, as this number tends to infinity, the zone between the envelopes will tend to cover all available space; in the case of x, this is the interval between −∞ and α=750 mm.Apparently, the dependence of the size on the number of model runs and the large or infinite size of tube are deficiencies of the envelope method.Because of these deficiencies, a deterministic approach of this type, i.e., based on lower and upper physical bounds (cf. the probable maximum precipitation concept), cannot help to effectively describe the stream tube size and the uncertainty.
Here comes probability and stochastics, which will give us a good description of the tube size, as well as a profile of the likelihood that the system state is at any specified position, a profile reminding of the profile of longitudinal velocity across a stream tube in real flows.One may wonder: is it permissible to use probability in a system that is purely deterministic, as the system we investigate here is?The answer we propose is a categorical "yes".This answer is consistent with the unified notion of randomness discussed above, as well as with the concept of probabilization of uncertainty, that is, the axiomatic reduction from the notion of an uncertain quantity to the notion of a random variable (Robert, 2007).To the author's perception, nothing in the Kolmogorov (1933) axiomatic system prohibits this reduction.In a probability-theoretic context, an unknown value x i is a realization of a random variable x i and is associated with a probability density function f (x i ).A family of random variables x i , (arbitrarily, usually infinitely, large) is a stochastic process whereas a realization of the stochastic process, i.e., a series of numbers x i is a time series 6 .
Probability along with the related fields of statistics and stochastic processes are currently described by the collective term stochastics7 .The current meaning of this well as with the concept of probabilization of uncertainty, that is, the axiomatic reduction from the notion of an uncertain quantity to the notion of a random variable (Robert, 2007).To the author's perception, nothing in the Kolmogorov (1933) axiomatic system prohibits this reduction.In a probability-theoretic context, an unknown value x i is a realization of a random variable x i and is associated with a probability density function f(x i ).A family of random variables x i , (arbitrarily, usually infinitely, large) is a stochastic process whereas a realization of the stochastic process, i.e., a series of numbers x i is a time series.* Probability along with the related fields of statistics and stochastic processes are currently described by the collective term stochastics.† The current meaning of this scientific term is no different than that first given by Jakob Bernoulli (1713-Ars Conjectandi, written 1684-1689).Specifically, Bernoulli defined stochastics as the science of prediction, or the science of measuring as exactly as possible the probabilities of events.In this respect, stochastics should not be identified with the very common ARMA or similar types of models.
To make up a stochastic formulation of the evolution of the system, first, we fully utilize the known deterministic dynamics x i = S(x i -1 ), where S is the vector function representing the deterministic dynamics.In our system, the state x i is the vector (x i , v i ) and the transformation S is described by equations ( 1) and ( 2) and is graphically depicted in Figure 3.
Second, we assume that the density at time 0, f(x 0 ), is known.Third, we use the following concept from the theory of dynamical systems: Given the probability density function at time , uniquely defined by an integral equation (e.g., Lasota and Mackey, 1991), which in our case takes the following simplified form, )<Stochazesthai ( 10well as with the concept of probabilization of uncertainty, that is, the axiomatic reduction from the notion of an uncertain quantity to the notion of a random variable (Robert, 2007).To the author's perception, nothing in the Kolmogorov (1933) axiomatic system prohibits this reduction.In a probability-theoretic context, an unknown value x i is a realization of a random variable x i and is associated with a probability density function f(x i ).A family of random variables x i , (arbitrarily, usually infinitely, large) is a stochastic process whereas a realization of the stochastic process, i.e., a series of numbers x i is a time series.* Probability along with the related fields of statistics and stochastic processes are currently described by the collective term stochastics.† The current meaning of this scientific term is no different than that first given by Jakob Bernoulli (1713-Ars Conjectandi, written 1684-1689).Specifically, Bernoulli defined stochastics as the science of prediction, or the science of measuring as exactly as possible the probabilities of events.In this respect, stochastics should not be identified with the very common ARMA or similar types of models.
To make up a stochastic formulation of the evolution of the system, first, we fully utilize the known deterministic dynamics x i = S(x i -1 ), where S is the vector function representing the deterministic dynamics.In our system, the state x i is the vector (x i , v i ) and the transformation S is described by equations ( 1) and ( 2) and is graphically depicted in Figure 3.
Second, we assume that the density at time 0, f(x 0 ), is known.Third, we use the following concept from the theory of dynamical systems: Given the probability density function at time , uniquely defined by an integral equation (e.g., Lasota and Mackey, 1991), which in our case takes the following simplified form, where A :={x; x ≤ (x, v)} and S -1 (A) is the counterimage of A, i.e. the set containing all points x whose mappings S(x) belong to A. Iterative application of the equation can determine the scientific term is no different from that first given by Jakob Bernoulli (1713-Ars Conjectandi, written 1684-1689).Specifically, Bernoulli defined stochastics as the science of prediction, or the science of measuring as exactly as possible the probabilities of events.In this respect, stochastics should not be identified with the very common ARMA or similar types of models.
To make up a stochastic formulation of the evolution of the system, first, we fully utilize the known deterministic dynamics x i =S(x i −1 ), where S is the vector function representing the deterministic dynamics.In our system, the state x i is the vector (x i , v i ) and the transformation S is described by Eqs. ( 1) and ( 2) and is graphically depicted in Fig. 3. Second, we assume that the density at time 0, f (x 0 ), is known.Third, we use the following concept from the theory of dynamical systems: Given the probability density function at time , f (x i −1 ), that of next time i , f (x i ), is given by the Frobenius-Perron operator FP, i.e. f (x i )=FP f (x i −1 ), uniquely defined by an integral equation (e.g., Lasota and Mackey, 1991), which in our case takes the following simplified form, where A:={x; x≤(x, v)} and S −1 (A) is the counterimage of A, i.e. the set containing all points x whose mappings S(x) belong to A. Iterative application of the equation can determine the density f (x i ) for any time i .This shows that the stochastic representation has an analytical expression, as has the deterministic.However, the stochastic representation refers to the evolution in time of admissible sets and densities (the stream tube visualization), rather than to trajectories of points (the thread visualization).From the deterministic, "exact" but inaccurate, ( 10 common ARMA or similar types of models. the evolution of the system, first, we fully = S(x i -1 ), where S is the vector function ystem, the state x i is the vector (x i , v i ) and the nd (2) and is graphically depicted in Figure 3. f(x 0 ), is known.Third, we use the following Given the probability density function at time y the Frobenius-Perron operator FP, i.e. f(x i ) ation (e.g., Lasota and Mackey, 1991), which

thread-like trajectory x
i =S(x i −1 ), we have moved to the tube-like trajectory: which is a slightly different way of writing Eq. (3).Clearly, the stochastic formulation does not disregard the deterministic dynamics: it is included in the counterimage S −1 (A).However, it can be easily extended to describe non-deterministic dynamics by generalizing the FP operator (Lasota and Mackey, 1991).
In the iterative application of the stochastic description of the system evolution we encounter two difficulties.First, despite being simple, the function S is not invertible and the integral over the counterimage S −1 (A) needs to be evaluated numerically.
Second, as the deterministic formulation is quite satisfactory for short time horizons, the stochastic formulation gets more meaningful for long ones.Iterative application of Eq. ( 4) over time will result in multiple integrations, so that eventually, for long time horizons, we need to perform a high dimensional numerical integration.This is difficult, unless a stochastic integration method is used.Specifically, it is easily shown (e.g., Metropolis and Ulam, 1949;Niederreiter, 1992) that for a number of dimensions d >4, a stochastic (Monte Carlo) integration method (in which the function evaluation points are taken at random) is more accurate (for the same total number of evaluation points) than classical numerical integration, based on a grid representation of the integration space.
The Monte Carlo method is very powerful, yet so easy that we may fail to notice that we are doing numerical integration and that there is some concrete mathematical background (Eq. 3) behind our simulations.In our example, the Monte Carlo method does not involve other calculations than those we did to construct the envelopes above.
It is so very simple that it even bypasses the calculation of S −1 (A).Results for the density function f i (x) of the system state x (soil water) for time i =100, in comparison with that for time i =0, are shown in Fig. 7.The Monte Carlo integration was performed assuming f (x 0 ) to be uniformly extending 1% around the value x 0 =(100 mm, 0.30) and using 1000 simulations.It is observed that moving from time i =0 to i =100, the density changes from concentrated to broad and from uniform to Gaussian; the theoretical Gaussian curve is also plotted.Knowing a priori, for theoretical reasons, that the probability distribution, after a long time, will be Gaussian is very important and substantially simplifies the solution of problem.But are there theoretical reasons implying Gaussian distribution?Jaynes (2003) lists a number of them.The most widely known is the Central Limit Theorem.In its most common formulation, which involves sums of random variables, it seems inapplicable here as there are no such sums.However, if interpreted as a statement about the properties of density functions under convolution (multiple integrals of a number of density functions tends to the Gaussian density) it may give the required explanation.However, it is more convenient to express our reasoning in terms of the principle of maximum entropy: for fixed mean and variance, the distribution that maximizes entropy is the normal distribution (or the truncated normal, if the domain of the variable is an interval in the real line).Entropy 8 is a probabilistic concept, which for a continuous random variable x is defined as where E [g(x)] denotes the expectation of any function g(x), i.e., Entropy is a typical measure of uncertainty, so its maximization indicates that the uncertainty spontaneously becomes as high as possible (this is the basis of the Second Law of thermodynamics).Entropy could be used to quantify the notions of randomness (high entropy) and determinism (lowest entropy).Given that information and entropy 8 Entropy <Greek 12 domain of the variable is an interval in the real line).Entropy * is a probabilistic concept, which for a continuous random variable x is defined as where E[g(x)] denotes the expectation of any function g(x), i.e., Entropy is a typical measure of uncertainty, so its maximization indicates that the uncertainty spontaneously becomes as high as possible (this is the basis of the Second Law of thermodynamics).Entropy could be used to quantify the notions of randomness (high entropy) and determinism (lowest entropy).Given that information and entropy are more or less the same quantity, this quantification agrees with Kolmogorov's and Chaitin's view of mathematical randomness.
In the same manner-in fact using the same simulation runs-we can calculate the densities for all times.The propagation of uncertainty in time is typically visualized through prediction intervals, such as those shown in Figure 8, for a certain probability, say 95%, of * Entropy < Greek εντροπία < entrepomai (εντρέποµαι) = to turn into.<entrepomai ( 12 domain of the variable is an interval in the real line).Entropy * is a probabilistic concept, which for a continuous random variable x is defined as where E[g(x)] denotes the expectation of any function g(x), i.e., Entropy is a typical measure of uncertainty, so its maximization indicates that the uncertainty spontaneously becomes as high as possible (this is the basis of the Second Law of thermodynamics).Entropy could be used to quantify the notions of randomness (high entropy) and determinism (lowest entropy).Given that information and entropy are more or less the same quantity, this quantification agrees with Kolmogorov's and Chaitin's view of mathematical randomness.
In the same manner-in fact using the same simulation runs-we can calculate the densities for all times.The propagation of uncertainty in time is typically visualized through prediction intervals, such as those shown in Figure 8, for a certain probability, say 95%, of * Entropy < Greek εντροπία < entrepomai (εντρέποµαι) = to turn into.
)=to turn into.

6625
are more or less the same quantity, this quantification agrees with Kolmogorov's and Chaitin's view of mathematical randomness.
In the same manner -in fact using the same simulation runs -we can calculate the densities for all times.The propagation of uncertainty in time is typically visualized through prediction intervals, such as those shown in Fig. 8, for a certain probability, say 95%, of bracketing the true state between the stream tube walls.Apparently, this stream tube visualization is more consistent than the envelope representation of Fig. 6.It is observed that for long time horizons the stream tube becomes less rough and its size, i.e. the uncertainty, tends to stabilize to a maximum value.This defines another type of equilibrium, a statistical thermodynamic equilibrium of maximum entropy.To distinguish it from the static or "dead" equilibrium of Sect.2, we can call it the "alive" equilibrium.

The power of data
As our prediction horizon increases and we approach the "alive" equilibrium, we may find it natural to raise this question: do we really need the deterministic dynamics to make a long-term prediction?Intuitively, from Fig. 8, the answer seems to be negative.But, then, what can replace the dynamics?Recalling that the form of the density function may be known a priori, as discussed above, what we need to completely express the density function are the two parameters of the Gaussian curve, namely its mean and standard deviation.But these could be estimated from data.Hence, for long horizons past data render knowledge of dynamics unnecessary.For illustration, we show in Fig. 9 a record of 100 past values of x i corresponding to times i =−100 to −1.Here, because our caricature system is imaginary, the past data are synthetic, generated by the same model, but in a real system with really unknown dynamics these would be past observations of the system state.To generate the data here we assumed initial conditions: x −100 =(73.99 mm, 0.904), for which the resulting state at time i =0 is x 0 =(99.5034≈100mm, 0.3019≈0.30).This state is compatible (within precision 1%) with the rounded off initial state x 0 =(100, 0.30) that we used in earlier investigations.Interpreting past data as a statistical sample, we estimate a sample mean µ=−2.52 mm and a sample standard deviation σ=209.13mm.With these values we can obtain the complete density function for time i =100, which is plotted in Fig. 7 along with the results obtained by the Monte Carlo simulation, in which the deterministic dynamics was explicitly taken into account.It can be seen that the empirical result without considering the dynamics is a good approximation.Despite being empirical, this result and, more generally, the use of past data in prediction, can find a theoretical justification in the concept of ergodicity 9 , an important concept in dynamical systems and stochastics.By definition (e.g., Lasota and Mackey, 1994, p. 59), a transformation is ergodic if all its invariant sets are trivial (have zero probability).In other words, in an ergodic transformation starting from any point, a trajectory will visit all other points, without being trapped to a certain subset.(In contrast, in non-ergodic transformations there are invariant subsets, such that a trajectory starting from within a subset will never depart from it.)An important theorem by Birkhoff (1931) says that for an ergodic transformation S and for any integrable function g the following property holds true: with the right-hand side representing the expectation E [g(x)].For example, for g(x)=x, setting x 0 the initial system state, observing that the sequence x 0 , x 1 =S(x 0 ), x 2 =S2 (x 0 ), ..., represents a trajectory of the system, and taking the equality in the limit as an approximation with finite (n) terms, we obtain that the time average equals the 9 Ergodicity<Greek 13 corresponding to times i = -100 to -1.Here, because our caricature system is imaginary, the past data are synthetic, generated by the same model, but in a real system with really unknown dynamics these would be past observations of the system state.To generate the data here we assumed initial conditions: x -100 = (73.99mm, 0.904), for which the resulting state at time i = 0 is x 0 = (99.5034≈ 100 mm, 0.3019 ≈ 0.30).This state is compatible (within precision 1%) with the rounded off initial state x 0 = (100, 0.30) that we used in earlier investigations.Interpreting past data as a statistical sample, we estimate a sample mean µ = < [ergon ( 13 corresponding to times i = -100 to -1.Here, because our caricature system is imaginary, the past data are synthetic, generated by the same model, but in a real system with really unknown dynamics these would be past observations of the system state.To generate the data here we assumed initial conditions: x -100 = (73.99mm, 0.904), for which the resulting state at time i = 0 is x 0 = (99.5034≈ 100 mm, 0.3019 ≈ 0.30).This state is compatible (within precision 1%) with the rounded off initial state x 0 = (100, 0.30) that we used in earlier investigations.Interpreting past data as a statistical sample, we estimate a sample mean µ = )=work]+[odos ( 13 dynamics unnecessary.For illustration, we show in Figure 9 a record of 100 past values of x i corresponding to times i = -100 to -1.Here, because our caricature system is imaginary, the past data are synthetic, generated by the same model, but in a real system with really unknown dynamics these would be past observations of the system state.To generate the data here we assumed initial conditions: x -100 = (73.99mm, 0.904), for which the resulting state at time i = 0 is x 0 = (99.5034≈ 100 mm, 0.3019 ≈ 0.30).This state is compatible (within precision 1%) with the rounded off initial state x 0 = (100, 0.30) that we used in earlier investigations.Interpreting past data as a statistical sample, we estimate a sample mean µ = )=path].

true (ensemble) average E [x]
: Thus, ergodicity allows estimation of the system properties using past data only.The question then arises: If the dynamics is known, should a long-term prediction be better based on the data or on the dynamics?To explore this question, let as compare two different types of predictions: (a) a typical deterministic prediction, based on applying the dynamics as done initially in Fig. 4; (b) a naïve statistical prediction, according to which the future equals the average of past data.Stochastics provides the tool to compare the two predictions, which is the standard (root mean square -RMS) error, where x i denotes the random variable representing the state and xi denotes the specific prediction for time i provided by either method (a) or (b).It is easily seen that for method (b), in which xi =µ (mean, estimated at −2.52 mm), the standard error equals the standard deviation σ (estimated at 209.13 mm), and is constant for all i .In method (a), e i is different for different times i and can be evaluated by Monte Carlo integration of Eq. ( 9).The results are shown in Fig. 10.Clearly, in short lead times (<∼30) the deterministic forecast is better, but in long lead times (>∼45) the naïve statistical forecast is superior.However, this is not a surprise.Actually, stochastics can give us an a priori estimate of the deterministic model error, applicable near the "alive" equilibrium, where the uncertainty has been stabilized.Thus, from Eq. ( 9) we obtain which after typical manipulations results in When it happens that xi =µ, then e i =σ, as in the statistical prediction; otherwise the error in the deterministic prediction is obviously greater than σ.The a priori error estimates are also plotted in Fig. 10 and agree well with those obtained by Monte Carlo simulation.By treating xi also as a random variable we easily obtain that the average error of the deterministic forecast over all times (above ∼45) will be e i =σ √ 2 (also plotted in Fig. 10, which shows that, on the average, the naïve statistical prediction outperforms the deterministic prediction by a factor of √ 2.
This example shows that for long horizons the use of deterministic dynamics gives misleading results and a dangerous illusion of exactness.Unless a stochastic framework is used, neglecting deterministic dynamics is preferable.In very complex systems, the same behaviour could emerge also in the smallest prediction horizons.This justifies, for example, the so-called ensemble forecasting in precipitation and flood prediction.In essence, it does not differ from this stochastic framework discussed, and is much more effective and reliable than a single deterministic forecast.
In seeking a more informative prediction than the naïve prediction, a natural question is: Is reduction of uncertainty possible for long time horizons?In our simple example the answer is categorical: no way!For, there is no margin for better knowledge of dynamics (we have assumed full knowledge already).And there is indifference of potentially improved knowledge of initial conditions.As mentioned above, reduction of initial uncertainty from 1% to 10 −6 results in no reduction of final uncertainty at i =100.Therefore, a more informative prediction cannot be a prediction with reduced uncertainty.Rather, it must be a point prediction accompanied by quantified uncertainty.This has been already done in Fig. 7.In summary, for long time horizons, the stochastic inference using (a) past data, (b) ergodicity, and (c) maximum entropy, provides an informative prediction.Knowledge of dynamics does not improve this prediction.For short time horizons, the stochastic 6629 framework also incorporates the deterministic dynamics and uses it in a Monte Carlo framework.Thus, the stochastic representation is an all-times solution, good for both short and long horizons, and helps figure out when the deterministic dynamics should be considered or neglected.
In theory, a good data set allows even the recovery of dynamics, if it is unknown, employing the Whitney (1936) and Takens (1981) embedding theorems.The recovery is based on time-delay vectors x m i :=(x i , x i −1 , . . ., x i −m+1 ) of a single observable x i , with the required vector size m being no more than 2d +1, where d is the system dimensionality, which can also be estimated from the time series.Forming time-delay vectors x m i with trial size m, we are able to calculate the multidimensional entropy where ε is a scale length (side length of hypercube) related to a grid covering the m-dimensional space, on which the empirical probability p that a data point x m i belongs to each hypercube is calculated (notice the difference from definition (5), i.e. the replacement of the probability density f with the probability mass p).The entropy φ m (ε) is a decreasing function of ε and tends to infinity as ε tends to zero.
The limit of φ m (ε)/(−ln ε) as ε tends to zero, which (according to de l'H ôpital's rule) is also equal to the limit of the slope d m (ε):=−∆φ m (ε)/∆ln ε, gives the dimension of the subspace of the m-dimensional space where the set of x m i lies.For small ε, d m (ε) cannot exceed m nor d .Application of a standard algorithm that implements this idea for increasing trial values of m (Grassberger and Procaccia, 1983;Koutsoyiannis, 2006) is demonstrated in Fig. 11, where it can be seen that d m does not exceed d =2, thus capturing the system dimensionality, which is 2. Note, however, that a large data set is required for the application of this technique.Our toy model can easily give us arbitrarily long time series, therefore here we used a time series of 10 000 points (rather than 100 used before).But shortness of data or poor attentiveness in the statistical properties of the data may result in erroneous conclusions (Koutsoyiannis, 2006).Other deterministic controls that can be recovered from the data using stochastic tools are discussed in the next section.

Exploration of the long term stochastic properties of the system
Arguably, when we are interested in a prediction for a long time horizon, we wish to know not the exact system state at a specified time but an average behaviour around that time and perhaps a measure of the dispersion of the extremes around it.This implies a different perspective of long-term prediction and predictability.In this we can disregard the "instantaneous" system state, which in atmospheric sciences is referred to as the weather, and try to predict the long-term average for a future period, commonly referred to as the climate.According to a common definition, climate is "the long-term average of conditions in the atmosphere, ocean, and ice sheets and sea ice described by statistics, such as means and extremes (US Global Change Research Program, 2009).The usefulness of the notion of climate as a long term average extends also to hydrological processes.Actually, in studies of climate change and its impacts, including those in hydrology and water resources, long-term predictions always refer to long-term average conditions.
To study this notion of prediction and predictability, we need long simulations and data series that enable observation of long-term behaviours.In all following illustrations we use time series with lengths of 10 000.The first thousand terms of the time series of the soil water x i , generated with the same initial conditions as before, are shown in Fig. 12 (upper panel).The plot shows high variability at the shortest time scale (i.e., 1), with peculiar variation patterns not visible in the plots of fewer data points presented earlier.Yet it shows flat time average at scale 30 ("climate").Despite high variability at scale 1, the trajectory of the system state does not resemble a purely irregular or random pattern.Rather the trajectory seems cyclical, but a more careful investigation reveals that the time series differs from that of a typical periodic deterministic system.Specifically, as shown in Fig. 13 (upper panel) there is no constant periodicity, but the time δ between successive peaks of the time series x i varies between 4 and 10 time steps, with a period of 6 time steps being the most frequent.

6631
The flat average of the soil water x i makes prediction of the long-term average rather trivial in this caricature system.However, other problems, in which variability plays a role (e.g.long term behaviour of extremes), are less trivial and more interesting, even in this very simple system.To study the peculiar variability of x i , we introduce the random variable y i :=|x i −x i +6 |, where the time lag 6 was chosen to be equal to the most frequent δ.We call y i the variability index and we will study its long-term behaviour in comparison to that of x i .It can be easily verified that y i represents the sample standard deviation of the size-two sample x i and x i +6 .Apparently, the standard deviation of a number of consecutive x i (e.g., the seven terms x i to x i + 6 ) would give a more representative variability index, but we chose this simpler definition to avoid artificial dependence between successive time series terms or else to avoid the need to change scale.Besides, the simple definition serves well our exploration purpose.A plot of the first 1000 terms of the time series of y i is shown in Fig. 12 (lower panel) and reveals a different behaviour than x i (upper panel).Here the variability is high, not only at a short time scale but also at a long one.The long "excursions" of the moving average of 30 values ("the climate") from the global average (of 10 000 values) are quite characteristic.
To explore the long-term stochastic properties of our system, including periodicity and time dependence or persistence, we use three stochastic tools.The first is the periodogram, i.e., the square absolute value of the Discrete Fourier Transform of the time series.It is a real function q(ω), where ω is frequency.The quantity q(ω)d ω is proportional to the fraction of variance explained by ω and thus excessive values of q(ω) indicate cyclicity with period 1/ω.The periodogram of 10 000 terms x i is shown in Fig. 13 (lower panel).
The second is the empirical autocorrelation function (autocorrelogram), i.e., the Finite Fourier Transform of the periodogram.It is a sequence of values ρ j , where j is a lag.It is alternatively defined and more easily determined as and µ=E [x].The autocorrelograms of both x i and y i are shown in Fig. 14.The third tool aims at a multi-scale stochastic representation.Based on the process x i at scale 1, we define a process x (k) i at any scale k≥1 as: A key multi-scale characteristic is the standard deviation σ (k) of x (k) i .The quantity σ (k)     is a function of the scale k≥1, here referred to as the climacogram 10 and typically depicted on a double logarithmic plot.While the periodogram and the autocorrelogram are related to each other through a Fourier transform, the climacogram is related to the autocorrelogram by a simpler transformation, i.e., It is directly verified (actually this is the most classical statistical law) that in a purely random process However, this law may not be verified in natural systems.The simplest alternative is where H is a constant (0<H<1) known as the Hurst coefficient, after H. E. Hurst (1951) who first analyzed statistically the long-term behaviour of geophysical time series.Earlier, Kolmogorov (1940), in studying turbulence, had proposed a mathematical model 10 Climacogram<Greek

18
Fourier Transform of the periodogram.It is a sequence of values ρ j , where j is a lag.It is alternatively defined and more easily determined as where . The autocorrelograms of both x i and y i are shown in Figure 14.
The third tool aims at a multi-scale stochastic representation.Based on the process x i at scale 1, we define a process x i (k) at any scale k ≥ 1 as: A key multi-scale characteristic is the standard deviation σ (k) of x i (k) . The quantity σ (k) It is directly verified (actually this is the most classical statistical law) that in a purely random The periodogram of 10 000 terms x i is shown in Figure 13 (lower panel).
The second is the empirical autocorrelation function (autocorrelogram), i.e., the Finite Fourier Transform of the periodogram.It is a sequence of values ρ j , where j is a lag.It is alternatively defined and more easily determined as where . The autocorrelograms of both x i and y i are shown in Figure 14.
The third tool aims at a multi-scale stochastic representation.Based on the process x i at scale 1, we define a process x i (k) at any scale k ≥ 1 as: A key multi-scale characteristic is the standard deviation σ (k) of x i (k) . The quantity σ (k) It is directly verified (actually this is the most classical statistical law) that in a purely random )=scale]+[gramma ( 18The second is the empirical autocorrelation function (autocorrelogram), i.e., the Finite Fourier Transform of the periodogram.It is a sequence of values ρ j , where j is a lag.It is alternatively defined and more easily determined as where . The autocorrelograms of both x i and y i are shown in Figure 14.
The third tool aims at a multi-scale stochastic representation.Based on the process x i at scale 1, we define a process x i (k) at any scale k ≥ 1 as: A key multi-scale characteristic is the standard deviation σ (k)  of x i   (k) .The quantity σ (k) It is directly verified (actually this is the most classical statistical law) that in a purely random ) =written].

6633
to describe this behaviour.This behaviour has been known with several names including Hurst phenomenon, long-term persistence, long range dependence, and Hurst-Kolmogorov (HK) behaviour or HK (stochastic) dynamics.At the same time, Eq. ( 15) defines a simple stochastic model that reproduces this behaviour, known as a simple scaling stochastic model or fractional Gaussian noise (due to Mandelbrot and van Ness, 1968), or the HK model.Climacograms for the time series x i and y i are shown in Fig. 15, where the departure from the classical law (Eq.14) of a purely random process is evident.
A purely random process would have a flat periodogram, but Fig. 13 (lower panel) indicates a different behaviour for x i .Furthermore, fixed periodicities would be manifested in the periodogram as high impulses in the specific periods, but in the periodogram of Fig. 13 no impulses exist.Rather, the figure indicates relatively higher densities q at a broad band of periods 1/ω, between 5-12 time units (agreeing with the simpler representation in the upper panel of Fig. 13).The most noticeable characteristics in the periodogram are the increasing spectral densities for low frequencies (1/ω>8) and the decreasing ones for high frequencies (1/ω<8).The two different behaviours are indicators of antipersistence and persistence, respectively.The autocorrelogram in Fig. 14 depicts the same behaviours in a different manner.It is observed that the autocorrelation for lag one is positive, which is expected because of physical consistence (states in neighbouring times should be positively correlated because the changes in small time should be small) and indicates short-term persistence.For higher lags the autocorrelation oscillates between negative and positive values.The existence of negative autocorrelations is an indication of antipersistence.In the simplest case, an antipersistent process should have all its autocorrelations negative (and it can be verified that in an HK process defined by Eq. ( 15) with H<0.5, the autocorrelation function is negative everywhere).But this cannot be observed in nature because short-term persistence demands that some autocorrelations will be positive.A strictly periodic behaviour would also result in autocorrelation oscillating between positive and negative values, which creates the risk of misinterpretation of antiper-sistence as periodicity.However, the intervals between different peaks or troughs in Fig. 14 have not constant length, so here we have antipersistence.Figure 14 also shows the autocorrelation of the variability index y i .In this case, the autocorrelation is always positive, indicating persistence both in short and long term.Processes with consistently positive autocorrelation functions lead to large and long "excursions" from the mean as shown in Fig. 12 (lower panel), which often tends to be interpreted as nonstationarity.The latter, however, would require that the system's dynamics change in time in a deterministic manner, which does not happen here (and in most of the cases).
The most characteristic and useful plot of all three is the climacogram in Fig. 15.For the soil water series x i and for scales 1-3, the slope formed by the empirical points is very low, reflecting short-term persistence.For large scales (k>10) the empirical points are arranged in a straight line with large slope, −0.98.From Eq. ( 15) we can see that this slope equals H−1, so that in this case H=0.02 asymptotically (for large scales).Likewise, the plot for the variability index y i indicates a slope of −0.34 for large scales, which corresponds to H=0.66.The slope for a purely random process, also shown in figure, is −0.5, which corresponds to H=0.5.Generally, an H between 0.5 and 1 characterizes long-term persistence, whereas an H between 0 and 0.5 indicates antipersistence.Thus, this figure verifies the antipersistent and persistent behaviour already detected for x i and y i , respectively.
Most importantly, this figure provides insights on the predictability issue.It is well known that for a one-step-ahead prediction at scale 1, a purely random process x i is the most unpredictable.Dependence enhances one-step-ahead predictability.For example, in a process with ρ 1 =0.5 (comparable to that of our series x i and y i ) the standard deviation, conditional on known present state, is a fraction √ (1−ρ 2 1 ) of the unconditional one, i.e., 13% smaller.However, in the climatic-type predictions, on which the average behaviour rather than exact values is studied, the situation is different.In our example, as clearly shown in Fig. 15, at the climatic scale of 30 time steps, predictability is deteriorated by a factor of 3 for the persistent process y i (thus eliminating 6635 and largely exceeding the 13% reduction due to conditioning on the present state).On the other hand, for the antipersistence process x i , the long-term predictability is improved by a factor of about 3. In summary (and perhaps contrary to what is believed), long-term persistence substantially deteriorates predictability over long time scalesbut antipersistent improves it.
Figure 16 provides further demonstration of the unpredictability of persistent processes.The plot shows 1000 items of the time series y i (variability index) at the annual and the climatic scale and for the two sets of initial conditions discussed above, the exact and rounded off, which differ by less than 1%.After about 30 time steps (one time unit of "climate"), the departures in the two cases are pronounced.Thus, even a completely deterministic system is completely unpredictable at a large (climatic) time scale, when there is persistence.

From the toy model to the real world
In comparison to our toy model, a natural system, such as the atmosphere, a river basin, etc.: (a) is much more complex; (b) has time-varying inputs and outputs; (c) has spatial extent, variability and dependence (in addition to temporal); (d) has greater dimensionality (virtually infinite); (e) has dynamics that to a large extent is unknown and difficult or impossible to express deterministically; and (f) has parameters that are unknown.Hence uncertainty and unpredictability are naturally even more pronounced in a natural system.The role of stochastics is then even more crucial: (a) to infer dynamics (laws) from past data; (b) to formulate the system equations; (c) to estimate the involved parameters; and (d) to test any hypothesis about the dynamics.Data offer the only solid ground for all these tasks, and failure to rely upon, and test against, evidence from data renders inferences about hypothesized dynamics worthless.
Despite the huge difference of the toy model and natural reality, we may hope that the above discourse can help address several questions, from philosophical to technical.Given the current dominant trend in hydrological (and other geophysical) sciences for physically-based modelling, relevant questions are: What is physically-based modelling?Is physics a synonym for determinism?Is physically-based a synonym for mechanistic?Are first principles mechanistic principles?Is not statistical physics part of physics?Is not entropy maximization a first principle?Is not stochastic modelling part of physical modelling?Will it ever be possible to achieve such a physically-based modelling of hydrological systems that will not depend on data or stochastic representations?Can detailed representations and reduction to first principles render hydrologic measurements unnecessary?What level of detail is needed in such reductionist modelling for a catchment of, say, 1000 km 2 ?How far can the current research trend toward detailed models advance hydrology and water resources science and technology?
Hydrological uncertainty and its reduction is currently a core research issue and also very important from a water resources engineering and management perspective.Relevant questions are: to what extent can hydrological uncertainty be reduced?Can uncertainty be eliminated by uncovering the system's deterministic dynamics?Is uncertainty epistemic or inherent in nature?When there is potential for reduction of uncertainty, what is the most effective means for reduction?Is it better understanding, better deterministic modelling, more detailed discretization, or better data?When the limits of uncertainty reduction have been reached, what is the appropriate scientific and engineering attitude?Is it confession of failure and no action, or quantification of uncertainty and risk through stochastics?Are current stochastic methods consistent with observed natural behaviours?Is there potential to improve current stochastic methods in hydrology?Can deterministic methods provide solid scientific grounds for water resources engineering and management?Are there physical upper and lower limits (deterministic envelopes) in extreme hydrological phenomena, such as precipitation and flood, whose determination could constitute the basis of hydrological design?
Can there be risk-free hydraulic engineering and water management?
In the last decades, financial support for research in hydrology and water resources engineering and management has been strongly linked to research on climate, a practice that does not favour hydrology as an autonomous scientific discipline.Therefore, 6637 questions related to the interface and relationship between hydrology and climate are quite important: Is the current interface satisfactory?Should hydrology and water resources planning rely on climate model outputs?Are climate models properly validated?Is the meaning of model validation and prediction the same in hydrology and in climate modelling?Is the evolution of climate and its impacts on water resources deterministically predictable?With respect to the last questions, we can observe that climate modellers do not hesitate to offer arbitrarily long predictions, with time horizons from 2100 AD (Battisti and Naylor, 2009), to 3000 AD (Solomon et al., 2009), to 100 000 AD (Shaffer et al., 2009) -to mention a few of the most recent publications in the most reputable journals.Given the definition of climate (detailed above) as a long-term average state, the behaviours illustrated by the toy model are quite relevant.The high Hurst exponents estimated in several instrumental and proxy climatic time series, especially for temperature (H>0.90;Cohn and Lins, 2005;Koutsoyiannis and Montanari, 2007;Koutsoyiannis et al., 2009), support the view of most climate processes as persistent ones and, hence, far more unpredictable than a purely random process.This raises the question, Is there any indication that climate is predictable in deterministic terms?

Conclusions
The following summarizing questions can represent the conclusions of this article: -Can natural processes be divided in deterministic and random components?-Are probabilistic approaches unnecessary in systems with known deterministic dynamics?
-Is stochastics a collection of mathematical tools, unable to give physical explanations?
-Are deterministic systems deterministically predictable in all time horizons?
where A :={x; x ≤ (x, v)} and S -1 (A) is the counterimage of A, i.e. the set containing all points x whose mappings S(x) belong to A. Iterative application of the equation can determine the * In some texts the two terms are used as synonymous, but distinction helps avoid ambiguity and misunderstanding.† Stochastics < Greek Stochasticos (Στοχαστικός) < Stochazesthai (Στοχάζεσθαι) < Stochos (Στόχος); Stochos = target; Stochazesthai = (a) to aim, point, or shoot (an arrow) at a target > (b) to guess or conjecture (the target) > (c) to imagine, think deeply, bethink, contemplate, cogitate, meditate.

- 2 .
52 mm and a sample standard deviation σ = 209.13mm.With these values we can obtain the complete density function for time i = 100, which is plotted in Figure 7 along with the results obtained by the Monte Carlo simulation, in which the deterministic dynamics was explicitly taken into account.It can be seen that the empirical result without considering the dynamics is a good approximation.Despite being empirical, this result and, more generally, the use of past data in prediction, can find a theoretical justification in the concept of ergodicity, * an important concept in dynamical systems and stochastics.By definition (e.g., Lasota and Mackey, 1994, * Ergodicity < Greek εργοδικός < [ergon (έργον) = work] + [odos (οδός) = path].

- 2 .
52 mm and a sample standard deviation σ = 209.13mm.With these values we can obtain the complete density function for time i = 100, which is plotted in Figure 7 along with the results obtained by the Monte Carlo simulation, in which the deterministic dynamics was explicitly taken into account.It can be seen that the empirical result without considering the dynamics is a good approximation.Despite being empirical, this result and, more generally, the use of past data in prediction, can find a theoretical justification in the concept of ergodicity, * an important concept in dynamical systems and stochastics.By definition (e.g., Lasota and Mackey, 1994, * Ergodicity < Greek εργοδικός < [ergon (έργον) = work] + [odos (οδός) = path].

- 2 .
52 mm and a sample standard deviation σ = 209.13mm.With these values we can obtain the complete density function for time i = 100, which is plotted in Figure 7 along with the results obtained by the Monte Carlo simulation, in which the deterministic dynamics was explicitly taken into account.It can be seen that the empirical result without considering the dynamics is a good approximation.Despite being empirical, this result and, more generally, the use of past data in prediction, can find a theoretical justification in the concept of ergodicity, * an important concept in dynamical systems and stochastics.By definition (e.g., Lasota and Mackey, 1994, * Ergodicity < Greek εργοδικός < [ergon (έργον) = work] + [odos (οδός) = path].
is a function of the scale k ≥ 1, here referred to as the climacogram * and typically depicted on a double logarithmic plot.While the periodogram and the autocorrelogram are related to each other through a Fourier transform, the climacogram is related to the autocorrelogram by a simpler transformation, i.e., is a function of the scale k ≥ 1, here referred to as the climacogram * and typically depicted on a double logarithmic plot.While the periodogram and the autocorrelogram are related to each other through a Fourier transform, the climacogram is related to the autocorrelogram by a simpler transformation, i.e., is a function of the scale k ≥ 1, here referred to as the climacogram * and typically depicted on a double logarithmic plot.While the periodogram and the autocorrelogram are related to each other through a Fourier transform, the climacogram is related to the autocorrelogram by a simpler transformation, i.e., Figures

Figure 1 :
Figure1: A caricature hydrological system for which the toy model was constructed.

Figure 2 :
Figure 2: Conceptual dynamics of vegetation for the caricature hydrological system.

Fig. 1 .
Fig. 1.A caricature hydrological system for which the toy model was constructed.

Figure 1 :Figure 2 :
Figure1: A caricature hydrological system for which the toy model was constructed.

Fig. 2 .
Fig. 2. Conceptual dynamics of vegetation for the caricature hydrological system.

Figure 5 :
Figure 5: Evolution of soil water x as in Figure 4 but with uncertainty in initial conditions: Bold blue line corresponds to initial conditions s0 = 100 mm, v0 = 0.30 and the other five lines represent initial conditions slightly (< 1%) different.

Fig. 5 .
Fig.5.Evolution of soil water x as in Fig.4but with uncertainty in initial conditions: bold blue line corresponds to initial conditions s 0 =100 mm, v 0 =0.30 and the other five lines represent initial conditions slightly (<1%) different.

Figure 6 :
Figure 6: Stream tube representation of the evolution of soil water x (with initial conditions as in Figure 4 and 1% uncertainty) in a deterministic setting using envelope curves: (upper) when only x0 is observed; (lower) when x0 to x30 are observed.

Fig. 6 .
Fig.6.Stream tube representation of the evolution of soil water x (with initial conditions as in Fig.4and 1% uncertainty) in a deterministic setting using envelope curves: (upper) when only x 0 is observed; (lower) when x 0 to x 30 are observed.

Figure 7 :Figure 8 :
Figure 7: Probability density functions f i (x) of the system state x (soil water) for times i = 0 and 100.

Fig. 7 .
Fig. 7.Probability density functions f i (x) of the system state x (soil water) for times i =0 and 100.

Figure 7 :
Figure 7: Probability density functions fi(x) of the system state x (soil water) for times i = 0 and 100.

Figure 8 :
Figure8: Stream tube representation of the evolution of soil water x (with initial conditions as in Figure4and 1% uncertainty) in a stochastic setting using Monte Carlo prediction limits for 95% probability of bracketing the true state between the stream tube "walls" (1000 simulation runs).

Fig. 8 .
Fig.8.Stream tube representation of the evolution of soil water x (with initial conditions as in Fig.4and 1% uncertainty) in a stochastic setting using Monte Carlo prediction limits for 95% probability of bracketing the true state between the stream tube "walls" (1000 simulation runs).

Figure 10 :
Figure 9: A sample of past data of soil water x for times i = -100 to -1.

Fig. 9 .
Fig. 9.A sample of past data of soil water x for times i =−100 to −1.

Figure 9 :
Figure9: A sample of past data of soil water x for times i = -100 to -1.

Figure 10 :
Figure 10: Comparison of the RMS prediction error of soil water x for the deterministic prediction and the naïve statistical prediction.

Fig. 10 .
Fig. 10.Comparison of the RMS prediction error of soil water x for the deterministic prediction and the naïve statistical prediction.