The co-moderator is raising an important topic for the Talkshop. Real world data and dimensionality, H, which might look normal but is not in time, with scaling, self similarity, perhaps maximum entropy at work.

The Hurst exponent.

And yet figure 2 looks normal enough. [as in statistical bell curve etc, given real data]

If you are familiar with this subject check I haven’t misled anyone and help out, add detail.

I would be stupid to attempt a proper explanation of this subject so I am going to write some snippets and give some pointers for elsewhere.

The story of Edwin Hurst has been expounded many times, a web search will return a vast array of information.

Try this, StatProb on Harold Edwin Hurst — “This is an instance of when a result is truly unexpected, is hard to comprehend, even by those best disposed to listen. New tools are not accepted without struggle.”

The head plot shows one of the most venerable datasets anywhere, the river Nile flood levels and it was work on this in Egypt which led Hurst to throw a spanner in the nice precise world of statistics, which to this day is widely disputed or ignored. And yet do you hear a great deal about flood or drought in Egypt, the Aswan High Dam works.

Figure 2 is a quick binning plot of the same data, frequency. I have put that there to demonstrate that looks are misleading. This could be taken as a plot of Gaussian noise… at your peril. Put it this way, tomorrow will not be the opposite of today, we get persistence of state, periods of dry, wet, hot, cold, wind, calm and so on. When is not encoded in plain noise, once again time is an omitted dimension.

River flow, origin of Hurst and rainfall are tightly related.

Rainfall and weather are tightly related, H spreads all across natural systems.

When Sligo of the Met Office amongst others recently declared on UK rainfall I somehow doubt they were speaking in skill but rather from a viewpoint of classical statistics. Critically probability is different with quantities which deviate from classical random, H=0.5, variance is wider. Has it really been so exceptionally wet, has weather gone mad, humans getting blamed of course?

I doubt it.

What I think is a good entrance point to the work of Koutsoyiannis and the superb ITIA web site is his 2003 paper

Climate change, the Hurst phenomenon, and hydrological statistics, Hydrological Sciences Journal, 48 (1), 3–24, 2003.

This includes showing statistical math which estimates probability taking into account H but also produces the usual answer if H=0.5

Head page for the copy of paper here http://itia.ntua.gr/en/docinfo/537/

There are many papers there and open discussions, some well known names appear. Link to publication list here

Tools

As a quick start here is a tool which computes H, a value which is an estimate. Given 10 different methods you will get 10 different answers.

A local usage **Java** tool (download and execute the jar file), works with V7 JRE, can estimate H using a variety of methods is here

Asks for details with email, seems straight “The SELFIS project is supported by the NSF CAREER grant ANIR 9985195, and DARPA award NMS N660001-00-1-8936, and NSF grant IDM IIS-0208950″

How about CET (Central England Temperature), use monthly, I’ve removed annual and centred mean on zero.

A good result, all estimators agree. H=0.7 or so, is typical for semi persistent data, such as Nile floods, a variety of things. (want the datafile, ask) Looks as follows

CET as used

Result using what was removed from the monthly set, annual cycle, showing patchy estimator performance but H is near zero, which means a simple cyclic data pattern. (CET data has a fixed offset so that might mess up the estimation)

Climatic community

Certain people don’t like this stuff one bit, wouldn’t if it says natural variation is much wider than the classic view, becomes difficult to justify humans doing much to earth systems.

Many blogs have touched on this subject, chiefio, WUWT, etc. providing many interesting takes. A list here would be nice (Koutsoyiannis links some where he takes part)

I’ve not touched Mandelbrot’s work on this one. (I have some about this here on paper, long forgotten where this detail didn’t register with me at the time, I think it needs real context before the penny drops)

Mistakes are mine. Please help improve this article and the understanding hereabouts.

Post by Tim Channon

This may help clarify things:

“The Hurst exponent is referred to as the “index of dependence,” or “index of long-range dependence.” It quantifies the relative tendency of a time series either to regress strongly to the mean or to cluster in a direction.[5] A value H in the range 0.5 < H < 1 indicates a time series with long-term positive autocorrelation, meaning both that a high value in the series will probably be followed by another high value and that the values a long time into the future will also tend to be high. A value in the range 0 < H < 0.5 indicates a time series with long-term switching between high and low values in adjacent pairs, meaning that a single high value will probably be followed by a low value and that the value after that will tend to be high, with this tendency to switch between high and low values lasting a long time into the future. A value of H=0.5 can indicate a completely uncorrelated series, but in fact it is the value applicable to series for which the autocorrelations at small time lags can be positive or negative but where the absolute values of the autocorrelations decay exponentially quickly to zero. This in contrast to the typically power law decay for the 0.5 < H < 1 and 0 < H < 0.5 cases."

Good idea putting the same thing in different ways.

I have written an entire book on this topic and much more:

Fractal and Diffusion Entropy Analysis of Time Series: Theory, concepts, applications and computer codes for studying fractal noises and Lévy walk signals

http://www.amazon.com/Fractal-Diffusion-Entropy-Analysis-Time/dp/3639257952/ref=sr_1_2?s=books&ie=UTF8&qid=1360286207&sr=1-2

The risk of writing in public, always people who know far more.

Not only does Dr Scaffeta’s book look very informative, it’s also sound investment :

Order within 17 mins, and choose One-Day Shipping at checkout.

Details

10 new from $98.64 4 used from $136.22

Old copies are worth more than new ones!

Scaffeta’s work is like fine wine, it’s worth more the older it gets. ;)

Tim, the axis labelling is very unclear, what is the frequency of the peak in figure 5?

Also what does the same processing method give for the daily time series?

Tim, your post is great and I thank you. You are absolutely right that the Hurst ideas are omitted or neglected in the (hydro)climatic community. Some hydrologists know about Hurst’s contribution but very few climatologists have heard about it—let alone know about its relevance and importance in climate. It is characteristic that Wikipedia has just three lines about Harold Edwin Hurst (http://en.wikipedia.org/wiki/Harold_Edwin_Hurst ). Trying to correct this, we are organizing an event about the life and legacy of Hurst. It will take place in Kos island, Greece, next October (http://kos2013.org/HarlodEdwinHurst_RoundTable.html ). It will be chaired by the renowned British hydrologist John Sutcliffe who succeeded Hurst to Nile studies. Hopefully, Stephen Hurst (son) will also participate and talk about his father.

Thank you for dropping in Demetris.

I find it surprising if a professional is unaware of part of their field yet I can think of other critical things most don’t, although to be fair I know little. Maybe looking in is easier.

Kos in October :-)

Greg,

The scales are unimportant, apt pictures.

The peak the probability curve, zero. (you did ask ;-))

[ added a note I hope makes the nature of figure 2 clear ]

Tim : “The scales are unimportant, apt pictures. The peak the probability curve, zero. (you did ask “;-) )

when I look at a graph, I do not consider I am looking at a “picture”. The first thing I do is to look at the axes to see what the graph means.

Fig 5 periodogram would look like red noise if that peak is at zero period but it certainly is not zero on that graph. Neither the red axes, not the x-axis label indiciate anything like zero. So should we conclude that the scales that are produced by this java thingy are completely unreliabile ?

The other panels are also unclear but look more credible. Looks like a bug in the software drawing the graphs.

Perhaps a note to that effect would be useful, if that is the case.

“Figure 2 is a quick binning plot of the same data, frequency. I have put that there to demonstrate that looks are misleading. This could be taken as a plot of Gaussian noise… at your peril. ”

Well, there certainly seems to be a clear underlying gaussian and that is probably because there is a lot of random noise in the data.

What would be of interest are the deviations from that form. In particular , there are several clear peaks and a notable notch at “1” ( one year?) also the strongest peak at -0.5 (freq of -0.5 ??).

Again, it’s pretty unclear what the poorly labelled graphs represent.

I deleted the labels etc., maybe a mistake, maybe not, done now.

Taking the matter of Hurst forward could be done using a concrete example. I think that would be good but it needs someone brave enough to produce conventional statistics where they need to be correct.

What I have in mind is choosing a well known data, preferably temperature because that is rarely associated with Hurst.

Conventional stats are produced using that data, including the probabilities. If these results appear to be normal in the sense of nothing unexpected, good.

In addition the Hurst stuff, including the probabilities as described by Koutsoyiannis which I assume will be distinctly different.

We then have a good talking point and example.

[…] and climate is not random as accepted and taught in statistics. This has been known for years. See here and here where there are many papers and links to […]

[…] Weather tends to periods of hot, cold, dry, windy, wet, calm and so on, exactly what we see. This is right in line with dimensionality, Hurst, which was originally deduced from the height of Nile floods. Met Office and others deny the effect, hence use the wrong statistical model with the side effect of imagining we see “100 year” or longer events often, claiming justification for their assertions about AGW. An old post is here. […]