Hadcrut 3 was last updated October 2012 so I assume it is still active.
I was looking at something else but updated Hadcrut 3 at the same time then took a quick look. The innovation is the first time I have shown a novel presentation of annual data where opinions are wanted.
As you can see this seems to highlight some curiosities about the Hadcrut data.
Firstly what I am doing which is different
I try to stay the right side of Nyquist and Shannon, which is made very difficult by the universal violations before data is provided to the public. This is a bad situation.
In this case I am working from published gridded data, computing my own weighted means.
One solution to the “annual” data problem is low pass at just over one year and then output minimal additional sampling to create a Nyquist meeting dataset, done here, 1/3rd year sampling. I am intending to go into this in more detail later but I need to say a certain amount now.
Conventional/normal practice in field after field is treat a time series like bundles of wood where any sequence will do if you want to know the mean length. This is wrong as has been known for many years. An example of a good way to make an organisation a laughing stock, such as the UK Met Office declaring 2011 CET was a very hot year when the people (correctly) knew otherwise, answer is herein. (neither was 2010 as cold as claimed). Some concern ought to be expressed over financial data, many other fields putting corruptions into published results… where people follow the latest knee jerk of a data point.
Unfortunately it is not valid to take the mean of time related items without taking into account when. This will literally give spurious results and it does every day. In effect time has to be averaged too, with both history and the future taken into account. (cold snap 2010/11 occurred in December so conventional 2011 mean didn’t see it, spins off into the meaning of life and practicalities)
This is what competent low pass filters do. It could also be viewed as an extension of [data] sampling theory.
Correct is removing all frequencies from the data above the Nyquist limit for the output sample rate. For annual data samples this is conventionally one sample a year therefore the Nyquist limit is at two years therefore all frequencies faster than two years must be removed first. In practice this needs a guard band too.
I’ll show what that looks like in a later post, is an alternate ploy.
In this case I low pass at just longer than one year so it is annual as in a year at a time but output every third of a year. Strictly if you want to recreate the true analogue original, or even plot you should use a reconstruction filter, beyond here and not worth the effort with this kind of thing. In real world analogue you bet you do, at least for pro stuff.
Given eg. Hadcrut is provided as monthly (not valid time series either) that is 12 samples a year. A minor problem appears, sampling evenly three times a year doesn’t fit. Enter skulduggery.
Oversample by two times, filter with the new 24 samples a year and decimate by 8, 24/8=3 but now there are samples at 1/6th, 3/6 ad 5/6 of a year with the twist this is zero based whereas months are conventionally labelled one based (1,2,3…), modulo math. Now there are 6 months before and after half way.
This proper method retains exact time information as we are about to see.
The data is restricted to one year on how fast it can move (with faster data automatically taken into account, not lost) but the when is retained. If a reconstruction filter is used there is essentially infinite time resolution.
Put another way, a reconstructed analogue signal has exact timing at any resolution and could for example be resampled at completely different times. (is kind of how sample rate conversion is done)
The Hadcrut 3 plots
Figure 1, the head image doesn’t show time in detail, Figure 3 is provided for a close look.
I found figure 1 a surprise, a reason for this post. What is going on 1888…1920, 1972…2000? Why is hemispheric data pretty much synchronous sometimes and not others?
Keep in mind the data is my computation from gridded, not official to which statistical adjustments are apparently made.
Plot with annual grid. Now you can see the timing scatter.
Some surprises on what precedes what, or not at all. Why?
Comments on the methodology are welcome since figuring out a good method is important. What I am doing is one way probably of many.
Mistakes, quite possibly.
What do I know anyway.
 Computation from gridded to mean is common for all gridded datasets here, exact matches for some quantisation excepted, eg. rss (some missing), uah (no missing), really seems to depend on whether the providers do post processing. Method used by all is cosine weighted mean, ie. correct on area.
 Oversampling can be done one of two ways. In this case duplicate the preceding sample into a new sample, then suitably low pass filter. In this case the final filter does this automatically. If you want to know more will be found in Signal Processing basics. Zero fill and scale the result is the alternative. If you are not convinced you need hands on.
Data and plot data in OpenOffice .ods inside a .zip here (334k)
Post by Tim Channon