[UPDATE] I have made a mistake although I had included caution, something looked wrong. On looking closely at the old file which is the source for this analysis it is for Earth, not the Sun. The basics will be similar. See comments, we are going to carry on because what has appeared in intriguing. In the process a rework on other versions of data will be much quicker and easier : Tim]
Recently on the Talkshop a discussion has started about Sol and gravity forces. I stepped up in case I can help. I think this is where the recent discussions commenced.
Story goes that some time ago I went down the route of attempting a computation but the result didn’t seem interesting in relation to whatever I was doing so I dropped the work.
Somewhat boring plot. Devil in the detail?
We don’t know whether z-axis is a key to solar activity patterns where it is suggested that analysis ought to be aligned to the solar rotation axis which is not the standard reference frame.
Whilst that is being resolved, here is work done aligned to the standard plane. There is not a large difference in alignment so it ought to be similar.
Revisiting the GB of files in the solar archive I came across what looks like the final result, so large it brings my computer to it’s knees, knobbly they are. (runs out of RAM, more about inefficient programs than actual need) Sorry folks, not putting up this data, too large.
This has 10 day data running apparently from a Thursday, tomorrow is TGIF, yet what did folks do 1st January 1400?
Here is the start date Julian 2232407.5
Have fun at http://www.fourmilab.ch/documents/calendar/
Dataset ends 2030.
I don’t remember the details but it looks as though I computed the gravitational force on the sun in XYZ format based on the standard astronomical plane. What the current discussions want is data on the solar plane. Never mind lets look at what is.
What follows might follow cockup theory so please not too hard, doing my best.
Figure 1 has been decimated to 1 year for plotting. (for pedants, low pass at just over 2 years and is end corrected by my own method)
Let’s take a look at the periodic structure. What follows is using the full data as input, 23,000 samples of floats.
My quick look (defaults) and this looks wrong. Is an octave z-chirp transform from 0.5y through 500y, windowed kaiser-bessel beta=3, logarithmic Y-axis
Assuming few readers are particularly familiar windowing is important, stops fictitious results which is to do with the input data having a finite length and has to be blocked off as best we can, which is a compromise.
The width of the lines (peaks) varies because unavoidable the time and hence period resolution approaches zero at the right hand side, limited by data set length. It becomes vague. I have a way of working around this, more on that later.
We can see the main planets Jupiter, Saturn, Uranus and Neptune. (Pluto is tiny and omitted from the data) The problem is the large amount of fast hash, <strong>that ought not to be there.</strong>
So far as I can tell I have done nothing wrong, everything is done to considerable precision and at good time resolution.
The solar system ought to be a linear system and therefore without significant interaction between planets, yet this implies there is enormous other things going on, or more likely something is wrong with the data or my processing.
I recognise clues in the spectra that there is more hidden and so
Same as figure 2 but using a Blackman Harris -92dB window (probably a nice demonstration of the difference) which has good vertical resolution but worse frequency resolution. (previous is the best at separating close spaced frequencies)
We can now see the previously missing ~19.8y dominant gravitational peak, at least it is in the normal orbital plane.
I still don’t believe this is real.
What I can do is extract items using other software. A slow by hand process. This gives exact phase and amplitude. Of course these might be random terms.
Any suggestions anyone? Might this be noise or artefacts from the data originating in integrated polynomial approximations, ephemeris? Am I looking too deeply into a junk layer?
I’m going to publish now and might update later with some details.
I’ve successfully run the analyser software on the 10 day data, simple enough but needing various tricks to get useful results. This is not about cheating.
A direct attack on the longer period primary problem would be twarted by the fast noise which is not of particular interest at the moment. I could force things by hand but an easier route is play tricks: low pass the data at 2 years, removing the troublesome content and use that as input data. This is undone later.
I chose 13 terms as reasonable in this case but is probably more than is wise. Apart from a few nudges to speed things up (this is a large dataset) it was just a matter of waiting. (a human can see slow progress to an obvious answer, give it a kick to get on with it, next stage, self corrects anyway)
I then took a deep breath and reverted to the original data as input, hoping it would simply accept instead of spotting the change and jumping to the faster data. This worked and minor refinement took place, probably so tiny it doesn’t matter.
Warning on what follows: there is an irresolvable technical problem: I cannot plot the full bandwidth data without showing ridiculously long plots, otherwise would alias, therefore I have to use @2y low pass filtered data.
Figure 4 visually quantifies the dominance of the first three terms of factors in a Fourier decomposition.
I don’t know what units came out of the original work so I can’t give them.
The plot uses a common y-axis scale and shows the original data filtered @2y for display (in blue) overlaying the 3 term model (in red). In addition one subtracted from the other is plotted (in green), the residual.
A detail mentioned here for clarity but also an aside of interest, at the very ends of the plot you can see the error caused by the 2 year low pass filter end correction being imperfect. This error is in the display version of the data, not the final result. (can’t meaningfully plot 22,000 points) Whilst an aside this is visual proof of the nonsense widely believed in science under everyone-knows about traversal filter length and used as an excuse for using moving average (shortest possible filter) yet at the same time ignoring the major artefacts added to the data by such a poor filter. Filter here was 585 long, about 16 years of data. It is the characteristic which actually matters and has no relationship to length.
Figure 5 is the magnified residual shown in figure 4 with 2 years knocked off each end.
Figure 6 is figure 3 with the result of removing the slower dominant terms. Cancellation is actually much greater than it appears, set more by the shortcomings of the DFT producing this plot than actual. Note it is 20dB (10x) per scale division.
It might be the case there are more factors within each discrete window but I have only cancelled one. Proving this would take a lot of detailed work, a waste of effort.
Figure 7 is the residual (same scaling as fig 6) with the entire model subtracted. Keep in mind that fast data below 2 years is excluded.
In match terms R-squared is not particularly apt, RMSD is more useful but not meaningful outside of a specific context. For the whole model excluding @2y r2=0.9999 and for the full data with high frequency r2=0.96
earth-grav-2012-a.zip (size 20k, contains two unexpanded spreadsheets, ods and xls, plus instructions as .txt, should be portable)
Anything else needed?