A frequent question in recent years is whether solar sunspot activity is being exaggerated by including very small marks in the count. This question implies inconsistency in methodology over time.
Recent other work on solar data puts me in a good position to quickly approach an answer, all the resources are already on disk here.
The quick answer is: Not obviously.
I invite further investigation, data used here is provided.
A reasonable way to address this problem is compare total sunspot area with sunspot count on the basis that counting many nothings adds no area. Detail dispute on this, please check the raw data, I haven’t.
Dr David Hathaway now maintains the NASA/Greenwich solar dataset after the British decided to not bother funding something important but seen by them as unimportant (yes politics, I do not like these people, are not technical people nor scientists). The project is a labour of love without proper funding. It is the source used to create the Butterfly diagrams.
“The Royal Greenwich Observatory data has been appended with data obtained by the US Air Force Solar Optical Observing Network since 1977. This newer data has been reformatted to conform to the older Greenwich data…”
Web page “The Sunspot Cycle” on NASA servers.
The dataset which runs from 1874 is not useful as a time series without major post processing for which I wrote code some time ago. In this case a data extract of sunspot area for north and south processed to daily. (in spreadsheet, didn’t update the data to current, month or so missing at the end)
Daily data over 100+ years cannot be directly plotted so I decimated to a more reasonable monthly via signal processing (trivial to do here, not set up to do pocket calculator method). Low pass filter at 60 days and pick off monthly at the right time offset. The filter will ring slightly on this data, unimportant. End correction is used. (ask if you need to know)
And then we look
I hope those line up. Total is simple sum north+south
Putting the SSN plot next to the total area plot make a compare easier.
Normalising this kind of data is unlikely to work perfectly but is passable. I used RMS and std dev. on both datasets as a cross check because the intent was moving the area data to SSN scaling. The processed SSN dataset matches the original, cross check successful.
Figure 3 does show some defects, zero offset for example but is minor. In broad terms the match is good. A better method would be to decimate daily SSN data so that both data were commonly filtered but I decided against this to avoid arguments over validity. Illustratively Figure 4 following shows the minor difference if that was done, not used here.
Plot overlay using different trace widths. There are subtle detail differences primarily from artefacts in the standard method, leakage because of lack of spatial control. Signal processing has other insuperable minor problems.
Figure 1 repeated below, is the simple subtract of the two normalised data. No plainly obvious difference. This shows that area and count are essentially identical and could proxy each other.
A more detail examination I leave to any interested readers.
An XLS is here, portable 97/xp/2000 format, 9M, plots are left in but won’t survive into other packages, delete and recreate as you wish.
Post by Tim Channon, co-moderator