Long -Term Climate Change: What Is A Reasonable Sample Size?

In WUWT Tim Ball asks a simple question:

Long -Term Climate Change: What Is A Reasonable Sample Size?

And the answer is fairly simple.
For a reasonable degree of certainty (~75% – but see end) One needs around 10x the length of time of data of the length of time in which we are taking a trend – and all the data must be from one homogeneous source. So, e.g. in order to assess whether the last century was abnormal, we need around a millennium of data. In order to assess whether the 1970-2000 warming was abnormal, we must compare the 1970-2000 trend in CET with the last 300-350 years in CET. And the longest period in the raw instrumental dataset of ~160years that can be assessed for abnormality would be around 16years.
So, if you hear anyone say “the pause is ‘abnormal'” – that might supportable (but is not). If however, they say “the last 30 years is abnormal” or even “the last 100 years” – they are either stupid, fraudulent or insane.
However, it all falls apart if we start comparing apples with cheese: tree rings for 1000 years with bogus fraudulent surface data for 30 years. Becasuse, for example, if we look at what kind of change is normal over the last 1000 years in the tree ring data – then we must compare it the same data for the last 100 years. But in contrast – the whole “hide the decline” scandal, was that, not only weren’t they comparing tree ring data with tree rings, but they knew that the tree ring data showed entirely the opposite trend from that they were stating to have been shown to be “abnormal”. With fraudulent behaviour there is no way to make their assertions credible by merely adding to the (bogus) data.
However, as 10x the data is problematic when we need quicker indications, I would suggest that we can get a “more likely than not” indication for 3x the period. But now it is critical that those doing the assessment come from the right background (which means tried-and-tested-engineering and not woolly-pc-panic-stricken-by-any-change academia.
So, e.g. with 160 years of data, we (engineers) can start saying with a modest certainty that if the last 50 years showed warming that had not been seen before in the last 160 years, then something was odd. But, just to show how ridiculous that assertion would be, even using the bogus upjusted data, the 1970-2000 period shows the same warming as 1910-1940. So, there is no indication of any abnormality with the global temperature (despite the known upjusting which in itself tells us just how normal the present period is – that even fraudulent changes can’t change it enough to make it abnormal).

The rational for long periods of data:

Until we know what is normal we cannot know what is abnormal

To take a simple example, we have two flight computers on a space craft – one says “full throttle”, the other “cease throttle”. How do we decide which is correct? The answer is that unless we have additional information there is no way to even guess which is correct.

If however, we have three computers, two say “cease throttle” and one says “full throttle”, then all other things being equal, then if the chance of any computer being wrong is p. Then the chance of two being wrong is p²

So, the chance of two computers being wrong as opposed just one is p²/p. So as p<1, then irrespective of the actual value of p, it’s always more likely than the minority is abnormal.

Likewise, at the very least, we need three centuries/decades of data to even start guessing which decade is “abormal”.

So, why 10x the length of data? The reasons are many:

The rule of 10

In essence, this rule simply means we need an awful lot more data than we think we need by “academic” statistics – because the real world is full of real people who just don’t think in the way needed for “academic” statistics to be valid.
The biggest problem is that we usually start looking at data when something “odd” appears. (Or to put it another way we ignore data where nothing odd appears) And, by pure chance, if we continue to monitor data, for long enough or from enough different sources, sooner of later by pure fluke, we will see an odd “event”.
As such, when we start assessing the risk of something like “climate change” we are not just picking data at random. Instead we have already “cherry picked” a period which appears odd. So, by pure probability, if we monitored 100 metrics, one of of those 100, should have a signal that only occurs 1/100 of the time in that signal. In that case, we would need 100x the data before that 1/100 signal would be within a sample where it was likely to occur. (but even then another such event should have occurred, so there is twice the probability of this event than would occur by pure chance – just because we only focussed on something that appeared a problem!)

But the untrained human factor gets worse

But, even with simple data, there are so many ways to take the same metric and suggest “abnormality”. Taking temperature, we can for example look for “hottest” and “coldest” (2). Also “faster warming” and “fastest cooling” (2). Then we have the possibilities of turning points(2) and cycles(>2)** – all of which can be construed as “odd”. So, even with simple data, there are around 10 different ways to see something “odd”.
So, quite contrary to what the statistics supposedly suggest, it is actually “normal” to see a 1-in-100 year event in a decade of temperature. It is also normal to see a 1-in-1000 year event in a century of data. So, if you are just looking for something “odd”, in around 10 different metrics, the chances are you will see a 1-in-millennium “event” every decade!!

The human factor

So, if we are intent on finding something “odd” in even one dataset, the chances are quite high and that is why we need long time series. If however, we have a host of datasets (floods, droughts, snow, temperature, rain, hurricanes, peak-rain, peak-wind, peak, rain, etc. etc.), then if we are allowed to cherry pick as academics have done, then we are guaranteed to find something abnormal.
This is why one needs to be properly trained in engineering practices to do risk assessments. Because the biggest quality failing of risk assessment is the idiot doing the risk assessment – and particularly if they don’t come from a culture used to doing risk assessments and living the with result of either overstating risk or under-stating it.
So, even with the best of intentions, and even with 10x the data of the period being assessed, the best we might say is that there is “more chance than not” of some data being abnormal, if (as has happened) you have politically motivated groups free to scour the data and worse – free to channel resources – with the intention of finding “something wrong”.
If however, you have people trained in risk assessment from a suitable culture, who know the temptation to cherry pick data and have the training, experience AND culture to resist, then the the certainty with 10x data can rise as high as perhaps 90% confidence. (Note the idiots at the IPCC have stated 95% confidence, about a period equivalent to the length of their whole dataset – so whilst they have no idea & no data to say what is normal – they are 95% sure that what they have is abnormal).

**To explain, if we accept a turning point is simply a variant of a sin-wave (with period twice the sample length), then a trend is a variant of a cos-wave. Then a simple cycle (up-down-up-down) is just twice the frequency. However, if the probability of this is half as high (there being twice as much “info”), then if we sum the total series, the probability of the total is around 1. However, because we are looking for things “happening” … we can often accept cycles that only appear later in the data. So, there is quite a high chance of seeing something “odd” in the form of an apparent cycle.

Long -Term Climate Change: What Is A Reasonable Sample Size?

The rational for long periods of data:

The rule of 10

But the untrained human factor gets worse

The human factor

Categories

Archives

Recent Posts

Recent Comments

Archives

Categories

Meta