I just posted a response to Latimer Alder on Bishop Hill:
Me: ‘The prevailing methodology in science is to create a model that fits the data. As such, scientists tend to assume all “science like” problems should all be tackled by creating a model and using that to predict what will happen.’
Latimer Alder: Wow. We used to do experiments instead. And test any models against them. Do they just leave out this essential step nowadays?
When I realised that my reply was based on a statistical evolutionary concept of theory building whereas I suspect Latimer’s question was based on the idea that there is only one theory which people uncover by enquiry much as an archaeologist uncovers a ruin by trial and error digging – I thought I’d follow this line of thought further and wrote this.
But beware … this is a personal note to myself working through this idea. The only reason I let to read it is because thinking someone might be daft enough to read this helps me write something that I have some chance understanding when I’ve forgotten why I wrote it. In essence I am suggesting that science consists of an ensemble of possible theory and/or models and that these are selected according to their fit to the data. As such we could imagine an infinite number of monkeys (aka academics) randomly producing theories which are then systematically discarded (or not if its climate “science”) according to whether they fit the data.
At this point I am tempted to compare Latimer’s (as I understand it) and my approach to evolutionary and Lamarckian evolution. Now, I was taught Larmarck proposed that a giraffe grew a long neck because it kept stretching its neck. Quick check – yes, that is right. Larmarck proposed that some mechanism altered the organisation in its life and these traits were passed on.
In a sense I can see a parallel in the tuning of parameters of an equation. The equation is altered during its lifetime … and I suppose “birth” is when further parameters are added to create a new equation based on the old.
So, what would evolutionary theory development equate to? I suppose the elimination of old equations which do not fit (are not fit for purpose) is evolutionary, but all equations have parameters changed in order to fit new data.
However, there is the fundamental problem that given a model with N adjustable parameters whose affect is orthogonal (acts independently) then it can be made to fit N data points. Or to put it more generally, the more complex the equation (the more adjustable parameters in the equation) and the less data points it has to fit, the easier it becomes to shoe horn any old equation to fit the data.
Latimer was talking about an experiment. An experiment is a means whereby a model is tested. In chemistry, that model is how chemicals react, in physics it might be how bodies behave and in climate … it is how the climate behaves. Unfortunately, experiments often don’t test what we expect. E.g. the exact order in which chemical are added may effect the outcome. So, in ideal conditions the experiment is repeated many times (or at least with a control). Each of these experiments create an added data point by which the model (or hypothesis) can be tested.
But when we come to climate, the time it takes for “climate” to change is of the order of a decade. So, in effect we only have one addition “test” point each decade. And given that half decent records for world climate started in 1850 and it was a long time after they were more than half decent, there are at most 16 data points (one for each decade) and if we were being strict perhaps far less.
Now in the reply to Latimer I made up a “rule of thumb” which was that you needed need n+1 data points to distinguish between n different hypothesis or models. For the moment I won’t worry whether its n+1 or n, but afterwards I suggested that a better model would be to say that each new test removes a proportion of the models. E.g. if the test is whether the model predicted the sign of global temperature. Given an infinite set of independent academics producing climate models, we would expect half their models to be removed for the first decade, half the next, half the next, so that after a century there would be only 1 in 1024 models still being considered.
Obviously in real life (as in evolution) when old models are removed, so those still in contention may be adapted (by the infinite monkeys in academia) so that the pool of models available for consideration remains fairly static.
In a “hard” subject like physics or chemistry, where it is relatively easy to conduct tests to verify models, it is relatively easy and quick to weed out models, so they tend to be “data rich” and “model poor”. In other words, whilst the process of developing theories can be entirely random, this data rich environment rapidly weeds out models leading to an environment with a few well tested theories.
But in climate “science”, the academics are in a data poor environment waiting as much as a decade for each new data point. And … as academia seemed to be judged on the papers they produce, we end up with hundreds of papers speculating on different models and theories and possible explanations of the climate. So, this subject is “model rich” and “data poor”.
Learning Curve
Now this is the point I really wanted to examine. How does this concept fit into the learning curve. Before pulling out the equation, I’ll just say what I understand the learning curve to relate to.
In any situation there are a host of things that affecting the situation. Some of these are large “elephants in the room”, some are mice, some flees, some microbes and some are just ephemeral vapours which we have no concept of their existence.
Anyone involved in the situation will immediately spot the elephant in the room (although climate scientists are the except that proves the rule). So, even within a fairly short time, we will be looking for solutions/fixes/models or whatever it is we are modelling using the “learning curve” which will sort out the elephants. However, only when we have removed the herd of stampeding elephants is the room quiet enough to start hearing/seeing the mice. In climatic terms, what this means is that only once we start to properly model the main contributing factors affecting the model (like solar activity), can we start to see that residuals which are caused by the smaller factors which affect the equation (like CO2).
However, because we can only “see” the influence of smaller things (mice) when we understand the influence of the “elephants”, it takes all the longer to work out how the mice are affecting things. Likewise, the influence of the flees can only be discerned when we do not have stampeding mice bouncing up and down the microscope, etc.
To put that more mathematically: if we are looking for something half the size we need twice the samples to see it amongst the other elephants/mice/flees.
So, this is how we get the learning curve which basically says:
Each time cumulative volume doubles, value added costs fall by a constant percentage.
There are only two problems with this:
- It just states the obvious … we learn quickly at first and it then takes longer to learn as things get more difficult
- Everyone believes they are different, that their improvement in doing something is down to their own achievements which no equation can possibly govern.
And the second is particularly true of anyone who things they are “intellectual” … particularly academics. Suggest to an academic that their advance in the subject is purely a function of the time and effort applied to the subject and their own “intellect” has very little impact on the results …. and you get the arrogant dismissal of those who know they cannot be government by something which cannot possibly apply to them.
However, of all the areas where the learning curve should apply, climate science … or perhaps as it should be more accurately described a group of monkeys devising numerical models of climate … is the one where it will certainly apply.
MATHS … OH NO I HATE MATHS
At this point I have three different “concepts”:-
- The number of data-points needed for an n-dimensional (n parameter) model
- The concept that a proportion of models will be removed by each new test point
- The learning curve – that doubling the test points, improves the model by a set amount
The first observation, is that point (1) is for an exact fit between the model and datapoints. In other words, with too few data points several parameter can be changed whilst still exactly matching the data points. This I suppose is where the “n+1” comes in, because only if the number of datapoints exceeds the degrees of freedom can we say that there is a match and not just an arbitrary coincidence based on the flexibility of the equation to match any set of datapoints.
However, if anything point (1) is saying that for each additional point, it will remove all models (that do not fit) which have n-1 degrees of freedom. Now in reality, it is very unusual to get perfectly orthogonal parameters (not all values can be matched). So, perhaps there may be some general rule that “randomly” doubling the parameters in the equation tends to introduce one? extra degree of freedom. This however assumes that a degree of freedom is equivalent to a unit of “added benefit” from the equation.
Information theory
At which point I desperately seek for a “unit” in information which equates the probability of events and the information that event occurring contains. And the unit of information is the bit (a 50% chance). So e.g. a 1/4 probability event has two bits of information, a 1/8 probability event 3 bits.
This seems to be working quite well, because an event with a 1/2^n probability is likely to be found after 2^n samples and so e.g. with 2^7 samples, events with a probability of 1/(2^7) will be discovered, whereas with 2^19 events with 1/2^19 will be found. So, doubling the samples means events with one extra bit of information will start to be uncovered. (Note that I’m being quiet about the possibility there may be more events to find with lower probability).
The cockup human factor of climate monkeys
Which brings me to the last comment I made:
I’m tempted to suggest that each test weeds out a proportion of the models so the number of experiments might be proportional to log(number of models). So e.g. if the test is whether the model could predict the sign of the global temperature. First decade … oops it didn’t warm and they all got thrown out … which just proves that a theory about the number of theories weeded out by the test just can’t account for the human factor that always cocks it up.
Now this is more difficult than it seems. I’ve already said that most people think that the learning curve does not apply to them. So, how can I have special pleeding for the “cock-up” factor?
Obviously, the fact that all climate models have proven to unable to predict the lack of warming, really shows that we didn’t have a group of random monkeys creating random theories, because that way we would have had as many predicting a fall as a rise. Instead it proves we had monkeys …. who didn’t even have the gumption to come up with their own ideas about the climate and basically they all copied each other’s idea so that barring a few decorations, they all had the same theory … which is why they were all wrong.
Diversity
I’m beginning to think that I’ve gone down a blind alleyway. The reality is that climate group-think clones behave very differently from an infinite group of monkeys.
A better analogy would be a single monkey and a group of parrots. The single monkey is randomly dreaming up new models … the parrots are just mindlessly regurgitating their model.
I’m beginning to wonder whether the “learning” for the parrots, will be that they would better be a monkey with some chance of being right rather than a parrot.
Pingback: These items caught my eye – 29 August 2013 | grumpydenier