Complex reality, simple theories


A number of people have commented that my Cats and Dogs theory is simplistic. I agree. People undoubtedly have more than one dimension to their character. However, I argue this does not mean I should immediately make my theory more complex.

Here's a graphical example. Some data (red dots) and two models of that data (lines):

Suppose we are modelling a phenomenon represented by the red dots. The dots follow the complicated curve "y=x+sin(x)", but we don't know that.

I've given two possible models for this curve that we might come up with from looking at it: "y=x", "y=x+cos(x)". The "y=x" model is obviously inadequate, the reality is very obviously more complex than a straight line. However, that does not mean that making the model more complicated is a good thing. The model "y=x+cos(x)" is more complicated, and even has a similar look to the data, but the discrepancy between it and the data is greater than that of "y=x". The simple model will be more accurate than the complex model when trying to predict new data points.

There may be a complex model that fits the data better than a simple model can, but finding such a complex model is really, really, really hard. For every extra bit of model complexity, the search space is doubled. Almost every attempt we make to improve a model by making it more complex will fail. Looking at complex models should thus only be attempted when all possible simpler models have been examined.

Thus I prefer simple models, even when it is obvious that reality is complex.