How to persuade muggles


Autistic people model random variables differently. It's a nasty idea, as we shall see, but it's just too good an explanation to ignore. The extreme case for autism is to model random variables as normally distributed (of form exp(-x2) ). The extreme case for normality is to model random variables as Cauchy distributed (of form 1/(1+x2) ).

Many consequences, I describe here one of the most important. Suppose we wish to alter someone's beliefs. They have in the past made a set of observations, and have fitted one of the graphs above to those observations to create a probabilistic model. We wish to offer some new observation that will cause them to shift their model (say to the right). What new datum will best achieve this?

Lets look at negative-log-likelihoods (you can follow the same line of reasoning with raw probabilities, but logs are easier). They have chosen the center point of the distribution such that the total negative-log-likelihood of the data points is minimized. Any shift in the center point will increase the negative-log-likelihood. The neg-log-likelihood of our new data point is competing with the old ones.

So to create the greatest change, we need to choose a new data point x such that the derivative of log(f(x-center)) with respect to center is maximized.

For a Gaussian curve the answer is simple. The derivative increases monotonically with x. To persuade an autistic person, simply choose the most extreme example you can. And a single data point of sufficient extremality can falsify a theory almost completely.

For the Cauchy distribution, there is a maximum.

If that maximum is exceeded, the new data point will be ignored to a greater and greater extent. So one should not push too hard. Better to give several mildly different examples than one killer example.

The Cauchy distribution has a certain robustness. Outliers don't throw it. But this also means truly important facts at odds with the person's world view can be ignored. There's no such thing as falsification for someone using this curve. It's even possible for there to be several local maxima for the neg-log-likelihood (something that does not happen with Gaussian distributions) -- a recipe for a rude awakening once the tipping point is reached.

And if reality changes suddenly, such that every new observation exceeds their limits of plausibility, they'll just keep on as they always have while things go straight to hell.

See how nasty an idea this is?

A few further consequences ...

Since there can be multiple local maxima (corresponding to clusters in the data), different people can end up at different maxima. They will then regard people at other local maxima as laughably naive for listening to evidence they themselves regard as implausible. A prime example of this is the left/right political split. Also racism, nationalism, and religion.

But use of a Cauchy distribution also lets people behave sensibly in novel situations -- what we call "common sense", a trait lacking in autistic people. The power of common sense is not to be underestimated. So there is bad and there is good to being normal.

This potential for sudden jumps to a different belief... I wonder, is this the cause of laughter? (Which would imply that while autistic people might "get" humour less, they might be more capable of producing it. Richard Feynman's book "Surely you're joking, Mr. Feynman" may be an example of this.)

Stories and memory ...

Remembering some sequence of events (and the related activity of making up a story) I figure is something like following a trail of breadcrumbs. Find a breadcrumb, eat it (so you don't end up backtracking :-) ), look for another breadcrumb nearby, repeat. A somewhat random process, maybe you sometimes skip a breadcrumb. The probability of a particular breadcrumb being chosen will follow a distribution such as those shown above (or a multivariate version thereof). If you're making up a story, you'll use a wide distribution, wander all over the place. If you're trying to remember accurately, you'll use a tight distribution so you stay on the one trail.

    X         X                                  X           X      X
        X            X     X                          X             

Ok, so what if there's a missing breadcrumb in a trail of memory you are following? Tight focussed distribution, so a missing crumb throws us into the tail of the distribution. The Gaussian tails keep dropping off quickly, so if you're autistic chances are you will pick a nearby breadcrumb, probably the one after the missing one. Voila, eidetic memory. If you're using a Cauchy distribution it's a big fat tail, doesn't much matter if you pick a crumb a metre away or two metres away. There's a good chance you'll jump tracks. No eidetic memory for normal people.

A similar effect applies when you're trying to find analogous memories to some unfamiliar situation. The autistic person will tend to pick the closest memory, therefore a memory on the border of the cloud of all memories. A normal person is almost as likely to consider memories deep inside the cloud. So autistic people will disproportionately pick memories at the edge of the cloud, and if there is some outlier from the cloud will pick that very often indeed, resulting in stereotyped reactions to unfamiliar situations. This may even be the explanation of the autistic tendency to specialize and obsess. (This suggests treatment for maladaptive obsessions could be to try to provide a path of experiences from the obsession back to the main cloud. Basically, meet them half-way.)

This pattern of breadcrumb jumping has an analogue in the physical structure of autistic brains, which have a greater ratio of short to long neural connections than normal people do (though why this is so I am not 100% certain). I am willing to bet the pattern of connection lengths will be approximately Gaussian for autistic people, and Cauchy for normal people. This also leads to greater modular independence in autistic brains.