In the continuous case, we find a set of models that it is not worth distinguishing between. We should do this also in the discrete case, or MML becomes just ML.

eg Snob should return a set of classifications that it is not worth distinguishing between. (Snob already allows for data points to be partially in two different classes if it is not worth distinguishing between the classes for that data point. This should be extended up another level.)

(update) Ok... from reading more of the MML book, it looks like in a large model any sub-model ends up getting chosen pseudo-randomly. It would make sense then to consider every model a sub-model of an ongoing communication of some kind, and thus assume it to be randomly chosen.

You can see this happening in the continuous case as you add more and more parameters. In 1D, it's a line segment of models. In 2D it's a randomly oriented hexagon, so already the edges are a little fuzzy. In higher and higher dimensions it gets more and more fuzzy, presumably tending to a Gaussian blob.

This makes the math much simpler. The discrete case just becomes a catalogue of models with associated probabilities. The continuous case (using the usual simplification) becomes a Gaussian blob of models.

This makes it all sound a little prosaic, but the Fisher information stuff remains a very useful simplification. Where the Fisher math becomes unworkable, one can switch to Monte-Carlo sampling.

One step closer to a workable sparrowfall system :-)