Python VP-tree implementation

homeblogtwitterthingiverse



VP-trees are a data structure for performing fast nearest-neighbour queries in metric spaces.

Here's a fairly naive implementation in Python:

A Levenshtein-distance based spelling corrector is included in this program as a test.

$ python VP_tree.py /usr/share/dict/words
Load dictionary
98568 words

Construct tree
727925 comparisons

Ready to answer queries

query> mispelt
    76 comparisons later...     1 misspelt
  2348 comparisons later...     2 dispels
     0 comparisons later...     2 misspell
     0 comparisons later...     2 misspent
     0 comparisons later...     2 respelt

query> correct
     8 comparisons later...     0 correct
  1591 comparisons later...     1 corrects
  1623 comparisons later...     2 Forrest
     0 comparisons later...     2 collect
     0 comparisons later...     2 connect

query> fandabulous
 18600 comparisons later...     3 fabulous
  7333 comparisons later...     4 nebulous
     0 comparisons later...     4 scandalous
 14885 comparisons later...     4 pendulous
  2314 comparisons later...     5 Daedalus



[æ]