One very simple form of comparison is the calculation of the “edit distance” (or Levenshtein distance) between two strings . This approach is primarily applicable in systems that use a symbolic coding scheme such as the Parsons Code. The edit distance between two strings typically is defined to be the number of insertion, deletion, and replacement operations required to convert the first string into the second. These operations can be further weighted based on a heuristic estimate of the likelihood of certain kinds of errors. The edit distance can be computed in O(nm) time for two strings of lengths n and m using well-known dynamic programming algorithms [9]. This approach is analogous to sequence alignment methods used in bioinformatics, and it has been shown to be reasonably fast and selective in a database of roughly 105 melodies [8].
Edit distance can be criticized on several points, however. While the minimum edit distance itself is unique, the actual sequence of transformations is not. Also, the edit distance depends on the number and type of edit operations we permit [5].