On Mon, Oct 16, 2006 at 07:15:52PM +0900, Minkoo wrote: > Hi list. > > I read an article posted at Wikipedia about Levenshtein distance (aka edit > distance). > The location of the document is > http://en.wikipedia.org/wiki/Levenshtein_distance > > In the document, a sample Ruby code goes like the following: > > class String > def levenshtein(comparator) > a, b = self.unpack('U*'), comparator.unpack('U*') > n, m = a.length, b.length > a, b, n, m = b, a, m, n if n > m > current = [*0..n] > 1.upto(m) do |i| > previous, current = current, [i]+[0]*n > 1.upto(n) do |j| > add, delete = previous[j]+1, current[j-1]+1 > change = previous[j-1] > change += 1 if a[j-1] != b[i-1] > current[j] = [add, delete, change].min > end > end > current[n] > end > end > > In the code, there's a parameter called comparator which seems to be used to > decode given parameter. But, I can't understand what exactly the comparator > is doing. > > Does anybody know the detail? It's simply the string you're comparing against; unpack('U*') just turns the UTF-8 characters into unsigned integers: class String def levenshtein(comparator) a, b = self.unpack('U*'), comparator.unpack('U*') b # => [102, 111, 111, 98, 97, 114] n, m = a.length, b.length a, b, n, m = b, a, m, n if n > m current = [*0..n] 1.upto(m) do |i| previous, current = current, [i]+[0]*n 1.upto(n) do |j| add, delete = previous[j]+1, current[j-1]+1 change = previous[j-1] change += 1 if a[j-1] != b[i-1] current[j] = [add, delete, change].min end end current[n] end end "foo".levenshtein("foobar") # => 3 -- Mauricio Fernandez - http://eigenclass.org - singular Ruby