On Mar 23, 2011, at 06:45 , Martin Hansen wrote: > Hello all, > > > Here is an attempt at a naive fuzzy string matching algorithm. I would > appreciate comments on the code - which is not as simple as I would > like. > > I have a couple of speed improvements in mind: > 1) replace EQUAL hash with array (possibly NArray). That would generally be slower, depending on use. One thing you might want to do is clean up the code is match: EQUAL[(@seq1[self.seq1_index].upcase + @seq2[self.seq2_index].upcase).to_sym] could be: EQUAL[:"#{@seq1[self.seq1_index]}#{@seq2[self.seq2_index]}"] You'll have to change your initializer to upcase the input: @seq1 = seq1.upcase @seq2 = seq2.upcase Looks like you'd get a better speed up using strings as your hash keys: # of iterations = 1000000 user system total real null_time 0.130000 0.000000 0.130000 ( 0.131828) upcase + to_sym 21.320000 0.020000 21.340000 ( 21.366015) interpolated sym 14.760000 0.010000 14.770000 ( 14.796624) interpolated str 11.290000 0.020000 11.310000 ( 11.449326) > 2) a path can be terminated before fully explored if it can be > calculated that it will never complete (somewhere inside in_bounds?). Dunno... but you have a good test suite, so it is probably pretty easy to write a test and fix it. Minor cleanup: EQUAL = Hash[%w(AA BU TH UY TT CB UH SC CC GB VA SG GG TB VC CS UU UB VG GS NA DA AV WA NT DG CV WT NC DT GV WU NG DU KG AW NU AD KT TW AN GD KU UW TN TD GK RA CN UD TK RG GN HA UK AR UN HC YC GR NN HT YT MA BC HU YU MC BG AH CY AM BT CH TY CM).map { |pair| [pair.to_sym, true] }]