gotoken / math.sci.hokudai.ac.jp (GOTO Kentaro) wrote: > >Akemashite omedetou gozaimasu! >Glueckliches neues Jahr! >A happy new year! > >In message "[ruby-talk:8413] Re: speedup of anagram finder" > on 00/12/31, David Alan Black <dblack / candle.superlink.net> writes: > > >So.... I think for general anagram-finding, the unpack one is still > >the one to beat. (Of course, it is not impossible that one would want > >to find anagrams of 5-letter words. My first Ruby program of any size > >was a Jotto implementation. Maybe it's time to have another look at > >that :-) > >Thank you for interesting measurements. Looking that, I'm feeling >difficulty of the average cost analysis in pragmatic sense. As Guy >said [ruby-talk:8414], we should consider about also frequency of GC. At some point in optimization you always reach the point where you make trade-offs. There isn't necessarily better in general. Merely better for my situation. >I've counted the distribution of word length in /usr/share/dict/words. >It looks like a non-sharp version of Poisson. By your measurement, >pack may well be faster than Shultz-Goedel-Abel index :-) roughly >because the average length is 9.585, where pack is faster. Has anyone tried using the frequency distribution of characters in English? Have the most common letters assigned to the smallest primes. This should keep the size of the index down, and I think would significantly improve performance... Cheers, Ben _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com