Hi -- On Wed, 18 Dec 2002, Shannon Fang wrote: > Hi Algorithmists, > > I am writing a spell checker in ruby. The first step > is to load the dictionary into memory. After some > experiments, I found that the following simple code > worked quite nicely: > > f=File.new("dict.txt") > text=f.sysread(File.size("dict.txt")) > lexicon=text.split(/\n/) > > On my 1.6G Pentium 4 running ruby 1.6.7, the total > time used to load the dictionary is less than 0.3 > second. (Dictionary file is about 1Mb with 90K > records) > > However, since the dictionary lookup operation will > be quite heavy, I am thinking of using a hash instead > of array. I tried the following code: > > f=File.new("dict.txt") > text=f.sysread(File.size("dict.txt")) > words=text.split(/\n/) > lexicon={} > words.each do |word| > lexicon[word]=0 > end > > Disaster! It took me about 15 seconds to load the > dictionary. Problem is that the #each method took too > much time. That's really puzzling. I ran the same script in .23s real time, using a dict with about 38K words, on a 1.4GHz Pentium. David -- David Alan Black home: dblack / candle.superlink.net work: blackdav / shu.edu Web: http://pirate.shu.edu/~blackdav