"James Edward Gray II" <james / grayproductions.net> wrote: > On Feb 27, 2005, at 11:59 AM, Dave Burt wrote: > >> JEGIII > > Hmm, I'm going down through the generations. ;) Still just a II, not a > III. For the curious, I shorten my name as JEG2. Use whatever you like > though, I answer to anything. Sorry. Fixed in code comments. >> And I added 3 input translators. > > Wow! Nice work. I look forward to digging through these. > Thanks. Do enjoy. >> I generated letter frequency data, which is a big part of the LetterWise >> method, from a smidgen over 65MB of plain-text modern novels and movie >> scripts, using the following two scripts (the latter has 2 versions): >> http://www.dave.burt.id.au/ruby/phonepad/1_wordfreq.zip >> http://www.dave.burt.id.au/ruby/phonepad/2_predict3_rb.zip >> http://www.dave.burt.id.au/ruby/phonepad/2_predict3_yaml.zip > > Just FYI, these links don't seem to work for me. > My fault: these are scripts, and end in .rb, not .zip: http://www.dave.burt.id.au/ruby/phonepad/1_wordfreq.rb http://www.dave.burt.id.au/ruby/phonepad/2_predict3_rb.rb http://www.dave.burt.id.au/ruby/phonepad/2_predict3_yaml.rb An additional note on the serialization in these latter two scripts: I tried two methods. Hand-output YAML, which ended up about 1.8MB, takes 5-6 seconds to load on my machine. Hand-output, because if you don't, it grows to over double the size, and fails on a mapping with nil for a key (it omits the "~"). "p big_hash" ended up around 2.4MB (and the file has only one "\n"), and takes a touch under 3 seconds to load. Cheers, Dave