On Fri, 13 Aug 2004 17:36:11 +0900, Robert Klemme wrote

> Yep.
> 
> > I've written something similar for
> > Regexp::English but the optimization itself is quite slow and takes up
> > lots of resources. (Because it needs to build the trie.)
> 
> Well, it was ok in my case because the word list didn't change and I 
> put the generated regexp into code.  Although I'm not sure that it 
> was really that slow.  Lemmesee...  IMHO Theoretically it should be 
> around O(n*m) with n the number of words and m the average word length.

I would love to see your script, Robert.  I was flirting in my head with the 
thought that if one took all of the regexes and broken them into shared 
pieces and made a tree out of them, one could walk down the tree of pieces 
until one found the complete regex that made the match.  Neat to see my 
brain wasn't totally off base with the idea.  So I want to take the regexp 
your script builds and benchmark it against the current state of affairs 
with simple hash based fixed string matching as well as the basic match the 
fixed strings then iterate through the regexes approach and see how overall 
performance shakes out.


Thanks!

Kirk Haines