craig duncan <duncan / nycap.rr.com> wrote: > >In message "[ruby-talk:9078] Re: Regexp for matching Ruby reg exps?" > > on 01/01/11, "Ben Tilly" <ben_tilly / hotmail.com> writes: > > >Friedl's solution only handles nesting to a limited depth. > > >Enough for parsing email addresses but not a general > > >solution. Not only is handling arbitrary nesting in a > >Which means that if you build a parser to handle nesting of parens up >to, say, 1000 deep, that you'd be able to defeat this parser by giving >it an expression with a deeper nesting than that. Theoretically true >but not very practical. Realistically, assuming some reasonable fixed >limit on nesting, is the problem then (fairly simply) solveable? > If you are willing to use basic parsing techniques rather than a regular expression then it is trivial to write a solution and easy to extend it later. If you insist on a regular expression then the expression grows very quickly as you add levels of nesting, the regular expression becomes very tricky to keep track of. In addition the regular expression becomes slower due to backtracking in the internal recursion. For the record Friedl's RE is 4,724 bytes. The optimized one 6,598 bytes. It only matches an internet email address. Heaven forbid you want to find out *why* it doesn't match in some case. (He doesn't actually hard-code it. Rather he writes a program to write the RE because maintaining it directly would be too insane!) The moral is that regular expressions are a tool for seeing simple patterns in text. They are not a good tool for analyzing the structure of a document. Use the right tool for the job. Cheers, Ben _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com