craig duncan <duncan / nycap.rr.com> wrote:
>
>In message "[ruby-talk:9078] Re: Regexp for matching Ruby reg exps?"
> >     on 01/01/11, "Ben Tilly" <ben_tilly / hotmail.com> writes:
> > >Friedl's solution only handles nesting to a limited depth.
> > >Enough for parsing email addresses but not a general
> > >solution.  Not only is handling arbitrary nesting in a
>
>Which means that if you build a parser to handle nesting of parens up
>to, say, 1000 deep, that you'd be able to defeat this parser by giving
>it an expression with a deeper nesting than that.  Theoretically true
>but not very practical.  Realistically, assuming some reasonable fixed
>limit on nesting, is the problem then (fairly simply) solveable?
>
If you are willing to use basic parsing techniques rather
than a regular expression then it is trivial to write a
solution and easy to extend it later.

If you insist on a regular expression then the expression
grows very quickly as you add levels of nesting, the
regular expression becomes very tricky to keep track of.
In addition the regular expression becomes slower due to
backtracking in the internal recursion.

For the record Friedl's RE is 4,724 bytes.  The optimized
one 6,598 bytes.  It only matches an internet email
address.  Heaven forbid you want to find out *why* it
doesn't match in some case.

(He doesn't actually hard-code it.  Rather he writes a
program to write the RE because maintaining it directly
would be too insane!)

The moral is that regular expressions are a tool for
seeing simple patterns in text.  They are not a good tool
for analyzing the structure of a document.  Use the right
tool for the job.

Cheers,
Ben
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com