Yasuo Saito <y_saito / mx10.freecom.ne.jp> writes:

> But, I cannot express paired brackets(<...>), which
> might be nested, in a single regular expression.

No - I don't think that's possible. But with a big a munging you can
use regexps to handle paired delimiters. I do it when I convert our
book from LaTeX to XML, for example.

The way I do it is to make two passes. On the first, I tag all the
delimiters. So I'd change

   ab<cd>ef

into

   ab<:0000:cd>:0000:ef

and

   ab<cd<e>f>

into

   ab<:0001:cd<:0000:e>:0000:f>:0000:

The Ruby to do this is pretty simple:

    count = "0000";
    OPEN  = 200.chr      # character that doesn't appear in a document
    CLOSE = 201.chr      # and another
    
    1 while @content.gsub!(/<([^<>]*)>/m) {
      count = count.succ
      "#{OPEN}:#{count}:#$1#{CLOSE}:#{count}:"
    }

    @content.gsub!(/#{OPEN}/,  '<')
    @content.gsub!(/#{CLOSE}/, '>')

Once I've got that, I can quickly find matching delimiters:

  /<:(\d\d\d\d):.*?>:\1:/


Regards


Dave