> -----Original Message-----
> From: James Edward Gray II [mailto:james / grayproductions.net] 
> Sent: Tuesday, September 13, 2005 12:31 PM
> To: ruby-talk ML
> Subject: Surprising Regexp Behavior
> 
> 
> I keep running into some surprising points with Ruby's Regexp engine  
> today and this first one just looks plain wrong to me:
> 
> irb(main):001:0> html = "<p>one</p>\n\n<p>two</p>"
> => "<p>one</p>\n\n<p>two</p>"
> irb(main):002:0> html.sub!(/<p>(.*?)<\/p>(.*)/) { $1.strip }
> => "one\n\n<p>two</p>"
> irb(main):003:0> $2
> => ""
> 
> Can anyone explain to me how that isn't a bug?

What's the bug to you?  The fact that the second set of <p></p> wasn't
stripped or the fact that $2 is empty?

In the former, sub != gsub.  In the latter, you need multi-line mode
because of the "\n\n":

# Without /m
irb(main):026:0> html =~ /<p>(.*?)<\/p>(.*)/
=> 0
irb(main):027:0> $1
=> "one"
irb(main):028:0> $2
=> ""

# With /m
irb(main):023:0> html =~ /<p>(.*?)<\/p>(.*)/m
=> 0
irb(main):024:0> $1
=> "one"
irb(main):025:0> $2
=> "\n\n<p>two</p>"

Regards,

Dan