Aidan wrote:

> Are Ruby regular expressions, especially those built using the
> Regexp::MULTILINE option, greedy by default?
> 
> Is there anyway to make them non greedy?
> 
> I need to scan over a large HTML table which consists of a large number of
> table row units:
> 
> <TR>
>   <TD>stuff</TD>
>   <TD>other stuff</TD>
> </TR>
> <TR>
>   <TD>stuff</TD>
>   <TD>other stuff</TD>
> </TR>
> 
> and so on ... The program needs to extract each <TR></TR> unit and process
> it individually. Will a RE like
> 
> <TR>.*</TR>
> 
> need to recoded with some guards against greedyness?

Parsing HTML with regexes is not a good idea (that is, it's more complex 
than you'd think). For specific files you could use something like

<TR>.*?</TR>

the following question mark makes it non-greedy.

But you should also look at the html-parser module.

I've written a module that knows about HTML structure and am writing one 
that builds a tree.