RM> I am trying to extract the href from links in HTML, however the RM> regular expression matcher doesn't appear to stop in the correct RM> place. RM> I intend the regular expression to extrace the href that is RM> enclosed in quotes and return that into $1. However is seems to RM> 'miss' the first set of quotes and a following one and finally RM> stop on a third set. RM> Here is an example RM> s="<A href=\"l.htm\">xxx <IMG src=\"images/la.gif\" width=4></A>" RM> s =~ /<.*A.*href *= *"(.*)".*>/ => 0 $1 =>> "l.htm\">xxx <IMG src=\"images/la.gif" s = '<A href="l.htm">xxx <IMG src="images/la.gif" width=4></A>' s =~ /<a\s+href\s*=\s*\"*([^>\"\s]+).*/i p $1 the above gets everything between the href= and the > (excluding the ending space or quote if there). Is that what you were looking for? regards, -joe