On Jun 15, 2005, at 3:36 PM, Nikolai Weibull wrote: > Ezra Zygmuntowicz wrote: > > >> Could someone help me do a little regex conversion? I've got a >> few perl compatible regexes from a php script I am trying to port to >> ruby but I need a little help. Here are the php functions: >> >> $buffer = preg_replace("#(?<!\"|http:\/\/)www\.(?:[a-zA-Z0-9\-]+\.)* >> [a-zA-Z]{2,4}(?:/[^ \n\r\"\'<]+)?#", "http://$0", $buffer); >> $buffer = preg_replace("#(?<!\"|href=|href\s=\s|href=\s|href\s=) >> (?:http:\/\/|https:\/\/|ftp:\/\/)(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,4} >> (?::[0-9]+)?(?:/[^ \n\r\"\'<]+)?#", "<a href=\"$0\" target=\"_blank >> \"> >> $0</a>", $buffer); >> $buffer = preg_replace("#(?<=[\n ])([a-z0-9\-_.]+?)@([^,< \n\r]+)#i", >> "<a href=\"mailto:$0\">$0</a>", $buffer); >> > > OK, this wins my newly instated prize for _worst regexes ever_. > Inefficient, > inconclusive, inconsistent, and just plain wrong. I really hope you > donÃÕ have to work with a lot of code like this. > > Nonetheless, hereÃÔ my solution: > > domain = /(?:[[:alnum:]\-]+\.)/ > tld = /[[:alpha:]]{2,4}/ > buffer.gsub!(/(?<!"|http:\/\/)www\.#{domain_part}*#{tld}/, 'http:// > \0') > buffer.gsub!(/(?<!\"|href=|href\s=\s|href=\s|href\s=) > (?:https?|ftp):\/\/#{domain_part}+#{tld} > (?::\d+)?(?:\/[^\s"'<]+)?/x, > '<a href="\0" target="_blank">\0</a>') > buffer.gsub!(/(?<=\s)[[:alnum:]\-_.]+@[^,<\s]+/i, > '\0') > > Totally untested, but at least itÃÔ somewhat easier to understand > and a > bit more correct. There are better ways to extract URLs and email > addresses from an input than this, mind you, > nikolai > > -- > Nikolai Weibull: now available free of charge at http://bitwi.se/! > Born in Chicago, IL USA; currently residing in Gothenburg, Sweden. > main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);} > Nikolai- Thank you. I have inherited a ton of NASTY php code like this at the newspaper I work at. I am rewriting it all in rails and ruby cgi scripts. But the guy who wrote this stuff is no longer here and I think he liked making his code as obsfuscated as possible in order to keep his job secure. I am by no means a regex master so digesting volumes of stuff like this hurts my head. Thank you for the help. -Ezra Zygmuntowicz Yakima Herald-Republic WebMaster 509-577-7732 ezra / yakima-herald.com