On Wed, Oct 19, 2005 at 08:16:58PM +0900, Eyal Oren wrote:
> thanks. that might work, but the problem is I think in the unions of
> the regexps that I use, see example:
> 
> because of the unions, I don't really want to decide after the match
> what to do with it, but rather state it in the constituent regexp's
> (e.g., I would like to say in the ImplicitWiki regexp what should
> happen if it is encountered)
> 
> 
> 	ExplicitWiki = /\[\[([^\]]+)\]\]/
> 
> 	# CamelCase followed by some non-word character, e.g. 'CamelCase.'
> 	ImplicitWiki = /([A-Z]+[a-z]+[A-Z]+\w*)\W/
> 
> 	# <...>, no space inside brackets
> 	Uri = /<([^<>]+)>/
> 
> 	# dc:title
> 	Prefix = /(\w*):(\w+)/
> 
> 	# "hello"
> 	Literal = /"([^"]*)"/
> 
> 	Wiki = Regexp.union ExplicitWiki, ImplicitWiki
> 	Pred = Regexp.union Wiki, Uri, Prefix
> 	Obj = Regexp.union Pred, Literal
> 	Annotation = /(#{Pred})\s*(#{Obj})\s*\./
> 
> 	Variable = /(\?\w+)/
> 	UriPattern = Regexp.union Variable, Pred
> 	LiteralPattern = Regexp.union Variable, Obj
> 	Query = /\[\?\s+#{UriPattern}\s+#{UriPattern}\s+#{LiteralPattern}\]/

I wrote the following a long time ago when I was new to Ruby.  Maybe you
could use a similar pattern,

----------------------------------------------------------------------
# Perform (possibly) multiple global substitutions on a string.
# the regexps given as keys must not use capturing subexpressions
# '(...)'
class MultiSub
  # hash has regular expression fragments (as strings) as keys, mapped
  # to
  # Procs that will generate replacement text, given the matched value.
  def initialize(hash)
    @mash = Array.new
    expr = nil
    hash.each do |key,val|
      if expr == nil ; expr="(" else expr<<"|(" end
      expr << key << ")"
      @mash << val
    end
    @re = Regexp.new(expr)
  end

  # perform a global multi-sub on the given text, modifiying the passed
  # string
  # 'in place'
  def gsub!(text)
    text.gsub!(@re) { |match|
      idx = -1
      $~.to_a.each { |subexp|
        break unless idx==-1 || subexp==nil
        idx += 1
      }
      idx==-1 ? match : @mash[idx].call(match)
    }
  end
end

# example,

mailSub = proc { |match| "<a href=\"mailto:#{match}\">#{match}</a>" }
urlSub = proc { |match| "<a href=\"#{match}\">#{match}</a>" }

sub = MultiSub.new ({
  '(?:mailto:)?[\w\.\-\+\=]+\@[\w\-]+(?:\.[\w\-]+)+\b' => mailSub,
  '\b(?:http|https|ftp):[^ \t\n<>"]+[\w/]' => urlSub
})

test = "...."
sub.gsub!(test)
puts test
----------------------------------------------------------------------

ta,
dave

-- 
http://david.holroyd.me.uk/