On Sat, Jan 12, 2002 at 03:56:15AM +0900, Jack Dempsey wrote:
> Often I will want to do many regex substitutions: different patterns
> with different replacements. 

Here is a stripped down version of a proofreader tool I wrote.  Half
of it would be enough to make my point, but somebody might find it
useful beyond this subject so I'm posting the whole gsub part.

I say ``stripped'' because the ``full'' version also corrects Italian
accents (which I doubt most people here will find any use for) and
wraps paragraphs using a snippet from the cookbook.

HTH
Massimiliano



text = STDIN.read

# rp, lp and cp stand for right- left- center-puctuation
rp_re = /[!$%&\)+,\.:;=>?@\]^|}~]/
rp    = rp_re.source

lp_re = /[\(\-<\[{]/
lp    = lp_re.source

cp_re = /[\/'\-\\]/
cp    = cp_re.source

# substitutions are applied *in* *order*
substitutions = [
  # replace cr + lf with lf
  [/\r\n/, "\n"],
  # add double lf at the end
  [/\z/, "\n\n\z"],
  # replace multiple lf's (paragraph break) with a cr to keep track of them
  [/\n\n+/, "\r"],
  # replace single lf's with white space
  [/\n/, ' '],
  # delete whitespaces at line beginnings
  [/^\s*/, ''],
  # correct ellipsis
  [/\.\.\.\.+/, '...'],
  # delete spaces before right side punctuaction 
  [/\s*(#{rp})/, '\1'],
  # ensure spaces after right side punctuation
  [/(#{rp})(\w)/, '\1 \2'],
  # delete spaces after left side punctuation
  [/(#{lp})\s*/, '\1'],
  # ensure spaces before left side punctuation
  [/(\w)(#{lp})/, '\1 \2'],
  # correct punctuation that should be attached to both sides
  [/\s*(#{cp})\s*/, '\1'],
  # correct acronyms, even if we fucked them up previously
  [/([A-Z]\.) ([A-Z]\.) /, '\1\2'],
  # finally, re-expand paragraph breaks (cr) to double lf
  [/\r/, "\n\n"]
]

substitutions.each {|from, to| text.gsub!(from, to)}