Hello -- On Sat, 12 Jan 2002, Massimiliano Mirra wrote: > On Sat, Jan 12, 2002 at 03:56:15AM +0900, Jack Dempsey wrote: > > Often I will want to do many regex substitutions: different patterns > > with different replacements. > > Here is a stripped down version of a proofreader tool I wrote. Half > of it would be enough to make my point, but somebody might find it > useful beyond this subject so I'm posting the whole gsub part. [...] > text = STDIN.read > > # rp, lp and cp stand for right- left- center-puctuation > rp_re = /[!$%&\)+,\.:;=>?@\]^|}~]/ > rp = rp_re.source > > lp_re = /[\(\-<\[{]/ > lp = lp_re.source > > cp_re = /[\/'\-\\]/ > cp = cp_re.source [...] > # replace single lf's with white space > [/\n/, ' '], > # delete whitespaces at line beginnings > [/^\s*/, ''], > # correct ellipsis > [/\.\.\.\.+/, '...'], The three-dot ellipsis is correct in cases where something inside a sentence is elided. The first dot is really the period at the end of a sentence. > # delete spaces before right side punctuaction > [/\s*(#{rp})/, '\1'], > # ensure spaces after right side punctuation > [/(#{rp})(\w)/, '\1 \2'], What if your text is: The product cost $10.00. And 3 + 4 = 7. @var is an example of a Ruby instance variable. > # delete spaces after left side punctuation > [/(#{lp})\s*/, '\1'], > # ensure spaces before left side punctuation > [/(\w)(#{lp})/, '\1 \2'], > # correct punctuation that should be attached to both sides > [/\s*(#{cp})\s*/, '\1'], > # correct acronyms, even if we fucked them up previously > [/([A-Z]\.) ([A-Z]\.) /, '\1\2'], What about: I write programs in C. Do you? > # finally, re-expand paragraph breaks (cr) to double lf > [/\r/, "\n\n"] > ] > > substitutions.each {|from, to| text.gsub!(from, to)} Hmmmm... looks sort of hash-like. Just being pedantic :-) David -- David Alan Black home: dblack / candle.superlink.net work: blackdav / shu.edu Web: http://pirate.shu.edu/~blackdav