Oh well, these things do look good. I'll look at it deeper tomorrow. Thanks! May ways to rome... :) Christian "Simon Strandgaard" <neoneye / adslhome.dk> wrote in message news:pan.2004.04.16.19.45.50.835557 / adslhome.dk... > On Fri, 16 Apr 2004 21:34:19 +0200, Christian Kaiser wrote: > > yes, but I have that many expressions that are 50-70 chars long, that I > > would need hundreds of chars in one line for the expression, which is > > neither good style nor readable. The text of my mail was just an example to > > show the ruggedness:))) > > > > I made a few improvements.. what do you think? > > > server> ruby a.rb > BAD: do DrUgS.. experience new things > BAD: play on casino and win big money > OK: pleasure with Ruby. BTW: its free > BAD: buy viagra > BAD: christina agulera caught backstage > BAD: britney spears > BAD: get your university diploma here > server> expand -t2 a.rb > input = <<WORDS > # pharmacy > vicodin viagra sildenafil citrate xanax xnax valium norco levsitra > cialis phentermine aseptic pharmacy medication pharmaceuticals? medical > meds drugs > > # money > mortgages? cable_bils? life_insurance busines_ofers? casino > > # music > britney_spears christina_agulera > > # misc > diploma > WORDS > > lines = input.to_a > # remove comments > lines.delete_if {|line| line.match(/^\#/)} > # transform lines into words > words = lines.inject([]) {|result, line| result + line.split(/\s/) } > # replace underscores with spaces > words.map!{|word| word.gsub!(/_/, ' '); word} > words.delete_if {|word| word == ''} > #p words > > # lets make a big regexp > regexp_str = words.map{|word| Regexp.escape(word)}.join('|') > regexp = Regexp.new(regexp_str, Regexp::IGNORECASE) > #p regexp.inspect > > testdata = [ > "do DrUgS.. experience new things", > "play on casino and win big money", > "pleasure with Ruby. BTW: its free", > "buy viagra", > "christina agulera caught backstage", > "britney spears", > "get your university diploma here" > ] > testdata.each do |str| > res = regexp.match(str) ? "BAD" : " OK" > puts "#{res}: #{str}" > end > server> > > -- > Simon Strandgaard