Oh well, these things do look good. I'll look at it deeper tomorrow. Thanks!

May ways to rome... :)

Christian

"Simon Strandgaard" <neoneye / adslhome.dk> wrote in message
news:pan.2004.04.16.19.45.50.835557 / adslhome.dk...
> On Fri, 16 Apr 2004 21:34:19 +0200, Christian Kaiser wrote:
> > yes, but I have that many expressions that are 50-70 chars long, that I
> > would need hundreds of chars in one line for the expression, which is
> > neither good style nor readable. The text of my mail was just an example
to
> > show the ruggedness:)))
> >
>
> I made a few improvements.. what do you think?
>
>
> server> ruby a.rb
> BAD: do DrUgS.. experience new things
> BAD: play on casino and win big money
>  OK: pleasure with Ruby. BTW: its free
> BAD: buy viagra
> BAD: christina agulera caught backstage
> BAD: britney spears
> BAD: get your university diploma here
> server> expand -t2 a.rb
> input = <<WORDS
> # pharmacy
> vicodin viagra sildenafil citrate xanax xnax valium norco levsitra
> cialis phentermine aseptic pharmacy medication pharmaceuticals? medical
> meds drugs
>
> # money
> mortgages? cable_bils? life_insurance busines_ofers?  casino
>
> # music
> britney_spears christina_agulera
>
> # misc
> diploma
> WORDS
>
> lines = input.to_a
> # remove comments
> lines.delete_if {|line| line.match(/^\#/)}
> # transform lines into words
> words = lines.inject([]) {|result, line| result + line.split(/\s/) }
> # replace underscores with spaces
> words.map!{|word| word.gsub!(/_/, ' '); word}
> words.delete_if {|word| word == ''}
> #p words
>
> # lets make a big regexp
> regexp_str = words.map{|word| Regexp.escape(word)}.join('|')
> regexp = Regexp.new(regexp_str, Regexp::IGNORECASE)
> #p regexp.inspect
>
> testdata = [
>   "do DrUgS.. experience new things",
>   "play on casino and win big money",
>   "pleasure with Ruby. BTW: its free",
>   "buy viagra",
>   "christina agulera caught backstage",
>   "britney spears",
>   "get your university diploma here"
> ]
> testdata.each do |str|
>   res = regexp.match(str) ? "BAD" : " OK"
>   puts "#{res}: #{str}"
> end
> server>
>
> --
> Simon Strandgaard