2009/10/15 steve <zyzygy / telstra.com>:
> George George wrote:
>>
>> i have some script in which i would like to match a string against
>> 'many' regular expressions patterns.
>>
>> def group(string)
>>   Ω򽾱>> group =1
>> else
>> ...
>>  
>> end
>>
>> My worry is the amount of patterns that i have (exceeding 400) and the
>> efficiency and sanity of such an approach.What would you advice?
>>
>>
>> Thank you.
>
> Are the matches randomly distributed across the patterns?   > you could arrange for the most common patterns in the first search, the next
> most common in a subsequent search and so on.
>
> Also recall that compiled regexps can be assigned to variables or constants,
> including arrays.

Yep, this could be used to do something like

PATTERNS = {
  1 => [/p1|p2|p3/, /p4|p5|p6/],
  2 => [/p7|p8/],
}

def group(s)
  PATTERNS.each do |gr, pats|
    return gr if pats.any? {|rx| rx =~ s} # short circuit!
  end
  nil # or exception
end

Obviously you want the most frequent patterns in the beginning of
arrays and the least frequent at the end.

I'm not sure though about the sanity aspect of this.  George, can you
disclose more details about the nature of your matching?  Maybe
there's a better and / or more efficient solution.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/