On 04.09.2006 14:11, singsang wrote:
> Dear all,
> 
> Writing some httpd logfile pre-processing (splitting it up, getting
> already some basic numbers), I think that I should compile the Regexp
> for the logfile entry only once.
> 
> So my guess is that I should have perhaps a class LogFormat that holds
> this as a class variable or a class constant. Below I use a non-tested
> regular expression that is not complete yet.
> 
> So the idea is to have:
> 
> class LogFormat
>   @@RegEx = Regexp.new( '(\S+) (\S+) (\S+) \[(\d+)/(\w+)/(\d+)
> [+\-]\d+?\]' )
>   def LogFormat.regex
>     @@RegEx
>   end
> end

You don't need a class variable for that.  A simple class instance 
variable or a constance is sufficient.

> If now from a class LogLine (instanciated for each line in the logfile)
> I use something like
> 
> class LogLine
>   # ...
>   ip, rfc931, user, day, month, year, offset =
> line.match(LogFormat.regex)
>   # ...
> end
> 
> My question: How often is the Regexp compiled? When?
> When the definition of LogFormat is read first?

It's compiled every time the line "@@RegEx = ..." is evaluated - so most 
likely only once.

Note though that it's usually faster to have a regexp in line.  So in 
your case you might have a method that does the line parsing (or 
multiple line parsing) and that's where you can put the inline regexp 
for max efficiency.

Kind regards

	robert