On 9/4/06, singsang <tomsingsang / yahoo.com> wrote:
> Dear all,
>
> Writing some httpd logfile pre-processing (splitting it up, getting
> already some basic numbers), I think that I should compile the Regexp
> for the logfile entry only once.
>
> So my guess is that I should have perhaps a class LogFormat that holds
> this as a class variable or a class constant. Below I use a non-tested
> regular expression that is not complete yet.
>
> So the idea is to have:
>
> class LogFormat
>   @@RegEx = Regexp.new( '(\S+) (\S+) (\S+) \[(\d+)/(\w+)/(\d+)
> [+\-]\d+?\]' )
>   def LogFormat.regex
>     @@RegEx
>   end
> end
>
> If now from a class LogLine (instanciated for each line in the logfile)
> I use something like
>
> class LogLine
>   # ...
>   ip, rfc931, user, day, month, year, offset =
> line.match(LogFormat.regex)
>   # ...
> end
>
> My question: How often is the Regexp compiled? When?
> When the definition of LogFormat is read first?
>
> Btw: If anybody has a ready-to-use regex for the common log format this
> would be great, but I will get that done as far as I need by myself.
> ;-)
> Other question: Does anybody know a "Webalizer" sort of thing written
> in Ruby?
>

I tried to write something like that once.

But It was very slow (especially because it did DNS resolving
synchronously), and I never finished it.

Thanks

Michal