On Thu, Jun 25, 2009 at 10:39 AM, Andreas Hansen<hansenandr / gmail.com> wrote:
> its the first time me or my friend has worked with regex, my friend have
> rewritten the regex a bit, maybe it makes more sense now ...
>
> another thing:
> some usernames are really hard to extract from the packets. an example:
> G-eX.Dowden

> cat /tmp/z
p(("12:23:59.378678 IP 85.225.108.54.54707 > 81.227.132.223.6112: P " +
   "5 90518027:590518071(44) ack 2582330461 win 64240\nE..Te. / .t..U." +
   "l6Q .......#2....<]P.........,................wakko0..........@." +
   ".....\n"                                                          +
   "12:23:59.378678 IP 85.225.108.55.54707 > 81.227.132.223.6112: P " +
   "5 90518027:590518071(44) ack 2582330461 win 64240\nE..Te. / .t..U." +
   "l6Q .......#2....<]P.........,................wa-kk.o0.........." +
   "@......"
  ).scan(
    %r{
      # capture the address after "IP"
      IP\s((?:\d{1,3}\.){3}\d{1,3})\.

      .+?  # skip (non-greedy)

      # capture the flag
      :\s([PSF])\s\d

      .+?                # skip (non-greedy)
      ^E.{5}@.{8}Q.{30}  # skip the index pattern
      .+?                # skip (non-greedy)

      # capture the username surrounded by dots: 2+ before, 0+ after
      \.{2,}(\w[-\w\.]+\w)\.?
    }mx  # m: "make dot match newlines"
  )
)

> ruby /tmp/z
[["85.225.108.54", "P", "wakko0"], ["85.225.108.55", "P", "wa-kk.o0"]]