On Thu, Jun 25, 2009 at 10:39 AM, Andreas Hansen<hansenandr / gmail.com> wrote: > its the first time me or my friend has worked with regex, my friend have > rewritten the regex a bit, maybe it makes more sense now ... > > another thing: > some usernames are really hard to extract from the packets. an example: > G-eX.Dowden > cat /tmp/z p(("12:23:59.378678 IP 85.225.108.54.54707 > 81.227.132.223.6112: P " + "5 90518027:590518071(44) ack 2582330461 win 64240\nE..Te. / .t..U." + "l6Q .......#2....<]P.........,................wakko0..........@." + ".....\n" + "12:23:59.378678 IP 85.225.108.55.54707 > 81.227.132.223.6112: P " + "5 90518027:590518071(44) ack 2582330461 win 64240\nE..Te. / .t..U." + "l6Q .......#2....<]P.........,................wa-kk.o0.........." + "@......" ).scan( %r{ # capture the address after "IP" IP\s((?:\d{1,3}\.){3}\d{1,3})\. .+? # skip (non-greedy) # capture the flag :\s([PSF])\s\d .+? # skip (non-greedy) ^E.{5}@.{8}Q.{30} # skip the index pattern .+? # skip (non-greedy) # capture the username surrounded by dots: 2+ before, 0+ after \.{2,}(\w[-\w\.]+\w)\.? }mx # m: "make dot match newlines" ) ) > ruby /tmp/z [["85.225.108.54", "P", "wakko0"], ["85.225.108.55", "P", "wa-kk.o0"]]