#if Todd Gillespie
 
> Would one rather spend some quality time learning to use regexes fully, or
> spend an indefinite amount of time debugging your one-off parser?

Regexp guru ? Perhaps you can tell me how to extract tokens from a
string, where a set of delimiter characters are provided, but you must
also return quoted strings "..." and bracketed strings (...) as separate
tokens. Quoted characters (\n, \a, etc) should simply be 'unquoted' (\n
-> n, \a -> a, etc.)

No, I'm not just randomly leeching some regexp knowledge here. My first
Ruby project is a port of my internet mail message (RFC822) parser,
which is in C++, using STL (and no regexps, for portability.)

The parser is completely OO and very lazy, which makes it quite nice to
use from Ruby, and also pretty fast.

On my Celeron 600, it takes 10s to parse 1000 messages, ripping out
the subject, first sender (first address in From, falling back to
Sender), date and 'has attachments' (check body part count, check
Content-Disposition of first body part).

Note that I picked these objects because they're the useful ones when
you're building an on-screen index.

The main slowdown is in the character-by-character tokenising function,
which is damn fast in C++, but dog slow in Ruby (according to require
'profile').

I've been using regexps for as long as I can remember, but I still
haven't the slightest clue how to do this simple parser with regexps
instead of character-by-character.

Rik