Here is my try using regexes.  I use the "copy-on-write trick" from
the suffix tree quiz: the regex is always anchored to the beginning of
the string using \A, and the matched text is discarded using
post_match.  In some places where I don't want to discard I use (?
=...).

Using Eric's benchmark I get 36kb/sec, but I haven't benchmarked any
other solution.

http://pastie.caboo.se/147201

Paolo