On 11/12/06, Ross Bamford <rosco / roscopeco.remove.co.uk> wrote: > On Sun, 12 Nov 2006 15:01:56 -0000, Peter Schrammel > <peter.schrammel / gmx.de> wrote: > > Why is there a limitation at all? I implemented the same thing in perl > > and it no complains ... > > Is the regexp engine of perl that much better? > > > > Irrespective of whether regex the best solution for your needs, it seems > Oniguruma will improve the situation somewhat with respect to large > regular expressions. I built a local version of 1.8.5 with the oniguruma engine: http://raa.ruby-lang.org/project/oniguruma/ And re-ran (a slight variation of) my test program: [~]$ ruby foo Using the <undefined> regex engine. Converted a list of 1 words into a regex 8 bytes long. Converted a list of 2 words into a regex 36 bytes long. Converted a list of 4 words into a regex 48 bytes long. Converted a list of 8 words into a regex 73 bytes long. Converted a list of 16 words into a regex 173 bytes long. Converted a list of 32 words into a regex 352 bytes long. Converted a list of 64 words into a regex 718 bytes long. Converted a list of 128 words into a regex 1415 bytes long. Converted a list of 256 words into a regex 2656 bytes long. Converted a list of 512 words into a regex 5210 bytes long. Converted a list of 1024 words into a regex 10105 bytes long. Converted a list of 2048 words into a regex 19432 bytes long. Converted a list of 4096 words into a regex 37509 bytes long. @_@ [~]$ /usr/local/bin/ruby foo Using the Oniguruma regex engine. Converted a list of 1 words into a regex 11 bytes long. Converted a list of 2 words into a regex 16 bytes long. Converted a list of 4 words into a regex 38 bytes long. Converted a list of 8 words into a regex 97 bytes long. Converted a list of 16 words into a regex 185 bytes long. Converted a list of 32 words into a regex 359 bytes long. Converted a list of 64 words into a regex 686 bytes long. Converted a list of 128 words into a regex 1387 bytes long. Converted a list of 256 words into a regex 2715 bytes long. Converted a list of 512 words into a regex 5264 bytes long. Converted a list of 1024 words into a regex 10074 bytes long. Converted a list of 2048 words into a regex 19439 bytes long. Converted a list of 4096 words into a regex 37452 bytes long. Converted a list of 8192 words into a regex 71931 bytes long. Converted a list of 16384 words into a regex 135572 bytes long. Converted a list of 32768 words into a regex 253027 bytes long. Converted a list of 65536 words into a regex 461607 bytes long. Converted a list of 131072 words into a regex 808171 bytes long. Converted a list of 262144 words into a regex 1326345 bytes long. Converted a list of 479625 words into a regex 1873539 bytes long.