From: "Robert Feldt" <feldt / ce.chalmers.se> > > Well, Rockit 0.4 will be built on a parser combinator framework which is > something of a Negexp, ie. you can express regular and non-regular > grammars in it. On the really long-term TODO there is a point mentioning > that this should be used to do a Negexp engine for Ruby but I'm unsure if > its really useful. If you want non-regular stuff (or some hairy regular > stuff) you probably shouldn't compress the description to a short > string. But I'm not sure... Checked out Rockit on Sourceforge, looks really nice! From some of the bullet points in your documentation it sounds like Rockit eliminates some of the aspects of lex+yacc that I've often thought seemed unfortunate or kinda messy (e.g. "No need to write a lexer/scanner; rockit gives you both a lexer and a parser from the same grammar." and, "rockit-generated parsers builds the AST; NO need to write ``action code'' in the grammar. ``Action code'' separated from grammar.") By the way, out of curiosity is the "rockit-grammar.grammar" file actually used somehow, or is it just an example of what the grammar _would_ be...? You haven't found some way to pull it up by its own bootstraps have you!?? ;-) > Ps. If you're not familiar with regular vs. non-regular grammars then the > canoncial example of what the former cannot do is balanced/nested > parentheses. You can't do it with one regexp (although people have shown > how to do it with one (two?) regexps and supporting code). In a > "formalism" supporting non-regular grammars it's no problem. Yes, in fact it was this that got me started. The Perl (??{ }) extension provides at least a way to specify such a construct in a regexp. From the perlre doc: (slightly rubified) (??{ code }) This is a ``postponed'' regular subexpression. The code is evaluated at run time, at the moment this subexpression may match. The result of evaluation is considered as a regular expression and matched as if it were inserted instead of this construct. re = %r{ \( (?: (?> [^()]+ ) # Non-parens without backtracking | (??{ re }) # (Sub-)group with matching parens )* \) }x (I'm also considering having (??) be shorthand for 'this-regex', which would look like... re = %r{ \( (?: (?> [^()]+ ) # Non-parens without backtracking | (??) # (Sub-)group with matching parens )* \) }x ...and which also wouldn't require interaction with the Ruby parser. But I'm hoping to be able to do the more general (??{ code }) version too.) Regards, Bill