From: "Robert Feldt" <feldt / ce.chalmers.se>
>
> On Tue, 20 Nov 2001, Bill Kelly wrote:
> 
> > By the way, out of curiosity is the "rockit-grammar.grammar" file
> > actually used somehow, or is it just an example of what the grammar
> > _would_ be...?  You haven't found some way to pull it up by its own
> > bootstraps have you!??  ;-)
> > 
> It bootstraps itself each time someone installs it. Its not really
> difficult; you simply give a crude way to write grammars directly in Ruby
> and then you write a parser for rockit grammars within it. Then use that
> parser to parse the Rockit grammar files grammar and output a parser and
> you're all set. Good thing is that its a fairly extensive test that things
> are working.

Ah, OK.  Cool!  At the start of this regexp project I toyed with the
idea or possibility of somehow writing its own lexer in terms of a
simple regexp, and parsing that at initialization time.  So the real
regexp parser would actually be using a lexer built from a simple
regexp. . . but the lexing of the thing is so simple in code it
seemed this extra layer might only complicate or obfuscate the problem.

But for Rockit, since you'd be getting both the lexer and the parser,
that kind of bootstrapping sounds really cool.


[...]
> Ok, interesting. Will you be writing this in Ruby or C?

Straight 'C'.  Old-school 'C', in fact, with the parameter list
separate from the type declarations of the parameters. :-/

However we have the potential to expose some of its underlying
functionality through an extension, of course. . .


[...]
> The above "regexp" can for example be written 
> 
>   non_paren = plus( any_but( "(", ")" ) ) 
>   re = seq("(", (non_paren || recurse), ")")
> 
> in the Rockit parser combinators but I'm not sure of a good syntax to
> write the recursion. Unfortunately :?? doesn't work or we could use the
> same "symbol"...

Interesting, well as another poster pointed out, these regexp
extensions are pretty non-intuitive as mnemonics.  I think your
"recurse" or something like "this_seq", etc. would seem to be
much more in line with your more readable grammar specifications. :-)


Regards,

Bill