------ art_6811_20206838.1202152392256 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On Feb 4, 2008 12:14 PM, Clifford Heath <no / spam.please.net> wrote: > Eric Mahurin wrote: > >> compare: > >> space rammar::Element[Set[?\ > ,?\t].duck!(: :include?)].discard > > So, doing 1..3, in the released grammar v0.5, you'd have this instead: > > space E[?\s] | E[?\t]).discard > > Definitely an improvement! > > >> Nathan's concept is that "grammar" should become a Ruby keyword, > > I don't think ruby needs any new keywords. It already has more than it > > needs in my opinion. There is enough power with just classes, methods, > > blocks, and operator overloading. > > Part of the point of using a packrat-style parser is that there are > no true keywords, meaning words that must only be used in their KW > places. Many of Ruby's words are like that, of course. But my point > was that once you say "grammar", you're now talking a different > language, which can use as much and *only* as much of the Ruby > grammar as it needs. And when inside a rule you say {, you're back > in Ruby grammar... etc. > I definitely need to go learn about packrat/PEG stuff. Sounds interesting after looking at wikipedia. Still don't really understand LR/LALR parsers. My main focus has been LL/recursive-descent and PEG sounds to be recursive-descent. The normal lexer/parser split makes that fairly hard to do, as the > lexer needs to know whether to return the KW meaning or a general > identifier. Fortunately in my grammar package the lexer and parser can be specified in exactly the same way. Also, you can use anything for tokens. If you are using pattern to match to them, it will just use the expression (pattern next_token) to see if it is a match. So, in the lexer, you might generate a String for any identifier/keyword. The lexer wouldn't care whether it is a keyword or not. In the parser, you then could use this to match an arbitrary identifier: E(String) # (String next_token) is true if next_token is a String or if you wanted to match a keyword, you'd have this type of Grammar: E("begin") # ("begin"