------ art_17657_9538849.1131030465435 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On 11/3/05, nobu.nokada / softhome.net <nobu.nokada / softhome.net> wrote: > > Hi, > > At Thu, 3 Nov 2005 22:13:46 +0900, > Christian Neukirchen wrote in [ruby-talk:163903]: > > I don't think the real problem is the real grammar, that part of > > parse.y looks like the easier one (and rather readable) to me. The > > problem is the lexer-parser communication, think heredocs, %q[] etc. > > There is no way to express that in BNF. > > Exactly. It's a headache. > Too much inheritance from perl :( heredocs are the worst of all. Since I'm writing a ruby lexer and parser in my Grammar package I've really been diving into the details. I'm trying to match the lex_state/space_seen/etc way of doing things (from parse.y), but if I were to start from a scratch, I possibly would make a lexer-free parser (no tokens - parser deals directly with characters). Of course there would be a performance hit (parser has to lookahead more), but you wouldn't have to deal with the parser-driven lexer state stuff. I would love to see the ruby syntax refactored and simplified - especially with regards to the lexer state. The first thing would be to get rid of heredocs or only allow whitespace/comments on the line after the initial << keyword. Secondly, I think it might be possible to reduce the lexer state to one bit - whether the next operator is unary or binary. For example - %: string vs. modulus, <<: heredoc vs. leftshift, `: execution quote vs. method name. I may look at some of this simplification later. ------ art_17657_9538849.1131030465435--