On Nov 16, 9:26 am, Charles Oliver Nutter <charles.nut... / sun.com> wrote: > Markus Liedl wrote: > > The grammar is hosting language neutral. It must be interpreted or > > translated to be run, i.e. to parse something. Currently there are two > > translators, one to Emacs Lisp, the other to C. Both produce recursive > > descent parsers. > > I would be interested in hearing more about the translation process, and > the possibility of producing RDPs in Ruby and Java. You may have a look in the file cp-gen.el. That's the elisp code to generate the C parser. Every one of the 31 forms I mentioned before expands nearly directly to some C block with very little analysis done before. There are lots of temporary variables generated for various purposes and its left to the C compiler to sort out the mess (which gcc does pretty well). If one doesn't want to use code generation, one might create 31 classes, each of them doing the work of one form. Following the Interpreter design pattern. The grammar reader would create object configurations mirroring the rules in the grammar. Then you are not able to use local variables for the capture variables and a few others. That would cost speed again. > > Without being sure, I'd like to claim the grammar is close to cover > > 100% of the Ruby language. It does, for example, parse Ruby stdlib > > completely. > > I'm not sure how much a measure that is; a parser could parse every > token as a literal "1" and it would parse everything, but it wouldn't > mean it's correct. Perhaps it's possible to roundtrip from the parsed > result back to Ruby code and see whether the result is roughly the same > as the original? hmm? There are many correct parses, as seen in the file tests.el. There may be many bugs too. The unparser won't catch "interesting" bugs like wrong priority. > > On the bad side parsers using this grammar work slower. Even the > > faster of both implementations is many times slower than the MRI > > parser. > > Do you expect this can be improved? If there's a performance hit for > using this parser it will substantially limit adoption. Well I didn't ever benchmark MRI's parser before writing this grammar. The problem for any other parser is that it is blindingly fast. Still I don't believe speed is that important. To give you at least one number: One my specific hardware, the C parser groks around 3 MBytes of Ruby code per second. Does this seem fast or slow to you? With careful work one may make the C parser maybe two times faster, maybe more. I don't think I will spend my time on that. > What's your goal with this? I will give another try at building a Ruby VM. I didn't want to use one of those parse.y adapters. I think its really nice and useful to have a portable parser. > At the moment, I don't like that there's > only two "mostly correct" parsers in existence: Ruby's Bison-based > parser and JRuby's Jay-based parser. They're both pretty painful to work > with and evolve. > > - Charlie