------art_33003_25050137.1222008371264
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Sun, Sep 21, 2008 at 7:56 AM, Clifford Heath <no / spam.please.net> wrote:

> arcadio wrote:
>
>> I've asked the following question in the ANTLR mailing list, with no
>> luck. Since it's quite Ruby-centric, perhaps someone here knows the
>> answer, given the popularity of ANTLR.
>>
>> I'm in the process of migrating a YACC grammar to ANTLR.
>>
>> However, since the rest of the project is written in Ruby, I'd like to
>> generate the ANTLR code in Ruby, if possible.
>>
>
> Have you tried just adding the required options section below your grammar
> statement? It just worked for me, though the generated code is a little
> quirky. The Ruby generator has been a standard part of ANTLR for years.
>
> options {
>       language  uby;
>       output  ST;
> }
>
> I found that ANTLR was very poor at reporting situations for which it
> couldn't generate correct code (or rather, the designer has a strange
> idea of "correct"), and in particular the lexing behaviour is very
> non-intuitive. I was attempting a very difficult grammar that needs
> LL(*) behaviour, and as I know now, needs to handle lexing using parse
> rules (no lexer at all). After some unhelpful arguments, I gave up and
> implemented my grammar using Treetop. Terribly slow, but powerful and
> simple to use, once you get the hang of PEG parsing.
>
> Clifford Heath.


Hey Clifford,

I still don't know enough about packrat parsing yet.  Do you think it
possible for this type of parser to reach the same performance as LL (or
LALR) parsers or is there just extra overhead that you need at parse time
(as opposed to accounted for at parser generation time)?  In my
benchmarking, the 3 packrat parsers I looked at (Treetop, Ghostwheel, and
Peggy) are 10-100 X slower than pure ruby LL and LALR parsers.

I also was a previous ANTLR user (you might find my C preprocessor in ANTLR
2.7.6 releases) before writing my own parser.  I decided to stick with LL
parsing for my Grammar project.  It does handle lexer-free parsers (but you
can have a lexer) and handles LL(*) parsing, but with a performance penalty
- backtracking.  From my understanding, packrat parsers shine in
backtracking by memoizing to maintain linear performance.  I'm wondering if
I could use this technique and still maintain my non-backtracking
performance.  I thinking I could also make an "engine" for Grammar that did
packrat or even LALR parsing instead of LL parsing.  Independently I should
be able to translate a PEG syntax, Regexp syntax, or BNF (YACC/ANTLR) to a
Grammar.

Also, any of you have JSON parser for ANTLR with a Ruby target?  JSON is
what I've been benchmarking with (because of the ruby quiz) and I'd like to
compare against ANTLR.

Eric

------art_33003_25050137.1222008371264--