On Sun, Dec 12, 2010 at 9:55 AM, Magnus Holm <judofyr / gmail.com> wrote:
> Hey folks,
>
> When it comes to working with Ruby files in Ruby (parsing, analyzing
> etc), we definitely have a non-optimal solution right now. There's
> plenty of parsers, but almost none work across implementations or are
> compatible with each other:
>
> * ParseTree. 1.8. Only parses 1.8 code (including internal). UnifiedRuby AST.
> * RubyParser. Everywhere. Only parses 1.8 code. UnifiedRuby AST.
> * Ripper. 1.9 / MacRuby. Only parses 1.9 code. Event based.
> * Melbourne. C-extension. Only parses 1.8 code. Rubinius AST.
> * JRuby's parser. JRuby. Parses both 1.8 and most 1.9 code. JRuby AST.
> * RedParse. Everywhere. Parses both 1.8 and some 1.9 code. RedParse
> AST. Not used as much as the other tools.

For the record, JRuby will never freeze its AST. We have made changes
to it (for various reasons) at roughly the same rate since I started
working on the project in 2005. It's likely changes will continue to
happen. So any "standard" AST should be produced by a separate offline
parser.

> ## Suggested plan
>
> 1. Design and standardize an AST together with a test suite (based on
> current parsers' test suites)
> 2. Write converters for all the parsers above
> 3. Provide a gem with all of the converters which automatically
> chooses the best internal parser on each implementation.
> 4. Make sure everyone targets this meta-parser instead of custom AST
> provided by other parsers.
> 5. *If* this gets traction, each implementation can *then* provide
> their own Ruby::Parser class and we can slowly deprecate the gem.
>
> ## How should we proceed with this?
>
> First a few questions:
>
> Everyone, do you also see the value of this, or is it only me?
> Ruby core team, is there a way to make this an "official" AST?
> matz, what are your thoughts on a Ruby::Parser class?

I will cast my vote, whatever it's worth, for just enhancing and
improving Ryan's RubyParser. It already parses 1.8 code very well, and
being racc-based it's closest in structure to the canonical C Ruby
parser, which will make enhancing and improving it far easier than
one-offs. It also has an extensive test suite, backward-compatibility
with ParseTree, and reasonable performance.

Converters will always suffer from changes that happen to each impl's
AST, and will need constant maintenance. RubyParser needs no such
maintenance beyond adding new syntax features as they are added to
Ruby proper.

Native parsers will be faster; but re-walking the native ASTs and
converting them to a standard format could be nearly as "slow" as a
pure-Ruby parser.

A pure-Ruby parser will work across all implementations that
sufficiently support standard Ruby.

The only down side at the moment is that RubyParser does not support
1.9 syntax. Ryan has a bounty on adding such support (or at least
adding tests, which may encourage him to add such support) and I'll
toss in a few bucks as well.
http://blog.zenspider.com/2010/12/bounty-ruby-parser-needs-19-lo.html

I'd also love to see someone implement a native racc backend for
JRuby...but that's a separate discussion :)

At any rate, I'd find incorporating a pure-Ruby parser into stdlib the
most acceptable option, and I think RubyParser is currently the best
candidate.

- Charlie