Hi all, Last week was my Ruby one.year anniversary; I've now been a happy Ruby user for a year. I just want to take the opportunity to thank matz and all of you in the community for making Ruby such a wonderful experience. The language itself is great and fills most of my programming needs, and the prompt and to-the-point replies, ideas and code from the community is invaluable. To "celebrate" I decided its time I make a public release of rockit. Its my largest pure-Ruby project so far and you can find the README below. If you want to try it out download at www.ce.chalmers.se/~feldt/ruby/rockit-0-3-5.tgz. Note that its still alpha and there are lots of things todo. There is a RubyInRuby parser in there but its very rudimentary and basically only parses the literals and one-liners. I've simply translated parse.y to the rockit grammar format. Need to go through and manually tune things so that its a valid rockit grammar with proper priorities etc. Anyway I hope someone finds this useful. I intend to bug matz and the list heavily with questions on the Ruby grammar to get a working RubyInRuby parser. Be glad if someone wants to help out. Regards, Robert Sorry for the length; I have to break it into pieces: ***************************************************** * rockit - Ruby O-o Compiler construction toolKIT * ***************************************************** Version: 0.3.5 Release date: 2001-06-11 Available from: https://sourceforge.net/projects/rockit/ but temporarily download from www.ce.chalmers.se/~feldt/ruby/rockit-0-3-5.tgz (since I didn't have the time to understand SourceForge's release process; I guess its simple... ;)) Author: Robert Feldt, feldt / ce.chalmers.se README version: $Id: README,v 1.6 2001/06/11 15:42:51 feldt Exp $ What is it? ----------- An easy-to-use, object-oriented compiler construction toolkit written in and generating code for Ruby. Currently focusing on the "front-end" phases of compiler construction. Main features of rockit: * Grammars written in Extended Backus-Naur Form. (=> use *, ? and + ops). * Generates both lexer and parser. * Parsers will return abstract syntax trees (AST). * Generated AST's support simple tree-walking using iterators. * "Ruby-friendly" with for example Array's for repetition, Regexp's for tokens etc. * More advanced parser than yacc's LaLr(1) (If you're curious its called "Generalized LR parsing with scanner forking"!) * AST's can be dumped to postscript (if you have graphviz/dot) * Reports when the grammar is ambigous and shows the alternative ways to parse the sentence. Helps you resolve ambiguities. * Associativity and precedence can be specified based on productions/rules in the grammar (NOT on operators which is less "portable"). Installation? ------------- 1. unpack tarball (if you haven't already) 2. install: ruby install.rb 3. If you've got RubyUnit you can also run tests: ruby tests/runtests.rb Why is it needed? ----------------- * No need to write a lexer/scanner; rockit gives you both a lexer and a parser from the same grammar. * No need to write standard code for building an abstract syntax tree; rockit automatically generates it and you can specify how the tree should look. * rockit-generated parsers builds the AST; *NO* need to write "action code" in the grammar. "Action code" separated from grammar. * More powerful operators. * You can write grammars directly in Ruby code. * Rockit will show you why your grammar is ambigous (if it is!) by showing you the two ways the sentence can be parsed. This helps you resolve the ambiguity be introducing priorities. But we already have two excellent compiler compilers in/for Ruby! ----------------------------------------------------------------- Yes, but they (racc and rbison) both use the bison/yacc format which, IMHO, is not optimal for an OO language like Ruby. You also have to write the "action code" (the one to be executed for each production) by hand. This is sometimes a good thing if you simply want to extract some info; but for general use you probably want to make several passes over the result from the parse (which will likely be an AST). Instead of writing the code for building the AST rockit does it for you. In the longer term rockit will include components that are typically not part of compiler compilers like yacc and bison such as for example syntax-directed translation, pretty-printer generation etc. Example of using the rockit command-line program? ------------------------------------------------- $ rockit my_grammar.grammar myparser.rb MyModule my_parser Generated parser for my_grammar.grammar and saved it in myparser.rb. Use it by doing: require 'myparser' ast = MyModule.my_parser.parse "..." Example of using the rockit lib in Ruby code? --------------------------------------------- require 'rockit/rockit' parser = Parse.generate_parser <<-'END_OF_GRAMMAR' Grammar ExampleGrammar Tokens Blank = /\s+/ [:Skip] Number = /\d+/ Productions Expr -> Number [^] | Expr '+' Expr [Plus: left,_,right] | Expr '-' Expr [Minus: left,_,right] | Expr '*' Expr [Mul: left,_,right] | Expr '/' Expr [Div: left,_,right] | '(' Expr ')' [^: _,expr,_] Priorities left(Plus), left(Minus), left(Mul), left(Div) Div = Mul > Plus = Minus END_OF_GRAMMAR def calc_eval(ast) case ast.name when "Plus" calc_eval(ast.left) + calc_eval(ast.right) when "Minus" calc_eval(ast.left) - calc_eval(ast.right) when "Mul" calc_eval(ast.left) * calc_eval(ast.right) when "Div" calc_eval(ast.left) / calc_eval(ast.right) when "Constant" ast.lexeme.to_i end end calc_eval(parser.parse '(4*((2+6)-3))/2') # => 10 Requirements? ------------- Memoize from RAA (or my Ruby page) is needed for a slight performance increase. But things should work without it. Please mail me if it don't! Otherwise it should work with any Ruby >= 1.6. If you've got strscan by Minero Aoki installed it will be used and give a slight performance increase. But things work even if you haven't. I've successfully used rockit with Ruby 1.7.0 (2001-04-02) and cygwin 1.1.8 (gcc version 2.95.2-6) on Windows 2000 Professional. NOTE THAT THIS IS AN ALPHA RELEASE SO THERE WILL LIKELY BE BUGS AND THE API WILL LIKELY CHANGE. RubyUnit is needed to run unit tests. Documentation? -------------- Not much yet. Check out the examples in the examples dir. You can get a good intro to writing grammars by looking at the grammar for rockit grammars. Its in lib and called 'rockit-grammar-files.grammar'. You can also compare it to the grammar in bootstrap.rb which is (almost) the same grammar but written directly in Ruby code. Also check out the tests. Lots of good info and examples in there. More examples of use? --------------------- There are some stuff in the examples dir: * calculator - simple read-eval calculator * minibasic - interpreter for subset of basic in 46 LOC! * polynomials - examples on evaluating and differentiating polynomials from the ANTLR tutorial. * ruby - rudimentary parser for Ruby. Translates parse.y to rockit grammar. But more work is needed for it to be useful. Note that I've only tried with parse.y from Ruby 1.6.3 and there is reported to be a problem with later ones. I'll check it out soon... License and legal issues? ------------------------- rockit is copyright (c) 2001 Robert Feldt, feldt / ce.chalmers.se. All rights reserved. rockit is distributed under LGPL. See LICENSE and COPYING-LESSER. Parsers you generate are LGPL so should not restrict you. If it does please mail me. Special things to note? ----------------------- Rockit is currently: * SLOW! Especially when generating the parser but also when parsing. I haven't given performance much thought yet and haven't profiled so expect significant performance gains when we get to this issue on the TODO. * BAD AT HANDLING AND REPORTING ERRORS! Will be fixed when someone shows me "the/a right way" to do it. Plans for the future? --------------------- Lots of stuff, see TODO. Do you have comments or questions? ---------------------------------- I'd appreciate if you drop me a note if you're successfully using rockit. If there are some known users I'll be more motivated to packing up additions / new versions and post them to RAA. Please give feedback! What is Generalized LR parsing? ------------------------------- (You don't need to understand this to use rockit but if you're interested you might learn something about parsing...) It is a pseudo-parallel parsing algorithm wihch runs a dynamically varying number of LR parsers in parallel. LR parsing algorithms, such as for example yacc and bison, generate a table with parsing actions. If there is an ambiguity in the language or the generation technique used introduces ambiguities because its "imperfect" there are multiple actions in some position(s) in the table. In ordinary (yacc-style) LALR(1) parsing these are called conflicts and must be resolved by rewriting the grammar or introducing associativity and precedence rules since the parser must take one and only one action. In generalized LR parsing all actions are taken by spawning of parsers for each one of them. So if the ambiguity arose not because of the grammar but because of the limitations of the parser generator all but the parser taking the correct action will fail. And if there are multiple ways to parse the sequence they will all be found! This procedure incurs a performance penalty at compile-time, but it can be overcome by clever encodings of the different parsers and their data. Happy coding! Robert Feldt, feldt / ce.chalmers.se