------art_8314_6373180.1202166478094
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Feb 1, 2008 7:55 AM, Ruby Quiz <james / grayproductions.net> wrote:

> In honor of that, this week's Ruby Quiz is to write a parser for JSON.


Here is another solution of mine:

http://pastie.caboo.se/147505

In this one, I just made a fast hand-built recursive-descent/LL(1) parser.
This is the kind of parser that I'm trying to get my 'grammar' package to
approach (using lots of optimizations).  It uses no Regexp or ruby eval
(both of which have compiled C to help speed).  And yet, it is the fastest
pure-ruby JSON parser we've seen (see the recursive descent line below):

ch/s    author/gem
----    ----------
-       Pawel Radecki (RE, mismatch)
3214    Justin Ethier (RE lexer + ruby eval, fixed number parsing)
4054    Eric Mahurin (Grammar0, no lexer, no parser generation)
4078    Eric I (Treetop, unicode broken)
6534    oksteev (Treetop, mismatches in benchmark)
8313    Clifford Heath (Treetop, had to remove handling of "\/")
17320   Alexander Stedile (RE)
54586   Eric Mahurin (Grammar, no lexer, v0.5)
137989  Paolo Bonzini (RE)
166041  Thomas Link (RE lexer + ruby eval, ruby 1.9 results)
220289  json
223486  Eric Mahurin (Grammar, no lexer, unreleased)
224823  fjson (uses C extensions)
333368  Thomas Link & Paolo Bonzini (RE + eval, unicode broken)
388670  Eric Mahurin (hand-built recursive descent)
553081  Eric Mahurin (Grammar, no lexer, unreleased, w/ ruby2cext)
1522250 json (w/ C extensions)


#
# JSON hand-built recursive descent/LL(1) parser, by Eric Mahurin
#
require 'stringio'

class JSONParser

    def parse(s)
        @next  @ioringIO.new(s)).getc
        ws
        value(out)
        ws
        raise("EOF expected") if @next
        raise(out.inspect) unless out.length
        out[0]
    end

    def error(expected, found)
        raise("expected #{expected}, found #{found ? ("'"<<found<<?\') :
'EOF'}")
    end

    def value(out)
        if ?\[.equal?(@next)
            # array
            @nexto.getc
            ws
            a  ]
            unless ?\].equal?(@next)
                value(a)
                ws
                until ?\].equal?(@next)
                    ?\,.equal?(@next) ? (@nexto.getc) : error("','",
@next)
                    ws
                    value(a)
                    ws
                end
            end
            @next  io.getc
            out << a
        elsif ?\{.equal?(@next)
            # object
            @nexto.getc
            ws
            h  }
            unless ?\}.equal?(@next)
                ?\".equal?(@next) ? string(kv) : error("a string", @next)
                ws
                ?\:.equal?(@next) ? (@nexto.getc) : error("':'", @next)
                ws
                value(kv)
                ws
                h[kv[0]]  v[1]
                until ?\}.equal?(@next)
                    ?,.equal?(@next) ? (@nexto.getc) : error("','",
@next)
                    ws
                    ?\".equal?(@next) ? string(kv.clear) : error("a string",
@next)
                    ws
                    ?\:.equal?(@next) ? (@nexto.getc) : error("':'",
@next)
                    ws
                    value(kv)
                    ws
                    h[kv[0]]  v[1]
                end
            end
            @next  io.getc
            out << h
        elsif (?a..?z)(@next)
            # boolean
            (s)<<@next
            @next  io.getc
            while (?a..?z)(@next)
                s<<@next;@nexto.getc
            end
            out << case s
                when "true" then true
                when "false" then false
                when "null" then nil
                else error("'true' or 'false' or 'null'", s)
            end
        elsif ?\".equal?(@next)
            string(out)
        else
            # number
            n  "
            (n<<@next;@nexto.getc) if ?-.equal?(@next)
            ?0.equal?(@next) ? (n<<@next;@nexto.getc) : digits(n)
            (?..equal?(@next) ?
                (n<<@next;@nexto.getc;digits(n);exp(n);true) :
                exp(n)) ?
            (out << n.to_f) :
            (out << n.to_i)
        end
    end

    # Flattening any of the methods below will improve performance further

    def ws
        @next  io.getc while (case @next;when ?\s,?\t,?\n,?\r;true;end)
    end

    def digits(out)
        (?0..?9)@next ? (out<<@next;@nexto.getc) : error("a digit",
@next)
        while (?0..?9)@next; (out<<@next;@nexto.getc); end
        true
    end

    def exp(out)
        (case @next;when ?e,?E;true;end) ? (out<<@next;@nexto.getc) :
            return
        (out<<@next;@nexto.getc) if (case @next;when ?-,?+;true;end)
        digits(out)
    end

    def string(out)
        # we've already verified the starting "
        @nexto.getc
        s  "
        until ?\".equal?(@next)
            if ?\\.equal?(@next)
                @next  io.getc
                case @next
                when ?\",?\\,?\/ then (s<<@next;@nexto.getc)
                when ?b then (s<<?\b;@nexto.getc)
                when ?f then (s<<?\f;@nexto.getc)
                when ?n then (s<<?\n;@nexto.getc)
                when ?r then (s<<?\r;@nexto.getc)
                when ?t then (s<<?\t;@nexto.getc)
                when ?u
                    @next  io.getc
                    u  "
                    4.times {
                        case @next
                        when ?0..?9, ?a..?f, ?A..?F
                            u<<@next;@nexto.getc
                        else
                            error("a hex character", @next)
                        end
                    }
                    s << u.to_i(16)
                else
                    error("a valid escape", @next)
                end
            else
                error("a character", @next) unless @next
                s<<@next;@nexto.getc
            end
        end
        @next  io.getc
        out << s
    end

end

------art_8314_6373180.1202166478094--