------ art_11671_1495614.1202238027096 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On Feb 5, 2008 11:44 AM, tho_mica_l <micathom / gmail.com> wrote: > > Maybe, but then making a fast parser wouldn't be any fun :) > > Since the figures differ slightly from Eric > Mahurin's benchmark it's possible that I did something wrong. But in > this case I did it equally wrong for all solutions. The code is down > below. We probably should probably assume all of these benchmarks have +-50% error. The performance is highly data-set and phase-of-the-moon dependent. You can still judge whether something has non-linear performance (i.e. quadratic runtime) or judge whether one solution is 5-10X faster than another. But, if two solutions are within 2X of each other in a benchmark, I don't think there is a clear winner. It does look like some solutions have quadratic runtime on ruby 1.9. I didn't observe this on 1.8.6. I added all of the unit tests I found in this thread, plus this one: def test_int_parsing assert_same(0, @parser.parse("0")) assert_same(42, @parser.parse("42")) assert_same(-13, @parser.parse("-13")) end and removed these that don't seem correct: #assert_raise(RuntimeError) { @parser.parse(%{"\u0022; p 123; \u0022Busted"}) } #assert_equal("\\u0022; p 123; \u0022Busted", # @parser.parse(%{"\\u0022; p 123; \\u0022Busted"})) Here is a tally of failures(F) and errors(F) using this expanded unit test suite: ch/s F E author/gem ---- - - ---------- - 5 0 Pawel Radecki (RE, recursive descent) - 6 2 ghostwheel (ghostwheel) 1226 3 2 James Edward Gray II (peggy) 3214 5 1 Justin Ethier (RE lexer, ruby eval, fixed numbers) 4054 0 0 Eric Mahurin (Grammar0, no lexer, no parser generation) 4078 2 0 Eric I (Treetop, unicode broken) 6534 2 0 Steve (Treetop, mismatches in benchmark) 8313 1 1 Clifford Heath (Treetop, removed handling of "\/") 17320 0 0 Alexander Stedile (RE, recursive descent) 54586 0 0 Eric Mahurin (Grammar, no lexer, v0.5) 137989 2 1 Paolo Bonzini (RE, recursive descent) 166041 2 1 Thomas Link (RE lexer, ruby eval, ruby 1.9 results) 186042 5 0 James Edward Gray II (RE, recursive descent) 220289 1 7* json 223486 0 0 Eric Mahurin (Grammar, no lexer, unreleased) 224823 6 0 fjson (uses C extensions) 287292 5 0 James Edward Gray II (RE, recursive, Eric optimized) 333368 3 0 Thomas Link & Paolo Bonzini (RE + eval, unicode broken) 388670 0 0 Eric Mahurin (recursive descent) 553081 4 9 Eric Mahurin (Grammar, no lexer, unreleased, ruby2cext) 1522250 0 7* json (w/ C extensions) For the json gem, all of the failures happen because the tests are invalid - top-level json should only be an array or an object. My Grammar with ruby2cext didn't work well with unit testing because it didn't handle creating the parser multiple times. Need to fix that. Has anyone been able to benchmark the ghostwheel json parser? I would like to see how well it does. Here is the complete set of unit tests I used: require "test/unit" class TestJSONParser < Test::Unit::TestCase def setup @parser SONParser.new end def test_keyword_parsing assert_equal(true, @parser.parse("true")) assert_equal(false, @parser.parse("false")) assert_equal(nil, @parser.parse("null")) end def test_number_parsing assert_equal(42, @parser.parse("42")) assert_equal(-13, @parser.parse("-13")) assert_equal(3.1415, @parser.parse("3.1415")) assert_equal(-0.01, @parser.parse("-0.01")) assert_equal(0.2e1, @parser.parse("0.2e1")) assert_equal(0.2e+1, @parser.parse("0.2e+1")) assert_equal(0.2e-1, @parser.parse("0.2e-1")) assert_equal(0.2E1, @parser.parse("0.2e1")) end def test_string_parsing assert_equal(String.new, @parser.parse(%Q{""})) assert_equal("JSON", @parser.parse(%Q{"JSON"})) assert_equal( %Q{nested "quotes"}, @parser.parse('"nested \"quotes\""') ) assert_equal("\n", @parser.parse(%Q{"\\n"})) assert_equal( "a", @parser.parse(%Q{"\\u#{"%04X" % ?a}"}) ) end def test_array_parsing assert_equal(Array.new, @parser.parse(%Q{[]})) assert_equal( ["JSON", 3.1415, true], @parser.parse(%Q{["JSON", 3.1415, true]}) ) assert_equal([1, [2, [3]]], @parser.parse(%Q{[1, [2, [3]]]})) end def test_object_parsing assert_equal(Hash.new, @parser.parse(%Q{{}})) assert_equal( {"JSON" 3.1415, "data" true}, @parser.parse(%Q{{"JSON": 3.1415, "data": true}}) ) assert_equal( { "Array" [1, 2, 3], "Object" {"nested" "objects"} }, @parser.parse(<<-END_OBJECT) ) {"Array": [1, 2, 3], "Object": {"nested": "objects"}} END_OBJECT end def test_parse_errors assert_raise(RuntimeError) { @parser.parse("{") } assert_raise(RuntimeError) { @parser.parse(%q{{"key": true false}}) } assert_raise(RuntimeError) { @parser.parse("[") } assert_raise(RuntimeError) { @parser.parse("[1,,2]") } assert_raise(RuntimeError) { @parser.parse(%Q{"}) } assert_raise(RuntimeError) { @parser.parse(%Q{"\\i"}) } assert_raise(RuntimeError) { @parser.parse("$1,000") } assert_raise(RuntimeError) { @parser.parse("1_000") } assert_raise(RuntimeError) { @parser.parse("1K") } assert_raise(RuntimeError) { @parser.parse("unknown") } end def test_int_parsing assert_same(0, @parser.parse("0")) assert_same(42, @parser.parse("42")) assert_same(-13, @parser.parse("-13")) end def test_more_numbers assert_equal(5, @parser.parse("5")) assert_equal(-5, @parser.parse("-5")) assert_equal 45.33, @parser.parse("45.33") assert_equal 0.33, @parser.parse("0.33") assert_equal 0.0, @parser.parse("0.0") assert_equal 0, @parser.parse("0") assert_raises(RuntimeError) { @parser.parse("-5.-4") } assert_raises(RuntimeError) { @parser.parse("01234") } assert_equal(0.2e1, @parser.parse("0.2E1")) assert_equal(42e10, @parser.parse("42E10")) end def test_more_string assert_equal("abc\befg", @parser.parse(%Q{"abc\\befg"})) assert_equal("abc\nefg", @parser.parse(%Q{"abc\\nefg"})) assert_equal("abc\refg", @parser.parse(%Q{"abc\\refg"})) assert_equal("abc\fefg", @parser.parse(%Q{"abc\\fefg"})) assert_equal("abc\tefg", @parser.parse(%Q{"abc\\tefg"})) assert_equal("abc\\efg", @parser.parse(%Q{"abc\\\\efg"})) assert_equal("abc/efg", @parser.parse(%Q{"abc\\/efg"})) end def test_more_object_parsing assert_equal({'a','b'}, @parser.parse(%Q{{ "a" : 2 , "b":4 }})) assert_raises(RuntimeError) { @parser.parse(%Q{{ "a" : 2, }}) } assert_raises(RuntimeError) { @parser.parse(%Q{[ "a" , 2, ]}) } end def test_alexander assert_raise(RuntimeError) { @parser.parse(%Q{"a" "b"}) } end def test_thomas assert_raise(RuntimeError) { @parser.parse(%{p "Busted"}) } assert_raise(RuntimeError) { @parser.parse(%{[], p "Busted"}) } assert_raise(RuntimeError) { @parser.parse(%{[p "Busted"]}) } assert_raise(RuntimeError) { @parser.parse(%{{1 STDOUT.puts("Busted")}}) } #assert_raise(RuntimeError) { @parser.parse(%{"\u0022; p 123; \u0022Busted"}) } assert_raise(RuntimeError) { @parser.parse(%{"" p 123; ""}) } #assert_equal("\\u0022; p 123; \u0022Busted", # @parser.parse(%{"\\u0022; p 123; \\u0022Busted"})) assert_equal('#{p 123}', @parser.parse(%q{"#{p 123}"})) assert_equal(['#{`ls -r`}'], @parser.parse(%q{["#{`ls -r`}"]})) assert_equal('#{p 123}', @parser.parse(%q{"\\u0023{p 123}"})) assert_equal('#{p 123}', @parser.parse(%q{"\u0023{p 123}"})) end def test_thomas2 assert_raise(RuntimeError) { @parser.parse(%{[], p "Foo"}) } assert_raise(RuntimeError) { @parser.parse(%{""; p 123; "Foo"}) } assert_raise(RuntimeError) { @parser.parse(%{"" p 123; ""}) } assert_raises(RuntimeError) { @parser.parse("-5.-4") } assert_raises(RuntimeError) { @parser.parse(%Q{{ "a" : 2, }}) } assert_raise(RuntimeError) { @parser.parse(%q{true false}) } end end ------ art_11671_1495614.1202238027096--