Ruboids:

Someone recently posted this:

>   o There's a difference between syntax checking and verification of 
> functional correctness

That is indeed why a test case that spot-checks your syntax is less useful
than a test case that understands your needs. All unit testing
starts at the former and aims at the latter. Here's an example of
testing Javascript's syntax:

  ondblclick = div.attributes['ondblclick']
  assert_match /^new Ajax.Updater\("hammy_id"/, ondblclick

It trivially asserts that a DIV (somewhere) contains an ondblclick
handler, and that this has a Script.aculo.us Ajax.Updater in it. The
assertion naturally cannot test that the Updater will indeed update a DIV.

To get closer to the problem, we might decide to get closer to the
Javascript. We may need a mock-Javascript system, to evaluate that string.
It could return the list of nuances commonly called a "Log String Test".
Here's an example, using a slightly more verbose language:

public void testPaintGraphicsintint() {
  Mock mockGraphics = new Mock(Graphics.class);
  mockGraphics.expects(once()).method("setColor").with(eq(Color.decode("0x6491EE")));
  mockGraphics.expects(once()).method("setColor").with(same(Color.black));
  mockGraphics.expects(once()).method("drawPolygon");
  mockGraphics.expects(once()).method("drawPolygon");
  hex.paint((Graphics) mockGraphics.proxy());
  mockGraphics.verify();
}

From the top, that mocks your graphics display driver, and retains its
non-retained graphics commands. Then the mockGraphics object
verifies a certain series of calls, with such-and-so parameters.

(That is a Log String Test because it's the equivalent of writing commands
like "setColor" and "drawPolygon" into a log file, and then reading this
to assert things.)

That test case indeed fits the ideal of moving away from testing raw
syntax, and closer to testing semantics. Such a test, for example, could
more easily ignore extraneous calls, and then check that two dynamic
polygons did not overlap.

Now suppose I envision this testage:

  def ondblclick(ypath)
    %(new Ajax.Updater("node",
        "/ctrl/act",
        { asynchronous:true,
          evalScripts:true,
          method:"get",
          parameters:"i_b_a=Parameter" })
    ).gsub("\n", '').squeeze(' ')
  end

  def test_some_js
    js = ondblclick()
    parse = assert_js(js)  
    statement = parse.first
    assert_equal 'new Ajax.Updater', statement.get_method
    assert_equal '"node"', statement.get_param(0)
    assert_equal '"/ctrl/act"', statement.get_param(1)
    json = statement.get_param(2)
    assert_equal true, json['evalScripts']
  end

The goal is the target JS can flex easily - can reorder its Json, or
change fuzzy details, or add new features - without breaking the tests.
Ideally, only changes that break project requirements will break tests.

Now suppose I want to write that assert_js() using less than seven billion
lines of code.

The first shortcut is to only parse code we expect. I'm aware that's
generally against the general philosophy of parsing, but I'm trying to
sell an application, not a JS parser. That's a private detail. I can
accept, for example, only parsing the JS emitted by Rails's standard
gizmos.

So before getting down to some actual questions, here's the code my
exquisite parsing skills have thrashed out so far:

  def test_assert_js
    source = 'new Ajax.Updater('+
      '"node", '+
      '"/controller/action", '+
      '{ asynchronous:true, '+
      'evalScripts:true, '+
      'method:"get", '+
      'parameters:"i_b_a=Parameter" })'

    js = assert_js(source)
    assert_equal 'new Ajax.Updater', js.keys.first
    parameters = js.values.first['()']
    assert_equal '"node"', parameters[0]
    assert_equal '"/controller/action"', parameters[1]
    json = parameters[2]['{}']
    assert_equal 'true', json['evalScripts']
    assert_equal '"get"', json['method']
    assert_equal '"i_b_a=Parameter"', json['parameters']
  end

Now that's good enough for government work, and I could probably upgrade
the interface to look more like my idealized example...

...but the implementation is a mish-mash of redundant
Regexps and run-on methods:

  Qstr = /^(["](?:(?:\\["])|(?:[^\\"]+))*?["]),?\s*/
  
  def assert_json(source)
    js = {}
    identifier = /([[:alnum:]_]+):/

    while m = source.match(identifier)
      source = m.post_match
      n = source.match(/^([[:alnum:]_]+),?\s*/)
      n = source.match(Qstr) unless n
      break unless n
      js[m.captures[0]] = n.captures[0]
      source = n.post_match
    end

    return { '{}' => js }
  end

  def assert_js(source)
    js = {}
    qstr = /^(["](?:(?:\\["])|(?:[^\\"]+))*?["]),?\s*/
    json = /^(\{.*\}),?\s*/

    if source =~ /^([^\("]+)(.*)$/
      js[$1] = assert_js($2)
    elsif source =~ /^\((.*)\)$/
      js['()'] = assert_js($1)
    else
      index = 0

      while (m = source.match(qstr)) or
            (m = source.match(json))
        break if m.size < 1
        
        if source =~ /^\{/
          js[index] = assert_json(m.captures[0])
        else
          js[index] = m.captures[0]
        end
        
        source = m.post_match
        index += 1
      end
    end

    return js
  end

Now the questions. Is there some...

 ...way to severely beautify that implementation?
 ...lexing library I could _easily_ throw in?
 ...robust JS Lexer library already out there?
 ...assert_js already out there?

-- 
  Phlip
  http://c2.com/cgi/wiki?ZeekLand  <-- NOT a blog!!