Issue #9508 has been updated by Yusuke Endoh.

Status changed from Open to Feedback

Sorry for the very late response.  I tried and read through your patch.
However, at first, I'd like to discuss the proposal itself.


## Demand

In fact, I think we can virtually implement this feature by using a Ruby code parser, such as ripper.

For example, method coverage is usually identical to the execution count of the first line of each method.
(Of course, to make it precise, there might be many annoying cases, such as a method defined in one line.)

In similar way, you can measure decision coverage by parsing if/then/else and case/when statements.

If there is a great demand for this feature, I'm not against embedding it to the core.  But, is it really needed?


## Use case

Is it fully-clarified what type of visualization and analysis you want to do?  Does the proposed API give you enough information for your use case?
If not, we will have to extend the API repeatedly, or even worse, the API will turn out not to be unusable after it is released.

For example, method coverage does not include method name.  Decision coverage does not include lineno of "else" statement.  Are they okay?

The current API is designed for an apparent use case: to visualize the execution count of each line.


## Performance

I'm afraid if it is heavy to measure decision coverage because the patch calls `rb_hash_lookup` in each branch.  I consulted the following micro benchmark:

https://gist.github.com/mame/2c1100664d452bff133a

It takes longer two times than the current.  Doesn't it matter?
Just one idea: it would be good to allow a user to specify what s/he want to measure, like:

    Coverage.start(line: true, method: false, decision: false) # default?
    Coverage.start(line: true, method: true, decision: true)   # all

Please let me know if you have a benchmark in practical case.  Though I did "make test-all" with coverage measurement, it caused core dump:

    $ time make test-all RUN_OPTS="--disable-gems -r./sample/coverage.rb"
    *snip*
    [ 3559/15034] TestDir#test_close*** Error in `./test/runner.rb': munmap_chunk(): invalid pointer: 0x00002ae91fa42770 ***
    Aborted (core dumped)

However, I guess this is not a problem of your patch but a pre-existing GC bug that is triggered by your patch.

## Review comments

Hereinafter, I describe review comments for your patch.
I think there is no big problem except ruby.h.

### include/ruby/ruby.h

    #define RUBY_EVENT_DEFN      0x0090
    #define RUBY_EVENT_DECISION_TRUE  0x00a0
    #define RUBY_EVENT_DECISION_FALSE 0x00b0

I think we don't have to declare these three constants here since they are used only in compile.c.

    #define RUBY_EVENT_MCOVERAGE              0x080000

Seems like this event is identical to `RUBY_EVENT_CALL`, i.e., `RUBY_EVENT_MCOVERAGE` is fired if and only if `RUBY_EVENT_CALL` is fired.  If so, this constant is not needed.

    #define RUBY_EVENT_DCOVERAGE_TRUE        0x2000000
    #define RUBY_EVENT_DCOVERAGE_FALSE       0x4000000

These two are needed for decision coverage.
But ko1 hesitates to add a new type of event unless it is really needed.
We must persuade ko1.


### parse.y

    VALUE rb_file_coverage = rb_hash_new();
    VALUE methods = rb_hash_new();
    VALUE decisions = rb_hash_new();

By using `ObjectSpace.each_object`, a user can get a reference to these objects and destroy them.  You should use RBASIC_CLEAR_CLASS to make them invisible for users.  (But invisible objects may cause another problem.  As I recall, `rb_hash_lookup` might not be used for an invisible object.)


### thread.c

`clear_coverage` deletes the coverage information measured so far.  I think it also should delete method and decision coverage.
When `Kernel#fork` is called, this function is used in the child process, because the parent and the child has the same coverage information which may lead to duplicated measurement.


### compile.c

    #define ADD_METHOD_COVERAGE_TRACE(seq, line, event, end_line)

`end_line` is not used.


### sample/coverage.rb

This file must be updated because of the API change.  I'm still unsure if the API change is not harmful, though.


Thank you,

----------------------------------------
Feature #9508: Add method coverage and branch coverage metrics
https://bugs.ruby-lang.org/issues/9508#change-46485

* Author: Sam Rawlins
* Status: Feedback
* Priority: Normal
* Assignee: 
* Category: 
* Target version: 
----------------------------------------
Since the Coverage extension was introduced in Ruby 1.9, Ruby has had built-in line code coverage. Ruby should support more of the basic code coverage metrics [1]. I have a pull request on GitHub ( https://github.com/ruby/ruby/pull/511 ) to add Method Coverage (Function Coverage) and Branch Coverage. I'd love feedback to improve it.

Currently, with this feature, Coverage.result would look like:

    {"/Users/sam/code/ruby/cov_method.rb" => {
      lines: [1, 2, 2, 20, nil, nil, 2, 2, 2, nil, 0, nil, nil, nil, 1, 0, nil, nil, 1, 1, nil, nil, 1],
      methods: {1=>2, 15=>0, 19=>1},
      branches: {8=>2, 11=>0}
    }}

which includes
* the current Ruby line coverage report,
* as well as a method report (The method defined on line 1 was called 2 times; the method on line 15 was called 0 times; ...),
* and a branch report (the branch on line 8 was called 2 times; the branch on line 11 was called 0 times).

Branches
--------

Branches include the bodies of if, elsif, else, unless, and when statements, which are all tracked with this new feature. However, this feature is not aware of void bodies, for example:

    if foo
      :ok
    end

will report that only one branch exists in the file. It would be better to declare that there is a branch body on line 2, and a void branch body on line 3, or perhaps line 1. This would require the keys of the [:branch] Hash to be something other than line numbers. Perhaps label_no? Perhaps nd_type(node) paired with line or label_no?

More Coverage
-------------

I think that Statement Coverage, and Condition Coverage could be added to this feature, using the same techniques.

Caveats
-------

I was not very clear on the bit-arrays used in ruby.h, and just used values for the new macros that seemed to work.

Also, I would much rather use Ranges to identify a branch, so that a Coverage analyzer like SimpleCov won't need any kind of Ruby parser to identify and highlight a full chunk of code as a tested branch, or a not tested branch. I'm trying to find how that could be implemented...

[1] Wikipedia has good definitions: http://en.wikipedia.org/wiki/Code_coverage

---Files--------------------------------
pull-request-511.patch (26.7 KB)
pull-request-511.patch (38.5 KB)
pull-request-511.patch (57 KB)


-- 
https://bugs.ruby-lang.org/