Issue #17930 has been updated by Eregon (Benoit Daloze).


mame (Yusuke Endoh) wrote in #note-5:
> In fact there is no enough information because `Thread::Backtrace::Location` only knows path and node_id.

This seems an implementation detail, not something that should decide what the API is.

For instance, on TruffleRuby, `Thread::Backtrace::Location` knows column information (which is stored on nodes), and is represented as a backtrace + index (https://github.com/oracle/truffleruby/blob/master/src/main/java/org/truffleruby/core/thread/RubyBacktraceLocation.java).
There is no need to re-parse anything, and requiring reparsing for this feature seems really suboptimal (slow) and prone to issues if e.g., the file no longer exists.

Wouldn't a byte offset + byte length be almost as compact as node_id, but without needing to keep the node or to reparse?

> @ko1 doesn't want to use more memory for this feature and I agree with him. I'm not a big fan of the mechanism of `AST.of`, but it is a completely different issue from this ticket.
> 
> My first proposal was four separate methods, but they require keeping the result of `AST.of` for each Thread::Backtrace::Location, which requires additional memory.

The footprint of a Thread::Backtrace::Location does not seem a big issue, those objects are typically GC'd pretty quickly.
Why would they require to keep the result of AST.of?
Can the column information be kept from the parser in the first place?

> Ahh, I've tried but I found that the current prototype draws an underline from "the end column of the receiver of the call" to "the end column of the call", so four integers are clearly not enough. Hardcoding such a location looks very ad-hoc, so AST::Node is only a reasonable way as far as I think of.

Could you show the output of that? I'm not sure I follow the decription.

I think whatever the API is we should also design something for `did_you_mean` that can work on other Ruby implementations.

As far as I see, `RubyVM::AbstractSyntaxTree` exposes MRI internals (e.g., node types, node field names, internal node field order) and seems difficult to support on other Ruby implementations (it reaches deep in MRI internals).
`did_you_mean` is a public gem, so I think it should avoid using such a internal, unstable (the AST will evolve and differs per Ruby implementation), and MRI-specific API.
For instance, `nd_recv = nd_call.children[0]` feels hacky (should be `call_node.receiver` or `call_node[:receiver]`), and hardcoding node types feel brittle (e.g., what if there is a new call node type?)

It would be great to turn `RubyVM::AbstractSyntaxTree` into a proper API which can be supported on other Ruby implementations, and then it could be used by many other gems.
I think that would require to have proper names and more thinking towards making it more future proof.

----------------------------------------
Feature #17930: Add column information into error backtrace
https://bugs.ruby-lang.org/issues/17930#change-92385

* Author: mame (Yusuke Endoh)
* Status: Assigned
* Priority: Normal
* Assignee: mame (Yusuke Endoh)
----------------------------------------
Consider the following code and error.

```
data["data"].first["field"] #=> undefined method `[]` for nil:NilClass
```

There are two possibilities; the variable `data` is nil, or the return value of `first` is nil. Unfortunately, the error message is less informative to say which.

This proposal allows to help identifying which method call failed.

```
$ ruby -r ./sample/no_method_error_ext.rb err1.rb
err1.rb:2:in `<main>': undefined method `[]' for nil:NilClass (NoMethodError)

data["data"].first["field"]
                  ^^^^^^^^^
```

## Proposal

I'd like to propose a feature to get column information from each `Thread::BacktraceLocation`. Maybe it is good to provide the following four methods:

* `Thread::BacktraceLocation#first_lineno`
* `Thread::BacktraceLocation#first_column`
* `Thread::BacktraceLocation#last_lineno`
* `Thread::BacktraceLocation#last_column`

These names came from `RubyVM::AbstraceSyntaxTree::Node`'s methods.

## Implementation

Here is a proof-of-concept implementation: https://github.com/ruby/ruby/pull/4540

See https://github.com/ruby/ruby/pull/4540/commits/6ff516f4985826e9f9c5606638001c3c420f7cad for an example usage.
(Note that, currently, you need to build ruby with `./configure cflags=-DEXPERIMENTAL_ISEQ_NODE_ID` to enable the feature.)

To put it simply, this PR provides only a raw API, `Thread::BacktraceLocation#node_id`. To get actual column information, you need to manually identify `RubyVM::AbstractSyntaxTree::Node` that corresponds to `Thread::BacktraceLocation#node_id`.
But it would be arguable to expose "node_id", so I will wrap it as the above four methods if this is accepted.

Credit: the original implementation was done by @yui-knk.

## Drawback

To use this feature, we need to enable `-DEXPERIMENTAL_ISEQ_NODE_ID` to add "node_id" information (a subtree ID of the original abstract syntax tree) into each byte code instruction. If we provide this feature, the option should be enabled by default. However, the option increases memory consumption.

I performed a simple experiment: I created a scaffold app by `rails new`, and measured the memory usage after `rails s`. The result was 97 MB without `-DEXPERIMENTAL_ISEQ_NODE_ID`, and 100 MB with the option enabled.

In my opinion, it is not so large, but requiring more gems will increase the difference. I will appriciate it if anyone could provide the actual memory increase in a more practical Rails app.

Do you think this feature deserves the memory increase?

---Files--------------------------------
image.png (73.3 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>