Issue #16027 has been updated by dalehamel (Dale Hamel).


Hi Martin,

> Do you have any parts implemented already

Yes,I have a prototype gem that adds StaticTracing.tracepoint as a way to define
a stub library that can be used for a debugger (dtrace/bpftrace) to attach
to, similar to the existing TracePoint API. I've been experimenting with using
this, in combination with the TracePoint api, to execute static tracepoints
from within the existing TracePoint handler context.

I've hit a wall, where in order to improve functionality I need more access to context,
which is only available from the RubyVM, and not exposed via the TracePoint API.

I don't yet have a prototype using internal VM context, as I thought it would be
pragmatic to see if there is interest in such functionality before investing the time in
further prototyping.

> any plans for implementation?

Yes, but it's still a work in progress as I learn more through prototyping.
Here's what I've got so far, feedback is welcome (and appreciated):

What is implemented needs to be in line with the existing setup of Ruby, and
be minimally intrusive. Adding this functionality should enhance Ruby, not
bog it down and make it more complex.

For this reason, I feel that it makes sense to tie static tracepoints to Ruby's
existing, bulit-in TracePoint API. They appear to operate in a similar way,
and are analogous types of breakpoint. What a TracePoint event is to the RubyVM,
a USDT event is to Kernel.

As Ruby already has USDT tracing built in, I believe it makes sense to target
the same libraries already used. On BSD, these headers are provided by the
system.

On linux, the SDT headers are provided by SystemTap, and do a similar thing.
These headers allow for the RubyVM to build static tracepoints into its own
source code, because Ruby is written in C.

On both platforms, the approach is analagous - the source code has some notes
added to it at a particular location outside of code space, indicating relocation
points for particular addresses in the code from which to read data.

I believe it makes sense to folow on this approach, as it is consistent with
the libraries and toolchain already supported by these built-in VM tracepoints.
This feature is an extension of that, allowing for ruby processes to provide
access to this same debugging metadata that system kernel-based handlers can
read and handle the event.

I think the process would probably go like this:

- Prove out as much as I can with a Ruby Gem prototype
- Contrast this with the smallest possible patches to ruby that I can do to
achieve or enhance this functionality.
- Build for this feature to be accepted behind a ./configure flag so that it
can be toggled and optional, as the existing dtrace support is (and perhaps
update --enable-dtrace to --enable-usdt).

I also want to clarify about JIT: I believe that this would be extremely easy to
enable for Ruby JIT'd objects. As the dtrace macros are headers, it seems like
they should be injected in the C generated by Ruby for a particular instruction
sequence. If that instruction sequence contains a tracepoint, with USDT,
attributes, then the code that prepares the ruby source only needs to add a
single line, and the macro will inject the probe.

This is why I mention JIT - JIT negates the need for an elf stub, as there is
already a shared object with the executable code - it can use the traditional
header macro approach for injecting USDT probes. By simply ispecting a ruby
process, these JIT probse should already be transparently available.

For full support in this JIT future, we would need to consider how we enable
these tracepoints on both ruby instruction sequences, for non-JIT code, and
their native counterparts.

To support dynamic object, we can use ELF stub libraries or DOF debug annotations.

> Would you be ready to do significant implementation work if this feature got accepted?

Yes, that's certainly my intention

> What parts do you think could be done as a Gem

I don't know yet, but I'm tring to figure this out. As I said, I've hit a wall
as to what I can get from the existing TracePoint API, so I need to peak inside
and see what I can do with VM context to enhance it.

> and what parts would need to be part of the Ruby implementation core?

I don't know where the line is, and it depends mostly on if support for USDT
tracepoints on ruby code (instead of just the VM) is a desirable feature for
ruby Core.

I'd aim to make the changes as small as possible, as it looks like the existing
dtrace integration is pretty minimally intrusive. It would come down to whether
it is better to have the stub generation libraries in Ruby Core, or linked to
externally. I think the latter is probably better.

If that's the case, then some small amount of glue code might be sufficient to
expose the needed context to build a userspace version of this functionality.

> How could this be made more OS-independent (besides Linux, Darvin, and Windows, there are also various BSD flavors,...).

Good news, BSD and Solaris already work with these dtrace APIS. Darwin, Solaris,
BSD, Linux would all be pretty easy to support, it's just Windows that would be
tricky.

To support Linux, ELF stubs are generated. To support BSD, the dtrace DOF format
is generated.



----------------------------------------
Feature #16027: Update Ruby's dtrace / USDT API to match what is exposed via the TracePoint API
https://bugs.ruby-lang.org/issues/16027#change-80220

* Author: dalehamel (Dale Hamel)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
# Abstract

I propose that Ruby's "dtrace" support be extended to match what is available in the TracePoint API, as was the case until feature [Feature #15289] landed.

# Background

I will refer to Ruby's "dtrace" bindings as USDT bindings for simplicity, as this is the typo of dtrace probe that they support.

Prior to [Feature #15289] being merged, Ruby's tracepoint API was able to trace only 'all' instances of a type of event.

Ruby added support for tracing ruby with dtrace, and so Ruby's USDT Ruby TracePoint API were "in sync".

Once the Ruby TracePoint API recently added the ability to do filtered tracing in [Feature #15289], it added new functionality but brought the TracePoint and USDT API out of sync.

Currently the TracePoint API is ahead of the USDT API, which presents the problem. There is valuable debug information available, but we do not have
a way to access it with dtrace instrumentation.

Additionally, the recent release of bpftrace adds support for USDT tracing on linux, which makes this a valuable opportunity to be able to use Ruby's TracePoint API in an efficient and targeted way for production tracing. To achieve this, we must synchronize the features of the USDT and TracePoint API.

What is currently lacking is the ability to do filtered, selective tracing as the `TracePoint#enable` call now supports as per [prelude.rb#L141](https://github.com/ruby/ruby/blob/master/prelude.rb#L141)

# Proposal

When enabling a TracePoint, users can specify a flag: `usdt: [LIST_OF_SIMPLE_TYPES]`, which will trigger Ruby to also enable the USDT API for when it enables TracePoints.

Within the TracePoint block, users can call `tp.fire` to send USDT data. So the new default API is:

```ruby
trace.enable(target: nil, target_line: nil, target_thread: nil: usdt: nil)
```

And the usage might look like:

```ruby
trace.enable(target: method(:foo), target_line: 5, usdt: [Integer, String]) do |tp|
  tp.fire(tp.lineno, "Any String I want to send")
end
```

The types specified must be simple types such as `Integer` or `String`, given by their names as constants. When data to the tracepoint, the types must match. If they don't, the tracer won't be able to interpret them properly, but nothing should crash.

# Details

I propose that Ruby optionally generate ELF (Linux) or DOF (Darwin) annotations for TracePoint targets when they are enabled.

As ruby is a dynamic language, it cannot do this natively (yet) though Ruby JIT may make this easier, but for now it is not suitable for production use.

To get around this, Ruby can either generate the DOF or ELF stub shared library itself, for example it may do one per class, treating the class as the "provider" for the USDT API, and the methods as tracepoints. This is the approach used by [libusdt](https://github.com/chrisa/libusdt), which generates DOF usable on Darwin, BSD, and other platforms, and [libstapsdt](https://github.com/sthima/libstapsdt), which generates ELF stubs for use on linux.

When a tracepoint is triggered, the user may be able to call a new API `TracePoint#fire`, to send data to the Kernel via the USDT API, using the generated ELF stub as a bridge, giving the kernel an address to target in order to receive this data.

Upon enabling a tracepoint, we can either generate these stubs internally, or by linking to an external library that must be enabled at configure time (without this, USDT tracing wouldn't be enabled at all).

It may be possible to use the existing bridge that is used by ruby jit, or have an experimental flag such as `--usdt` that enables support for generating these stubs.

It may be more consistent with the future Ruby JIT to do this, or else Ruby can generate these stubs by its own native code, but this will require a sort of merging of libusdt and libstapsdt. This would add a dependency to the libelf development header, but that is probably not a problem on Linux platforms.

I would suggest the first approach, if this feature is accepted, would be to try and implement the ELF / DOF generation directly in Ruby. What libstapsdt and libusdt do isn't that complex and could be done in its own C file that probably wouldn't be too large.

Failing that approach, it may be worth investigating the Ruby JIT code to see if a compiler can generate these stubs for us easily. This approach would be to have ruby generate C code that results in the necessary DOF/ELF annotations, and have the compiler pipeline used by ruby JIT to generate the file. This couples the feature to ruby jit though.

# Usecase

This feature would be used by dtrace / bpftrace users to debug ruby applications. It may be possible for other platforms to benefit from this too, but I think the main use case is for Linux system administrators and developers to use external debuggers (dtrace/bpftrace) to introspect Ruby's behavior.

# Discussion

## Pros:

* Syncs the Ruby TracePoint and USDT API
* Allows for much more dynamic and targeted USDT tracing
* Can help to find problems in both development and production
* Can be used for performance and error analysis
* Is better than printing, as emitting/collecting data is only done while a "debugger is attached"

## Cons:

* Complexity introduced, in order to generate the ELF/DOF stub files
* Not easily ported to other platforms
* Isn't fully consistent with the current dtrace functionality of Ruby, which is built-in to the VM

# Limitation

This will only work on *Nix platforms, and probably just on Linux to start, as that is where most of the benefits are.

If the Ruby JIT approach is preferred or much simpler, then that functionality will be tied to the Ruby JIT functionality.

# See also

* https://bpf.sh/usdt-report-doc/index.html a document describing my experimental gem ruby-static-tracing, which prototypes this functionality outside of the RubyVM
* https://bpf.sh/production-breakpoints-doc/index.html a work-in-progress on adding more dynamic method and line based USDT tracing to ruby, built atop ruby-static-tracing now using the ruby tracepoint API.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>