Issue #18269 has been updated by k0kubun (Takashi Kokubun).


No matter how we implement it (making each instruction unique, telling the compiler to not perform de-duplication if possible, etc.), one direction would be to make every address of trace instructions unique like you suggested. I would need to investigate some compiler macro or trick to force duplicate the same code for different insns since I don't have immediate knowledge to make that happen in GCC.

Another direction could be to have an st for each iseq and optionally insert mapping from such insn index to the original insn to it. 

The trade-off here is that the former approach is more memory-efficient and the latter approach is platform-independent and potentially faster since it would continue to maintain a nice code locality.

----------------------------------------
Bug #18269: trace_opt_not and trace_opt_regexpmatch2 insns are indistinguishable
https://bugs.ruby-lang.org/issues/18269#change-94331

* Author: mame (Yusuke Endoh)
* Status: Open
* Priority: Normal
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
On some platforms and compiler tool chain, `RubyVM::ISeq#to_a` shows a wrong instruction.

```
set_trace_func(Proc.new { })
set_trace_func(nil)

iseq = RubyVM::InstructionSequence.compile("/true/ =~ 'true'")
pp iseq.to_a.last
# expected:
#=> [1,
#    :RUBY_EVENT_LINE,
#    [:putobject, /true/],
#    [:putstring, "true"],
#    [:opt_regexpmatch2, {:mid=>:=~, :flag=>16, :orig_argc=>1}],
#    [:leave]]

# actual:
#=> [1,
#    :RUBY_EVENT_LINE,
#    [:putobject, /true/],
#    [:putstring, "true"],
#    [:opt_not, {:mid=>:=~, :flag=>16, :orig_argc=>1}],
#    [:leave]]
```

In this case, `opt_regexpmatch2` is a correct insn, but actually `opt_not` is printed.

This is caused by the fix of #14870. commit:48c8df9e0eb295af06d593ce37ce1933c0ee1d90 made almost all traced and optimized instructions (`trace_opt_*` insns) delegate to `opt_send_without_block`. Under some conditions, GCC seems to merge them to one code block because the bodies of `trace_opt_*` instructions are all identical. I'm unsure why, but in this case, only the pair of `trace_opt_not` and `trace_opt_regexpmatch2` is merged, and their addresses in direct-threaded code are now indistinguishable.

Fortunately, there seems to be no path to revert `trace_*` insn to its original (non-trace) version, except `RubyVM::ISeq#to_a`. (YJIT might be also affected because it uses `rb_vm_insn_addr2opcode`, but I'm unsure.)
So, AFAIK, this is not a significant problem, at the present time. However, this will bring trouble if we attempt to revert `trace_*` insn in future.

Some tests of MJIT use `RubyVM::ISeq#to_a` to check if the insn that the test targets is correctly tested. This leads to false positive warnings:

http://rubyci.s3.amazonaws.com/graviton2/ruby-master/log/20211026T020530Z.log.html.gz

```
[20397/21172] TestJIT#test_compile_insn_opt_regexpmatch2
/home/chkbuild/chkbuild/tmp/build/20211026T020530Z/ruby/test/ruby/test_jit.rb:599: warning: 'opt_regexpmatch2' insn is not included in the script. Actual insns are: putobject putstring opt_not leave

/home/chkbuild/chkbuild/tmp/build/20211026T020530Z/ruby/test/ruby/test_jit.rb:600: warning: 'opt_regexpmatch2' insn is not included in the script. Actual insns are: putstring putobject opt_not leave
 = 0.58 s
```

@jeremyevans0 @k0kubun Do you have any idea to fix this issue?



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>