Issue #14480 has been updated by vo.x (Vit Ondruch).


It seems they are getting further:

~~~
With -fomit-frame-pointer on *everything*, and hacking out the call to rb_thread_create_timer_thread in Init_Thread (to keep this single-threaded for simplicity), the bug appears to be a problem with setjmp/longjmp.

x29 (the frame pointer) is corrupted deep within the 15th call to vm_exec_core within the 10th call to vm_call_opt_call.

The write of the bogus value to x29 occurs here:

0x000000000057d9bc in rb_ec_tag_jump (ec=0x46b050 <all_iter_i>, st=RUBY_TAG_NONE) at vm.i:10459
10459     __builtin_longjmp(((ec->tag->buf)), (1));
2: /x $x29 = 0x7fffff8080

(gdb) disassemble 
Dump of assembler code for function rb_ec_tag_jump:
   0x000000000057d988 <+0>:     stp     x29, x30, [sp,#-32]!
   0x000000000057d98c <+4>:     mov     x29, sp
   0x000000000057d990 <+8>:     str     x0, [x29,#24]
   0x000000000057d994 <+12>:    str     w1, [x29,#20]
   0x000000000057d998 <+16>:    ldr     x0, [x29,#24]
   0x000000000057d99c <+20>:    ldr     x0, [x0,#24]
   0x000000000057d9a0 <+24>:    ldr     w1, [x29,#20]
   0x000000000057d9a4 <+28>:    str     w1, [x0,#336]
   0x000000000057d9a8 <+32>:    ldr     x0, [x29,#24]
   0x000000000057d9ac <+36>:    ldr     x0, [x0,#24]
   0x000000000057d9b0 <+40>:    add     x0, x0, #0x10
   0x000000000057d9b4 <+44>:    ldr     x1, [x0,#8]
   0x000000000057d9b8 <+48>:    ldr     x29, [x0]
=> 0x000000000057d9bc <+52>:    ldr     x0, [x0,#16]
   0x000000000057d9c0 <+56>:    mov     sp, x0
   0x000000000057d9c4 <+60>:    br      x1

when called from rb_iterate0, where the bogus x29 value has been fetched from the jmp_buf at +48.
 
A watchpoint on that memory shows it being set to the bogus value here in rb_iterate0:

   0x0000000000595e50 <+96>:    str     x0, [sp,#264]
   0x0000000000595e54 <+100>:   ldr     x0, [sp,#232]
   0x0000000000595e58 <+104>:   ldr     x0, [x0,#24]
   0x0000000000595e5c <+108>:   str     x0, [sp,#592]
   0x0000000000595e60 <+112>:   add     x0, sp, #0x108
   0x0000000000595e64 <+116>:   add     x0, x0, #0x10
   0x0000000000595e68 <+120>:   add     x1, sp, #0x260
   0x0000000000595e6c <+124>:   str     x1, [x0]
=> 0x0000000000595e70 <+128>:   adrp    x1, 0x595000 <raise_method_missing+544>

28300       struct rb_vm_tag _tag;
28301       _tag.state = RUBY_TAG_NONE;
28302       _tag.tag = ((VALUE)RUBY_Qundef);
28303       _tag.prev = _ec->tag;
28304       ;
28305       state = (__builtin_setjmp((_tag.buf)) ? rb_ec_tag_state((_ec))
28306                                             : ((void)(_ec->tag = &_tag), 0));

If I'm reading this right, the __builtin_longjmp rb_ec_tag_jump (in rb_iterate0) is attempting to restore x29 from the jmp_buf, but the __builtin_setjmp in rb_iterate0 isn't actually saving x29 there, and hence x29 gets corrupted at the longjmp, deep in the callstack, leading to an eventual crash when vm_call_opt_call eventually tries to use x29.
~~~

----------------------------------------
Bug #14480: miniruby crashing when compiled with -O2 or -O1 on aarch64
https://bugs.ruby-lang.org/issues/14480#change-70578

* Author: vo.x (Vit Ondruch)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [aarch64-linux]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
Recently, it is not possible to build Ruby 2.5.0 on aarch64 on Fedora Rawhide, because miniruby fails during build:

~~~
... snip ...

./miniruby -I./lib -I. -I.ext/common  -n \
-e 'BEGIN{version=ARGV.shift;mis=ARGV.dup}' \
-e 'END{abort "UNICODE version mismatch: #{mis}" unless mis.empty?}' \
-e '(mis.delete(ARGF.path); ARGF.close) if /ONIG_UNICODE_VERSION_STRING +"#{Regexp.quote(version)}"/o' \
10.0.0 ./enc/unicode/10.0.0/casefold.h ./enc/unicode/10.0.0/name2ctype.h 
generating encdb.h
./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -c -o encdb.h ./template/encdb.h.tmpl ./enc enc
generating prelude.c
./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -I. -c -o prelude.c \
	./template/prelude.c.tmpl ./prelude.rb ./gem_prelude.rb ./abrt_prelude.rb
*** stack smashing detected ***: <unknown> terminated
encdb.h updated

... snip ...
~~~

This might by Ruby or gcc issue. Not sure yet. However, there is already lengthy analysis available in Fedora's Bugzilla [1]. Would be anybody able to help to resolve this issue?


[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1545239

---Files--------------------------------
Dockerfile (573 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>