Issue #8399 has been updated by Eregon (Benoit Daloze).


dbussink (Dirkjan Bussink) wrote: 
> The problem is not rb_ary_new4 itself, but that RARRAY_PTR needs to be called before. Since there's no guarantee on what with happen with that RARRAY_PTR() it suffers from all the drawbacks here. I think this case can however be replaced with rb_ary_subseq() to create a substring from the original one. This could be implemented in an efficient way in Rubinius as well. I will try and update my patch with that accordingly then.

Indeed, I though of that afterwards.

> I've used rb_ary_entry() here so this could be easily backported to 1.9.3 as well (and also for other extensions). That way the extensions that Rubinius has a copy of can still use the 1.9 version for 1.9 mode there and not suffer from the performance degradation. Also the performance on a simple benchmark was not affected by using rb_ary_entry() here at all.

Could you share that benchmark?
I could notice the difference in an highly constrained one
summing a 10000-elements array: 107us instead of 49us (and 85us with RARRAY_PTR on trunk).
But the difference is only in the order of a couple instructions of course,
it might be irrelevant in this case.
----------------------------------------
Bug #8399: Remove usage of RARRAY_PTR in C extensions when not needed
https://bugs.ruby-lang.org/issues/8399#change-39791

Author: dbussink (Dirkjan Bussink)
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 
ruby -v: trunk
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


Rubinius uses quite a few C extensions directly from MRI. Some of these use functionality such as RARRAY_PTR which is not necessary. For compatibility reasons, RARRAY_PTR works on Rubinius but suffers from a heavy performance penalty. Take for example the test of the parser gem (http://github.com/whitequark/parser). These run over 10x faster with the patch applied to Racc that is submitted here:

https://gist.github.com/dbussink/57c32c08fb21c7a41719

Consider issue #8339 where there is work being done on generational GC, I think it is also beneficial to remove usage of internal structures such as RARRAY_PTR where there is the problem of going around the write barrier. In Rubinius, an array is treated special if RARRAY_PTR is used on it in the C-API, so I can imagine MRI being able to optimize the GC better if extensions don't do this. There are functions available for both getting and setting elements in an array and they work fine.

I have only make a patch against Racc here as a showcase, I also want to update all the other extensions to remove RARRAY_PTR. Please consider this change to MRI since in my opinion it has benefits also for MRI and so Rubinius can keep using these extensions directly without having to maintain custom versions just for the considerations described here. I'm also already actively checking C extension gems and sending pull requests for updating this.



-- 
http://bugs.ruby-lang.org/