Issue #8399 has been updated by dbussink (Dirkjan Bussink).


ko1 (Koichi Sasada) wrote:

>  Let us clarify your statement (sorry if I missed in your messages).
>  
>  (1) Do you want to remove RARRAY_PTR() completely?
>  or
>  (2) Reduce usage of RARRAY_PTR()?
>  
>  I believe RARRAY_AREF/ASET will help (2), and I strongly agree to
>  backport 2.0 and 1.9 (or provide some migration way).
>  
>  However, I think (1) is too drastic for us, MRI.

Personally I think RARRAY_PTR should be removed and the resulting issues of that fixed. I can understand if MRI decides not to do this, but I'm going to actively change C extensions and send pull requests / patches to other extensions to remove the usage. This so perhaps a future version of MRI can remove it completely (with a proper announcement to provide sufficient time for people to change). 

So (2) is something I'm actively working on. I think MRI C extensions should set a good example of how to use the C-API, so also remove usage of RARRAY_PTR in extensions. This is also because Rubinius also uses C extensions from MRI, like OpenSSL etc. In this case it is Racc, which is also provided as a gem. This gem however is creating by taking the code from MRI, which is why I would like to fix the issue at the source and not try to patch the Racc gem.

With backporting these additional macro's to 1.9 and 2.0, we can also update the C extensions shipped with those versions to use that macro instead of RARRAY_PTR. This means for Rubinius an instant performance improvement for these extensions since we don't need to copy and scan the array anymore, but can just make those macro's an alias for rb_ary_entry and rb_ary_store which properly work with the Rubinius write barrier.

So in short, these are the two options I see:

(1) Have a mechanism for getting and setting an array element. rb_ary_entry and rb_ary_store already exist and I never saw any measurable performance impact with MRI on C extensions where I replaced RARRAY_PTR() usage with these functions. Therefore this is what I initially proposed. In this case all usages of RARRAY_PTR in C extensions in MRI's ext/ directory would use rb_ary_entry etc. and not RARRAY_PTR.

(2) If MRI objects to doing this, I want to propose using RARRAY_AREF / RARRAY_ASET. This means we should backport these macros' to 1.9 and 2.0, so C extensions (in MRI but also written by others) can use this mechanism as soon as possible instead of RARRAY_PTR. Also in this case I want to update the C extensions in ext/ for 1.9, 2.0 and trunk (2.1) to use these new macro's. In this case I will add these macro's to Rubinius as well (where they will alias to rb_ary_entry and rb_ary_store).

For both approaches I'm perfectly willing to do the changes myself so no one in MRI has to do any work for this, but of course I want approval before starting the work.


----------------------------------------
Bug #8399: Remove usage of RARRAY_PTR in C extensions when not needed
https://bugs.ruby-lang.org/issues/8399#change-39309

Author: dbussink (Dirkjan Bussink)
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 
ruby -v: trunk
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


Rubinius uses quite a few C extensions directly from MRI. Some of these use functionality such as RARRAY_PTR which is not necessary. For compatibility reasons, RARRAY_PTR works on Rubinius but suffers from a heavy performance penalty. Take for example the test of the parser gem (http://github.com/whitequark/parser). These run over 10x faster with the patch applied to Racc that is submitted here:

https://gist.github.com/dbussink/57c32c08fb21c7a41719

Consider issue #8339 where there is work being done on generational GC, I think it is also beneficial to remove usage of internal structures such as RARRAY_PTR where there is the problem of going around the write barrier. In Rubinius, an array is treated special if RARRAY_PTR is used on it in the C-API, so I can imagine MRI being able to optimize the GC better if extensions don't do this. There are functions available for both getting and setting elements in an array and they work fine.

I have only make a patch against Racc here as a showcase, I also want to update all the other extensions to remove RARRAY_PTR. Please consider this change to MRI since in my opinion it has benefits also for MRI and so Rubinius can keep using these extensions directly without having to maintain custom versions just for the considerations described here. I'm also already actively checking C extension gems and sending pull requests for updating this.



-- 
http://bugs.ruby-lang.org/