Issue #8399 has been updated by dbussink (Dirkjan Bussink).


normalperson (Eric Wong) wrote:

>  I am curious how Rubinius implements RARRAY_PTR and why it cannot be
>  made faster (especially in the non-resizing case).
>  
>  Are the native (for Rubinius) memory region for arrays not contiguous?
>  
>  I also remember asking for RSTRUCT_PTR in Rubinius a few years ago and
>  was turned down.  I expect Array/Struct to use a contiguous memory
>  region (regardless of programming language, but I learned C/ASM first)
>  
>  I don't know much about GC implementation, but I think non-resizing
>  accesses from C API ought to have no effect (though entering C code
>  in the first place may be expensive).

We don't want to expose GC managed memory like this directly to a C extension, that is on concern and why we copy. Even if we would directly expose it, it would still be problematic and still perform much worse because we would have to scan every array when going back into Ruby land because of the GC write barrier. Someone could have set a pointer to a young object in a mature array and all hell would break loose if we wouldn't do that. Since we don't know what people will do when using RARRAY_PTR() we always have to do these safety checks. What if we in Rubinius decide we don't want contiguous memory for arrays but something rope like? The C-API should not put up restrictions on this when this is not necessary.

When people properly use rb_ary_store for setting elements we can properly update the write barrier. Concerns like this are exactly why #8339 treats arrays for example special, I think that as the Ruby community we should remove these usages from C extensions so workarounds like described there are not necessary. 

The reason RSTRUCT_PTR was turned down in Rubinius, is the same as why we don't have RHASH. It's not possible, since it exposes implementation details of how Struct and Hash work that simply don't apply on Rubinius. Hash is implemented in Ruby in Rubinius (as in Struct), so internally these objects look very different. An API for extensions should not put restrictions like this onto other implementations. Anything that accesses internal memory layout in my opinion should not be used in C extensions, if there are things that that are missing / can't be done easily that should be possible, methods should be added for them.

>  > https://gist.github.com/dbussink/57c32c08fb21c7a41719
>  
>  What is the performance change for MRI?

Not seeing any difference, so if it's there in some way it's too small to measure on this example.
----------------------------------------
Bug #8399: Remove usage of RARRAY_PTR in C extensions when not needed
https://bugs.ruby-lang.org/issues/8399#change-39294

Author: dbussink (Dirkjan Bussink)
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 
ruby -v: trunk
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


Rubinius uses quite a few C extensions directly from MRI. Some of these use functionality such as RARRAY_PTR which is not necessary. For compatibility reasons, RARRAY_PTR works on Rubinius but suffers from a heavy performance penalty. Take for example the test of the parser gem (http://github.com/whitequark/parser). These run over 10x faster with the patch applied to Racc that is submitted here:

https://gist.github.com/dbussink/57c32c08fb21c7a41719

Consider issue #8339 where there is work being done on generational GC, I think it is also beneficial to remove usage of internal structures such as RARRAY_PTR where there is the problem of going around the write barrier. In Rubinius, an array is treated special if RARRAY_PTR is used on it in the C-API, so I can imagine MRI being able to optimize the GC better if extensions don't do this. There are functions available for both getting and setting elements in an array and they work fine.

I have only make a patch against Racc here as a showcase, I also want to update all the other extensions to remove RARRAY_PTR. Please consider this change to MRI since in my opinion it has benefits also for MRI and so Rubinius can keep using these extensions directly without having to maintain custom versions just for the considerations described here. I'm also already actively checking C extension gems and sending pull requests for updating this.



-- 
http://bugs.ruby-lang.org/