Issue #14370 has been updated by tenderlovemaking (Aaron Patterson).

File iseq_mark.diff added

normalperson (Eric Wong) wrote:
> tenderlove / ruby-lang.org wrote:
>  > I was concerned that this patch might impact GC time, but
>  > `make gcbench-rdoc` didn't seem to show any significant
>  > difference in GC time between trunk and this patch.  If it
>  > turns out there is a performance impact, I think I could
>  > improve the time while still keeping memory usage low by
>  > generating a bitmap during iseq compilation.
>  
>  I like this patch so far.  This improves data locality (at the
>  cost of extra branches), so I'm all for the space reduction.
>  
>  Also, iseq aren't marked frequently anymore since RGenGC;
>  so I don't believe a bitmap index would be worth the extra code
>  and space; this is probably why GC time isn't as noticeable.

I figured this might be the case.  This patch should make ISeq marking slower, but I wasn't sure if it would be mitigated by the fact that the ISeq objects get old.  It seems like RGenGC solves the problem.
 
>  > +static int
>  > +iseq_extract_values(const VALUE *code, size_t pos, iseq_value_itr_t * func, void *data)
>  
>  Did you try without using function pointers?   It may be
>  possible to eek out a few more cycles by calling rb_gc_mark
>  directly.

I did not.  I've got another branch going with a compacting garbage collector, and I'm reusing this function for updating references.  Since that branch isn't ready for merging, I'm happy to change this to direct calls if it helps to get the patch merged (though I'd prefer to keep the function pointer as it makes my diff smaller).  :)

>  > +void
>  > +rb_iseq_each_value(const rb_iseq_t *iseq, iseq_value_itr_t * func, void *data)
>  
>  Shouldn't this be static? (or do you have other changes planned?)

Yep, it should be static.  Thank you!

>  > +static void
>  > +each_insn_value(void *ctx, VALUE obj)
>  > +{
>  > +    return rb_gc_mark(obj);
>  > +}
>  
>  Needless "return" statement.  I remember some compilers/options
>  complain about the "return" despite rb_gc_mark also being void,
>  even.

Removed! 

Thanks for the review.  I've uploaded a new patch with the static change, and removed the needless return statement.

Another benefit of this patch that I forgot to mention is that it makes reading heap dumps much easier.  Before you have to chase references through an array: iseq->array->literal, but now it's just iseq->literal.

Anyway, thanks again for the review! :)


----------------------------------------
Feature #14370: Directly mark instruction operands and avoid mark_ary usage on rb_iseq_constant_body
https://bugs.ruby-lang.org/issues/14370#change-69630

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Target version: 
----------------------------------------
Hi,

I've attached a patch that changes rb_iseq_mark to directly mark instruction operands rather than adding them to a mark array.  I observed a ~3% memory reduction by directly marking operands, and I didn't observe any difference in GC time.  To test memory usage, I used a basic Rails application, logged all malloc / free calls to a file, then wrote a script that would sum the live memory at each sample (each sample being a call to malloc).  I graphed these totals so that I could see the memory usage as malloc calls were made:

![memory usage graph](https://user-images.githubusercontent.com/3124/35020270-1b0ded20-fae0-11e7-9cbd-1d028a6c9484.png)

The red line is trunk, the blue line is trunk + the patch I've attached.  Since the X axis is sample number (not time), the blue line is not as long as the red line because the blue line calls `malloc` fewer times.  The Y axis in the graph is the total number of "live" bytes that have been allocated (all allocations minus their corresponding frees).  You can see from the graph that memory savings start adding up as more code gets loaded.

I was concerned that this patch might impact GC time, but `make gcbench-rdoc` didn't seem to show any significant difference in GC time between trunk and this patch.  If it turns out there is a performance impact, I think I could improve the time while still keeping memory usage low by generating a bitmap during iseq compilation.

There is a bit more information where I've been working, but I think I've summarized everything here.

  https://github.com/github/ruby/pull/39


---Files--------------------------------
iseq_mark.diff (6.28 KB)
iseq_mark.diff (6.28 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>