On 01/06/2014 04:52 PM, SASADA Koichi wrote:
> Could you try same measurement
> https://github.com/ruby/ruby/pull/495#issuecomment-31580604
> with only addding dummy padding to RVALUE (and not extend embed area) if
> it is easy to try?

Wait a moment.  It is not difficult but takes some time.

> If your assumption:
> 
>> The problem is, 5 is a prime number. So cache mechanisms of any size
> cannot store this struct efficiently. Most notably, CPUs have been
> equipped with data caches since their mid age; Ruby's objects do not
> suit there. That does not always mean a breakage but significant
> slowdown is happening.
> 
> is true, the performance will improve without extending embed data area.
> At least, the improvement of vm3_gc is mainly from lightweight Hash
> allocation, I guess.

Agreed.  vm3_gc boost is "mainly" by allocating {""=>""}.  From my
empirical considerations, cache optimization boosts at most 10%.
Anything faster than that should be due to side effects.

> If the assumption "only allocating overhead is issue" is true, we can
> discuss lightweight memory allocation techniques (which includes
> increasing RVALUE size and expand embed area). If cache line mismatch is
> issue as you said, we can consider about cache line in other area.

Lightweight memory allocation is a good thing to have anyway, no?