Narihiro Nakamura <authornari / gmail.com> wrote:
> 2012/1/8 KOSAKI Motohiro <kosaki.motohiro / gmail.com>:
> >> Narihiro Nakamura <authornari / gmail.com> wrote:
> >>> * A heap block address is aligned by 16KB to find fast a bitmap.
> >>
> >> Just wondering, why/how did you determine 16K alignment is optimal?
> >> Normal page size in Linux is only 4K, so 16K seems large.
> 
> We defined the heap block size at following commit.
> https://github.com/ruby/ruby/commit/4d93af26df1c322515e535d60cd5b0a66dcc222d
> I've run benchmarks on different heap block size. Then, I chose 16KB
> because it was fastest.
> But, I probably should have chosen 4KB as you say.

A larger (16KB) _size_ I can easily understand to be more efficient.
I just don't see a reason for _alignment_ to be 16KB, too.  Larger
alignment increases the chance of fragmentation.

I've only used posix_memalign() to avoid sharing of L1 cache lines
(with alignment being only 32-128 bytes on modern CPUs).  I'm not
even sure if any larger alignment makes sense for the Ruby heap.

> > Moreover, posix_memalign() and memalign() sound bad choice. It need
> > a few header bytes. then, some malloc implementations might allocate
> > 32K instead of 16K.
> >
> > So, why can't we use mmap() directly or use "16K - a few bytes" length?
> 
> No special reason.
> We should choose to use mmap() directly if it's efficient way.

Using mmap() directly will avoid (userspace) fragmentation entirely.

On newish glibc, I expect posix_memalign()+free() to be fastest,
especially for fast startup and short-lived processes.  So I think
"16K - 2*sizeof(size_t)" may be best (but I'm too lazy to test :)