Issue #12236 has been reported by Koichi Sasada.

----------------------------------------
Feature #12236: Introduce `mmap` managed heap
https://bugs.ruby-lang.org/issues/12236

* Author: Koichi Sasada
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
# Abstract

I propose `mmap` (or similar API on Windows) managed heap.

Now, each heap pages are allocated by posix_memalign(), 16KB for each pages.
I wrote a patch to manage heap pages allocated by bulk `mmap` (256MB),
and provide pages by 4KB each pages.

This technique can reduce several overhead,
and can reduce fragmentation problem by reducing page size.

* https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena
* https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena.patch
* https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena.diff

# Terminology

* Slot: Object representation. Each object consumes a slot. Now, a slot consumes 5 words.
* Page: A set of slots. Now, page size is 16KB (so that 400 slots available in a page on 64bit CPU)
* Heap: A set of pages. Now, there is only one heap.

# Background

## Fragmentation

Before Ruby 1.9, the page size was very big (and only few pages).
It was easy to manage, but we couldn't free pages because of fragmentation problem.

To relax this problem, from Ruby 1.9, we split pages into small fixed size (16KB).
The possibility to release pages are increased,
but there are several overhead to manage huge number of pages.

We introduced sorted list to manage such many pages.
This list is used by conservative marking
(because we need to know that a pointer points a slot or not).

## Alignment

Ruby 2.0.0 introduced a bitmap marking technique, to improve CoW friendliness.
To implement this technique, we need to align each pages to access corresponding bitmaps.
So we have used posix_memalign(). This is why

# Proposal

We propose `mmap` managed heap.

Allocate a big area (call it arena) and provide pages from an arena.
Only `mmap`, we only reserve virtual memory, not acquire physical memory.

We can use `madvise` with `MADV_DONTNEED` or `MADV_FREE` to return a page to OS. 

Redefined words:

* New page: A set of slots. Now, page size is 4KB (so that 100 slots available in a page on 64bit CPU)
* Arena: a set of pages. Now, arena size is 256MB (so that 65536 pages in an arena)
* Heap: a set of arenas.

# Discussion

* Advantages:
  * We don't need to manage sorted_list.
  * We can manage small size pages to improve empty possibility.
  * Easy to search bitmap bits because we can manage bitmaps per arenas.
* Disadvantages
  * We need to manage x4 more pages. We need to avoid O(n) (n: page number) operations from the GC process.
  * Difficult to port on a system which doesn't have `mmap` (or similar API)
  * We consume huge virtual memory (at least 256MB). But anybody worries about that?

# Optimizations

To avoid O(n) operations, we introduced a generational technique for sweeping.
So that we can say "generational sweeping".
The technique is simple. If a page has only old generational slots,
then we can skip sweeping on such page.

We separate pages into "collectible" pages and "uncollectible" pages,
and sweep only "collectible" pages.

# Evaluation

Actually, we can't observe impressive result on micro-benchmarks and my small rails application.
However, we believe that this technique will help future optimizations (such as separated heaps by types or lifetime, and so on).

Any comments?




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>