Issue #12236 has been updated by Nate Berkopec.


Yura Sokolov wrote:
> > I believe we don't have any drawback with 256MB virtual memory allocation. (but it can depend on OSs)
> 
> It may depends on devices: one may wish to run ruby script on MIPS linux box with 64MB memory.
> Doubtfully, kernel will be configured with over-commit on such system.
> 
> Does it really matters for performance ARENA_SIZE to be 256MB ?
> May be it could be as small as 4MB without noticeable degradation?
> Or may it be compile-time parameter with access from `configure` and default to 256MB?

Well, as long as your Ruby process never actually needs more than 64MB of memory, that's not a problem, right? And I don't think there's any safeguards in Ruby today against overcommit anyway, so I don't see how this proposal is any different in that regard (but maybe I misunderstand).

I guess it would break if the OS has overcommit protections enabled. Example: vm.overcommit_memory 2 on Linux. But I don't think this is a common case?


----------------------------------------
Feature #12236: Introduce `mmap` managed heap
https://bugs.ruby-lang.org/issues/12236#change-61240

* Author: Koichi Sasada
* Status: Open
* Priority: Normal
* Assignee: Koichi Sasada
----------------------------------------
# Abstract

I propose `mmap` (or similar API on Windows) managed heap.

Now, each heap pages are allocated by posix_memalign(), 16KB for each pages.
I wrote a patch to manage heap pages allocated by bulk `mmap` (256MB),
and provide pages by 4KB each pages.

This technique can reduce several overhead,
and can reduce fragmentation problem by reducing page size.

* https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena
* https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena.patch
* https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena.diff

# Terminology

* Slot: Object representation. Each object consumes a slot. Now, a slot consumes 5 words.
* Page: A set of slots. Now, page size is 16KB (so that 400 slots available in a page on 64bit CPU)
* Heap: A set of pages. Now, there is only one heap.

# Background

## Fragmentation

Before Ruby 1.9, the page size was very big (and only few pages).
It was easy to manage, but we couldn't free pages because of fragmentation problem.

To relax this problem, from Ruby 1.9, we split pages into small fixed size (16KB).
The possibility to release pages are increased,
but there are several overhead to manage huge number of pages.

We introduced sorted list to manage such many pages.
This list is used by conservative marking
(because we need to know that a pointer points a slot or not).

## Alignment

Ruby 2.0.0 introduced a bitmap marking technique, to improve CoW friendliness.
To implement this technique, we need to align each pages to access corresponding bitmaps.
So we have used posix_memalign(). This is why

# Proposal

We propose `mmap` managed heap.

Allocate a big area (call it arena) and provide pages from an arena.
Only `mmap`, we only reserve virtual memory, not acquire physical memory.

We can use `madvise` with `MADV_DONTNEED` or `MADV_FREE` to return a page to OS. 

Redefined words:

* New page: A set of slots. Now, page size is 4KB (so that 100 slots available in a page on 64bit CPU)
* Arena: a set of pages. Now, arena size is 256MB (so that 65536 pages in an arena)
* Heap: a set of arenas.

# Discussion

* Advantages:
  * We don't need to manage sorted_list.
  * We can manage small size pages to improve empty possibility.
  * Easy to search bitmap bits because we can manage bitmaps per arenas.
* Disadvantages
  * We need to manage x4 more pages. We need to avoid O(n) (n: page number) operations from the GC process.
  * Difficult to port on a system which doesn't have `mmap` (or similar API)
  * We consume huge virtual memory (at least 256MB). But anybody worries about that?

# Optimizations

To avoid O(n) operations, we introduced a generational technique for sweeping.
So that we can say "generational sweeping".
The technique is simple. If a page has only old generational slots,
then we can skip sweeping on such page.

We separate pages into "collectible" pages and "uncollectible" pages,
and sweep only "collectible" pages.

# Evaluation

Actually, we can't observe impressive result on micro-benchmarks and my small rails application.
However, we believe that this technique will help future optimizations (such as separated heaps by types or lifetime, and so on).

Any comments?




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>