Issue #13438 has been updated by jeremyevans0 (Jeremy Evans).

File 0001-Fix-heap-overflow-by-allocating-more-memory-per-heap.patch added

jeremyevans0 (Jeremy Evans) wrote:
> 1) The heap overflow only happens when the operating system uses <16kb pages and ruby is set to use 16k heap pages.
> 
> 2) The heap overflow always happens when ruby uses 16kb heap pages.
> 
> If 1) is true, then this should be the only fix necessary.  If 2) is true, then there is a separate memory issue that should be fixed. I suspect the problem is more likely 2), but this is outside my area of expertise.

I did some testing with different versions of HEAP_PAGE_ALIGN_LOG.  Here's the results of my testing, with the first entry being HEAP_PAGE_ALIGN_LOG, the second being the page size, and the third being the result:

~~~
6  64B   Couldn't build: SIGFPE 
7  128B  No error
8  256B  No error
9  512B  No error
10 1KB   No error
11 2KB   No error
12 4KB   No error
13 8KB   chunk canary corrupted 0x1fd8@0x1fd8
14 16KB  chunk canary corrupted 0x3fd8@0x3fd8
15 32KB  chunk canary corrupted 0x7fd8@0x7fd8
16 64KB  chunk canary corrupted 0xffd8@0xffd8
17 128KB chunk canary corrupted 0x1ffd8@0x1ffd8
18 256KB chunk canary corrupted 0x3ffd8@0x3ffd8
19 512KB Couldn't build: Failed to allocate memory
~~~

I first thought that when using >4KB pages, there is a heap overflow, but the heap overflow doesn't happen when using <=4KB pages.  However, I think there may always be a heap overflow, even when using <=4KB pages.  It turns out the OpenBSD malloc canary support is only turned on when allocating >=4KB.  This leads me to believe the issue is that there is always a heap overflow, no matter the HEAP_PAGE_ALIGN_LOG value.

I tried increasing the size passed to `aligned_malloc` to see if I could determine the size of the overflow.  It turns out that it overflows not by a single byte, but by 40 bytes.  Coincidentally, that is also the value of REQUIRED_SIZE_BY_MALLOC.  Maybe REQUIRED_SIZE_BY_MALLOC just needs to be added when calling `aligned_malloc`? I tried that and it appears to fix things.

The attached patch should fix the heap overflow for all page sizes that compile (tested on HEAP_PAGE_ALIGN 7..18).

----------------------------------------
Bug #13438: Fix heap overflow due to configure.in not being updated for HEAP_* -> HEAP_PAGE_* variable renaming
https://bugs.ruby-lang.org/issues/13438#change-64297

* Author: jeremyevans0 (Jeremy Evans)
* Status: Open
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Target version: 
* ruby -v: ruby 2.5.0dev (2017-04-15 trunk 58358) [x86_64-openbsd]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: REQUIRED
----------------------------------------
An OpenBSD user reported that ruby 2.4.1 fails on OpenBSD when malloc canaries are enabled.  I verified this is true, not just on ruby 2.4.1, but also on trunk:

~~~
MALLOC_OPTIONS=C ruby25 -v
ruby 2.5.0dev (2017-04-15 trunk 58358) [x86_64-openbsd]
ruby25(13588) in free(): chunk canary corrupted 0x8dbf2bb0000 0x3fd8@0x3fd8
Abort trap (core dumped)
~~~

`MALLOC_OPTIONS=C` here turns on malloc canaries.  From the OpenBSD malloc.conf(5) man page:

~~~
     C       "Canaries".  Add canaries at the end of allocations in order to
             detect heap overflows.  The canary's content is checked when
             free(3) is called.  If it has been corrupted, the process is
             aborted.
~~~

So what we have here is a (probably small) heap overflow.  Here's the backtrace:

~~~
(gdb) bt
#0  0x000008dcc9a1014a in thrkill () at {standard input}:5
#1  0x000008dcc99ead29 in *_libc_abort () at /usr/src/lib/libc/stdlib/abort.c:52
#2  0x000008dcc99e1346 in wrterror (d=0x7f7ffffceda0, msg=0x8dcc9b6fea0 "chunk canary corrupted %p %#tx@%#zx") at /usr/src/lib/libc/stdlib/malloc.c:306
#3  0x000008dcc99e1422 in validate_canary (d=Variable "d" is not available.
) at /usr/src/lib/libc/stdlib/malloc.c:1047
#4  0x000008dcc99e28ee in ofree (argpool=0x8dcd75392f0, p=0x8dbf2bb0000, clear=0) at /usr/src/lib/libc/stdlib/malloc.c:1334
#5  0x000008dcc99e2bdd in free (ptr=0x8dbf2bb0000) at /usr/src/lib/libc/stdlib/malloc.c:1414
#6  0x000008dcb4ffdb52 in aligned_free (ptr=0x8dbf2bb0000) at gc.c:7678
#7  0x000008dcb4fef9df in heap_page_free (objspace=0x8dbf1f35400, page=0x8dc825ed800) at gc.c:1446
#8  0x000008dcb4fef752 in rb_objspace_free (objspace=0x8dbf1f35400) at gc.c:1341
#9  0x000008dcb515c168 in ruby_vm_destruct (vm=0x8dc8ca68c00) at vm.c:2191
#10 0x000008dcb4fe12c8 in ruby_cleanup (ex=0) at eval.c:227
#11 0x000008dcb4fe1528 in ruby_run_node (n=0x14) at eval.c:297
#12 0x000008d9eb600624 in main (argc=2, argv=0x7f7ffffcf248) at main.c:36
~~~

Note that this doesn't tell you where the heap overflow happened, it only shows where the heap overflow is detected.

I checked and ruby 1.8.7p374, 1.9.3p551, 2.0.0p648, 2.1.9, 2.2.7, and 2.3.4 do not suffer from this issue.

I determined via `git bisect dea8ea61ea0bf08adb35be6ad47abe3ab955afc4 b58b970db5156766d6e19606d79afc68e4c2df7c` the problem was introduced between r53467 and r53471. 

With r53467 (git checkout 066b825400349c559aa3c1ca7769516c967c41b9):

~~~
ruby24 -v
ruby 2.4.0dev (2016-01-08 trunk 53467) [x86_64-openbsd]
# no error                                                                                                                                       
~~~

With r53471 (git checkout fca0cf6e6b8ab8882f3403e5909c8eb91c5c351e):

~~~
ruby24 -v
ruby 2.4.0dev (2016-01-09 trunk 53471) [x86_64-openbsd]
ruby24(85533) in free(): chunk canary corrupted 0xf7bcfc1c000 0x3fd8@0x3fd8
Abort trap (core dumped)
~~~

There isn't much between r53467 and r53471 that could indicate a potential overflow introduction.  All the diff does is rename variables.  However, I noticed one part of the diff that is interesting:

~~~
-#ifndef HEAP_ALIGN_LOG
+#ifndef HEAP_PAGE_ALIGN_LOG
 /* default tiny heap size: 16KB */
-#define HEAP_ALIGN_LOG 14
+#define HEAP_PAGE_ALIGN_LOG 14
 #endif
~~~ 

If HEAP_ALIGN_LOG is already defined, it had an effect before, but no longer has an effect now.  I searched for HEAP_ALIGN_LOG and sure enough it is still used by configure.in.  Updating configure.in to change it to HEAP_PAGE_ALIGN_LOG fixes things.

Attached is a patch to fix this.  This should be applied and backported to the 2.4 branch.

---Files--------------------------------
0001-Fix-heap-overflow-if-system-uses-16kb-pages.patch (1.12 KB)
0001-Fix-heap-overflow-by-allocating-more-memory-per-heap.patch (929 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>