I tried jemalloc 3.5.0 vs eglibc 2.13-38 (Debian x86_64)

http://80x24.org/bmlog-20140126-003136.7320.gz

Mostly close results, but I think our "make benchmark" suite is
incomplete and we need more fork/concurrency-intensive benchmarks of
large apps.

io_file_read and vm2_bigarray seem to be big losses because jemalloc
tends to release large allocations back to the kernel more aggressively
(and the kernel must zero that memory).

[1] I have applied two patches for improved benchmark consistency:
        https://bugs.ruby-lang.org/issues/5985#change-44442
        https://bugs.ruby-lang.org/issues/9430
    (Note: I still don't trust the vm_thread* benchmarks too much,
     they seem very inconsistent even with no modifications)