Btw, I also hope to experiment with a slab allocator since many internal
objects are around the same size (like an OS kernel).  This idea is
originally from the Solaris kernel, but also in Linux and FreeBSD.  One
benefit with slab allocators over a general purpose malloc is malloc
has too little context/information make some decisions:

* long-lived vs short-lived (good for CoW)
* shared between threads or not
* future allocations of the same class

Notes on slab: I don't think caching constructed objects like the
reference Solaris implementation does is necessary (or even good),
since it should be possible to transparently merge objects of different
classes (like SLUB in Linux, I think).

Anyways, I think jemalloc is a great general-purpose malloc for things
that don't fit well into slabs.  And it should be easy to let a slab
implementation switch back to general-purpose malloc for
testing/benching.