On Tue, Oct 25, 2011 at 11:35 AM, Carter Cheng <cartercheng / gmail.com> wrote:
>
> Hello Charlie,
> I did notice reading through some of the papers on the current efforts toJIT javascript (v8, spider monkey and squirrelfish) are quite similar to Ruby in some ways but that it offers unique challenges. Are there some descriptions out there of the JRuby effort and the internals?

There's an oldish description here, but it could use some refreshing:
https://github.com/jruby/jruby/wiki/JRubyInternalDesign

Basically, there's two phases to JRuby's JITing. First there's the
JRuby compiler, which compiles from the AST (which we interpret
against) into JVM bytecode. The primary goal of the bytecode compiler
is to map Ruby language constructs as closely to JVM constructs as
possible. For example, mapping local variables to JVM local variables
whenever possible by detecting the use of closures, methods like
'eval', and so on. The set of optimizations that compiler does is
fairly limited.

Then after the bytecode runs for a while, the JVM will take over. The
different JVMs all have their own optimization strategies, but in
general they're extremely good at it. I won't go into the details
here; there's papers on each of the JVMs, and lots of information
about how Hotspot/OpenJDK optimizes publicly available.

We do also have a third compiler that will be in use soon, which
produces a CFG-based intermediate representation (so called "IR
Compiler"). This new compiler will give us the opportunity to apply
more traditional compiler optimization techniques, improving the
quality of the JVM bytecode we produce so that JVMs have better
visibility to take things from there. I'm not the expert on that
compiler, but I can point you toward folks who are.

> My sense is that there are opportunities here for further improvement though with MRI I would
> ideally like to restructure the byte code IL format (though I am uncertain if this is a good idea given the direction it looks like 2.0 may be going.

My understanding of optimizing bytecode compilers tells me that having
a good IL is very important. You want enough visibility into the
fine-grained operations (so you can optimize well) but you don't want
such a massive instruction set that the optimizer becomes gigantic. I
don't hink MRI's current bytecode set is "too big" but they've started
to add "superinstructions" which are a clear sign (to me) of premature
optimization.

- Charlie