------art_17309_16131350.1153949082418
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On 7/26/06, Chad Perrin <perrin / apotheon.com> wrote:
>
> The canonical example for comparison, I suppose, is the Java VM vs. the
> Perl JIT compiler.  In Java, the source is compiled to bytecode and
> stored.  In Perl, the source remains in source form, and is stored as
> ASCII (or whatever).  When execution happens with Java, the VM actually
> interprets the bytecode.  Java bytecode is compiled for a virtual
> computer system (the "virtual machine"), which then runs the code as
> though it were native binary compiled for this virtual machine.  That
> virtual machine is, from the perspective of the OS, an interpreter,
> however.  Thus, Java is generally half-compiled and half-interpreted,
> which speeds up the interpretation process.


Half true. The Java VM could be called "half-compiled and half-interpreted"
at runtime for only a short time, and only if you do not consider VM
bytecodes to be a valid "compiled" state. However most bytecode is very
quickly compiled into processor-native code, making those bits fully
compiled. After a long enough runtime (not very long in actuality), all Java
code is running as native code for the target processor (with various
degrees of optimization and overhead).

The difference between AOT compilation with GCC or .NET is that Java's
compiler can make determinations based on runtime profiling about *how* to
compile that "last mile" in the most optimal way possible. The bytecode
compilation does, as you say, primarily speed up the interpretation process.
However it's far from the whole story, and the runtime JITing of bytecode
into native code is where the magic lives. To miss that is to miss the
greatest single feature of the JVM.

When execution happens in Perl 5.x, on the other hand, a compiler runs
> at execution time, compiling executable binary code from the source.  It
> does so in stages, however, to allow for the dynamic runtime effects of
> Perl to take place -- which is one reason the JIT compiler is generally
> preferable to a compiler of persistent binary executables in the style
> of C.  Perl is, thus, technically a compiled language, and not an
> interpreted language like Ruby.


I am not familiar with Perl's compiler. Does it compile to processor-native
code or to an intermediate bytecode of some kind?

We're also juggling terms pretty loosely here. A compiler converts
human-readable code into machine-readable code. If the "machine" is a VM,
then you're fully compiling. If the VM code later gets compiled into "real
machine" code, that's another compile cycle. Compilation isn't as cut and
dried as you make it out to be, and claiming that, for example, Java is
"half compiled" is just plain wrong.

Something akin to bytecode compilation could be used to improve upon the
> execution speed of Perl programs without diverging from the
> JIT-compilation execution it currently uses and also without giving up
> any of the dynamic runtime capabilities of Perl.  This would involve
> running the first (couple of) pass(es) of the compiler to produce a
> persistent binary compiled file with the dynamic elements still left in
> an uncompiled form, to be JIT-compiled at execution time.  That would
> probably grant the best performance available for a dynamic language,
> and would avoid the overhead of a VM implementation.  It would, however,
> require some pretty clever programmers to implement in a sane fashion.


There are a lot of clever programmers out there.

I'm not entirely certain that would be appropriate for Ruby, considering
> how much of the language ends up being dynamic in implementation, but it
> bothers me that it doesn't even seem to be up for discussion.  In fact,
> Perl is heading in the direction of a VM implementation with Perl 6,
> despite the performance successes of the Perl 5.x compiler.  Rather than
> improve upon an implementation that is working brilliantly, they seem
> intent upon tossing it out and creating a different implementation
> altogether that, as far as I can see, doesn't hold out much hope for
> improvement.  I could, of course, be wrong about that, but that's how it
> looks from where I'm standing.


Having worked heavily on a Ruby implementation, I can say for certain that
99% of Ruby code is static. There are some dynamic bits, especially within
Rails where methods are juggled about like flaming swords, but even these
dynamic bits eventually settle into mostly-static sections of code.
Compilation of Ruby code into either bytecode for a fast interpreter engine
like YARV or into bytecode for a VM like Java is therefore perfectly valid
and very effective. Preliminary compiler results for JRuby show a boost of
50% performance over previous versions, and that's without optimizing many
of the more expensive Ruby operations (call logic, block management).
Whether a VM is present (as in JRuby) or not (as may be the case with YARV),
eliminating the overhead of per-node interpretation is a big positive. JRuby
will also feature a JIT compiler to allow running arbitrary .rb files
directly, optimizing them as necessary and as seems valid based on runtime
characteristics. I don't know if YARV will do the same, but it's a good
idea.

It just looks to me like everyone's chasing VMs.  While the nontrivial
> problems with Java's VM are in many cases specific to the Java VM (the
> Smalltalk VMs have tended to be rather better designed, for instance),
> there are still issues inherent in the VM approach as currently
> envisioned, and as such it leaves sort of a bad taste in my mouth.


The whole VM thing is such a small issue. Ruby itself is really just a VM,
where its instructions are the elements in its AST. The definition of a VM
is sufficiently vague enough to include most other interpreters in the same
family. Perhaps you are specifically referring to VMs that provide a set of
"processor-like" fine-grained operations, attempting to simulate some sort
of magical imaginary hardware? That would describe the Java VM pretty well,
though in actuality there are real processes that run Java bytecodes
natively as well. Whether or not a language runs on top of a VM is
irrelevant, especially considering JRuby is a mostly-compatible version of
Ruby running on top of a VM. It matters much more that translation to
whatever underlying machine....virtual or otherwise...is as optimal and
clean as possible.

-- 
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

------art_17309_16131350.1153949082418--