On Fri, Apr 01, 2005 at 07:34:45PM +0900, gabriele renzi wrote:
> On the other hand, yarv can be used as a ruby->c compiler, then you can 
> compile the resulting C files with the standard toolchain, and load them 
> at runtime as any other C extension. No tight interaction beetween C 
> toolchain and ruby/yarv at runtime.
...
> > Yeah, but there's a difference between the ./configure && make && make install
> > cycle of a standalone C program, and writing a program which dynamically
> > writes C programs and runs them in real time; 
> 
> here is the point: it is not done at runtime :)

Ah, so YARV is a *static* compiler, not JIT? Then I misunderstood
completely. I saw JIT->"native code" in the presentation, and I thought it
meant machine-code; I see now it's JIT compilation to YARV byte-code.

Given that Ruby is so dynamic, clearly it's not always going to be able to
compile into C. There's "eval", and there's also the case of external Ruby
libraries:

  require 'bar.rb'
  Bar.new.run

For Bar to be compiled code, bar.rb would have to be compiled to bar.so
beforehand. Can YARV do that? If so that would be great - it should *reduce*
startup overhead when loading libraries.

[Thinks aloud]. Even without eval, the methods present in an object can
change dynamically often. So given a statement like

      a.foo

how does this end up in C? If you're saying that there's no dynamic
recompilation at run-time, then essentially you have to compile

      obj = getlocal("a");
      fn = find_method1(obj, "foo");
      (*fn)(obj);

But then we add some caching. Consider "a = a + 1"; each time round the loop
the object 'a' changes, and we don't want to invalidate the cache each time,
so I realise that actually we should cache on the *class*, not the object.
This should be OK, since any object with its own methods will also have its
own singleton class.

       /* a = a + 1 */
       /* a = a.+(1) */

       obj = getlocal("a");
       oc = rb_class_of(obj);
       if (oc != oc1235 || mstate != mstate1235) {
        fn1235 = find_method2(oc, "+");
        oc1235 = oc;
        mstate1235 = mstate;
       }
       obj2 = (*fn1235)(obj, INT2FIX(1));
       putlocal("a", obj2);

(where 'mstate' is incremented whenever any new methods or classes are
defined, to invalidate all the caches; I saw something about that in the
Rubyconf presentation)

Then if getlocal / rb_class_of / putlocal are inlined, actually the code
doesn't end up being too bad. In the cache-hit case there's a branch forward
(pipeline bubble) and one indirect function call, to the existing fix_plus()
method.

OK, I think I can see where this is going, and it should eliminate a lot of
interpreter overhead - even if it's never going to be as fast as "a++" in C
:-)

Am I guessing along the right lines here?

Cheers,

Brian.