On Wed, Apr 13, 2011 at 2:20 AM, Xavier Noria <fxn / hashref.com> wrote:
> What's the downside of the naive implementation?
>
> Each class has two slots: list of pointers to direct mixins (no flat
> list, only the direct ones that were mixed in and in the correct
> order) and a pointer to the parent class. Modules have a slot with the
> list of direct mixins.

I assume you're talking about how it could be rather than how it is.
At least in JRuby, Modules do not do anything special wrt mixed-in
modules; they simply go into the superclass hierarchy as with normal
classes. Upon include, the superclasses are walked to get the list of
additional modules to include.

> Method dispatch loops/recurses. We ask each mixin in turn whether he
> responds to the method. At that point of execution, mixins ask
> themselves, and if not loop over their direct mixins. Recurse. If all
> of them fail, I go to the parent class and repeat. Of course the
> complete algorithm would include method_missing, singleton classes...
> but just to give the idea.

This is highly dependent on the implementation. In JRuby, each class
has a local cache of methods from superclasses. Under normal
circumstances, this means even the slowest method lookups are only one
hash hit.

In order to keep this cache consistent, JRuby leverages two relationships:

* Weak references down-hierarchy from superclass to subclass
(including artificial intermediate classes like those that delegate to
modules)
* Weak references from modules to all hierarchies they are included into

This allows invalidating the caches down-hierarchy whether a change
happens in a superclass or in an included module.

In the absence of any caching, your logic would work fine; method
lookup in a Module would search that Module and all included Modules.
It *would* be special-cased logic for Module, since normally only
direct superclasses are searched, and this would introduce a separate
cycle to search superclasses of an included module.

In the presence of caching, it may still be possible; module updates
(new modules included, methods modified) would need to invalidate not
just "first-degree" including hierarchies, but also "second degree"
hierarchies that include a module that includes our updating module.
And so on and so forth until we invalidate all modules and hierarchies
that include (directly or indirectly) the module in question.

It would complicate method lookup, since now two different mechanisms
would be needed. That could impact perf in other ways (i.e. the basic
logic right now inlines everywhere it's used in JRuby).

> I have NO experience writing interpreters, is that a terrible
> implementation? The number of pointers to follow seems to be about the
> same more or less doesn't it... if it is bad why is it bad? Not
> proposing anything of course (I can't propose anything I have no ida
> about VMs :), only doing this as an exercise to try to understand why
> it works the way it works.

It's not a bad implementation, but it complicates the process of
method searching and (in JRuby's case) cache invalidation.

- Charlie