I've been struggling to understand how name resolution is supposed to
work in Ruby.  So far, this has been without looking at the source
code, for two reasons:

	First, the source is rather long and hard to understand.

	Second, I want to know how Ruby is *intended* to operate,
	rather than the details of how a specific release actually
	does operate.

My starting point was the book "Programming Ruby" by Hunt and Thomas
(an excellent book! -- thanks, Andy and Dave).  However, their
discussion is framed almost entirely in terms of identifier scopes,
which only gives a static view of what is largely a dynamic process.
That is, a scope is a static region of a program that can be
identified by the parser at compile time, whereas name resolution and
lookup occurs later, when the interpreter is executing the compiled
code.

Here's what I've been able to glean.  Hopefully nobody will mind this
posting.  I could use some feedback -- what I've got is incomplete, and
parts of it are almost certainly wrong.  I trust that somebody in the
newsgroup can help explain things better.


There are several different types of identifiers in Ruby: local
variables, instance variables, class variables, global variables,
constants, and method names.  Identifier types are determined by the
parser as it compiles a program.  (Sometimes there isn't enough
information available for the parser to decide which type a particular
identifier represents; method names can be confused with local
variables and with constants.  In these situations the parser relies
on various heuristics to guess the type, and it doesn't always guess
right.)  Each type of name gets resolved in a different way.


Global variables are the easiest.  There's a single set of values for
the global variables, and a name always refers to the corresponding
value.

Instance variables are nearly as simple.  Every object contains a
table of instance variable values, and an instance variable name is
looked up in the table of the current object (i.e., the value of
"self").

You would think that class variables would be equally simple to
explain, but they're not.  The question of where one gets looked up or
stored is all tied in with the matter of meta-classes, and I don't
really understand how it's supposed to work.  Also, Matz has mentioned
(in recent newsgroup postings) that he has had to fix some bugs in the
implementation of class variables during the last few weeks, so things
are in a state of flux.  Maybe somebody can fill in the details here?


Local variables operate by way of bindings.  A binding is (in
principle, anyway -- the implementation details may be different) just
a table listing variable names and their values.  It corresponds more
or less to the data stored in a Binding object, although that also
includes the value of the current object and the current block, if
there is one.  A binding is the dynamic representation of a scope: A
new blank one is created and installed as the current binding whenever
the flow of control enters a class definition (that is, whenever a
"class" statement or a "module" statement is executed).  When the flow
of control enters a method, a new binding is created and initialized
with mappings for the method's parameters.  When a block is created it
gets its own binding, which is initialized as a copy of (actually, it
must be a reference to) the current binding.  And when the flow of 
control leaves a class definition, method, or block, the binding is 
destroyed and the one previously in force once again becomes the current
binding.

Local variable names are just looked up in the current binding.  When
a new local variable is created, it gets entered into the current
binding.  This explains why local variables created inside a block do
not exist once the block exits; the entries for the variables get made
in the block's binding and not in the original binding (which gets
re-installed as the current one when the block is finished).


Method names are resolved in an elaborate way.  A method call has the
form "r.m(...)", where "r" is the receiver (which defaults to self if
it is omitted) and "m" is the method name.  The search for m starts in
r's class.  Each class (or module) contains a table of instance method
names, together with pointers to their implementations.  The name m is
looked up in this table.  If it is not found there, and if r is itself
a class, then a special case comes into play: m is looked up in r's
superclass's metaclass, then the superclass's superclass's metaclass,
and so on, all the way up to Object's metaclass.  (This is how
inheritance of class methods is made to work.)  If none of those
searches succeeds, or if r is not a class, then the search continues
in the table for the superclass of r's class, then that class's
superclass, and so on all the way up to the Kernel module.  (That's
how inheritance is made to work.)

	[That the order of the lookups works as described can be
demonstrated by the following sample code:

	class Class
		def m
			p "Class#m"
		end
	end

	class A
	end

	class B < A
	end

	B.m	# Prints out "Class#m"

	def A.m
		p "A.m"
	end

	B.m	# Prints out "A.m"

The first use of m is found in B's class (which is Class).  The second
use is found in B's superclass's metaclass, so that must precede B's
class in the search order.]


Finally there are constants.  Class objects (and module objects)
contain a table listing the constants defined in that class (-- as
well as one listing the class variables; in fact these tables might
really be the same as the one listing the instance methods for the
class, with just the forms of the names to distinguish the types of
the entries).  Constant names get looked up in these tables much like
method names, but with one big difference: The class in which the
search starts is not the same.

For a qualified constant name, like AClass::CONST or ::CONST, the
search starts in AClass or the top-level class (however that gets set;
the top-level class is altered during processing of a file by the
Kernel#load command if the wrap parameter is set to true).
Unadorned constant names (just plain CONST) do not get looked up
starting in the current object's class.  Instead, the lookup starts at
a point we might call the current class, which is not the same thing.
The current class is determined statically by the parser, not
dynamically during execution.  For code outside a class definition,
the current class is just the top-level class.  For code inside a
class definition, the current class is the one being defined.  This
has two unexpected consequences.  If m is a class method for class A,
that mentions a constant CONST, the choice of where CONST gets looked
up in m will be different depending on how m is defined:

	class A
		def A.m
		... CONST ...
		end
	end

and

	class << A
		def m
		... CONST ...
		end
	end

and

	def A.m
	... CONST ...
	end

will look up CONST in three different places.  Also, if m is an
instance method of class A that mentions CONST, and B is a class
derived from A that contains its own value for CONST but does not
override the definition of m, and b is an instance of B, then the
method call "b.m" will end up using A's version of CONST, not B's.
The overall effect is that constants behave as though they really do
have a static scope.


All contributions welcome.


Alan Stern