On Sun, Feb 05, 2006 at 08:33:40PM +0900, Christian Neukirchen wrote:
> Caleb Clausen <vikkous / gmail.com> writes:
> 
> > 100_000.times{|n|
> >   o=Object.new;
> >   i=o.__id__;
> >   o2=ObjectSpace._id2ref(i);
> >   o.equal? o2 or raise "o=#{o}, i=#{"%x"%i}, o2=#{o2.inspect}, n=#{n}"
> > }
> >
> > The exception should never be raised. On my OS X 10.3.9 system (and at
> > least 1 other) it does get eventually raised after a few hundred
> > iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
> > it does not happen. Tests on several Windows and Linux systems have
> > never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
> > a problem on OS X 10.4; I don't have access to any 10.4 systems.
> >
> > The problem seems to be in the call to __id__. Usually, it works
> > correctly, but every once in a while it returns the id of some random
> > symbol. Does anyone know why this is happening?
> 
> I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]:
> 
> o=#<Object:0x1d421c>, i=ea10e, o2=:reject, n=448 (RuntimeError)
> 
> It looks like the object id wrapped in some way and now points to a
> symbol?  Clearly looks like a bug.

0x1d421c.to_s(2)                                   # => "111010100001000011100"
0xea10e.to_s(2)                                    # => "11101010000100001110"
0xea10e.class                                      # => Fixnum
(2 * 0xea10e).to_s(2)                              # => "111010100001000011100"

So far so good.

Now, in gc.c:

static VALUE
id2ref(obj, id)
    VALUE obj, id;
{
    unsigned long ptr, p0;

    rb_secure(4);
    p0 = ptr = NUM2ULONG(id);
    if (ptr == Qtrue) return Qtrue;
    if (ptr == Qfalse) return Qfalse;
    if (ptr == Qnil) return Qnil;
    if (FIXNUM_P(ptr)) return (VALUE)ptr;
    if (SYMBOL_P(ptr) && rb_id2name(SYM2ID((VALUE)ptr)) != 0) {
	return (VALUE)ptr;
    }

(SYMBOL_FLAG == 0x0e)

NUM2ULONG is rb_num2ulong, which calls rb_num2long, which uses FIX2LONG. 
id was 111010100001000011101b and ptr becomes 11101010000100001110b, which
matches the SYMBOL_FLAG.

I'd conjecture that the above works on Linux because glibc's malloc() always
returns 8-byte aligned memory addresses, which doesn't seem to be the case in
OSX:

 0x1d421c % 8                                      # => 4

Another possibility would be that the address space for the data segment
used in OSX is lower than on Linux, so the SYM2ID matches an existent
symbol:

RUBY_PLATFORM                                      # => "i686-linux"
Object.new.inspect                                 # => "#<Object:0xb7d44d7c>"
0xb7d44d7c >> 9                                    # => 6023718
# we shouldn't have 6 million symbols
0x1d421c >> 9                                      # => 3745
# but 4000 are indeed possible 

The relevant code hasn't changed between 1.6 and 1.8; could it be that the
Apple-supplied 1.6 binary was built specially to use 8-byte alignment, or
that the memory layout has changed in the meantime?

If so, possible fixes would include:
* modifying the configure to use the magic options
* using posix_memalign or such

-- 
Mauricio Fernandez