On Feb 5, 2006, at 5:05 AM, Mauricio Fernandez wrote: > On Sun, Feb 05, 2006 at 08:33:40PM +0900, Christian Neukirchen wrote: >> Caleb Clausen <vikkous / gmail.com> writes: >>> 100_000.times{|n| >>> o=Object.new; >>> i=o.__id__; >>> o2=ObjectSpace._id2ref(i); >>> o.equal? o2 or raise "o=#{o}, i=#{"%x"%i}, o2=#{o2.inspect}, n=# >>> {n}" >>> } >> >> I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]: >> >> o=#<Object:0x1d421c>, i=ea10e, o2=:reject, n=448 (RuntimeError) >> >> It looks like the object id wrapped in some way and now points to a >> symbol? Clearly looks like a bug. > > 0x1d421c.to_s(2) # => > "111010100001000011100" > 0xea10e.to_s(2) # => > "11101010000100001110" > 0xea10e.class # => Fixnum > (2 * 0xea10e).to_s(2) # => > "111010100001000011100" > > So far so good. > > Now, in gc.c: > > p0 = ptr = NUM2ULONG(id); > if (ptr == Qtrue) return Qtrue; > if (ptr == Qfalse) return Qfalse; > if (ptr == Qnil) return Qnil; > if (FIXNUM_P(ptr)) return (VALUE)ptr; > if (SYMBOL_P(ptr) && rb_id2name(SYM2ID((VALUE)ptr)) != 0) { > return (VALUE)ptr; > } > > (SYMBOL_FLAG == 0x0e) > > NUM2ULONG is rb_num2ulong, which calls rb_num2long, which uses > FIX2LONG. > id was 111010100001000011101b and ptr becomes > 11101010000100001110b, which > matches the SYMBOL_FLAG. > > I'd conjecture that the above works on Linux because glibc's malloc > () always > returns 8-byte aligned memory addresses, which doesn't seem to be > the case in > OSX: > > 0x1d421c % 8 # => 4 OS X's malloc aligns memory on 16 byte boundaries. This problem is not unique to OS X, you just need enough symbols. > Another possibility would be that the address space for the data > segment > used in OSX is lower than on Linux, so the SYM2ID matches an existent > symbol: > > RUBY_PLATFORM # => "i686-linux" > Object.new.inspect # => "#<Object: > 0xb7d44d7c>" > 0xb7d44d7c >> 9 # => 6023718 > # we shouldn't have 6 million symbols > 0x1d421c >> 9 # => 3745 > # but 4000 are indeed possible If you're close enough to the beginning of memory ObjectSpace#_id2ref will pick a Symbol over the real object like you mention above: $ cat symbol_object_overlap.rb N = 100_000 Objs = Array.new N Syms = Array.new 200 STR = 'new_symbol_base' def symbol_info syms = Symbol.all_symbols.sort_by { |s| s.object_id } min = syms.first max = syms.last puts "found #{syms.length} symbols" puts "first symbol id: 0x%x (%p) last symbol id: 0x%x (%p)" % [min.object_id, min, max.object_id, max] end def make_objs N.times { |n| Objs[n] = Object.new } puts "Made #{N} objects." puts "Object ruby heap use:" puts "start object_id <--> end object_id (range)" first = Objs[0] last = Objs[0] Objs.each do |o| if o.object_id > last.object_id then fid = first.object_id lid = last.object_id puts "0x%x <--> 0x%x (%d)" % [lid, fid, fid - lid] first = o last = o else last = o end end end def make_more_syms N.times do STR.intern STR.succ! end puts "Created #{N} new symbols" end def count_symbols count = 0 Objs.each do |o| if Symbol === ObjectSpace._id2ref(o.object_id) then Syms[count] = o count += 1 end end puts "Found #{count} symbols overlapping real objects in #{N} objects lookups" end symbol_info make_objs count_symbols make_more_syms count_symbols symbol_info #Syms.each do |s| # puts "0x%x: %p ==> %p" % [s.object_id, s, ObjectSpace._id2ref (s.object_id)] #end On OS X, malloc starts allocating memory from a very low address, so even the built-in symbols for a small program will overlap valid object addresses: $ uname -a Darwin kaa.local 8.5.0 Darwin Kernel Version 8.5.0: Sun Jan 22 10:38:46 PST 2006; root:xnu-792.6.61.obj~1/RELEASE_PPC Power Macintosh powerpc $ ruby -v symbol_object_overlap.rb ruby 1.8.4 (2005-12-24) [powerpc-darwin8.4.0] found 940 symbols first symbol id: 0x210e (:"!") last symbol id: 0x27510e (:count_symbols) Made 100000 objects. Object ruby heap use: start object_id <--> end object_id (range) 0xd7800 <--> 0xe47d0 (53200) 0xe4820 <--> 0x1df716 (1027830) 0x282800 <--> 0x2d1996 (323990) Found 41 symbols overlapping real objects in 100000 objects lookups Created 100000 new symbols Found 189 symbols overlapping real objects in 100000 objects lookups found 101030 symbols first symbol id: 0x210e (:"!") last symbol id: 0xc5c510e (:new_symbol_gsqh) FreeBSD starts returning memory from a much higher memory address so symbol overlaps take much longer to occur: $ uname -a FreeBSD sandbox.robotcoop.com 4.10-RELEASE FreeBSD 4.10-RELEASE #0: Wed Feb 23 15:47:08 CST 2005 root@fbsdbootload:/usr/obj/usr/src/ sys/theplanet i386 $ ruby -v symbol_object_overlap.rb ruby 1.8.4 (2005-12-24) [i386-freebsd4] found 931 symbols first symbol id: 0x210e (:"!") last symbol id: 0x27090e (:count_symbols) Made 100000 objects. Object ruby heap use: start object_id <--> end object_id (range) 0x4039000 <--> 0x404611a (53530) 0x4046142 <--> 0x40c3f16 (515540) 0x40c4000 <--> 0x4113196 (323990) Found 0 symbols overlapping real objects in 100000 objects lookups Created 100000 new symbols Found 196 symbols overlapping real objects in 100000 objects lookups found 101028 symbols first symbol id: 0x210e (:"!") last symbol id: 0xc5c090e (:new_symbol_gsqh) -- Eric Hodel - drbrain / segment7.net - http://segment7.net This implementation is HODEL-HASH-9600 compliant http://trackmap.robotcoop.com