On Fri, Feb 09, 2007 at 09:50:34AM +0900, Ryan Davis wrote: > On Feb 7, 2007, at 9:21 AM, Mark Alexander Friedgan wrote: > > >We've been struggling with this problem for months. We use TupleSpace to > >implement a distributed processing framework and periodically if the > >number of objects in the TupleBag gets large (not ridiculous but 50,000 or > >so) the TupleSpace begins to take 100pct cpu and is effectively neutered > >and must be restarted. We've been trying to come up with a better > >implementation of TupleBag but have not had much luck so far. Has anyone > >else done this? > > What's happening is that you've populated a hash enough that you've > saturated the bins. You're now hitting a lot of hash collisions. You > can either improve the hash method on the tuples (possibly... really > depends on what you're storing), or switch the underlying data > structure (maybe a balanced tree will suit you better at large tuple > populations). Are you sure? In Ruby, the number of bins is increased (exponentially) when there are more than 5 entries per bin... Actually, in a TupleBag tuples are indexed by their size and stored in an array. Finding a tuple matching a given template in that array is O(n) ## # Finds a live tuple that matches +template+. def find(template) @hash.fetch(template.size, []).find do |tuple| tuple.alive? && template.match(tuple) end end If you want to make TupleSpace perform better, you'll have to make that operation (finding a live tuple matching an arbitrary template) faster. [Of course, the TupleSpace behaves much better when it gets the read request _before_ the tuple, since it just has to scan the list of templates waiting for a tuple.] -- Mauricio Fernandez - http://eigenclass.org - singular Ruby ** Latest postings ** What's new in Ruby 1.9, Feb. 07 update http://eigenclass.org/hiki.rb?Changes-in-Ruby-1.9-update-6 Firebrigade: automated, sandboxed testing of RubyGems packages by other devels http://eigenclass.org/hiki.rb?firebrigade-launched