On 9/17/07, Stephen Bannasch <stephen.bannasch / deanbrook.org> wrote:
> I'm using set in the ruby standard library to produce collections of
> unique objects from enumerable objects with duplicates but it's
> doesn't appear to work with hash objects.
>
> $ ruby --version
> ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.9.1]
> $ irb
> irb(main):001:0> require 'set'
> => true
> irb(main):002:0> a = [1,1,2,3]
> => [1, 1, 2, 3]
> irb(main):003:0> b = [{:a1 => "123"}, {:a1 => "123"}, {:b1 => "123"}]
> => [{:a1=>"123"}, {:a1=>"123"}, {:b1=>"123"}]
> irb(main):004:0> seta = a.to_set
> => #<Set: {1, 2, 3}>
> irb(main):005:0> setb = b.to_set
> => #<Set: {{:a1=>"123"}, {:a1=>"123"}, {:b1=>"123"}}>
> irb(main):006:0> b[0] == b[1]
> => true
>
> Am I doing something wrong?

Your problem is in hash comparison. Set is internally using hash as
implementation (values are keys in the hash). So in order to obtain
uniqueness of the values, you need to define proper Hash#eq (IIRC).
The default one is comparing object ids - i.e. two totally equivalent
hashes are considered different.

To make things even worse, hash calls dup when creating a new key to
avoid someone else changing the object. So even if you insert the same
object more times, it will be added each time (resp. its copy).

For more information search the archives for something like "hash key
dup". There were recent (as in last two months) threads that discussed
this.