On 4/29/07, SonOfLilit <sonoflilit / gmail.com> wrote: > Well, you could first count how many repetitions there are of each > line and then partition the set of pairs [line, count]. > > e.g. > > h = Hash.new {0} # is this how I set a default value? > > STDIN.each_line {|l| h[l] += 1} > > partitioning = Set.new(h.to_a).divide{|a| a[0][0]} > > Aur > > On 4/29/07, Peter Szinek <peter / rubyrailways.com> wrote: > > Hello all, > > > > I have been playing with partitioning a set recently and I am stuck with > > an issue. The whole story is here: > > > > http://www.rubyrailways.com/partitioning-sets-in-ruby/ > > > > A quick version for those who would not like to read the article: > > > > Consider this input: > > > > a 53 2 3 > > b 8 62 1 23 > > a 9 0 31 > > b 4 45 4 16 7 > > b 1 23 > > c 3 42 2 31 4 6 > > a 1 3 22 > > a 7 83 1 23 3 > > b 1 14 4 15 16 2 > > c 5 16 2 34 > > > > the goal is to create a partition based on the character in the first > > column, i.e.: > > > > <Set: <Set: {"a 9 0 31", "a 7 83 1 23 3", "a 53 2 3", "a 1 3 22 "}>, > > <Set: {"b 1 23 ", "b 1 14 4 15 16 2", "b 8 62 1 23", "b 4 45 4 16 7"}>, > > <Set: {"c 5 16 2 34", "c 3 42 2 31 4 6"}>}> > > > > Which is exactly what Set.divide does. However, there is one problem: I > > would like to know if there are duplicate lines. I.e. divide returns the > > same result, no matter that the input is this: > > > > c 5 16 2 34 > > c 5 16 2 34 > > c 5 16 2 34 > > > > or this: > > > > c 5 16 2 34 > > > > What I would need is a modified divide which returns also the count of > > the elements in the input set (at least for those elements which are > > more than once in the set). Is this doable or do I have to roll some > > code to do this for me additionally? > > > > Cheers, > > Peter > > > > __ > > http://www.rubyrailways.com :: Ruby and Web2.0 blog > > http://scrubyt.org :: Ruby web scraping framework > > http://rubykitchensink.ca/ :: The indexed archive of all things Ruby > > > > > Even better: h = Hash.new {0} # is this how I set a default value? STDIN.each_line {|l| h[l] += 1} partitioning = h.to_set.divide{|a| a[0][0]} # changed to .to_set, which is great. might need to be .to_set{|k,v| k,v}