On Tue, Sep 29, 2009 at 2:38 PM, Robert Klemme
<shortcutter / googlemail.com> wrote:
> 2009/9/29 Paul Smith <paul / pollyandpaul.co.uk>:
>> On Tue, Sep 29, 2009 at 12:43 PM, Ne Scripter
>> <stuart.clarke / northumbria.ac.uk> wrote:
>>> I have two data sets loaded into a hash to give the following output
>>>
>>> "2efa4ba470", =A0"00000005"
>>> "2efa4ba470", =A0"00000004"
>>> "02adecfd5c", =A0"00000002"
>>> "c0784b5de101", =A0"00000006"
>>> "68c4bf10539", =A0"00000003"
>>> "c0784b5de101", =A0"00000001"
>>>
>>> My code to get this is as follows:
>>>
>>> =A0source=3D "C:\\dummyFile.txt"
>>> =A0hashMapping =3D Hash.new
>>> =A0ocrIDMapping =3D Hash.new
>>>
>>> =A0IO.foreach(source.to_s) do |data|
>>> =A0 =A0fields =3D data.split(",")
>>> =A0 =A0hash =3D fields[0]
>>> =A0 =A0ocrID =3D fields[1]
>>> =A0 =A0hashMapping[ocrID] =3D hash
>>> =A0end
>>>
>>> =A0hashMapping.sort{|a,b| a[1]<=3D>b[1]}.each { |elem|
>>>
>>> =A0puts "#{elem[1]}, #{elem[0]}"}
>>>
>>> I would like to alter my output to group my the first value to give an
>>> output like this:
>>>
>>> "2efa4ba470", =A0"00000005", "00000004"
>>> "02adecfd5c", =A0"00000002"
>>> "c0784b5de101", =A0"00000006", "00000001"
>>> "68c4bf10539", =A0"00000003"
>>>
>>> As you can see now only unique values are shown in the first field
>>> however a list of the corresponding second field is formed, grouping th=
e
>>> results. Something like this I could do in SQL however I have never com=
e
>>> across it in Ruby so does anyone have any pointers?
>>
>> You want a hash where the key is the element you want to group on, and
>> the 'item' is an array of all items with the shared key. =A0A bit like
>> (untested):
>>
>> hashMapping =3D {}
>>
>> IO.foreach(source.to_s) do |data|
>> =A0 fields =3D data.split(",")
>> =A0 hash =3D fields[0]
>> =A0 ocrID =3D fields[1]
>>
>> =A0 hashMapping[ocrID] ||=3D [] #If hashMapping has never seen this key
>> before, make an empty array
>>
>> =A0 hashMapping[ocrID] << hash #Add the new element to the array for thi=
s key
>>
>> =A0end
>
> It is slightly more efficient to do it in one step:
>
> (hashMapping[ocrID] ||=3D []) << hash
>
> Even nicer
>
> hashMapping =3D Hash.new {|h,k| h[k] =3D []}

Is this defining a default element for the hash?  I had a vague
recollection you could do this but completely forgot how.

I'd also rename the 'hash' variable to 'key' or something, I think
it's less confusing.  Then Your hashMapping can either be given the
name 'hash', because that's what it is, or a name that's actually
useful for describing what the mystical contents of the hash are.

> ...
> hashMapping[ocrID] << hash
>
> Kind regards
>
> robert
>
> --
> remember.guy do |as, often| as.you_can - without end
> http://blog.rubybestpractices.com/
>
>



--=20
Paul Smith
http://www.nomadicfun.co.uk

paul / pollyandpaul.co.uk