Issue #4151 has been updated by marcandre (Marc-Andre Lafortune).


Hi,

trans (Thomas Sawyer) wrote:
> Could you give an example of #categorize?

There are many already in the discussion... A simple one:

     [:a, "foo", :a, :a].categorize(:+){|i| ["occurences of #{i}", 1]} # => { "occurences of a" => 3,  "occurences of foo" => 1}

> Also, I thought this method(s) purpose is to convert Enumerable to Hash not just Hash to Hash. Examples with Arrays would be good too.

They do convert Enumerable to Hash. Hash just has specialized versions, like `Hash#select` is a specialized version of `Enumerable#select`.

> Glancing at ActiveSupport's #index_by, I am not sure about implementation. Can it even handle `|key,value|` pairs?

Not sure what your question is, but that implementation is not relevant to the discussion. Here's a Ruby implementation:

    module Enumerable
      def index_by(merge = nil)
        return to_enum(merge) unless block_given?
        if merge.is_a?(Symbol)
          method = merge
          merge = ->(_, before, after) { before.send(method, after) }
        end
        h = {}
        each do |val|
          key = yield val
          h[key] = if h.has_key?(key) && merge
                     merge[key, h[key], val]
                   else
                     val
                   end
        end
        h
      end
    end

    class Hash
      def index_by(merge = nil)
        # same...
        each do |key, val|
          key = yield key, val
        # same ...
      end
    end


> For Hash, I have always liked Facets #rekey which can take `|key|` or `|key,value|` in block.

I don't know of a single core method that behaves differently depending of the arity of the block.

> Perhaps the best approach (...) would be via intermediate methods to get the enumerable in a prepared state, followed by a simple call of #to_h.

Matz is not positive about `Array#to_h`. I wish I could convince him that `Hash[ary]` is so much uglier than `ary.to_h`.

----------------------------------------
Feature #4151: Enumerable#categorize
https://bugs.ruby-lang.org/issues/4151#change-27386

Author: akr (Akira Tanaka)
Status: Assigned
Priority: Low
Assignee: akr (Akira Tanaka)
Category: 
Target version: 2.0.0


=begin
 Hi.
 
 How about a method for converting enumerable to hash?
 
   enum.categorize([opts]) {|elt| [key1, ..., val] } -> hash
 
 categorizes the elements in _enum_ and returns a hash.
 
 The block is called for each elements in _enum_.
 The block should return an array which contains
 one or more keys and one value.
 
   p (0..10).categorize {|e| [e % 3, e % 5] }
   #=> {0=>[0, 3, 1, 4], 1=>[1, 4, 2, 0], 2=>[2, 0, 3]}
 
 The keys and value are used to construct the result hash.
 If two or more keys are provided
 (i.e. the length of the array is longer than 2),
 the result hash will be nested.
 
   p (0..10).categorize {|e| [e&4, e&2, e&1, e] }
   #=> {0=>{0=>{0=>[0, 8],
   #            1=>[1, 9]},
   #        2=>{0=>[2, 10],
   #            1=>[3]}},
   #    4=>{0=>{0=>[4],
   #            1=>[5]},
   #        2=>{0=>[6],
   #            1=>[7]}}}
 
 The value of innermost hash is an array which contains values for
 corresponding keys.
 This behavior can be customized by :seed, :op and :update option.
 
 This method can take an option hash.
 Available options are follows:
 
 - :seed specifies seed value.
 - :op specifies a procedure from seed and value to next seed.
 - :update specifies a procedure from seed and block value to next seed.
 
 :seed, :op and :update customizes how to generate
 the innermost hash value.
 :seed and :op behavies like Enumerable#inject.
 
 If _seed_ and _op_ is specified, the result value is generated as follows.
   op.call(..., op.call(op.call(seed, v0), v1), ...)
 
 :update works as :op except the second argument is the block value itself
 instead of the last value of the block value.
 
 If :seed option is not given, the first value is used as the seed.
 
   # The arguments for :op option procedure are the seed and the value.
   # (i.e. the last element of the array returned from the block.)
   r = [0].categorize(:seed => :s,
                      :op => lambda {|x,y|
                        p [x,y]               #=> [:s, :v]
                        1
                      }) {|e|
     p e #=> 0
     [:k, :v]
   }
   p r #=> {:k=>1}
 
   # The arguments for :update option procedure are the seed and the array
   # returned from the block.
   r = [0].categorize(:seed => :s,
                      :update => lambda {|x,y|
                        p [x,y]               #=> [:s, [:k, :v]]
                        1
                      }) {|e|
     p e #=> 0
     [:k, :v]
   }
   p r #=> {:k=>1}
 
 The default behavior, array construction, can be implemented as follows.
   :seed => nil
   :op => lambda {|s, v| !s ? [v] : (s << v) }
 
 Note that matz doesn't find satisfact in the method name, "categorize".
 [ruby-dev:42681]
 
 Also note that matz wants another method than this method,
 which the hash value is the last value, not an array of all values.
 This can be implemented by enum.categorize(:op=>lambda {|x,y| y}) { ... }.
 But good method name is not found yet.
 [ruby-dev:42643]
 -- 
 Tanaka Akira
 
 Attachment: enum-categorize.patch
=end



-- 
http://bugs.ruby-lang.org/