Issue #4151 has been updated by Eregon (Benoit Daloze).


Reading this again (and what I could understand from the ruby-dev part), I think this is a very important feature, and I also believe one that has much been asked.

I think the need for the Hash part of Array#map is the clearer. I was initially thinking to have one generic method to avoid n specialized methods, but I think the "map" kind Hash primitives should be added first.
These are (these names are just to make the purpose clear): map_value, map_key and map_pair. map_pair might be very much replaced by categorize/associate though, so it might not be worth to add it.

There is the alternative to use #each_with_object, but I think it is not even close to be as clear and readable:
    h.each_with_object({}) { |(k,v),h| h[k] = v * 2 }
    h.map_value { |v| v * 2 }

I think map_value and map_key are worth as is, because building an Array for changing only the keys or the values is not showing well what the user wants to achieve, nor very readable. And I think checking the block return value to see if it is an Array is not a good specification, as you can not have easily an Array as value anymore.

About categorize and associate, I agree the default should rather be "last value" than an Array, even though losing information is a bad default.

I think categorize interface could be simplified a little, by having an interface closer to #inject. That is categorize(init = nil, sym_or_proc). I believe only one of :op and :update should be kept, to simplify the interface (categorize's merge proc should not know about the keys I think, otherwise, one might just do { |init, (*keys, value)| } but it's not very nice).

marcandre wrote:
> Is anyone going to submit a slide-show about this? 

I'm wishing to do so. I don't expect associate or categorize to be decided at the meeting, but I think it would be a nice occasion to go on with this and find out what should be added.
----------------------------------------
Feature #4151: Enumerable#categorize
https://bugs.ruby-lang.org/issues/4151#change-27597

Author: akr (Akira Tanaka)
Status: Assigned
Priority: Low
Assignee: akr (Akira Tanaka)
Category: 
Target version: 2.0.0


=begin
 Hi.
 
 How about a method for converting enumerable to hash?
 
   enum.categorize([opts]) {|elt| [key1, ..., val] } -> hash
 
 categorizes the elements in _enum_ and returns a hash.
 
 The block is called for each elements in _enum_.
 The block should return an array which contains
 one or more keys and one value.
 
   p (0..10).categorize {|e| [e % 3, e % 5] }
   #=> {0=>[0, 3, 1, 4], 1=>[1, 4, 2, 0], 2=>[2, 0, 3]}
 
 The keys and value are used to construct the result hash.
 If two or more keys are provided
 (i.e. the length of the array is longer than 2),
 the result hash will be nested.
 
   p (0..10).categorize {|e| [e&4, e&2, e&1, e] }
   #=> {0=>{0=>{0=>[0, 8],
   #            1=>[1, 9]},
   #        2=>{0=>[2, 10],
   #            1=>[3]}},
   #    4=>{0=>{0=>[4],
   #            1=>[5]},
   #        2=>{0=>[6],
   #            1=>[7]}}}
 
 The value of innermost hash is an array which contains values for
 corresponding keys.
 This behavior can be customized by :seed, :op and :update option.
 
 This method can take an option hash.
 Available options are follows:
 
 - :seed specifies seed value.
 - :op specifies a procedure from seed and value to next seed.
 - :update specifies a procedure from seed and block value to next seed.
 
 :seed, :op and :update customizes how to generate
 the innermost hash value.
 :seed and :op behavies like Enumerable#inject.
 
 If _seed_ and _op_ is specified, the result value is generated as follows.
   op.call(..., op.call(op.call(seed, v0), v1), ...)
 
 :update works as :op except the second argument is the block value itself
 instead of the last value of the block value.
 
 If :seed option is not given, the first value is used as the seed.
 
   # The arguments for :op option procedure are the seed and the value.
   # (i.e. the last element of the array returned from the block.)
   r = [0].categorize(:seed => :s,
                      :op => lambda {|x,y|
                        p [x,y]               #=> [:s, :v]
                        1
                      }) {|e|
     p e #=> 0
     [:k, :v]
   }
   p r #=> {:k=>1}
 
   # The arguments for :update option procedure are the seed and the array
   # returned from the block.
   r = [0].categorize(:seed => :s,
                      :update => lambda {|x,y|
                        p [x,y]               #=> [:s, [:k, :v]]
                        1
                      }) {|e|
     p e #=> 0
     [:k, :v]
   }
   p r #=> {:k=>1}
 
 The default behavior, array construction, can be implemented as follows.
   :seed => nil
   :op => lambda {|s, v| !s ? [v] : (s << v) }
 
 Note that matz doesn't find satisfact in the method name, "categorize".
 [ruby-dev:42681]
 
 Also note that matz wants another method than this method,
 which the hash value is the last value, not an array of all values.
 This can be implemented by enum.categorize(:op=>lambda {|x,y| y}) { ... }.
 But good method name is not found yet.
 [ruby-dev:42643]
 -- 
 Tanaka Akira
 
 Attachment: enum-categorize.patch
=end



-- 
http://bugs.ruby-lang.org/