On Oct 13, 10:50=A0pm, Jeremy Bopp <jer... / bopp.net> wrote:
> On 10/13/2010 09:50 PM, Steve Howell wrote:
>
>
>
> > It would be nice if Ruby supported a sort_by on steroids.
>
> > =A0 sorted_list =3D list.multi_field_sort_by(
> > =A0 =A0 { |x| x.department.name },
> > =A0 =A0 { |x| x.position.name },
> > =A0 =A0 desc { |x| x.level },
> > =A0 =A0 desc { |x| x.salary ? x.salary : x.rate * db_lookup(x,
> > 'hours_worked') }
> > =A0 )
>
> > I believe you could write a decent multi_field_sort_by in Ruby that
> > would be efficient for large enough lists to outperform more tedious
> > approaches, but it would be even better if Ruby natively supported it.
>
> > My proposed syntax might be slightly off, but you get the idea. =A0You'=
d
> > pass a list of blocks that represent the successive tiebreakers, and
> > multi_field_sort_by would presumably cache the results from each
> > transformation, evaluating the blocks only as necessary. =A0The "desc"
> > thingy would actually produce some kind of wrapper that
> > multi_field_sort_by could introspect to know that it needs to apply a
> > particular tiebreaker in reverse order.
>
> How about something like this:
>
> module Enumerable
> =A0 # sort_by will take a key generator and an optional comparator
> =A0 # and perform a Schwartzian Transform over the data.
> =A0 #
> =A0 # This is a general solution which is relatively inefficient than
> =A0 # a purpose-built sorting function if the keys are trivial to
> =A0 # generate.
> =A0 def sort_by(cmp =3D lambda { |a, b| a <=3D> b }, &key)
> =A0 =A0 collect do |item| =A0 # Generate keys from the list items.
> =A0 =A0 =A0 [key[item], item]
> =A0 =A0 end.sort do |a, b| =A0# Sort the keys.
> =A0 =A0 =A0 cmp[a[0], b[0]]
> =A0 =A0 end.collect do |kv| # Return the items in key sort order.
> =A0 =A0 =A0 kv[1]
> =A0 =A0 end
> =A0 end
> end
>
> # This will cause the sort to operate over the second and then
> # the first column.
> key =3D lambda { |item| [item[1], item[0]] }
>
> # This will cause the sort to operate normally on the second
> # column and in reverse on the first column.
> cmp =3D lambda do |a, b|
> =A0 =A0 =A0 =A0 res =3D a[0] <=3D> b[0]
> =A0 =A0 =A0 =A0 break res unless res =3D=3D 0
> =A0 =A0 =A0 =A0 b[1] <=3D> a[1]
> =A0 =A0 =A0 end
>
> # Sample data from Ryan's post.
> a =3D [
> =A0 ["radio", 30, 5],
> =A0 ["radio", 20, 5],
> =A0 ["archie", 20, 5],
> =A0 ["newton", 10, 3]
> ]
>
> # Sort over the second and then first columns each in normal order.
> p a.sort_by(&key)
> # =3D> [["newton", 10, 3], ["archie", 20, 5], ["radio", 20, 5],
> # =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0 =A0["radio", 30, 5]]
>
> # Sort over the second column in normal order and then the first
> # column in reverse order.
> p a.sort_by(cmp, &key)
> # =3D> [["newton", 10, 3], ["radio", 20, 5], ["archie", 20, 5],
> # =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0 =A0["radio", 30, 5]]

I think it's a good start, but it still requires you to write a
somewhat customized cmp function.  A truly optimized multi-field sort
might also be able to avoid calculating all elements of the key as
well, since they are often only needed as tiebreakers.