Issue #4766 has been updated by Yusuke Endoh.

File bsearch.patch added

Hello,

Let me summarize.
There are two typical use cases for bsearch: from array soreted
in ascending order,

  1) find minimum i so that s <= a[i]
  2) find any i so that s == a[i]


Case 2 is the point I overlooked; "find minimum" is not inherent
in binary search.


In fact, the API that I proposed covers both use cases.

  1) r.bsearch {|i| s <= a[i] }
  2) r.bsearch {|i| break i if s == a[i]; s < a[i] }

But,


> My proposed bsearch_matches would clearly need to know which
> direction to break ties before it commenced (perhaps a 1/0/-1
> argument), which starts to limit its usefulness.

indeed, Case 2 can be shorter by using 1/0/-1.

  2) r.bsearch {|i| s - a[i] }

As Michael said, it is too cryptic for casual use, but certainly
more generic, and compatible to bsearch of stdlib.h.


How about combining 1/0/-1 and my proposal?  That is,

  if the block returns zero,
    the current value satisfies the condition;
    return the current value immediately

  if the block returns an integer less than zero,
    the current value is too big to satisfy the condition;
    continue to search smaller values

  if the block returns an integer greater than zero,
    the current value too big to satisfy the condition;
    continue to search larger values

  if the block returns true,
    the current value satisfies the condition;
    continue to search minimum bound

  if the block returns false or nil, same as -1


It is still needed to invert the condition when you want the
maximum bound, but I guess this is acceptable compromise.


I updated rdoc and a patch.  As you know I'm not a native
speaker, so could you check it?


/*
 *  call-seq:
 *     rng.bsearch {|obj| block }  -> element
 *
 *  By using binary search, finds a value in range which meets the given
 *  condition in O(n log n) where n = (rng.begin - rng.end).
 *
 *  The given block receives a current value, determines if it meets the
 *  condition and controls search.
 *  When the condition is satisfied and you want to stop search, the block
 *  should return zero, and then this method return the value immediately.
 *  When the condition is satisfied and you want to find minimum bound,
 *  the block should return true.  When the condition is not satisfied and
 *  the current value is smaller than wanted, the block should return false,
 *  nil or an integer greater than zero. When the condition is not satisfied
 *  and the current value is larger than wanted, the block should return an
 *  integer less than zero.
 *  Unless the block returns zero, the search will continue until a minimum
 *  bound is found or no match is found.  Returns the minimum bound if any,
 *  or returns nil when no match is found.
 *
 *  The block must be monotone; there must be two values a and b so that
 *  the block returns:
 *  - false, nil or an integer greater than zero for all x of [begin of
 *    range, a), and
 *  - zero or true for all x of [a, b), and
 *  - an integer less than zero for all x of [b, end of range).
 *  If the block is not monotone, the result is unspecified.
 *
 *  This method takes O(n log n), but it is unspecified which value is
 *  actually picked up at each iteration.
 *
 *     ary = [0, 4, 7, 10, 12]
 *     (0...ary.size).bsearch {|i| ary[i] >= 4 } #=> 1
 *     (0...ary.size).bsearch {|i| ary[i] >= 6 } #=> 2
 *     (0...ary.size).bsearch {|i| ary[i] >= 8 } #=> 3
 *     (0...ary.size).bsearch {|i| ary[i] >= 100 } #=> nil
 *
 *     (0.0...Float::INFINITY).bsearch {|x| Math.log(x) >= 0 } #=> 1.0
 *
 *     ary = [0, 100, 100, 100, 200]
 *     (0..4).bsearch {|i| 100 - i } #=> 1, 2 or 3
 *     (0..4).bsearch {|i| 300 - i } #=> nil
 *     (0..4).bsearch {|i|  50 - i } #=> nil
 */

Thanks,

-- 
Yusuke Endoh <mame / tsg.ne.jp>
----------------------------------------
Feature #4766: Range#bsearch
http://redmine.ruby-lang.org/issues/4766

Author: Yusuke Endoh
Status: Assigned
Priority: Normal
Assignee: Yusuke Endoh
Category: 
Target version: 1.9.x


Hello,

I propose Range#bsearch for binary search:

  ary = [0, 4, 7, 10, 12]
  p (0..4).bserach {|i| ary[i] >= 4 } #=> 1
  p (0..4).bserach {|i| ary[i] >= 6 } #=> 2

  p (0.0...Float::INFINITY).bsearch {|x| Math.log(x) >= 0 } #=> 1.0


Rdoc:

 *  call-seq:
 *     rng.bsearch {|obj| block }  -> element
 *
 *  Returns the minimum value in range which meets the block by binary search.
 *  The block must be monotone for arguments; in other words, it must have any
 *  following properties:
 *
 *    - there is a value so that the block returns false for any smaller value
 *      than the value, and that the block returns true for any bigger (or
 *      equal) value than the value,
 *    - the block always return true, or
 *    - the block always return false.
 *
 *  If the block is not monotone, the behavior is not unspecified.
 *
 *  Returns nil when there is no value that meets the block..
 *
 *
 *     ary = [0, 4, 7, 10, 12]
 *     (0...ary.size).bsearch {|i| ary[i] >= 4 } #=> 1
 *     (0...ary.size).bsearch {|i| ary[i] >= 6 } #=> 2
 *     (0...ary.size).bsearch {|i| ary[i] >= 8 } #=> 3
 *     (0...ary.size).bsearch {|i| ary[i] >= 100 } #=> nil
 *
 *     (0.0...Float::INFINITY).bsearch {|x| Math.log(x) >= 0 } #=> 1.0
 *
 *
 *  Note that this method does not stop search immediately when the block
 *  returns true.  This is because this method find the *minimum* value:
 *
 *     ary = [0, 100, 100, 100, 200]
 *     (0..4).bsearch {|i| p ary[i]; ary[i] >= 100 }
 *       # may output "100" more than once



Discussion:

You might think Array#bsearch is better.  But Array#bsearch has some problems:

  - Binary search is usable for not only an array but also a function.
    (See the above example for Math.log)
    Nevertheless, Array#bsearch can be used only for an array.
    Range#bsearch covers both.  You can use it for an array as follows:

      ary.sort!
      i = (0...ary.size).bsearch {|i| condition(ary[i]) }
      p ary[i]

  - matz hated Array#bsearch because its precondition (Array must be sorted)
    seems too strong (to him).  [ruby-dev:17125]
    Range#bsearch has the same precondition in effect (the block must be
    monotone), but there is a difference that this condition is for "code",
    not "state".  In fact, Array#sort has a similar condition.


I think there is demand for this feature because similar feature requests
are often proposed: [ruby-dev:17122] [ruby-core:30892] [ruby-dev:38545]

More importantly, this feature is often required in many programming
contests :-)


A patch is attached.  Thank you for your consideration.

-- 
Yusuke Endoh <mame / tsg.ne.jp>


-- 
http://redmine.ruby-lang.org