On Thu, Jan 26, 2012 at 5:29 AM, Alex Chaffee <alex / stinky.com> wrote:
>
>> Just throwing this out there: judging float equality based on
>> _difference_ is incorrect. =A0You should use _ratio_, or perhaps in some
>> cases a combination of both.
>
> Fascinating! So if we used division-and-proximity-to-1 instead of
> subtraction-and-proximity-to-0 then we could possibly do away with the
> tolerance parameter altogether... or at least redefine it.
>
> Of course, your argument reverses itself when dealing with very large
> numbers!

I don't think so.  Floats are implemented using a coefficient and an
exponent.  So are the following two floats essentially equal?

  A:  6.30912402 E 59
  B:  6.30912401999999999999 E 59

I'd say yes.  What about these two?

  A:  6.30912402 E 59
  B:  6.30912401999999999999 E 58

Of course not!  There is an order-of-magnitude difference.  So perhaps
a unit testing float comparison should work like this (pseudo-code):

   def float_equal? a, b
      c1, m1 =3D coefficient(a), exponent(a)
      c2, m2 =3D coefficient(b), exponent(b)
      m1 =3D=3D m2 and (c1/c2 - 1).abs < 0.000000000001
   end

If you take the magnitude away, then dealing with very large numbers
shouldn't be a problem.

> Let's say I'm dealing with Time. If I say time a should be
> close to time b, then I probably want the same default precision (say,
> 10 seconds) no matter when I'm performing the test, but using ratios
> will give me quite different tolerances depending on whether my
> baseline is epoch (0 =3D 1/1/1970) or Time.now.

If your application or library want to know if two times are within 10
seconds of each other, then that's a property of _your code_ and has
nothing to do with float implementations.  In other words, to compare
Time objects, use Time objects, not Float objects :)

> In any case... I will be happy to review your patch! :-)

Hard to offer a patch to code I don't even have installed :), but here
is an excerpt from my implementation.  See http://bit.ly/AblEvx, line
274, for context [1].

      def run
        if @actual.zero? or @expected.zero?
          # There's no scale, so we can only go on difference.
          (@actual - @expected) < @epsilon
        else
          # We go by ratio. The ratio of two equal numbers is one, so the r=
atio
          # of two practically-equal floats will be very nearly one.
          @ratio =3D (@actual/@expected - 1).abs
          @ratio < @epsilon
        end
      end

The problem with this is it's using @epsilon for two different
purposes: a "difference" epsilon and a "ratio" epsilon.  That is
clearly wrong, but I just implemented something that would work for
me.  I figured there _must_ be a best-practice approach out there
somewhere that I could learn from.  I firmly believe this problem
should be solved once and for all, and it won't be by testing
difference, and there should be value for epsilon that is justified by
the engineering. [2]

I also believe the built-in Float class should provide methods to
assist us.  It gives us inaccuracy, so it should give us the tools to
deal with it.

  class Float
    def essentially_equal_to?(other)
      # Best-practice implementation here with scientifically valid
value for epsilon.
    end

    def within_delta_of?(other, delta)
      (self - other).abs < delta
        # No default value for delta because it is entirely
context-dependent. This is
        # a convenience method only.
    end
  end

  a =3D 1.1 - 1.0
  a.essentially_equal_to?(0.1)          # true

  4.7.within_delta_of?(4.9251, 0.2)   # false

[1] Full link for posterity:
https://github.com/gsinclair/whitestone/blob/master/lib/whitestone/assertio=
n_classes.rb

[2] While "one epsilon to rule them all" is appealing, the problem is
that the errors inherent in float representation get magnified by
computation.  However, even raising two "essentially equal" floats to
the power of 50 doesn't change their essential equality, assuming a
ratio of 1e-10 is good enough:

  a =3D 0.1            # 0.1
  b =3D 1.1 - 1.0      # 0.10000000000000009

  xa =3D a ** 50       # 1.0000000000000027e-50
  xb =3D b ** 50       # 1.0000000000000444e-50

  proximity_ratio =3D (xa/xb - 1).abs
                     # 4.163336342344337e-14

  proximity_ratio < 1e-10
                     # true

By the way, the proximity_ratio for the original a and b was
7.77156e-16, so I hastily conclude:
 * The engineering compromises in the representation of floats gives
us a proximity ratio of around 1e-15 (7.77156e-16 above).
 * Raising to an enormous power changes the proximity ratio to around
1e-13 (4.163e-14 above).
 * A reasonable value for epsilon might therefore be 1e-12.

I expect this conclusion might depend on my choice of values for a and
b, though.

If you made it this far, congratulations.