Issue #14729 has been updated by shevegen (Robert A. Heiler).


> This behavior is inconsistent; underscores are verified throughout the entire
> string so why not look for other invalid characters?

I think that particular reasoning is incorrect in regards to underscores, as
underscores are valid in both variants aka simply ignored, as far as I can
see it:

    "123_456".to_f # => 123456.0
    Float("123_456") # => 123456.0

I have not checked on the size restriction yet though.

----------------------------------------
Bug #14729: Float("long_invalid_string") fails to throw an exception
https://bugs.ruby-lang.org/issues/14729#change-71765

* Author: samiam (Sam Napolitano)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.6.0dev (2018-04-29 trunk 63298) [x86_64-darwin16]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
When Float() is used to convert a string into a float, invalid characters in the string throw an error.

But when a really long string is passed to Float(), invalid characters exceeding the size of the internal C buffer are ignored and no error is thrown.

This behavior is inconsistent; underscores are verified throughout the entire string so why not look for other invalid characters?

I have a weak patch but would prefer to see what the developers think of this bug before I post it.  Should Float() accept any size string or limit it?

Code details:

The code in question is object.c:rb_cstr_to_dbl_raise().
https://bugs.ruby-lang.org/projects/ruby-trunk/repository/entry/object.c#L3232

Specifically the buffer limit is usually 70-1 digits.  For reference, 2^64 is 20 digits so this may be a academic exercise.
https://bugs.ruby-lang.org/projects/ruby-trunk/repository/entry/object.c#L3271

As an aside, I believe the last check on errno in the function is unnecessary.  Errno should be examined immediately after a system call, which it is, so it's unclear why it's checked again at the end of the function.
https://bugs.ruby-lang.org/projects/ruby-trunk/repository/entry/object.c#L3307

The following code demonstrates the issue with some additional comments.

```ruby
#!/usr/bin/env ruby

require 'test/unit'
require 'test/unit/assertions'
include Test::Unit::Assertions

class TestFloat < Test::Unit::TestCase

  # https://bugs.ruby-lang.org/projects/ruby-trunk/repository/entry/object.c#L3271
  # BUF_SIZE = 69 on most machines
  # -1 is for newline
  # Bonus points if you can explain the constants 4 and 10?
  BUF_SIZE = Float::DIG * 4 + 10 - 1

  # case 1: invalid char 'a' is within buffer size
  # Result: strtod correctly throws error
  def test_strtod_ok
    assert_raise(ArgumentError){Float('1' * (BUF_SIZE-1) + 'a')}
  end

  # case 2: invalid char 'a' is outside buffer size
  # Result: strtod doesn't throw error because buffer doesn't contain invalid char.
  # Confusing why ruby's behavior is different between case 1 and 2 until you look at C code.
  def test_strtod_no_error
    assert_equal(1.1111111111111112e+68, Float('1' * BUF_SIZE + 'a is ignored'))
  end

  # case 3: entire string is scanned for underscores
  # Result: when '_' is found in string, prev char is checked and MUST be ISDIGIT
  # or error is thrown by rb_cstr_to_dbl_raise not strtod.
  def test_underscores_checked_whole_string
    assert_raise(ArgumentError){Float('1' * BUF_SIZE  + '234_56a_890')}
  end

  # case 4: the bug - should ruby scan entire string and detect invalid chars
  # just like it does for invalid underscores so this test should pass?
  # Result: no exception raised (currently)
  def test_check_whole_string_for_invalid_chars
    assert_raise(ArgumentError){Float('1' * BUF_SIZE  + 'a')}
  end

end
```



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>