Issue #14400 has been updated by akr (Akira Tanaka).

Status changed from Assigned to Feedback

Basically, ungetbyte/ungetc works if there is enough space in the IO buffer.

Since the IO buffer is not dynamically extended and
the buffer size is internal behavior,
it is very small guarantee that we have enough space:
(1) There is 1 byte space at beginning and
(2) there is 1 byte space just after reading some data.

Note that current implementation of ungetbyte allocates
enough buffer to save given string if IO buffer is not allocated yet.
(This is the reason we can ungetbyte large data at beginning.)

Some idea:
- describe the above guarantee in the document.
- refine error message: "ungetbyte failed" to "ungetbyte: IO buffer space not enough"

Also, I found a situation that we cannot push back one (multibyte) character, now.

```
% ./ruby -e 'print "a" * 1024*32'|./ruby -e 'p STDIN.getc; STDIN.ungetc "\u3042"'
"a"
Traceback (most recent call last):
	1: from -e:1:in `<main>'
-e:1:in `ungetc': ungetbyte failed (IOError)
```

I'm not sure what we can here.


----------------------------------------
Bug #14400: IO#ungetc and IO#ungetbyte documentation is inconsistent with the behavior
https://bugs.ruby-lang.org/issues/14400#change-70470

* Author: Eregon (Benoit Daloze)
* Status: Feedback
* Priority: Normal
* Assignee: akr (Akira Tanaka)
* Target version: 
* ruby -v: ruby 2.6.0dev (2018-01-25 trunk 62035) [x86_64-linux]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
The documentation of IO#ungetc states:

> Pushes back one character (passed as a parameter) onto ios, such that a
> subsequent buffered character read will return it. Only one character may be
> pushed back before a subsequent read operation (that is, you will be able to
> read only the last of several characters that have been pushed back). Has no
> effect with unbuffered reads (such as IO#sysread).

And similar for IO#ungetbyte:

> Pushes back bytes (passed as a parameter) onto ios, such that a
> subsequent buffered read will return it. Only one byte may be pushed back
> before a subsequent read operation (that is, you will be able to read only the
> last of several bytes that have been pushed back). Has no effect with
> unbuffered reads (such as IO#sysread).

The part about only one byte/character is inconsistent with the actual behavior,
most notably because both of these methods accept a String with multiple characters as argument.

~~~ ruby
STDIN.ungetc "Hello World!"
STDIN.read 12 #=> "Hello World!"

STDIN.ungetbyte "Foo Bar"
STDIN.read 7 #=> "Foo Bar"
~~~
(There are even specs for it:
https://github.com/ruby/spec/blob/7fa22023d69620ea3ff4d0ed2eb71fd7b02dd950/core/io/ungetc_spec.rb#L98
https://github.com/ruby/spec/blob/7fa22023d69620ea3ff4d0ed2eb71fd7b02dd950/core/io/ungetbyte_spec.rb#L21)

> that is, you will be able to read only the last of several characters that have been pushed back

is contradicting what happens.

The behavior with large Strings is confusing.
It seems to allow arbitrarily large strings (but only if there was not a ungetbyte already/the buffer was empty?).

~~~
$ pry
[1] pry(main)> STDIN.ungetbyte "a"*10_000
=> nil
[2] pry(main)> STDIN.ungetbyte "a"*10_000
IOError: ungetbyte failed

$ pry
[1] pry(main)> STDIN.ungetbyte "a"*100_000
=> nil
[2] pry(main)> STDIN.ungetbyte "a"*100_000
IOError: ungetbyte failed
from (pry):2:in `ungetbyte'

$ pry
[1] pry(main)> STDIN.ungetbyte "a"*100_000
=> nil
[2] pry(main)> STDIN.read(100_000).size
=> 100000
[3] pry(main)> STDIN.ungetbyte "a"*100_000
=> nil
[4] pry(main)> STDIN.read(100_000).size
=> 100000
~~~

And it's not as simple as if two consecutive ungetbyte were forbidden:

~~~
$ pry
[1] pry(main)> STDIN.ungetbyte "a"*10_000_000
=> nil
[2] pry(main)> STDIN.ungetbyte "a"
IOError: ungetbyte failed
from (pry):2:in `ungetbyte'

$ pry
[1] pry(main)> STDIN.ungetbyte "a"
=> nil
[2] pry(main)> STDIN.ungetbyte "a"
=> nil
~~~

So how are those methods supposed to behave?
Can the documentation be updated to match the behavior and/or the behavior be fixed to be simpler?

I also wonder when those methods are useful.
There seems to be very few usages in the stdlib.
Maybe they should just be removed?
It seems easy to make a custom IO wrapper/buffer supporting pushing characters/bytes back.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>