Issue #13626 has been updated by duerst (Martin Drst).


normalperson (Eric Wong) wrote:

>  Fwiw, I'm also not convinced String#<< behavior about changing
>  write_buffer to Encoding::UTF-8 in your above example is good
>  behavior on Ruby's part...  But I don't know much about human
>  language encodings, I am just a *nix plumber where a byte is a
>  byte.

This behavior may not be the best for this specific case, but in general, if one string is US-ASCII, and the other is UTF-8, then UTF-8 is a superset of US-ASCII, and concatenating the two will produce a string in UTF-8. Dropping the encoding would loose important information.

Please also note that you are actually on dangerous ground here. The above only works because the string doesn't contain any non-ASCII (high bit set) bytes. As soon as there is such a byte, there will be an error.

````
s = "abcde".b
s.encoding   # => #<Encoding:ASCII-8BIT>
s << "¦æĦ" # => "abcde¦æĦ"
s.encoding   # => #<Encoding:UTF-8>
````

but:
````
t = "¦æĦ".b # => "\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4\xCE\xB5"
t.encoding    # => #<Encoding:ASCII-8BIT>
t << "˦̦Ц"   # => Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
````

So if you have an ASCII-8BIT buffer, and want to append something, always make sure you make the appended stuff also ASCII-8BIT.


----------------------------------------
Feature #13626: Add String#byteslice!
https://bugs.ruby-lang.org/issues/13626#change-66884

* Author: ioquatix (Samuel Williams)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
It's a common pattern in IO buffering, to read a part of a string while leaving the remainder.

~~~
# Consume only part of the read buffer:
result = @read_buffer.byteslice(0, size)
@read_buffer = @read_buffer.byteslice(size, @read_buffer.bytesize)
~~~

It would be nice if this code could be simplified to:

~~~
result = @read_buffer.byteslice!(size)
~~~

Additionally, this allows a significantly improved implementation by the interpreter.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>