Issue #13216 has been updated by Nobuyoshi Nakada.


Shyouhei Urabe wrote:
> > $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")'
> > 
> 
> This IS weird.  Smells like a bug to me.

Not a bug.

`pack("U")` packs just one codepoint, and U+00EF is LATIN SMALL LETTER I WITH DIAERESIS, which is the printed exactly.

```
$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U*")'
id
```


----------------------------------------
Bug #13216: Possible unexpected behaviour reading string starting with a byte order mark
https://bugs.ruby-lang.org/issues/13216#change-62991

* Author: Gabriel Giordano
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Maybe the comparison between symbols has an unexpected behaviour. Tested with ruby 2.4.0

```
$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes'
239
187
191
105
100

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.bytes'
105
100

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym'
id

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym'
id

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym == :id' 
false

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym == :id'
true

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")'




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>