Issue #13216 has been updated by Shyouhei Urabe.

Description updated

Hello.

Gabriel Giordano wrote:
> $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes'
> 239
> 187
> 191
> 105
> 100
> 
> $ echo -n -e 'id' | ruby -e 'puts STDIN.read.bytes'                                                                    
> 105
> 100

These two are as expected, aren't they?

> $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym'
> id

I think it's the `puts` method that eats the BOM.

```
% echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym.to_s.dump'
"\uFEFFid"
```

This symbol actually includes U+FEFF, which is normally invisible in the middle of a string.

> $ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym'
> id

This is OK I believe.

> $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym == :id'                                                
> false

Given the symbol generated by reading stdin does contain U+FEFF, this is natural.

> $ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym == :id'                                                            
> true

No problem here.

> $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")'
> 

This IS weird.  Smells like a bug to me.

----

So all but the last one are working well (at least seems to me).  The last one needs more inspection.

----------------------------------------
Bug #13216: Possible unexpected behaviour reading string starting with a byte order mark
https://bugs.ruby-lang.org/issues/13216#change-62987

* Author: Gabriel Giordano
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Maybe the comparison between symbols has an unexpected behaviour. Tested with ruby 2.4.0

```
$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes'
239
187
191
105
100

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.bytes'
105
100

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym'
id

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym'
id

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym == :id' 
false

$ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym == :id'
true

$ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")'




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>