2011/11/29 Yukihiro Matsumoto <matz / ruby-lang.org>:
> In message "Re: [ruby-core:41362] Re: Wrong encoding of Symbol"
>    ﲰ   Ŭ  >
> |> Matz wrote :
> |>> a = :foo
> |>> p a.encoding  => #<Encoding:US-ASCII> a)
> |
> |A symbol which consists of only US-ASCII characters is US-ASCII.
> |
> |>> b = "foo"
> |>> p b.encoding  => #<Encoding:UTF-8> (b)
> |
> |Of course, it is UTF-8.
>
> Considering the fact that a string which consists of only US-ASCII
> characters is UTF-8, a symbol which consists of only US-ASCII being
> US-ASCII is bit awkward for me.

If a symbol keep its source encoding even if it consists of only ASCII,
the symbol distinguish from US-ASCII one.
It means following code raises NoMethodError because UTF-8 :length
doesn't exist.

# coding: utf-8
"foo".__send__(:length)

> |>> c = "#{a}foo"
> |>> p c.encoding  => #<Encoding:US-ASCII> c)
> |
> |(str + another)'s encoding will be str's encoding when both str and
> |another consist of only ASCII characters.
> |
> |>> d = "foo#{a}"
> |>> p d.encoding  => #<Encoding:UTF-8> (d)
> |
> |So this is "foo"'s encoding.
>
> I understand the internal.       
> US-ASCII.      
> feel inconsistent.       ӭ
> characters should be UTF-8, even when a US-ASCII symbol is embedded at
> the top of the string.

It doesn't depend on symbol. It behaves as strings.

-- 
NARUSE, Yui naruse / airemix.jp>