2011/11/29 Yukihiro Matsumoto <matz / ruby-lang.org>: > In message "Re: [ruby-core:41362] Re: Wrong encoding of Symbol" > ¨ÂÍïî¬ ²¸ Îï²°±± ²±º´±º°¸ «°¹°°¬ ¢ÎÁÒÕÓŬ Ùõé¢ ¼îáòõóåÀáéòåíéø®êð÷òéôåó> > |> Matz wrote : > |>> a = :foo > |>> p a.encoding => #<Encoding:US-ASCII> a) > | > |A symbol which consists of only US-ASCII characters is US-ASCII. > | > |>> b = "foo" > |>> p b.encoding => #<Encoding:UTF-8> (b) > | > |Of course, it is UTF-8. > > Considering the fact that a string which consists of only US-ASCII > characters is UTF-8, a symbol which consists of only US-ASCII being > US-ASCII is bit awkward for me. If a symbol keep its source encoding even if it consists of only ASCII, the symbol distinguish from US-ASCII one. It means following code raises NoMethodError because UTF-8 :length doesn't exist. # coding: utf-8 "foo".__send__(:length) > |>> c = "#{a}foo" > |>> p c.encoding => #<Encoding:US-ASCII> c) > | > |(str + another)'s encoding will be str's encoding when both str and > |another consist of only ASCII characters. > | > |>> d = "foo#{a}" > |>> p d.encoding => #<Encoding:UTF-8> (d) > | > |So this is "foo"'s encoding. > > I understand the internal. ¨Â ëîï÷ ÷èôèåîãïäéîç ïæ ôèóùíâïì éó > US-ASCII. ¨Âõô èáöéîç áâïöá÷ë÷áòäîåóó¬ ôèéó éóóõóôéìíáëåíå > feel inconsistent. ¨Â óôòéîç ÷èéããïîóéóôó ïæ ïîìù ÕÓÁÓÃÉÉ > characters should be UTF-8, even when a US-ASCII symbol is embedded at > the top of the string. It doesn't depend on symbol. It behaves as strings. -- NARUSE, Yui naruse / airemix.jp>