Hi, At Tue, 30 Oct 2007 15:30:30 +0900, Martin Duerst wrote in [ruby-core:13082]: > Please don't. If you really want, you might use \x{...} for a big- > endian representation of the underlying byte sequence for all encodings, > including UTF-8. This would mean e.g. the following: In [ruby-dev:16603], Matz said that `codepoint isn't a byte representation but is a "number"'. > Directly encoded string: "ÃæÅÄ ¿±Ù" > > Using \x for UTF-8: "\xE4\xB8\xAD\xE7\x94\xB0 \xE4\xBC\xB8\xE6\x82\xA6" > Using \x for Shift_JIS: "\x92\x86\x93\x63 \x90\x4c\x89\x78" > > Using \x{...} for UTF-8: "\x{E4B8AD}\x{E794B0} \x{E4BCB8}\x{E682A6}" > Using \x{...} for Shift_JIS: "\x{9286}\x{9363} \x{904c}\x{8978}" > > Using \u (currently only UTF-8): "\u4E2D\u7530 \u4F38\u60A6" > Using \u (in the future potentially for Shift_JIS and others): > "\u4E2D\u7530 \u4F38\u60A6" Rather, "\x{4366 4544} \x{3f2d 3159}" for both of Shift_JIS and EUC-JP which are based on JIS0212, and "\x{4E2D 7530} \x{4F38 60A6}" for UTF-8, I'd expect. > As you can see, and as discussed earlier, \x{} is very shallow syntactic > sugar, based on the actual binary representation, and therefore not really > necessary. It is slightly more readable than a sequence of \x bytes, > but I don't think this is so important, because I don't think it will > be used very much (most people who use a specific legacy encoding have > the fonts and editing tools needed). My thought is to make \x{} defferent from \x. However, it may be better to use another escape character for it, as Michal wrote in [ruby-core:13092]. -- Nobu Nakada