The XML spec says, 
"All XML processors must accept the UTF-8 and UTF-16 encodings of [ISO/IEC] 10646 [...]" 
http://www.w3.org/TR/2000/REC-xml-20001006#charsets

So, ideally, parser users should be able to code as if all data were UTF-16, even if
the actual representation in the parser is UTF-8.

James

> On 01/11/11 2:20 AM, "TAKAHASHI Masayoshi" <maki / open-news.com> wrote:
> > 

<snip />
> > 

> > Is it good solution?
> 
> No I don't think so. How you represent the character stream internally is
> entirely up to you (immediate *internal* conversion to UTF-8 by your parser
> is OK). Restricting input to UTF-8 will place an impossible to live with
> constraint on the use of your parser. Presumably having an XML parser is to
> allow ruby programs to participate in a larger context -- and this larger
> context isn't going to provide encoding conversions.
> 
> Bob
>