Bob Hutchison wrote: >>IMHO, default encoding of XML parser in Ruby should be UTF-8. >>Because XML is in Unicode world, not ISO-8859-* nor EUC world >>(unfortunately for me). And Ruby's regex doesn't support >>UTF-16. >>So, if the parser support only one encoding, it should be UTF-8, >>and documents in other encoding should be converted to UTF-8. >> >>Is it good solution? >> > > No I don't think so. How you represent the character stream internally is > entirely up to you (immediate *internal* conversion to UTF-8 by your parser > is OK). Restricting input to UTF-8 will place an impossible to live with > constraint on the use of your parser. Presumably having an XML parser is to > allow ruby programs to participate in a larger context -- and this larger > context isn't going to provide encoding conversions. http://www.w3.org/TR/REC-xml.html : http://www.w3.org/TR/REC-xml.html#charencoding : "All XML processors must be able to read entities in both the UTF-8 and UTF-16 encodings." Tobi -- Tobias Reif http://www.pinkjuice.com/myDigitalProfile.xhtml go_to('www.ruby-lang.org').get(ruby).play.create.have_fun http://www.pinkjuice.com/ruby/