Hi, danielcavanagh / aanet.com.au wrote: I don't mean to shoot you down in flames, but a lot of thought and effort has gone into Ruby's encoding support. Ruby could have followed the Python route of converting everything to Unicode, but that was rejected for various good reasons. Also automatic transcoding to solve issues of incompatible encodings was also rejected because it causes a number of problems, in particular I believe that transcoding isn't necessarilly accurate, because for example there may be multiple or ambiguous representations of the same character. What *was* introduced is the concept of a "default_internal" encoding, which, if used by the programmer, causes I/O and other interfaces to transcode to the internal encdoing on input & the opposite on output. Typically the default_internal encoding, if used, is UTF-8, and in this case the programmer would have to accept that, on doing I/O to a file in a different encoding, the transcoding *may* cause data loss. > we first add a function > to do actual conversions between two encodings based on character, not > just reinterpreting the byte values. so c in latin-1 (0x63) would become c > in utf-32 (0x00000063). String#encode does this I believe > it could have lists of which encodings are > supersets of other encodings Unfortunately it turns out that the only encoding that we can reliably state is a subset of any other encoding is US-ASCII, and Ruby knows about this and optimizes for it. Cheers Mike