On 27/06/06, Alex Young <alex / blackkettle.org> wrote:> Is there any way to use the Iconv library to lossily convert between> partially incompatible encodings?  In other words, if, for example, I've> got a UTF-8 string that I need to convert down to 7-bit ASCII, and I> don't especially care what happens to the extended characters (short of> a single character being mapped to a single character - ideally one I> can specify), is there any way of forcing the recode?
Yes, there is. Add //IGNORE to the destination encoding to ignoreunavailable characters, or //TRANSLIT to transliterate them intocombinations of ASCII characters (e.g. `e for è).
E.g.:
#!/usr/bin/env rubby$KCODE = 'u'require 'iconv'
s = 'caffè'
ic_ignore = Iconv.new('US-ASCII//IGNORE', 'UTF-8')puts ic_ignore.iconv(s) # => caff
ic_translit = Iconv.new('US-ASCII//TRANSLIT', 'UTF-8')puts ic_translit.iconv(s) # => caff`e
//TRANSLIT will raise an exception on characters it can'ttransliterate, however; this can be solved by using'//IGNORE//TRANSLIT' together (in that order).
Paul.