Hi Carlos, Thanks v much for the advice. Thought I'd start with looking at what's already in the database using unpack. > 1. that application, the MySQL query tool, is not UTF-8 aware. So, it > interprets the 2 bytes of "" (197, 130) as 2 characters in some simple-byte > encoding (probably latin-1), which gives "" and an unprintable character. > Your test line wasn't UTF-8 encoded at all. Yeah, for another db on the server it works fine, so I'm guessing it's your 2nd option. Your explanation of the 2 bytes solves another question I had though :-) > 2. The application is UTF-8 aware, the test line is in UTF-8, but the data > from your web pages was already in UTF-8 and you thought it wasn't and > encoded it again to UTF-8. > To test if a string is encoded in UTF-8, just examine its bytes > p str.unpack("C*") > and see if the diacritic letters are encoded with 2 or more bytes (UTF-8), > or only one (iso-8859-*, cp*, etc.). (If you see *four* then you encoded > them twice :). Here's a test case On web page after being loaded from DB: "Wyナ嬪ij" [This is correct!] In MySQL Analyser: "Wylij" [bad, even though MySQL analyser is UTF-8] In Interactive Ruby (IRB) printed to console, after loading from DB: "Wy笏シテクlij" [expected in a DOS prompt!] In IRB unpacked, after loading from DB: [87, 121, 197, 155, 108, 105, 106] So, I can see that the character "" must correspond to the 3rd and 4th bytes of "Wyナ嬪ij". Looking at the Ruby help, I see I can do this p str.unpack("U*") to get the UTF-8 characters, which gives: [87, 121, 347, 108, 105, 106] According to this, http://www.fileformat.info/info/unicode/char/015b/index.htm, character 347 is in fact a "". This would suggest that the database has UTF-8 text, and it's getting into Ruby without corruption! Is this right? So, the question now is why doesn't Iconv convert my UTF-8 to Latin2 correctly... That could just be because the original text can't be converted due to additional characters outside of the Latin2 set. I could probably give Iconv explicit mapping codes for how to handle certain characters, that may do the trick.. I'll re-read your post and see if I can find anything else. Thanks for the help, feels like I'm a few steps forward now! If you can spot any errors in the above a hint would be most welcome! Tobin > HTH. Good luck. > --