[Tobes <tobin / tobinharris.com>, 2007-01-04 16.55 CET]
> Thanks for the links and the advice Carlos.
> 
> I'm actually using Ruby FPDF (http://zeropluszero.com/software/fpdf/),
> and couldn't see a dependency on PDF::Writer. However, using iconv to
> convert to UTF-16 gives a different result
> http://www.tobinharris.com/media/mtq38_utf16.jpg.
> 
> Do you know of any tools that will let me reliably inspect the data in
> the database to see what encoding the information is being stored in.
> MySQL was setup to store UTF-8, and since the text data is sent from a
> UTF-8 formatted web page, I assumed this would be the case. However,
> I'm thinking that it wasn't UTF-8 at all, and so need to know what the
> original encoding is?
> 
> I'm also definately lacking some knowledge in this area, so any
> pointers to resources/tools would be appreciated.

Hi. I assumed you were using "railspdfplugin"
  http://rubyforge.org/projects/railspdfplugin/

which is the first Google result for RPDF, and depends on PDF::Writer.

I can't access the Ruby FPDF page right now ("502 Bad Gateway" error
message), but if it is based on PHP's FPDF, then you just have to follow the
steps here:
  http://www.fpdf.org/en/tutorial/tuto7.htm

(extrapolated to Ruby's FPDF, of course).

WRT the screenshot of your other message, there are two possibilities:

1. that application, the MySQL query tool, is not UTF-8 aware. So, it
interprets the 2 bytes of "" (197, 130) as 2 characters in some simple-byte
encoding (probably latin-1), which gives "" and an unprintable character.
Your test line wasn't UTF-8 encoded at all.

2. The application is UTF-8 aware, the test line is in UTF-8, but the data
from your web pages was already in UTF-8 and you thought it wasn't and
encoded it again to UTF-8.

To test if a string is encoded in UTF-8, just examine its bytes
  p str.unpack("C*")

and see if the diacritic letters are encoded with 2 or more bytes (UTF-8),
or only one (iso-8859-*, cp*, etc.). (If you see *four* then you encoded
them twice :).

HTH. Good luck.
--