On Aug 10, 2009, at 1:36 PM, Calvin Nguyen wrote:

> Question:
> Hi, our company is using Ruby 1.8.6 with Rails 2.2.2.  Does anyone
> know we can explicitly specify what encoding to use when calling

Rails pretty much assumes UTF-8 data everywhere.  The path of least =20
pain is definitely to try to work exclusively with UTF-8, since that's =20=

mainly what Ruby 1.8.x can handle.

However, I believe the default encoding of a web page is ISO-8859-1, =20
unless you specify otherwise.  If you served up a form, a browser sent =20=

you some data from that form, you saved it into the database, and you =20=

never specified an encoding or tried to transcode the content, your =20
data is probably in ISO-8859-1.  Indeed, that seems to be the case, =20
from what you are showing:

> If we use the browser to submit HTTP Get requesting JSON format, save
> the file and view it in binary mode in Hexadecimal representation,
> this is what we get.  It looks like this is using extended ASCII.
>
> Bytes                          Text
> 43 61 66 E9                    C a f (should be accented e but get =20
> weird
> block unprintable character)

That's ISO-8859-1 data:

$ ruby -KU -r iconv -e 'puts Iconv.conv("UTF-8", "ISO-8859-1", [0x43, =20=

0x61, 0x66, 0xE9].pack("C*"))'
Caf=E9

Thus, you need to transcode it to UTF-8 before using operations like =20
to_json() that assume UTF-8, using the reverse of the transform I just =20=

showed.  Even better, you could transcode the existing data in your =20
database to UTF-8 and then mark all pages on your site as UTF-8 =20
encoded, possibly by adding this line to the head of your HTML:

<meta http-equiv=3D"Content-type" content=3D"text/html; charset=3Dutf-8">

You may also want to instruct your web server to return a proper =20
Content-Type encoding header.

I hope that helps.

James Edward Gray II