On Mon, Feb 25, 2008 at 8:25 AM, Simone Carletti <weppos / gmail.com> wrote:
>  I run a deep search through this group and other resources online but
>  I have been unable to find whether is there a way to guess the charset
>  of a string in Ruby 1.8.6.
>
>  I need to ensure a string is always UTF-8 encoded but Iconv requires
>  the developer to specify both in and out charset.
>  On the other side, Kconv provides a #guess() method but doesn't
>  support Latin or Western encodings.
>
>  Any suggestion?

Kconv can guess because the encodings for the set of Asian written
languages are distinctive (they don't share much with the Latin
character set). What you're wanting is nearly impossible without a
large body of text for analysis, and even then the best commercial
programs are taking stabs at probabilities. (Here's an example: how do
you tell the difference between ISO-8859-1 and ISO-8859-15
programmatically? IIRC, the only difference between them is that -15
supports the Euro symbol, replacing a different symbol from -1.)

You're better off seeking a slightly different approach.

-austin
-- 
Austin Ziegler * halostatue / gmail.com * http://www.halostatue.ca/
               * austin / halostatue.ca * http://www.halostatue.ca/feed/
               * austin / zieglers.ca