Simone Carletti wrote:
>
> If I'm right both ISO-8859-1 and ISO-8859-15 belongs to Latin1 thus I
> can convert them in the same way using Iconv.iconv('UTF-8', 'LATIN1',
> 'a string').join.
>
>   

You'll probably loose the  (euro) sign from ISO-8859-15 sources as 
LATIN1 is probably equivalent to ISO-8859-1.

> My goal is not to be able to detect each single different charset but
> to convert all string from an input into UTF-8.
>
>   

In fact... it's the same if you don't know the original charset you 
can't convert properly to UTF-8.

> In the meantime I was reading the code of rFeedParser, the Ruby
> implementation of Python FeedParser.
> I just discovered it depends on a project called https://rubyforge.org/projects/rchardet/
>
> I gave it a look and it seems to do exactly what I was looking for.
>
> Anyone is using this library?
>
>   

I use chardet 0.9.0. I believe they work more or less the same.

I use it as a fallback mechanism when I can't reliably get the original 
charset from feeds. Some feeds actually tell that they are UTF-8 encoded 
but have invalid code points (your database isn't happy when you try to 
feed it something like that...), this becomes a mess when you find out 
that each item in the feed may use different charsets because people 
aggregate different sources without checking their charset themselves...

The behavior I'm using is :
1/ Try the advertised charset with Iconv('utf-8', charset), even if 
charset =~ /^utf-?8$/i
  succeeds? -> END
  fails? (Exception) -> continue
2/ Use chardet to guess the charset,
3/ Iconv('utf-8', chardet_charset).

Good luck, you're in for a lot of pain...

Lionel