On Jan 6, 2009, at 3:20 AM, Brian Candler wrote:

> Kenneth McDonald wrote:
>> Any advice most appreciated,
>
> Use hexdump -C on the file to see what the actual byte sequences  
> are. If
> these are single-byte characters then it's probably ISO-8859-1. If  
> they
> are two bytes then it's probably UTF-8.

I have some code that detects valid UTF-8 data here:

http://blog.grayproductions.net/articles/the_unicode_character_set_and_encodings#comment_14649

James Edward Gray II