--------------enigD4B9C75A9F44A5850B75F727
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

ciapecki wrote:
> Paul Lutus schrieb:
> 
>> ciapecki wrote:
>>
>>> Another question following up.
>>> Is there a way to find out in what encoding is the file encoded (is it
>>> ucs-2le or utf-8)?
>>> when I open a file in VIM I can check it with :set fileencoding
>>> so there must be any way to recognize the file and its encoding.
>> The fact that you can choose a particular encoding doesn't mean that
>> encoding is innate to the file. In the case of a unicode text file without
>> an identifying header, strictly speaking it is not possible to determine
>> the encoding -- I mean, apart from a human being using common sense and
>> text recognition.
>>
>> --
>> Paul Lutus
>> http://www.arachnoid.com
> 
> Hi Paul,
> 
> in VIM :set filencoding (does not only set fileencoding, but as well
> shows current fileencoding when run like I wrote)
> so when I open a utf-8 file and enter :set fileencoding I get utf8,
> when I open a ucs-2le file I get ucs-2le, I do not know how it
> recognizes,
> but the same thing happens (but not always) in Microsoft Notepad. When
> you mark a file which is in UTF-8 Notepad marks UTF-8 as encoding, whenhe file is ucs-2le, it marks Unicode as encoding.
> So there must be something characteristic in those files.
> 
> chris
> 
> 

Byte order marks[1]? They're a hack of sorts that you can abuse to
indicate "This file is in Unicode encoding $FOO" in a text-file context.
However, they're a form of in-band signalling, and therefore a potential
Bad Thing depending on what the data will be passing through.

David Vallner

[1]: http://unicode.org/unicode/faq/utf_bom.html#BOM


--------------enigD4B9C75A9F44A5850B75F727
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFeHAJy6MhrS8astoRAji9AJ9tuygMY9ZPs0n3qibKy+MQFWx8cQCcCmBG
XUpGWBVVQwH/l+ukJAAB8YcWu
-----END PGP SIGNATURE-----

--------------enigD4B9C75A9F44A5850B75F727--