--------------enigD4B9C75A9F44A5850B75F727 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable ciapecki wrote: > Paul Lutus schrieb: > >> ciapecki wrote: >> >>> Another question following up. >>> Is there a way to find out in what encoding is the file encoded (is it >>> ucs-2le or utf-8)? >>> when I open a file in VIM I can check it with :set fileencoding >>> so there must be any way to recognize the file and its encoding. >> The fact that you can choose a particular encoding doesn't mean that >> encoding is innate to the file. In the case of a unicode text file without >> an identifying header, strictly speaking it is not possible to determine >> the encoding -- I mean, apart from a human being using common sense and >> text recognition. >> >> -- >> Paul Lutus >> http://www.arachnoid.com > > Hi Paul, > > in VIM :set filencoding (does not only set fileencoding, but as well > shows current fileencoding when run like I wrote) > so when I open a utf-8 file and enter :set fileencoding I get utf8, > when I open a ucs-2le file I get ucs-2le, I do not know how it > recognizes, > but the same thing happens (but not always) in Microsoft Notepad. When > you mark a file which is in UTF-8 Notepad marks UTF-8 as encoding, whenhe file is ucs-2le, it marks Unicode as encoding. > So there must be something characteristic in those files. > > chris > > Byte order marks[1]? They're a hack of sorts that you can abuse to indicate "This file is in Unicode encoding $FOO" in a text-file context. However, they're a form of in-band signalling, and therefore a potential Bad Thing depending on what the data will be passing through. David Vallner [1]: http://unicode.org/unicode/faq/utf_bom.html#BOM --------------enigD4B9C75A9F44A5850B75F727 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (MingW32) iD8DBQFFeHAJy6MhrS8astoRAji9AJ9tuygMY9ZPs0n3qibKy+MQFWx8cQCcCmBG XUpGWBVVQwH/l+ukJAAB8YcWu -----END PGP SIGNATURE----- --------------enigD4B9C75A9F44A5850B75F727--