--nextPart1384615.vNGKNoWs82 Content-Type: text/plain; charset tf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Quoth Jamal Bengeloun: > ... > > The app I am working on gets its data from different sources (yaml > files, dBaseIV files, MS Access files) and then produces xml files (via > builder). > > When using print you get the original character. When using p, you get > the escaped equivalent. > > And that's only the start of your problems! When trying to get those > characters into utf-8 > > ... > > Does someone have an explanation? > > Does anyone know how to get those characters into the final xml files? > > Any help would be greatly appreciated. > > Jamal In short, you're asking what the difference between "\303\251", "é", and "‚" are. The first is an octal sequence embedded in a string (it happens to be the same as utf-8 'é'). The second is also utf-8 'é'. These two are the same string ("\303\251" == "é"). The last, '‚' is the html-escaped notation for a 'é' (I'm trusting your email for the correct number here). That is, literally "‚" != "é", but they should render the same to a browser capable of displaying utf-8. HTH, -- Konrad Meyer <konrad / tylerc.org> http://konrad.sobertillnoon.com/ --nextPart1384615.vNGKNoWs82 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQBHJo8DCHB0oCiR2cwRAvd4AKCbJqFvY5oJpu8E+ca0nG3l5+rvTQCeKIVc /7R9O+FLq1w5/rG+os0R6k0ßÑ9 -----END PGP SIGNATURE----- --nextPart1384615.vNGKNoWs82--