--nextPart1384615.vNGKNoWs82
Content-Type: text/plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Quoth Jamal Bengeloun:
> ...
>=20
> The app I am working on gets its data from different sources (yaml=20
> files, dBaseIV files, MS Access files) and then produces xml files (via=20
> builder).
>=20
> When using print you get the original character. When using p, you get=20
> the escaped equivalent.
>=20
> And that's only the start of your problems! When trying to get those=20
> characters into utf-8
>=20
> ...
>=20
> Does someone have an explanation?
>=20
> Does anyone know how to get those characters into the final xml files?
>=20
> Any help would be greatly appreciated.
>=20
> Jamal

  In short, you're asking what the difference between "\303\251", "=C3=A9",=
=20
and "‚" are.

  The first is an octal sequence embedded in a string (it happens to be the=
=20
same as utf-8 '=C3=A9'). The second is also utf-8 '=C3=A9'. These two are t=
he same=20
string ("\303\251" =3D=3D "=C3=A9"). The last, '‚' is the html-escape=
d notation=20
for a '=C3=A9' (I'm trusting your email for the correct number here). That =
is,=20
literally "‚" !=3D "=C3=A9", but they should render the same to a bro=
wser=20
capable of displaying utf-8.

HTH,
=2D-=20
Konrad Meyer <konrad / tylerc.org> http://konrad.sobertillnoon.com/

--nextPart1384615.vNGKNoWs82
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBHJo8DCHB0oCiR2cwRAvd4AKCbJqFvY5oJpu8E+ca0nG3l5+rvTQCeKIVc
/7R9O+FLq1w5/rG+os0R6k0=
=e0p9
-----END PGP SIGNATURE-----

--nextPart1384615.vNGKNoWs82--