--nextPart1384615.vNGKNoWs82
Content-Type: text/plain;
  charsettf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Quoth Jamal Bengeloun:
> ...
> 
> The app I am working on gets its data from different sources (yaml 
> files, dBaseIV files, MS Access files) and then produces xml files (via 
> builder).
> 
> When using print you get the original character. When using p, you get 
> the escaped equivalent.
> 
> And that's only the start of your problems! When trying to get those 
> characters into utf-8
> 
> ...
> 
> Does someone have an explanation?
> 
> Does anyone know how to get those characters into the final xml files?
> 
> Any help would be greatly appreciated.
> 
> Jamal

  In short, you're asking what the difference between "\303\251", "é", 
and "‚" are.

  The first is an octal sequence embedded in a string (it happens to be the 
same as utf-8 'é'). The second is also utf-8 'é'. These two are the same 
string ("\303\251" == "é"). The last, '‚' is the html-escaped notation 
for a 'é' (I'm trusting your email for the correct number here). That is, 
literally "‚" != "é", but they should render the same to a browser 
capable of displaying utf-8.

HTH,
-- 
Konrad Meyer <konrad / tylerc.org> http://konrad.sobertillnoon.com/

--nextPart1384615.vNGKNoWs82
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBHJo8DCHB0oCiR2cwRAvd4AKCbJqFvY5oJpu8E+ca0nG3l5+rvTQCeKIVc
/7R9O+FLq1w5/rG+os0R6k09
-----END PGP SIGNATURE-----

--nextPart1384615.vNGKNoWs82--