Sorry but I do not get it. Plus I am not sure it is only related to 
YAML.

I am working on something similar and the only answers I can relate are 
those in Python (such as: 
http://www.reportlab.com/i18n/python_unicode_tutorial.html). I mean I 
got so far as understanding that:

é gets translated to \202
è gets translated to \212
?? gets translated to \205
ç gets translated to \207
â gets translated to \203
ê gets translated to \210
î gets translated to \214
ô gets translated to \223
û gets translated to \226
ä gets translated to \204
ë gets translated to \211
ï gets translated to \213
ö gets translated to \224
ù gets translated to \227

But why?

The app I am working on gets its data from different sources (yaml 
files, dBaseIV files, MS Access files) and then produces xml files (via 
builder).

When using print you get the original character. When using p, you get 
the escaped equivalent.

And that's only the start of your problems! When trying to get those 
characters into utf-8

é gets translated to \202 that then gets translated to ‚
è gets translated to \212 that then gets translated to Š
?? gets translated to \205 that then gets translated to …
ç gets translated to \207 that then gets translated to ‡
â gets translated to \203 that then gets translated to ƒ
ê gets translated to \210 that then gets translated to ˆ
î gets translated to \214 that then gets translated to Œ
ô gets translated to \223 that then gets translated to “
û gets translated to \226 that then gets translated to –
ä gets translated to \204 that then gets translated to „
ë gets translated to \211 that then gets translated to ‰
ï gets translated to \213 that then gets translated to ‹
ö gets translated to \224 that then gets translated to ”
ù gets translated to \227 that then gets translated to —

Does someone have an explanation?

Does anyone know how to get those characters into the final xml files?

Any help would be greatly appreciated.

Jamal

Luis Parravicini wrote:
> On 10/23/07, h3raLd <h3rald / gmail.com> wrote:
>> => "test \225\227\212"
> \225\227\212 is the same as \x95\x97\x8A, the former in octal, and the
> latter in hex.
> 
> irb(main):002:0> 0x95.to_s(8)
> => "225"
> irb(main):003:0> 0x97.to_s(8)
> => "227"
> irb(main):004:0> 0x8a.to_s(8)
> => "212"
> 
> 
> Bye

-- 
Posted via http://www.ruby-forum.com/.