Bob Hutchison <hutch / xampl.com> writes:

> > Alternatively, offer the application to
> > a) receive the data in an encoding of their choice, or
> 
> I think that this problem is in Ruby (i.e. the application from the parser's
> point of view) not the parser. 

So are you saying "until Ruby 'properly' supports Unicode, Ruby XML
parsers should not care about encodings'?

> > b) offer the application to receive all data in the input encoding,
> >  reporting an error when you get data that cannot be represented
> >  in the input encoding (such as character entities).
> 
> I'm not sure what you are getting at here. What errors? 

Suppose you are processing

<?xml version="1.0" encoding="iso-8859-1"?>
<foo>&#x03C0;</foo>

(which contains GREEK SMALL LETTER PI), then my option b) is to export
all data to the application in iso-8859-1. Of course, you cannot
represent the character U+03C0 in Latin-1, hence you should report an
error to the application when trying to convert the character
reference into a character.

> detected by what software?

By the XML parser.

> If you mean errors in the XML file then it is easy:

No, I don't.

> Do you mean output? 

No, I don't.

> If you mean in the application,

No, I don't.

I was talking about errors that occur when converting trying to
represent all character data in the input encoding, which may not be
possible due to character references.

Regards,
Martin