David Alan Black wrote:

AFAIK:


> (Do you mean entities *not* resolved? 


No. Entities should get resolved. :)

The parser should resolve entities. The writer just serializes the 
result out.

So
input <
parsed and serialized could be <

If the writer gets fed a string directly, it could take care that it 
gets escaped to prevent well-formedness violations.

So that > in the input is
> still > in the output?)


No. It could be output as the Numerical Character Reference <

.


> I now see that the write method of rexml is, ummmm, an XML writing
> method.  Whereas retrieving some_element.text *does* do (some, though
> not all) built-in XML substitutions.
> 
> So my main concern -- namely, that the parsed text have its entities
> resolved -- is addressed, or at least well on its way to being
> addressed :-)


oh yes, entities should get resolved by the parser. :)


>>Absolutely. So what about replacing < with the NCRs < or < ?
>>http://www.w3.org/TR/REC-xml.html#sec-predefined-ent
>>
> 
> Entity substition is recursive, so those get resolved too.


But they are NCRs (Numerical Character References), not named entities.
NCRs don't have to get resolved to the corresponding characters.
é in the input can be written by the serializer/XMLwriter as é

< is a (predefined named) entity. It can be resolved by substituting 
it with its' replacement text < by the parser, the serializer writes 
out <

Tobi




-- 
Tobias Reif
http://www.pinkjuice.com/myDigitalProfile.xhtml

go_to('www.ruby-lang.org').get(ruby).play.create.have_fun
http://www.pinkjuice.com/ruby/