Hello --

On Thu, 15 Nov 2001, James Britt (rubydev) wrote:

> >
> > When an XML parser receives XML that is not well-formed (be it from an
> > XML document, or as a string), it must report an error.
> > AFAICS: It must not try to recover by itself, for resons of
> > interoperability.
>
> REXML does not behave like a standard XML parser.

That will change, hopefully.  Otherwise we've just wasted a lot of
ruby-talk bandwidth :-)

> It helps (among other things) to make building XML strings easier.
> It currently will automagically escape problem characters, like '&' or '<'
>
> Having REXML escape the quotes is consistent with current behavior.
> It is the "REXML Way", so to speak.

That's no good, though.  As Tobi said, either it's an XML parser, or
it isn't.

Actually, I'm confused by what REXML does in this area.  Given this:

  Here's an ampersand and a greater-than sign: &amp; &gt;

one would expect to end up with this, on output:

  Here's an ampersand and a greater-than sign: & >

Right now, the #write method in rexml-1.1a is producing:

  Here's an ampersand and a greater-than sign: &amp;amp; &amp;gt;

and rexml-1.1a3 is producing:

  Here's an ampersand and a greater-than sign: &amp; &gt;

(There's a "write_with_substitution" method, but that goes the other
way: it escapes XML special characters (> becomes &gt;), rather than
replacing the entities with the characters they represent.)


David

-- 
David Alan Black
home: dblack / candle.superlink.net
work: blackdav / shu.edu
Web:  http://pirate.shu.edu/~blackdav