Hi, I'm REXML newbie :-), and I have a comment.

Sean Russell <ser / germane-software.com> wrote:
> 1) REXML hasn't been handling entities in parsed documents very well.  This 
> has been fixed, but I'm wondering if REXML's behavior is confusing.  REXML 
> inherited Electric XML's behavior of converting &, <, and > to entities on 
> write.  This is convenient, but it may be confusing for users to know when 
> they have to quote their own entities, and when to leave them alone.  1.1a3 
> fixes this to a certain extent; REXML ignores entities in text, but it 
> still converts &, <, and >.  It also reverse-converts &amp;, &lt;, and &gt; 
> back to characters on a read.  &#xxx; entities are now correctly handled.  
> I'm accepting opinions about this matter.  I'd like entity handling to be 
> fairly painless, but I don't want ambiguous behavior if I can avoid it.

In XML 1.0 Rec., there are 5 predefined general entities: "&", "<", ">",
"'" and '"'.
(cf. 4.6 Predefined Entity
     http://www.w3.org/TR/REC-xml#sec-predefined-ent )

Should REXML convert "'" and '"' into &apos; and &quot;?
At least, the behavior:

  foo_str = ""
  foo = REXML::Element.new("foo")
  foo.attributes["bar"] = "aaa'bbb\"ccc"
  foo.write(foo_str)
  p foo_str         #=> "<foo bar='aaa'bbb\"ccc'/>"

is odd.


Regards,

TAKAHASHI Masayoshi (maki / inac.co.jp)