Hello -- On Fri, 16 Nov 2001, Sean Russell wrote: > Actually, I'm afraid I let this get out of hand. Let me clarify: Don't take all the credit :-) > The source document must be well formed. If an '&', '<', or '>' are > encountered in the source document, an error is (should be) reported. > However: > > element.text = "cats & dogs" > > is valid; REXML will auto-convert '&' to '&' on output. With the never > versions (1.1a+, I believe), it also correctly processes text such as: > > element.text = "cats & dogs" > > When you write out the element, '&' will be converted, and the entity will > be ignored. There's a possible conceptual problem with this: namely, that when you set the text this way, you are actually writing to the source document. (I don't mean a file on disk; I mean modifying the document as it's viewed by the parser.) So one could argue that the argument to Element#text= should follow the same rules as other input. As you say, the auto-escaping is just a convenience, but I'm wondering whether it could be misleading or too convenient :-) > When you *read* an XML source, '&', '<', and '>' are converted to > '&', '<', and '>' automatically for you, for convenience; all other > entities are ignored. Unquoted '&', '<', and '>' generate errors. What if I've defined an entity? (And there are the other two built-ins, but I think you've added handling for those.) For example: doc = Document.new <<EOS <?xml version='1.0'?> <!DOCTYPE doc [ <!ENTITY me "David Alan Black"> ]> <doc> <person> Hello, I am &me; </person> </doc> EOS puts doc.root.elements["person"].text # => Hello, I am &me; Well, I guess that raises the whole DTD question :-) David -- David Alan Black home: dblack / candle.superlink.net work: blackdav / shu.edu Web: http://pirate.shu.edu/~blackdav