Peter Higgins wrote:
> I've written a small script to parse an xml doc with SaxParser and
> everything goes well until the parser encounters a Unicode character.
> For example, in the for the following snippet:
> 
> <key>Name</key><string>90's Music</string>
> 
> In case it doesn't come through correctly, the "'" character above is an
> apostrophe, represented as <E2><80><99> when I view the xml with less.
> 
> When the on_characters method is called for the string "90's Music", the
> buffer only contains "90", with no error or warning being presented.
> After this is encountered parsing occurs normally; the first I saw of
> the bug was when I noticed some of my strings being truncated. Is there
> some setting of libxml or ruby that I've overlooked to cause this
> behavior?

As part of researching the problem, I wrote a small test script with 
REXML looking  for that particular string, and it returned the correct, 
full quote: "90”Ēs Music". It looks like this is a bug with libxml then, 
so I'll post on their mailing list.

-- 
Posted via http://www.ruby-forum.com/.