Issue #9539 has been updated by Kouhei Sutou.

Status changed from Open to Closed
% Done changed from 0 to 100

Applied in changeset r45153.

----------
* lib/rexml/xmltokens.rb: Add missing non ASCII valid characters
  to element name characters. Now, REXML name tokens exactly
  match "[5] Name" in the XML spec and "[4] NCName" in the
  Namespaces in XML spec. See comment about the details.
  [Bug #9539]  [ruby-core:60901]
  Reported by Mario Barcala. Thanks!!!

* test/rexml/xpath/test_node.rb: Add tests for the above case.

----------------------------------------
Bug #9539: REXML XPath UTF8 encoding problem
https://bugs.ruby-lang.org/issues/9539#change-45434

* Author: Mario Barcala
* Status: Closed
* Priority: Normal
* Assignee: 
* Category: 
* Target version: 
* ruby -v: ruby 2.1.0
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
I found some problems in REXML when processing XPath expressions with Unicode not ascii characters. I attached a sample script and a sample document. If you see the script output, you will see two different problems:

1) text() XPath function does not work properly when there is an accent or tilde character.

2) two different XPath paths, one with an accent and the other without it, are considered the same.

Thank you,

  Mario Barcala

---Files--------------------------------
sample.rb (366 Bytes)
sample.xml (224 Bytes)


-- 
http://bugs.ruby-lang.org/