--GcuyunM1iFaMYZNm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Everyone,

There appears to be a bug in REXML 2.7.1 external entity parsing.  The
following code throws an error in Ruby 1.8.0/REXML 2.7.1, but not in
Ruby 1.6.8/REXML 2.3.5:

----
#!/usr/bin/env ruby

require 'rexml/document'

XP =3D '//channel/title'

# dump versions
puts 'Ruby %s, REXML %s' % [RUBY_VERSION, REXML::Version]

# check both examples
%w{working.rss broken.rss}.each do |path|
  File.open(path) do |file|
    doc =3D REXML::Document.new file.readlines.join('')

    puts 'File: ' << path

    # check to make sure everything is kosher
    puts 'doc.root.class =3D ' << doc.root.class.to_s
    puts 'doc.root.elements.class =3D ' << doc.root.elements.class.to_s

    # get the title of the feed
    puts (e =3D doc.root.elements[XP]) ? e.class.to_s : "Couldn't find #{XP=
}."
  end
end
----

2.3.5 Output
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Ruby 1.6.8, REXML 2.3.5
File: working.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
<title>Paul Duncan</title>
File: broken.rss
doc.root.class =3D REXML::Element=20
doc.root.elements.class =3D REXML::Elements
<title>O'Reilly Network Articles</title>
   =20
2.7.1 Output
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Ruby 1.8.0, REXML 2.7.1
File: working.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
REXML::Element
File: broken.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
/usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:83:in `internal_parse': =
undefined method `node_type' for #<REXML::Entity:0x4027d9d0> (NoMethodError)
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:81:in `delete_if'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:81:in `internal_p=
arse'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:60:in `match'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:315:in `d_o_s'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `each_inde=
x'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `d_o_s'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:317:in `d_o_s'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `each_inde=
x'
   ... 8 levels...
  from ./rexml_test.rb:12:in `open'
  from ./rexml_test.rb:12
  from ./rexml_test.rb:11:in `each'
  from ./rexml_test.rb:11

The files in question and additional information are available at
http://www.raggle.org/files/rexml-external_entity_bug/ .  We're
stripping external entity declarations before parsing feeds in Raggle as
an interim solution.


PS.  I attempted to use the REXML bug report page on the Germane
Software site, but it gave me the following error:

    The system encountered a fatal error
    failed to chroot(/home/jitterbug/rexml)
    The last error code was: Operation not permitted
    uid/gid=3D81/81=20

--=20
Paul Duncan <pabs / pablotron.org>        OpenPGP Key ID: 0x82C29562
http://www.pablotron.org/               http://www.paulduncan.org/

--GcuyunM1iFaMYZNm
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/WHYvzdlT34LClWIRAksWAKDHdGet3Dc3D/KN0dqnpUboGzrTYwCgjgWh
CD9WfZN4tohdbYF2yuirXnE=
=pwCD
-----END PGP SIGNATURE-----

--GcuyunM1iFaMYZNm--

Hi Everyone,

There appears to be a bug in REXML 2.7.1 external entity parsing.  The
following code throws an error in Ruby 1.8.0/REXML 2.7.1, but not in
Ruby 1.6.8/REXML 2.3.5:

----
#!/usr/bin/env ruby

require 'rexml/document'

XP =3D '//channel/title'

# dump versions
puts 'Ruby %s, REXML %s' % [RUBY_VERSION, REXML::Version]

# check both examples
%w{working.rss broken.rss}.each do |path|
  File.open(path) do |file|
    doc =3D REXML::Document.new file.readlines.join('')

    puts 'File: ' << path

    # check to make sure everything is kosher
    puts 'doc.root.class =3D ' << doc.root.class.to_s
    puts 'doc.root.elements.class =3D ' << doc.root.elements.class.to_s

    # get the title of the feed
    puts (e =3D doc.root.elements[XP]) ? e.class.to_s : "Couldn't find #{XP=
}."
  end
end
----

2.3.5 Output
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Ruby 1.6.8, REXML 2.3.5
File: working.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
<title>Paul Duncan</title>
File: broken.rss
doc.root.class =3D REXML::Element=20
doc.root.elements.class =3D REXML::Elements
<title>O'Reilly Network Articles</title>
   =20
2.7.1 Output
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Ruby 1.8.0, REXML 2.7.1
File: working.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
REXML::Element
File: broken.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
/usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:83:in `internal_parse': =
undefined method `node_type' for #<REXML::Entity:0x4027d9d0> (NoMethodError)
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:81:in `delete_if'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:81:in `internal_p=
arse'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:60:in `match'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:315:in `d_o_s'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `each_inde=
x'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `d_o_s'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:317:in `d_o_s'
  from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `each_inde=
x'
   ... 8 levels...
  from ./rexml_test.rb:12:in `open'
  from ./rexml_test.rb:12
  from ./rexml_test.rb:11:in `each'
  from ./rexml_test.rb:11

The files in question and additional information are available at
http://www.raggle.org/files/rexml-external_entity_bug/ .  We're
stripping external entity declarations before parsing feeds in Raggle as
an interim solution.


PS.  I attempted to use the REXML bug report page on the Germane
Software site, but it gave me the following error:

    The system encountered a fatal error
    failed to chroot(/home/jitterbug/rexml)
    The last error code was: Operation not permitted
    uid/gid=3D81/81=20

--=20
Paul Duncan <pabs / pablotron.org>        OpenPGP Key ID: 0x82C29562
http://www.pablotron.org/               http://www.paulduncan.org/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/WHYvzdlT34LClWIRAksWAKDHdGet3Dc3D/KN0dqnpUboGzrTYwCgjgWh
CD9WfZN4tohdbYF2yuirXnE=
=pwCD
-----END PGP SIGNATURE-----