I bit the bullet and went to ruby cvs last week from 1.6.8.
The new libraries are great! Today I am playing with rexml and
open-uri.  Something is b0rked, though.

I tried parsing my homepage since it's xhtml, and I don't have
loads of xml files knocking about.

open-uri loads the doc fine (*nice* lib, incidentally), but rexml has
some problems with it. If I remove this line:

<link rel=stylesheet href="css/9.css" type="text/css" />

from <head>, it all works ok. Anything I'm missing?
Here's the program, both inputs and the output (I cut out all the
bits that didn't seem relevant):

1rasputin@lb:xml$ cat wtf.xml
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>oh dear</title>
<link rel=stylesheet href="css/9.css" type="text/css" />
</head>
<body> hmm </body>
</html>

1rasputin@lb:xml$ cat poc.rb
#!/data/ruby/bin/ruby -w
require "rexml/document"
require "open-uri"
xml = open(ARGV[0])
doc = REXML::Document.new xml

1rasputin@lb:xml$ ./poc.rb wtf.xml
/data/ruby/lib/ruby/1.9/rexml/parsers/baseparser.rb:291:in `pull': Missing end tag for 'head' (got "html") (REXML::ParseException)
Line: 9
Position: 280
Last 80 unconsumed characters:
        from /data/ruby/lib/ruby/1.9/rexml/document.rb:180:in `build'
        from /data/ruby/lib/ruby/1.9/rexml/document.rb:44:in `initialize'
        from ./poc.rb:6:in `new'
        from ./poc.rb:6
1rasputin@lb:xml$ ./poc.rb ok.xml

1rasputin@lb:xml$ diff ok.xml wtf.xml
5a6
> <link rel=stylesheet href="css/9.css" type="text/css" />
1rasputin@lb:xml$

-- 
Serenity through viciousness.
Rasputin :: Jack of All Trades - Master of Nuns