If you get errors complaining of undefined entities like   when
parsing xhtml it means you need to install the DTD for xhtml 1.0 or
1.1.

Example of a doctype for xhtml 1.1:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

You want to install the DTDs locally following the model in /etc/xml.
If you don't libxml will fetch the DTD from www.w3.org each time you
parse a document. Needing to install these DTDs was not obvious to me
and should be part of the documentation. There a rpm for xhtml 1.0 -
"xhtml1-dtds-1.0-7". I couldn't find one for xhtml 1.1 so I downloaded
it piecemeal from w3.org.

Installing the DTD does not automatically turn on validation. If you
want to validate you need to turn it on:
XML::Parser::default_validity_checking = TRUE

XML::Parser::default_load_external_dtd controls the loading of the
'external subset' (the definition for the character entities like
&amp;. It is defaulted to TRUE.

XML::Parser::default_load_external_dtd is broken. This fixes it.

Index: ruby_xml_parser.c
==========================================================
RCS file: /var/cvs/xml-tools/libxml-ruby/ruby_xml_parser.c,v
retrieving revision 1.1.1.1
diff -r1.1.1.1 ruby_xml_parser.c
274c274
<   if (xmlSubstituteEntitiesDefaultValue)
---
>   if (xmlLoadExtDtdDefaultValue)
916c916
<                            ruby_xml_parser_default_load_external_dtd_set, 0);
---
>                            ruby_xml_parser_default_load_external_dtd_get, 0);
918c918
<                            ruby_xml_parser_default_load_external_dtd_get, 1);
---
>                            ruby_xml_parser_default_load_external_dtd_set, 1);


Sam's patches for libxml are also needed:
http://www.intertwingly.net/blog/2005/11/05/Patch-for-libxml2s-Ruby-binding