If you get errors complaining of undefined entities like when parsing xhtml it means you need to install the DTD for xhtml 1.0 or 1.1. Example of a doctype for xhtml 1.1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> You want to install the DTDs locally following the model in /etc/xml. If you don't libxml will fetch the DTD from www.w3.org each time you parse a document. Needing to install these DTDs was not obvious to me and should be part of the documentation. There a rpm for xhtml 1.0 - "xhtml1-dtds-1.0-7". I couldn't find one for xhtml 1.1 so I downloaded it piecemeal from w3.org. Installing the DTD does not automatically turn on validation. If you want to validate you need to turn it on: XML::Parser::default_validity_checking = TRUE XML::Parser::default_load_external_dtd controls the loading of the 'external subset' (the definition for the character entities like &. It is defaulted to TRUE. XML::Parser::default_load_external_dtd is broken. This fixes it. Index: ruby_xml_parser.c ========================================================== RCS file: /var/cvs/xml-tools/libxml-ruby/ruby_xml_parser.c,v retrieving revision 1.1.1.1 diff -r1.1.1.1 ruby_xml_parser.c 274c274 < if (xmlSubstituteEntitiesDefaultValue) --- > if (xmlLoadExtDtdDefaultValue) 916c916 < ruby_xml_parser_default_load_external_dtd_set, 0); --- > ruby_xml_parser_default_load_external_dtd_get, 0); 918c918 < ruby_xml_parser_default_load_external_dtd_get, 1); --- > ruby_xml_parser_default_load_external_dtd_set, 1); Sam's patches for libxml are also needed: http://www.intertwingly.net/blog/2005/11/05/Patch-for-libxml2s-Ruby-binding