Hpricot is certainly one tool you should consider.
also Rexml and Scrubyt.
Scrubyt is more for web-scraping but if you can scrape it, you can  
remove it too.