Hello, I'm looking for libraries to do text extraction from MS Office and PDF file formats. Also looking for libraries to do HTML rendering of documents in the same formats. I know of couple of commercial libraries from Oracle and Autonomy, but they only have C and/or Java APIs. I also found this project http://poi.apache.org/poi-ruby.html. Is there other open source alternatives, and/or alternatives with Ruby bindings? Thanks, Vitali