On Mon, Aug 18, 2008 at 11:10 PM, dare ruby <martin / angleritech.com> wrote: > I have some of the study materials as PDF documents. I need to parse the > PDF to any text format like microsoft word or text pad in windows OS. I > need to do parsing using a ruby program. Could any one suggesst on this? Your best bet is a ruby script that calls out to xpdf to do the actual pdf->text conversion, then parses the text. There's a windows port of the xpdf command line utilities. http://gnuwin32.sourceforge.net/packages/xpdf.htm http://www.perlmonks.org/?node_id=298041 http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/ http://forjournalists.com/cookbook/index.php?title=XPDF martin