On Mon, Aug 18, 2008 at 11:10 PM, dare ruby <martin / angleritech.com> wrote:
> I have some of the study materials as PDF documents. I need to parse the
> PDF to any text format like microsoft word or text pad in windows OS. I
> need to do parsing using a ruby program. Could any one suggesst on this?

Your best bet is a ruby script that calls out to xpdf to do the actual
pdf->text conversion, then parses the text. There's a windows port of
the xpdf command line utilities.

http://gnuwin32.sourceforge.net/packages/xpdf.htm
http://www.perlmonks.org/?node_id=298041
http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/
http://forjournalists.com/cookbook/index.php?title=XPDF

martin