Axel

On 11/21/06, Nuralanur / aol.com <Nuralanur / aol.com> wrote:
> is there a way of extracting text from a PDF, if the latter
> is in some non-European language, such as Arabic or
> Chinese?

rpdf2txt (1) _should_ work with Unicode PDF-Documents. If you run into
any problems let me know, I'm happy to tinker with the beast.

http://download.ywesee.com/rpdf2txt/rpdf2txt-1.0.6.tar.bz2
http://raa.ruby-lang.org/project/rpdf2txt/

hth

Hannes