--bcaec53d5d87675d4d04a2dba2a8 Content-Type: text/plain; charset=UTF-8 Regular Expressions are pretty much the standard way of parsing text files, aren't they? Certainly they're what I've been using for years now. What's the problem you're having with them? On Mon, May 9, 2011 at 11:32 AM, Felipe Espinoza <fespinozacast / gmail.com>wrote: > Hi, > > I'm looking for an example of parsing pdf. I tried to implement this > with ruby > and docsplit gem, but it uses an external tool to extract the text, and > there are problems with number references, and you have to parse the > text file according to the regular expressions > > I want to parse some papers in pdf format, to extract it's title, > keywords, authors, authors's mails, institutions, etc. > > I'm looking for some experience ruby developer with a better way to do > this without parsing a textfile through regular expressions > > Greetings > > -- > Posted via http://www.ruby-forum.com/. > > --bcaec53d5d87675d4d04a2dba2a8--