Hi Kirk,

I've had pretty good luck shelling out to pdftotext to get the text from
PDFs into a searchable format. Is this what you mean by "break them down
into a database"?

Best Regards,
Jason



On Mon, Jun 3, 2013 at 4:28 PM, Kirk Keeter <kirkkeeter / gmail.com> wrote:

> Team,
>
> I'm working on a project that will involve processing 15,000+ complex
> financial documents.  They are in PDF form.
>
> Unfortunately, the documents are not available in a non-PDF form -- so I
> have to electronically scan the documents and "break them down" into a
> database.
>
> I'm familiar enough with Rails, that I feel comfortable doing it with the
> Rails framework -- but I'm not sure this is a good use of Rails.
>
> Ruby and Javascript are the only programming languages I know, so I'd
> either need to somehow do this as a Rails project (with Ruby and
> javascript), or as a Ruby project.
>
> If I do it as a Ruby project (not rails), can you make recommendations
> about the best way to go about it?
>
> Kirk Keeter
>