--1926193751-2050096072-1246023881669
Content-Type: MULTIPART/MIXED; BOUNDARY="1926193751-2050096072-1246023881=:8669"

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--1926193751-2050096072-1246023881669
Content-Type: TEXT/PLAIN; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8BIT

Hi --

On Fri, 26 Jun 2009, Peter Bailey wrote:

> David A. Black wrote:
>> Hi --
>>
>> On Fri, 26 Jun 2009, Peter Bailey wrote:
>>
>>>>> method, which is an array.
>>> it does a bunch, then dies.
>>> C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:8:in `block in
>>> <main>'
>>>      pages  ages[0][0].to_i
>>> C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `each'
>>>    Dir.glob("*.pdf").each do |pdffile|
>>> C:\Users\pb4072\Documents\scripts\RUBY\multitiffs.rb:5:in `<main>'
>>>    Dir.glob("*.pdf").each do |pdffile|
>>>
>>> 
>>> Exception: undefined method `[]' for nil:NilClass
>>
>> That means that somewhere along the line, the scan operation isn't
>> finding what you expect it to. Is it possible that you have a document
>> with more than 99 occurrences of [] in a row?
>>
>> I'd still recommend trying the technique I suggested in my earlier
>> answer. Getting a nested array of one element and unnesting it seems
>> like the long way around.
>>
>>
>> David
>
> But, it does hundreds of files just fine. Then, it dies. So, you're
> saying that in one file in particular it can find what's in the scan?
> I'm sorry, but, I don't understand the technique you described earlier,
> David. You say to do this:
>  pages[/Pages:\D+(\d+),1/]
>  pages  ages.to_i
> I get "0" as output with this.
>
> The output of pdfinfo is simple. Here's an example:
> Author:         pb4072
> Creator:        Microsoft« Office Word 2007
> Producer:       Microsoft« Office Word 2007
> CreationDate:   09/27/07 13:36:28
> ModDate:        02/19/09 14:13:47
> Tagged:         no
> Pages:          1
> Encrypted:      no
> Page size:      612 x 792 pts (letter)
> File size:      55418 bytes
> Optimized:      yes
> PDF version:    1.6
> As you can see, there are no [] characters in here.

Sorry, I was spacing out and remembering (wrongly) that you were
looking for [] characters. I think my brain was leading me astray by
images of dvips output and such.

Anyway... here's the pages[] technique in action:

>> pages  Pages:        1234"
"Pages:        1234"
>> pages[/\D+(\d+)/,1]
"1234"

When you subscript a string with a regex like that, it matches it
against the string, and if you provide a number, it returns only the
corresponding parenthetical match. Another example:

"David A. Black"[/\S+ (\S+) (\S+)/,2]    # "Black"


David

-- 
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com
--1926193751-2050096072-1246023881669--
--1926193751-2050096072-1246023881669--