On Wednesday 16 March 2011 21:41:25 Peter Bailey wrote:
> Hello,
> I've got some enormous RTF files and I need to get a count of the number
> of footnotes in them. So, I'm trying this:
> 
> Dir.chdir("T:/rtf")
> file_contents = File.read("1.rtf")
>  count = file_contents.count "\footnote"
>  puts count
> 
> I'm getting an enormous value (115683) in the hundreds of thousands.
> And, with my text editor, I know that there are only hundreds (582). Can
> someone please explain why this is happening?
> 
> Thanks,
> Peter

String#count doesn't work the way you expect. It doesn't count the number of 
occurrences of the argument in the receiver, but the number of occurrences of 
any one of the characters making up the argument. For example:

"ab ac ad".count "ab"
=> 4

"ab ac ad".count "ae"
=> 3

In the first examples, 4 is obtained by summing the 3 occurrences of 'a' and 
the one occurrence of 'b'. In the second, 'e' is never found, so only the 
three occurrences of 'a' are returned. For other examples, see the ri 
documentation for String#count

To obtain what you want, you can use

file_contents.scan(/\\footnote/).count

I don't know if there's a better way.

I hope this helps

Stefano