On Wednesday 16 March 2011 21:41:25 Peter Bailey wrote: > Hello, > I've got some enormous RTF files and I need to get a count of the number > of footnotes in them. So, I'm trying this: > > Dir.chdir("T:/rtf") > file_contents = File.read("1.rtf") > count = file_contents.count "\footnote" > puts count > > I'm getting an enormous value (115683) in the hundreds of thousands. > And, with my text editor, I know that there are only hundreds (582). Can > someone please explain why this is happening? > > Thanks, > Peter String#count doesn't work the way you expect. It doesn't count the number of occurrences of the argument in the receiver, but the number of occurrences of any one of the characters making up the argument. For example: "ab ac ad".count "ab" => 4 "ab ac ad".count "ae" => 3 In the first examples, 4 is obtained by summing the 3 occurrences of 'a' and the one occurrence of 'b'. In the second, 'e' is never found, so only the three occurrences of 'a' are returned. For other examples, see the ri documentation for String#count To obtain what you want, you can use file_contents.scan(/\\footnote/).count I don't know if there's a better way. I hope this helps Stefano