Martin Pirker <crf / sbox.tu-graz.ac.dfgdfhjhzjgfdfsddadshrhdrhdfdsasaff.at> writes:

> given: text files with values, one per line, sorted, e.g.
> 10100
> 10234
> 10292
> ......
> 
> so:
> arr1 = IO.readlines(file1)
> arr2 = IO.readlines(file2)
> 
> arr1 consists of ~40000 lines/elements
> arr2 size is ~10000
> 
> when I want to take the "set difference", arr3 = arr1-arr2, meaning "take
> all elements from arr1 which dont appear in arr2" this takes forever - I
> don't even know how long because I stopped early ;)

The following runs in about .5s on my pokey old box:

   s1 = {}
   File.foreach(ARGV[1]) {|line| s1[line] = 1}
   File.foreach(ARGV[0]) {|line| puts(line) unless s1[line]}

Note that it's doing String comparisons, not integer ones, but if both
of your files are generated the same way that won't be a problem.


Cheers


Dave