On Saturday 06 July 2002 07:18 am, Martin Pirker wrote:
> Hi...
>
> I have a Ruby speed problem, maybe you have some suggestions:
>
> given: text files with values, one per line, sorted, e.g.
> 10100
> 10234
> 10292
> ......
>
> so:
> arr1 = IO.readlines(file1)
> arr2 = IO.readlines(file2)
>
> arr1 consists of ~40000 lines/elements
> arr2 size is ~10000
>
> when I want to take the "set difference", arr3 = arr1-arr2, meaning
> "take all elements from arr1 which dont appear in arr2" this takes
> forever - I don't even know how long because I stopped early ;)

How fast does this go:


h2 = Hash.new(false)
IO.readlines(ARGV[1]).each { |line| h2[line] = true }
puts "h2 has #{h2.size} elements"

diff = []
IO.readlines(ARGV[0]).each { |line| diff << line unless h2[line] }

puts "diff has #{diff.size} elements"

-- 
Ned Konz
http://bike-nomad.com
GPG key ID: BEEA7EFE