< :the previous in number
^ :the list in numerical order
> :the next in number
P :the previous artilce (have the same parent)
N :the next artilce (have the same parent)
|<:the top of this thread
>|:the next thread
^ :the parent (reply-to)
_:the child (an article replying to this)
>:the elder article having the same parent
<:the youger article having the same parent
---:split window and show thread lists
| :split window (vertically) and show thread lists
~ :close the thread frame
.:the index
..:the index of indices
On Sat, Jul 06, 2002 at 11:18:47PM +0900, Martin Pirker wrote:
> I have a Ruby speed problem, maybe you have some suggestions:
>
> given: text files with values, one per line, sorted, e.g.
> 10100
> 10234
> 10292
> ......
>
> so:
> arr1 = IO.readlines(file1)
> arr2 = IO.readlines(file2)
>
> arr1 consists of ~40000 lines/elements
> arr2 size is ~10000
>
> when I want to take the "set difference", arr3 = arr1-arr2, meaning "take
> all elements from arr1 which dont appear in arr2" this takes forever - I
> don't even know how long because I stopped early ;)
>
> So the built-in operator is too slow - taking advantage of my knowledge
> of the sorting I handcoded a "loop { compare arr1[0] with arr2[0], either
> puts or remove }"
> This gets it down to 5-10s, much better, but still too slow - 1s would be
> a nice target
>
> As I'm still quite novice in Ruby (always having Hal's book on my lap ;) ..)
> how would you code a speedier solution?
You could also do it without hashes (though that solution is clearly the
simplest). This algorithm feels to me like the merge stage of a merge
sort, sorta (if you're not familiar with merge sort, google for it):
a1 = [1, 3, 4, 10, 13, 31, 53, 54, 55]
a2 = [3, 10, 10, 11, 13, 54, 55]
i = 0
a3 = []
a1.each { |a|
i = i + 1 while a2[i] < a
a3 << a if a2[i] > a
}
a3.each {|a| puts a}
--
Evan Martin
martine / cs.washington.edu
http://neugierig.org