On 5/5/10, Intransition <transfire / gmail.com> wrote:
> On May 5, 12:58 pm, Caleb Clausen <vikk... / gmail.com> wrote:
>> Why not create a temporary hash before the other.modules.each loop and
>> use it to tell which modules are present.... something like (assuming
>> you rewrite @modules as an array):
>>
>>   known_mods={}
>>   c.modules.each{|mod| known_mods[mod.base]=mod }
>>   other.modules.each do |ofmod|
>>      if known_mods[ofmod.base]
>>        ...
>>      end
>>   end
>>
>> Your Snapshot#- won't be as efficient as it is now, since you have to
>> build that index up every time its called, but it should be a fairly
>> minor performance degradation, I would think.
>
> That was my first alternative idea too. I'm just not sure if it's
> worth the efficiency trade-off.

From what you say about how this is used, it doesn't sound like this
method is called all that much. Once per file in the lib tested? I'd
say just use a temporary hash and get on with your life until and
unless the performance actually proves to be a problem. Or stick with
your current solution of a permanent hash.

It's worth noting that (if I recall correctly) Array#-, which you are
calling 6 times inside that if statement, creates a temporary hash of
the receiver's contents in order to do its own work efficiently. So,
the cost of your one temporary hash is probably vastly dwarfed by the
cost of the 6 temporary hashes created by stdlib on every loop
iteration.

I expect that a temp hash would perform reasonably even with thousands
of files to scan and/or thousands of classes/modules in ObjectSpace.
So, it should scale to all but the very largest projects, and those
should probably expect a performance hit for their large size.

> I was thinking there might be a way to
> do it were the two arrays are sorted by name and then iterate down the
> list popping off one or the other and merging base on <=>, but I
> haven't worked it out yet. Even though there's two sorts involved it
> should be just as fast I think.

It's a neat idea.... but sounds a little complex to implement.

> Lemon is unit testing framework that has a strict testcase<->class/
> module and unit<->method correspondence. By taking a snapshot of the
> system before and after a target library is loaded it can provide test
> coverage information.

Is this just verifying that there is a test of some sort for every
method? Or actually a deeper level of coverage (line coverage) like
what rcov does?