"Christoph Rippel" <crippel / primenet.com> wrote: > > > From: Ben Tilly [mailto:ben_tilly / hotmail.com] [...] >I also think that your objections towards symmetrization - i.e. swapping >are not valid since they cost very little compared to the reminder division >invoked in any hashing scheme and running the interpreter it self of course >(in my test it makes virtually non difference if you swap or don't swap - >even so swapping does not gain anything in the ``examples'') >- swapping IMO simply feels more robust. I hate the phrase "simply feels more robust". And my reaction is so strongly opposite that I want to explain the cause of my reaction in some detail. First of all it is an unnecessary step. We are both agreed on that. The algorithm works just fine without it. Secondly let us stop and ask ourselves what we want == to mean for complex data structures. A reasonable definition is that two data structures are == when there is no nested set of lookups that you can find which give apparently different results. So, for instance if we take this code: a = ["foo"] a.push a then a is == to ["foo", a] which is == to ["foo", ["foo", a]] and in general if you try to access an element in any of them you get the same result as from the infinite data structure: ["foo", ["foo", ["foo", ... ]]] I maintain that this definition is both reasonable and understandable. Now it is not hard for me to come up with a convincing demonstration that caching without swapping is always, no matter how complex the data structure, going to result in a correct algorithm. But I had more trouble producing a demonstration for the general case with swapping where you might have 10 variables that got compared with each other, which have a variety of recursive relationships and appear on both sides of the comparison. It turns out that there is a simple proof, but it was not as easy for me to produce it. Conversely since both algorithms accomplish the same thing, there is no way that adding the swap can make your code more robust. So we have a step that is not needed, that makes the code harder to analyze, that does not change what the end result is. There is no case that will work out which would not have have worked out anyways. Thus it cannot make the code more robust. It can only make the code more complex to understand. And in my books, all else being equal, complexity is the dire enemy of robustness. Sure, it is only one line. But what does that line accomplish? What purpose does it serve? How does it justify its continued existence? (I know, my math background is showing...) Cheers, Ben PS I don't mind complications if they buy you something. Linear searches are far simpler than hashing, BTrees, etc, but they are also much slower. But unless a complication actually wins me something specific, I try to avoid it. And while it is definitely true that there are good reasons to use these more complex algorithms, it is also undeniable that many bugs have been created by people trying to switch to more complex algorithms... _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com