On Fri, 28 Jan 2005, Trans wrote:

> Curt Sampson wrote:
>> Sure, but it's not hard if that's how the programming culture works. And
>> the only option beyond that is to make ruby purely functional, so that
>> it's impossible to have side-effects. Not even Scheme went that far.
>
> But maybe it _is_ a good idea to go that far.

Well, that may well be, but it seems to me that as soon as a language does
go that far, people start putting stuff on top of it so that they can do
things in a more "imperative-looking" way. But this whole topic gets very interesting and cool, check this out for an example:

     http://www.cs.pdx.edu/~antoy/Courses/TPFLP/lectures/MONADS/Noel/research/monads.html

(Particularly note the bit at the end where it points out that,
"getContents and readFile read an entire stream lazily (stdin or a file
respectively), returning a lazily constructed list. writeFile does the
opposite, writing a (possibly lazy) list as a file." You've seen this
kind of thing with generators in Ruby.)

>>> And it is terribly inefficient.
>>
>> Again, I don't buy this. Show me some proof.
>
> Well, I shouldn't say terribly I guess. But it will be slower by the
> simple fact that when passing by-value, a copy of the object must be
> made.

Oh! I agree that that is inefficent. That's why I proposed passing
immutable objects by reference.

> Try this on for size. Calling 'a.foo(b,c)' flags the parameters b and c
> as by-value, but nothing actually happens yet. Only if during the
> course of executing foo should b (or c) be _affected_ does a copy get
> made prior to the actual affect, and the copy is used from then on.

Ah, the standard copy-on-write. That gets interesting. Say I have a file
containing "abc" and a reference to an I/O object for it with the file
pointer for next character to read/write pointing to "b". I then,

     def read_one_character(file_handle)
 	return file_handle.get_next_character
     end
     c1 = read_one_character(file_handle)
     c2 = read_one_character(file_handle)

c1 contains "b". What does c2 contain? "b". Why? Because
read_one_character invokes a method that modifies the file handle object
(moves the location of the file pointer), and so it's copied first,
leaving the caller with the original one that was not modified.

> x += y
>
> Presently, of course, this creates a new object referenced by x.

Indeed. That was pretty much what I was proposing: do this for a lot
more classes than we currently do.

> Yet does it need to? Well, only if there are other references to x's
> original object. If there are not then it may just as well be modified
> in place, it won't effect anything.

That's really more a matter of GC tuning. figuring out if there are
other references to that object is essentially doing a good chunk
of a garbage collection. Doing this on every assignment does not
instinctively strike me as a performance improvement. By separating
the marking and freeing phases of the GC you could make unreferenced
available available for reuse, with potential modification, and achieve
pretty much the same effect (though in the case above you'd be finding
another now-unreferenced object of the same class, modifying it, and
setting x to reference it, now freeing up the object that x used
to reference for the same thing later on). But given the type of
tactics GCs are using these days to deal with, e.g., short lived versus
long-lived objects, this might just as easily backfire on you and kill
the GC's performance.

For this sort of thing, you need to look at the entire system, not just
object creation, because the system may be tuned such that it's actually
cheaper to create and delete a lot of short lived objects than it is to
keep fewer long-lived objects around.

cjs
-- 
Curt Sampson  <cjs / cynic.net>   +81 90 7737 2974   http://www.NetBSD.org
      Make up enjoying your city life...produced by BIC CAMERA