On Thu, Apr 17, 2003 at 10:11:55PM +0900, David King Landrith wrote:
> >Do you think that mmap will get me speeds near cp?
> 
> I've never looked at the source for gnu fileutils, so I don't know how 
> cp works.  [BEGIN SPECULATION] The fact that much of it is a straight 
> bit copy of binary data may allow for optimizations that the more 
> general approach we're using, in which our reading of the data allows 
> pretty much any use of it.  So cp may well remain somewhat faster.  
> [END SPECULATION]  Perhaps someone else on the list can speak on this 
> topic with more authority.

I have the source code for FreeBSD's cp on my machine. If the file is under
8M it mmap's the source for reading, and calls write() repeatedly on the
target file until it has been written completely. Otherwise it goes into a
simple read() / write() loop with a buffer of MAXBSIZE, which is 65536

There is a comment which says:

        /*
         * Mmap and write if less than 8M (the limit is so we don't totally
         * trash memory on big files.  This is really a minor hack, but it
         * wins some CPU back.
         */

I guess the main difference is that instead of two syscalls for each 64K of
data, you have one syscall for whatever the maximum size write() will accept
at once.

If you're doing things where this is significant, then Ruby is probably the
wrong language for you - unless you write your entire read-process-write
loop in C as a single function, in which case using Ruby as a convenient way
to start it is fine. If it calls 'yield' on each line of the input then that
is almost certain to dominate...

Regards,

Brian.