On Sun, Nov 28, 2004 at 05:27:53AM +0900, Robert Klemme scribed:
> 
> >kludges like splitting the data into several system calls.
> 
> Your example is quite special.  Usually, when writing servers that serve 
> huge chunks of data (like HTTP servers that also serve binary content, e.g. 
> for download) then the usual (and proper) approach is to copy the file in 
> chunks.  Nobody writes a server that reads a 1GB file into memory first 
> before sending it over the line.  So IMHO your test case is a bit 
> artificial.

  Actually, this is an interesting special case -- most high-performance
webservers and FTP servers do exactly what you've described using
the sendfile() system call or by mmap()ing the file and then
sending on it.  The reason to use sendfile() is to reduce the
number of data copies (zero-copy write):

  fd = open("file", O_RDONLY);
  sock = accept(...)

  sendfile(fd, sock, ...);

Unfortunately, sendfile doesn't seem to have particularly friendly
nonblocking IO semantics or an asynchronous IO implementation.
You _can_ call sendfile in a select loop, but that still greatly
bumps up your system call count.  Kernel threads are pretty much
the only way to get around it.

Sendfile is a very nice trick for high performance servers.
It could be, for instance, a convenient tiny-C component to
add to webrick or other HTTP server frameworks to reduce some of
the overhead of file transmission by taking all of the
data touches out of ruby and putting them in the OS.

  -Dave

-- 
work: dga / lcs.mit.edu                          me:  dga / pobox.com
      MIT Laboratory for Computer Science           http://www.angio.net/