Clifford Heath <cjh_nospam / managesoft.com> writes: > Note that Linux "cat" doesn't move the data twice. Instead it mmap's That's strange. My GNU cat does not use mmap. It uses read() and write(). > Either way, ruby can't do this, but it can use fread! What about it Matz? > $ time ./cat_read < /tmp/ten_megabytes > /dev/null > real 0m0.085s > user 0m0.000s > sys 0m0.080s > $ time ./cat_fread < /tmp/ten_megabytes > /dev/null > real 0m0.086s > user 0m0.000s > sys 0m0.080s In the above benchmark, fread(3) and read(2) does not differ by much. But, at least in glibc 2.2.4, fread(3) is merely a portability layer on top of read(2). So, theoretically, using read(2) should result in faster performance than fread(3). On another note, mmap cannot be used as a generic reading mechanism. It requires an fd. Accessing $stdin will be done differently than accessing other IO objects. Too much hassle, and for the case of cat-ing, there won't be any improvement since the file access is linear, not random. In fact, hunting down mmap() in filemap.c (which gave me a headache) from linux 2.4.18 code makes me think that for strictly linear access, mmap will suffer because of the overhead. > Since fread is almost as fast as read, the restriction on not mixing > sysread and read could perhaps be relaxed too? You're not the only one confused about existence of #sysread and #read, I am too. Both rb_io_read and rb_io_sysread do basically the same thing. Only diff is one uses getc(3), and the other one read(2). Since they are not on the same layer, calling one after the other one confuses the system. Simply changing #sysread to use fread(3) will eliminate the confusion and the price is a very small overhead. But Matz didn't do it. Is there anything that can be done with read(2) but can't be done with fread(3)? If not, then the only reason I can think of is #sysread is there for you to utilise the maximum capability of the OS. Could this be true? YS.