yes, I can see that most of the time in the
script is spent copying the file, more so when
the file is huge.

the sh script is doing classic unix, gluing together
several smaller utilities, and I thought these
seperate processes would take their toll in over
head.

Note how the sh script determines if a file is a link:
    `ls -l foo.c | grep '>'`
its looking for the "pointer" that "ls -l" prints out
to the screen indicating a link to another file!

we use the sh script at work in a makefile where
it gets called many times and I thought if I could
rewrite it in ruby it would be much faster, but it
turns out to be much slower. 

My enthusiasm for ruby just hit a speed bump.
- greg s.

In article <slrn98npeo.3to.behrends / allegro.cse.msu.edu>,
behrends / cse.msu.edu  wrote:

> greg strockbine (gstrock / pacbell.net) wrote:
>> why is ruby so damn slow?
>> 
>> I don't understand this.  I rewrote a /bin/sh script in ruby and the sh
>> script runs much faster.  I ran the script on  both Linux and Solaris. 
>> The script removes a symlink and  replaces it with a copy of the linked
>> to file.  
> 
> You are essentially matching /bin/cp against ruby's File.syscopy routine
> (written in ruby), not sh against ruby.
> 
>> The ruby script took 20 seconds and the sh script took 7 seconds.  I
>> don't remember the file size.   I don't understand the difference.  The
>> sh script looks so sloppy.
> 
> For what it's worth, I can't quite reproduce your numbers. While ruby is
> (naturally, we have some overhead) a bit slower than /bin/cp, it's not
> nearly by that much. Remember that when you're working on UNIX,
> benchmarking file operations can be extremely tricky. For instance:
> 
> * The disk cache is going to throw everything off entirely. If the
>   file has been read/copied before, chances are that it is still cached
>   in memory, and will take only a fraction of the time to read again.
> 
> * Conversely, writing the file may be miraculously fast, but unless
>   you add a sync right after the last write, you will have zero reliable
>   information as to how long _that_ takes.
> 
> * Networked filesystems add another huge question mark.
> 
> Essentially, to benchmark file system operations, you will have at the
> very least to start with a freshly booted system and take a few
> precautions to make sure that you actually measure what you want to
> measure. And without further information, it's really not possible to
> tell exactly what went wrong. As I mentioned above, I cannot reproduce
> those discrepancies.
> 
> 			Reimer Behrends