On 09.11.2008 18:04, Sebastian Newstream wrote:
> Hello fellow Rubyists!
> 
> I'm trying to impress my boss and co-workers with Ruby so we
> hopefully can start to use it in work more often. I was given
> the task with moving a *large* repository of images from one
> source to the next. The repository consists of around 1.750.000
> images and requires around 350GB of space.

> My question is this: How do I speed up my application?
> I reused my filehandler and skipped the printing to the console,
> but it is still taking time.
> 
> Also if any one has any previous experience of handling this many files
> any kind of tips are welcome. I'm quite worried that the array
> containing
> the path to all the files will flood the stack.

Sorry to disappoint you but this amount of copying won't be really fast 
regardless of programming language.  You do not mention what a "source" 
in your case is, what operating systems are involved and what transport 
media you are intending to use (local, network).  If you need to 
transport using a network in my experience tar with a pipe works pretty 
well.  But no matter what you do, the slowest link will determine your 
throughput: you cannot go faster than network speed or the speed that 
your "sources" can read or write.

Here's the tar variant, since you copy images I assume data is 
compressed and does not need compression (on your favorite Unix shell 
prompt):

$> ( cd "$source" && tar cf - . ) | ( ssh user@target "cd '$target' && 
tar xf - )

If you can physically move the source disk to the target host and then 
do a local copy with cp -a that's probably the fastest you can go - 
unless the physical takes ages (e.g. to the moon or other remote locations).

Kind regards

	robert