On 15/11/06, Devesh Agrawal <dagrawal / cs.umass.edu> wrote:
> Hi,
> Farrel Lifson wrote:
> > On 15/11/06, Devesh Agrawal <dagrawal / cs.umass.edu> wrote:
> >>
> >>
> >>         close all files
> >> underlying libc FILE* ? Oh btw, each line is approx 200-400 bytes.
> >>         v. Would coding the realine part in C using rubyinline offer me speed
> >> the IO would be faster as I could just dump a fixed number of bytes to a
> >> To give you an idea of how slow this is actually: Just reading all the
> >>
> >>
> >>
> >> --
> >> Posted via http://www.ruby-forum.com/.
> >>
> >>
> >
> > Could you not parrallelise the processing of each file? Perhaps using
> > something like starfish (http://www.rufy.com/starfish/doc/)?
>
> Did you mean parrallelizing across multiple files or parrallelizing the
> processing of one file ?
>
> Yes and No. But this involves me getting deeper into a description of my
> problem:
>
> Each file has traceroutes at lots of times t1,t2.... The objective is to
> collect all traceroutes that happened at that time into one structure.
> Hence I could do something like this: Using ruby threads (or whatever)
> read each file, and store it into one *common* hashtable. And then call
> the syncing of the hashtable to the disk incase it has grown large
> enough.
>
> I was more hoping for some kind of a fast way to readlines, using say
> mmap or something like that. I read a few posts about how using mmap
> helped someone else. I will look into this starfish, I rejected ruby
> threads as unlike pthreads they weren't really true threads.
>
> Thanks for replying. Is there something wrong or inherently slow with
> the things I am doing ?. Can It be speeded up ?.
>
> > Farrel
>
>
> --
> Posted via http://www.ruby-forum.com/.
>
>

Also try running your code with some sample data through the ruby
profiler (just run 'ruby -rprofiler yourcode.rb') and it should give
you an idea where your program is spending it's time.

Farrel