2007/10/24, Daniel Berger <djberg96 / gmail.com>:
>
>
> On Oct 24, 2:34 am, "Robert Klemme" <shortcut... / googlemail.com>
> wrote:
> > 2007/10/23, Daniel Berger <djber... / gmail.com>:
> >
> > > Hi all,
> >
> > > Park Heesob and I came up with a custom implementation for
> > > IO.readlines using scattered I/O I thought would be fun to share. I
> > > think I'm seeing a 2x performance increase, but page caching is making
> > > it difficult to tell. Also, it looks like the main profiling issue is
> > > the call to 'split' at the end, so you can remove that last bit of
> > > logic if you want to see the speed without it.
> >
> > > What do folks think? Are you seeing a performance increase? You
> > > probably won't see any noticeable difference unless the file is
> > > greater than 25mb or so, btw.
> >
> > Can't test it at the moment.  But I wonder how a scattered read can
> > help with #readlines which reads a file sequentially.  AFAIK scattered
> > reading is for reading different portions of the same file.
>
> No, it's just a different technique for reading files. You pre-
> allocate the line buffers up front and then it then reads the lines
> into those buffers asynchronously. But the resulting array still ends
> up "in order". At least, that's how it seems to work on MS Windows. :)

Ah, I see!  So the "scatter" does not refer to the user request but to
the fact that blocks of a file are likely scattered on the disk and
the OS attempts to do an optimized read.  Thanks for clarifying!

Kind regards

robert