"Thomas S?ndergaard" <tsondergaard / speakanet.com> writes:

> I hope I'm not violating the rubiqette by cross-posting this from
> www.rubygarden.org, but I find this is important, and matz has
> requested opinions about what the behaviour of IO should be. I hope
> to get more competent answers and not just my meaningless ramblings
> by posting this :-)
>
> http://www.rubygarden.org/article.php?sid=179
> http://www.rubygarden.org/ruby?IORCR

Has this discussion stalled?  Matz asked for a concrete proposal about
a "Stream" API but none materialized.  I understand his request -- I
am a bit fuzzy on what this RCR is actually requesting myself.

A few folks said "I want it to be like Java's xxx" or "I wanted to
implement Smalltalk's xyz but couldn't."  So we have the beginnings of
what is actually required.  I'm personally not familiar with either
Java or Smalltalk's IO capabilities, so these references don't help me
much.

I can only add to the fuzzy requirements.  In a MIME multipart parser
I recently wrote, I wanted to be able to nest IO streams.  This would
allow me to write one stream that understood unix mbox format, another
that understood RFC2822 messages, another that could parse MIME
multiparts.

      For example, an IO stream could be defined that parses unix mbox
      files.  It would return data between "From " lines, returning
      EOF when the next message is reached.  Some API on the stream
      would have to call #next on the stream to get the next message
      in the mailbox.  But in this way, you can read messages out of
      an mbox very similarly to how you'd read a single message out of
      a file.

      Another IO stream would parse email messages.  It would take
      another IO stream as an input source and return data until the
      first blank line (the message header).  Then #next would be
      called and it would return the body until eof.

      Another IO stream would parse multipart messages.  It would take
      another IO stream as input as well as a boundary marker (fetched
      from the message header).  It would then return the data from
      each part, returning eof after each one and requiring #next to
      be called for the next part.

      In the end you might have this:

      IO::File <- IO::MboxReader <- IO::MessageReader <- IO::MultipartReader

In other words, the MultipartReader would call read on the
MessageReader, and so on.  This way, the MultipartReader need not
worry about how the end of the message is reached (details of the Unix
mbox format).  You can even nest MultipartReaders when you run into
nested multipart MIME messages.

In my implementation, MboxReader, MessageReader and MultipartReader
are not actually IO objects.  They just support #read and #next --
they support reading chunks of data this way, but not most of the
useful methods of IO.

This made some parts of the implementation fairly difficult.  It would
have been handy to been able to treat each reader as a file.

It would be cool if classes like MultipartReader could implement a few
simple methods (like read and optionally seek) and then be passed to
an IO::Wrapper class that turned them into full fledged objects that
had all the neat methods of IO like each, each_byte, each_line, eof,
eof?, getc, gets, lineno, print, printf, seek, etc.

-- 
matt