in response to George Ogata.

great idea, IMHO. i was starting to have similar notions, to a lesser
extent. i especially like how you bring IO into the fold. i think one of
the things that is becoming clear in this discussion, is the desire to
have as much of a consisitent interface as possible across all classes
in so far as they are alike. currently there are a number of
inconsistencies that are begging to be improved: similarities between
String and Array, similariteis between String and IO, etc. so it would
be nice to see those changes in the future. at first this seemed like a
very difficult thing to do without breaking lots of code. but
massilliano had a great notion for making it easy to preseve backward
compatability. (assuming that is reasonably doable). so ther is hope
that these improvements can be made --and matz adored child can become
an ever more beautful creation.

question: how do you determine paragraph seperation?

i think the #useX is a great notion. but we may be able to make it more
general. we already have the $/ record seperator. but i do not care for
it myeself --too perlish. and even so it dosen't support character
seperation straigtfowardly. so what if we had a more general
#seperator=(x)? i would very much prefer something like this over $/=x.
but also x="" would mean each character, not "\n" to which it currently
defaults.

~transami



> Hi,
> 
> Yukihiro Matsumoto wrote:
> 
> > Hi,
> > 
> > In message "Re: is there a better string.each?"
> >     on 02/07/08, Tom Sawyer <transami / transami.net> writes:
> > 
> > |the only point i am trying to make is that in so far as both account for
> > |order and list, i want their exposed methodologies to be the same.
> > |that's it. (whihc is why .each should work differently)
> > 
> >   * a string is not an array.
> > 
> >   * but still, a string can be seen as an array of characters
> >     sometimes, hence str[0] returns the first character (a byte in the
> >     current implementation) in the string.
> > 
> >   * text processing model of Ruby is besed on lines, not characters.
> >     That's why I made "each" to be line oriented, not character
> >     oriented.  I took usefulness over consisitency here.
> > 
> > So, if you want String consistent with Array, you need to express
> > either:
> > 
> >   * why it is important over usefulness
> >   * or line-oriented "each" is not usefull at all
> > 
> > good enough to bring incompatibility.
> > 
> > matz.
> 
> Well, "usefulness" depends on the application.  And it seems that 
> sometimes, strings are better seen as arrays (not Arrays!) of characters, 
> and at other times, as arrays of lines (and at yet other times, as arrays 
> of words or paragraphs).
> 
> Since there doesn't seem to be a universally happy medium, perhaps the 
> problem lies in assuming there is one.  I think the method "each" is the 
> problem: it's ambiguous ("each what?").
> 
> Couldn't we have separate methods (as suggested earlier) #bytes, #chars (or 
> #characters), #words (maybe), #lines, #pars (or #paragraphs) for String?  
> We could have identical methods for the IO class, so IOs and Strings can be 
> used interchangeably with these methods.  These methods could be both used 
> to iterate over the object (when called with a block), or used to retrieve 
> an Array (not array!) of the elements we're interested in (when called 
> without a block).  E.g.:
> 
> s = "abc\ndef\nghi"
> s.lines                            >> ["abc", "def", "ghi"]
> a = []
> s.lines do {|s| a << (s + 'xyz')}  >> nil (or maybe something else)
> s                                  >> ["abcxyz", "defxyz", "ghixyz"]
> 
> Thus, io.lines &proc has the same effect as io.lines.each &proc (though 
> would hopefully require less memory when say, reading in a large file, 
> since the former doesn't need to read the whole thing into a gigantic Array 
> first).
> 
> Now, what about #each?  Well, we've made it essentially useless for calling 
> directly (since the others are more readable and unambiguous (agree?)).  
> But there's still the interaction of String and IO with the Enumerable 
> mixin.  What about this?:
> 
> We have methods corresponding to the iterator/collector methods to set the 
> one that #each points to:  #useBytes, #useChars (or #useCharacters), 
> #useWords, #useLines, #usePars (or #useParagraphs).
> 
> These could be defined in their own mixin module:
> 
> Module StringProcessing
>   def useBytes
>     class << self
>       alias each bytes
>     end
>   end
> 
>   def useChars
>     class << self
>       alias each chars
>     end
>   end
> 
>   def useWords
>     class << self
>       alias each words
>     end
>   end
> 
>   def useLines
>     class << self
>       alias each lines
>     end
>   end
> 
>   def usePars
>     class << self
>       alias each pars
>     end
>   end
> end
> 
> These would change the behaviour of #each for the String or IO instance 
> they're called.  E.g.:
> 
> s = "abc\ndef\nghi"
> s.useChars
> s.collect {|char| frob char}  # iterates over chars
> s.useLines
> s.collect {|line| frob line}  # now it iterates over lines
> 
> The default behaviour would be to iterate over lines, so as to be backward 
> compatible.  Well, except for my little, implicit wish that the record 
> separators would be removed automatically.  E.g., I'd rather 
> "abc\ndef".lines would return ["abc", "def"] than ["abc\n", "def"].  But I 
> guess that's another war...  Still, even if the record separators stay, 
> this'd be a pretty flexible String/IO model, wouldn't it?
> 
> Thoughts?
> 
-- 
~transami

"They that can give up essential liberty to obtain a little
 temporary safety deserve neither liberty nor safety."
	-- Benjamin Franklin