On Mon, 8 Jul 2002 20:43:57 +0900, George Ogata wrote:
> Well, "usefulness" depends on the application. And it seems that
> sometimes, strings are better seen as arrays (not Arrays!) of
> characters, and at other times, as arrays of lines (and at yet
> other times, as arrays of words or paragraphs).

> Since there doesn't seem to be a universally happy medium, perhaps
> the problem lies in assuming there is one. I think the method
> "each" is the problem: it's ambiguous ("each what?").

I don't actually find it ambiguous, although I think that it's
poorly documented. String#each is said to be the same as
String#each_line, but I would argue that it isn't. String#each_line
implies \n in all cases (or /\n+/m), but the way that it's
implemented is actually the same as the (not-implemented)
String#each_record. The default for String#each is to treat \n as
the record separator.

If $/ (the argument to String#each) could be made to accept regex,
then the record separator could be String#each(//) if necessary,
allowing character-by-character parsing. This is not currently
possible.

Excepting String#each_byte, all of the other conditions are simply a
matter of choosing a different record separator. (Words could be
/\s/m; paragraphs could be /\n\n/m.)

Strings, by and large, aren't better seen as arrays or Arrays of
anything -- but there are times when it is useful to see them so.
What's obviously differing is the view on when those times are. As
an application developer, I can see few reasons for dealing with
strings as arrays of characters or bytes -- I am more interested in
dealing with whole strings or substrings. Library developers are
more likely to be interested in dealing with strings as arrays.

I don't think that your suggestion (String#bytes, String#chars,
String#words, etc.) is applicable to most cases -- and for string
entities greater than #chars, likely to do it The Wrong Way.

> Now, what about #each? Well, we've made it essentially useless for
> calling directly (since the others are more readable and
> unambiguous (agree?)).

Disagree. They are neither more readable nor unambiguous. The
definition of a Word, Paragraph, etc. are too varied to make a
single decision on.

I think that the mistake here is that String#each is equated with
String#each_line, but this isn't the case -- and when you look at
the documentation, it is clear that it really isn't the case.
String#each, as I said, is REALLY String#each_record -- and IMO
that's a good thing for it to be.

I don't much care for the mixins you've suggested, either.

-austin
-- Austin Ziegler, austin / halostatue.ca on 2002.07.08 at 08.59.35