On Mar 30, 2011, at 2:08 PM, 7stud -- wrote:
> Patrick Tyler wrote in post #990031:
>> Hello,
>>=20
>> I know that this has been covered a bit here:
>> http://www.ruby-forum.com/topic/186437 but I'm still not certain that =
I
>> understand.
>>=20
>> s =3D "foo"
>>=20
>> s[3] is nil, like I would expect.
>>=20
>> s[3,0] is "", instead of nil.
>=20
> That behaviour is contrary to the description in the 1.9.2 docs here:
>=20
>    http://www.ruby-doc.org/core/classes/Array.html

The docs certainly could be more clear but the actual behavior is =
self-consistent and useful.
Note: I'm assuming 1.9.X version of String.

 It helps to consider the numbering in the following way:

  -4  -3  -2  -1    <-- numbering for single argument indexing
   0   1   2   3   =20
 +---+---+---+---+
 | a | b | c | d |=20
 +---+---+---+---+
 0   1   2   3   4  <-- numbering for two argument indexing or start of =
range
-4  -3  -2  -1

The common (and understandable) mistake is too assume that the semantics =
of the single argument index are the same as the semantics of the =
*first* argument in the two argument scenario (or range).  They are not =
the same thing in practice and the documentation doesn't reflect this.  =
The error though is definitely in the documentation and not in the =
implementation:

single argument:  the index represents a single character position =
within the string.  The result is either the single character string =
found at the index or nil because there is no character at the given =
index.

  s =3D ""
  s[0]    # nil because no character at that position

  s =3D "abcd"
  s[0]    # "a"
  s[-4]   # "a"
  s[-5]   # nil, no characters before the first one

two integer arguments: the arguments identify a portion of the string to =
extract or to replace.  In particular, zero-width portions of the string =
can also be identified so that text can be inserted before or after =
existing characters including at the front or end of the string. In this =
case, the first argument does *not* identify a character position but =
instead identifies the space between characters as shown in the diagram =
above.  The second argument is the length, which can be 0.

s =3D "abcd"   # each example below assumes s is reset to "abcd"

To insert text before 'a':   s[0,0] =3D "X"           #  "Xabcd"
To insert text after 'd':    s[4,0] =3D "Z"           #  "abcdZ"
To replace first two characters: s[0,2] =3D "AB"      #  "ABcd"
To replace last two characters:  s[-2,2] =3D "CD"     #  "abCD"
To replace middle two characters: s[1..3] =3D "XX"    #  "aXXd"

The behavior of a range is pretty interesting. The starting point is the =
same as the first argument when two arguments are provided (as described =
above) but the end point of the range can be the 'character position' as =
with single indexing or the "edge position" as with two integer =
arguments.  The difference is determined by whether the double-dot range =
or triple-dot range is used:

s =3D "abcd"
s[1..1]           # "b"
s[1..1] =3D "X"     # "aXcd"

s[1...1]          # ""
s[1...1] =3D "X"    # "aXbcd", the range specifies a zero-width portion =
of the string

s[1..3]           # "bcd"
s[1..3] =3D "X"     # "aX",  positions 1, 2, and 3 are replaced.

s[1...3]          # "bc"
s[1...3] =3D "X"    # "aXd", positions 1, 2, but not quite 3 are =
replaced.


If you go back through these examples and insist and using the single =
index semantics for the double or range indexing examples you'll just =
get confused.  You've got to use the alternate numbering I show in the =
ascii diagram to model the actual behavior.


Gary Wright