Issue #5120 has been updated by Alexey Muranov.


I understand why
",,1".split(',')  # => ["","","1"]
and why
".5".split('.')  # => ["","5"]
But then ",1,".split(',') should return ["", "1", ""].
It is not clear why one needs to do it like that:
",1,".split(',',-1)  # => ["", "1", ""]

The decision to discard trailing empty elements seems random (maybe targeted at processing particular programming languages or application input where optional parameters are placed at the end?).

However, splitting on the empty string does not make sense (cannot be made consistent with these examples).
In my opinion, splitting on the empty string should be forbidden.
To obtain the array of letters (in the given encoding) it would be more logical to introduce a #letters method or use #split without parameters.
Current implementation of split('') seems inconsistent with the rest: why
"ab".split('')  # => ["a", "b"] and not  ["", "a", "b"] or  ["", "a", "", "b"] ?
"ab".split('',-1)  # => ["a", "b", ""] and not  ["", "a", "b", ""] ?
Does splitting on the empty string work this way because this is how the general implementation works if fed with the empty string, or is it implemented as a separate case?
In the last case it is not a good solution.

Alexey.
----------------------------------------
Feature #5120: String#split needs to be logical
http://redmine.ruby-lang.org/issues/5120

Author: Alexey Muranov
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 


I would call this a bug, but i am new to Ruby, so i report this as a feature request.

Here are examples showing a surprising and inconsistent behavior of String#split method:

"aa".split('a')  # => []
"aab".split('a')  # => ["", "", "b"]

"aaa".split('aa')  # => ["", "a"] 
"aaaa".split('aa')  # => []
"aaaaa".split('aa')  # => ["", "", "a"] 

"".split('')  # => []
"a".split('')  # => ["a"]

What is the definition of *split*?
In my opinion, there should be given a simple one that would make it clear what to expect.
For example:

  str1.split(str2) returns a maximal array of non-empty substrings of str1 which can be concatenated with copies of str2 to form str1.

Additional precisions can be made to this definition to make clear what to expect as the result of "baaab".split("aa").

Thanks for attention.


-- 
http://redmine.ruby-lang.org