"Robert Klemme" <bob.news / gmx.net> writes:

> "Lloyd Zusman" <ljz / asfast.com> schrieb im Newsbeitrag
> news:m3vfgq6hg6.fsf / asfast.com...
>> "Robert Klemme" <bob.news / gmx.net> writes:
>>
>> > "Cameron McBride" <cameron.mcbride / gmail.com> schrieb im Newsbeitrag
>> > news:dcedf5e204071320562e0ad096 / mail.gmail.com...
>> >> > What possible benefit is there to typing split(" ") vs.
> split(/\s/)?
>> >> > One saved character (but two shift key presses!)?
>> >>
>> >> they are not the same:
>> >>
>> >> irb(main):001:0> s = "this is\tfun \tno?"
>> >> => ["this", "is", "fun", "no?"]
>> >> irb(main):003:0> s.split(" ")
>> >> => "this is\tfun \tno?"
>> >> irb(main):002:0> s.split(/\s/)
>> >> => ["this", "is", "fun", "", "no?"]
>> >
>> > I'd rather compare split(" ") to split(/\s+/), which is what I use
> when I
>> > need this functionality.  [ ... ]
>>
>> However, the two cases are not equivalent:
>>
>>   irb(main):001:0> "  spaces  of  doom  ".split(/\s+/)
>>   => ["", "spaces", "of", "doom"]
>>   irb(main):002:0> "  spaces  of  doom  ".split(" ")
>>   => ["spaces", "of", "doom"]
>>
>> You'd have to compare split(" ") with strip.split(/\s+/).  I'll do that
>> later this morning, when I have more time, and I'll then post my
>> results.
>
> You're right, the strip makes
>
> [ ... etc. ... ]

Well, you saved me some time by running these yourself.  Thanks.

Hmm ... if you know for sure ahead of time whether or not there's
leading whitespace, split(' ') is not the best.

However, without this knowledge about the existence of leading
whitespace or lack thereof, I believe that the best bet is still
split(' ') and its cousins split(nil) and split().

Using a random number of spaces between the items and a random amount of
leading whitespace (including none), I got the following results.  Note
that the split(' ')/split(nil)/split() cases are the fastest ones when
you leave out the split(/\s+/) case.  That one should really be left out
of these random whitespace tests, because it doesn't give the same
results as the others.

  testArray = []

  1000.times {
    string = ''
    (1..100).each { |x| string += ((" " * rand(3)) + x.to_s) }
    testArray << string;
  }

  require 'profile'

  def test1(s) s.split(' ') end
  def test2(s) s.split(nil) end
  def test3(s) s.split() end
  def test4(s) s.split(/\s+/) end
  def test5(s) s.strip.split(/\s+/) end
  def test6(s) s.sub(/^\s+/, '').split(/\s+/) end

  testArray.each { |x| test1(x) }
  testArray.each { |x| test2(x) }
  testArray.each { |x| test3(x) }
  testArray.each { |x| test4(x) }
  testArray.each { |x| test5(x) }
  testArray.each { |x| test6(x) }
  
  %   cumulative   self              self     total
   time   seconds   seconds    calls  ms/call  ms/call  name
   33.17     3.80      3.80     6000     0.63     0.63  String#split
   31.54     7.42      3.62        1  3617.19  3617.19
   Profiler__.start_profile
   24.59    10.24      2.82        6   470.05  1911.46  Array#each
    8.17    11.18      0.94     1000     0.94     1.66  Object#test5
    6.68    11.95      0.77     1000     0.77     1.93  Object#test6
    6.34    12.67      0.73     1000     0.73     1.20  Object#test1
    6.27    13.39      0.72     1000     0.72     1.60  Object#test4
    6.27    14.11      0.72     1000     0.72     1.13  Object#test3
    5.18    14.70      0.59     1000     0.59     1.12  Object#test2
    2.18    14.95      0.25     1000     0.25     0.25  String#sub
    1.16    15.09      0.13     1000     0.13     0.13  String#strip
    0.00    15.09      0.00        6     0.00     0.00
   Module#method_added
    0.00    15.09      0.00        1     0.00 11468.75  #toplevel
  
--
 Lloyd Zusman
 ljz / asfast.com
 God bless you.