On Wed, Jun 30, 2010 at 8:32 AM, Cs Webgrl <cschaller / gmail.com> wrote:

> Hello,
>
> I am working with scraping quite a bit of data and I would like to make
> sure that I'm following some best practices for string manipulation.  I
> would like to be sure to take into account any speed and garbage
> collection issues.
>
> Does anyone know of any posts, websites, books or other resources that
> provide  "do this, not that" types of guidance?
>
> For example, my understanding is that globbing everything into one line
> when manipulating a string is not the best use of resources.
>
> not good
> "string+var".gsub('+','').strip.capitalize
>
>
> better
> s = "string+var
> s.gsub('+','')
> s.strip!
> s.capitalize
> s => 'String Var'
>
> Are there resources that explain why one is better than the other that
> also provides more best practices like this?
>
> Thanks.
> --
> Posted via http://www.ruby-forum.com/.
>
>
I don't know about a specific site, but if you do not need to keep the value
of string, then string << var is better than string + var, since it mutates
string, rather than creating a new object. I once read benchmarks about
this, but I can't remember where I read them, and I can't seem to recreate
them, so maybe I am wrong.

# plus returns a new String
string , var = 'abc' , 'def'
string + var  # => "abcdef"
string        # => "abc"

# << mutates the receiver
string << var # => "abcdef"
string        # => "abcdef"



You can use s.delete('+') instead of s.gsub('+','') and it will be faster,
prettier, and more expressive.



I expect the reason you heard that it is better to do it on multiple lines
is that it then lets you use the bang methods, which, for whatever reason
will return nil if they don't mutate the object. In general, it is faster to
say s.capitalize! than s.capitalize because in bang version, we mutate s
itself, in the second, we create a new object that is modified. But we are
not interested in keeping the original value of s, so creating all these
objects adds up.

# capitalize returns the capital version regardless of the original string
# so you can use it in the middle of a method chain
'Abc'.capitalize  # => "Abc"
'abc'.capitalize  # => "Abc"

# don't use capitalize! in the middle of a method chain because it can
return nil
'Abc'.capitalize! # => nil
'abc'.capitalize! # => "Abc"

# capitalize creates a new string, so is less efficient if you don't care
about the original
# also does not modify the receiver, so you have to capture its result
s = 'abc'
s.capitalize  # => "Abc"
s             # => "abc"

# capitalize! mutates the original string, so is more efficient if you don't
care about the original
# does modify the receiver, so don't have to capture its result
# in fact, _don't_ capture its result, because as shown above, result could
be nil
s = 'abc'
s.capitalize! # => "Abc"
s             # => "Abc"