Issue #14900 has been updated by funny_falcon (Yura Sokolov).


@ioquatix, your patch doesn't seems to be correct for me on first glance. 

Imagine pipelined RPC server:
- we read data into buffer
- while buffer larger than request size
 - detect first request and split buffer into request and rest of buffer

Same for any other binary parser.

With current behavior, operation "get rest of buffer" will copy buffer into shared frozen string only once.
With your patch it will copy every time. 
So instead on linear complexity we will have quadratic complexity.

Thinking second time, it is possible to use frozen string explicitely for buffer, just not so trivial (while there are not enough data for request, buffer should not frozen, and `<<` should be used, otherwise it should be frozen, and `+` used).

Some programs will certainly become slower with this change, until they fixed. 

I'm not against the patch, but new behavior should be carefully documented and mentioned in a Changelog as a change, that could negatively affect performance if not concerned.

----------------------------------------
Bug #14900: Extra allocation in String#byteslice
https://bugs.ruby-lang.org/issues/14900#change-72890

* Author: janko (Janko Marohni)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
When executing `String#byteslice` with a range, I noticed that sometimes the original string is allocated again. When I run the following script:

~~~ ruby
require "objspace"

string = "a" * 100_000

GC.start
GC.disable
generation = GC.count

ObjectSpace.trace_object_allocations do
  string.byteslice(50_000..-1)

  ObjectSpace.each_object(String) do |string|
    p string.bytesize if ObjectSpace.allocation_generation(string) == generation
  end
end
~~~

it outputs

~~~
50000
100000
6
5
~~~

The one with 50000 bytes is the result of `String#byteslice`, but the one with 100000 bytes is the duplicated original string. I expected only the result of `String#byteslice` to be amongst new allocations.

If instead of the last 50000 bytes I slice the *first* 50000 bytes, the extra duplication doesn't occur.

~~~ ruby
# ...
  string.byteslice(0, 50_000)
# ...
~~~

~~~
50000
5
~~~

It's definitely ok if the implementation of `String#bytesize` allocates extra strings as part of the implementation, but it would be nice if they were deallocated before returning the result.

EDIT: It seems that `String#slice` has the same issue.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>