Issue #13085 has been updated by Eric Wong.

File 0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch added

OK, different strategy; not as fast, but still better than what
we currently have.

[PATCH v2] io.c (io_fwrite): copy to hidden buffer when writing

This avoids garbage from IO#write for [Bug #13085] when
called in a read-write loop while protecting the VM
from race conditions forced by the user.

Memory usage from benchmark/bm_io_copy_stream_write.rb
is reduced greatly:

  target 0: a (ruby 2.5.0dev (2017-01-05 trunk 57270) [x86_64-linux])
  target 1: b (ruby 2.5.0dev (2017-01-05) [x86_64-linux])

  Memory usage (last size) (B)
  name  a       b
  io_copy_stream_write  81899520.000    6561792.000

  Memory consuming ratio (size) with the result of `a' (greater is better)
  name  b
  io_copy_stream_write  12.481

Despite the extra deep data copy, there is a small speedup in
execution time due to GC avoidance:

  Execution time (sec)
  name  a       b
  io_copy_stream_write  0.393   0.296

  Speedup ratio: compare with the result of `a' (greater is better)
  name  b
  io_copy_stream_write  1.328

This patch increases memory bandwidth use by pessimistically
copying the data into a temporary hidden buffer.  Using a
lightweight frozen copy (as before this patch) is ineffective
in read-write loops, since the read operation will trigger
a heavy copy behind our back due to the CoW operation.

It is also impossible to safely release memory from the
lightweight CoW string, because we have no idea how many
lightweight duplicates exist by the time we reacquire GVL.

So, we now make a heavy copy up front which we recycle
immediately upon completion.

Ideally, Ruby should have a way of detecting Strings which are
not visible to other threads and be able to optimize away the
internal copy.  Or, we give up on the idea of implicit data
sharing between threads since its dangerous anyways.


----------------------------------------
Bug #13085: io.c io_fwrite creates garbage
https://bugs.ruby-lang.org/issues/13085#change-62398

* Author: Eric Wong
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Relying on rb_str_new_frozen for unconverted strings does not
save memory because copy-on-write is always triggered in
read-write I/O loops were subsequent IO#read calls will
clobber the given write buffer.

  ```ruby
  buf = ''.b
  while input.read(16384, buf)
    output.write(buf)
  end
  ```

This generates a lot of garbage starting with Ruby 2.2 (r44471).
For my use case, even `IO.copy_stream` generates garbage, since
I wrap "write" to do Digest calculation in a single pass.

I tried using rb_str_replace and reusing the string as a hidden
`(klass == 0)` thread-local, but `rb_str_replace` attempts CoW
optimization by creating new frozen objects, too:

  https://80x24.org/spew/20161229004417.12304-1-e / 80x24.org/raw


So, I'm not sure what to do, temporal locking seems wrong for
writing strings (I guess it's for reading?).  I get
`test_threaded_flush` failures with the following:

  https://80x24.org/spew/20161229005701.9712-1-e / 80x24.org/raw


`IO#syswrite` has the same problem with garbage.  I can use
`IO#write_nonblock` on fast filesystems while holding GVL,
I guess...


---Files--------------------------------
0001-io.c-io_fwrite-temporarily-freeze-string-when-writin.patch (2.6 KB)
0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch (2.9 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>