Issue #9632 has been updated by Eric Wong.


 ko1 / atdot.net wrote:
 > 1. How performance improved?
 
 There is less pointer chasing for iteration:
 
 Before: st_table_entry->rb_thread_t->st_table_entry->rb_thread_t ...
  After: rb_thread->rb_thread ...
 
 This is made possible by the container_of macro.
 
 I plan to use container_of in method/constant/symbol table, too
 (ihash in Feature #9614).
 
 > 2. Should we modify ccan/* files? Or should we sync with originals?
 
 I probably best to sync with originals.  I removed parts of
 ccan/str/str.h we are not using, but we can use more of str.h later.
 I may also put ihash in CCAN so other projects may use it easily.
 But I am not sure about the name "ihash".
 
 > 3. What mean the name "CCAN"?
 
 Comprehensive C Archive Network - ccodearchive.net
 
 > 4. Should we use it on compile.c?
 
 Maybe.  I do not know compile.c well enough...
 If we can reduce allocations and pointer chasing without regressions,
 we should use it.

----------------------------------------
Feature #9632: [PATCH 0/2] speedup IO#close with linked-list from ccan
https://bugs.ruby-lang.org/issues/9632#change-46683

* Author: Eric Wong
* Status: Open
* Priority: Normal
* Assignee: Koichi Sasada
* Category: core
* Target version: current: 2.2.0
----------------------------------------
This imports the ccan linked-list (BSD-MIT licensed version of the Linux kernel
linked list).  I cut out some of the unused str* code (only for debugging),
but it's still a big import of new code.  Modifications to existing code is
minimal, and it makes the living_threads iteration functions simpler.

The improvement is great, and there may be future places where we could
use a doubly linked list.

= vm->living_threads:

* before: st hash table had extra malloc overhead, and slow iteration due
to bad cache locality

* after: guaranteed O(1) insert/remove performance (branchless!)
iteration is still O(n), but performance is improved in IO#close
due to less pointer chasing


= IO#close: further improvement with second linked list

* before: IO#close is linear based on number of living threads
* after: IO#close is linear based on number of waiting threads

No extra malloc is needed (only 2 new pointers in existing structs)
for a secondary linked-list for waiting FDs.


I chose the ccan linked list over BSD <sys/queue.h> for two reasons:
1) insertion and removal are both branchless
2) locality is improved if a struct may be a member of multiple lists

git://80x24.org/ruby.git threads-list


---Files--------------------------------
0002-speedup-IO-close-with-many-living-threads.patch (2.86 KB)
0001-doubly-linked-list-from-ccan-to-manage-vm-living_thr.patch (68.1 KB)


-- 
https://bugs.ruby-lang.org/