Issue #5306 has been updated by Charlie Savage.


And a bit more info. Running the datagrams test under GDB.

$gdb --args ruby -I.:lib:tests tests/test_epoll.rb -n test_datagrams
(gdb) run

... hangs ...
hit ctrl+c

Program received signal SIGINT, Interrupt.
0x000000375200d91b in read () from /lib64/libpthread.so.0

(gdb) bt
#0  0x000000375200d91b in read () from /lib64/libpthread.so.0
#1  0x00002aaaae9ea3ce in EventMachine_t::_ReadLoopBreaker (this=0xd61b50)
    at em.cpp:998
#2  0x00002aaaae9ebc9a in EventMachine_t::_RunSelectOnce (this=0xd61b50)
    at em.cpp:935
#3  0x00002aaaae9ec4f5 in EventMachine_t::_RunOnce (this=0x9) at em.cpp:498
#4  0x00002aaaae9ee183 in EventMachine_t::Run (this=0xd61b50) at em.cpp:478
#5  0x00002aaaae9e86a9 in t_run_machine_without_threads (self=9)
    at rubymain.cpp:219
#6  0x00002aaaaac1b2d0 in vm_call_cfunc (th=0x602520, cfp=0x2aaaae5c7778,
    num=0, blockptr=0x1, flag=24, id=0, me=0x8e9f90, recv=9127040)
    at vm_insnhelper.c:404
etc.

(gdb) frame 0

#0  0x000000375200d91b in read () from /lib64/libpthread.so.0
(gdb) list
1013            // inspecting the whole list every time we come here.
1014            // Just keep inspecting and processing the list head until we hit
1015            // one that hasn't expired yet.
1016
1017            while (true) {
1018                    multimap<uint64_t,Timer_t>::iterator i = Timers.begin();
1019                    if (i == Timers.end())
1020                            break;
1021                    if (i->first > MyCurrentLoopTime)
1022                            break;

(gdb) frame 1
#1  0x00002aaaae9ea3ce in EventMachine_t::_ReadLoopBreaker (this=0xd61ce0)
    at em.cpp:998
998             read (LoopBreakerReader, buffer, sizeof(buffer));
(gdb) list
993             /* The loop breaker has selected readable.
994              * Read it ONCE (it may block if we try to read it twice)
995              * and send a loop-break event back to user code.
996              */
997             char buffer [1024];
998             read (LoopBreakerReader, buffer, sizeof(buffer));
999             if (EventCallback)


----------------------------------------
Bug #5306: Application Hangs Due to Recent rb_thread_select Changes
http://redmine.ruby-lang.org/issues/5306

Author: Charlie Savage
Status: Open
Priority: High
Assignee: 
Category: core
Target version: 1.9.3
ruby -v: ruby 1.9.3dev (2011-09-09 revision 33236) [x86_64-linux]


This commit:

4e9438bc9153f7a1f4ea0af85c8dbe359e1a55d8

Changed the implementation of rb_thread_select.  

It causes eventmachine to hang on CentOS 5.5.  Not sure what the issue is, but its easily reproduced by by running the test eventmachine/tests/test_epoll.rb.  

We noticed this because it also causes the tweetstream gem to hang.

The same setup works on Fedora 14 and an up-to-date arch linux.  Specific version information included below.

We temporarily fixed this by reverting the commit.

Since Centos is a common production environment (and the one we are using), this seems to us a blocker for 1.9.3. 

We are happy to provide any additional information or test fixes.  

Thanks - Charlie

--------------
We are running this version of CentOS:

Linux app1.zerista.com 2.6.18-238.19.1.el5.centos.plus #1 SMP Mon Jul 18 10:05:09 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

And this version of Fedora:

Linux ammonite.internal.zerista.com 2.6.35.14-95.fc14.x86_64 #1 SMP Tue Aug 16 21:01:58 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

And this version of eventmachine:

eventmachine (1.0.0.beta.3)

And this version of tweetstream:

tweetstream (1.0.4)


-- 
http://redmine.ruby-lang.org