On Sat, Aug 02, 2003 at 06:37:40PM +0900, Kero van Gelder wrote:
> Find some code below to reliably cause a rb_sys_fail() for server.rb,
> whenever it is run by ruby 1.8.0-previewX on a Linux system (2.2 kernel RH
> 6.something, 2.4.21-smp kernel RH 7.3, 2.5 kernel debian unstable). The
> client can be ran with 1.6.8, still causes rb_sys_fail().
> 
> I get normal exceptions/nil-from-gets for Ruby 1.6.8, or when running on
> HP-UX. If I run the server with 1.6.8 and the client with 1.8.0-pX, all is
> fine, too.
> 
> The code below is simplified from my real codebase at work, where the
> rb_sys_fail() means the server crashes, without possibility of recovery.

rb_sys_fail should just raise an exception:

rb_sys_fail(mesg)
    const char *mesg;
{
    extern int errno;
    int n = errno;
    VALUE arg;

    errno = 0;
    if (n == 0) {
        rb_bug("rb_sys_fail() - errno == 0");
    }

    arg = mesg ? rb_str_new2(mesg) : Qnil;
    rb_exc_raise(rb_class_new_instance(1, &arg, get_syserr(n)));
}

Are you saying that the ruby interpreter is dying at this point? Do you get
a core dump? (In which case gdb can be used to interpret it)

Do you get the message "rb_sys_fail() - errno == 0" printed?

I tried your code under FreeBSD-4.8; from server.rb I get a zillion 'nils'
printed (from the infinite gets/prints loop), followed by an EPIPE when the
other server thread tries to write to the closed socket: is that what you
get with the working platforms?

...
nil
nil
nil
server.rb:17:in `write': Broken pipe (Errno::EPIPE)
        from server.rb:17:in `puts'
        from server.rb:17
        from server.rb:16:in `loop'
        from server.rb:20

One thing to try might be adding
      trap('PIPE') { }
to the top of server.c, just to see if SIGPIPE is interacting some way. Just
a thought.

Cheers,

Brian.