Jesús Gabriel y Galán wrote:
> Did you increase the ulimit for file descriptors for your process (I
> did ulimit -n 50000)?

Oh you're right, I had -n 1024. I changed it to -n 4096 and it stopped 
successfully at "All sockets connected (1086)"

If I increase the loop count to 2048 then it stops at

1213,ert.rb:10:in `connect': Socket operation on non-socket - connect(2) 
(Errno::ENOTSOCK)

[although I didn't get a SEGV]

Apache error.log showed I was hitting Apache MaxClients too, so I 
tweaked some Apache mpm_worker values to allow 1500, but the ruby script 
still failed at 1213.

>> I suspect you have managed to go 8 bytes beyond the limit (1024+64=1088)
>> before overwriting something important and killing the ruby interpreter
>> - presumably the newer 1.8.7 I have fixes this.
>>
>> To go beyond this limit you may need to use something like eventmachine
>> which uses poll/epoll instead of select.
> 
> But who is calling select in my code?

Ruby itself. Use strace to see:

connect(1088, {sa_family=AF_INET, sin_port=htons(80), 
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in 
progress)
select(1089, NULL, [64 1028 1030 1031 1032 1034 1037 1038 1039 1043 1044 
1045 1050 1051 1054 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 
1066 1067 1068 1069 1070 1088], [1088], NULL) = 31 (out [1028 1030 1031 
1032 1034 1037 1038 1039 1043 1044 1045 1050 1051 1054 1056 1057 1058 
1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1088]])
connect(1088, {sa_family=AF_INET, sin_port=htons(80), 
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fcntl(1088, F_SETFL, O_RDWR)            = 0
write(1, "All sockets connected (1086)", 28All sockets connected (1086)) 
= 28

It looks like ruby makes a non-blocking connect, then does a select to 
wait for the response (in case there's another green thread which can 
run)

But I don't understand how select is going above 1024, when this little 
C program shows that the size of fd_set is 128 bytes / 1024 fds:

#include <stdio.h>
#include <sys/select.h>
int main(void)
{
  printf("%ld\n", (long)FD_SETSIZE);
  printf("%ld\n", (long)sizeof(fd_set)); // bytes
  return 0;
}

I can only imagine that it is stomping on data beyond its bounds, which 
if true is a really serious bug.

Regards,

Brian.
-- 
Posted via http://www.ruby-forum.com/.