Eric Hodel <drbrain / segment7.net> writes:

> $ ruby test.rb
>    1 HTTP/1.1 200 OK
> [...]
>  252 HTTP/1.1 200 OK
> test.rb:5:in `initialize': Too many open files - socket(2)
> (Errno::EMFILE)
>         from test.rb:5:in `open'
>         from test.rb:5
>         from test.rb:2:in `map'
>         from test.rb:4:in `each'
>         from test.rb:4:in `map'
>         from test.rb:4

Fair enough.  I wasn't able to replicate this on my system with this
code exactly, (ruby 1.8.5, but on Debian, not OS X) but it was able to
point me in the right direction.  (I was able to replicate it by using
a "127.0.0.1" instead of "localhost" and using ulimit to first reduce
my open descriptors limit to something small)

Okay, so now the theory is that when ftp.rb throws the EOFError upon
connection what's happened is that the remote end has fallen over from
having too many open connections.

Specifically, if you hit an inetd-based service too many times without
closing the connection then you put the server into a state where it
will accept the connection and then immediately close it.  On the
server, what's happening is that inetd is accepting the connection
but then the ftp daemon is exiting immediately when it tries to open
some files that it reads on startup, like libc.  What I get in the
logs is:

Jan 24 08:43:56 esau inetd[1641]: ftp/tcp server failing (looping),
service terminated for 10 min

"looping" here means that the ftpd program is exiting immediately when
inetd tries to start it.

It may be that non-inetd ftp servers exhibit similar behavior
(i.e. accept a connection yet close it immediately) when hit
with too many simultaneous open connections, but I haven't
investigated those.  I'll note that my non-inetd ssh server gets into
this state (where it can accept a connection, but then closes it
before spitting out the identification string) after a mere 10
simultaneous unclosed connections.

So if ftp.rb is encountering this situation on initial connect, it
means that the server closed the control socket immediately after
connecting, and the ftp server is probably overloaded.

If ftp.rb is encountering this EOFError elsewhere, (as was alleged
elsewhere in the thread where someone mentioned ftp.rb blowing up with
an EOFError after retrieving 60 files) then the culprit is probably
that the other end closed the control connection unexpectedly.
Although this could mean that ftp.rb was hammering the other end with
connections that aren't being closed, the ftp.rb code doesn't look
like it would do that.  More likely, the ftp server on the other end
is enforcing clumsily some limit about the number of transactions per
connection.  (Or some NATting firewall in the middle is getting
confused, and shutting the connection down.  I've seen that happen
when dealing with an FTP server behind the firewall)

Probably, the function getline in ftp.rb should be modified to catch
EOFError and turn it into an FTPProtoError, (the same type of error
you'd get if the socket closed unexpectedly partway through a
response) though the comments seem to indicate that the current
situation is deliberate:

    def getline
      line = @sock.readline # if get EOF, raise EOFError
      line.sub!(/(\r\n|\n|\r)\z/n, "")
      if @debug_mode
      print "get: ", sanitize(line), "\n"
      end
      return line
    end
    private :getline

I contend that the current situation is a bug, and getline inside
ftp.rb should read:

    def getline
      begin
        line = @sock.readline # if get EOF, raise EOFError
      rescue EOFError
        raise FTPProtoError, "Connection closed unexpectedly"
      end
      line.sub!(/(\r\n|\n|\r)\z/n, "")
      if @debug_mode
        print "get: ", sanitize(line), "\n"
      end
      return line
    end
    private :getline

At the very least, it should raise some sort of FTPError.

-- 
s=%q(  Daniel Martin -- martin / snowplow.org
       puts "s=%q(#{s})",s.map{|i|i}[1]       )
       puts "s=%q(#{s})",s.map{|i|i}[1]