On Friday 11 December 2009 05:33:06 pm IƱaki Baz Castillo wrote:
> The good point of using:
> 
>   start-stop-daemon --stop --pidfile /var/run/rb_program.pid --name
>  rb_program
> 
> is that it would stop the process just if it's called "rb_program" and its
>  pid matches the value of /var/run/rb_program.pid, so you cannot kill any
>  other process using that pid by accident (it could occur if your program
>  didn't delete the pidfile and a new process has taken same pid value).

That's a good point -- though I would guess that in theory, it shouldn't be 
possible for a program to die in such a way that it wouldn't be able to delete 
that file. The only thing that would make sense is a reboot, and on my system, 
/var/run is a tmpfs (only exists in RAM/swap), so it's not stored anywhere 
that would survive a reboot.

I can think of a few other possibilities, like checking directly (with fuser, 
for example) which process is controlling the resource your daemon is 
associated with, or even talking to the old daemon over a socket, or some sort 
of formal IPC like dbus.

I'm not sure about Debian, but on Ubuntu, the upstart system might also be 
worth looking into.

I did find a solution, though:

#!/usr/bin/env ruby

doesn't work. However:

#!/usr/bin/ruby1.9.1

produces a process which responds both to killall and to 'start-stop-daemon-
stop --name'. But hardcoding the location of the Ruby interpreter is 
antisocial, especially when there are so many of them. The whole point of env 
is that I can always override PATH and point to a different Ruby interpreter, 
to easily switch between 1.8, 1.9, JRuby, Rubinius, etc.

I thought I'd found a workaround, and I've been getting messy with C tryingo 
figure out how to replace env with a more appropriate program, but I'm not 
sure how to change the program name at all. That is, this isn't a Ruby issue, 
it's a Linux/Unix issue. The best I could figure out was this:


#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

int main(int argc, char *argv[]) {
  if (argc == 1) {
    fprintf(stderr, "Usage: %s PROGRAM ARGUMENTS\n", argv[0]);
    return 1;
  }

  char * filename = argv[1];
  argv[1] = "foo";

    execvp(filename, argv+1);

  char * outstr = malloc(strlen(argv[0]) + strlen(argv[1]) + 2);
  sprintf(outstr, "%s: %s", argv[0], argv[1]);
  perror(outstr);
  return errno;
}


Compile that, then change the Ruby script to be:

#!/path/to/that/binary ruby

The same should work for Python.

Except it doesn't.

It does indeed change the program name in /proc/self/cmdline -- it becomes 
'foo /name/of/my/program.rb'. And killall and start-stop-daemon both seem to 
work, here, but only when I give them "ruby" as the name of the program to 
kill.

Notice, also, I'm explicitly setting 'foo' as the program name. If this 
worked, I'd detect that dynamically -- but it doesn't.

At this point, I'm about ready to write a script that would copy your script 
(or create a wrapper for your script) and change the shebang to match `which 
ruby` at the point of invocation, but that wouldn't work either -- that will 
fool the program name in /proc, but it won't fool your program -- there's $0 
and probably a dozen other things I haven't thought of to tell it that it's 
being loaded by a separate script.

I don't really know what else to try. /proc/self/exe is no help; that points 
to the Ruby binary no matter what. You might write your own script that greps 
through /proc/*/cmdline, but I don't see a way to fool start-stop-daemon 
without changing to #!/usr/bin/ruby1.9.1.