On Fri, 17 Sep 2004, Markus wrote: > Ara -- > > Random thoughts: > > * It could be a race condition of some sort yes - perhaps even in some library code i'm exercising - this my current best guess. > * It could be that closing the file in the child closes it for the > parent even though closing it for the parent does not close it > for the child hmmm - not that one: harp:~ > ruby -e'f = open "f","w";fork{ f.close };Process.wait;f.puts 42' harp:~ > cat f 42 > * It could be that you omitted a file from your keep list that the > child actually needs. It tries to access it, goes boom,... i do an exec of bash immediately after so i think that's out since bash cannot possibly require anything ruby or sqlite has open other that stdin, stdout, and stderr. > * can you make it happen in a simplified situation (e.g. one > child, etc.) yes. but not predictably either. it can run for days, or minutes. unfortunately (for debugging) it usually about 3 days before a core dump - diffucult to work with... > * is it possible to make nfs put the ugly files somewhere you > can't see them? I know much of the software I run has lots of > ugly files (e.g. the web browser cache), but they don't bother > me because I don't look at them. i handle that this way now: def sillyclean dir = @dirname #{{{ glob = File.join dir,'.nfs*' orgsilly = Dir[glob] yield newsilly = Dir[glob] silly = newsilly - orgsilly silly.each{|path| FileUtils::rm_rf path} #}}} end this code wraps ONLY the transaction/fork code. it is safe because i know any silly file left over from a transaction was created due to the sqlite not setting close-on-exec on it's tmp files. plus removing a silly file cannot hurt because they spring back into existence (by definition) if someone actually still needs them. so, if the remove succeeds it no-one was actually using them. this is indeed what happens - they are removed never to return. i just hate this sort of thing. > * Instead of specifying the files you want to keep (STDIN, etc) > could you list the ones you want to close, and narrow the > problem down that way? yes - i'm working on that. the problem is that i actually KNOW the filename that gets unlinked and causes the sillyname - it's the 'db-journal' file (i can see a .nfsXXXX file come into existence with it's exact contents). the problem is that the sqlite api opens this file and i have no file handle on it. problem two is that ruby does not provide a way to get at this info that i know of. you could 256.times do |fd| begin file = IO::new fd File::unlink file.path if file.path =~ %r/db-journal/o rescue Errno::EBADF, Errno::EINVAL end end __except__ that File objects created this way do not have a path! (nor respond_to?('path') for that matter) - at least on my ruby. i'm not sure if this is a bug or not... > I don't know if any of these will help, but I can't see that they > could hurt (I used to say that "ideas can't hurt you" but I'm older > now). funny. yeah - anything helps - i'm grasping at straws! cheers. -a -- =============================================================================== | EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov | PHONE :: 303.497.6469 | A flower falls, even though we love it; | and a weed grows, even though we do not love it. | --Dogen ===============================================================================