Robert Klemme wrote in post #972187: > On Tue, Jan 4, 2011 at 8:41 AM, Vishnu I. <pathsny / gmail.com> wrote: > >> I was recently trying to solve a pet problem of mine where threading >> seems to be the most natural way to solve a problem. I needed a thread >> that would scan a fixed list of tuples each of which contained a single >> filenames to sha1 them and add the sha1 to each tuple and a second >> thread that would run through this list and contact this service on the >> internet to get metadata on these files. (the service exposes a udp api >> that allows a given ip to use only a fixed local port and has strict >> rate limits, so I don't want to request metadata for files >> concurrently). The thing is I have almost never worked worked with >> threading since I've mostly built web apps. > > This is the point where I would immediately switch to using a queue. oh absolutely.Once I learnt that Queue's exist, I got the thing working with queues. I just want to understand how to reason with threads that share data. > >> So I decided to maintain a list of tuples and a last updated index that >> would be visible to both threads. >> >> So something like this >> https://gist.github.com/764495#file_simple_multithreaded_example . >> However, I understand that this may not work. Because there is no >> guarantee of the order in which the two operations are performed and >> these might be reordered. But this understanding comes from reading >> about this in other languages. Is this also true in Ruby? > > Yes. Without synchronization of some sort all bets are off and there > are no guarantees about the order in which code in concurrent threads > is executed. > >> I understand that if I were doing this in .net or java I could just make >> index volatile which is a directive to the compiler and runtime that >> operations surrounding it should not be reordered. Am i correct in >> understanding that there is no such concept in ruby? So I cannot do this >> locklessly. > > Exactly. > >> So I decided to lock on a mutex and I came up with this. >> https://gist.github.com/764495#file_simple_multithreaded_example_with_locks >> now in the scanner thread I have udpated the list inside the sychronized >> block and then updated index, > > As far as I can see you forgot to increment local_index after > #update_metadata. > ah yes, that was an accident when I was typing it out right now. >> but is it safe to move the list updation >> outside? (I assume no since regular statements and sych blocks can get >> reordered). > > There is no list update. You only update Hash instances contained in > the Array. With your logic it should be safe to only synchronize on > the index access because you modify only those entries in the Array > with index > current index. > This is the part that I'm not sure about. If I synchronize only on updation of the index, there's nothing preventing the runtime from first updating the index and THEN updating the hash in the array. unless sychronizing guarantees that all statements prior to the synchronized block are executed. >> Also I have sychronized in the updater thread because I dont >> know if statements inside sychronized blocks can be re-ordered. Is that >> right? > > Not sure what you mean here. You properly synchronize access to index > which is exactly what you need to do to make the code safe. > sorry, what I meant was inside the top synchronized block I update the hash and the index. But again there are no guarantees about the order in which these actions happen. So I have to synchronize on the mutex while retrieving index too. >> p.s. After I considered this solution, I came to know about the Queue >> classin thread, so I know that that is a more elegant solution. But I >> would still like to know the solution. > > https://gist.github.com/764540 oops I wasnt clear at all here I see :o. I meant I understood and got the solution working with queues. I was wondering how to solve this problem with locks :) thanks Vishnu -- Posted via http://www.ruby-forum.com/.