Em 10-04-2012 00:33, Urabe Shyouhei escreveu:
>>> Myth2: "Threading boosts your application performance"
>>>
>>> This is also no.  Multi-threaded programming is _very_ difficult to do
>>> properly.  And  if you  do it badly,  its performance gets  even worse
>>> than a  single threaded one.   A multi-threaded application  that does
>>> scale can also scale much easily by using processes.
>> This is not a myth. This entire thread was born from a real issue:
>>
>> http://rosenfeld.herokuapp.com/en/articles/ruby-rails/2012-03-04-how-nokogiri-and-jruby-saved-my-week
>>
>> Multi-thread is not always difficult and being difficult doesn't mean it can't be implemented in a more performant way.
> Yes, using JRuby was one of  the easiest approach for you.  Said that,
> I still believe  it is possible to fork your  process to fully utilize
> your  multi-core  machine.  Your  problem  needed parallel  execution;
> fully  multi-threaded paradigm (such  as interaction  between threads)
> was not required, was it?

Sorry for the late response but things got crazy here since Monday night
when due to some mysterious problem I lost several of my directories
(seems like random directories) including all my hard work converting
part of the Grails application I'm maintaining to Rails, besides lots of
bug fixes which took almost a full week (including some work I've done
in the holiday) :( And that happened exactly a few minutes before I
would be committing everything and sending to the server... I shouldn't
rely that much on Linux. My latest backup was on March 15 :(

I was comparing some deployment strategies with my new Rails application
(now lost) when this happened. I could measure some numbers. For thin
(thin start -e production --threaded) Apache benchmark (ab -n 100 -c 10
...) reported about 59s to complete, while it took about 1.9s on Webrick
and I found this very strange. So I was trying to run it on Java (warble
executable war; java -jar app.war) when everything went crazy and I was
out of control and hardly could sleep that night... In the Grails
application the benchmark completed in about 0.4s.

Now, answering your question, for this particular scenario I didn't need
any inter-process communication. I just needed the procedure to use all
my cores.

I guess you're right. I'd probably get about the same result if I had
used a pool of processes instead of a pool of threads in MRI.

But since the migration was already done, and given that I have lots of
work to do, I don't think I'll test this anytime soon.

>>> Myth3: "Threading is the way to go"
>>>
>>> This is  quite doubtful.  Multi-threaded programming  is too difficult
>>> to  develop, almost  impossible to  debug,  and normally  do not  work
>>> properly.   Multi-threading  is too  easy  to  fail.  Some  restricted
>>> model, such  as Shared Nothing  architecture, seems to have  much more
>>> potential.
>> People have been doing multi-threaded programs for decades and while certainly there can be bugs hard to track on threaded applications who said that programming was supposed to be easy? Sometimes some solutions require a more complicated approach.
> We _want_  to make it  easy.  For instance  in Ruby you don't  have to
> care about  malloc/free counterbalance.  You don't have  to care about
> manipulating  struct sockaddr_storage.   Then why  you have  to bother
> mutex dead-locks?

You can get rid of some manual memory management in lots of scenarios by
using a garbage collector but you can't ensure that programs won't leak
in Java or Ruby.

I think the same happens to threads. We can create a great API for
dealing with threads, but I don't think it is possible for a system to
know where to put the locks.

They're very specific to each problem being solved. Creating a good
design is hard in any language and although you can try a share nothing
architecture to avoid some pitfalls of threading programming, it doesn't
mean that we shouldn't support threads the proper way. JRuby is able to
provide more useful threads while still implementing the Ruby language.
This is a drawback specific to the C Ruby implementation and is not the
language's fault. That was what I meant.

> I want you, programmers, to code happily.

I want to code happily too, as much as keeping my code free of hard-disk
disasters ;)

>  I believe a language can let
> you do so.  Does a  parallel-execution really make you happy?  Doesn't
> it give you more pain than gain?

In some cases it will make me much happier like in the case I mentioned.
It is much better to wait 2 hours for some task to complete instead of
12 hours.

Not all thread programming is hard. But you can't protect anyone from
their ignorance. We programmers will always have to study techniques to
make our code more robust and this includes how to do multi-threaded
programming. You'd be wrong if you think that Ruby will prevent
beginners to write bad code.

The languages and frameworks must try to provide the best possible API /
semantics but they shouldn't assume their consumers are idiots.

I never advertised Rails as a framework for beginners. You need to have
a strong web programming background to take great advantage of this
framework. But it allows experienced developers to take full advantage
of the framework (except by the poor stream support). It is not designed
to protect beginners but to give more power to experienced developers.

I like to think that the same happens to Ruby.

>
>> I agree that Shared Nothing architectures can be interesting too, but choosing some solution architecture should be the programmers decision in my opinion.
> We are not going to deprecate Threads that we already have.

This never passed by my mind :D

> If we add SN-architecture  to our  core, that  should  be an  opt-in.  So  don't
> worry, you can decide your architecture.  I promise.

Not worried about this, but thanks :)

>>> Myth4: "We are ignoring Rais"
>>>
>>> Definitely no.
>>>
>>> Myth5: "You can make MRI lock-free"
>>>
>>> Do it yourself if you think so.  Patch welcomed.
>> I don't think any threaded application can be lock-free, including a language interpreter. But having locks (instead of a single global lock) doesn't mean you can't use the full power of processors.
> Technically  speaking, there  are  reasons why  MRI  cannot take  this
> approach.  One  reason for it is  that MRI's GC needs  a giant locking
> because no  modifications to  any objects shall  be allowed  during GC
> (this restriction can theoretically be weakened, but in practice it is
> very hard).

Even if it is hard to get a better GC mechanism, having the big locks
only around the GC code wouldn't be that much a problem I guess.

But I guess that those locks are used in other situations, is that right?

>   Another  reason is that most extension  libraries are not
> designed to  be multi-thread ready;

I think this is the chicken and egg problem. Why would extension writers
worry about threads if Ruby itself is not? I mean, they won't invest
time on improving them if they won't be able to see real performance gains.

>   for instance the  SQLite database
> does  not  support  multiple  transactions  per  a  connection,  which
> effectively kills multi-threaded usage.
> cf: http://www.sqlite.org/faq.html#q6
>  
>> Unfortunately I don't have time nor skills for patching CRuby even for much easier patches... :( Maybe in the future...
>>
>> I was just stating that I don't think someone with skills has invested much time trying to do so either because they don't think CRuby could benefit from parallel processing that much.
>>
>> That is what I'm trying to get them to reflect about this situation when I launched this thread. If they agree that parallel threading can be very useful, than it would be clear that the only reason for CRuby not supporting it is the fact that it is currently difficult to patch Ruby to get rid of the global lock approach.
> Someone with skills is always welcomed!

Hopefully some day I can find some time and become one of them :)