It's also worth mentioning that this is going to be a problem for
*all* implementations that can't or don't want to juggle the call
stack. IronRuby will be in the "can't" column. MacRuby and Rubinius
are probably in the "don't want to" column. Other implementations on
runtimes without stack access or continuations will not be able to
support Enumerator#next easily (or at all?).

- Charlie

On Fri, Sep 25, 2009 at 3:18 PM, Charles Oliver Nutter
<headius / headius.com> wrote:
> I have a challenge for anyone who wants to discuss, propose
> suggestions, or help us fix this problem.
>
> Ruby 1.8.7 added the ability to "next" your way through an Enumerator.
> At a glance, this seems fine; it's just external enumeration. The
> problem, however, is that enumeration can be arbitrarily complex.
>
> Take this code for example:
>
> class Foo
> def each
>  5.times {|i| yield i}
> end
> end
>
> enum = Foo.new.to_enum
> puts enum.next # => 0
> puts enum.next # => 1
>
> What's actually happening here?
>
> to_enum creates a new Enumerator to wrap our Foo type. All it requires
> is that an "each" be implemented. The Enumerator then uses each to
> perform iterations for collect, select, etc. In those cases, you're
> really just deferring the call to #collect until a later time, and
> enumeration proceeds as normal with #collect running until #each has
> completed. But the "next" case is different.
>
> With "next", we have a more complicated situation: enumeration
> *pauses* after each element. Here's how things work when using
> Enumerator#next
>
> 1. On the first call to #next, a fiber or generator is spun up to
> start the call to each, similar to this:
> f = Fiber.new { collection.each {|i| Fiber.yield i} }
> 2. For each element next returns, the fiber/generator is invoked to
> produce the next result
> def next
> f.resume
> end
> 3. When the enumeration completes (or at any time) you can rewind and
> start from the beginning.
>
> In Ruby 1.8.7 and Ruby 1.9, this is implemented using continuations
> (delimited continuations, i.e. Fibers or coroutines), making it
> dreadfully slow to "next" your way through a collection. On JRuby,
> because there's an in-progress #each we have to pause for every
> element, Enumerator#next has to spin up a *new native thread*. Each
> #next call then pings the thread to produce a new result.
>
> Functionally, this works just fine, other than the cost of us spinning
> up a thread. But there's a larger problem: an Enumerator-created
> thread has a full lifecycle apart from the caller's thread. As a
> result the enumerator thread can root objects (preventing them from
> being GCed), including the Enumerator itself.
>
> So the small problem with Enumerator#next is that it's slow on MRI
> because of continuations and slow on JRuby because of native threads.
> But on JRuby, we have the additional large problem of managing the
> associated thread and making sure it doesn't live forever if you don't
> complete an enumeration.
>
> Bottom line is that Enumerator#next is a real problem for JRuby. I
> hope it's not going to be impossible to support, but at this point the
> path forward is unclear.
>
> Here's the options as I see them:
>
> 1. Soldier on, attempting to find a way to use native thread for
> Enumerator#next without rooting objects, etc
> 2. Support Enumerator#next only on core types where we know how to do
> enumeration without #each
> 3. Provide a way to cancel an enumeration, so implementations like
> JRuby will know when to forcibly end the fiber/thread
> 4. Require that along with "each", arbitrary collection types must
> implement "to_enum" to avoid requiring fibers/continuations in all
> Ruby impls
> 5. Not support Enumerator#next in JRuby at all
>
> Obviously we want to avoid #5, and ideally #3 as well. We don't want
> to stand in the way of 1.8.7 (and 1.9.2) adoption, but without a
> satisfactory solution to this problem JRuby may have a crippled
> Enumerator#next implementation, making it less reliable across
> implementations.
>
> - Charlie
>
>