Issue #14781 has been updated by knu (Akinori MUSHA).

zverok (Victor Shepelev) wrote:
> @knu
> The _ultimate_ goal for my proposal is, in fact, promoting Enumerator as a "Ruby way" for doing all-the-things with loops; not just "new useful feature".
>
> That's why I feel really uneasy about your changes to the proposal.

Thanks for your quick feedback, and for bringing up this issue.

> **drop**
> ruby
> # from: drop: 2 is part of Enumerator.from API
> Enumerator.from([node], drop: 2, &:parent).map(&:name)
> # generate: drop(2) is part of standard Enumerator API
> Enumerator.generate(node, &:parent).take(6).map(&:name).drop(2)
> 

I presume .take(6) is inserted by mistake, but with it or not the following map and drop methods belong to Enumerable, and are Array based operations that create an intermediate array per call.  So, I consider them as Array/Enumerable API rather than Enumerator API.  Creating intermediate arrays is not only a waste of memory but also against the key concept of Enumerator: to deal with an object as a stream, which may be infinite.

Adding .lazy before .drop(2) can be a cure, but then the value you get is a lazy enumerator that is incompatible with an non-lazy enumerator.  For instance, Lazy#map, Lazy#select etc. return Lazy objects, so you can't always pass one to methods that expect a normal Enumerable object.

I've always thought that Lazy#eager that turns a lazy enumerator back to a non-lazy enumerator would be nice, but .lazy.map{}.eager would look messy anyway.

> # implicit "stop on nil" is part of Enumerator.from convention that code reader should be aware of

I think it's good and reasonable default behavior to treat nil as an end.  Taking your Octokit example, the block could be { |response| response.rels[:next]&.get } to make it go through all pages and automatically stop if nil were treated as an end.  You omitted a .take_while in the example, but you'd get an error if there were less than 3 pages.  You'd almost always need to either explicitly raise StopIteration in the initial block or chain .take_while/.take if there were no default end, and the choice between them is not obvious.

> **start with array** (I believe 1 and 0 initial values are the MOST used cases)
> ruby
> # from: we should start from empty array, expression nothing but Enumerator.from API limitation
> Enumerator.from([]) { 0 }.take(10)
> # generate: no start value
> Enumerator.generate { 0 }.take(10)

The limitation only came from what the word from sounds like.  I picked the name from and Enumerator.from {} just didn't sound right to me, so I made the argument mandatory.  You can just default the first argument to [] if it reads and writes better, possibly with a different name than from which I won't insist on.

> # from: work with one value requires not forgetting to arrayify it
> Enumerator.from([1], &:succ).take(10)
> # generate: just use the value
> Enumerator.generate(1, &:succ).take(10)

Yeah, due to our keyword arguments being pseudo ones, you can't use variable length arguments for a list of objects that might end with a hash.  We'll hopefully be getting it right by Ruby 3.0.

There's much room for consideration of the name and method signature.  Perhaps multiple factory methods could work better.

> # from: "we pass as much of previous values as initial array had" convention
> Enumerator.from([0, 1]) { |i, j| i + j }.take(10)
> # generate: regular value enumeration, next block receives exactly what previous returns
> Enumerator.generate([0, 1]) { |i, j| [j, i + j] }.take(10).map(&:last)
> # ^ yes, it will require additional trick to include 0 in final result, but I believe this is worthy sacrifice
> 

The former directly generates an infinite Fibonacci sequence and that's a major difference.  Taking a first few elements with .take is just for testing (assertion) purposes and not part of the use case.  When solving a problem like "Find the least n such that \sum_{k=1}^{n} fib(k) >= 1000", take wouldn't work optimally.

> The problem with "API complication" is inconsistency. Like, a newcomer may ask: Why Enumerator.from has "this handy drop: 2 initial arg", and each don't? Use cases could exist, too!

I understand that sentiment, but there's no surprise that a factory/constructor method of a dedicated class often takes many tunables while individual instance methods do not.  If people all said they need it as a generic feature, it wouldn't be a bad idea to me to consider adding something like Enumerable#skip(n) that would return an offset enumerator.

----------------------------------------
Feature #14781: Enumerator#generate
https://bugs.ruby-lang.org/issues/14781#change-74506

* Author: zverok (Victor Shepelev)
* Status: Feedback
* Priority: Normal
* Assignee:
* Target version:
----------------------------------------
This is alternative proposal to Object#enumerate (#14423), which was considered by many as a good idea, but with unsure naming and too radical (Object extension). This one is _less_ radical, and, at the same time, more powerful.

**Synopsys**:
* Enumerator.generate(initial, &block): produces infinite sequence where each next element is calculated by applying block to previous; initial is first sequence element;
* Enumerator.generate(&block): the same; first element of sequence is a result of calling the block with no args.

This method allows to produce enumerators replacing a lot of common while and loop cycles in the same way #each replaces for.

**Examples:**

With initial value

ruby
# Infinite sequence
p Enumerator.generate(1, &:succ).take(5)
# => [1, 2, 3, 4, 5]

# Easy Fibonacci
p Enumerator.generate([0, 1]) { |f0, f1| [f1, f0 + f1] }.take(10).map(&:first)
#=> [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

require 'date'

# Find next Tuesday
p Enumerator.generate(Date.today, &:succ).detect { |d| d.wday == 2 }
# => #<Date: 2018-05-22 ((2458261j,0s,0n),+0s,2299161j)>

# ---------------
require 'nokogiri'
require 'open-uri'

# Find some element on page, then make list of all parents
p Nokogiri::HTML(open('https://www.ruby-lang.org/en/'))
.at('a:contains("Ruby 2.2.10 Released")')
.yield_self { |a| Enumerator.generate(a, &:parent) }
.take_while { |node| node.respond_to?(:parent)  }
.map(&:name)
# => ["a", "h3", "div", "div", "div", "div", "div", "div", "body", "html"]

# Pagination
# ----------
require 'octokit'

Octokit.stargazers('rails/rails')
# ^ this method returned just an array, but have set .last_response to full response, with data
# and pagination. So now we can do this:
p Enumerator.generate(Octokit.last_response) { |response|
response.rels[:next].get                         # pagination: get fetches next Response
}
.first(3)                                          # take just 3 pages of stargazers
.flat_map(&:data)                                  # data is parsed response content (stargazers themselves)
# => ["wycats", "brynary", "macournoyer", "topfunky", "tomtt", "jamesgolick", ...


Without initial value

ruby
# Random search
target = 7
p Enumerator.generate { rand(10) }.take_while { |i| i != target }.to_a
# => [0, 6, 3, 5,....]

# External while condition
require 'strscan'
scanner = StringScanner.new('7+38/6')
p Enumerator.generate { scanner.scan(%r{\d+|[-+*/]}) }.slice_after { scanner.eos? }.first
# => ["7", "+", "38", "/", "6"]

# Potential message loop system:
Enumerator.generate { Message.receive }.take_while { |msg| msg != :exit }


**Reference implementation**: https://github.com/zverok/enumerator_generate

I want to **thank** all peers that participated in the discussion here, on Twitter and Reddit.

---Files--------------------------------
enumerator_from.rb (3.16 KB)

--
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>