Issue #11815 has been updated by Daniel P. Clark.


I like how your Array#difference method works well with duplicate entries.  I've only come across times where the difference of id references between two lists needed to be determined.  In my case it's

~~~ruby
a = [2, 4, 6, 8, 2, 4, 6, 8]
b = [1, 2, 3, 4, 1, 2, 3, 4]

# example
b - a
# => [1, 3, 1, 3]

b - a | a - b
# => [1, 3, 6, 8]
~~~

Like the example you first gave with added `| b - a` for getting two way evaluation on uniqueness.  If I wanted to get the same thing with Array#difference it looks the same as my example above.

~~~ruby
a = [2, 4, 6, 8, 2, 4, 6, 8]
b = [1, 2, 3, 4, 1, 2, 3, 4]

# example
b.difference(a)
# => [1, 3, 1, 3]

a.difference(b) | b.difference(a)
# => [1, 3, 6, 8]
~~~

So as to not cause confusion these are not the same as I will demonstrate with Cary Swoveland's input.

~~~ruby
a = [1,2,3,4,3,2,2,4]
b = [2,3,4,4,4]

b.difference(a)
# => [4] 
b - a
# => []

a.difference(b)
# => [1, 3, 2, 2] 
a - b
# => [1]
~~~

As far as a real world use case for Array#difference: Service (A) exports all data to CSV files with a background worker. Service (B) exports to a database with a background worker.  Sometimes a background worker crashes.  Now to figure out what's missing we compare the difference between to two datasets.  *One flaw in my example is there is no determination in the position the new data needs to be entered to match the other.  In this case we would need to use something like Enumerator#with_index*

@Cary Swoveland; If I could make one recommendation on the implementation. I think it would be best to have it as an Enumerator so it can be performed with lazy evaluation.  That way when the difference is being compared we can perform operations along the way and save system resources.

----------------------------------------
Feature #11815: Proposal for method `Array#difference`
https://bugs.ruby-lang.org/issues/11815#change-55552

* Author: Cary Swoveland
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I propose that a method `Array#difference` be added to the Ruby core. It is similar to [Array#-](http://ruby-doc.org/core-2.2.0/Array.html#method-i-2D) but for each element of the (array) argument it removes only one matching element from the receiver. For example:

    a = [1,2,3,4,3,2,2,4]
    b = [2,3,4,4,4]

    a - b #=> [1]
    c = a.difference b #=> [1, 3, 2, 2] 

As you see, `a` contains three `2`'s and `b` contains `1`, so the first `2` in `a` has been removed from `a` in constructing `c`. When `b` contains as least as many instances of an element as does `a`, `c` contains no instances of that element. 

It could be implemented as follows:

     class Array
       def difference(other)
         dup.tap do |cpy|
           other.each do |e|
             ndx = cpy.index(e)
             cpy.delete_at(ndx) if ndx
            end
          end
        end
      end

Here are a few examples of its use:

*Identify an array's unique elements*

      a = [1,3,2,4,3,4]
      u = a.uniq #=> [1, 2, 3, 4]
      u - a.difference(u) #=> [1, 2]

*Determine if two words of the same size are anagrams of each other*

      w1, w2 = "stop", "pots"
      w1.chars.difference(w2.chars).empty?
        #=> true

*Identify a maximal number of 1-1 matches between the elements of two arrays and return an array of all elements from both arrays that were not matched*

      a = [1, 2, 4, 2, 1, 7, 4, 2, 9] 
      b = [4, 7, 3, 2, 2, 7] 
      a.difference(b).concat(b.difference(a))
        #=> [1, 1, 4, 2, 9, 3, 7] 
  
To remove elements from `a` starting at the end (rather the beginning) of `a`:

    a = [1,2,3,4,3,2,2,4]
    b = [2,3,4,4,4]

    a.reverse.difference(b).reverse #=> [1,2,3,2]

`Array#difference!` could be defined in the obvious way.



-- 
https://bugs.ruby-lang.org/