Issue #12745 has been updated by Shyouhei Urabe.


We looked at this issue in developer meeting today.

While matz's suggestion of new method was excavated from the past, myself and others argued possibility of extending the gsub method that exists today.  Proposals include:

1. what about just passing a MatchData instead of a String.  That hurts no one because almost every time when we need a block to pass to gsub etc, what is needed is $1.
2. what about passing a MatchData _in addition_ to a String like gsub(){|string, md|}.  It is 100% safe even when the block parameter is used.
3. what about extending the block parameter string with a module, like open-uri does, to define additional method.
4. there are of course attendees who think a separate method is the cleanest solution.

My idea was #3 but in reality #2 can be a perfect and easy solution.  What do you think matz?

----------------------------------------
Feature #12745: String#(g)sub(!) should pass a MatchData to the block, not a String
https://bugs.ruby-lang.org/issues/12745#change-60848

* Author: Herwin W
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
A simplified (and stupid) example: replace some placeholders in a string with function calls

~~~ruby
def placeholder(val)
  raise 'Incorrect value' unless val == 'three'
  '3'
end

str = '1.2.[three].4'
str.gsub!(/\[(\w+)\]/) { |m| placeholder(m) }
~~~

This raises the 'incorrect value' because we don't pass the match 'three', but the full string '[three]'. It looks like we have 3 options to fix that:

1. Match `[three]` instead of `three` in the placeholder replacement method
2. Pass `m[1..-2]` instead of `m` to the method (or strip it in `placeholder`)
3. Use `$1` in the method call, ignore the value that's passed to the block

Options 1 and 2 look kind of code duplication to me (and they're possible in the simplified example, but might get tricky in real situations). I don't like option 3 because you completely ignore the value that's been passed to the block in favor of global variables, you can't use named captures, and writing code this way makes it incompatible with Rubinius.

I think it would be more logical to pass a `MatchData` (like what you'd get with `String#match`) instead of a `String` to the block. The `#to_s` returns the whole string, so in 90% of the use cases the code could remain unaltered, but the remaining 10% makes it a change that shouldn't be backported to 2.3.

Attached is a very naive patch to pass a matchdata to the block called by `String#sub`. The additional change in `rbinstall.rb` was required to run `make install`, which actually shows an incompatiblity (which I hadn't anticipated)

---Files--------------------------------
ruby_string_sub_matchdata.diff (952 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>