Issue #12745 has been updated by Herwin W.


About the suggestions by Shyouhei Urabe: even though option 2 is very pragmatic (it solves the problem, keeps backwards compatibility and is a small change), the idea of the following code looks a bit off to me:

~~~ruby
str.gsub!(/\[(\w+)\]/) { |_, m| placeholder(m) }
~~~

I guess you would use either a String or a MatchObject in the block, but never both (if you'd need them both, you could get the String from the MatchObject of course). Always having to ignore the first argument to the block feels a bit like a hack. Not that I have a better suggestion though.


A separate method that passes a MatchData object would be another solution that feels a bit hacky: String#gsub and String#gs would be almost identical, and which method would be preferred if you don't add a block, but use a second parameter as the replacement string? And without looking at the documentation, I would have no idea what String#gs/String#sg would do, the names are very undescriptive.

----------------------------------------
Feature #12745: String#(g)sub(!) should pass a MatchData to the block, not a String
https://bugs.ruby-lang.org/issues/12745#change-60862

* Author: Herwin W
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
A simplified (and stupid) example: replace some placeholders in a string with function calls

~~~ruby
def placeholder(val)
  raise 'Incorrect value' unless val == 'three'
  '3'
end

str = '1.2.[three].4'
str.gsub!(/\[(\w+)\]/) { |m| placeholder(m) }
~~~

This raises the 'incorrect value' because we don't pass the match 'three', but the full string '[three]'. It looks like we have 3 options to fix that:

1. Match `[three]` instead of `three` in the placeholder replacement method
2. Pass `m[1..-2]` instead of `m` to the method (or strip it in `placeholder`)
3. Use `$1` in the method call, ignore the value that's passed to the block

Options 1 and 2 look kind of code duplication to me (and they're possible in the simplified example, but might get tricky in real situations). I don't like option 3 because you completely ignore the value that's been passed to the block in favor of global variables, you can't use named captures, and writing code this way makes it incompatible with Rubinius.

I think it would be more logical to pass a `MatchData` (like what you'd get with `String#match`) instead of a `String` to the block. The `#to_s` returns the whole string, so in 90% of the use cases the code could remain unaltered, but the remaining 10% makes it a change that shouldn't be backported to 2.3.

Attached is a very naive patch to pass a matchdata to the block called by `String#sub`. The additional change in `rbinstall.rb` was required to run `make install`, which actually shows an incompatiblity (which I hadn't anticipated)

---Files--------------------------------
ruby_string_sub_matchdata.diff (952 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>