Issue #12745 has been updated by Herwin W.


I posted this link to IRC (#ruby on freenode), just to see if anyone had a good name suggestion. The suggestions `#matchsub` and `#msub` were offered. I don't really like them, `#matchsub` sounds like it tries to subtitute a match (exactly what `#sub` does, `#msub` reminds me of the `/m` modifier or regex (multiline, just as `#gsub` is named after the `/g` modifier (global)) instead of a `MatchData` object. (Although I smiled at the suggestion `#gsubthewayitshouldhavebeenallalong`, the name reminds me to much of `mysql_real_escape_string` to take serious)

Another option that was suggested was adding a keyword argument to toggle the behaviour. I actually liked that proposal: it keeps backwards compatibility and doesn't result in an explosion of the number of methods

----------------------------------------
Feature #12745: String#(g)sub(!) should pass a MatchData to the block, not a String
https://bugs.ruby-lang.org/issues/12745#change-61867

* Author: Herwin W
* Status: Feedback
* Priority: Normal
* Assignee: Yukihiro Matsumoto
----------------------------------------
A simplified (and stupid) example: replace some placeholders in a string with function calls

~~~ruby
def placeholder(val)
  raise 'Incorrect value' unless val == 'three'
  '3'
end

str = '1.2.[three].4'
str.gsub!(/\[(\w+)\]/) { |m| placeholder(m) }
~~~

This raises the 'incorrect value' because we don't pass the match 'three', but the full string '[three]'. It looks like we have 3 options to fix that:

1. Match `[three]` instead of `three` in the placeholder replacement method
2. Pass `m[1..-2]` instead of `m` to the method (or strip it in `placeholder`)
3. Use `$1` in the method call, ignore the value that's passed to the block

Options 1 and 2 look kind of code duplication to me (and they're possible in the simplified example, but might get tricky in real situations). I don't like option 3 because you completely ignore the value that's been passed to the block in favor of global variables, you can't use named captures, and writing code this way makes it incompatible with Rubinius.

I think it would be more logical to pass a `MatchData` (like what you'd get with `String#match`) instead of a `String` to the block. The `#to_s` returns the whole string, so in 90% of the use cases the code could remain unaltered, but the remaining 10% makes it a change that shouldn't be backported to 2.3.

Attached is a very naive patch to pass a matchdata to the block called by `String#sub`. The additional change in `rbinstall.rb` was required to run `make install`, which actually shows an incompatiblity (which I hadn't anticipated)

---Files--------------------------------
ruby_string_sub_matchdata.diff (952 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>