Issue #12306 has been updated by Sam Saffron.


Shyouhei Urabe wrote:

> is clear that the XINCLUDES indicates this string is passed to a shell script.  The call of strip here surely intends stripping shell's definition of whitespaces, not the Unicode one.  Do you think Unicode-aware strip cannot introduce regressions for this kind of situations?

In this particular case I fail to see how this can cause any problem. 

```
sam@ubuntu ruby % ruby -e 'p ARGV[0]' test
"test"
```

Most shells these days allow you to pass in UTF-8 strings, but saying that there is one real case in the universe where someone really wanted to include the library `hacky` as opposed to the lib called `hacky` when building tk is not true. blank? and utf8 aware strip would work fine here and not break a single real example in this usage. 

That said, internal uses of strip while dealing with shell params passed to ARGV may need some care, but requiring people start quoting leading and trailing unicode spaces is not really a major breaking change imo. 


----------------------------------------
Feature #12306: Implement String #blank? #present? and improve #strip and family to handle unicode
https://bugs.ruby-lang.org/issues/12306#change-58332

* Author: Sam Saffron
* Status: Open
* Priority: Normal
* Assignee: Yukihiro Matsumoto
----------------------------------------
Time and again there have been rejected feature requests to Ruby core to implement `blank` and `present` protocols across all objects as ActiveSupport does. I am fine with this call and think it is fair. 

However, for the narrow case of String having `#blank?` and `#present?` makes sense. 

- Provides a natural extension over `#strip`, `#lstrip` and `#rstrip`. `("   ".strip.length == 0) == "    ".blank?`

- Plays nicely with ActiveSupport, providing an efficient implementation in Ruby core: see: https://github.com/SamSaffron/fast_blank, implementing blank efficiently requires a c extension. 

However, if this work is to be done, `#strip` and should probably start dealing with unicode blanks, eg: 

```
irb(main):008:0> [0x3000].pack("U")
=> "กก"
irb(main):009:0> [0x3000].pack("U").strip.length
=> 1
```

So there are 2 questions / feature requests here

1. Can we add blank? and present? to String? 
2. Can we amend strip and family to account for unicode per: https://github.com/SamSaffron/fast_blank/blob/master/ext/fast_blank/fast_blank.c#L43-L74



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>