Issue #15764 has been updated by naruse (Yui NARUSE).


Whether an issue is "Bug" or "Feature" practically depends whether the fix should be backported or not.

Anyway as far as I understand, Ruby has following categories of characters.
* First character for constants (upper case)
* First character for local variables (lower case and '_')
* trailing character for tokens (upper case, lowercase, '_', and numerics)
* white spaces
* sigils (ASCII punct except '_')
* invalid char (non nul non space control characters)

So we need to map between Unicode category and above category.

In addition we may also consider the canonicalize for each sigils like `!`, `?`, and so on.

----------------------------------------
Bug #15764: Whitespace and control characters should not be permitted in tokens
https://bugs.ruby-lang.org/issues/15764#change-77706

* Author: BatmanAoD (Kyle Strand)
* Status: Open
* Priority: Normal
* Assignee: matz (Yukihiro Matsumoto)
* Target version: 
* ruby -v: 
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
As of Ruby 2.5.1p57, it appears that all valid Unicode code-points above 128 are permitted in tokens. This includes whitespace and control characters.

This was demonstrated here: https://gist.github.com/qrohlf/7045823

I have attached the raw download from the above gist.

The issue has been discussed on StackOverflow: https://stackoverflow.com/q/34455427/1858225

I would say this is arguably a bug, but I am marking this ticket as a "feature" since the current behavior could be considered by-design.

---Files--------------------------------
helloworld.rb (543 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>