Hello Yui,

On 2014/10/21 16:34, naruse / airemix.jp wrote:
> Issue #10084 has been updated by Yui NARUSE.
>
>
>>    class Unicode < self
>>      def self.download(name, *rest)
>>        super("http://www.unicode.org/Public/UCD/latest/ucd/#{name}", name, *rest)
>>      end
>>    end
>
> "latest" is not acceptable because released Ruby's table must be a specific version.

[I disagree with this policy, but I will of course respect it until I 
can convince others that a more dynamic policy is better.]

> Moreover generated lib/unicode_normalize/tables.rb is only 200MB. How about committing it to the repo like other conversion tables?

I came to the same conclusion, and I have just done so at r48072.

Nobu and I have tried to make the update of the Unicode data files 
automatic and unobtrusive, but we had to find out that it is difficult 
to get all of the following:
- Use already downloaded Unicode data files if no network connection.
- Check for updates dynamically.
- Make sure that this happens regularly (I think currently it is done
   with "make up", but not everybody packaging Ruby is using "make up").

I hope we can try to keep the makefile logic for automatic update of 
Unicode data files and lib/unicode_normalize/tables.rb but change it so 
that it is triggered only on request.

Regards,   Martin.

> ----------------------------------------
> Feature #10084: Add Unicode String Normalization to String class
> https://bugs.ruby-lang.org/issues/10084#change-49562
>
> * Author: Martin Dürst
> * Status: Assigned
> * Priority: Normal
> * Assignee: Martin Dürst
> * Category:
> * Target version: Ruby 2.2.0
> ----------------------------------------
> Unicode string normalization is a frequent operation when comparing or normalizing strings.
>
> This should be available directly on the String class.
>
> The proposed syntax is:
>
>     'string'.normalize       # normalize 'string' according to NFC (most frequent on the Web)
>     'string'.normalize :nfc  # normalize 'string' according to NFC; :nfd, :nfkc, :nfkd also usable
>     'string'.nfc             # shorter variant, but maybe too many methods
>
> There are several "unofficial" but convenient normalization variants that could be offered, e.g.:
>
>     'string'.normalize :mac  # use MacIntosh file system normalization variant
>
> Implementations are already available in pure Ruby (easy for other Rubymplementations; e.g. eprun: https://github.com/duerst/eprun) and in C (unf, http://bibwild.wordpress.com/2013/11/19/benchmarking-ruby-unicode-normalization-alternatives/)
>
> ---Files--------------------------------
> Normalization.pdf (576 KB)
>
>