David Flanagan wrote:
> Sam Ruby wrote:
>> I've tried porting a few small codebases, and a few experiments, and 
>> documented some of my findings here:
>>
>>   http://intertwingly.net/blog/2007/12/28/3-1-2
>>
>> - Sam Ruby
> 
> Here's the response I left on Sam's blog:

I responded, also on my weblog.  An explanation (or pointer to the 
documentation) of the apparently inconsistent results from my table:

http://intertwingly.net/stories/2007/12/28/hearts.rb
http://intertwingly.net/stories/2007/12/28/hearts.html

As well an explanation of the differences between the following two 
would be appreciated:

http://intertwingly.net/stories/2007/12/28/test1.rb
http://intertwingly.net/stories/2007/12/28/test2.rb

> Sam,
> 
> It sounds like your complaint is with Array.pack and the rexml library, 
> not with all of Unicode in Ruby 1.9.
> 
> Given that the point of Array.pack is to serialize data into byte 
> strings, I think its behavior is probably correct as it is.  Admitedly 
> confusing, though.  A documentation clarification is probably in order. 
>  (Though pack() has always been a confusing method!)
> 
> Instead of using pack to convert Unicode codepoints to strings, try the 
> Integer#chr method, with the desired encoding as an argument. (Your 
> comment system won√’ allow me to enter an example: it must think that 
> I√ő embedding JS or something).
> 
> I don√’ know anything about the rexml library.  But the 1.9.0 is not 
> really expected to be stable yet, and I suspect that there are a number 
> of libraries that haven√’ been carefully ported yet.
> 
> Like so much of Ruby, I think you√◊e got to give the Unicode support a 
> chance to grow on you.  I don√’ understand why Matz made some of the 
> choices he did, but they seem to work okay.  Keep in mind, too, that the 
> goal was not just to support Unicode but also to support Japanese 
> encodings as well.  So some of the design decisions might make a lot 
> more sense to programmers who have to work with SJIS and EUC every day.
> 
> Finally, Ruby does inherit the default external encoding from the locale 
> if you don√’ specify an encoding with -K, -E or --encoding.  This is the 
> encoding assumed when you read from a file and do not specify a 
> different encoding.  (It is not used when you write to a file or read or 
> write from a socket or pipe, however.)  It respects the standard 
> LC_CTYPE, LC_ALL, and LANG variables.  Encoding.default_external returns 
> the value.  Encoding.locale_encoding didn√’ make it into 1.9.0, but it 
> is in the current sources and returns the default encoding for the 
> locale even if -K, -E, or --encoding is specified.
> 
> (I attempt to explain all this in The Ruby Programming Language which 
> should be in  bookstores in about a month.  I√ő making the last-minute 
> changes today.)
> 
> David Flanagan
>