On Wed, 23 Apr 2008 15:35:11 -0500, Evgeni Belin wrote:

> This is happening in ruby 1.8.6:
> 
> % ruby --version
> ruby 1.8.6 (2007-09-23 patchlevel 110) [i686-darwin9.1.0]

>>> "".chop
> => "\320"
>>> "".chop.chop
> => ""
> 
> 
> As you can see chop removes last byte vs last char.  Btw, problem

Ruby 1.8 does not have multi-byte character support built in, so it 
assumes each character is one byte. If you would like unicode support, 
include in your scripts

$KCODE='u'
require 'jcode'

Then chop will work properly. (Though I'm not sure everything will be 
perfect)

Ruby 1.9 has unicode support built in and handles this properly out of 
the box.




-- 
Ken (Chanoch) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/