hi!

recently, there were some weird segfaults with Unicode [1] in our
Rails app. i have tracked it down to a simple expression like this:

  Unicode.downcase(''[//])        # OK
  Unicode.downcase(''[//].strip)  # BOOM!

where the latter has a NULL pointer in wstring.c:WStr_allocWithUTF8
and thus affects all (?) Unicode methods, not just downcase. the
offending line is 118:

  for (i = 0; in[i] != '\0'; i++) {

with 'const char* in' being NULL in the latter case, but not in the
former. so how can this be? both are just empty strings, are they
not? i suppose this is not a bug in Unicode, but rather in Ruby, right?

now we could easily work around this by only Unicode.downcase()ing
non-empty strings, but maybe there's some deeper issue behind this.

[1] <http://www.yoshidam.net/Ruby.html#unicode>

cheers
jens

p.s.: ruby -v
ruby 1.8.6 (2007-06-07 patchlevel 36) [x86_64-linux]

-- 
Jens Wille, Dipl.-Bibl. (FH)
prometheus - Das verteilte digitale Bildarchiv f?r Forschung & Lehre
Kunsthistorisches Institut der Universit?t zu K?ln
Albertus-Magnus-Platz, D-50923 K?ln
Tel.: +49 (0)221 470-6668, E-Mail: jens.wille / uni-koeln.de
http://www.prometheus-bildarchiv.de/