On 11/6/05, Park Heesob <phasis68 / hotmail.com> wrote:
> Hi,
>
> >From: Gyoung-Yoon Noh <nohmad / gmail.com>
> >Reply-To: ruby-talk / ruby-lang.org
> >To: ruby-talk / ruby-lang.org (ruby-talk ML)
> >Subject: Problem in writing ruby extension.
> >Date: Sat, 5 Nov 2005 20:56:29 +0900
> >
> >Hi,
> >
> >I am writing a ruby binding for C library implements an automaton
> >for supporting non-english keyboard input. Following shows an example
> >usage:
> >
> >   require 'hangul'
> >
> >   hic = Hangul::InputContext.new(Hangul::KEYBOARD_2)
> >   input = "fnql gksrmf fkdlqmfjfl xptmxm"
> >   buffer = ''
> >   input.each_byte do |c|
> >     ret = hic.filter(c)         # filtering [a-zA-Z] for automaton
> >     commit = hic.commit_string  # output produced by automaton
> >     buffer << commit if commit
> >     buffer << c.chr unless ret  # just append unfiltered chars.
> >   end
> >   hic.flush
> >   buffer << hic.commit_string.to_s
> >
> >It works as I expected when I paste the code on IRB, or run it with
> >built-in ruby debugger(-r debug). See the actual session:
> >
> ...
> >
> >FYI, the variable 'buffer' is finally utf-8 encoded string, expected
> >result.
> >
> >But it returns strange result when I run the code directly from
> >command-line. After running the code, the 'buffer' would be filled with
> >only 3 spaces(0x20). The 'hic.commit_string' always returns nil. I don't
> >know why the result differs. It doesn't relate with concurrency issues.
> >
> >My extension codes can be found at:
> >http://nohmad.sub-port.net/tmp/ruby-hangul/
> >
> >Any comment will be appreciated.
> >
> >--
> >http://nohmad.sub-port.net
> >
>
> The behaviour of wcstombs depends on the LC_CTYPE category of the  current
> locale.
>
> Modify hangul.c like this:
>
> #include "ruby.h"
> #include "hangul.h"
> #include <locale.h>                           // ADDED
>
> static void
> rbhic_free(HangulInputContext *hic)
> {
>     hangul_ic_delete(hic);
> }
>
> static VALUE
> rbhic_alloc(VALUE klass)
> {
>     HangulInputContext *hic = hangul_ic_new(HANGUL_KEYBOARD_2);
>     setlocale(LC_CTYPE,"ko_KR.eucKR");   // ADDED
>     return Data_Wrap_Struct(klass, 0, rbhic_free, hic);
> }
> ....
>
> HTH,
>
> Park Heesob
>
>
>
>

Thanks, it works great!

I think fixing specific locale would not be a good idea.
So I modified LC_CTYPE to respect user's environment:

static VALUE
rbhic_alloc(VALUE klass)
{
    HangulInputContext *hic = hangul_ic_new(HANGUL_KEYBOARD_2);
    setlocale(LC_CTYPE, "");
    return Data_Wrap_Struct(klass, 0, rbhic_free, hic);
}

--
http://nohmad.sub-port.net