Hi,

At Fri, 9 Nov 2007 16:08:02 +0900,
David Flanagan wrote in [ruby-core:13326]:
> >> Q1) In step 1 above, should the default primary encoding come from the 
> >> locale environment variables (LC_ALL, LC_CTYPE, and LANG) instead of 
> >> defaulting to ASCII?
> > 
> > It's planned, but we have no mappings from locale name to
> > encoding name.  Attached is a quick hack I tried the week
> > before last.
> 
> Did you consider nl_langinfo(CODESET)?

It needs setlocale() to be called before it, which sets global
state.

> Your code looks good to me, 
> except that you don't check LC_CTYPE.  Is it your intent to be 
> conservative and assume ASCII unless the locale explicitly specifies an 
> encoding name following a .?  You're not going to choose either EUC-JP 
> or SJIS as the default for Japanese locales?

Forgotten, thank you.  I changed locale_encoding() as
following.


static rb_encoding * locale_encoding(void) { static const char *const langs[] = {"LC_ALL", "LC_CTYPE", "LANG",}; const char *lang, *at; int i, len, idx = 0; char buf[32]; rb_encoding *enc; for (i = 0; i < sizeof(langs) / sizeof(langs[0]); ++i) { if (!(lang = getenv(langs[i]))) continue; if (!(lang = strchr(lang, '.'))) continue; at = strchr(++lang, '@'); if ((len = (at ? at - lang : strlen(lang))) >= sizeof(buf) - 1) continue; MEMCPY(buf, lang, char, len); buf[len] = 0; idx = rb_enc_find_index(buf); if (idx < 0 && len > 3 && (strncasecmp(buf, "euc", 3) == 0 || strncasecmp(buf, "utf", 3) == 0) && buf[3]) { MEMMOVE(buf + 4, buf + 3, char, len - 2); buf[3] = '-'; idx = rb_enc_find_index(buf); } enc = rb_enc_from_index(idx); if (enc) return enc; } return rb_enc_default(); }
-- Nobu Nakada