The meaning of \w can change if you alter the global $KCODE variable.
It's best to specify exactly what you mean if you know exactly what
you want (eg, follow Robert's advice).  Specifying \w says that you
want "wordful," non-breaking characters; this includes non-English
characters, even CJK.

irb(main):001:0> s = " "
=> "\327\251\327\221\327\252 \327\251\327\234\327\225\327\235"
irb(main):002:0> s =~ /\w/ ? "match" : "no match"
=> "no match"
irb(main):003:0> $KCODE = "u"
=> "u"
irb(main):004:0> s =~ /\w/ ? "match" : "no match"
=> "match"

On May 10, 11:10 pm, Ehud <ehud... / gmail.com> wrote:
> Hi everyone...
> I'm looking for a way to only allow english characters through a
> simple regex.
> It seems that \w (altough the documentation states is equivalent to [a-
> zA-Z0-9] still allows
> non english characters (in my case hebrew).
>
> Has anyone come up with a solution other than specifying [abcdef...]?
>
> Thanks!
> Ehud