> I just noticed that accented letters like טאיעשל (actually, if someone
> can see them correctly in this message, either)
> are not matched by /[a-z]/ or \w on windows.

> I've not tryed on *nix with proper locale set, but I wonder if,
> anyway, there is something special I should do to allow this kind of
> special letters to be matched as letters.

1. if you don't mind altering the string before checking, look at unac (http://www.gnu.org/directory/text/Misc/unac.html).
'unac' is a C library and command that removes accents from a string.

2. On one app I had to match words in different languages (including right-to-left langs). I ended up doing it in reverse: matching everything that wasn't a space or punctuation mark. I can send it to you if you want (it's not complete, as I had some constraints on the text).

HTH,
Assaph

Ps. The regexp engine seems to handle UTF-8 with no problem.