On Mon, 15 Sep 2008 13:09:31 +1000, James Gray <james / grayproductions.net> wrote: > However, if I can't make a regex work with a non-ASCII encoding, is > there any point? My all ASCII regular expressions will work on pretty > much everything else, right? If you pass me non-ASCII separators, > there's nothing I can do anyway, right? Seems like this isn't possible, > which is a big disappointment for me. I don't think it is quite that bad. Regexp's appear to be broken on UTF-16 & UTF-32. UTF-8 for example certainly seems OK to me. So I believe that if you pass UTF-8 separators and operate on UTF-8 text, it should work. So perhaps you have a choice: 1) You proceed and hope that Regexps are fixed sometime soon 2) Convert the broken encodings to UTF-8 & back again 3) Don't support the encodings that are broken (I don't think they are commonly used anyhow - maybe on Windows - not sure) Mike