Austin Ziegler wrote: > No. #index always returns character position; it just so happens that > some encodings use bytes for their character position. That's just a different way of wording my concern :-P You'll still have inconsistencies if you forgot to set the proper encoding for one of your strings. THAT is my (admittedly tiny) concern. > I think that m17n is a better solution than Unicode all the time given > the amount of legacy data out there, despite the definite value of > Unicode as a long-term solution. I agree, but I wasn't talking about unicode; I was talking about putting the m17n in the strings (1.9) vs. putting it in the regexen (1.8). But that's a moot point as that discussion, if it ever occured, was over months/years ago. Daniel