On Jun 25, 2009, at 14:29, Ad Ad wrote: > Hi, > I am retrieving a string from a txt file. > The file contains some utf8 characters. > > I am comparing these characters against a default string. > > The problem is that some of the characters are not stored in a default > format. > > For example: > A is stored as ï¼> > Naturally when I compare the character it fails. > Strangely when I unpacked the character it appears as 65313 which is he > correct utf8 number for A. > > Any way around this? Well, ï¼is "Fullwidth Latin Capital Letter A" from the "Hiragana and Katakana" category (Unicode FF21) whereas A is "Latin Capital Letter A" from the "Latin" category (Unicode 0041). I don't know of a way to translate between the two categories, but