On Sun, 14 Dec 2008 17:26:10 +1100, Daniel Luz <dev / mernen.com> wrote: > On Sat, Dec 13, 2008 at 22:57, Michael Selig <michael.selig / fs.com.au> > wrote: >> From my testing: >> - String equality comparisons seem to be simply done on a byte-by-byte >> basis, without regard to the encoding > > Am I misinterpreting something here? > > u = "café".encode("utf-8") > b = u.dup.force_encoding("binary") > i = u.dup.force_encoding("iso-8859-1") > u == b # => false > b == i # => false > u == i # => false > u.eql?(b) # => false Sorry, you are quite right. Equality is false if the encodings are not compatible. If they are compatible, it is done on a byte-by-byte basis. > I only knew of ASCII-compatibility. Are there other cases? ISO-8859-1 > and Windows-1252 (a superset) at least are not compatible: > > i = "café".encode("iso-8859-1") > w = "café".encode("windows-1252") > i == w # => false > i + w # Encoding::CompatibilityError > w + i # Encoding::CompatibilityError I think this might be a bug. Cheers Mike.