> Could that be an optimization in encode: since the string is already > thought to be UTF-8, just return it? Not sure, it isn't obvious (to me) looking at encode()'s source. There's no charset specified in the response headers from IIS. The Content-Type meta tag specifies "text/html; charset=UTF-8" though I'm not sure if Firefox respects that. `file -I` on one of the downloaded files displays "text/html; charset=unknown-8bit." Firefox is choosing UTF-8 but the special characters aren't displayed properly. Switching from within the browser to one of the Western encodings displays the characters correctly (as mentioned this is all MS stuff and I assume people just copy and paste from MS Office).