Hi,

Every now and then I get errors relating to UTF8 encodings, and each time I fail to (guess) find the right combination of words to get Ruby 1.92 to play nice with some string it doesn't like.

Right now I want to open a log file and read it, but some script kiddie has decided to connect using some crazy non ASCII characters, and this line in my script

    File.readlines(logfile, :encoding => "UTF-8" )

Now spits out the error:

  ArgumentError - invalid byte sequence in UTF-8

when encountering lines like this:

83.44.178.124 - - [19/Jul/2011:19:15:00 +0100] ?.???S\x08\x02?N~],>~Q?~@6\x15`?~Vg?'dR\x1C??\x08?F\x06w?~H?~F?\x08P~V?\x0Bf\x22?\x17~M^??{??j\x1E??p?~AU~\\
 "400" 166 "-" "-" "-"


I'd really like to know how to fix this without dropping 1.9. Does anyone know the magic words that will get this logfile read? These are my best efforts

    File.readlines(logfile, :encoding => "UTF-8" ).map{|e| e.force_encoding('UTF-8')} 

    File.readlines(logfile, :encoding => "UTF-8" ).map{|e| e.encode('UTF-8', undef: :replace, replace: "??")}

    File.readlines(logfile, :encoding => "UTF-8" ).map{|e| e.encode('iso-8859-1', undef: :replace, replace: "??")}

They fail :( They do read a logfile with valid utf8 in there. Any help is much appreciated.


Regards,