Bug #3252: 1.9.2 -> 1.9.1 backport fail: invalid byte sequence in UTF-8 (ArgumentError):  "\xA0;" =~ /\./
http://redmine.ruby-lang.org/issues/show/3252

Author: Mike L
Status: Open, Priority: Normal
Category: core
ruby -v: ruby 1.9.1p378 (2010-01-10 revision 26273) [i686-linux]

Test case file:

# coding=utf-8
# vi: set fileencoding=utf-8 :
puts RUBY_DESCRIPTION
puts "ext: " << Encoding.default_external.to_s # UTF-8
puts "int: " << Encoding.default_internal.to_s
puts "loc: " << Encoding.locale_charmap.to_s   # UTF-8
"\xA0;" =~ /\./


produces:

$ ~/ruby19/bin/ruby test.rb
ruby 1.9.1p378 (2010-01-10 revision 26273) [i686-linux]
ext: UTF-8
int:
loc: UTF-8
test.rb:10:in `<main>': invalid byte sequence in UTF-8 (ArgumentError)

synopsis of web research: fixed in 1.9.2 (must be in git; not fixed in RC on ftp), but meanwhile breaking things in 1.9.1

I came across this when parsing an html &nbsp; (from a quirks mode web page encoded as windows-1252) into Nokogiri (via mechanize, iirc)

(matchable with [\302\240] see: http://www.vitarara.org/cms/hpricot_to_nokogiri_day_1)


Related:

relevant history: http://redmine.ruby-lang.org/issues/show/2762

is this the/one patch to backport?: http://groups.google.com/group/rubyonrails-core/browse_thread/thread/5c1718cdbeb1ba17

http://redmine.ruby-lang.org/issues/show/1370 -- a year ago; still no 1.9.2 release nor 1.9.1 backport.  boo hoo

not backported: http://redmine.ruby-lang.org/issues/show/1839

old (but good?): http://po-ru.com/diary/fixing-invalid-utf-8-in-ruby-revisited/

worked around: https://rails.lighthouseapp.com/projects/8994/tickets/2628-ruby-19-and-activesupport

Invalid byte sequence in UTF-8 error for anything but ASCII
http://www.redmine.org/boards/2/topics/9842

and so on ... http://www.google.com/search?q=ruby%20%22invalid%20byte%20sequence%20in%20utf-8%22

*** alternate vector of breakage?: ***
from the end of: http://blog.grayproductions.net/articles/ruby_19s_string
<quote>
massi added 9 months later:

Hi,

I'm trying to render an image from mysql using send_data and I'm getting this error : invalid byte sequence in UTF-8 Here is my code :

def get_photo
    @image_data = Photo.find(params[:id])
    @image = @image_data.binary_data
    @url  = @image_data.url
    send_data(@image, :type => 'image/jpeg,
                      :filename => "#{params[:id]}.jpg",


                      :disposition => 'inline')
end

BTW, I'm using ruby 1.9.1 with rails 2.3.5.
</quote>


----------------------------------------
http://redmine.ruby-lang.org