On Thu, Sep 15, 2011 at 11:50 AM, Wayne Brissette <waynefb / earthlink.net> wrote: > First off, I'm very new to Ruby and I'm trying to wrap my head around a few things, so if this sounds simplistic I apologize in advanceI did do some searching but I'm not sure how to fix this error. > > I'm using Mac OS X (Lion) and I've started a script that reads in an xml file (ditamap) and parses the data, so I only end up with a listing of files that map uses. > > When I run the script using Ruby 1.8.x, the script works as I expected ito work. However, when I run it using Ruby .9.x, I get the following error: > > `gsub': invalid byte sequence in US-ASCII (ArgumentError) > > From what I've determined via the web, this has to do with some mis-matchf what the OS is using vs. what Ruby is using. One post I read recommended reading the file as a binary to get around this. However, I'm wondering what the real fix is for this problem, and why is it happening in 1.9 vs. 1.8. > > > > For the record, here is how I'm opening my files: > > ditamap_file= File.read("v5630097.ditamap") Your issue is likely a late consequence of reading the file with improper encoding. I can provoke the same behavior: irb(main):001:0> s="a => "a¡¬" irb(main):003:0> s.bytes.to_a => [97, 195, 159] irb(main):004:0> File.open("x","w:UTF-8"){|io|io.write s} => 3 irb(main):005:0> t = File.open("x","r:UTF-8"){|io|io.read} => "a¡¬" Now we are reading with the wrong encoding: irb(main):008:0> t = File.open("x","r:ASCII"){|io|io.read} => "a\xC3\x9F" irb(main):009:0> t.bytes.to_a => [97, 195, 159] irb(main):010:0> t.gsub(/./){"X"} ArgumentError: invalid byte sequence in US-ASCII from (irb):10:in `gsub' from (irb):10 from /opt/bin/irb19:12:in `<main>' irb(main):011:0> The error does not show up during loading but during gsub. If you define the target encoding, the error pops up earlier: irb(main):012:0> t = File.open("x","r:ASCII:UTF-8"){|io|io.read} Encoding::InvalidByteSequenceError: "\xC3" on US-ASCII from (irb):12:in `read' from (irb):12:in `block in irb_binding' from (irb):12:in `open' from (irb):12 from /opt/bin/irb19:12:in `<main>' Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/