I have some legacy text data that's gone through several databases and
web services in its life, playing promiscuously with dirty web
servers, browsers, and encodings.

It's coming out of the source database as ASCII-8bit. I'm trying to
bring it all into UTF-8. I've found ways to coerce many of the bad
entries into compliance, but now I've hit one that is simply bad. I
want to just delete the minimum necessary to make it valid UTF-8. What
I'm trying isn't working. Here's my code:

  if new_value.is_a? String
    begin
      utf8 = new_value.force_encoding('UTF-8')
      if utf8.valid_encoding?
        new_value = utf8
      else
        new_value.encode!( 'UTF-8', 'Windows-1252' )
      end
    rescue EncodingError => e
      puts "Bad encoding: #{old_table}.#{pk}:#{old_row[pk]} -
#{new_value.inspect}"
      new_value.encode!( 'UTF-8', invalid: :replace, undef: :replace,
replace: '' )
      p new_value.encoding unless new_value.valid_encoding?
    end
  end

When I fall into the rescue clause, I'm getting out:
  Bad encoding: bugs.id:2469 - "Indexing C:\\\\длебдл\xE3\x81E \x81E
\x81EZCa_zu5.264"
  #<Encoding:UTF-8>
The conversion resulted in an invalid UTF-8 string (that happens to be
the same as the original, as far as I can tell.) I'm surprised,
because I thought the purpose of invalid/undef replace was to clean
things up.

How do I force it into a valid UTF-8 encoding, losing as little data
as possible but happily throwing out the senseless bits?