Hi,
i needed a method to convert a piece of text to plain ascii andreplace all non-ascii chars with a placeholder. I could not findanything in the stdlib so I wrote one.
I'd love to hear your comments. (or pointers to existing libraries forthis task)
-Levin
#!/usr/bin/ruby
require 'iconv'
class String
# removes all characters which are not part of ascii # and replaces them with +replacement+ # # +replacement+ is supposed to be the same encoding as +source+ # def asciify(replacement = "?", target = "ASCII", source = "UTF-8") intermediate = "UCS-4" pack_format = "N*" i = Iconv.new(intermediate, source)
u16s = i.iconv(self) repl = i.iconv(replacement).unpack(pack_format)
s = u16s.unpack(pack_format).collect { |codepoint| codepoint < 128 ? codepoint : repl }.flatten.pack(pack_format)
return Iconv.new(target, intermediate).iconv(s) endend
if __FILE__ == $0 require 'test/unit'
class TestAsciify < Test::Unit::TestCase def test_asciify assert_equal "Itrntinliztin".asciify, "I?t?rn?ti?n?liz?ti?n" assert_equal "Mtorhead".asciify("(removed)"), "M(removed)torhead" end endend