Dear buddies,

I am using ruby to run some map reduce job in hadoop streaming.
Unfortunately, we have some dirty data which have invalid byte sequence as
the input. So while running things like

line.chomp.split("\t")

I will get

Best wishes,
Stanley Xu