I was playing around with the RMail package and I was missing RFC-2047 support. I found the "module Rfc2047" in <20031204151316.GC849@jupp%gmx.de> but noticed the following: In the regex to discover encoded words: | WORD = %r{=\?([!#$%&'*+-/0-9A-Z\\^\`a-z{|}~]+)\?([BbQq])\?([!->@-~]+)\?=} # :nodoc: I had to change % to \% to run. Maybe it's just Cygwin. The second thing is that the module doesn't correctly interpret the "encoded-word - linear white space - encoded word" sequence, where all the white space should be deleted. So I added a regex to delete this whitespace before further processing: > module Rfc2047 > > WORD = %r{=\?([!#$\%&'*+-/0-9A-Z\\^\`a-z{|}~]+)\?([BbQq])\?([!->@-~]+)\?=} # :nodoc: >| WORDSEQ = %r{(=\?[!#$\%&'*+-/0-9A-Z\\^\`a-z{|}~]+\?[BbQq]\?[!->@-~]+\?=)\s*(=\?[!#$\%&'*+-/0-9A-Z\\^\`a-z{|}~]+\?[BbQq]\?[!->@-~]+\?=)} [Comment skipped] > def Rfc2047.decode_to(target, from) >| from.gsub!(WORDSEQ, '\1\2') > > out = from.gsub(WORD) do > |word| > charset, encoding, text = $1, $2, $3 It works so far, but I wonder whether '\s*' is the correct expression and whether there is a more efficient way to do this. I also observed that decoding of non-Western character sets (Win-1251 to Big5) to UTF-8 didn't work. Does anybody already suspect why or do I have to track down the error further? -- Oliver Cromm