Thanks for the update on the RFC, guess I should have just read that myself. Well I don't want to "litter" the news group, but I hate to have incorrect code out there with my name on it so. If you want follow the link (http://hurleyhome.com/~patrick/quiz23.rb) to see the fixed code. Also of note is the now commented (just for Dave) regexp for parsing long lines, for the curious: lines = result.scan(/ # Match one of the three following cases (?: # This will match the special case of an escape that would generally have # split across line boundries (?: [^\n]{74}(?==[\dA-F]{2}) ) | # This will match the case of a line of text that does not need to split (?: [^\n]{0,76}(?=\n) ) | # This will match the case of a line of text that needs to be split without special adjustment (?:[^\n]{1,75}(?!\n{2})) ) # Match zero or more newlines (?-x:#{$/.}*)/x); pth On Wed, 16 Mar 2005 05:40:15 +0900, Dave Burt <dave / burt.id.au> wrote: > "Patrick Hurley" <phurley / gmail.com> continued: > > Thanks for the kind response. > > > > When I said the test case failed, I meant the actually output our > > resulting output encodeing the line has trailing space at the end of a > > line. We both escape trailing spaces before we break lines - if the > > line breaking moves some code is that not an issue? (the continuation > > = might mean that it is not). > > From the RFC (2045, section 6.7): > Any TAB (HT) or SPACE characters > on an encoded line MUST thus be followed on that line > by a printable character. In particular, an "=" at the > end of an encoded line, indicating a soft line break > (see rule #5) may follow one or more TAB (HT) or SPACE > characters. > > So it's all good - unescaped tabs and spaces are fine as long as it's got a > printable non-whitespace character after it, and "=" is fine for that. > > ... Therefore, when decoding a Quoted-Printable > body, any trailing white space on a line must be > deleted, as it will necessarily have been added by > intermediate transport agents. > > There's something I think we've all forgotten to do -- strip trailing unescaped > whitespace. I've added the following test: > > def test_decode_strip_trailing_space > assert_equal( > "The following whitespace must be ignored: \r\n".from_quoted_printable, > "The following whitespace must be ignored:\n") > end > > And the following line to decode_string: > result.gsub!(/[\t ]+(?=\r\n|$)/, '') > > > > > Yup there was an issue with masks I fixed that and removed the globals > > (my perl just throwing in a $ when in doubt :-) There was also a bug > > in the command line driver, which I have fixed. The patched code > > follows > > > >> (/(?:(?:[^\n]{74}(?==[\dA-F]{2}))|(?:[^\n]{0,76}(?=\n))|(?:[^\n]{1,75}(?!\n{2})))(?:#{$/}*)/) > >> makes you look like a Perl 5 junkie, > > > > I did this to allow the use of a gsub, which is much faster than the > > looping solution. The look aheads and general uglyness handle the > > special cases. I probably should use /x and space it out and comment, > > but when I am in the regexp zone, I know what I am typing <grin>. > > Write-only? No, I'm not in a fantastic position to comment, mine is not that > much shorter. > > > ... > > def QuotedPrintable.decode > > STDIN.binmode > > while (line = gets) do > > # I am a ruby newbie, and I could > > # not get gets to get the \r\n pairs > > # no matter how I set $/ - any pointers? > > | C:\WINDOWS>ruby > | STDIN.binmode > | gets.each_byte do |b| puts b end > | ^Z > | > | 13 > | 10 > | > Seems to work for me - that output says I wouldn't need the following line > > > line = line.chomp + "\r\n" > > Cheers, > Dave > >