Trans wrote: > I think one needs to be careful in understanding this though. The > sorting only provides a benefit in contrast to less efficient > algorithms. No amount of reordering will ever make a difference to the > lower bound of compressibility. That's the crux of my point. And > hence, for a better algorithm, this kind of sorting should be a moot > point. (Of course, one could argue that this sorting Is the better > algorithm -- fair enough.) Your analysis is well-meaning, but flawed. Compression relies on prediction. If you know I'm going to send you a message telling you how many bananas to buy, and you already know that's the subject of the message, then I never need to say "bananas", I can just say "12". If you know you're receiving a message that contains large amounts of English text, or C language text, or i386 binary opcodes, or you can recognise and adapt when you find that you are, you'll be able to compress better than any algorithm that only has a single trick, repetition, as deflate does. Otherwise we'd all be using deflate for download pirated video streams and no-one would need H.264 :-) BWT improves deflate's predictor for the kinds of text we tend to use. Period. Nothing really interesting there, except how it does it. Clifford Heath.