sea's blog → Algebra, Lisp, and miscellaneous thoughts

Table of Contents

There is no magic compression algorithm

You can use information theory to prove this, but compression is actually related to permutations. The set of all strings in your language, you permute. Some of them become representable in shorter strings, but others necessarily become longer strings. In total, the sum of information remains unchanged (information is conserved).

Good compression functions are those which map interesting inputs to shorter strings, while uninteresting inputs become longer ones.

So for instance, you'd want an algorithm which encodes written text and all human-created interesting media to compress well, but randomized strings of gibberish to become much longer under 'compression', because we don't care about those.