2014-10-23

Compression of plain text files - 7z vs zip vs rar

This blog post is a translation of original post in Polish – for my friends who do not speak Polish :-)

Quantum-mechanical calculation codes create output in the form of plain text files. Sometimes the files are very large - a several / a few dozen MB of plain text. Some time ago I decided to clean out some directories containing calculation outputs in large text files (to save some disk space) and compress them. By the way I checked which one of the three common compression formats – zip, rar, or 7z – has better compression level for text files.

Size without compression (MB)Zip archive (MB)7z archive (MB)Rar archive (MB)
Directory 1 (only text files)193,738,82,43,5
Directory 2 (only text files)633,7128,65,98,5
Directory 3 (text and binary files)699,6153,399,7144,4

The results are very interesting, because for plain text files (of high redundancy) the 7z compression allows you to create an archive of size up to 100 times smaller than uncompressed files (and up to several times smaller than a zip/rar archive).