7-Zip DEFLATE Compression Ratios

by Sam Allen - Updated February 2, 2010

You want to find the best compression you can get with DEFLATE. You are using 7-Zip, and want to find the best compression ratio for GZIP files. Turn the "fast bytes" and "number passes" knobs in 7-Zip. Here we see ways to change the DEFLATE compression in 7-Zip.

~~~ 7-Zip DEFLATE compression options test ~~~    
    262 files were tested.                        
    258 fast bytes and 13 passes was the smallest.

128 fast bytes, 10 passes: 915282 bytes [biggest]
                11 passes: 915020 bytes
                12 passes: 914958 bytes
                13 passes: 914898 bytes
                14 passes: 914938 bytes
                15 passes: 914899 bytes

258 fast bytes, 10 passes: 915277 bytes
                11 passes: 915017 bytes
                12 passes: 914953 bytes
                13 passes: 914897 bytes [smallest]
                14 passes: 914933 bytes
                15 passes: 914898 bytes [second smallest]

Using 7-Zip command line

Unlike the 7z format, 7-Zip doesn't offer tons of options for GZIP, ZIP, and DEFLATE files. However, it allows you to adjust maximum fast bytes and the number of passes. For simple tasks, you can use the -mx options on the command line. In this article, the starting point will be the 7-Zip ultra compression for GZIP. It has the majority of the gains.

7za.exe -tgzip archive.gz input.html -mx=9

7za.exe:     the 7-zip executable
-tgzip:      specifies GZIP and Deflate as the method
archive.gz:  the target file
             will be created or overwritten
input.html:  the input file to be compressed
-mx=9:       specifies ultra compression

Turning knobs—7za.exe

As the 7-Zip manual states, the two options with DEFLATE are -mpass={NumPasses} and -mfb={NumFastBytes}. We replace the -mx=9 switch with combinations of these two switches. I tested the 7-Zip 4.60 beta version for Windows Vista, which at the time I write this is the latest and greatest. The files tested were small HTML files.

In the results graph, the first bar on the left is equivalent to -mx=9. It has 10 passes and 128 fast bytes. All the bars to the right of the first one have increased compression switch values. [See graph above]

As you increase the number of passes from 10 to 15, the compression ratio generally improves. Additionally, specifying more fast bytes, 258, never reduces the compression rate. From left to right on the graph, these commands were run:

7za.exe -tgzip file2 file1 -mpass=10 -mfb=128
7za.exe -tgzip file2 file1 -mpass=10 -mfb=258

7za.exe -tgzip file2 file1 -mpass=11 -mfb=128
7za.exe -tgzip file2 file1 -mpass=11 -mfb=258

7za.exe -tgzip file2 file1 -mpass=12 -mfb=128
7za.exe -tgzip file2 file1 -mpass=12 -mfb=258

7za.exe -tgzip file2 file1 -mpass=13 -mfb=128
7za.exe -tgzip file2 file1 -mpass=13 -mfb=258

7za.exe -tgzip file2 file1 -mpass=14 -mfb=128
7za.exe -tgzip file2 file1 -mpass=14 -mfb=258

7za.exe -tgzip file2 file1 -mpass=15 -mfb=128
7za.exe -tgzip file2 file1 -mpass=15 -mfb=258

Interpretation

Adding passes and fast bytes to the already excellent compression ratio of ultra mode in 7-Zip resulted in a file size decrease of 0.042%. In other words, it saved 384 bytes in a 915282 byte archive.

Certainly, this isn't impressive, but when dealing with compression, understanding the knobs are important. In this case, going beyond ultra mode in 7-Zip wasn't very useful. Keep in mind that most GZIP algorithms, including those included in the .NET framework, have results that are commonly 10% larger than 7-Zip's. That difference is much more significant.

Recommendation

If you are going to create archives frequently, don't use compression switches above -mx=9. Also, if there are more important improvements to make, pursue those first. However, if your data is going to be compressed once and left, then you might as well add the most aggressive options. There is no reason not to.

Adding more passes. I found that 7-Zip accepts many more passes than 15, and even bumped it up to 100. However, there was absolutely no improvement past 15 passes.

File names and compression. You don't always need the original file name. In this case, before you archive your files, rename the original files to a single-character file name. This will save several bytes off your archive.

Summary

Here we saw that with 7-Zip, there are no substantial gains in DEFLATE when you go beyond the top preset option of ultra. However, knowing this is useful to some extent, and the knowledge can save time and frustration. My experiments here shaved 0.042% off of my archive's final size, which is better than nothing. I won't continue pursuing better 7-Zip DEFLATE options.

(Do not copy this page.)

Dot Net Perls | Search
7-Zip | 7-Zip Command-Line Examples | 7-Zip Executable Tutorial | PPMd Compression Benchmark in 7-Zip
C# | Integer.TryParse | ArrayList Examples | Bituminous Coal | Sleep Method Use
© 2009 Sam Allen. All rights reserved.
Dot Net Perls | Sam Allen