With compression we trade time for space. In compression we apply algorithms that change data to require less physical memory.
But it makes other parts faster: less data needs transferring. An important compression algorithm is GZIP.
Many web pages are transferred with GZIP compression. This reduces the time required to load pages. In Python, we can use the gzip module.
open()
method to write the compressed file.import gzip # Open source file. with open("C:\perls.txt", "rb") as file_in: # Open output file. with gzip.open("C:\perls.gz", "wb") as file_out: # Write output. file_out.writelines(file_in)perls.txt size: 4404 bytes perls.gz size: 2110 bytes
The gzip module has two main functions. It compresses data and decompresses data. In this next example we decompress the same file that was written in the previous program.
string
format, in memory.string
contents.import gzip # Use open method. with gzip.open("C:\perls.gz", "rb") as f: # Read in string. content = f.read() # Print length. print(len(content))4404
subprocess
We can invoke 7-Zip with the subprocess
module in Python. This can be done on Windows or other platforms (but modifications to the executable name are needed).
The Python site has some useful information on GZIP. Compression collapses space. And it often reduces the time required to process data.
Data compression becomes increasingly important. As time passes, the quantity of data increases. And with the gzip module, we have an easy way to use a powerful, compatible algorithm.