Suppose we have a file and want a unique value that represents the contents of that file. With hashlib, we can use hash algorithms like SHA256 and MD5 to get hashes.
For hashlib, we are trying to get unique hashes, and these may be slow to compute. Collections like the dictionary use different hashes that are faster to compute.
This Python example program first creates a binary file called "example.txt" that contains some data. It then tries to hash this file's contents.
update()
with bytes, and then calling hexdigest()
.update()
method as well.file_digest
, we have a more efficient way to compute a hash for a file object.string
to file_digest
.file_digest
as well.import hashlib # Create the needed example file. with open("example.txt", "wb") as f: f.write(b"Some example text") # Version 1: get sha256 hash from bytes. m = hashlib.sha256() m.update(b"Some example text") print(m.hexdigest()) # Version 2: get sha256 hash from file using update. with open("example.txt", "rb") as f: m2 = hashlib.sha256() data = f.read() m2.update(data) print(m2.hexdigest()) # Version 3: get sha256 hash from file with file_digest. with open("example.txt", "rb") as f: m3 = hashlib.file_digest(f, "sha256") print(m3.hexdigest()) # Version 4: use blake2s hash. with open("example.txt", "rb") as f: m4 = hashlib.file_digest(f, "blake2s") print(m4.hexdigest()) # Version 5: use md5 hash. with open("example.txt", "rb") as f: m5 = hashlib.file_digest(f, "md5") print(m5.hexdigest())ba94b1c49abbd67b58019d6295f070913f499a774173c1b951b28525b0fb7193 ba94b1c49abbd67b58019d6295f070913f499a774173c1b951b28525b0fb7193 ba94b1c49abbd67b58019d6295f070913f499a774173c1b951b28525b0fb7193 cd154ac1356589b52ee8ad35899e36129bd62dcbf741dc7c08748fd10900e555 5b3a9b7b92bd8217bf5ffbd301043cea
Hashing the contents of files lets us compute 2 files for equality without knowing their contents. This can be a performance optimization, or help with ensuring data has not been tampered with.