Python Lower Dictionary: String Performance

Use a dictionary to optimize the computation of lowercase strings.

Lower, dictionary. String operations are often slow in languages compared to dictionary lookups. Consider a program that must lowercase many strings.Dictionary

With a dictionary, we could cache lowercase versions of each string. Depending on the program's data, this could speed it up considerably.

An example benchmark. To begin, this benchmark runs two tests. It tries to see how much of a performance advantage we can achieve by caching lowercase strings.

Version 1: This loop calls lower() on each iteration. The string is a bit more than 10 characters long.

Version 2: Uses a dictionary lookup on each iteration. If the string is not found, it stores a lowercase version with a normal-case key.

Dictionary get
Python program that benchmarks lower, dictionary cache import time value = "ESSAY ON MAN" lower_cache = {} print(time.time()) # Version 1: lowercase the string each iteration. for i in range(2000000): res = value.lower() # Test result. if res != "essay on man": print("X") break print(time.time()) # Version 2: use a cache in a dictionary to get the lowercase string. for i in range(2000000): res = lower_cache.get(value) # Set in cache if needed. if res == None: res = value.lower() lower_cache[value] = res # Test result. if res != "essay on man": print("X") break print(time.time()) Output 1478465952.604 1478465953.01 lower = 0.406 s 1478465953.057 dictionary get = 0.047 s

Benchmark results. Having a dictionary with a near 100% hit rate on the cache was about 10 times faster in PyPy. For real programs, the results will be less stunning.

Memoization, notes. This program is an example of memoization. A function remembers its result and caches it based on its arguments. This optimization could help real-world programs.Memoize

A summary. Using a dictionary cache is often beneficial in Python. In PyPy or the Python interpreter, a dictionary lookup is usually much faster than a string modification.
