In Python we have many string
manipulation methods. These handle most common requirements. To handle whitespace, strip()
is useful.
With strip, we remove certain characters (such as whitespace) from the left and right parts of strings. We invoke lstrip, rstrip and the versatile strip()
.
Here we invoke the lstrip, rstrip and strip methods. Strip, with no argument, removes leading and trailing whitespace.
string
. The L stands for left.# Has two leading spaces and a trailing one. value = " a line " # Remove left spaces. value1 = value.lstrip() print("[" + value1 + "]") # Remove right spaces. value2 = value.rstrip() print("[" + value2 + "]") # Remove left and right spaces. value3 = value.strip() print("[" + value3 + "]")[a line ] [ a line] [a line]
Strip can be used for more than whitespace. Try passing an argument to it. Strip will remove all characters found in the argument string
that lead, or end the string
.
Strip()
does not match substrings—it treats the argument as a set of characters. Here we specify all digits and some punctuation.# Has numbers on left and right, and some syntax. value = "50342=Data,231" # Strip all digits. # ... Also remove equals sign and comma. result = value.strip("0123456789=,") print(result)Data
Strip can be combined with methods like lower()
to preprocess keys for a dictionary. So we can look up the string
"CAT" with any leading or trailing spaces.
lower()
we can treat "cat" and "CAT" and "Cat" the same. This example is not optimally fast. But it works.def preprocess(input): # String and lowercase the input. return input.strip().lower() # Create a new dictionary. lookup = {} # Use preprocess to create key from string. # ... Use key in the dictionary. lookup[preprocess(" CAT")] = 10 # Get value from dictionary with preprocessed key. print(lookup[preprocess("Cat ")])10
How important are the arguments you pass to strip, lstrip and rstrip? Should you remove unneeded characters from the string
argument to improve performance?
import time # Input data. s = "100200input" print(s.lstrip("0123456789")) print(s.lstrip("012")) # Time 1. print(time.time()) # Version 1: specify all digits. i = 0 while i < 10000000: result = s.lstrip("0123456789") i += 1 # Time 2. print(time.time()) # Version 2: specify needed digits. i = 0 while i < 10000000: result = s.lstrip("012") i += 1 # Time 3. print(time.time())input input 1380915621.696 1380915626.524 [With all digits: 4.83 s] 1380915631.384 [With needed digits: 4.86 s]
Split
The strip method is commonly used before calling another method, such as split. The split method works better when no leading, or trailing, whitespace is found.
Strip, and its friends lstrip and rstrip, aids in preprocessing string
data. This is helpful. It simplifies using methods later in your program, like split or even custom ones.