A Python string may contain letters, whitespace, numbers, and punctuation. Punctuation characters include commas and periods and semicolons.
With Python, we can access the string.punctuation
constant. This contains all the common punctuation characters. It can be tested and used in programs.
Here we use the in
-operator on the string.punctuation
constant. This allows us to test whether a char
in a string
is a punctuation character.
import string # An input string. name = "hey, my friend!" for c in name: # See if the char is punctuation. if c in string.punctuation: print("Punctuation: " + c)Punctuation: , Punctuation: !
In this example we use a for-in
loop over the string.punctuation
characters. We print each character with surrounding brackets.
string.punctuation
values do not include Unicode symbols or whitespace characters.import string # Display punctuation. for c in string.punctuation: print("[" + c + "]")[!] ["] [#] [$] [%] [&] ['] [(] [)] [*] [+] [,] [-] [.] [/] [:] [;] [<] [=] [>] [?] [@] [[] [\] []] [^] [_] [`] [{] [|] [}] [~]
Remove
punctuationWith the "in" operator and the string.punctuation
constant, we can remove all punctuation chars from a string
.
import string def remove_punctuation(value): result = "" for c in value: # If char is not punctuation, add it to the result. if c not in string.punctuation: result += c return result # Test our method. temp = "hello, friend!... welcome." print(temp) print(remove_punctuation(temp))hello, friend!... welcome. hello friend welcome
Let us look at another occasionally-helpful constant in the string module: string.whitespace
. This too can be looped over or tested.
isspace()
.import string print(" " in string.whitespace) print("\n" in string.whitespace) print("X" in string.whitespace)True True False
For performance, the in
-operator is fast. But it will not outperform a specialized lookup table. We could use a dictionary to store a value indicating whether a char
is punctuation or not.
Python contains many helpful constants in its string
module. For example, we can test whitespace, digits, punctuation. We avoid writing these characters out in special code.