Re.sub
In regular expressions, sub stands for substitution. The re.sub
method applies a method to all matches. It evaluates a pattern, and for each match calls a method (or lambda).
This method can modify strings in complex ways. We can apply transformations, like change numbers within a string
. The syntax can be hard to follow.
This example introduces a method "multiply" that receives a match. It accesses group(0) and converts it into an integer. It multiplies that number by two, and converts it to a string
.
re.sub()
is "\d+" which means one or more digit chars.re.sub
calls on each substitution.string
for processing. In this example, we use a sample string
with several 2-digit numbers.re.sub
method matched each group of digits (each number) and the multiply method doubled it.import re def multiply(m): # Convert group 0 to an integer. v = int(m.group(0)) # Multiply integer by 2. # ... Convert back into string and return it. return str(v * 2) # Use pattern of 1 or more digits. # ... Use multiply method as second argument. result = re.sub(r"\d+", multiply, "10 20 30 40 50") print(result)20 40 60 80 100
String
Re.sub
can replace a pattern match with a simple string
. No method call or lambda is required. Here we replace a pattern with the string
"x."
import re # An example string. v = "running eating reading" # Replace words starting with "r" and ending in "g" with a new string. v = re.sub(r"r.*?g", "x", v) print(v)x eating x
Usually re.sub()
is sufficient. But another option exists. The re.subn
method has an extra feature. It returns a tuple with a count of substitutions in the second element.
re.sub
, using re.subn
is an ideal choice.re.sub
is probably best. It is simpler and more commonly used.import re def add(m): # Convert. v = int(m.group(0)) # Add 2. return str(v + 1) # Call re.subn. result = re.subn(r"\d+", add, "1 2 3 4 5") print("Result string:", result[0]) print("Number of substitutions:", result[1])Result string: 11 21 31 41 51 Number of substitutions: 5
A method can be used in re.sub
. But a lambda offers a more terse alternative. Here we specify a lambda expression directly within the re.sub
argument list.
import re # The input string. input = "laugh eat sleep think" # Use lambda to add "!" to all words. result = re.sub(r"\w+", lambda m: m.group(0) + "!", input) # Display result. print(result)laugh! eat! sleep! think!
Dictionary
exampleThe re.sub
method can be used with a dictionary. In the method provided to re.sub
, we access a dictionary to influence our action.
string
PLANT. On other words, modify()
takes no action.import re plants = {"flower": 1, "tree": 1, "grass": 1} def modify(m): v = m.group(0) # If string is in dictionary, return different string. if v in plants: return "PLANT" # Do not change anything. return v # Modify to remove all strings within the dictionary. result = re.sub(r"\w+", modify, "bird flower dog fish tree") print(result)bird PLANT dog fish PLANT
Re.sub
, and its friend re.subn
, can replace substrings in arbitrary ways. A method can test the contents of a match and change it using any algorithm.
And with a pattern, we can specify nearly any textual sequence to match. We can change a string
to any other string
(with a sufficient algorithm).