Re.sub. In regular expressions, sub stands for substitution. The re.sub method applies a method to all matches. It evaluates a pattern, and for each match calls a method (or lambda).
This method can modify strings in complex ways. We can apply transformations, like change numbers within a string. The syntax can be hard to follow.
This example introduces a method "multiply" that receives a match. It accesses group(0) and converts it into an integer. It multiplies that number by two, and converts it to a string.
Argument 1 The first argument to re.sub() is "\d+" which means one or more digit chars.
Argument 2 The second argument is the multiply method name—this is the method that re.sub calls on each substitution.
Argument 3 We pass a string for processing. In this example, we use a sample string with several 2-digit numbers.
Result The re.sub method matched each group of digits (each number) and the multiply method doubled it.
import re
def multiply(m):
# Convert group 0 to an integer.
v = int(m.group(0))
# Multiply integer by 2.# ... Convert back into string and return it.
return str(v * 2)
# Use pattern of 1 or more digits.# ... Use multiply method as second argument.
result = re.sub(r"\d+", multiply, "10 20 30 40 50")
print(result)20 40 60 80 100
String. Re.sub can replace a pattern match with a simple string. No method call or lambda is required. Here we replace a pattern with the string "x."
import re
# An example string.
v = "running eating reading"# Replace words starting with "r" and ending in "g" with a new string.
v = re.sub(r"r.*?g", "x", v)
print(v)x eating x
Subn. Usually re.sub() is sufficient. But another option exists. The re.subn method has an extra feature. It returns a tuple with a count of substitutions in the second element.
Tip If you must know the number of substitutions made by re.sub, using re.subn is an ideal choice.
However If your program has no use of this information, using re.sub is probably best. It is simpler and more commonly used.
import re
def add(m):
# Convert.
v = int(m.group(0))
# Add 2.
return str(v + 1)
# Call re.subn.
result = re.subn(r"\d+", add, "1 2 3 4 5")
print("Result string:", result[0])
print("Number of substitutions:", result[1])Result string: 11 21 31 41 51
Number of substitutions: 5
Lambda. A method can be used in re.sub. But a lambda offers a more terse alternative. Here we specify a lambda expression directly within the re.sub argument list.
Here We add an exclamation mark to the end of all words within the input string.
import re
# The input string.
input = "laugh eat sleep think"# Use lambda to add "!" to all words.
result = re.sub(r"\w+", lambda m: m.group(0) + "!", input)
# Display result.
print(result)laugh! eat! sleep! think!
Dictionary example. The re.sub method can be used with a dictionary. In the method provided to re.sub, we access a dictionary to influence our action.
Here We replace all known "plant" strings with the string PLANT. On other words, modify() takes no action.
import re
plants = {"flower": 1, "tree": 1, "grass": 1}
def modify(m):
v = m.group(0)
# If string is in dictionary, return different string.
if v in plants:
return "PLANT"# Do not change anything.
return v
# Modify to remove all strings within the dictionary.
result = re.sub(r"\w+", modify,
"bird flower dog fish tree")
print(result)bird PLANT dog fish PLANT
Summary. Re.sub, and its friend re.subn, can replace substrings in arbitrary ways. A method can test the contents of a match and change it using any algorithm.
And with a pattern, we can specify nearly any textual sequence to match. We can change a string to any other string (with a sufficient algorithm).
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Sep 19, 2024 (new example).