Python re.sub, subn Methods

Use the re.sub and re.subn methods to invoke a method for matching strings.

Re.sub. In regular expressions, sub stands for substitution. The re.sub method applies a method to all matches. It evaluates a pattern, and for each match calls a method (or lambda).re.match, search

This method can modify strings in complex ways. We can apply transformations, like change numbers within a string. The syntax can be hard to follow.

This example introduces a method "multiply" that receives a match. It accesses group(0) and converts it into an integer. It multiplies that number by two, and converts it to a string.

Sub: In the main program body, we call the re.sub method. The first argument is "\d+" which means one or more digit chars.

And: The second argument is the multiply method name. We also pass a sample string for processing.

Result: The re.sub method matched each group of digits (each number) and the multiply method doubled it.

Python program that uses re.sub import re def multiply(m): # Convert group 0 to an integer. v = int( # Multiply integer by 2. # ... Convert back into string and return it. return str(v * 2) # Use pattern of 1 or more digits. # ... Use multiply method as second argument. result = re.sub("\d+", multiply, "10 20 30 40 50") print(result) Output 20 40 60 80 100

String. Re.sub can replace a pattern match with a simple string. No method call or lambda is required. Here we replace a pattern with the string "ring."
Python program that uses string replacement import re # An example string. v = "running eating reading" # Replace words starting with "r" and ending in "ing" # ... with a new string. v = re.sub(r"r.*?ing", "ring", v) print(v) Output ring eating ring

Subn. Usually re.sub() is sufficient. But another option exists. The re.subn method has an extra feature. It returns a tuple with a count of substitutions in the second element.Tuple

Tip: If you must know the number of substitutions made by re.sub, using re.subn is an ideal choice.

However: If your program has no use of this information, using re.sub is probably best. It is simpler and more commonly used.

Python program that calls re.subn import re def add(m): # Convert. v = int( # Add 2. return str(v + 1) # Call re.subn. result = re.subn("\d+", add, "1 2 3 4 5") print("Result string:", result[0]) print("Number of substitutions:", result[1]) Output Result string: 11 21 31 41 51 Number of substitutions: 5

Lambda. A def method name can be used in re.sub. But a lambda offers a more terse alternative. Here we specify a lambda expression directly within the re.sub argument list.

Here: We add the string "ing" to the end of all words within the input string. Additional logic could be used to make the results better.

Tip: A gerund form of a verb cannot be made this way all the time. Sometimes other spelling changes are needed.

Python program that uses re.sub, lambda import re # The input string. input = "laugh eat sleep think" # Use lambda to add "ing" to all words. result = re.sub("\w+", lambda m: + "ing", input) # Display result. print(result) Output laughing eating sleeping thinking

Dictionary example. The re.sub method can be used with a dictionary. In the method provided to re.sub, we access a dictionary to influence our action.

Here: We replace all known "plant" strings with the string PLANT. On other words, modify() takes no action.

Python program that uses re.sub with dictionary import re plants = {"flower": 1, "tree": 1, "grass": 1} def modify(m): v = # If string is in dictionary, return different string. if v in plants: return "PLANT" # Do not change anything. return v # Modify to remove all strings within the dictionary. result = re.sub("\w+", modify, "bird flower dog fish tree") print(result) Output bird PLANT dog fish PLANT

A summary. Re.sub, and its friend re.subn, can replace substrings in arbitrary ways. A method can test the contents of a match and change it using any algorithm.

And with a pattern, we can specify nearly any textual sequence to match. We can change a string to any other string (with a sufficient algorithm).
