Often in parsing text, a string
's position relative to other characters is important. For example, a parameter to a method is between two parentheses.
For simple text languages, these surrounding characters are easy to handle. We can implement a between()
method that returns a substring surrounded by two others.
Here we see some example input and output of the between parsing def. It is important to note the exact character position of the letter "A."
Input = "DEFINE:A=TWO" Between("DEFINE:", "=TWO") = "A"
This Python program includes three new methods: between, before and after. We internally call find()
and rfind()
. These methods return -1 when nothing is found.
find()
and rfind to get the indexes of the desired string
slice.string
slice syntax in Python to get substrings of the strings. We thus get substrings relative to other substrings.def between(value, a, b): # Find and validate before-part. pos_a = value.find(a) if pos_a == -1: return "" # Find and validate after part. pos_b = value.rfind(b) if pos_b == -1: return "" # Return middle part. adjusted_pos_a = pos_a + len(a) if adjusted_pos_a >= pos_b: return "" return value[adjusted_pos_a:pos_b] def before(value, a): # Find first part and return slice before it. pos_a = value.find(a) if pos_a == -1: return "" return value[0:pos_a] def after(value, a): # Find and validate first part. pos_a = value.rfind(a) if pos_a == -1: return "" # Returns chars after the found string. adjusted_pos_a = pos_a + len(a) if adjusted_pos_a >= len(value): return "" return value[adjusted_pos_a:] # Test the methods with this literal. test = "DEFINE:A=TWO" print(between(test, "DEFINE:", "=")) print(between(test, ":", "=")) print(before(test, ":")) print(before(test, "=")) print(after(test, ":")) print(after(test, "DEFINE:")) print(after(test, "="))A A DEFINE DEFINE:A A=TWO A=TWO TWO
The between, before and after methods here return an empty string
literal when no value is found. A None
result is probably a better choice for these error cases.
None
constant is desired, the return statements could be easily modified to return None
instead.I have not needed this code for any Python programs. But often in code I have found that implementing small, domain-specific languages is a good strategy.
Substring
logic can be complexIt is a disaster to be off by one character. Your entire program might fail and ruin your day. With methods that handle this logic, things are easier.