Levenshtein. In 1965 Vladmir Levenshtein introduced a distance algorithm. This tells us the number of changes (edits) one string must go through to become another string.
Function notes. In VB.NET we compute Levenshtein distance with a Function. This function returns the number of character edits that must occur to get from String A to String B.
Example. The internals of the Levenshtein distance Function are complex. Internally it allocates a 2D array. This 2D array has dimensions equal to the lengths of the 2 argument strings.
Then Array elements are assigned, considering adjacent array elements and the cost. The ending array element is returned as the distance.
Module Module1
Sub Main()
Console.WriteLine(LevenshteinDistance("aunt", "ant"))
Console.WriteLine(LevenshteinDistance("Sam", "Samantha"))
Console.WriteLine(LevenshteinDistance("flomax", "volmax"))
End Sub
''' <summary>
''' Compute LevenshteinDistance.
''' </summary>
Public Function LevenshteinDistance(ByVal s As String,
ByVal t As String) As Integer
Dim n As Integer = s.Length
Dim m As Integer = t.Length
Dim d(n + 1, m + 1) As Integer
If n = 0 Then
Return m
End If
If m = 0 Then
Return n
End If
Dim i As Integer
Dim j As Integer
For i = 0 To n
d(i, 0) = i
Next
For j = 0 To m
d(0, j) = j
Next
For i = 1 To n
For j = 1 To m
Dim cost As Integer
If t(j - 1) = s(i - 1) Then
cost = 0
Else
cost = 1
End If
d(i, j) = Math.Min(Math.Min(d(i - 1, j) + 1, d(i, j - 1) + 1),
d(i - 1, j - 1) + cost)
Next
Next
Return d(n, m)
End Function
End Module1
5
3
Function output. The number returned by Levenshtein for the arguments "aunt" and "ant" is 1. This means only one edit is required to get from aunt to ant—the letter U is removed.
And For the second result, we have 5 edits. The name "Sam" must be edited 5 times to get "Samantha"—5 letters are added.
Finally And to get to "volmax" from "flomax" 3 edits must occur. Swapping 2 characters is considered 2 separate edits.
The theory of the Levenshtein distance algorithm is not covered here. But the implementation here is correct on the example Strings tested. It is not fast.
Certain optimizations could be used to enhance performance, such as memoization. A data structure such as a Dictionary could store the two strings and also a cached distance.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Jan 24, 2022 (edit).