Sort strings alphanumerically, handling numbers and strings together logically for humans. Where I live, highways have names like 50F and 100F. If we use a plain alphabetic or ASCII sort, the 100F will come before the 50F, because 1 comes before 5. We need alphanumeric sorting instead of alphabetic sorting.
Alphanumeric sorting is the technique of sorting strings containing numbers and regular characters in a logical and human-friendly order. It is not an alphabetic sort, an ASCII sort, or a numeric sort. This kind of sorting is used in file managers, and the following example shows some highway names.
| ASCII sort | Alphanumeric sort |
|
100F 50F SR100 SR9 |
50F 100F SR9 SR100 |
By implementing a comparator. Alphanumeric sort order makes most sense to humans, but it isn't readily apparent to the computer. The key is to compare numbers and strings separately. Let's look at the calling convention for the comparator here.
// We want to sort a string[] array called highways in an // alphanumeric way. Call the static Array.Sort method, and use the // custom AlphanumComparatorFast for the sort method parameter. Array.Sort(highways, new AlphanumComparatorFast());
Like this. I took the first version I found at another site and made it 40% faster, and this version is 45% faster than that. You will see reduced memory usage and far better performance, while being entirely accurate, according to my testing. Let's look at the implementation.
public class AlphanumComparatorFast : IComparer
{
public int Compare(object x, object y)
{
string s1 = x as string;
if (s1 == null)
{
return 0;
}
string s2 = y as string;
if (s2 == null)
{
return 0;
}
int len1 = s1.Length;
int len2 = s2.Length;
int marker1 = 0;
int marker2 = 0;
// Walk through two the strings with two markers.
while (marker1 < len1 && marker2 < len2)
{
char ch1 = s1[marker1];
char ch2 = s2[marker2];
// Some buffers we can build up characters in for each chunk.
char[] space1 = new char[len1];
int loc1 = 0;
char[] space2 = new char[len2];
int loc2 = 0;
// Walk through all following characters that are digits or
// characters in BOTH strings starting at the appropriate marker.
// Collect char arrays.
do
{
space1[loc1++] = ch1;
marker1++;
if (marker1 < len1)
{
ch1 = s1[marker1];
}
else
{
break;
}
} while (char.IsDigit(ch1) == char.IsDigit(space1[0]));
do
{
space2[loc2++] = ch2;
marker2++;
if (marker2 < len2)
{
ch2 = s2[marker2];
}
else
{
break;
}
} while (char.IsDigit(ch2) == char.IsDigit(space2[0]));
// If we have collected numbers, compare them numerically.
// Otherwise, if we have strings, compare them alphabetically.
string str1 = new string(space1);
string str2 = new string(space2);
int result;
if (char.IsDigit(space1[0]) && char.IsDigit(space2[0]))
{
int thisNumericChunk = int.Parse(str1);
int thatNumericChunk = int.Parse(str2);
result = thisNumericChunk.CompareTo(thatNumericChunk);
}
else
{
result = str1.CompareTo(str2);
}
if (result != 0)
{
return result;
}
}
return len1 - len2;
}
}
Copy the above IComparer code and use it directly. For more general information about sorting, there is an article about sorting string arrays in C#, which can give you more background. I have material about sorting strings by length along with example code, and an article on sorting individual characters. You can see the code here in a more convenient form at the source archive.