C#Dot Net Perls

C#
Alphanumeric Sorting

by Sam Allen

Problem

Sort strings alphanumerically, handling numbers and strings together logically for humans. Where I live, highways have names like 50F and 100F. If we use a plain alphabetic or ASCII sort, the 100F will come before the 50F, because 1 comes before 5. We need alphanumeric sorting instead of alphabetic sorting.

C# Solution

Alphanumeric sorting is the technique of sorting strings containing numbers and regular characters in a logical and human-friendly order. It is not an alphabetic sort, an ASCII sort, or a numeric sort. This kind of sorting is used in file managers, and the following example shows some highway names.

ASCII sort Alphanumeric sort
100F
50F
SR100
SR9
50F
100F
SR9
SR100

How can I call this sort method?

By implementing a comparator. Alphanumeric sort order makes most sense to humans, but it isn't readily apparent to the computer. The key is to compare numbers and strings separately. Let's look at the calling convention for the comparator here.

// We want to sort a string[] array called highways in an
// alphanumeric way. Call the static Array.Sort method, and use the
// custom AlphanumComparatorFast for the sort method parameter.
Array.Sort(highways, new AlphanumComparatorFast());

How do we implement the sort?

Like this. I took the first version I found at another site and made it 40% faster, and this version is 45% faster than that. You will see reduced memory usage and far better performance, while being entirely accurate, according to my testing. Let's look at the implementation.

public class AlphanumComparatorFast : IComparer
{
    public int Compare(object x, object y)
    {
        string s1 = x as string;
        if (s1 == null)
        {
            return 0;
        }
        string s2 = y as string;
        if (s2 == null)
        {
            return 0;
        }

        int len1 = s1.Length;
        int len2 = s2.Length;
        int marker1 = 0;
        int marker2 = 0;

        // Walk through two the strings with two markers.
        while (marker1 < len1 && marker2 < len2)
        {
            char ch1 = s1[marker1];
            char ch2 = s2[marker2];

            // Some buffers we can build up characters in for each chunk.
            char[] space1 = new char[len1];
            int loc1 = 0;
            char[] space2 = new char[len2];
            int loc2 = 0;

            // Walk through all following characters that are digits or
            // characters in BOTH strings starting at the appropriate marker.
            // Collect char arrays.
            do
            {
                space1[loc1++] = ch1;
                marker1++;

                if (marker1 < len1)
                {
                    ch1 = s1[marker1];
                }
                else
                {
                    break;
                }
            } while (char.IsDigit(ch1) == char.IsDigit(space1[0]));

            do
            {
                space2[loc2++] = ch2;
                marker2++;

                if (marker2 < len2)
                {
                    ch2 = s2[marker2];
                }
                else
                {
                    break;
                }
            } while (char.IsDigit(ch2) == char.IsDigit(space2[0]));

            // If we have collected numbers, compare them numerically.
            // Otherwise, if we have strings, compare them alphabetically.
            string str1 = new string(space1);
            string str2 = new string(space2);
            
            int result;

            if (char.IsDigit(space1[0]) && char.IsDigit(space2[0]))
            {
                int thisNumericChunk = int.Parse(str1);
                int thatNumericChunk = int.Parse(str2);
                result = thisNumericChunk.CompareTo(thatNumericChunk);
            }
            else
            {
                result = str1.CompareTo(str2);
            }

            if (result != 0)
            {
                return result;
            }
        }
        return len1 - len2;
    }
}

Conclusion

Copy the above IComparer code and use it directly. For more general information about sorting, there is an article about sorting string arrays in C#, which can give you more background. I have material about sorting strings by length along with example code, and an article on sorting individual characters. You can see the code here in a more convenient form at the source archive.

Dot Net Perls is dedicated to sharing code and knowledge. It has
© 2007-2008 Sam Allen. All rights reserved.

Ads by The Lounge