Dot Net Perls

Alphanumeric Sorting in C#

by Sam Allen - Updated June 13, 2009

Problem. You want to sort strings alphanumerically, handling numbers and strings together in a way that makes sense to humans. For example, highway names like 50F and 100F will be sorted wrongly with ASCII sort. Solution. This article contains an implementation of an alphanumeric sorting method, using the C# programming language.

       ASCII Sort: 100F 
                   50F  
                   SR100
                   SR9  
Alphanumeric Sort: 50F  
                   100F 
                   SR9  
                   SR100

1. Using alphanumeric sorting

Here we use the alphanumeric sorting comparer to deal with strings containing numbers and regular characters in a human-friendly order. It is different from alphabetic sorting, ASCII sorting, or numeric sorting. It is used in most file managers, and is useful on highway names.

=== Program that uses comparator (C#) ===

using System;
using System.Collections;

class Program
{
    static void Main()
    {
        string[] highways = new string[]
        {
            "100F",
            "50F",
            "SR100",
            "SR9"
        };
        //
        // We want to sort a string[] array called highways in an
        // alphanumeric way. Call the static Array.Sort method.
        //
        Array.Sort(highways, new AlphanumComparatorFast());
        //
        // Display the results
        //
        foreach (string h in highways)
        {
            Console.WriteLine(h);
        }
    }
}

=== Output of the program ===

50F
100F
SR9
SR100

2. Implementing IComparer interface

Here we see the comparator for alphanumeric sorting. The author took open source code and made it 40% faster, and this version is 45% faster yet. It has reduced memory usage and far better performance. Most importantly, however, it is accurate, according to his limited testing. [This code was tested on the .NET Framework 3.5 SP1]

--- Implementation of IComparer (C#) ---

public class AlphanumComparatorFast : IComparer
{
    public int Compare(object x, object y)
    {
        string s1 = x as string;
        if (s1 == null)
        {
            return 0;
        }
        string s2 = y as string;
        if (s2 == null)
        {
            return 0;
        }

        int len1 = s1.Length;
        int len2 = s2.Length;
        int marker1 = 0;
        int marker2 = 0;

        // Walk through two the strings with two markers.
        while (marker1 < len1 && marker2 < len2)
        {
            char ch1 = s1[marker1];
            char ch2 = s2[marker2];

            // Some buffers we can build up characters in for each chunk.
            char[] space1 = new char[len1];
            int loc1 = 0;
            char[] space2 = new char[len2];
            int loc2 = 0;

            // Walk through all following characters that are digits or
            // characters in BOTH strings starting at the appropriate marker.
            // Collect char arrays.
            do
            {
                space1[loc1++] = ch1;
                marker1++;

                if (marker1 < len1)
                {
                    ch1 = s1[marker1];
                }
                else
                {
                    break;
                }
            } while (char.IsDigit(ch1) == char.IsDigit(space1[0]));

            do
            {
                space2[loc2++] = ch2;
                marker2++;

                if (marker2 < len2)
                {
                    ch2 = s2[marker2];
                }
                else
                {
                    break;
                }
            } while (char.IsDigit(ch2) == char.IsDigit(space2[0]));

            // If we have collected numbers, compare them numerically.
            // Otherwise, if we have strings, compare them alphabetically.
            string str1 = new string(space1);
            string str2 = new string(space2);

            int result;

            if (char.IsDigit(space1[0]) && char.IsDigit(space2[0]))
            {
                int thisNumericChunk = int.Parse(str1);
                int thatNumericChunk = int.Parse(str2);
                result = thisNumericChunk.CompareTo(thatNumericChunk);
            }
            else
            {
                result = str1.CompareTo(str2);
            }

            if (result != 0)
            {
                return result;
            }
        }
        return len1 - len2;
    }
}

Explanation of the class. The above IComparer implementation returns an integer. This result indicates which string comes first. It walks through both strings. This is done in a single loop. It tries to find equivalent chunks of those two strings at a position. It uses char arrays for performance. [C# Char Array Use - dotnetperls.com]

It uses two comparisons. It selects either an alphabetical comparison or a numeric comparison, depending on the chunk type. Next, it uses CompareTo. This method returns an integer that indicates whether the first object is bigger or smaller than the second object.

3. Summary

Here we saw how you can sort strings alphanumerically in your programs written in the C# language, resulting in a more natural presentation of lists for your users. Use this IComparer implementation for alphanumeric or natural sorting. This code isn't the fastest or clearest code, but it works for small applications and its results are predictable.

Dot Net Perls
Sorting | Reverse String | Shuffle Array | Sort Dictionary Values | Sort List Method, Sorting and... | Sort String Arrays
C# | Dictionary StringComparer Tip | DateTime.TryParse Example | Reflection Field Example | Validate Characters in String
© 2009 Sam Allen. All rights reserved.