Dot Net Perls

List Element Equality - C#

by Sam Allen

Problem

You want to determine if two Lists or arrays have the same element values, regardless of order. There are many possible approaches, but you want the simplest one, which is easiest to maintain.

List 1List 2Equal?
1, 2, 42, 1, 4True
5, 4, 66, 5, 4True
1, 2, 41, 4False
1, 52, 5False
1, 21, 2True

Solution: unordered Lists and hashes in C#

Before developing the solution, I researched on Google and found a variety of approaches. One solution copies both Lists to arrays, sorts them, and then loops over the elements. The best versions use Dictionary and compare the frequencies.

Some solutions I found will have severe performance problems, as with nested loops. Another solution has the potential to be inaccurate, making it worse than useless.

Figure 1: using Dictionary to count key frequencies in generic method

Here we see a generic method in C#, meaning it receives parameters of a caller-specified type. Methods can be generic just as classes can. The syntax still uses the angle brackets, < and >.

TryGetValue is also used instead of ContainsKey, which eliminates extra hash key computations. This enhances performance and yields a method that is faster than most.

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        List<int> la = new List<int>() { 1, 0, 4, 200, -40 };
        List<int> lb = new List<int>() { -40, 200, 4, 1, 0 };
        List<int> lc = new List<int>() { 3, 5, 4, 9, 11 };
        List<int> ld = new List<int>() { 6, 6, 100 };
        List<int> le = new List<int>() { 6, 100, 100 };
        Console.WriteLine(UnorderedEqual(la, lb)); // true
        Console.WriteLine(UnorderedEqual(la, lc)); // false
        Console.WriteLine(UnorderedEqual(lc, ld)); // false
        Console.WriteLine(UnorderedEqual(ld, le)); // false

        int[] a1 = new int[] { 1, 2, 5 };
        int[] a2 = new int[] { 2, 5, 1 };
        int[] a3 = new int[] { 1, 1, 3 };
        int[] a4 = new int[] { 3, 3, 1 };
        Console.WriteLine(UnorderedEqual(a1, a2)); // true
        Console.WriteLine(UnorderedEqual(a1, a3)); // false
        Console.WriteLine(UnorderedEqual(a3, a4)); // false
    }

    static bool UnorderedEqual<T>(ICollection<T> a, ICollection<T> b)
    {
        // 1
        // Require that the counts are equal
        if (a.Count != b.Count)
        {
            return false;
        }
        // 2
        // Initialize new Dictionary of the type
        Dictionary<T, int> d = new Dictionary<T, int>();
        // 3
        // Add each key's frequency from collection A to the Dictionary
        foreach (T item in a)
        {
            int c;
            if (d.TryGetValue(item, out c))
            {
                d[item] = c + 1;
            }
            else
            {
                d.Add(item, 1);
            }
        }
        // 4
        // Add each key's frequency from collection B to the Dictionary
        // Return early if we detect a mismatch
        foreach (T item in b)
        {
            int c;
            if (d.TryGetValue(item, out c))
            {
                if (c == 0)
                {
                    return false;
                }
                else
                {
                    d[item] = c - 1;
                }
            }
            else
            {
                // Not in dictionary
                return false;
            }
        }
        // 5
        // Verify that all frequencies are zero
        foreach (int v in d.Values)
        {
            if (v != 0)
            {
                return false;
            }
        }
        // 6
        // We know the collections are equal
        return true;
    }
}
  1. UnorderedEqual receives ICollections and checks length.
    Two collections that implement ICollection<T> are received and their Counts are compared. If the two parameters don't have the same number of elements, they cannot be equal.
  2. It uses a Dictionary.
    The Dictionary used in this method is for storing each key of type T and its frequency. If the key is found once, its frequency will be set to 1, for example.
  3. It adds the first collection's keys.
    The method places each element into the Dictionary as a key. If an element occurs twice, its frequency is incremented. TryGetValue is used to avoid another hashtable computation. [C# - TryGetValue - dotnetperls.com]
  4. It subtracts the second collection's keys.
    Each element occurring in the second collection is decremented from the Dictionary. If any value goes below zero, we already know the collections are not equal.
  5. It verifies the keys.
    The method enforces that each key have a frequency of 0. This ensures there are no extra elements in the first collection.
  6. It returns true.
    Here we know for sure that the collections are equal.

Information: my comments on this method

There is a large thread of postings on this topic from the C# newsgroups. The solution that adds up hash values would result in collisions and errors. The solutions that copy, sort and then compare are good. [Comparing two arrays when element order does not matter]

Elsewhere, I found some solutions that didn't return early and continued through the entire loops even after the result was known. This is needlessly inefficient.

You cannot use HashSet because it simply tests existence, not frequency, of the elements. HashSet was my first attempt, and the solution was 2 lines long but not accurate.

Question: do I need generic methods?

No, absolutely not. I don't use custom generics normally and discourage them unless necessary. I feel they add needless complexity to small programs.

You can rewrite the above solution simply by changing the arguments to int[], and the Dictionary and loops to use int instead of T. However, the implementation here will work on both strings and numeric types.

Summary: Lists contain equal elements

Here we saw a solution to checking unordered elements that is a generic method, works for both string and int, uses a more efficient Dictionary syntax that avoids excessive lookups, and also iterates over each value in the Dictionary, avoiding more lookups.

Most importantly, I proved the method's correctness on the data provided. Never use a method that is not proven or can return incorrect results. In fact, always test methods rigorously, except perhaps .NET framework methods that are widespread.

Dot Net Perls
About
Sitemap
Arrays
Sort String Arrays
2D Array Use
Count Elements in Array
Char Array Performance
Shuffle Array
New
GZIP Accept-Encoding Request
Convert bool to int
© 2008 Sam Allen. All rights reserved.