C# Dictionary Order, Use Keys Added Last

Understand how to optimize Dictionary lookups by changing the order keys are added.

Dictionary order. The order you add keys to a Dictionary is important. It affects the performance of accessing those keys. Because the Dictionary uses a chaining algorithm, the keys that were added last are often faster to locate.OptimizationDictionary

Example. This program is a complete benchmark harness that adds ten million and two random path strings to a Dictionary collection in the C# language. A reference to the first and last keys added are stored as variables.
Note: This is a benchmark for testing the performance of looking up a key that was added first, and another key that was added last.
And: A reference is retained to the first and last keys added. This involves two string variables.
Result: The two tight loops then show the performance difference between looking up entries in the Dictionary that were added first, and last.
C# program that tests Dictionary performance and order using System; using System.Collections.Generic; using System.Diagnostics; using System.IO; class Program { const int _max = 10000000; static void Main() { double msA = 0, msB = 0; // // Ten tests of dictionary order. // for (int y = 0; y < 10; y++) { var dictionary = new Dictionary<string, int>(StringComparer.Ordinal); string first = Path.GetRandomFileName(); dictionary.Add(first, 1); for (int i = 0; i < 1000000; i++) { dictionary.Add(Path.GetRandomFileName(), 1); } string last = Path.GetRandomFileName(); dictionary.Add(last, 1); // // Test the dictionaries. // var s1 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { int v; dictionary.TryGetValue(first, out v); } s1.Stop(); var s2 = Stopwatch.StartNew(); for (int i = 0; i < _max; i++) { int v; dictionary.TryGetValue(last, out v); } s2.Stop(); msA += s1.Elapsed.TotalMilliseconds; msB += s2.Elapsed.TotalMilliseconds; Console.Write("."); // // Collect the garbage. // GC.Collect(); } Console.WriteLine(); Console.WriteLine(msA.ToString("0.00") + " ms"); Console.WriteLine(msB.ToString("0.00") + " ms"); Console.Read(); } } Output (Second number is the last key search time elapsed.) .......... 3539.57 ms 3203.58 ms [Faster]

I varied the order of the two tests. This did not affect the performance of the lookups. I added the forced garbage collection call. Otherwise the runtime might start a garbage collection during one of the two tests.
And: The program uses completely random path strings, so results will vary each pass through the loop.

The last key added to the Dictionary was faster to access than the first key that was added. Usually the performance gain was around 10% faster for a Dictionary with ten million separate keys.

Discussion. This section explains why the last key can be accessed faster than the first key. The Dictionary type internally uses a chaining algorithm. Each key you look up is converted to an index through the hash code computation.
Note: Each hash code is not unique. And each index is even less likely to be unique. Collisions are stored in a logical linked list.

Steps. When you add a key to the Dictionary and another key was added at this index, the new key is put in the first place in this logical linked list. And the old key is demoted to the next slot.
Then: When you look up that key, the search algorithm loops through those entries in order, with the last key checked first.

Performance. The performance impact of changing the ordering of the keys you add will likely not be large. For this reason, I emphasize the theory and implementation issues rather than offering a recommendation or suggestion about performance.
Generally: The Dictionary lookup performance is good regardless of ordering if the hash code computation is well-distributed.
However: For certain in-memory database programs, adding keys in order of ascending hit frequency could improve performance.

Summary. We saw a performance test for Dictionary that showed that the first key added to a Dictionary instance is slower to access than the last key. We noted that this is because of the internal chaining algorithm used in the Dictionary.

© 2007-2020 Sam Allen. Every person is special and unique. Send bug reports to info@dotnetperls.com.