Capacity
Is it worthwhile to set capacities? We are able to specify capacity for C# collections. This optional capacity property adjusts the amount of memory allocated.
Capacity
is specified in the constructor of List
and Dictionary
. Some other types, like StringBuilder
, can also set a capacity.
Here is a benchmark that tests capacity settings. Many Dictionaries in real C# programs have a string
key and around 100-1000 elements.
Count
property at the end.static
class
and use const
fields called ElementCountEstimate
or other fields with the word estimate.using System; using System.Collections.Generic; class Program { const int _m = 100000; static List<string> _values = new List<string>(); public static void Main() { // Add 100 strings for testing. for (int i = 0; i < 100; i++) { _values.Add("value" + i.ToString()); } long t1 = Environment.TickCount; A_Dictionary(); long t2 = Environment.TickCount; B_DictionaryCapacity(); long t3 = Environment.TickCount; C_DictionaryCapacity(); long t4 = Environment.TickCount; D_DictionaryCapacity(); long t5 = Environment.TickCount; E_List(); long t6 = Environment.TickCount; F_ListCapacity(); long t7 = Environment.TickCount; // Write Dictionary times. Console.WriteLine("A_Dictionary: " + (t2 - t1) + " ms"); Console.WriteLine("B_DictionaryCapacity: " + (t3 - t2) + " ms"); Console.WriteLine("C_DictionaryCapacity: " + (t4 - t3) + " ms"); Console.WriteLine("D_DictionaryCapacity: " + (t5 - t4) + " ms"); // Write List times. Console.WriteLine("E_List: " + (t6 - t5) + " ms"); Console.WriteLine("F_ListCapacity: " + (t7 - t6) + " ms"); } static void A_Dictionary() { // No capacity. for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(); foreach (string k in _values) { d.Add(k, 0); } } } static void B_DictionaryCapacity() { // Capacity from collection Count. for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(_values.Count); foreach (string k in _values) { d.Add(k, 0); } } } static void C_DictionaryCapacity() { // Const capacity. for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(100); foreach (string k in _values) { d.Add(k, 0); } } } static void D_DictionaryCapacity() { // Huge capacity (10 times too large). for (int i = 0; i < _m; i++) { var d = new Dictionary<string, int>(1000); foreach (string k in _values) { d.Add(k, 0); } } } static void E_List() { // No capacity. for (int i = 0; i < _m * 5; i++) { var l = new List<string>(); foreach (string k in _values) { l.Add(k); } } } static void F_ListCapacity() { // Exact capacity. for (int i = 0; i < _m * 5; i++) { var l = new List<string>(100); foreach (string k in _values) { l.Add(k); } } } }A_Dictionary: 500 ms B_DictionaryCapacity: 328 ms C_DictionaryCapacity: 329 ms D_DictionaryCapacity: 484 ms E_List: 547 ms F_ListCapacity: 437 ms
When we don't specify a capacity, the buffer will be reallocated as we keep adding elements. This makes populating a Dictionary
or List
much slower.
Should developers bother with Dictionary
or List
capacity? Having the runtime resize the underlying arrays is expensive.
double
the time required to add elements—even in small collections.The memory model of C# programs is complex. But giving programs hints about the sizes of the collections proves to be a substantial optimization.