Count Letter Frequencies
This page was last reviewed on Nov 23, 2021.
Dot Net Perls
Letter frequencies. English text contains some letters more than others—for example, the lowercase "e" is often the most common. It is possible to compute a table of letter frequencies.
C# method. We can perform letter frequency analysis with a method in the C# language. This may be part of a project or required for simple textual analysis.
Example. The first part of the solution is declaring an array to store the frequencies. The char type is 2 bytes, which means it contains 65535 elements to store.
char Array
int Array
Part 1 We use the new char syntax and initialize the array size to char.MaxValue, which is a constant value defined in the .NET Framework.
Part 2 We read in the file. File.ReadAllText() efficiently reads the file and places its contents into a string.
Next We add up frequencies by iterating over each letter, and casting it to an int. We increment each "slot" for the letters we find.
Loop, String Chars
Finally We display all letter or digit characters we found. The Console.WriteLine statement actually prints the results to the screen.
using System; using System.IO; class Program { static void Main() { // Array to store frequencies. int[] c = new int[(int)char.MaxValue]; // Read entire text file. string s = File.ReadAllText("text.txt"); // Iterate over each character. foreach (char t in s) { // Increment table. c[(int)t]++; } // Write all letters found. for (int i = 0; i < (int)char.MaxValue; i++) { if (c[i] > 0 && char.IsLetterOrDigit((char)i)) { Console.WriteLine("Letter: {0} Frequency: {1}", (char)i, c[i]); } } } }
aaaa bbbbb aaaa bbbbb aaaa bbbbb CCcc xx y y y y y Z
Letter: C Frequency: 2 Letter: Z Frequency: 1 Letter: a Frequency: 12 Letter: b Frequency: 15 Letter: c Frequency: 2 Letter: x Frequency: 2 Letter: y Frequency: 5
Some notes. Most important in the solution is how the array is declared and the constants are used. An array can be used to record counts.
Info The sections in the code show the input file, and the frequency counts, so we can easily see the method is correct.
A summary. We counted the letter frequencies in a string or file. This is handy code to have for text-based processing. It can be used to determine if a string is likely English or not.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Nov 23, 2021 (rewrite).
© 2007-2024 Sam Allen.