English text contains some letters more than others—for example, the lowercase "e" is often the most common. It is possible to compute a table of letter frequencies.
We can perform letter frequency analysis with a method in the C# language. This may be part of a project or required for simple textual analysis.
The first part of the solution is declaring an array to store the frequencies. The char
type is 2 bytes, which means it contains 65535 elements to store.
char
syntax and initialize the array size to char.MaxValue
, which is a constant value defined in the .NET Framework.File.ReadAllText()
efficiently reads the file and places its contents into a string
.int
. We increment each "slot" for the letters we find.Console.WriteLine
statement actually prints the results to the screen.using System; using System.IO; class Program { static void Main() { // Array to store frequencies. int[] c = new int[(int)char.MaxValue]; // Read entire text file. string s = File.ReadAllText("text.txt"); // Iterate over each character. foreach (char t in s) { // Increment table. c[(int)t]++; } // Write all letters found. for (int i = 0; i < (int)char.MaxValue; i++) { if (c[i] > 0 && char.IsLetterOrDigit((char)i)) { Console.WriteLine("Letter: {0} Frequency: {1}", (char)i, c[i]); } } } }aaaa bbbbb aaaa bbbbb aaaa bbbbb CCcc xx y y y y y ZLetter: C Frequency: 2 Letter: Z Frequency: 1 Letter: a Frequency: 12 Letter: b Frequency: 15 Letter: c Frequency: 2 Letter: x Frequency: 2 Letter: y Frequency: 5
Most important in the solution is how the array is declared and the constants are used. An array can be used to record counts.
We counted the letter frequencies in a string
or file. This is handy code to have for text-based processing. It can be used to determine if a string
is likely English or not.