Count Letter Frequencies
This page was last reviewed on Sep 25, 2022.
Dot Net Perls
Counting letter frequencies. Sometimes an analysis of a file's contents is helpful. A Java method can be used to read in a file, count letters, and display the results.
Example file. Let us first consider the example text file. It contains some patterns of letters like 2 "x" characters and one uppercase Z.
This program uses BufferedReader and HashMap. With BufferedReader, it reads in a file line-by-line. We then loop over the line and call charAt to access characters.
Detail We test to see if we have a space—we avoid hashing spaces. We then put the char in the HashMap.
Detail We call this method to get an existing value from the HashMap or 0. Then we put one plus that value.
Finally We use a for-loop on the result of keySet(). We display all characters found and their frequencies in the file.
import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.HashMap; public class Program { public static void main(String[] args) throws IOException { HashMap<Integer, Integer> hash = new HashMap<>(); // Load this text file. BufferedReader reader = new BufferedReader(new FileReader( "C:\\programs\\file.txt")); while (true) { String line = reader.readLine(); if (line == null) { break; } for (int i = 0; i < line.length(); i++) { char c = line.charAt(i); if (c != ' ') { // Increment existing value in HashMap. // ... Start with zero if no key exists. int value = hash.getOrDefault((int) c, 0); hash.put((int) c, value + 1); } } } // Close object. reader.close(); // Display values of all keys. for (int key : hash.keySet()) { System.out.println((char) key + ": " + hash.get(key)); } } }
aaaa bbbbb aaaa bbbbb aaaa bbbbb CCcc xx y y y y y Z
a: 12 b: 15 C: 2 c: 2 x: 2 y: 5 Z: 1
For performance, an array of all possible char values can be used. We can use a char as an index by casting it to an int. Then we increment each char index as it is encountered.
With some inspection, we can see that the results of the program are correct. This kind of program will need to be modified to suit your exact goals. But the general approach is durable.
Hashing, frequencies. This is an important tip—often in programs we can use a HashMap to count how often an element occurs. We can increment the value on each encountered key.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Sep 25, 2022 (edit).
© 2007-2024 Sam Allen.