Word count. Counting the words in a string can help determine if the string is appropriate for a specific usage in a program. For example, a summary may need to be a certain number of words.
In Rust, we are often focused on performance. Implementing a word counting function should use iteration over the chars in a string, and not allocate excessively.
Example program. To begin, we introduce the count_words function. This returns a usize, which is the number of words in the argument string.
Finally We test the chars for ASCII character ranges using the built-in "is_ascii" functions in Rust.
Tip A word is defined as a sequence of letters, digits and punctuation following a whitespace char.
fn count_words(s: &str) -> usize {
let mut total = 0;
let mut previous = char::MAX;
for c in s.chars() {
// If previous char is whitespace, we are on a new word.
if previous.is_ascii_whitespace() {
// New word has alphabetic, digit or punctuation start.
if c.is_ascii_alphabetic() || c.is_ascii_digit() || c.is_ascii_punctuation() {
total += 1;
}
}
// Set previous.
previous = c;
}
if s.len() >= 1 {
total += 1
}
total
}
fn main() {
let mut data = String::new();
data.push_str("cat, ");
data.push_str("bird, ");
data.push_str("and dog");
// Borrow String to call count_words.
let count = count_words(&data);
println!(" DATA: {}", data);
println!("COUNT: {}", count);
}DATA: cat, bird, and dog
COUNT: 4
Function results. In our simple example, the number of words is correctly determined. When no whitespace ends the string, it is important we add the final word to the count.
A summary. Many functions in other languages can be ported to Rust with minimal changes. The borrow checker is not involved, so memory issues are not a concern.
Dot Net Perls is a collection of pages with code examples, which are updated to stay current. Programming is an art, and it can be learned from examples.
Donate to this site to help offset the costs of running the server. Sites like this will cease to exist if there is no financial support for them.
Sam Allen is passionate about computer languages, and he maintains 100% of the material available on this website. He hopes it makes the world a nicer place.
This page was last updated on Mar 14, 2023 (edit).