Counting the words in a string
can help determine if the string
is appropriate for a specific usage in a program. For example, a summary may need to be a certain number of words.
In Rust, we are often focused on performance. Implementing a word counting function should use iteration over the chars in a string
, and not allocate excessively.
To begin, we introduce the count_words
function. This returns a usize
, which is the number of words in the argument string
.
str
reference in count_words
. This is ideal as a either a string
literal or String
can be passed to the function.chars()
. We store the previous char
in a "mut
" local variable.is_ascii
" functions in Rust.char
.fn count_words(s: &str) -> usize { let mut total = 0; let mut previous = char::MAX; for c in s.chars() { // If previous char is whitespace, we are on a new word. if previous.is_ascii_whitespace() { // New word has alphabetic, digit or punctuation start. if c.is_ascii_alphabetic() || c.is_ascii_digit() || c.is_ascii_punctuation() { total += 1; } } // Set previous. previous = c; } if s.len() >= 1 { total += 1 } total } fn main() { let mut data = String::new(); data.push_str("cat, "); data.push_str("bird, "); data.push_str("and dog"); // Borrow String to call count_words. let count = count_words(&data); println!(" DATA: {}", data); println!("COUNT: {}", count); }DATA: cat, bird, and dog COUNT: 4
In our simple example, the number of words is correctly determined. When no whitespace ends the string
, it is important we add the final word to the count.
Many functions in other languages can be ported to Rust with minimal changes. The borrow checker is not involved, so memory issues are not a concern.