This sentence has a certain number of words. But how many? With a Go method, we can count sequences of non-whitespace characters—these are words.
With a word count method, we can generate statistics about text. We can verify that it is correct. A regexp
can be used for the most accurate word counting.
func
To write a simple word count func
in Go, we can use regexp
. Think of a series of words—the one thing we can do to detect a word is find a sequence of letters or digits.
regexp
metacharacter to detect these.WordCount
we invoke FindAllString
with a -1 argument—this finds all matches.FindAllString
. This is our entire word counting method.package main import ( "fmt" "regexp" ) func WordCount(value string) int { // Match non-space character sequences. re := regexp.MustCompile(`[\S]+`) // Find all matches and return count. results := re.FindAllString(value, -1) return len(results) } func main() { // This has 10 words. fmt.Println(WordCount("To be or not to be, that is the question.")) // This has 1 word. fmt.Println(WordCount("Hello")) // This has 2 words. fmt.Println(WordCount("Hello friend")) }10 1 2
With some simple string
tests, we can see that the results appear correct. The string
"Hello" has just 1 word. The string
"Hello friend" has 2.
An important thing to get right in word counting I show punctuation separates words. A double
-hyphen can separate 2 words—the regexp
must handle this case.
For an easy-to-write and fairly accurate word counting method, counting sequences of 1 or more non-whitespace chars is a good option. More advanced approaches are possible.