Go - String Remove HTML Tags

Remove HTML. Suppose a string in a Go program contains HTML markup, but we do not want to keep the markup. We can remove the tags with a for-loop.

With a rune slice, we can build up the runes for the result. We detect markup by testing each rune in the original string for angle brackets.

Example. We introduce the stripHtml function, which receives a string and returns another string. We call stripHtml to test in the main() function.

Step 1 We loop over the input string with a for-range loop. This gives us each individual rune in the string.

range

Step 2 We append each rune to the data rune slice. At this point, we have skipped past runes including and surrounded by angle brackets.

Step 3 We convert the rune slice back into a string. This string now contains all non-markup runes.

Convert String

package main

import (
    "fmt"
)

func stripHtml(source string) string {
    data := []rune{}
    inside := false
    // Step 1: loop over string with range loop.
    for _, c := range source {
        if c == '<' {
            inside = true
            continue
        }
        if c == '>' {
            inside = false
            continue
        }
        // Step 2: append chars not inside markup tags starting and ending with brackets.
        if !inside {
            data = append(data, c)
        }
    }
    // Step 3: return string based on the rune slice.
    return string(data)
}

func main() {
    // Call the stripHtml function.
    input := "<p>Hello <b>world</b>!</p>"
    result := stripHtml(input)
    fmt.Println(input)
    fmt.Println(result)
}<p>Hello <b>world</b>!</p>
Hello world!

In the results, we can see that the "p" and "b" tags were removed from the markup. Note that this function will fail for HTML comments—a more powerful parser would be needed.

Summary. It is possible to use regular expressions to remove markup from strings, but this offers little advantage over a for-loop. And it is usually slower.

Dot Net Perls is a collection of pages with code examples, which are updated to stay current. Programming is an art, and it can be learned from examples.

Donate to this site to help offset the costs of running the server. Sites like this will cease to exist if there is no financial support for them.

Sam Allen is passionate about computer languages, and he maintains 100% of the material available on this website. He hopes it makes the world a nicer place.

This page was last updated on Aug 25, 2023 (new).

Home

Changes