Remove HTML. Often we encounter Strings that contains HTML markup. It is possible to remove this markup with a custom VB.NET Function.
A Function. We develop a custom Function based on the Regex type. It uses a regular expression to strip HTML markup tags—this works on many source strings.
An example. To begin, this program introduces the StripTags Function, which performs the HTML removal. This calls the Regex.Replace function.
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
' Input.
Dim html As String = "<p>There was a <b>.NET</b> programmer " +
"and he stripped the <i>HTML</i> tags.</p>"' Call Function.
Dim res As String = StripTags(html)
' Write.
Console.WriteLine(res)
End Sub
''' <summary>
''' Strip HTML tags.
''' </summary>
Function StripTags(ByVal html As String) As String
' Remove HTML tags.
Return Regex.Replace(html, "<.*?>", "")
End Function
End ModuleThere was a .NET programmer and he stripped the HTML tags.
A warning. If you have HTML markup that is malformed in any way, or has comments, this method will not work. You may wish to first validate the markup.
Tip You can validate HTML markup using a simple parser that matches tag characters.
Summary. The easiest way to strip HTML tags is to use the Regex type. Other methods that scan the String and use Char arrays are more efficient, but will also be more complicated.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Mar 20, 2023 (edit).