
Your program requires complex text processing for strings or files, and the Regex type in the C# language is ideal for this purpose. Built upon a custom text-processing language, the Regex type exposes many methods; in this page, we describe usages and examples of the Regex type.
The Regex type can be used to process or extract parts of HTML strings. The examples linked to here show how you can pull the title or the contents of paragraphs in your HTML documents. You can also remove all HTML tags, although this can be problematic.
It is possible to extract only the string content inside quotes in text. This is useful for processing certain output files from other programs; the article demonstrates this usage of the Regex type.
When you use parentheses in your Regex pattern, you can then access the Groups property of the resulting Match. The problem here is that the index of the groups is somewhat non-intuitive; read the linked article to get a working example.
You will often need to process text files from the disk. The Regex type and its methods can definitely be used for this, but you will need to combine a file input method with the Regex code; this tutorial details ways you can do this.
Are Regex matches fast? Unfortunately, Regex usage often results in slower code that imperative loops. However, the two tutorials shown here provide ways to optimize Regex performance, or to entirely replace Regex with a switch construct.
See Validate Characters in String.
The Escape method on the Regex type can be used to change a user input to a valid Regex pattern: the method assumes no metacharacters were intended, and the input string should be literal characters only. Please see the article for a complete example.
The Regex.Match method is one of the most useful ones on the Regex type. The article here describes its use with some example patterns that were useful in the real world.
The IsMatch version is different from Match in that it only returns a boolean answer that tells you if the pattern matches. It is less useful, but is fully documented in the linked article.
What if you have to replace a certain pattern of text with some other text? The Regex.Replace method solves this problem well: you can replace strings that match a pattern with a simple string, or with a value that is determined through a computation with MatchEvaluator.
See Regex.Replace and MatchEvaluator.
Do you need to extract substrings from your text that contain only certain characters, such as certain digits or letters? The Split method returns a string array that will contain the matching substrings; its usage solves complicated text problems.
See Regex.Split Method Examples.
Lowercase and uppercase letters are distinct in the Regex text language. You can, however, use a RegexOptions enumerated constant to change the machine's behavior so that the letters 'A' and 'a' are treated as equal; this article describes the approach.
See RegexOptions.IgnoreCase for Case-Insensitive Regex.
You can change how the Regex type acts upon newlines using the RegexOptions newline. For more details, check the article linked here.
This article describes a specific way you can use the terminating metacharacter in a Regex pattern to remove the ending part of a string. The article might provide clues whenever you need to act upon the ending part of a string.
If you are reading this, you are well acquainted with numbers. But how can you handle numbers in text strings using the Regex type in the C# language? These two examples show how you can get and remove numbers from strings using the Regex type.
See Remove Numbers From String.
Whitespace isn't actually white, but it is often not needed for future processing of data. In this linked article, we demonstrate how you can Trim whitespace using Regex methods; this is an alternative to the string methods.
What does the star character in Regex patterns do? The star is also known as a Kleene closure in language theory. Check out the article linked here for more details about the star.
Using a static Regex instance can improve performance and even simplify your Regex code. The tutorial shown here demonstrates how you can use a static Regex.
Another usage of the Regex type in the C# language is to count words in strings. The article here shows how you can implement a word count method that is very close to that present in Microsoft Word 2007. This works on English text.