Regex.Split
This C# method separates strings based on a pattern. It handles a delimiter specified as a pattern—such as "\D+" which means non-digit characters.
Using a Regex
yields a greater level of flexibility and power than string.Split
. The syntax is more complicated, and performance may be worse.
Here we extract all substrings that are separated by whitespace characters. We could use string.Split
. But this version is simpler and can be more easily extended.
string
. An operand is a character like "*" that acts on operands.Regex
, we implement a simple tokenizer. Lexical analysis and tokenization is done in many programs.using System; using System.Text.RegularExpressions; // Input string. string operation = "3 * 5 = 15"; // Part 1: split it on whitespace sequences. string[] operands = Regex.Split(operation, @"\s+"); // Part 2: display results. // ... Now we have each token. foreach (string operand in operands) { Console.WriteLine(operand); }3 * 5 = 15
We use Regex.Split
to split on all non-digit values in the input string. We then loop through the result strings, with a foreach
-loop, and use int.TryParse
.
string
contains the numbers 10, 20, 40 and 1, and the static
Regex.Split
method is called with two parameters.using System; using System.Text.RegularExpressions; // String containing numbers. string sentence = "10 cats, 20 dogs, 40 fish and 1 programmer."; // Get all digit sequence as strings. string[] digits = Regex.Split(sentence, @"\D+"); // Now we have each number string. foreach (string value in digits) { // Parse the value to get the number. int number; if (int.TryParse(value, out number)) { Console.WriteLine(value); } }10 20 40 1
Here we get all the words that have an initial uppercase letter in a string
. The Regex.Split
call gets all the words. And the foreach
-loop checks the first letters.
string
operations.using System; using System.Collections.Generic; using System.Text.RegularExpressions; // String containing uppercased words. string sentence = "Bob and Michelle are from Indiana."; // Get all words. string[] uppercaseWords = Regex.Split(sentence, @"\W"); // Get all uppercased words. var list = new List<string>(); foreach (string value in uppercaseWords) { // Check the word. if (!string.IsNullOrEmpty(value) && char.IsUpper(value[0])) { list.Add(value); } } // Write all proper nouns. foreach (var value in list) { Console.WriteLine(value); }Bob Michelle Indiana
For performance consider the string
Split
method (on the string
type) instead of regular expressions. That method is more appropriate for precise and predictable input.
Regex.Split
method call into an instance Regex
. This enhances performance and reduces memory pressure.RegexOptions.Compiled
enumerated constant for greater performance.We extracted strings with the Regex.Split
method. We used patterns of non-digit characters, whitespace characters, and non-word characters.