C# Regex.Match Examples

You want to use the Regex.Match method in the C# programming language. This is to isolate part of a string based on patterns surrounding it. Here we see the Match method and various ways of using it, with sample input and output, using the C# programming language.

  Input string: /content/some-page.aspx
Required match: some-page

  Input string: /content/alternate-1.aspx
Required match: alternate-1

  Input string: /images/something.png
Required match: -

Using Regex.Match method

Here we see how you can match the filename in a directory path with Regex. Note that this has more constraints regarding the acceptable characters than many methods do. You can see the char range in the second parameter to Regex.Match.

--- Program that uses Regex.Match (C#) ---

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        // First we see the input string.
        string input = "/content/alternate-1.aspx";

        // Here we call Regex.Match.
        Match match = Regex.Match(input, @"content/([A-Za-z0-9\-]+)\.aspx$",
            RegexOptions.IgnoreCase);

        // Here we check the Match instance.
        if (match.Success)
        {
            // Finally, we get the Group value and display it.
            string key = match.Groups[1].Value;
            Console.WriteLine(key);
        }
    }
}

--- Output of the program ---

alternate-1

Overview of the example. It uses the verbatim @ string. Pay close attention to the syntax with the @ symbol, which designates the syntax we can use in the pattern.

Pattern information. Its pattern starts with "content/". We require that our group, which is in parentheses, is after the "content/" string. The symbols in the [ and ] are ranges of characters, or single characters. These are the allowed characters in our group.

What it captures from the string. It captures a Group. The content in the parentheses, Group, is collected. Then we require that the match succeeds, and then we access the value with Groups[1].

The index starts at 1

It is important to note that the indexing of the Groups collection on Match objects starts at 1. Some computer languages start with 1, but the C# language doesn't usually. It does here, and we must remember this.

Using ToLower instead

Here I found that that by using ToLower instead of RegexOptions.IgnoreCase on the Regex yielded a 10% or higher improvement. Since I needed a lowercase result, calling the C# string ToLower method first was simpler.

See ToLower String Method.

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        // This is the input string.
        string input = "/content/alternate-1.aspx";

        // Here we lowercase our input first.
        input = input.ToLower();
        Match match = Regex.Match(input, @"content/([A-Za-z0-9\-]+)\.aspx$");
    }
}

Using static Regex instance

Here we see that using a Regex instance object is faster than using the static Regex.Match. For performance, you should always use an instance object. It can be shared throughout the entire project.

--- Program that uses static Regex (C#) ---

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
        // The input string again.
        string input = "/content/alternate-1.aspx";

        // This calls the static method specified.
        Console.WriteLine(RegexUtil.MatchKey(input));
    }
}

static class RegexUtil
{
    static Regex _regex = new Regex(@"/content/([a-z0-9\-]+)\.aspx$");
    /// <summary>
    /// This returns the key that is matched within the input.
    /// </summary>
    static public string MatchKey(string input)
    {
        Match match = _regex.Match(input.ToLower());
        if (match.Success)
        {
            return match.Groups[1].Value;
        }
        else
        {
            return null;
        }
    }
}

--- Output of the program ---

alternate-1

Explanation. It uses an instance Regex. This static class stores an instance regex that can be used project-wide. We initialize it inline. The custom method exposes a MatchKey method. This is a useful method I developed to return the string that we want from the input value.

Pattern description. It uses a letter range. In this code I show the Regex with the "A-Z" range removed, because the string is lowercased already. I found that removing as many options from the Regex as possible boosted performance.

Optimizing Regexes

You can add the RegexOptions.Compiled flag to your regular expressions for a substantial performance gain at runtime. This will however make your program start up slower. In this example, RegexOptions.Compiled yielded 30% faster performance.

See Regex Performance.

Consider RegexOptions.RightToLeft

With this code, I found that using RegexOptions.RightToLeft made the pattern slightly faster as well. The expression engine would have to evaluate fewer characters in this case. This option could slow down or speed up your Regex.

Regex.IsMatch method

Here we mention that there is a similar method in the Regex type called IsMatch and it can be accessed with the compound name Regex.IsMatch. This method coalesces the return value from the Regex.Match into a boolean value by calling the Run method internally in the .NET Framework code. You can find more information on the Regex.IsMatch method here.

See Regex.IsMatch Method.

Summary

Here we saw three examples of the same regular expression in the C# programming language, all of which function similarly and use Regex.Match from the System.Text.RegularExpressions namespace. The final example above was benchmarked as the fastest, although many factors will determine performance with Regex.

See Regex Overview.

© 2007-2010 Sam Allen. All rights reserved.

Dot Net Perls  Sam Allen