Regular expression performance comparisons

October 24, 2010

Pop quiz, hotshot. You need to match a string using a regular expression and you need it to execute as fast as possible. Application startup time doesn’t matter and each method will execute 10,000 times. Which option do you chose?

//A
void Method_A()
{
    Regex x = new Regex("somepattern");
    x.IsMatch("test.string!");
}

//B
void Method_B()
{
    Regex.IsMatch("test.string!", "somepattern");
}

//C
static void Method_C()
{
    Regex x = new Regex("somepattern");
    x.IsMatch("test.string!");
}

//D
static readonly Regex regexForMethod_D = new Regex("somepattern");
public void Method_D()
{
    regexForMethod_D.IsMatch("test.string!");
}

//E
static readonly Regex regexForMethod_E = new Regex("somepattern",
	RegexOptions.Compiled);
public void Method_E()
{
    regexForMethod_E.IsMatch("test.string!");
}  

Bonus question: two pairs of these options take just about the same time to execute. Which are they, and which pair is faster? The answers are below.

Note: for brevity, I’ve listed “somepattern” as the regex pattern. For my actual testing, I used a pattern to match an email address – the one in the code sample just below.

Programming regular expressions by coincidence

I’ll admit, regular expressions intimidate me just a little. They’re powerful and fast, but the complex, archaic-looking patterns just make me want to avoid them. When I did have to use a Regex, I would just google for a piece of code, confirm that it generally worked and move on.

//need to validate an email... [copy/paste from a website:]
Regex x = new Regex(@"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*");
x.IsMatch("e@mail.com");
//it works... moving on

This is the definition of programming by coincidence. Regular expressions are somewhat complex. Initialization of a Regex object is not a trivial operation, and then there’s that whole question of whether you should compile or not.

You can get away with programming by coincidence most of the time. When your servers are melting in a giant ball of fire because your website is initializing and compiling several regular expressions on every request… you can’t.

Regular expression options for an ASP.NET application

This article deals with regular expressions in a general matter, but I want to focus on using them on the context of an ASP.NET website: A application that is typically long-running, and has several global (static) classes/members as well as instance objects (such as a web form Page) in which a regular expression may be used.

Static vs Instance members

The first thing to consider is whether your regular expression object should be defined as an instance member, or as a static member. Unless the Regex pattern is variable or only ever used once, I can’t think of a reason why you would ever want to create a Regex as an instance variable in an ASP.NET application.

As a general rule of thumb, create regular expressions as static read-only objects if they are going to be executed several times. D is much faster than A.

//A
void Method_A()
{
    Regex x = new Regex("somepattern");
    x.IsMatch("test.string!");
}

//D
static readonly Regex regexForMethod_D = new Regex("somepattern");
public void Method_D()
{
    regexForMethod_D.IsMatch("test.string!");
}

I’m currently maintaining code that has a lot of the following in it: A static method that initializes a local regular expression within it. When I was programming by coincidence, I’d see this and at first be taken aback “a non-static Regex – this is probably slow because it gets initialized on every call!” and then I’d see it was in a static method and think “…well, the compiler must surely optimize this. It’s a local member of a static method so that’s fine.” Wrong.

The following - option C in our quiz – is no faster that option A.

//C
static void Method_C()
{
    Regex x = new Regex("somepattern");
    x.IsMatch("test.string!");
}

What about the static method Regex.IsMatch()?

There’s a static IsMatch() method on the Regex class that can be called as a quick an easy way to evaluate an expression. My understanding is that this is defined just like option C, but by my measurements can perform up to twice as fast. In fact, Regex.IsMatch() – option B – performs at about the same speed as option D under my testing scenario.

//B
void Method_B()
{
    Regex.IsMatch("test.string!", "somepattern");
}

I’ll have to investigate this further.

Compile vs not compile

The final question is whether you should initialize your regular expression object using RegexOptions.Compiled. This generally depends. A compiled Regex gives about 30% better runtime performance but can take as much as ten times longer to initialize.

With an ASP.NET app, I’d say the answer is pretty clear in most cases: go with the compiled version (on a static Regex object!). The BCL team states “…the bottom line is that you should only use this mode for a finite set of expressions which you know will be used repeatedly” which is the case in most of the regular expressions I come across in an ASP.NET application. You’re mileage may vary, of course.

Option E, when ignoring the increased startup cost, is the fastest of all the options presented.

//E
static readonly Regex regexForMethod_E = new Regex("somepattern", RegexOptions.Compiled);
public void Method_E()
{
    regexForMethod_E.IsMatch("test.string!");
}  

The verdict

Ignoring initialization time, here are the results of the quiz – where each method was executed 10,000 times.

regex execution time comparison

Calling IsMatch() on the compiled static Regex was by far the fastest operation, and in my opinion, the best way to use a regular expression in an ASP.NET application provided it will be used often.

What about the initialization costs? It is definitely not trivial to use RegexOptions.Compiled!

regex startup times

But given this only has to happen once (and not 10,000 times!) for the life of an ASP.NET application, I’m willing to accept this startup cost for the comparatively smaller runtime performance increase of the non-compiled version.

blog comments powered by Disqus

About Kurt

I'm a senior consultant at Headspring in Austin, TX. My passion is creating web-based applications that are well crafted and solve real problems for real people. Want to know more? Check out my about page.

. @LipGlosserie setting up for Renegade Austin craft fair http://t.co/7X4WBVQb 15 hours ago