What Are Regular Expressions (Regex) & How to Use Them in A/B Testing?
Regular expressions (or regex) can be a powerful tool in the arsenal of any CRO practitioner.
Many data scientists, analysts, and others have undoubtedly come across them at some point during their careers. They can be difficult for those without technical knowledge but mastering these useful patterns is a sure way to elevate your experimentation program!
In this blog post, we’ll attempt to demystify regular expressions so that you can start confidently using them in your testing.
We’ll start by analyzing the structure and different types of regular expressions. We then show you some examples of regular expressions you might want to use and how to implement these patterns into various parts of A/B testing. Finally, we look at a few ways these can be used in the Convert Experiences app.
What Is Regex?
Regular expressions are like a miniature language that’s widely used, allowing people to match complex patterns that would otherwise take them many hours of research.
They have an alphanumeric structure and come with their own set of symbols like brackets { } parentheses ( ), asterisks (*), question marks (?), opening brackets ([) closing brackets (]), etc.
If you are a bit familiar with the regex below, then this is the right article for you.
/https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#()?&//=]*)/
How Can You Write, Test & Debug Regular Expressions?
Now, let’s show you what’s possible with regular expressions. First, we’re going to take a look at a few ways you can create your own complex regexes from scratch!
How to Write Regex
To build your first regular expression, you must use specific syntax—that is, special characters (metacharacters) and construction rules. For example, the following is a simple regular expression that matches any 10-digit telephone number, in the pattern nnn-nnn-nnnn:
\d{3}-\d{3}-\d{4}
You can either start writing the specific syntax yourself (and make many mistakes until you have a validated regex pattern) or use one of the available regex generators that exist. One of the most user-friendly and easy to use is the Regex Generator.
Simply paste a text sample you want to match using a regex (as an example, I used Convert’s URL www.convert.com below), and then, select different parts of the text that you want to build the regex on.
That’s it! So simple.
The regex pattern is then ready to be used:
w+\.[a-zA-Z]+\.com
Another example could be to enter a support email address and specify the domain part of the address to build the regex on:
The regex is ready for you and you can now target all support email addresses of the tools you use without having to copy paste them one by one:
support@[a-zA-Z]+\.com
If you are more tech-savvy and want to write that pattern yourself, you can start learning the basic characters and quantifiers plus some construction rules.
Regular Expression Basic Characters
Here is a quick “cheat sheet” for those of you who want to learn the most common rules of regex.
Quantifiers
So what if you want to match several characters? You need to use a quantifier. The most important quantifiers are *?+. They may look familiar to you, but they’re not exactly the same.
- * matches zero or more of what comes before it.
- ? matches zero or one of what comes before it.
- + matches one or more of what comes before it.
Special Characters
A lot of special characters are available for regex building. Here are some of the most frequent ones:
. | The dot matches any single character. |
\n | Matches a newline character (or CR+LF combination). |
\t | Matches a tab (ASCII 9). |
\d | Matches a digit [0-9]. |
\D | Matches a non-digit. |
\w | Matches an alphanumeric character. |
\W | Matches a non-alphanumeric character. |
\s | Matches a whitespace character. |
\S | Matches a non-whitespace character. |
\ | Use \ to escape special characters. For example, \. matches a dot, and \\ matches a backslash. |
^ | Match at the beginning of the input string. |
$ | Match at the end of the input string. |
Character Classes
You can group characters by putting them between square brackets. This way, any character in the class will match one character in the input.
[abc] | Match any of a, b, and c. |
[a-z] | Match any character between a and z. (ASCII order) |
[^abc] | A caret ^ at the beginning of the square bracket indicates “not”. In this case, match anything other than a, b, or c. |
[+*?.] | Most special characters have no meaning inside the square brackets. This expression matches literally any of +, *, ? or the dot. |
Need help building your regular expressions?
If you’re unfamiliar with regular expressions and would like to learn more, we highly recommend taking a quick crash course! Regex is a powerful tool that only requires a small time investment to learn.
How to Test Regular Expressions
You now have your regex pattern ready but you would like to test if the syntax is correct. You can do it manually and spend many hours reading the validation rules. Mathias Bynens has a great article on the best comparison of a lot of regular expressions: In search of the perfect URL validation regex. That is the crazy way to move forward.
But thankfully, there are many free online regex validators you can take advantage of and speedily test your strings against the regex pattern you built. We can recommend two of them, RegEx101 and RegExr. The screenshots below are from the latter but feel free to use any you feel most comfortable with.
Simply add your regex pattern in the Expression field, then, in the Text field, add any text that you would like to see if it matches your pattern. You can see on the fly how many of the texts you entered are matching the specific pattern.
These regex validators are very powerful!
How to Debug Regex
Testing your regex is much more important than debugging it. You can usually figure out what’s going on with a regex quite easily by looking at the result, but to be sure it does what you mean it to, you should test your regex with all possible border cases. Testing will eventually clarify what you really want to do and make the debugging useless.
However, if you still want to debug your regex pattern you can type it into https://regex101.com/. Not only does it let you test out your regexes on a sample set, color coding your match groups, but it also gives you a full explanation of what’s going on under the hood.
Keep in mind though, you’ll have to refer to the specific documentation for the particular programming language you’re using the regex in. Each has its particular restrictions. Some things might be unsupported in a particular language.
If you want a more “visual debugging” experience, try Debuggex.
It shows pathways in your regex like this:
How to Use Regex in JavaScript
There are two ways to create a regular expression in JavaScript. It can be either created with the RegExp module or by using forward slashes ( / ) to enclose the pattern. Slashes /…/ tell JavaScript that we are creating a regular expression. They play the same role as quotes for strings.
In both cases, regexp becomes an instance of the built-in RegExp module.
The main difference between these two syntaxes is that the pattern using slashes /…/ is fully static while the other can generate regular expressions on the fly.
Method 1 Example
Let’s look at the example below of RegExp used to validate the user’s input and ensure that their input contains only numbers:
let num = 'me'; let regex = new RegExp('[0-9]'); console.log(regex.test(num)); //this will output false
Method 2 Example
Let’s look at a simple expression with the literal notation that’ll look for an exact match in a String. This will match the String, performing case sensitive search:
let re = "Hello Studytonight"; let result = /hello/.test(re); console.log(result); //outputs false
After you’ve written them, there are two interesting methods for testing your JavaScript regular expressions:
- RegExp.prototype.test(): to test if a match has been found or not. It accepts a string that we test against a regular expression. It will return true or false if the match is found.
- RegExp.prototype.exec(): Returns an array with all matched groups. It accepts a string that we test against a regular expression.
In the following example, the pattern /JavaScript/ is tested against the string to see whether a match is found:
var re = /JavaScript/; var str = "JavaScript"; if (re.test(str)) document.writeln("true") ;
In the following snippet of code, the RegExp method, exec, searches for a specific pattern, /javascript*/, across the entire string (g), ignoring case (i):
var re = /javascript*/ig; var str = "cfdsjavascript *(&Yjavascriptjs 888javascript"; var resultArray = re.exec(str); while (resultArray) { document.writeln(resultArray[0]); resultArray = re.exec(str); }
Why Do We Need Regex in A/B Testing?
Regex in A/B testing is mainly used for targeting. Targeting controls the who and the where of any experience.
Through targeting you are telling your testing platform who (which website visitor conditions) to show the experience to and where (which specific URLs) the experience should run on your site.
By defining audiences, you can decide who will see the experience. Audience conditions can define traffic sources, geographical data, behavioral data, specific cookies your visitors have, and endless conditions you can specify yourself.
By defining URL targeting, you decide where the experience will run. URL targeting conditions can include several domains, subdomains, query parameters, and paths.
Sometimes it is just not feasible to use the “exact match” or “contains” or “starts with” operators to bucket traffic to your experiences. This is where regexes come in.
These are 5 sample audiences that can be excluded or included in an experience and defined with regex:
- Visitors coming from ad campaigns that have a common term in their names but differ in the rest (e.g. shoes-purchases-mobile, rings-purchases-desktop).
- Visitors using a specific browser version (e.g. Firefox 3.6.4).
- Visitors coming from a third-party site like Facebook or TikTok where you need to specifically define a group of names.
- Visitors who have previously seen a promotion.
- Visitors who are logged in and their cookies for controlling the login feature have a unique identifier.
These are 5 sample locations you might want to include or exclude from an experience and can be defined with regex:
- Pages with dynamic/unique query string values.
- Specific landing pages with common terms but unique identifiers.
- Category and subcategory pages.
- Multiple pages in the checkout funnel while visitors flow from one step to the next.
- Everywhere except for a few pages.
How to Use Regex in A/B Testing?
Regular expressions are useful in any A/B / MVT / Personalization / A/A / Multipage / Split URL experience that benefits from full or partial URL pattern matches.
We can use regex in A/B testing to:
- verify the structure of a URL
- extract substrings from structured URLs
- search / replace / rearrange parts of the URL
- split a URL into tokens
- find a constant part of the URL.
All of these come up regularly when drafting a Convert experience.
Regex matches are useful when the path, trailing parameters, or both, can vary in the URLs for the same webpage.
For example, if a user comes from one of many subdomains and your URLs use session identifiers, you could use a regular expression to define the constant element of your URL. Pretty handy, right?
At Convert, we use regular expressions (shortened to regex and regexes) to allow you to target your experiences to a specific set of pages, or to URLs that are complex or dynamic. It is also used to define audiences with multiple variables that have something in common, thus allowing you to target specific website visitors and in several other use cases that we present below.
There is a lot of information about regexes on the internet and a lot of it isn’t really applicable to how you will be using them with Convert Experiences, so we have created this regex guide to help you get started.
Regex Use Case: Convert’s Regular Expression Interface with Checker
There are many regex testers/validators you can make use of before you bring your formulas and patterns into the Convert UI.
We have designed a regex section (see below) to make it simple for non-familiar users to write their own regex formulas and validate these with our checker.
Regular expression matches are then evaluated using JavaScript’s built-in regular RegExp module.
Here are some examples of how the checker looks in different places in the app:
How to Use Regular Expressions in the Convert Experiences App (with Examples)
Now, let’s go through each of these use cases and see a few examples of instances where regexes are immensely useful.
1. Site Area with Regex
The Site Area is the place within the Convert Experiences app where you configure the page targeting criteria that trigger your experiences.
The most basic URL configuration triggers the experiment based on a URL, for example: “https://www.convert.com“.
This setting is configured automatically when you first create your experiment. And it is set to the URL which you input to create your A/B Experiment / MVT Experiment / Personalization or the Original URL on a Split URL Experiment.
However, you can change this default configuration by selecting one of the several operators that the Site Area provides for triggering your experience.
One of the operators is called “Matches Regex” and another is called “Does not match exactly regex”.
You can use these two options to define the pages where you want to run your Convert experiences when no other operator can be helpful to apply the URL settings you want.
Let’s see some use cases to make this easier to understand!
Example 1
Let’s say you want to run an experience with these two conditions:
- Traffic source = Google Adwords
- URL contains prg=ABTEST
Here’s how you’d write the regex in your Site Area:
https://convert.com/\?(?=.*utm_source=google)(?=.*prg=ABTEST).*
Example 2
Let’s say you want to compare 3 landing pages to one variant.
The landing pages are:
- https://www.convert.com/lp-home
- https://www.convert.com/lp-home-agencies
- https://www.convert.com/lp-home-clients
with the variant being https://www.convert.com/lp-semhome/desktop
In this example, you’d write the regex in your Site Area like this:
https:\/\/www.convert.com\/lp-home(\/|-agencies|-clients|)
Example 3
Now, let’s imagine your colleagues ask you to set up an experience where:
- Traffic to Original should be 0
- Query parameter contains utm_bucket=competitor
- Traffic is split 50/50 between the two variants, thus when traffic gets to https://convert.com/?utm_bucket=competitor then 50% of traffic goes to https://convert.com/vs-offerpad/ and the other 50% goes to https://convert.com/vs-zillow/
In this case, the regex would look like this:
https://www.convert.com/([^\?]+)?\?{0,1}(.*)([&,\?]utm_bucket=competitor)(.*)$
Example 4
Another case could be that you want to test the conditions below:
- Page URL should contain /collections/
- Page UR should not contain /products/
- Page URL should not match exactly: https://convert.com/collections/
- URL Query parameter should not contain ?v=t
- Original URL could be any page under collections
Here, you need to combine regex with audiences to fulfill all conditions. Thus, the regex in your Site Area will look something like this:
And don’t forget to define the audience to exclude visitors that have ?v=t in their URL.
https://www.convert.com/collections/(?!(.*\/)products)(.*)([^\?]+)?\?{0,1}(.*)$
Example 5
In this final example, let’s say you want to run a Split URL experience where, when the shop-size is included in the URL, you want to run the test and split the traffic between the original and the variant.
1. The original can be any of the below:
https://convert.com/products/shop-size
https://convert.com/collections/new-products-deals/products/shop-size
https://convert.com/collections/fitting/products/shop-size
2. The variation URL might look like this: https://convert.com/products/the-original-fittings
Here, this will be your regex:
2. Audiences with Regex
Another section where you can take advantage of regular expressions in the Convert Experiences app is Audiences.
An audience is a group of users/website visitors that have something in common. With audiences, you categorize your website visitors into groups based on specific criteria such as location, the device used to access the site, the hour of the day, their landing page, or any other user behaviors.
Visitors to a different subgroup are likely to behave or buy in the same way. You may create audiences by specifying the conditions that allow Convert to decide which audience a visitor is eligible for and run the correct test or variation.
We only support regex in one of the 3 audience types we provide, Segmentation.
When you select this type of audience, these conditions become available:
Example
Let’s say you want to run an experience targeting website visitors whose landing page consists of a common term like “products”. In this case, you’d select the “Page URL” condition from the list on the left, and then “Matches Regex” as your operator.
And you are done!
3. Goals with Regex
To track goal conversions for your experiences, you need to specify the page URLs where you wish to record the conversion. Convert Experiences allows you to enter specific URLs, page patterns, or regular expressions (regex) of the pages where you want to record goal conversion.
Example
Let’s say your goal is to check how many users access a particular page of your website.
In this case, you need to define the goal type as “Visit a specific page” and enter the page URL users need to visit, to record the conversion:
And this is what your regex looks like:
https://convert.com/$1/privacy/?$3
4. Regex in Active Websites
Convert supports wildcards in your “Active Websites” settings.
For example, if you want to include all subdomains under “domain.com”, you should set up the “Active Domain” entry like this: “http://*.domain.com“.
Common Mistakes to Avoid when Using Regex
It is not enough to define your regex for your URL targeting once and then ignore it. Regular cleanup and checks are required to ensure the right pages/audiences/goals are continuously in the right experiments.
Here are the top mistakes we often see show up in our support tickets:
1. Including the Start and End Characters
If you include the start and end characters (^ and $), then any URL that includes text before or after the pattern will not be matched.
Avoid using them.
It’s very common for URLs to include query strings at the end, such as the UTM parameter that are added to URLs for tracking purposes.
An example of this would be:
https://www.convert.com/?utm_campaign=ads
2. Including a Forward Slash
A forward slash (/) at the end of the URL is generally optional.
If your regex includes that character at the end, then a visit to the same URL but without the forward-slash wouldn’t match. It is better not to include that final forward slash character.
3. Exceeding the Character Limit
There is a limit of 750 characters for all of our regex targeting rules. If you go over this limit, there will be no error thrown to alert you of the issue (even though I believe this limit cannot be reached easily).
4. Running Simultaneous Experiments on the Same Page
If you try to run multiple experiments on the same page(s) simultaneously, this leads to a collision as to which experiment a visitor should participate in and which changes to be applied first.
Because of this, you should be careful with your regex URL targeting. If you target the same page with the targeting rules of more than one A/B test, you need to use these instructions to prevent the collision.
If you’re looking for help understanding regular expressions or with the URL targeting of your Convert Experiences, our support team is ready to answer your questions. You can reach us at any time through the in-app chat. We’ll be happy to provide an overview and show you some examples so that you can start using regex confidently in your testing!