Using Regex in Unqork

Overview

When working with end-user End-users, also known as Express Users, are the individuals accessing an application through Express View. In most cases, end-users are the customers using the product. input fields, you'll want to control what's entered into them. One way to control end-user input is by using Regex. Regex, or Regular Expression, is a complex set of rules that control the patterns of strings as they're entered into a field.

Regex is a concept that exists outside of Unqork, but can be used in your Unqork application in various ways.

After completing this article, you’ll know what Regex is and how to use it in Unqork.

What Is Regex?

Regex is a character sequence that describes what pattern a string A string is an object that represents a sequence of characters. Strings typically hold data represented in text form. must follow. One of the most common places you find Regex is when signing up for an account. You're asked to enter a password with specific characters. If you enter a password that does not meet the specified criteria, you're given an error. You can even use Regex to validate that an email address entered is an actual email address.

Regex Syntax

Let’s take a look at the Regex syntax used to validate a U.S. email address.

/^([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})*$/

Let's explore this syntax. The @ symbol is typically found in the middle of the entire email address. So, we know the Regex is looking for a _________@____ pattern in the email address string. So, the @ symbol looks for a literal match that the input is an email address.

Notice how the @ sign separates two sets of square brackets. There's also a third set of square brackets toward the end of the expression that is separated by a period. Simplifying the expression, you get the following:

[ ] @ [ ] . [ ]

Let's understand the first set of square brackets and what it contains:

[a-zA-Z0-9._%-]

Here is the list of supported characters allowed in this section of the email address syntax. Regex uses the ASCII code index to control the following ranges:

Display ASCII Range Character Range Notes

a-z

97-122

a-z

Case sensitive, lower case

A-Z

65-90

A-Z

Case sensitive, upper case

0-9

48-57

0-9

Case sensitive, numeric (lower)

You can find the full ASCII code index here: https://www.ascii-code.com/.

The a-zA-Z0-9 syntax in the bracket states that any letter (uppercase or lowercase) and number is valid. The period in the ._%- portion of the syntax indicates a set of specific supported characters instead of a range. End-users End-users, also known as Express Users, are the individuals accessing an application through Express View. In most cases, end-users are the customers using the product. are permitted to use underscores (_), a percentage sign (%), or a hyphen (-) in the first part of their email address.

The plus sign (+) that follows the first set of square brackets ensures the preceding section can have as many characters needed to match the criteria, but it must contain at least one character.

Our Regex expression states that the string:

  • Can include uppercase or lowercase letters.

  • Can include a number.

  • Can include underscores, percentage signs, and hyphens.

  • Must include at least one character.

Let's explore the second set of square brackets:

[a-zA-Z0-9.-]

The only difference between the first set of square brackets and the second is the removal of underscores and percentage signs. So, these characters are not allowed in this portion of the email address.

The next symbol between the first and second set of square brackets is the combination of a back slash and period (\.). Placing a backslash before a period tells the Regex to treat this character combination as a literal period. So, the \. combination means a period is required at this point of the email address.

Lastly, let's explore the final set of square brackets:

[a-zA-Z]

This syntax only allows uppercase and lowercase letters. No numbers or special characters are allowed in the final set of square brackets. This portion represents the .com, .net, .org, and so on part of an email address. We know authentic emails only have letters to conclude the email address.

After the final set of square brackets, you see the following syntax:

{2,6}

Notice how this section uses curly braces instead of square brackets. It's not specifying acceptable characters. Instead, it indicates the number of characters allowed in brackets that precede it. So, the portion of the email address after the final period must be between two and six characters long. Which makes sense for .com, .net, .org, and so on.

The remainder of the syntax is standard for Regex:

  • The forward slash (/) at the beginning and end of regular expression notes that this syntax represents a regular expression.

  • The caret (^) as the second character of the syntax notes this as the beginning of the expected string.

  • The parentheses note that what's enclosed should be treated as one complete string.

  • The asterisk (*) near the end means this regular expression can be used as many times as needed. So, you could have zero valid email addresses or as many as you want.

  • The dollar sign ($) represents the end of the Regex string.

A static image displaying a Regex string broken down into several parts.

To learn more about Regex syntax, visit https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Cheatsheet. For syntax specific to querying MongoDB, visit https://docs.mongodb.com/manual/reference/operator/query/regex/.

Using Regex in Unqork

There are two options for using Regex when configuring in Unqork:

Native Unqork Components

The Email and Phone Number components include built-in Regex to confirm an email address and phone number is correct. But, many times you'll customize restrictions on other fields. For example, customizing restrictions for a Text Field component.

The Text Field component has a dedicated setting where you can add custom Regex. In the Advanced settings, there's a couple of Regular Expression Pattern settings to add Regex syntax and a custom error message when the criteria is not met.

Regular Expression Pattern is a designated Regex setting, so the opening and closing forward slashes are not required.

A static image displaying the Text Field configuration screen, the Regular Expression Patter (Regex) and Pattern Error Message fields are highlighted.

Here's what the configuration window looks like with the Regex set to check for a valid email address:

A static image displaying the Text Field configuration window, the Regular Express Pattern field is filled out with the following REGEX code  ([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})*$

And here's how the field looks in Express View Express View is how your end-user views your application. Express View also lets you preview your applications to test your configuration and view the styling. This is also the view your end-users will see when interacting with your application. After configuring a module, click Preview in the Module Builder to interact with the module in Express View. when the input is not an email address:

Regex Formulas

Another option for using Regex in Unqork is to use Regex formulas. REGEXMATCH is a formula you can use in components to validate a string against a regular expression.

Below is the syntax for a REGEXMATCH formula:

REGEXMATCH(stringToCheck,regexPattern)

In this formula, regexPattern is also specifically used for regular expressions. So, you'll omit the opening and closing slashes.

In a Data Workflow component you can use Formula or Create Value operators to validate a single value. Or, you can use Create Field operator to validate a field across an entire table. The stringToCheck and regexPattern syntax can be hard coded directly into the formula or set dynamically using the argument ports and inserting A into the formula. Either way, your result is a Boolean value of true (indicating a match) or false (indicating no match).

REGEXTRACT is a formula that returns a portion of a string that matches a regular expression. Let's say you only want to retrieve the portion of an email address before the @ symbol.

Below is the syntax for a REGEXTRACT formula:

REGEXEXTRACT(stringToCheck,regexPattern)

The syntax is the same as REGEXMATCH, but instead of a Boolean value, the output is the portion of the string that matches the regular expression. If no portion of the string matches, the output is a null value.

Overview

When working with end-user End-users, also known as Express Users, are the individuals accessing an application through Express View. In most cases, end-users are the customers using the product. input fields, you'll want to control what's entered into them. One way to control end-user input is by using Regex. Regex, or Regular Expression, is a complex set of rules that control the patterns of strings as they're entered into a field.

Regex is a concept that exists outside of Unqork, but can be used in your Unqork application in various ways.

After completing this article, you’ll know what Regex is and how to use it in Unqork.

What Is Regex?

Regex is a character sequence that describes what pattern a string A string is an object that represents a sequence of characters. Strings typically hold data represented in text form. must follow. One of the most common places you find Regex is when signing up for an account. You're asked to enter a password with specific characters. If you enter a password that does not meet the specified criteria, you're given an error. You can even use Regex to validate that an email address entered is an actual email address.

Regex Syntax

Let’s take a look at the Regex syntax used to validate a U.S. email address.

/^([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})*$/

Let's explore this syntax. The @ symbol is typically found in the middle of the entire email address. So, we know the Regex is looking for a _________@____ pattern in the email address string. So, the @ symbol looks for a literal match that the input is an email address.

Notice how the @ sign separates two sets of square brackets. There's also a third set of square brackets toward the end of the expression that is separated by a period. Simplifying the expression, you get the following:

[ ] @ [ ] . [ ]

Let's understand the first set of square brackets and what it contains:

[a-zA-Z0-9._%-]

Here is the list of supported characters allowed in this section of the email address syntax. Regex uses the ASCII code index to control the following ranges:

Display ASCII Range Character Range Notes

a-z

97-122

a-z

Case sensitive, lower case

A-Z

65-90

A-Z

Case sensitive, upper case

0-9

48-57

0-9

Case sensitive, numeric (lower)

You can find the full ASCII code index here: https://www.ascii-code.com/.

The a-zA-Z0-9 syntax in the bracket states that any letter (uppercase or lowercase) and number is valid. The period in the ._%- portion of the syntax indicates a set of specific supported characters instead of a range. End-users End-users, also known as Express Users, are the individuals accessing an application through Express View. In most cases, end-users are the customers using the product. are permitted to use underscores (_), a percentage sign (%), or a hyphen (-) in the first part of their email address.

The plus sign (+) that follows the first set of square brackets ensures the preceding section can have as many characters needed to match the criteria, but it must contain at least one character.

Our Regex expression states that the string:

  • Can include uppercase or lowercase letters.

  • Can include a number.

  • Can include underscores, percentage signs, and hyphens.

  • Must include at least one character.

Let's explore the second set of square brackets:

[a-zA-Z0-9.-]

The only difference between the first set of square brackets and the second is the removal of underscores and percentage signs. So, these characters are not allowed in this portion of the email address.

The next symbol between the first and second set of square brackets is the combination of a back slash and period (\.). Placing a backslash before a period tells the Regex to treat this character combination as a literal period. So, the \. combination means a period is required at this point of the email address.

Lastly, let's explore the final set of square brackets:

[a-zA-Z]

This syntax only allows uppercase and lowercase letters. No numbers or special characters are allowed in the final set of square brackets. This portion represents the .com, .net, .org, and so on part of an email address. We know authentic emails only have letters to conclude the email address.

After the final set of square brackets, you see the following syntax:

{2,6}

Notice how this section uses curly braces instead of square brackets. It's not specifying acceptable characters. Instead, it indicates the number of characters allowed in brackets that precede it. So, the portion of the email address after the final period must be between two and six characters long. Which makes sense for .com, .net, .org, and so on.

The remainder of the syntax is standard for Regex:

  • The forward slash (/) at the beginning and end of regular expression notes that this syntax represents a regular expression.

  • The caret (^) as the second character of the syntax notes this as the beginning of the expected string.

  • The parentheses note that what's enclosed should be treated as one complete string.

  • The asterisk (*) near the end means this regular expression can be used as many times as needed. So, you could have zero valid email addresses or as many as you want.

  • The dollar sign ($) represents the end of the Regex string.

A static image displaying a Regex string broken down into several parts.

To learn more about Regex syntax, visit https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Cheatsheet. For syntax specific to querying MongoDB, visit https://docs.mongodb.com/manual/reference/operator/query/regex/.

Using Regex in Unqork

There are two options for using Regex when configuring in Unqork:

Native Unqork Components

The Email and Phone Number components include built-in Regex to confirm an email address and phone number is correct. But, many times you'll customize restrictions on other fields. For example, customizing restrictions for a Text Field component.

The Text Field component has a dedicated setting where you can add custom Regex. In the Advanced settings, there's a couple of Regular Expression Pattern settings to add Regex syntax and a custom error message when the criteria is not met.

Regular Expression Pattern is a designated Regex setting, so the opening and closing forward slashes are not required.

A static image displaying the Text Field configuration screen, the Regular Expression Patter (Regex) and Pattern Error Message fields are highlighted.

Here's what the configuration window looks like with the Regex set to check for a valid email address:

A static image displaying the Text Field configuration window, the Regular Express Pattern field is filled out with the following REGEX code  ([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})*$

And here's how the field looks in Express View Express View is how your end-user views your application. Express View also lets you preview your applications to test your configuration and view the styling. This is also the view your end-users will see when interacting with your application. After configuring a module, click Preview in the Module Builder to interact with the module in Express View. when the input is not an email address:

Regex Formulas

Another option for using Regex in Unqork is to use Regex formulas. REGEXMATCH is a formula you can use in components to validate a string against a regular expression.

Below is the syntax for a REGEXMATCH formula:

REGEXMATCH(stringToCheck,regexPattern)

In this formula, regexPattern is also specifically used for regular expressions. So, you'll omit the opening and closing slashes.

In a Data Workflow component you can use Formula or Create Value operators to validate a single value. Or, you can use Create Field operator to validate a field across an entire table. The stringToCheck and regexPattern syntax can be hard coded directly into the formula or set dynamically using the argument ports and inserting A into the formula. Either way, your result is a Boolean value of true (indicating a match) or false (indicating no match).

REGEXTRACT is a formula that returns a portion of a string that matches a regular expression. Let's say you only want to retrieve the portion of an email address before the @ symbol.

Below is the syntax for a REGEXTRACT formula:

REGEXEXTRACT(stringToCheck,regexPattern)

The syntax is the same as REGEXMATCH, but instead of a Boolean value, the output is the portion of the string that matches the regular expression. If no portion of the string matches, the output is a null value.

Resources