The following examples illustrate the use and construction of simple regular expressions. Each example includes the type of text to match, one or more regular expressions that match that text, and notes that explain the use of the special characters and formatting.
- Match exact phrase only
- Match word or phrase in a list
- Match word with different spellings or special characters
- Match any email address from a specific domain
- Match any IP address in a range
- Match an alphanumeric format
Important:We support RE2 Syntax only, which differs slightly from PCRE. Regular expressions are case-sensitive by default.
Example 2:(\W|^)stock\s{0,3}tips(\W|$)
Example 3:(\W|^)stock\s{0,3}tip(s){0,1}(\W|$)
- \W matches any character that’s not a letter, digit, or underscore. It prevents the regex from matching characters before or after the phrase.
- In example 2, \s matches a space character, and {0,3} indicates that from 0 to 3 spaces can occur between the words stockand tip.
- ^ matches the start of a new line. Allows the regex to match the phrase if it appears at the beginning of a line, with no characters before it.
- $ matches the end of a line. Allows the regex to match the phrase if it appears at the end of a line, with no characters after it.
- In example 3, (s) matches the letter s, and {0,1} indicates that the letter can occur 0or 1times after the word tip. Therefore, the regex matches stock tipand stock tips. Alternatively, you can use the character ? instead of {0,1}
- baloney
- darn
- drat
- fooey
- gosh darnit
- heck
-
(...) groups all the words, such that the \W character class applies to all of the words within the parenthesis.
-
(?i) makes the content matching case insensitive.
-
\W matches any character that’s not a letter, digit, or underscore. It prevents the regex from matching characters before or after the words or phrases in the list.
-
^ matches the start of a new line. Allows the regex to match the word if it appears at the beginning of a line, with no characters before it.
-
$ matches the end of a line. Allows the regex to match the word if it appears at the end of a line, with no characters after it
-
| indicates an “or,” so the regex matches any one of the words in the list.
-
\s matches a space character. Use this character to separate words in a phrase.
Usage example
- f@st c@sh
- f@$t c@$h
- fa$t ca$h
f[a4@][s5\$][t7] +c[a4@][s5\$]h
- \W isn't included, so that other characters can appear before or after any of the variants of fast cash. For example, the regex still matches fast cashin the following text:
Fast cash!! or ***f@st ca$h***
- [a4@] matches the characters a, 4, or @ in the second character position of the word, reflecting common letter substitutions spammers use to evade simple text matches.
- \W matches any character that’s not a letter, digit, or underscore. It prevents the regex from matching characters before or after the email address.
- ^ matches the start of a new line. Allows the regex to match the address if it appears at the beginning of a line, with no characters before it.
- $ matches the end of a line. Allows the regex to match the address if it appears at the end of a line, with no characters after it.
- [\w.\-] matches any word character (a-z, A-Z, 0-9, or an underscore), a period, or a hyphen. These are the most commonly used valid characters in the first part of an email address. The \- (which indicates a hyphen) must occur last in the list of characters within the square brackets.
- The \ before the dash and period “escapes” these characters—that is, it indicates that the dash and period aren't a regex special characters themselves. There's no need to escape the period within the square brackets.
- {0,25} indicates that from 0 to 25 characters in the preceding character set can occur before the @ symbol. The Content Compliance email setting supports matching of up to 25 characters for each character set in a regular expression.
- The (...) formatting groups the domains, and the | character that separates them indicates an “or.”
Example 2: 192\.168\.1\.\d{1,3}
- The \ before each period “escapes” the period—that is, it indicates that the period isn't a regex special character itself.
- In Example 1, no characters follow the last period, so the regex matches any IP address beginning with 192.168.1., regardless of the number that follows.
- In Example 2, \d matches any digit from 0to 9after the last period, and {1,3} indicates that the digits 1 to 3 can appear after that last period. In this case, the regex matches any complete IP address beginning with 192.168.1.. This regex also matches invalid IP addresses, such as 192.168.1.999.
- PO nn-nnnnn
- PO-nn-nnnn
- PO# nn nnnn
- PO#nn-nnnn
- PO nnnnnn
- \W matches any character that’s not a letter, digit, or underscore. It prevents the regex from matching characters before or after the number.
- ^ matches the start of a new line. Allows the regex to match the number if it appears at the beginning of a line, with no characters before it.
- $ matches the end of a line. Allows the regex to match the number if it appears at the end of a line, with no characters after it.
- [#\-] matches a pound sign or a hyphen after the letters po, and {0,1} indicates that one of those characters can occur zero or one times. The \- (which indicates a hyphen) must occur last in the list of characters within the square brackets.
- \s matches a space, and {0,1} indicates that a space can occur zero or one times.
- \d matches any digit from 0 to 9, and {2} indicates that exactly 2 digits must appear in this position in the number.