Regex (regular expression) is a sequence of characters that defines a search pattern. It is used in programming and text processing to find, match, validate, and manipulate strings. Regular expressions are supported by virtually every programming language including JavaScript, Python, Ruby, Java, Go, and PHP. They can match simple literal text or complex patterns involving character classes, quantifiers, anchors, groups, and lookaheads. Despite their compact syntax, regex patterns can express sophisticated matching logic.

What are capturing groups in regex?

Capturing groups are created by enclosing part of a regex pattern in parentheses. They serve two purposes: they group tokens so quantifiers can apply to the entire group, and they capture the matched text for later reference. For example, (\w+)@(\w+) captures the username and domain parts of an email-like pattern separately. Non-capturing groups (?:...) provide grouping without capturing. Named groups (? ...) let you reference captures by name instead of number.

Regex to English Explainer - Free Online Tool

Q: How do I read a regular expression?

Read a regex from left to right, breaking it into tokens. Each token is either a literal character, a special character class (like \d for digits or \w for word characters), a quantifier (like + for one or more, * for zero or more, ? for optional), an anchor (^ for start, $ for end), or a group (parentheses). The key is to identify each token and understand what it matches. For example, ^\d{3}-\d{4}$ reads as: start of string, exactly 3 digits, a literal hyphen, exactly 4 digits, end of string.

Q: What is lazy vs greedy matching?

By default, quantifiers like *, +, and {n,m} are greedy, meaning they match as many characters as possible. Adding a ? after the quantifier makes it lazy, matching as few characters as possible. For example, given the string ' bold ', the greedy pattern matches the entire string ' bold ', while the lazy pattern matches only ' '. Lazy matching is essential when you want to match the shortest possible substring, such as individual HTML tags.

Q: What are lookaheads and lookbehinds?

Lookaheads and lookbehinds are zero-width assertions that check for a pattern without including it in the match. A positive lookahead (?=...) asserts that what follows matches the pattern. A negative lookahead (?!...) asserts that what follows does not match. Similarly, (?<=...) is a positive lookbehind and (?<!...) is a negative lookbehind. For example, \d+(?= dollars) matches digits only when followed by ' dollars' but does not include ' dollars' in the match result.

Paste a regular expression to get a plain English explanation of what each part matches. Understand complex regex patterns instantly.

How to Use the Regex Explainer

Type or paste a regular expression into the input field and the tool immediately breaks it down into a plain English explanation. Each token in the regex is identified and described: character classes, quantifiers, anchors, groups, lookaheads, and literal characters. The stats panel shows the total number of tokens, whether the pattern contains capturing groups, and whether it uses quantifiers. The explanation updates in real time as you type, so you can build up a pattern and see how each addition changes the meaning.

This tool is especially useful when reading someone else's regex code, debugging a pattern that does not match as expected, or learning regex syntax by experimenting with different tokens. Copy the explanation to include as a comment in your source code, making complex patterns understandable for future maintainers.

Regular Expression Token Types

Regular expressions are built from several types of tokens, each serving a specific purpose. Understanding these building blocks is the key to both reading and writing regex patterns effectively. While the syntax can look cryptic at first, there are only a handful of fundamental concepts that combine to create complex matching behavior.

Character Classes and Shorthand

Character classes match one character from a defined set. Square brackets define custom classes: [abc] matches a, b, or c. Ranges are defined with hyphens: [a-z] matches any lowercase letter. Negated classes use a caret: [^0-9] matches anything except a digit. Shorthand classes provide common patterns: \d matches any digit (equivalent to [0-9]), \w matches word characters (letters, digits, underscore), \s matches whitespace (space, tab, newline), and the dot . matches any character except newline.

Quantifiers

Quantifiers control how many times the preceding token must appear. The + quantifier matches one or more times. The * quantifier matches zero or more times. The ? quantifier matches zero or one time (optional). Curly braces specify exact counts: {3} matches exactly 3 times, {2,5} matches between 2 and 5 times, and {3,} matches 3 or more times. By default quantifiers are greedy (match as much as possible), but adding ? after them makes them lazy (match as little as possible).

Anchors and Boundaries

Anchors do not match characters but instead match positions in the string. The caret ^ matches the start of the string (or line in multiline mode). The dollar sign $ matches the end. The word boundary \b matches the position between a word character and a non-word character, useful for matching whole words without capturing surrounding whitespace or punctuation.

Groups and Alternation

Parentheses create groups that can be quantified as a unit and optionally capture matched text. The pattern (abc)+ matches one or more repetitions of the sequence "abc". The pipe symbol | inside a group creates alternation: (cat|dog) matches either "cat" or "dog". Non-capturing groups (?:...) provide grouping without the overhead of capturing. Named groups (?<name>...) assign a name to the captured text for easier reference in code.

Frequently Asked Questions

What is regex?

A sequence of characters defining a search pattern, used in programming for finding, matching, validating, and manipulating strings. Supported by virtually every programming language.

How do I read a regular expression?

Read left to right, identifying each token: literal characters, character classes (\d, \w), quantifiers (+, *, ?), anchors (^, $), and groups (parentheses). This tool automates that process.

What are capturing groups?

Portions of a pattern in parentheses that capture matched text for later reference. Non-capturing groups (?:...) group without capturing. Named groups (?<name>...) allow reference by name.

What is lazy vs greedy matching?

Greedy quantifiers (*, +) match as much as possible. Adding ? makes them lazy, matching as little as possible. Use lazy matching to find the shortest possible match.

What are lookaheads and lookbehinds?

Zero-width assertions that check for a pattern without including it in the match. (?=...) is positive lookahead, (?!...) is negative lookahead, (?<=...) is positive lookbehind, (?<!...) is negative lookbehind.

Regex to English Explainer

Embed This

How to Use the Regex Explainer

Regular Expression Token Types

Character Classes and Shorthand

Quantifiers

Anchors and Boundaries

Groups and Alternation

Frequently Asked Questions

What is regex?

How do I read a regular expression?

What are capturing groups?

What is lazy vs greedy matching?

What are lookaheads and lookbehinds?

Related Calculators

Regex Tester

Regex Builder

String Encoder/Decoder

You Might Also Need

Regex Builder

Regex Tester

String Encoder/Decoder

Recommended Reading

How Much Should You Tip? A Complete Tipping Guide

GPA Calculator: How to Calculate Your Grade Point Average