More games at WuGames.ioSponsoredDiscover free browser games — play instantly, no download, no sign-up.Play

Regex Generator

Visual regex builder + live tester: highlight matches, capture groups, lookahead, replace preview, export valid JS/Python/PHP/Java/Ruby/Go code. No install.

Library Pattern Library
Builder Pattern Builder
Enter your regex pattern or select from library
/pattern/flags
Components Quick Components
Click to insert component at cursor position
Test Test Input
Use $1, $2 or $to insert captured groups

Regex Generator - Visual Regular Expression Builder and Tester

Regular expressions are the universal language of pattern matching, invented by mathematician Stephen Kleene in 1951 and embedded in nearly every text-processing tool ever written — grep, sed, awk, every modern programming language's string library, every database with a LIKE clause, every form-validation framework, every code editor's find-and-replace. This builder removes the two main pain points of writing regex by hand: forgetting the exact syntax for character classes/quantifiers/anchors, and never being sure what your pattern actually does until you test it against real strings. The library ships 12 production-ready patterns covering the 80% case (email RFC 5322, HTTP/HTTPS URLs, US/international phone, IPv4, ISO date, 24h time, usernames, strong passwords, hex colors, credit cards, US ZIP, HTML tags). Quick Components let you click to insert \d \w \s . ^ $ \b * + ? () [] | {n} {n,m} into your pattern at the cursor — no need to memorize the syntax. The live tester runs your pattern against the input as you type, highlights matches, shows captured groups, and produces a plain-English explanation of every component. When you're happy, the Generate Code button outputs idiomatic snippets for JavaScript, Python (re module), PHP (preg_match), Java (Pattern/Matcher), Ruby and Go (regexp package) — paste straight into your project.

What are Regular Expressions?

**Regular Expressions (Regex)** are powerful pattern-matching tools used to search, validate, and manipulate text.

**Common Uses:**

1. **Validation:**
• Email addresses
• Phone numbers
• Passwords
• URLs
• Credit cards

2. **Text Processing:**
• Search and replace
• Data extraction
• Log parsing
• Text cleaning

3. **Data Validation:**
• Form inputs
• API parameters
• File formats
• Configuration files

**Basic Syntax:**

• **Literals:** Match exact characters
- `abc` matches "abc"

• **Character Classes:**
- `\d` = digit (0-9)
- `\w` = word character (a-z, A-Z, 0-9, _)
- `\s` = whitespace
- `.` = any character

• **Quantifiers:**
- `*` = 0 or more
- `+` = 1 or more
- `?` = 0 or 1 (optional)
- `{n}` = exactly n times
- `{n,m}` = between n and m times

• **Anchors:**
- `^` = start of line
- `$` = end of line
- `\b` = word boundary

• **Groups:**
- `(...)` = capturing group
- `(?:...)` = non-capturing group

• **Character Sets:**
- `[abc]` = matches a, b, or c
- `[a-z]` = matches any lowercase letter
- `[^abc]` = matches anything except a, b, or c

**Examples:**

```regex
# Email validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

# Phone number (US)
^\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})$

# URL
^https?:\/\/[\w\-]+(\.[\w\-]+)+[/#?]?.*$

# Strong password
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
```

How to use this Regex Generator?

**Step-by-Step Guide:**

**1. Choose a Starting Point:**

**Option A - Pattern Library:**
• Browse common patterns (email, URL, phone, etc.)
• Click a pattern to load it
• Modify as needed

**Option B - Build from Scratch:**
• Type pattern directly in the input
• Use Quick Components to insert syntax
• Click components to add at cursor position

**2. Set Flags:**

• **g (Global):** Find all matches, not just first
• **i (Case Insensitive):** Ignore case (A = a)
• **m (Multiline):** ^ and $ match line breaks
• **s (Dot All):** . matches newlines

**3. Test Your Pattern:**

• Enter test text in the Test Input area
• Click "Test Pattern"
• View matches highlighted in real-time
• See match count and positions
• Inspect captured groups

**4. Understand Your Pattern:**

• Click "Explain Pattern"
• See breakdown of each component
• Learn what each symbol means
• Understand pattern structure

**5. Generate Code:**

• Click "Generate Code"
• Get ready-to-use code in:
- JavaScript
- Python
- PHP
- Java
- Ruby
- Go
• Copy to your project

**Tips:**

✓ Start with a pattern library example
✓ Test with multiple input variations
✓ Use Explain to learn syntax
✓ Test edge cases
✓ Keep patterns simple when possible
✓ Use non-capturing groups (?:) for performance
✓ Escape special characters: \. \* \+ \? etc.

Common Regex Patterns and Use Cases

**Validation Patterns:**

**1. Email Address:**
```regex
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
```
• Basic email validation
• Allows common special characters
• Requires @ and domain extension

**2. URL (HTTP/HTTPS):**
```regex
^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&\/\/=]*)$
```
• Matches web URLs
• Optional www prefix
• Supports query strings and anchors

**3. Phone Numbers:**
```regex
# US Format
^\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})$

# International
^\+?[1-9]\d{1,14}$
```

**4. Strong Password:**
```regex
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
```
• Minimum 8 characters
• At least one uppercase letter
• At least one lowercase letter
• At least one digit
• At least one special character

**Data Extraction:**

**5. Date Formats:**
```regex
# YYYY-MM-DD (ISO)
^\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12][0-9]|3[01])$

# MM/DD/YYYY
^(?:0[1-9]|1[0-2])\/(?:0[1-9]|[12][0-9]|3[01])\/\d{4}$

# DD-MM-YYYY
^(?:0[1-9]|[12][0-9]|3[01])-(?:0[1-9]|1[0-2])-\d{4}$
```

**6. IP Addresses:**
```regex
# IPv4
^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$

# IPv6 (simplified)
^([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$
```

**7. Credit Card:**
```regex
# Visa, MasterCard, Amex, Discover
^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9][0-9])[0-9]{12})$
```

**Text Processing:**

**8. Extract Hashtags:**
```regex
#[a-zA-Z0-9_]+
```

**9. Extract Mentions:**
```regex
@[a-zA-Z0-9_]+
```

**10. HTML Tags:**
```regex
<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)
```

**11. Remove Extra Whitespace:**
```regex
\s+
# Replace with single space
```

**Code Patterns:**

**12. JavaScript Variables:**
```regex
(var|let|const)\s+([a-zA-Z_$][a-zA-Z0-9_$]*)\s*=
```

**13. Hex Color Codes:**
```regex
^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$
```

**14. File Extensions:**
```regex
\.(jpg|jpeg|png|gif|pdf|doc|docx)$
```

**Tips:**

• Test patterns with edge cases
• Use online regex testers for debugging
• Consider performance for large texts
• Simpler is often better
• Document complex patterns

Regex Flags Explained

**Regular Expression Flags:**

Flags modify how the regex engine processes patterns.

**1. Global Flag (g):**

```javascript
const text = 'cat bat rat';
const regex1 = /at/; // No global
const regex2 = /at/g; // Global

text.match(regex1); // ['at'] - first match only
text.match(regex2); // ['at', 'at', 'at'] - all matches
```

**Use when:**
• Finding all occurrences
• Replace all instances
• Counting matches

**2. Case Insensitive Flag (i):**

```javascript
const regex1 = /hello/; // Case sensitive
const regex2 = /hello/i; // Case insensitive

regex1.test('Hello'); // false
regex2.test('Hello'); // true
regex2.test('HELLO'); // true
```

**Use when:**
• User input validation
• Case-insensitive search
• Flexible matching

**3. Multiline Flag (m):**

```javascript
const text = `line 1
line 2
line 3`;

const regex1 = /^line/; // Matches start of string
const regex2 = /^line/m; // Matches start of each line

text.match(regex1); // ['line'] - one match
text.match(regex2); // ['line', 'line', 'line'] - three matches
```

**Effect:**
• `^` matches start of each line (not just string)
• `$` matches end of each line (not just string)

**Use when:**
• Processing multi-line text
• Line-by-line validation
• Log file parsing

**4. Dot All Flag (s):**

```javascript
const text = 'hello\nworld';

const regex1 = /hello.world/; // . doesn't match \n
const regex2 = /hello.world/s; // . matches everything

regex1.test(text); // false
regex2.test(text); // true
```

**Effect:**
• `.` matches newline characters
• Normally `.` matches anything except `\n`

**Use when:**
• Matching across lines
• HTML/XML parsing
• Multi-line content extraction

**Flag Combinations:**

```javascript
// Multiple flags together
const regex = /pattern/gim;
// g = global
// i = case insensitive
// m = multiline

// Common combinations:
/email/gi // Find all emails, any case
/^error/gim // Find all lines starting with "error", any case
/.*?/gs // Match everything across lines, lazily
```

**Language-Specific Flags:**

**JavaScript:**
```javascript
/pattern/gimsuy
// y = sticky (matches from lastIndex)
// u = unicode (proper unicode handling)
```

**Python:**
```python
import re
re.IGNORECASE # i
re.MULTILINE # m
re.DOTALL # s
re.VERBOSE # x - ignore whitespace, allow comments
```

**PHP:**
```php
/pattern/gimsx
// x = extended (ignore whitespace)
```

**Best Practices:**

✓ Only use flags you need
✓ Global flag for replace/count operations
✓ Case insensitive for user input
✓ Multiline for text processing
✓ Test with and without flags
✓ Document flag usage in code comments

Why does my regex match too much or too little? (Greedy vs Lazy)

By default, every quantifier (* + ? {n,m}) in a regex is GREEDY — it grabs as many characters as it can while still allowing the rest of the pattern to match. The classic trap: against the input '<b>foo</b><i>bar</i>', the pattern '<.*>' returns the whole string '<b>foo</b><i>bar</i>' instead of just '<b>'. The fix is to make the quantifier LAZY by appending ?: '<.*?>' returns '<b>' first, then '</b>', then '<i>', then '</i>' separately. Rule of thumb: when the next thing in your pattern is a literal character (like the closing >), use lazy quantifiers. When you want to consume everything up to the END of input, use greedy. Better still, replace .* with a negated character class — '<[^>]*>' is faster than '<.*?>' because it avoids backtracking entirely. This is also the #1 cause of 'catastrophic backtracking' that can hang a regex engine for seconds on a 100-character input.

What's the difference between capturing, non-capturing, and named groups?

Parentheses in regex serve three different purposes that beginners often conflate.

(1) Capturing group: (\d{4}) — matches AND saves the matched text for later retrieval via $1/\1/match.groups[1]. Every set of plain parens creates a new numbered group.

(2) Non-capturing group: (?:\d{4}) — matches but does NOT save the result. Use it when you only need parentheses for grouping (e.g. to apply a quantifier to multiple characters, or for alternation like (?:cat|dog)s?). Skipping the capture has two benefits: cleaner output (your match.groups array isn't cluttered with values you'll never use) and slightly faster execution.

(3) Named group: (?<year>\d{4}) — captures and gives the group a meaningful name. Access via match.groups.year (JS), m.group('year') (Python) or m['year'] (Ruby). Use named groups whenever your regex has 3+ captures — '$1', '$2', '$3' get unreadable fast, but '${year}-${month}-${day}' is self-documenting. Supported in modern JavaScript (ES2018+), Python, Perl, PHP, Ruby, .NET. Older environments fall back to numbered groups.

Regex Generator — Visual regex builder + live tester: highlight matches, capture groups, lookahead, replace preview, export valid JS/Pytho
Regex Generator

Why does the email pattern in the library not match every valid email address?

Because the full RFC 5322 grammar for valid email addresses is over 6,000 characters of regex and is impractical to use. The library pattern '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' is the industry-standard pragmatic compromise — it accepts 99%+ of real-world addresses while rejecting obviously malformed ones (no @, no domain, no TLD). What it misses: legitimate but rare patterns like quoted local parts ('John Doe'@example.com), IP-literal domains (user@[192.168.1.1]), internationalized domain names (user@münchen.de), and the new TLDs introduced after 2011. What it allows that's technically invalid: a leading dot ('[email protected]') and consecutive dots ('[email protected]'). For production use, the only truly safe email validation is to send a confirmation email — any regex is a screening step, not a guarantee. Use the pattern to reject typos; verify ownership via a click-to-confirm link.

How do I use the Replace / Substitution preview, and how do backreferences differ per language?

Type a pattern, add test text, then fill the **Replacement (optional)** field and press **Test Pattern**. The Replacement Preview shows your test string with every match substituted, using JavaScript's native String.replace under the current flags — so turn on the **g** flag to replace all occurrences (without it only the first match is replaced).

**Inserting captured groups:**

• Numbered: `$1`, `$2`, `$3` … reference capture groups in order.
• Named: `$<name>` references a named group `(?<name>...)`.
• `$&` inserts the whole match; `$\`` and `$'` insert text before/after the match.

**Example:** pattern `(\w+)@(\w+)` with replacement `$2-$1` turns `a@b c@d` into `b-a d-c`.

**Per-language replace syntax** (the Generate Code button emits all of these for you):

• **JavaScript:** `text.replace(/pat/g, '$1')` — `$1`, `$<name>`.
• **Python:** `re.sub(r'pat', r'\1', text)` — numbered `\1`, named `\g<name>`.
• **PHP:** `preg_replace('#pat#', '$1', $text)` — `$1` or `${1}`, named `${name}`.
• **Java:** `matcher.replaceAll("$1")` — `$1`, named `${name}`.
• **Ruby:** `text.gsub(/pat/, '\1')` — numbered `\1`, named `\k<name>`.
• **Go:** `re.ReplaceAllString(text, "$1")` — `$1` or `${1}`, named `${name}`.

Note that `$` (JS/Java/Go/PHP) becomes `\` (Python/Ruby) for numbered groups — the generated snippets convert this automatically.

Why do the generated snippets drop the 'g' flag and change some flags between languages?

JavaScript flags do not map 1:1 onto other regex engines, and copying them literally produces broken or silently-wrong code. The generator normalizes them per flavor:

• **'g' (global)** is a *JavaScript-only* flag. In Python, PHP, Java, Ruby and Go, 'match all vs match one' is decided by the **function you call** (`re.findall` vs `re.search`, `preg_match_all` vs `preg_match`, `gsub` vs `sub`, `FindAll...` vs `Find...`), not by a flag — so the snippets omit 'g' and pick the right call instead.

• **Ruby flag remapping:** Ruby's `m` means DOTALL (dot matches newline), NOT multiline, and `s` is not a regex modifier at all (it is an encoding suffix). So JavaScript `s` (dotAll) maps to Ruby `m`, JavaScript `m` (multiline) is dropped because Ruby's `^`/`$` are *always* line-anchored, and `i` stays `i`. Passing JS flags straight through would silently break behavior or raise a syntax error.

• **Python / Java** use named constants instead of letters: `re.IGNORECASE | re.MULTILINE | re.DOTALL`, `Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL`.

• **PHP delimiters:** the snippet uses `#...#` instead of `/.../` so a forward slash in your pattern does not terminate the delimiter, and literal `#` is auto-escaped.

• **Engine differences:** Go's `regexp` package uses **RE2**, which guarantees linear-time matching but *does not support backreferences or lookaround*. Patterns using `\1`, `(?=...)` or `(?<=...)` will compile in JS/PCRE but fail to compile in Go — test there before shipping.

• **JS-only flags:** the sticky `y` and unicode `u`/`v` flags have no portable equivalent and are not emitted for other languages.

Are these generated patterns safe to use for server-side input validation (ReDoS)?

Treat any regex you paste into production input validation as a potential denial-of-service vector. **Catastrophic backtracking (ReDoS)** happens when a backtracking engine (JavaScript, Python `re`, Java, PCRE/PHP, Ruby) explores exponentially many paths on a crafted input, hanging a request thread for seconds or minutes on a short string.

**Danger signs in a pattern:**

• Nested quantifiers: `(a+)+`, `(a*)*`, `(.*)*`.
• Overlapping alternation under a quantifier: `(a|a)+`, `(\d+|\w+)+`.
• Adjacent ambiguous quantifiers: `\s*.*\s*$` on attacker-controlled input.

**Mitigations:**

1. Replace `.*`/`.+` with a **negated character class** like `[^>]*` so the engine cannot backtrack across the delimiter.
2. **Anchor** with `^` and `$`, and prefer possessive quantifiers / atomic groups `(?>...)` where supported (PCRE, Java, Ruby) to forbid backtracking.
3. Set an **execution timeout** on the match, or run untrusted matching in a worker/sandbox you can kill.
4. For purely linear-time guarantees, use **RE2** (Go's `regexp`, or `re2` bindings) — it is immune to ReDoS by design, at the cost of no backreferences/lookaround.
5. Keep the regex a *first-pass screen*; do authoritative validation (e.g. email ownership, checksum) separately.

Test every pattern against deliberately adversarial inputs (long repeated runs, near-misses) before deploying it on user-facing endpoints.

Advanced Regex Techniques

**1. Lookahead and Lookbehind:**

**Positive Lookahead (?=...)**
```regex
# Match 'foo' only if followed by 'bar'
foo(?=bar)

# Matches: 'foo' in 'foobar'
# Doesn't match: 'foo' in 'foobaz'
```

**Negative Lookahead (?!...)**
```regex
# Match 'foo' only if NOT followed by 'bar'
foo(?!bar)

# Matches: 'foo' in 'foobaz'
# Doesn't match: 'foo' in 'foobar'
```

**Positive Lookbehind (?<=...)**
```regex
# Match 'bar' only if preceded by 'foo'
(?<=foo)bar

# Matches: 'bar' in 'foobar'
# Doesn't match: 'bar' in 'bazbar'
```

**Negative Lookbehind (?<!...)**
```regex
# Match 'bar' only if NOT preceded by 'foo'
(?<!foo)bar

# Matches: 'bar' in 'bazbar'
# Doesn't match: 'bar' in 'foobar'
```

**2. Capturing Groups:**

```regex
# Extract parts of a date
(\d{4})-(\d{2})-(\d{2})

# Match: '2025-01-15'
# Group 1: '2025' (year)
# Group 2: '01' (month)
# Group 3: '15' (day)
```

**Named Capturing Groups:**
```regex
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

# Access by name in code:
# match.groups.year
# match.groups.month
# match.groups.day
```

**Non-Capturing Groups (?:...)**
```regex
# Group for alternation, but don't capture
(?:cat|dog)s?

# Matches: cat, cats, dog, dogs
# Doesn't create capture group
# Better performance
```

**3. Backreferences:**

```regex
# Match repeated words
\b(\w+)\s+\1\b

# Matches: 'the the', 'hello hello'
# \1 refers to first captured group
```

**HTML Tag Matching:**
```regex
# Match opening and closing tags
<([a-z]+)>.*?<\/\1>

# Matches: <div>content</div>
# Doesn't match: <div>content</span>
```

**4. Greedy vs Lazy Quantifiers:**

**Greedy (default):**
```regex
# Matches as much as possible
<.*>

# In: '<div>Hello</div> <span>World</span>'
# Matches: '<div>Hello</div> <span>World</span>' (entire string)
```

**Lazy (add ?):**
```regex
# Matches as little as possible
<.*?>

# In: '<div>Hello</div> <span>World</span>'
# Matches: '<div>', '</div>', '<span>', '</span>' (separately)
```

**Lazy quantifiers:**
• `*?` = 0 or more (lazy)
• `+?` = 1 or more (lazy)
• `??` = 0 or 1 (lazy)
• `{n,m}?` = n to m (lazy)

**5. Atomic Groups (?>...)**

```regex
# Prevent backtracking
(?>\d+)\.

# More efficient for long numbers
# Doesn't retry if decimal point doesn't match
```

**6. Conditional Patterns:**

```regex
# If group 1 matched, use pattern A, else pattern B
(\()?[^()]+(?(1)\))

# Matches: 'hello' or '(hello)'
# Doesn't match: 'hello)' or '(hello'
```

**7. Unicode Properties:**

```regex
# Match any letter (any language)
\p{L}+

# Match emoji
\p{Emoji}

# Match currency symbols
\p{Sc}
```

**Performance Tips:**

✓ Use non-capturing groups (?:) when don't need capture
✓ Be specific (avoid .* when possible)
✓ Use atomic groups to prevent backtracking
✓ Anchor patterns when possible (^, $, \b)
✓ Test with large inputs
✓ Avoid nested quantifiers
✓ Use character classes [abc] instead of (a|b|c)

**Common Pitfalls:**

❌ Catastrophic backtracking: `(a+)+b`
❌ Overly greedy: `.*` matching too much
❌ Missing anchors: partial matches
❌ Forgetting to escape: `. * + ? [ ] ( ) { } ^ $ | \`
❌ Not testing edge cases

Key Features

  • Visual pattern builder interface
  • Comprehensive pattern library
  • Common regex patterns (email, URL, phone, etc.)
  • Quick component insertion
  • Real-time pattern testing
  • Match highlighting and visualization
  • Captured groups display
  • Match count and positions
  • Pattern explanation generator
  • Component-by-component breakdown
  • Regex flags support (g, i, m, s)
  • Multi-language code generation
  • JavaScript code output
  • Python code output
  • PHP code output
  • Java code output
  • Ruby code output
  • Go code output
  • Syntax highlighting
  • Copy to clipboard
  • Download as code file
  • No data sent to server
  • Works offline
  • Mobile-friendly interface
  • Dark mode support
  • 100% free
  • No registration required