Regular Expression Matching
Description
Given an input string s and a pattern p, implement regular expression matching that supports two special characters:
'.'— Matches any single character. For example, the pattern'.'can match'a','b','z', or any other single character.'*'— Matches zero or more of the preceding element. The'*'does not stand alone — it always refers to the character directly before it. For example,'a*'can match""(empty — zeroas),"a"(onea),"aa"(twoas),"aaa"(threeas), and so on. Similarly,'.*'can match any sequence of characters (including an empty sequence), because'.'matches any character and'*'repeats it zero or more times.
The matching must cover the entire input string, not just a substring. The pattern must account for every character in s from start to finish.
In simpler terms: You are building a mini regex engine. Given a text string and a pattern with wildcards, determine whether the pattern describes the entire text exactly.
Examples
Example 1
Input: s = "aa", p = "a"
Output: false
Explanation: The pattern "a" matches only a single 'a' character. The string "aa" has two characters, so the pattern cannot cover the entire string. There is no '*' in the pattern to allow repetition.
Example 2
Input: s = "aa", p = "a*"
Output: true
Explanation: The '*' means zero or more of the preceding element 'a'. By repeating 'a' twice, the pattern "a*" produces "aa", which matches the entire input string.
Example 3
Input: s = "ab", p = ".*"
Output: true
Explanation: The pattern ".*" means zero or more of any character ('.'). Since '.' matches any character, ".*" can match any string — including "ab". It effectively matches 'a' first (using .), then 'b' (using . again), with the '*' allowing two repetitions of ..
Example 4
Input: s = "aab", p = "cab"
Output: true
Explanation: The pattern "c*a*b" works as follows: "c*" matches zero cs (skipping it entirely), "a*" matches two as, and "b" matches the final b. So the pattern describes "" + "aa" + "b" = "aab", which matches the input.
Constraints
- 1 ≤ s.length ≤ 20
- 1 ≤ p.length ≤ 20
scontains only lowercase English letterspcontains only lowercase English letters,'.', and'*'- It is guaranteed that for each appearance of
'*', there will be a previous valid character to match
Editorial
Brute Force
Intuition
The most natural way to think about this problem is recursion. At each point, we look at the current character in the string and the current character in the pattern, and decide what to do next.
Imagine you are reading a book (the string) and following a set of instructions (the pattern) simultaneously. At each step, you check: does my current instruction match the current word in the book?
There are three scenarios at each position:
-
No
'*'ahead: The current pattern character must match the current string character (either they are the same letter, or the pattern has'.'). If they match, move forward in both the string and pattern. -
'*'is the next pattern character: This is the tricky case. The'*'gives us a choice:- Use zero occurrences: Skip the current pattern character and its
'*'entirely (advance pattern by 2 positions). This is like saying "this part of the pattern matches nothing." - Use one or more occurrences: If the current characters match, consume one character from the string but keep the pattern position the same (so
'*'can match again).
- Use zero occurrences: Skip the current pattern character and its
-
Base case: If we have exhausted the pattern, the match succeeds only if we have also exhausted the string.
Without any optimization, this recursive approach explores every possible combination of choices at each '*', leading to an exponential number of recursive calls.
Step-by-Step Explanation
Let's trace through matching s = "aab", p = "cab":
Step 1: Call match(i=0, j=0). s[0]='a', p[0]='c'. Next pattern char p[1]='*', so we have the star case.
- Option A: Skip "c*" → call match(i=0, j=2)
- Option B: Try to match 'c' with 'a' → 'a' ≠ 'c', so this branch fails.
- We proceed with Option A.
Step 2: Call match(i=0, j=2). s[0]='a', p[2]='a'. Next pattern char p[3]='*', so we have another star case.
- Option A: Skip "a*" → call match(i=0, j=4)
- Option B: Match 'a' with 'a' → they match! → call match(i=1, j=2)
- We need to try both.
Step 3 (Option A path): Call match(i=0, j=4). s[0]='a', p[4]='b'. No star follows.
- Does 'a' match 'b'? No. Return false.
Step 4 (Option B path): Call match(i=1, j=2). s[1]='a', p[2]='a'. Next pattern char p[3]='*'.
- Option B1: Skip "a*" → call match(i=1, j=4)
- Option B2: Match 'a' with 'a' → match! → call match(i=2, j=2)
Step 5 (B1 path): Call match(i=1, j=4). s[1]='a', p[4]='b'. No star.
- 'a' ≠ 'b'. Return false.
Step 6 (B2 path): Call match(i=2, j=2). s[2]='b', p[2]='a'. Next pattern char p[3]='*'.
- Option: Skip "a*" → call match(i=2, j=4)
- Option: Match 'a' with 'b' → 'b' ≠ 'a', fails.
Step 7: Call match(i=2, j=4). s[2]='b', p[4]='b'. No star.
- 'b' == 'b'. They match! → call match(i=3, j=5)
Step 8: Call match(i=3, j=5). i=3 = len(s), j=5 = len(p). Both exhausted.
- Return true!
The successful path: skip "c*" → match two 'a's with "a*" → match 'b' with 'b'.
Recursive Matching: s = "aab", p = "c*a*b" — Watch the recursive exploration tree as the algorithm tries different ways to match the string against the pattern, branching at every '*' character.
Algorithm
- Define a recursive function match(i, j) that checks if s[i:] matches p[j:]
- Base case: if j reaches the end of pattern p, return true only if i also reaches the end of s
- If j+1 is within bounds and p[j+1] == '*':
- Try skipping: call match(i, j+2) — zero occurrences of p[j]
- Try using: if s[i] matches p[j] (same char or p[j] is '.'), call match(i+1, j) — consume one character, keep pattern
- Return true if either option succeeds
- Otherwise (no star): if s[i] matches p[j], return match(i+1, j+1)
- If characters don't match, return false
Code
class Solution {
public:
bool isMatch(string s, string p) {
return match(s, p, 0, 0);
}
private:
bool match(const string& s, const string& p, int i, int j) {
// Base case: pattern exhausted
if (j == p.size()) {
return i == s.size();
}
// Check if current characters match
bool first_match = (i < s.size()) &&
(s[i] == p[j] || p[j] == '.');
// Star case: next pattern char is '*'
if (j + 1 < p.size() && p[j + 1] == '*') {
// Skip (zero occurrences) OR use (one occurrence + repeat)
return match(s, p, i, j + 2) ||
(first_match && match(s, p, i + 1, j));
}
// Regular case: must match and advance both
return first_match && match(s, p, i + 1, j + 1);
}
};class Solution:
def isMatch(self, s: str, p: str) -> bool:
def match(i: int, j: int) -> bool:
# Base case: pattern exhausted
if j == len(p):
return i == len(s)
# Check if current characters match
first_match = (i < len(s)) and (s[i] == p[j] or p[j] == '.')
# Star case
if j + 1 < len(p) and p[j + 1] == '*':
# Skip (zero) OR use (one + repeat)
return match(i, j + 2) or (first_match and match(i + 1, j))
# Regular case
return first_match and match(i + 1, j + 1)
return match(0, 0)class Solution {
public boolean isMatch(String s, String p) {
return match(s, p, 0, 0);
}
private boolean match(String s, String p, int i, int j) {
// Base case: pattern exhausted
if (j == p.length()) {
return i == s.length();
}
// Check if current characters match
boolean firstMatch = (i < s.length()) &&
(s.charAt(i) == p.charAt(j) || p.charAt(j) == '.');
// Star case
if (j + 1 < p.length() && p.charAt(j + 1) == '*') {
return match(s, p, i, j + 2) ||
(firstMatch && match(s, p, i + 1, j));
}
// Regular case
return firstMatch && match(s, p, i + 1, j + 1);
}
}Complexity Analysis
Time Complexity: O((m + n) × 2^(m + n/2))
In the worst case, every '*' in the pattern creates a branching point where we explore two paths. If the pattern has many '*' characters, the recursion tree grows exponentially. For a pattern like "a*a*a*a*...b" matched against "aaaa...a", the algorithm explores a vast number of combinations. The exact bound depends on the pattern structure, but it can be exponential in the combined length of s and p.
Space Complexity: O(m + n)
The recursion depth is bounded by m + n in the worst case (at each step we advance at least one of i or j). Each recursive call uses constant stack space, so the total stack space is O(m + n).
Why This Approach Is Not Efficient
The pure recursive approach has exponential time complexity because it recomputes the same subproblems repeatedly.
Consider matching "aaa" with pattern "a*a*a*". When the recursion branches at each '*', different branches eventually ask the same question: "does s[2:] match p[4:]?" But without remembering the answer, we compute it from scratch every time.
The number of unique subproblems is actually small — at most (m+1) × (n+1) pairs of (i, j) positions. But the brute force explores the same pairs through different recursive paths, wasting enormous effort.
The key insight: If we store (memoize) the result of each (i, j) pair the first time we compute it, we can look it up in O(1) time for all subsequent calls. This transforms the exponential recursion into a polynomial-time dynamic programming solution. Alternatively, we can build a bottom-up DP table that fills in answers for all (i, j) pairs systematically.
Optimal Approach - Dynamic Programming
Intuition
The recursive solution already contains the right logic — we just need to avoid redundant computation. Dynamic programming achieves this by building a 2D table dp[i][j] where each cell answers: "does the substring s[i:] match the pattern p[j:]?"
Think of it like filling in a crossword puzzle from the bottom-right corner upward. Each cell's answer depends on cells below and to the right (smaller subproblems). By the time we reach dp[0][0], we have our final answer.
We fill the table bottom-up:
- Last row and column are base cases (what happens when the string or pattern is empty)
- dp[m][n] = true because empty string matches empty pattern
- dp[m][j] (empty string, remaining pattern): true only if the remaining pattern can match empty string (e.g.,
"a*b*"can match empty, but"a*b"cannot) - dp[i][j] follows the same logic as our recursion: check for star, decide to skip or use
The beauty of bottom-up DP is that it avoids recursion overhead entirely and guarantees we compute each subproblem exactly once.
Step-by-Step Explanation
Let's trace with s = "aab" (m=3), p = "cab" (n=5).
We build a (m+1) × (n+1) = 4 × 6 table. dp[i][j] = "does s[i:] match p[j:]?"
Step 1: Base cases
- dp[3][5] = true (both empty)
- dp[3][4]: p[4:]="b", s[3:]="" → 'b' can't match empty → false
- dp[3][3]: p[3:]="*b", but * applies to p[2]='a', so check dp[3][2]
- dp[3][2]: p[2:]="ab", 'a' can match empty → check dp[3][4] = false → false
- dp[3][0]: p[0:]="cab", 'c*' can match empty → check dp[3][2] = false → false
Step 2: Fill row i=2 (s[2:]="b")
- dp[2][5] = false (pattern empty, string not)
- dp[2][4]: s[2:]="b", p[4:]="b" → 'b'=='b' and dp[3][5]=true → true
- dp[2][3]: p[3]='*' → this is handled as part of j=2
- dp[2][2]: p[2]='a', p[3]='*' → skip: dp[2][4]=true → true (no need to check use, skip already succeeds)
- dp[2][1]: p[1]='*' → handled as part of j=0
- dp[2][0]: p[0]='c', p[1]='*' → skip: dp[2][2]=true → true
Step 3: Fill row i=1 (s[1:]="ab")
- dp[1][4]: s[1:]="ab", p[4:]="b" → 'a'≠'b' → false
- dp[1][2]: p[2]='a', p[3]='*' → skip: dp[1][4]=false, use: 'a'=='a' and dp[2][2]=true → true
- dp[1][0]: p[0]='c', p[1]='*' → skip: dp[1][2]=true → true
Step 4: Fill row i=0 (s[0:]="aab")
- dp[0][4]: s[0:]="aab", p[4:]="b" → 'a'≠'b' → false
- dp[0][2]: p[2]='a', p[3]='*' → skip: dp[0][4]=false, use: 'a'=='a' and dp[1][2]=true → true
- dp[0][0]: p[0]='c', p[1]='*' → skip: dp[0][2]=true → true
Result: dp[0][0] = true. The pattern "cab" matches "aab".
DP Table for s = "aab", p = "c*a*b" — Watch how the DP table fills from bottom-right to top-left, with each cell answering whether the remaining string matches the remaining pattern.
Algorithm
- Create a 2D boolean table dp of size (m+1) × (n+1), initialized to false
- Set base case: dp[m][n] = true (empty string matches empty pattern)
- Fill base row (i = m): for j from n-2 down to 0, if p[j+1] == '*', then dp[m][j] = dp[m][j+2] (star can match zero occurrences)
- For i from m-1 down to 0, for j from n-1 down to 0:
a. Compute first_match = (s[i] == p[j] or p[j] == '.')
b. If j+1 < n and p[j+1] == '*':- dp[i][j] = dp[i][j+2] (skip) OR (first_match AND dp[i+1][j]) (use)
c. Else: dp[i][j] = first_match AND dp[i+1][j+1]
- dp[i][j] = dp[i][j+2] (skip) OR (first_match AND dp[i+1][j]) (use)
- Return dp[0][0]
Code
class Solution {
public:
bool isMatch(string s, string p) {
int m = s.size(), n = p.size();
// dp[i][j] = does s[i:] match p[j:]?
vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));
// Base case: both empty
dp[m][n] = true;
// Fill the table bottom-up
for (int i = m; i >= 0; i--) {
for (int j = n - 1; j >= 0; j--) {
bool first_match = (i < m) &&
(s[i] == p[j] || p[j] == '.');
if (j + 1 < n && p[j + 1] == '*') {
// Star case: skip or use
dp[i][j] = dp[i][j + 2] ||
(first_match && dp[i + 1][j]);
} else {
// Regular case
dp[i][j] = first_match && dp[i + 1][j + 1];
}
}
}
return dp[0][0];
}
};class Solution:
def isMatch(self, s: str, p: str) -> bool:
m, n = len(s), len(p)
# dp[i][j] = does s[i:] match p[j:]?
dp = [[False] * (n + 1) for _ in range(m + 1)]
# Base case: both empty
dp[m][n] = True
# Fill table bottom-up
for i in range(m, -1, -1):
for j in range(n - 1, -1, -1):
first_match = (i < m) and (s[i] == p[j] or p[j] == '.')
if j + 1 < n and p[j + 1] == '*':
# Star case: skip or use
dp[i][j] = dp[i][j + 2] or (first_match and dp[i + 1][j])
else:
# Regular case
dp[i][j] = first_match and dp[i + 1][j + 1]
return dp[0][0]class Solution {
public boolean isMatch(String s, String p) {
int m = s.length(), n = p.length();
// dp[i][j] = does s[i:] match p[j:]?
boolean[][] dp = new boolean[m + 1][n + 1];
// Base case: both empty
dp[m][n] = true;
// Fill table bottom-up
for (int i = m; i >= 0; i--) {
for (int j = n - 1; j >= 0; j--) {
boolean firstMatch = (i < m) &&
(s.charAt(i) == p.charAt(j) || p.charAt(j) == '.');
if (j + 1 < n && p.charAt(j + 1) == '*') {
// Star case: skip or use
dp[i][j] = dp[i][j + 2] ||
(firstMatch && dp[i + 1][j]);
} else {
// Regular case
dp[i][j] = firstMatch && dp[i + 1][j + 1];
}
}
}
return dp[0][0];
}
}Complexity Analysis
Time Complexity: O(m × n)
We fill a 2D table of size (m+1) × (n+1). Each cell requires O(1) work — a constant number of comparisons and lookups into already-computed cells. Total operations: (m+1) × (n+1) ≈ O(m × n). With constraints m, n ≤ 20, this is at most ~441 operations — extremely fast.
Space Complexity: O(m × n)
The DP table itself requires (m+1) × (n+1) boolean values. This could be optimized to O(n) by keeping only two rows at a time (since each row only depends on the row below it), but with the small constraint sizes, this optimization is unnecessary.