Skip to main content

Edit Distance

MEDIUMProblemSolveExternal Links

Description

Given two strings word1 and word2, return the minimum number of operations required to convert word1 into word2.

You are allowed exactly three types of single-character operations:

  1. Insert a character anywhere in word1
  2. Delete a character from word1
  3. Replace a character in word1 with a different character

Each operation counts as one step. Your goal is to find the shortest sequence of these operations that transforms word1 into word2.

This problem is a classic example of dynamic programming and is also known as the Levenshtein distance in computer science. It has practical applications in spell checkers, DNA sequence alignment, and diff utilities.

Examples

Example 1

Input: word1 = "horse", word2 = "ros"

Output: 3

Explanation: We can convert "horse" into "ros" in 3 operations:

  1. Replace 'h' with 'r' → "rorse"
  2. Remove the second 'r' → "rose"
  3. Remove 'e' → "ros"

No sequence of fewer than 3 operations can achieve this transformation.

Example 2

Input: word1 = "intention", word2 = "execution"

Output: 5

Explanation: One possible sequence of 5 operations:

  1. Remove 't' → "inention"
  2. Replace 'i' with 'e' → "enention"
  3. Replace first 'n' with 'x' → "exention"
  4. Replace second 'n' with 'c' → "exection"
  5. Insert 'u' before the last 'n' → "execution"

Example 3

Input: word1 = "", word2 = "abc"

Output: 3

Explanation: word1 is empty, so we must insert all 3 characters of word2 one by one. This is the simplest edge case: the answer is always the length of the non-empty string.

Constraints

  • 0 ≤ word1.length, word2.length ≤ 500
  • word1 and word2 consist of lowercase English letters only

Editorial

Brute Force

Intuition

The most natural way to think about this problem is to work character by character from the end of both strings. At each step, we compare the last characters of the remaining portions of word1 and word2.

Imagine you have two words written on paper and you are trying to morph one into the other by erasing and rewriting characters. At each position, you have a simple decision:

  • If the characters already match, great — no operation needed. Move on to the rest.
  • If they do not match, you have exactly three choices: insert a character, delete a character, or replace a character. You try all three and pick whichever leads to the fewest total operations.

This naturally forms a recursive solution: break the problem into smaller subproblems by reducing the strings one character at a time.

Step-by-Step Explanation

Let's trace through with word1 = "horse" (m=5), word2 = "ros" (n=3). We define ed(m, n) as the edit distance between word1[0..m-1] and word2[0..n-1].

Step 1: Start with ed(5, 3). Compare word1[4]='e' and word2[2]='s'. They do not match. We must try all three operations.

Step 2: Three recursive branches are created:

  • Replace 'e' with 's': solve ed(4, 2) — both strings shrink by one
  • Insert 's' at end of word1: solve ed(5, 2) — only word2 shrinks
  • Delete 'e' from word1: solve ed(4, 3) — only word1 shrinks

Step 3: Follow the Replace branch → ed(4, 2). Compare word1[3]='s' and word2[1]='o'. They do not match. This creates 3 more children: ed(3, 1), ed(4, 1), ed(3, 2).

Step 4: Follow the Insert branch → ed(5, 2). Compare word1[4]='e' and word2[1]='o'. They do not match. Creates children: ed(4, 1), ed(5, 1), ed(4, 2). Notice that ed(4, 2) and ed(4, 1) already appeared in Step 3 — these are overlapping subproblems!

Step 5: Follow the Delete branch → ed(4, 3). Compare word1[3]='s' and word2[2]='s'. They match! No operation needed. Directly calls ed(3, 2) — which also appeared in Step 3.

Step 6: The recursion continues deeper until base cases: when one string becomes empty. ed(0, n) = n (insert n characters) and ed(m, 0) = m (delete m characters).

Step 7: After exploring ALL branches exhaustively and taking the minimum at each level, the answer is ed(5, 3) = 3.

Edit Distance — Recursive Branching with Overlapping Subproblems — Watch how the recursive approach creates an exponentially growing tree where many subproblems are computed multiple times.

Algorithm

  1. Define a recursive function ed(m, n) that returns the edit distance between word1[0..m-1] and word2[0..n-1]
  2. Base cases:
    • If m == 0, return n (insert all remaining characters of word2)
    • If n == 0, return m (delete all remaining characters of word1)
  3. Recursive cases:
    • If word1[m-1] == word2[n-1]: characters match, return ed(m-1, n-1)
    • Otherwise: return 1 + min(ed(m-1, n-1), ed(m, n-1), ed(m-1, n))
      • ed(m-1, n-1) + 1 = replace
      • ed(m, n-1) + 1 = insert
      • ed(m-1, n) + 1 = delete

Code

class Solution {
public:
    int minDistance(string word1, string word2) {
        return solve(word1, word2, word1.size(), word2.size());
    }

private:
    int solve(string& w1, string& w2, int m, int n) {
        if (m == 0) return n;
        if (n == 0) return m;

        if (w1[m - 1] == w2[n - 1])
            return solve(w1, w2, m - 1, n - 1);

        return 1 + min({solve(w1, w2, m - 1, n - 1),
                        solve(w1, w2, m, n - 1),
                        solve(w1, w2, m - 1, n)});
    }
};
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        def solve(m: int, n: int) -> int:
            if m == 0:
                return n
            if n == 0:
                return m

            if word1[m - 1] == word2[n - 1]:
                return solve(m - 1, n - 1)

            return 1 + min(
                solve(m - 1, n - 1),
                solve(m, n - 1),
                solve(m - 1, n)
            )

        return solve(len(word1), len(word2))
class Solution {
    public int minDistance(String word1, String word2) {
        return solve(word1, word2, word1.length(), word2.length());
    }

    private int solve(String w1, String w2, int m, int n) {
        if (m == 0) return n;
        if (n == 0) return m;

        if (w1.charAt(m - 1) == w2.charAt(n - 1))
            return solve(w1, w2, m - 1, n - 1);

        return 1 + Math.min(
            solve(w1, w2, m - 1, n - 1),
            Math.min(solve(w1, w2, m, n - 1),
                     solve(w1, w2, m - 1, n))
        );
    }
}

Complexity Analysis

Time Complexity: O(3^max(m, n))

At each non-matching character pair, we make 3 recursive calls. In the worst case (no characters match), the recursion depth is max(m, n) and we branch 3 ways at each level, giving an exponential number of calls.

Space Complexity: O(m + n)

The recursion stack can grow as deep as m + n in the worst case, since each call reduces at least one of m or n by 1.

Why This Approach Is Not Efficient

The brute force recursion has exponential time complexity O(3^max(m, n)). With m and n up to 500, this would require approximately 3^500 operations — a number far beyond what any computer can handle in any reasonable timeframe.

The root cause is overlapping subproblems. As we saw in the trace, subproblems like ed(4,2), ed(4,1), and ed(3,2) appeared multiple times in the recursion tree. In fact, there are only m × n = 5 × 3 = 15 unique subproblems for our example, but the recursive tree explores exponentially many paths through them.

Key insight: If we could remember the result of each subproblem the first time we solve it and reuse that result whenever the same subproblem appears again, we would eliminate all redundant computation. This technique is called memoization.

Better Approach - Top-Down DP (Memoization)

Intuition

The memoization approach uses the exact same recursive logic as brute force, but adds a cache (dictionary or 2D array) to store results of subproblems we have already solved.

Think of it like a student solving homework problems. Without memoization, the student re-derives the answer every time the same sub-question appears in different contexts. With memoization, the student writes down each answer the first time and simply looks it up if the same sub-question comes up again.

Since there are only (m+1) × (n+1) unique subproblems (each defined by a pair of indices), and each subproblem is solved at most once, the total work drops from exponential to polynomial.

Step-by-Step Explanation

Let's trace the memoized approach with word1 = "horse", word2 = "ros":

Step 1: Initialize an empty memo table. Start with ed(5, 3).

Step 2: ed(5, 3): 'e' ≠ 's'. Need ed(4, 2), ed(5, 2), ed(4, 3). Process depth-first, starting with ed(4, 2).

Step 3: ed(4, 2) recursively solves its subproblems (ed(3,1), ed(4,1), ed(3,2) and their children). After reaching base cases and bubbling up, ed(4, 2) = 3. Store in memo: {(4,2): 3}.

Step 4: Now process ed(5, 2). It needs ed(4, 1), ed(5, 1), and ed(4, 2).

Step 5: ed(4, 2) is already in the memo table! Instead of recomputing its entire subtree, we return 3 instantly. This single cache hit saved dozens of recursive calls.

Step 6: After computing the remaining subproblems, ed(5, 2) = 4. Store in memo.

Step 7: Process ed(4, 3): word1[3]='s' == word2[2]='s'. Match! Calls ed(3, 2). This was computed during ed(4, 2)'s exploration — found in memo! Return 2 instantly.

Step 8: ed(4, 3) = 2 (no +1 since characters matched). All three children resolved.

Step 9: ed(5, 3) = 1 + min(3, 4, 2) = 1 + 2 = 3. Only 15 unique subproblems computed instead of exponentially many.

Memoized Recursion — Cache Hits Prune the Tree — Watch how storing results in a memo table eliminates redundant computation. Pruned (green) nodes represent instant cache lookups instead of full recomputation.

Algorithm

  1. Create a memo table (2D array or dictionary) initialized to -1 (uncomputed)
  2. Define recursive function ed(m, n) with memo lookup:
    • If memo[m][n] is already computed, return it immediately
    • Base cases: if m == 0 return n; if n == 0 return m
    • If characters match: memo[m][n] = ed(m-1, n-1)
    • If characters differ: memo[m][n] = 1 + min(ed(m-1, n-1), ed(m, n-1), ed(m-1, n))
  3. Return ed(len(word1), len(word2))

Code

class Solution {
public:
    int minDistance(string word1, string word2) {
        int m = word1.size(), n = word2.size();
        vector<vector<int>> memo(m + 1, vector<int>(n + 1, -1));
        return solve(word1, word2, m, n, memo);
    }

private:
    int solve(string& w1, string& w2, int m, int n,
              vector<vector<int>>& memo) {
        if (m == 0) return n;
        if (n == 0) return m;
        if (memo[m][n] != -1) return memo[m][n];

        if (w1[m - 1] == w2[n - 1])
            return memo[m][n] = solve(w1, w2, m - 1, n - 1, memo);

        return memo[m][n] = 1 + min({
            solve(w1, w2, m - 1, n - 1, memo),
            solve(w1, w2, m, n - 1, memo),
            solve(w1, w2, m - 1, n, memo)
        });
    }
};
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        memo = {}

        def solve(m: int, n: int) -> int:
            if m == 0:
                return n
            if n == 0:
                return m
            if (m, n) in memo:
                return memo[(m, n)]

            if word1[m - 1] == word2[n - 1]:
                memo[(m, n)] = solve(m - 1, n - 1)
            else:
                memo[(m, n)] = 1 + min(
                    solve(m - 1, n - 1),
                    solve(m, n - 1),
                    solve(m - 1, n)
                )
            return memo[(m, n)]

        return solve(len(word1), len(word2))
class Solution {
    private int[][] memo;

    public int minDistance(String word1, String word2) {
        int m = word1.length(), n = word2.length();
        memo = new int[m + 1][n + 1];
        for (int[] row : memo) Arrays.fill(row, -1);
        return solve(word1, word2, m, n);
    }

    private int solve(String w1, String w2, int m, int n) {
        if (m == 0) return n;
        if (n == 0) return m;
        if (memo[m][n] != -1) return memo[m][n];

        if (w1.charAt(m - 1) == w2.charAt(n - 1))
            return memo[m][n] = solve(w1, w2, m - 1, n - 1);

        return memo[m][n] = 1 + Math.min(
            solve(w1, w2, m - 1, n - 1),
            Math.min(solve(w1, w2, m, n - 1),
                     solve(w1, w2, m - 1, n))
        );
    }
}

Complexity Analysis

Time Complexity: O(m × n)

There are (m+1) × (n+1) unique subproblems, and each is solved exactly once. Each subproblem does O(1) work (a comparison and at most three lookups). Total: O(m × n).

Space Complexity: O(m × n)

The memo table stores up to (m+1) × (n+1) entries. Additionally, the recursion call stack can go as deep as O(m + n), since each call reduces at least one parameter. Total space: O(m × n) for the memo table + O(m + n) for the stack.

Why This Approach Is Not Efficient

Memoization reduces the time from exponential to O(m × n), which is a massive improvement. However, it still has drawbacks:

  1. Recursion stack overhead: Each recursive call adds a frame to the call stack. With m and n up to 500, the recursion depth can reach 1000, which risks stack overflow in some languages or environments.

  2. Function call overhead: Thousands of recursive function calls have higher constant-factor overhead than a simple nested loop.

  3. Non-sequential memory access: The top-down approach fills the memo table in an unpredictable order, leading to poor CPU cache utilization.

Key insight: We can compute the same results iteratively using a bottom-up approach. Instead of starting from ed(m, n) and recursing downward, we start from the base cases ed(0, j) and ed(i, 0) and fill a table row by row. This eliminates recursion entirely and allows further space optimization from O(m × n) to O(n).

Optimal Approach - Bottom-Up DP (Tabulation)

Intuition

Instead of thinking top-down ("how do I solve the big problem?"), we think bottom-up: "what are the smallest subproblems, and how do they build up to the answer?"

Imagine a grid where each cell dp[i][j] represents the minimum edits needed to convert the first i characters of word1 into the first j characters of word2. The bottom-right corner dp[m][n] is our answer.

The base cases are straightforward:

  • dp[i][0] = i: converting a string of length i to an empty string requires i deletions
  • dp[0][j] = j: converting an empty string to a string of length j requires j insertions

For each inner cell, we look at the character pair word1[i-1] and word2[j-1]:

  • If they match, dp[i][j] = dp[i-1][j-1] — no operation needed
  • If they differ, dp[i][j] = 1 + min(dp[i-1][j-1], dp[i][j-1], dp[i-1][j]) — take the best of replace, insert, or delete

Step-by-Step Explanation

Let's trace through with word1 = "horse", word2 = "ros". Our table has 6 rows (for "", h, o, r, s, e) and 4 columns (for "", r, o, s).

Step 1: Fill base cases. First row dp[0][j] = [0, 1, 2, 3]. First column dp[i][0] = [0, 1, 2, 3, 4, 5].

Step 2: dp[1][1]: 'h' vs 'r' — no match. 1 + min(dp[0][0]=0, dp[0][1]=1, dp[1][0]=1) = 1 (replace 'h' with 'r').

Step 3: dp[1][2]: 'h' vs 'o' — no match. 1 + min(dp[0][1]=1, dp[0][2]=2, dp[1][1]=1) = 2.

Step 4: dp[1][3]: 'h' vs 's' — no match. 1 + min(dp[0][2]=2, dp[0][3]=3, dp[1][2]=2) = 3.

Step 5: dp[2][1]: 'o' vs 'r' — no match. 1 + min(dp[1][0]=1, dp[1][1]=1, dp[2][0]=2) = 2.

Step 6: dp[2][2]: 'o' vs 'o' — MATCH! dp[1][1] = 1. Converting "ho" to "ro" costs the same as converting "h" to "r".

Step 7: dp[2][3]: 'o' vs 's' — no match. 1 + min(dp[1][2]=2, dp[1][3]=3, dp[2][2]=1) = 2.

Step 8: dp[3][1]: 'r' vs 'r' — MATCH! dp[2][0] = 2.

Step 9: dp[3][2]: 'r' vs 'o' — no match. 1 + min(dp[2][1]=2, dp[2][2]=1, dp[3][1]=2) = 2.

Step 10: dp[3][3]: 'r' vs 's' — no match. 1 + min(dp[2][2]=1, dp[2][3]=2, dp[3][2]=2) = 2.

Step 11: dp[4][3]: 's' vs 's' — MATCH! dp[3][2] = 2.

Step 12: dp[5][3]: 'e' vs 's' — no match. 1 + min(dp[4][2]=3, dp[4][3]=2, dp[5][2]=4) = 3.

Step 13: The answer is dp[5][3] = 3.

Edit Distance DP Table — Building the Solution Bottom-Up — Watch the DP table fill cell by cell. Each cell depends on its top, left, and diagonal neighbors. Green highlights show character matches where no operation is needed.

Algorithm

  1. Create a 2D table dp of size (m+1) × (n+1)
  2. Fill base cases:
    • dp[i][0] = i for all i (delete all characters)
    • dp[0][j] = j for all j (insert all characters)
  3. For each cell dp[i][j] where i ∈ [1, m] and j ∈ [1, n]:
    • If word1[i-1] == word2[j-1]: dp[i][j] = dp[i-1][j-1]
    • Else: dp[i][j] = 1 + min(dp[i-1][j-1], dp[i][j-1], dp[i-1][j])
  4. Return dp[m][n]

Code

class Solution {
public:
    int minDistance(string word1, string word2) {
        int m = word1.size(), n = word2.size();
        vector<vector<int>> dp(m + 1, vector<int>(n + 1));

        for (int i = 0; i <= m; i++) dp[i][0] = i;
        for (int j = 0; j <= n; j++) dp[0][j] = j;

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1[i - 1] == word2[j - 1]) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    dp[i][j] = 1 + min({dp[i - 1][j - 1],
                                         dp[i][j - 1],
                                         dp[i - 1][j]});
                }
            }
        }
        return dp[m][n];
    }
};
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        m, n = len(word1), len(word2)
        dp = [[0] * (n + 1) for _ in range(m + 1)]

        for i in range(m + 1):
            dp[i][0] = i
        for j in range(n + 1):
            dp[0][j] = j

        for i in range(1, m + 1):
            for j in range(1, n + 1):
                if word1[i - 1] == word2[j - 1]:
                    dp[i][j] = dp[i - 1][j - 1]
                else:
                    dp[i][j] = 1 + min(
                        dp[i - 1][j - 1],
                        dp[i][j - 1],
                        dp[i - 1][j]
                    )

        return dp[m][n]
class Solution {
    public int minDistance(String word1, String word2) {
        int m = word1.length(), n = word2.length();
        int[][] dp = new int[m + 1][n + 1];

        for (int i = 0; i <= m; i++) dp[i][0] = i;
        for (int j = 0; j <= n; j++) dp[0][j] = j;

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    dp[i][j] = 1 + Math.min(
                        dp[i - 1][j - 1],
                        Math.min(dp[i][j - 1], dp[i - 1][j])
                    );
                }
            }
        }
        return dp[m][n];
    }
}

Complexity Analysis

Time Complexity: O(m × n)

We fill a table of (m+1) × (n+1) cells. Each cell requires O(1) work: one character comparison and at most three lookups plus a min operation. Total: O(m × n).

For the given constraints (m, n ≤ 500), this means at most 250,000 operations — extremely fast.

Space Complexity: O(m × n)

We store the full (m+1) × (n+1) table. This can be further optimized to O(n) by observing that each row only depends on the previous row. We can use two 1D arrays (or even one with a saved diagonal value) instead of the full 2D table. However, the 2D version is clearer and sufficient for the given constraints.