Word Break II

Description

You are given a string s and a list of words called wordDict (a dictionary). Your task is to insert spaces into s so that every segment between spaces is a valid word found in the dictionary. You need to return all possible ways to do this.

A few important details:

The same dictionary word can be used more than once in a single sentence.
The order of words in the output sentences must correspond to a left-to-right reading of the original string, meaning the concatenation of words (without spaces) must exactly reproduce s.
If no valid segmentation exists, return an empty list.

Examples

Example 1

Input: s = "catsanddog", wordDict = ["cat", "cats", "and", "sand", "dog"]

Output: ["cats and dog", "cat sand dog"]

Explanation: The string "catsanddog" can be split in two valid ways:

"cat" + "sand" + "dog" → all three words exist in the dictionary.
"cats" + "and" + "dog" → all three words exist in the dictionary.
Both segmentations fully cover the original string, and every piece is a valid dictionary word.

Example 2

Input: s = "pineapplepenapple", wordDict = ["apple", "pen", "applepen", "pine", "pineapple"]

Output: ["pine apple pen apple", "pineapple pen apple", "pine applepen apple"]

Explanation: The string can be split three ways:

"pine" + "apple" + "pen" + "apple"
"pineapple" + "pen" + "apple"
"pine" + "applepen" + "apple"
Notice that "apple" is reused in the first segmentation, which is allowed.

Example 3

Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]

Output: []

Explanation: No matter how we split the string, we cannot form a complete sentence using only dictionary words. For instance, "cat" + "san" leaves "san" which is not in the dictionary, and "cats" + "andog" leaves "andog" which is also not valid. There is no valid segmentation.

Constraints

1 ≤ s.length ≤ 20
1 ≤ wordDict.length ≤ 1000
1 ≤ wordDict[i].length ≤ 10
s and wordDict[i] consist of only lowercase English letters
All the strings of wordDict are unique
Input is generated such that the length of the answer does not exceed 10^5

Editorial

Brute Force

Intuition

The simplest way to think about this problem is: at every position in the string, we have a choice of where to place the next "cut" (space). We can try cutting after 1 character, 2 characters, 3 characters, and so on, all the way to the end of the remaining string.

Imagine you are reading a long word written on a whiteboard without any spaces. You start from the left and try covering 1 letter, 2 letters, 3 letters... each time asking: "Is this chunk a real word?" If it is, you mark it as a word, move your starting point to the right, and repeat the process for the remaining letters.

This approach explores every possible way to partition the string. At each position, for every possible prefix that matches a dictionary word, we recursively attempt to break the rest of the string. When we reach the end of the string with all parts being valid words, we have found one valid sentence.

Since we try every possible partition, this is essentially generating all 2^(n-1) possible ways to insert or not insert a space between consecutive characters (where n is the length of the string), and then filtering for valid ones.

Step-by-Step Explanation

Let's trace through with s = "catsanddog", wordDict = ["cat", "cats", "and", "sand", "dog"].

Step 1: Start at index 0. Try substrings starting at index 0:

s[0..0] = "c" → not in dict. Skip.
s[0..1] = "ca" → not in dict. Skip.
s[0..2] = "cat" → in dict! Take "cat" and recurse from index 3.

Step 2: Recurse from index 3. Try substrings:

s[3..3] = "s" → not in dict.
s[3..4] = "sa" → not in dict.
s[3..5] = "san" → not in dict.
s[3..6] = "sand" → in dict! Take "sand" and recurse from index 7.

Step 3: Recurse from index 7. Try substrings:

s[7..7] = "d" → not in dict.
s[7..8] = "do" → not in dict.
s[7..9] = "dog" → in dict! Take "dog" and recurse from index 10.

Step 4: Index 10 == length of s. We have reached the end! Current sentence: ["cat", "sand", "dog"] → "cat sand dog". Add to results.

Step 5: Backtrack to Step 2, continue trying substrings from index 3:

s[3..7] = "sandd" → not in dict.
s[3..8] = "sanddo" → not in dict.
s[3..9] = "sanddog" → not in dict.
All exhausted. Backtrack to Step 1.

Step 6: Continue trying substrings from index 0:

s[0..3] = "cats" → in dict! Take "cats" and recurse from index 4.

Step 7: Recurse from index 4. Try substrings:

s[4..4] = "a" → not in dict.
s[4..5] = "an" → not in dict.
s[4..6] = "and" → in dict! Take "and" and recurse from index 7.

Step 8: Recurse from index 7. s[7..9] = "dog" → in dict! Recurse from 10 → end reached. Sentence: ["cats", "and", "dog"] → "cats and dog". Add to results.

Step 9: Continue from index 4: s[4..7] = "andd" → not in dict. And so on. No more valid paths.

Step 10: Final result: ["cat sand dog", "cats and dog"].

Brute Force — Exploring All Partitions of "catsanddog" — Watch how recursive calls branch at every position, trying all possible prefix words and backtracking when no dictionary word matches.

Algorithm

Convert wordDict into a hash set for O(1) lookups.
Define a recursive function solve(start, path):
a. If start == s.length, join the words in path with spaces and add to results.
b. For each end from start+1 to s.length:
- Extract substring word = s[start..end-1].
- If word is in the dictionary set:
  - Add word to path.
  - Recursively call solve(end, path).
  - Remove word from path (backtrack).
Call solve(0, []) and return the collected results.

Code

class Solution {
public:
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> dict(wordDict.begin(), wordDict.end());
        vector<string> results;
        vector<string> path;

        function<void(int)> solve = [&](int start) {
            if (start == s.size()) {
                string sentence;
                for (int i = 0; i < path.size(); i++) {
                    if (i > 0) sentence += " ";
                    sentence += path[i];
                }
                results.push_back(sentence);
                return;
            }
            for (int end = start + 1; end <= (int)s.size(); end++) {
                string word = s.substr(start, end - start);
                if (dict.count(word)) {
                    path.push_back(word);
                    solve(end);
                    path.pop_back();
                }
            }
        };

        solve(0);
        return results;
    }
};

class Solution:
    def wordBreak(self, s: str, wordDict: list[str]) -> list[str]:
        word_set = set(wordDict)
        results = []
        path = []

        def solve(start: int) -> None:
            if start == len(s):
                results.append(" ".join(path))
                return
            for end in range(start + 1, len(s) + 1):
                word = s[start:end]
                if word in word_set:
                    path.append(word)
                    solve(end)
                    path.pop()

        solve(0)
        return results

class Solution {
    public List<String> wordBreak(String s, List<String> wordDict) {
        Set<String> dict = new HashSet<>(wordDict);
        List<String> results = new ArrayList<>();
        List<String> path = new ArrayList<>();
        solve(s, dict, 0, path, results);
        return results;
    }

    private void solve(String s, Set<String> dict, int start,
                       List<String> path, List<String> results) {
        if (start == s.length()) {
            results.add(String.join(" ", path));
            return;
        }
        for (int end = start + 1; end <= s.length(); end++) {
            String word = s.substring(start, end);
            if (dict.contains(word)) {
                path.add(word);
                solve(s, dict, end, path, results);
                path.remove(path.size() - 1);
            }
        }
    }
}

Complexity Analysis

Time Complexity: O(2^n × n)

In the worst case, there are 2^(n-1) ways to partition a string of length n (each of the n-1 gaps between characters can either have a space or not). For each partition, we spend O(n) time to construct substrings and check the dictionary. So the worst case is O(2^n × n). In practice, many partitions are pruned early because the prefix is not a dictionary word.

Space Complexity: O(n × 2^n)

In the worst case, we may store up to 2^(n-1) valid sentences, each of length n. The recursion stack uses O(n) space.

Why This Approach Is Not Efficient

The brute force approach can be extremely slow because it re-explores the same subproblems. Consider the string "aaaa" with dictionary ["a", "aa"]. The recursion starting at index 2 is called multiple times:

Once when we chose "a" + "a" for indices 0-1
Once when we chose "aa" for indices 0-1

Each time, the function recomputes all valid sentences for the suffix starting at index 2, doing redundant work. For longer strings with many valid splits, these overlapping subproblems compound exponentially.

The key insight is that the set of valid sentences from any starting index is fixed regardless of how we arrived at that index. If we cache (memoize) the results for each starting index, we avoid recomputing them.

Optimal Approach - Backtracking with Memoization

Intuition

We keep the same recursive structure — at each index, try all prefixes that are dictionary words and recurse for the remainder — but we add a crucial optimization: memoization.

We use a hash map where the key is a starting index and the value is the list of all valid sentences that can be formed from s[start..end]. Before computing the result for any starting index, we first check if we have already computed it. If yes, we return the cached result immediately.

Think of it like solving a jigsaw puzzle. If you have already figured out all the ways to complete the bottom-right corner, you write them down on a sticky note. Whenever another part of the puzzle leads you back to the same bottom-right corner, you just read the sticky note instead of re-solving it.

This memoization is especially effective when the string has many overlapping prefixes. For example, in "catsanddog", the suffix "dog" starting at index 7 might be reached from multiple paths (via "cat"+"sand" or "cats"+"and"), but we only compute its breakdowns once.

Note: While the worst-case number of valid sentences can still be exponential (meaning we cannot reduce the output-related cost), memoization ensures that the computation for each unique suffix is done exactly once, avoiding the redundant re-exploration that plagues the pure brute force approach.

Step-by-Step Explanation

Let's trace through with s = "catsanddog", wordDict = ["cat", "cats", "and", "sand", "dog"]:

Step 1: Call solve(0). memo is empty. Try prefixes from index 0.

Step 2: s[0..2] = "cat" is in dict. Recurse: solve(3).

Step 3: solve(3): not in memo. Try prefixes from index 3. s[3..6] = "sand" is in dict. Recurse: solve(7).

Step 4: solve(7): not in memo. Try prefixes from index 7. s[7..9] = "dog" is in dict. Recurse: solve(10).

Step 5: solve(10): index == len(s). Return [""] (empty string signaling complete sentence). Cache: memo[10] = [""].

Step 6: Back in solve(7): "dog" + solve(10) → ["dog"]. No more prefixes from index 7. Cache: memo[7] = ["dog"].

Step 7: Back in solve(3): "sand" + memo[7] → ["sand dog"]. Continue trying: s[3..5] = "and" is NOT at index 3 (it's s[4..6]). Actually from index 3: s[3..6] = "sand", no more dictionary matches. Cache: memo[3] = ["sand dog"].

Step 8: Back in solve(0): "cat" + memo[3] → ["cat sand dog"]. Continue trying more prefixes.

Step 9: s[0..3] = "cats" is in dict. Recurse: solve(4).

Step 10: solve(4): not in memo. Try prefixes from index 4. s[4..6] = "and" is in dict. Recurse: solve(7).

Step 11: solve(7) is ALREADY in memo! Return cached ["dog"] immediately. No recomputation needed.

Step 12: Back in solve(4): "and" + memo[7] → ["and dog"]. Cache: memo[4] = ["and dog"].

Step 13: Back in solve(0): "cats" + memo[4] → ["cats and dog"]. No more prefixes from index 0.

Step 14: Final result from solve(0): ["cat sand dog", "cats and dog"]. The memoization saved us from recomputing solve(7) when it was needed the second time.

Memoized Backtracking — Caching Suffix Results — Watch how memoization prevents redundant computation. When solve(7) is called the second time, the cached result is returned instantly instead of re-exploring the subtree.

Algorithm

Convert wordDict into a hash set for O(1) lookups.
Create a memo hash map: index → list of valid suffix sentences.
Define a recursive function solve(start):
a. If start is in memo, return memo[start].
b. If start == s.length, return [""] (base case: empty suffix).
c. Initialize an empty list sentences for this start index.
d. For each end from start+1 to s.length:
- Extract word = s[start..end-1].
- If word is in the dictionary:
  - Recursively get suffix_sentences = solve(end).
  - For each suffix in suffix_sentences:
    - If suffix is empty, add word to sentences.
    - Else, add (word + " " + suffix) to sentences.
      e. Store sentences in memo[start].
      f. Return sentences.
Call solve(0) and return the result.

Code

class Solution {
public:
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> dict(wordDict.begin(), wordDict.end());
        unordered_map<int, vector<string>> memo;

        function<vector<string>(int)> solve = [&](int start) -> vector<string> {
            if (memo.count(start)) return memo[start];
            if (start == (int)s.size()) return {""};

            vector<string> sentences;
            for (int end = start + 1; end <= (int)s.size(); end++) {
                string word = s.substr(start, end - start);
                if (dict.count(word)) {
                    vector<string> suffixes = solve(end);
                    for (const string& suffix : suffixes) {
                        if (suffix.empty()) {
                            sentences.push_back(word);
                        } else {
                            sentences.push_back(word + " " + suffix);
                        }
                    }
                }
            }
            memo[start] = sentences;
            return sentences;
        };

        return solve(0);
    }
};

class Solution:
    def wordBreak(self, s: str, wordDict: list[str]) -> list[str]:
        word_set = set(wordDict)
        memo = {}

        def solve(start: int) -> list[str]:
            if start in memo:
                return memo[start]
            if start == len(s):
                return [""]

            sentences = []
            for end in range(start + 1, len(s) + 1):
                word = s[start:end]
                if word in word_set:
                    suffixes = solve(end)
                    for suffix in suffixes:
                        if suffix:
                            sentences.append(word + " " + suffix)
                        else:
                            sentences.append(word)
            memo[start] = sentences
            return sentences

        return solve(0)

class Solution {
    public List<String> wordBreak(String s, List<String> wordDict) {
        Set<String> dict = new HashSet<>(wordDict);
        Map<Integer, List<String>> memo = new HashMap<>();
        return solve(s, dict, 0, memo);
    }

    private List<String> solve(String s, Set<String> dict, int start,
                               Map<Integer, List<String>> memo) {
        if (memo.containsKey(start)) return memo.get(start);
        if (start == s.length()) {
            List<String> base = new ArrayList<>();
            base.add("");
            return base;
        }

        List<String> sentences = new ArrayList<>();
        for (int end = start + 1; end <= s.length(); end++) {
            String word = s.substring(start, end);
            if (dict.contains(word)) {
                List<String> suffixes = solve(s, dict, end, memo);
                for (String suffix : suffixes) {
                    if (suffix.isEmpty()) {
                        sentences.add(word);
                    } else {
                        sentences.add(word + " " + suffix);
                    }
                }
            }
        }
        memo.put(start, sentences);
        return sentences;
    }
}

Complexity Analysis

Time Complexity: O(n × 2^n) in the worst case

The number of valid sentences can be exponential — consider s = "aaa...a" with dict = ["a", "aa", "aaa", ...]. In such cases, the total length of all output sentences dominates. However, memoization ensures that each unique starting index is computed exactly once. For each of the n starting indices, we try up to n prefixes. The key improvement is that solve(i) is never recomputed — its result is cached and returned in O(1) for subsequent calls.

For most practical inputs (especially with the constraint s.length ≤ 20 and answer size ≤ 10^5), the memoized solution runs efficiently.

Space Complexity: O(n × 2^n) in the worst case

The memo cache stores the list of valid sentences for each starting index. In the worst case, the total number of sentences across all indices can be exponential. The recursion stack uses O(n) additional space.

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

MEDIUM

HARD

EASY

MEDIUM

HARD

MEDIUM

HARD

EASY

MEDIUM

HARD

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

HARD

EASY

MEDIUM

EASY

MEDIUM

Word Break II

Description

Examples

Example 1

Example 2

Example 3

Constraints

Editorial

Brute Force

Intuition

Step-by-Step Explanation

Algorithm

Code

Complexity Analysis

Why This Approach Is Not Efficient

Optimal Approach - Backtracking with Memoization

Intuition

Step-by-Step Explanation

Algorithm

Code

Complexity Analysis