Skip to main content

Perfect Squares

MEDIUMProblemSolveExternal Links

Description

Given a positive integer n, find the least number of perfect square numbers that sum to n.

A perfect square is any integer that equals some integer multiplied by itself: 1 (1×1), 4 (2×2), 9 (3×3), 16 (4×4), 25 (5×5), and so on. Numbers like 3, 7, and 11 are not perfect squares.

You can use the same perfect square multiple times. For example, to reach 12, you could use three copies of 4: 12 = 4 + 4 + 4. Your goal is to minimize the count of perfect squares used.

Examples

Example 1

Input: n = 12

Output: 3

Explanation: 12 = 4 + 4 + 4. Using three copies of the perfect square 4 (= 2²) gives the minimum count. Other decompositions exist — like 12 = 9 + 1 + 1 + 1 (four squares) or 12 = 1 + 1 + ... + 1 (twelve squares) — but none use fewer than 3 perfect squares.

Example 2

Input: n = 13

Output: 2

Explanation: 13 = 4 + 9 = 2² + 3². Just two perfect squares suffice. No single perfect square equals 13 (since 3² = 9 < 13 < 16 = 4²), so 1 square is impossible. But 2 squares work: 4 + 9 = 13.

Example 3

Input: n = 1

Output: 1

Explanation: 1 = 1² is itself a perfect square. Only one square is needed.

Constraints

  • 1 ≤ n ≤ 10^4

Editorial

Brute Force

Intuition

The most direct way to solve this is pure recursion. From the current value n, try subtracting every perfect square that fits (1², 2², 3², ... up to floor(√n)²). Each subtraction gives a smaller subproblem. Recursively find the minimum squares needed for each smaller value, add 1 (for the square we just used), and return the overall minimum.

Imagine you have a pile of 12 coins and buckets that can hold 1, 4, or 9 coins. You want to fill the fewest buckets possible. You try filling a bucket of 1 (11 coins left), a bucket of 4 (8 left), or a bucket of 9 (3 left). For each remaining pile, you repeat the same process. The path that reaches 0 coins with the fewest buckets wins.

The base case: when n = 0, we need 0 squares — we have already decomposed the number completely.

Step-by-Step Explanation

Let's trace the recursion for n = 12. We write f(k) = minimum squares to sum to k.

Step 1: Call f(12). Perfect squares ≤ 12 are 1, 4, 9.

  • Subtract 1²: 1 + f(11)
  • Subtract 2²: 1 + f(8)
  • Subtract 3²: 1 + f(3)
  • f(12) = 1 + min(f(11), f(8), f(3))

Step 2: Let's follow the f(8) branch. Squares ≤ 8: 1, 4.

  • Subtract 1²: 1 + f(7)
  • Subtract 2²: 1 + f(4)
  • f(8) = 1 + min(f(7), f(4))

Step 3: Follow f(4). Squares ≤ 4: 1, 4.

  • Subtract 1²: 1 + f(3)
  • Subtract 2²: 1 + f(0) = 1 + 0 = 1
  • f(4) = min(1 + f(3), 1) = 1 (using 4 = 2² directly)

Step 4: So f(8) = 1 + min(f(7), f(4)) = 1 + min(f(7), 1). Eventually f(7) = 4 (1+1+1+4), so f(8) = 1 + 1 = 2 (using 4+4).

Step 5: Follow f(3). Squares ≤ 3: 1.

  • Subtract 1²: 1 + f(2) = 1 + 2 = 3
  • f(3) = 3 (using 1+1+1)

Step 6: Now follow f(11). This branches into f(10), f(7), f(2), each of which branches further. The recursion tree explodes — f(7) alone generates 4 more branches.

Step 7: Eventually all branches resolve:

  • f(3) = 3, f(8) = 2, f(11) = 3
  • f(12) = 1 + min(3, 2, 3) = 1 + 2 = 3

The problem: f(3) is computed from f(12), from f(4), and from within f(11). The same subproblems are solved repeatedly across different branches.

Algorithm

  1. Define f(n):
    • If n == 0: return 0
    • Initialize min_count = infinity
    • For i = 1, 2, 3, ... while i² ≤ n:
      • min_count = min(min_count, 1 + f(n − i²))
    • Return min_count
  2. Call f(n)

Code

class Solution {
public:
    int numSquares(int n) {
        if (n == 0) return 0;
        int minCount = INT_MAX;
        for (int i = 1; i * i <= n; i++) {
            minCount = min(minCount, 1 + numSquares(n - i * i));
        }
        return minCount;
    }
};
class Solution:
    def numSquares(self, n: int) -> int:
        if n == 0:
            return 0
        min_count = float('inf')
        i = 1
        while i * i <= n:
            min_count = min(min_count, 1 + self.numSquares(n - i * i))
            i += 1
        return min_count
class Solution {
    public int numSquares(int n) {
        if (n == 0) return 0;
        int minCount = Integer.MAX_VALUE;
        for (int i = 1; i * i <= n; i++) {
            minCount = Math.min(minCount, 1 + numSquares(n - i * i));
        }
        return minCount;
    }
}

Complexity Analysis

Time Complexity: O(√n ^ (n))

At each level of recursion, we branch into up to √n children (one per valid perfect square). The tree depth can reach n (subtracting 1² = 1 each time). This creates an exponential number of nodes. For n = 10,000, this is completely impractical.

Space Complexity: O(n)

The recursion stack depth is at most n (when we subtract 1 at every step). Each stack frame uses O(1) space.

Why This Approach Is Not Efficient

The brute force recursion suffers from the same disease as many recursive solutions: massive overlapping subproblems. The value f(3) is needed when computing f(12) (via 12−9), f(4) (via 4−1), f(7) (via 7−4), and many others. Each time, the entire subtree is recomputed from scratch.

With n up to 10,000, the recursion tree has billions of redundant nodes. The brute force would time out even for n = 100.

However, there are only n + 1 unique subproblems (f(0) through f(n)). This suggests two efficient strategies:

  1. BFS on an implicit graph — model each value 0..n as a node, with edges from k to k−j² for each valid j. Finding the shortest path (fewest edges) from n to 0 gives the minimum squares.
  2. Dynamic programming — compute f(0), f(1), ..., f(n) in order, each in O(√n) time.

Better Approach - BFS on Implicit Graph

Intuition

Here is a creative reframing: treat the problem as a shortest path problem on an unweighted graph.

Imagine every number from 0 to n as a node. There is a directed edge from node a to node b whenever b = a − j² for some integer j (meaning we can reach b from a by subtracting one perfect square). The weight of every edge is 1 (each subtraction counts as using one perfect square).

Now the question becomes: what is the shortest path from node n to node 0? Since all edge weights are equal (1), BFS finds the shortest path. BFS explores all nodes reachable in 1 step first, then 2 steps, then 3 steps, and so on. The first time we reach node 0, the current BFS level is the answer.

This is elegant because BFS naturally handles the "minimum" requirement — it guarantees the first path found to any node is the shortest.

Step-by-Step Explanation

Let's trace BFS for n = 12. Perfect squares ≤ 12: {1, 4, 9}.

Step 1: Start BFS from node 12. Queue = [12]. Level = 0. Visited = {12}.

Step 2: Level 1 — process node 12:

  • 12 − 1 = 11 → enqueue 11
  • 12 − 4 = 8 → enqueue 8
  • 12 − 9 = 3 → enqueue 3
  • None reached 0. Queue = [11, 8, 3].

Step 3: Level 2 — process node 11:

  • 11 − 1 = 10, 11 − 4 = 7, 11 − 9 = 2 → enqueue 10, 7, 2

Step 4: Level 2 — process node 8:

  • 8 − 1 = 7 (already visited), 8 − 4 = 4 → enqueue 4

Step 5: Level 2 — process node 3:

  • 3 − 1 = 2 (already visited), 3 − 4 < 0 → skip
  • Queue = [10, 7, 2, 4].

Step 6: Level 3 — process node 10:

  • 10 − 1 = 9, 10 − 4 = 6, 10 − 9 = 1 → enqueue 9, 6, 1

Step 7: Level 3 — process node 7:

  • 7 − 1 = 6 (visited), 7 − 4 = 3 (visited) → nothing new

Step 8: Level 3 — process node 2:

  • 2 − 1 = 1 (visited) → nothing new

Step 9: Level 3 — process node 4:

  • 4 − 1 = 3 (visited), 4 − 4 = 0 → FOUND!

Result: Reached 0 at level 3. The path: 12 → 8 → 4 → 0 (using 4 + 4 + 4). Answer = 3.

BFS — Shortest Path from 12 to 0 via Perfect Squares — Watch BFS explore nodes level by level. Each level represents using one more perfect square. The first time we reach 0, the level number is the answer.

Algorithm

  1. Precompute all perfect squares ≤ n: squares = [1, 4, 9, ...]
  2. Initialize BFS: queue = [n], visited = {n}, level = 0
  3. While queue is not empty:
    a. Increment level
    b. For each node in the current level:
    • Dequeue the node
    • For each perfect square sq:
      • remain = node − sq
      • If remain == 0: return level
      • If remain > 0 and not visited: mark visited, enqueue
  4. Return level (should reach 0 before queue empties)

Code

class Solution {
public:
    int numSquares(int n) {
        vector<int> squares;
        for (int i = 1; i * i <= n; i++)
            squares.push_back(i * i);
        
        queue<int> q;
        vector<bool> visited(n + 1, false);
        q.push(n);
        visited[n] = true;
        int level = 0;
        
        while (!q.empty()) {
            level++;
            int size = q.size();
            for (int k = 0; k < size; k++) {
                int val = q.front();
                q.pop();
                for (int sq : squares) {
                    int remain = val - sq;
                    if (remain == 0) return level;
                    if (remain > 0 && !visited[remain]) {
                        visited[remain] = true;
                        q.push(remain);
                    }
                }
            }
        }
        return level;
    }
};
from collections import deque

class Solution:
    def numSquares(self, n: int) -> int:
        squares = []
        i = 1
        while i * i <= n:
            squares.append(i * i)
            i += 1
        
        queue = deque([n])
        visited = {n}
        level = 0
        
        while queue:
            level += 1
            for _ in range(len(queue)):
                val = queue.popleft()
                for sq in squares:
                    remain = val - sq
                    if remain == 0:
                        return level
                    if remain > 0 and remain not in visited:
                        visited.add(remain)
                        queue.append(remain)
        
        return level
class Solution {
    public int numSquares(int n) {
        List<Integer> squares = new ArrayList<>();
        for (int i = 1; i * i <= n; i++)
            squares.add(i * i);
        
        Queue<Integer> queue = new LinkedList<>();
        boolean[] visited = new boolean[n + 1];
        queue.offer(n);
        visited[n] = true;
        int level = 0;
        
        while (!queue.isEmpty()) {
            level++;
            int size = queue.size();
            for (int k = 0; k < size; k++) {
                int val = queue.poll();
                for (int sq : squares) {
                    int remain = val - sq;
                    if (remain == 0) return level;
                    if (remain > 0 && !visited[remain]) {
                        visited[remain] = true;
                        queue.offer(remain);
                    }
                }
            }
        }
        return level;
    }
}

Complexity Analysis

Time Complexity: O(n × √n)

In the worst case, BFS visits all n + 1 nodes (values 0 through n). For each node, it tries at most √n perfect squares. So total work is O(n × √n). With n = 10,000, that is about 10,000 × 100 = 1,000,000 operations.

Space Complexity: O(n)

The visited array and BFS queue together store up to n + 1 entries. The squares list has O(√n) entries. Total: O(n).

Why This Approach Is Not Efficient

BFS correctly finds the minimum in O(n × √n) time, but it has practical overhead:

1. Queue operations: BFS repeatedly enqueues and dequeues elements, each requiring memory allocation and pointer manipulation. For n = 10,000, the queue can hold thousands of elements simultaneously.

2. Visited set overhead: BFS needs a visited set (or boolean array) to avoid revisiting nodes. While a boolean array is O(1) per lookup, the BFS approach processes nodes in an unpredictable order, leading to poor cache locality.

3. Early termination depends on input: BFS can terminate early if 0 is found quickly (e.g., n = 16 = 4² terminates at level 1). But for adversarial inputs, it may explore most nodes before finding 0.

4. Conceptual complexity: BFS requires modeling the problem as a graph with nodes and edges, which adds conceptual overhead. The DP approach directly computes the answer without graph abstraction.

The bottom-up DP approach achieves the same O(n × √n) time but processes cells sequentially from dp[0] to dp[n], which is maximally cache-friendly. It uses a simple array with no queue or visited set. The code is shorter and easier to reason about.

Optimal Approach - Bottom-Up Dynamic Programming

Intuition

Instead of thinking about graphs and shortest paths, we can solve this directly with dynamic programming.

Define dp[i] = the minimum number of perfect squares that sum to i.

Base case: dp[0] = 0. Zero requires zero squares.

Transition: For each value i from 1 to n, try every perfect square j² ≤ i:

dp[i] = min over all valid j of (dp[i − j²] + 1)

This says: to form the value i using the fewest squares, consider every possible "last square" j² we could use. If we use j², the remaining amount is i − j², which requires dp[i − j²] squares. Adding 1 for the j² we just used gives a candidate answer. We take the minimum over all candidates.

The beauty of bottom-up DP: we compute dp[0], dp[1], dp[2], ..., dp[n] in order. When we compute dp[i], all smaller dp values are already available. No recursion, no memoization overhead, no graph abstraction — just a simple nested loop.

Step-by-Step Explanation

Let's trace the DP for n = 12. Perfect squares to consider at each step: 1² = 1, 2² = 4, 3² = 9.

Step 1: dp[0] = 0 (base case).

Step 2: dp[1]: try 1²→ dp[0]+1 = 1. dp[1] = 1.

Step 3: dp[2]: try 1²→ dp[1]+1 = 2. dp[2] = 2.

Step 4: dp[3]: try 1²→ dp[2]+1 = 3. dp[3] = 3.

Step 5: dp[4]: try 1²→ dp[3]+1 = 4. Try 2²→ dp[0]+1 = 1. dp[4] = min(4, 1) = 1. Key insight: 4 is itself a perfect square!

Step 6: dp[5]: try 1²→ dp[4]+1 = 2. Try 2²→ dp[1]+1 = 2. dp[5] = 2.

Step 7: dp[6] = 3, dp[7] = 4 (computed similarly).

Step 8: dp[8]: try 1²→ dp[7]+1 = 5. Try 2²→ dp[4]+1 = 2. dp[8] = 2. Two copies of 4.

Step 9: dp[9]: try 1²→ dp[8]+1 = 3. Try 2²→ dp[5]+1 = 3. Try 3²→ dp[0]+1 = 1. dp[9] = 1. Perfect square!

Step 10: dp[10] = 2, dp[11] = 3.

Step 11: dp[12]: try 1²→ dp[11]+1 = 4. Try 2²→ dp[8]+1 = 3. Try 3²→ dp[3]+1 = 4. dp[12] = min(4, 3, 4) = 3.

Result: dp[12] = 3. Decomposition: 12 = 4 + 4 + 4.

Bottom-Up DP — Filling the Table Left to Right — Watch each dp cell get computed by checking all perfect squares that fit. The minimum over all options determines each cell's value.

Algorithm

  1. Create dp array of size n + 1, initialized to infinity
  2. Set dp[0] = 0
  3. For i from 1 to n:
    • For j = 1, 2, 3, ... while j² ≤ i:
      • dp[i] = min(dp[i], dp[i − j²] + 1)
  4. Return dp[n]

Code

class Solution {
public:
    int numSquares(int n) {
        vector<int> dp(n + 1, INT_MAX);
        dp[0] = 0;
        
        for (int i = 1; i <= n; i++) {
            for (int j = 1; j * j <= i; j++) {
                dp[i] = min(dp[i], dp[i - j * j] + 1);
            }
        }
        
        return dp[n];
    }
};
class Solution:
    def numSquares(self, n: int) -> int:
        dp = [float('inf')] * (n + 1)
        dp[0] = 0
        
        for i in range(1, n + 1):
            j = 1
            while j * j <= i:
                dp[i] = min(dp[i], dp[i - j * j] + 1)
                j += 1
        
        return dp[n]
class Solution {
    public int numSquares(int n) {
        int[] dp = new int[n + 1];
        Arrays.fill(dp, Integer.MAX_VALUE);
        dp[0] = 0;
        
        for (int i = 1; i <= n; i++) {
            for (int j = 1; j * j <= i; j++) {
                dp[i] = Math.min(dp[i], dp[i - j * j] + 1);
            }
        }
        
        return dp[n];
    }
}

Complexity Analysis

Time Complexity: O(n × √n)

The outer loop runs n times. For each value i, the inner loop tries all perfect squares j² ≤ i, which is at most √i ≤ √n iterations. Each iteration does O(1) work (one comparison and one min operation). Total: Σ(i=1 to n) √i, which is bounded by n × √n. With n = 10,000, this is about 1,000,000 operations — very fast.

Space Complexity: O(n)

We use a single dp array of size n + 1. No recursion stack, no queue, no visited set. This is the minimum space needed to store all subproblem answers.