Count Square Submatrices with All Ones

Description

You are given an m × n binary matrix (containing only 0s and 1s).

Return the total number of square submatrices that have all ones.

A square submatrix is any contiguous square-shaped region within the matrix. It can be of any size — 1×1, 2×2, 3×3, and so on — as long as every element within it is 1.

Examples

Example 1

Input: matrix = [[0,1,1,1],[1,1,1,1],[0,1,1,1]]

Output: 15

Explanation:

There are 10 squares of side 1 (cells containing 1).
There are 4 squares of side 2. For example, the 2×2 submatrix at rows [0-1], cols [1-2] = [[1,1],[1,1]].
There is 1 square of side 3. The 3×3 submatrix at rows [0-2], cols [1-3] = [[1,1,1],[1,1,1],[1,1,1]].
Total = 10 + 4 + 1 = 15.

Example 2

Input: matrix = [[1,0,1],[1,1,0],[1,1,0]]

Output: 7

Explanation:

There are 6 squares of side 1 (cells containing 1).
There is 1 square of side 2. The 2×2 submatrix at rows [1-2], cols [0-1] = [[1,1],[1,1]].
No squares of side 3 or larger exist with all ones.
Total = 6 + 1 = 7.

Constraints

1 ≤ m, n ≤ 300
matrix[i][j] is either 0 or 1

Editorial

Brute Force

Intuition

The most straightforward approach is to directly enumerate every possible square submatrix and check if it contains all ones.

For each cell (i, j), we treat it as the top-left corner of a potential square. We then try every possible square size k (from 1 up to the maximum that fits within the matrix boundaries). For each candidate square of size k×k starting at (i, j), we scan every cell inside to verify all values are 1.

If all cells are 1, we increment our count. This exhaustive enumeration guarantees correctness — we never miss any valid square.

Step-by-Step Explanation

Let's trace with matrix = [[1,0,1],[1,1,0],[1,1,0]] (3×3).

Step 1: Start at cell (0,0), value=1. Try 1×1: just cell (0,0)=1 → all ones ✓. Count = 1. Try 2×2 from (0,0): cells are (0,0)=1, (0,1)=0 → contains a zero ✗. Stop expanding from (0,0).

Step 2: Cell (0,1), value=0. Any square with top-left at (0,1) will include this 0. Skip.

Step 3: Cell (0,2), value=1. Try 1×1: (0,2)=1 ✓. Count = 2. Try 2×2 from (0,2): needs column 3, which is out of bounds. Stop.

Step 4: Cell (1,0), value=1. Try 1×1: ✓. Count = 3. Try 2×2: cells (1,0)=1, (1,1)=1, (2,0)=1, (2,1)=1 → all ones ✓. Count = 4. Try 3×3: needs column 2, but (1,2)=0 ✗. Stop.

Step 5: Cell (1,1), value=1. Try 1×1: ✓. Count = 5. Try 2×2: cells (1,1)=1, (1,2)=0 → ✗. Stop.

Step 6: Cell (1,2), value=0. Skip.

Step 7: Cell (2,0), value=1. Try 1×1: ✓. Count = 6. Try 2×2: out of bounds (row 3). Stop.

Step 8: Cell (2,1), value=1. Try 1×1: ✓. Count = 7. Try 2×2: out of bounds. Stop.

Step 9: Cell (2,2), value=0. Skip.

Result: Count = 7.

Brute Force — Checking Every Possible Square — Watch as we try every cell as a potential top-left corner and expand squares of increasing size, checking if all cells are 1.

Algorithm

Initialize count = 0.
For each cell (i, j) as a potential top-left corner:
- Compute maxSide = min(m - i, n - j) — the largest square that fits.
- For each side length k from 1 to maxSide:
  - Check if all cells in the k×k submatrix starting at (i, j) are 1.
  - If yes, increment count.
  - If no, break (no larger square from this corner can be all ones).
Return count.

Code

#include <vector>
using namespace std;

class Solution {
public:
    int countSquares(vector<vector<int>>& matrix) {
        int m = matrix.size(), n = matrix[0].size();
        int count = 0;
        
        for (int i = 0; i < m; i++) {
            for (int j = 0; j < n; j++) {
                int maxSide = min(m - i, n - j);
                for (int k = 1; k <= maxSide; k++) {
                    bool allOnes = true;
                    // Check the new row and column added by expanding to size k
                    for (int r = i; r < i + k && allOnes; r++) {
                        for (int c = j; c < j + k && allOnes; c++) {
                            if (matrix[r][c] == 0) allOnes = false;
                        }
                    }
                    if (allOnes) count++;
                    else break;
                }
            }
        }
        
        return count;
    }
};

class Solution:
    def countSquares(self, matrix: list[list[int]]) -> int:
        m, n = len(matrix), len(matrix[0])
        count = 0
        
        for i in range(m):
            for j in range(n):
                max_side = min(m - i, n - j)
                for k in range(1, max_side + 1):
                    all_ones = True
                    for r in range(i, i + k):
                        for c in range(j, j + k):
                            if matrix[r][c] == 0:
                                all_ones = False
                                break
                        if not all_ones:
                            break
                    if all_ones:
                        count += 1
                    else:
                        break
        
        return count

class Solution {
    public int countSquares(int[][] matrix) {
        int m = matrix.length, n = matrix[0].length;
        int count = 0;
        
        for (int i = 0; i < m; i++) {
            for (int j = 0; j < n; j++) {
                int maxSide = Math.min(m - i, n - j);
                for (int k = 1; k <= maxSide; k++) {
                    boolean allOnes = true;
                    for (int r = i; r < i + k && allOnes; r++) {
                        for (int c = j; c < j + k && allOnes; c++) {
                            if (matrix[r][c] == 0) allOnes = false;
                        }
                    }
                    if (allOnes) count++;
                    else break;
                }
            }
        }
        
        return count;
    }
}

Complexity Analysis

Time Complexity: O(m × n × min(m,n)²)

For each of the m × n cells, we try squares of side up to min(m, n). For each candidate square of side k, we check all k² cells. In the worst case (all 1s), the inner checking loop runs approximately min(m,n)² times per starting cell. So the total work is O(m × n × min(m,n)²). With m = n = 300, this is 300 × 300 × 300² ≈ 8.1 × 10⁹ — too slow.

Space Complexity: O(1)

No additional data structures are used beyond a few loop variables.

Why This Approach Is Not Efficient

The brute force checks every cell of every candidate square from scratch. This leads to massive redundant work — the same cell can be visited as part of hundreds of different overlapping squares.

For a 300×300 matrix of all ones, the brute force performs roughly 8 billion operations, far exceeding typical time limits.

The core redundancy: To know if a k×k square starting at (i,j) is all ones, we don't need to re-examine the entire square. If we already know the answers for the three adjacent (k-1)×(k-1) squares — at (i+1,j), (i,j+1), and (i+1,j+1) — then their overlap covers most of the k×k region. This is exactly the insight behind a dynamic programming approach, where we compute the maximum square size at each cell using its neighbors, avoiding any redundant cell checking.

Optimal Approach - Dynamic Programming

Intuition

The key insight is to define dp[i][j] as the side length of the largest square submatrix with all ones whose bottom-right corner is at cell (i, j).

If matrix[i][j] = 0, no square can end here, so dp[i][j] = 0.

If matrix[i][j] = 1:

We look at three neighbors: dp[i-1][j] (above), dp[i][j-1] (left), and dp[i-1][j-1] (diagonal).
dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1.

Why does this recurrence work? To form a k×k all-ones square ending at (i,j), we need:

A (k-1)×(k-1) square ending at (i-1,j) → ensures the rows above are covered.
A (k-1)×(k-1) square ending at (i,j-1) → ensures the columns to the left are covered.
A (k-1)×(k-1) square ending at (i-1,j-1) → ensures the interior is covered.

The minimum of these three constrains how large a square can extend to (i,j).

The beautiful connection to counting: dp[i][j] doesn't just tell us the largest square size — it also equals the number of square submatrices that end at (i,j). If dp[i][j] = 3, that means there are squares of size 1×1, 2×2, and 3×3 all ending at (i,j) — exactly 3 squares. So the total count is simply the sum of all dp[i][j] values!

Step-by-Step Explanation

Let's trace with matrix = [[1,0,1],[1,1,0],[1,1,0]].

Step 1: Initialize dp table (same dimensions as matrix). First row and first column of dp are copied directly from the matrix — a cell on the boundary can only form a 1×1 square at best.

Step 2: dp[0][0] = matrix[0][0] = 1 (1×1 square). dp[0][1] = matrix[0][1] = 0. dp[0][2] = matrix[0][2] = 1. First row dp: [1, 0, 1].

Step 3: dp[1][0] = matrix[1][0] = 1 (first column, copy directly). dp[2][0] = matrix[2][0] = 1. First column dp: [1, 1, 1].

Step 4: Compute dp[1][1]. matrix[1][1] = 1. Look at neighbors: dp[0][1]=0, dp[1][0]=1, dp[0][0]=1. dp[1][1] = min(0, 1, 1) + 1 = 0 + 1 = 1. The 0 above blocks a larger square — we can only form a 1×1 here.

Step 5: Compute dp[1][2]. matrix[1][2] = 0. Since the cell is 0, dp[1][2] = 0. No square ends here.

Step 6: Compute dp[2][1]. matrix[2][1] = 1. Neighbors: dp[1][1]=1, dp[2][0]=1, dp[1][0]=1. dp[2][1] = min(1, 1, 1) + 1 = 1 + 1 = 2. All three neighbors support at least a 1×1 square, so we can extend to a 2×2 square ending here!

Step 7: Compute dp[2][2]. matrix[2][2] = 0. dp[2][2] = 0.

Step 8: Final dp table: [[1,0,1],[1,1,0],[1,2,0]]. Sum = 1+0+1+1+1+0+1+2+0 = 7.

This matches our expected answer of 7.

DP — Building Maximum Square Sizes Bottom-Right — Watch how each dp[i][j] is computed from its three neighbors (above, left, diagonal). The value represents the largest all-ones square ending at that cell AND the count of squares ending there.

Algorithm

Create a dp table of size m × n.
Initialize: for the first row and first column, dp[i][j] = matrix[i][j].
For each interior cell (i, j) where i ≥ 1 and j ≥ 1:
- If matrix[i][j] = 0, set dp[i][j] = 0.
- If matrix[i][j] = 1, set dp[i][j] = min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1]) + 1.
Return the sum of all dp[i][j] values.

Code

#include <vector>
#include <algorithm>
using namespace std;

class Solution {
public:
    int countSquares(vector<vector<int>>& matrix) {
        int m = matrix.size(), n = matrix[0].size();
        vector<vector<int>> dp(m, vector<int>(n, 0));
        int count = 0;
        
        for (int i = 0; i < m; i++) {
            for (int j = 0; j < n; j++) {
                if (matrix[i][j] == 1) {
                    if (i == 0 || j == 0) {
                        dp[i][j] = 1;
                    } else {
                        dp[i][j] = min({dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]}) + 1;
                    }
                }
                count += dp[i][j];
            }
        }
        
        return count;
    }
};

class Solution:
    def countSquares(self, matrix: list[list[int]]) -> int:
        m, n = len(matrix), len(matrix[0])
        dp = [[0] * n for _ in range(m)]
        count = 0
        
        for i in range(m):
            for j in range(n):
                if matrix[i][j] == 1:
                    if i == 0 or j == 0:
                        dp[i][j] = 1
                    else:
                        dp[i][j] = min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]) + 1
                count += dp[i][j]
        
        return count

class Solution {
    public int countSquares(int[][] matrix) {
        int m = matrix.length, n = matrix[0].length;
        int[][] dp = new int[m][n];
        int count = 0;
        
        for (int i = 0; i < m; i++) {
            for (int j = 0; j < n; j++) {
                if (matrix[i][j] == 1) {
                    if (i == 0 || j == 0) {
                        dp[i][j] = 1;
                    } else {
                        dp[i][j] = Math.min(Math.min(dp[i - 1][j], dp[i][j - 1]), dp[i - 1][j - 1]) + 1;
                    }
                }
                count += dp[i][j];
            }
        }
        
        return count;
    }
}

Complexity Analysis

Time Complexity: O(m × n)

We iterate over every cell exactly once. For each cell, the computation (min of three neighbors plus addition) takes O(1). Total: O(m × n). With m = n = 300, this is 90,000 operations — blazing fast.

Space Complexity: O(m × n)

The dp table has the same dimensions as the input matrix. This can be optimized to O(n) by storing only the current and previous rows, since dp[i][j] depends only on dp[i-1][j], dp[i][j-1], and dp[i-1][j-1]. However, the O(m × n) version is simpler and sufficient for the given constraints.

Language Essentials & Syntax

EASY

Pattern-Based Logic Building

EASY

Standard Libraries & Collections

EASY

Essential Mathematics

EASY

MEDIUM

Recursion Fundamentals

EASY

Hashing Essentials

EASY

MEDIUM

Comparison-Based Sorts

EASY

Divide & Conquer Sorts

EASY

MEDIUM

Array Fundamentals

EASY

MEDIUM

Intermediate Array Problems

EASY

MEDIUM

Advanced Array Challenges

EASY

MEDIUM

HARD

Searching in Sorted Arrays

EASY

MEDIUM

Search Space Reduction

EASY

MEDIUM

HARD

Matrix Binary Search

EASY

MEDIUM

HARD

Core String Operations

EASY

MEDIUM

Intermediate String Challenges

EASY

MEDIUM

Singly Linked List Basics

EASY

MEDIUM

Doubly Linked List Basics

EASY

MEDIUM

Linked List Techniques

EASY

MEDIUM

Doubly Linked List Techniques

MEDIUM

HARD

Advanced Linked List Challenges

MEDIUM

HARD

Building Recursive Thinking

EASY

MEDIUM

Generating Subsequences

EASY

MEDIUM

Exhaustive Search & Constraint Solving

MEDIUM

HARD

Bitwise Operations Fundamentals

EASY

MEDIUM

Bit Manipulation Applications

EASY

MEDIUM

Number Theory & Primes

EASY

MEDIUM

HARD