Matrix Chain Multiplication Bottom-Up Approach

Description

Given the dimensions of a sequence of matrices in an array arr[], where the i-th matrix has dimensions arr[i-1] × arr[i] (for i ≥ 1), find the most efficient way to multiply these matrices together such that the total number of scalar multiplications is minimized.

The problem is not to actually perform the multiplications, but to decide the order (parenthesization) in which to multiply the matrices. Matrix multiplication is associative — the result is the same regardless of how you parenthesize — but the number of scalar multiplications changes dramatically depending on the grouping.

When a matrix of size m × n is multiplied with a matrix of size n × p, it produces a matrix of size m × p and requires m × n × p scalar multiplications.

Return the minimum number of scalar multiplications needed to compute the product of the entire chain.

Focus: This problem emphasizes the bottom-up (tabulation) dynamic programming approach, where we iteratively fill a DP table by solving subproblems in order of increasing chain length — from pairs of adjacent matrices up to the full chain.

Examples

Example 1

Input: arr = [2, 1, 3, 4]

Output: 20

Explanation: There are 3 matrices: M1 (2×1), M2 (1×3), M3 (3×4). Two possible parenthesizations:

((M1 × M2) × M3): Cost = (2×1×3) + (2×3×4) = 6 + 24 = 30.
(M1 × (M2 × M3)): Cost = (1×3×4) + (2×1×4) = 12 + 8 = 20.

The minimum is 20.

Example 2

Input: arr = [1, 2, 3, 4, 3]

Output: 30

Explanation: There are 4 matrices: M1 (1×2), M2 (2×3), M3 (3×4), M4 (4×3). The optimal parenthesization is ((M1 × M2) × M3) × M4:

M1 × M2 = 1×2×3 = 6
(M1M2) × M3 = 1×3×4 = 12
((M1M2)M3) × M4 = 1×4×3 = 12
Total = 6 + 12 + 12 = 30.

Example 3

Input: arr = [10, 30, 5, 60]

Output: 4500

Explanation: There are 3 matrices: A (10×30), B (30×5), C (5×60).

(AB)C: (10×30×5) + (10×5×60) = 1500 + 3000 = 4500
A(BC): (30×5×60) + (10×30×60) = 9000 + 18000 = 27000

The minimum is 4500, achieved by first multiplying A×B to get a 10×5 matrix, then multiplying the result by C.

Example 4

Input: arr = [3, 4]

Output: 0

Explanation: Only one matrix (3×4). No multiplication is needed.

Constraints

2 ≤ arr.size() ≤ 100
1 ≤ arr[i] ≤ 200

Editorial

Brute Force

Intuition

The most straightforward approach is to recursively try every possible way to split the matrix chain and pick the cheapest one.

For a chain of matrices represented by indices [i..j] in the dimensions array, we can split at any position k (where i < k < j). This divides the chain into two groups: [i..k] and [k..j]. We recursively find the minimum cost for each group, then add the cost of multiplying the two resulting matrices together: arr[i] × arr[k] × arr[j].

We try every possible split point and return the minimum total cost. The base case is when there is only one matrix (i+1 == j), which costs nothing.

Step-by-Step Explanation

Let's trace with arr = [10, 30, 5, 60] (matrices: A=10×30, B=30×5, C=5×60):

Step 1: Call minCost(0, 3). Try splits k=1 and k=2.

Step 2: Split k=1: left=[0,1] (A alone, cost 0), right=[1,3] (B×C).

minCost(1,3): split k=2 → 0 + 0 + 30×5×60 = 9000.
Merge cost for k=1 = 10×30×60 = 18000. Total = 0 + 9000 + 18000 = 27000.

Step 3: Split k=2: left=[0,2] (A×B), right=[2,3] (C alone, cost 0).

minCost(0,2): split k=1 → 0 + 0 + 10×30×5 = 1500.
Merge cost for k=2 = 10×5×60 = 3000. Total = 1500 + 0 + 3000 = 4500.

Step 4: Compare: k=1 → 27000, k=2 → 4500. Minimum = 4500.

Result: 4500.

Algorithm

Define minCost(i, j) representing the minimum cost to multiply chain [i..j].
Base case: if i + 1 == j, return 0.
For each split k from i+1 to j-1:
- cost = minCost(i, k) + minCost(k, j) + arr[i] × arr[k] × arr[j]
- Track minimum.
Return minimum cost. Answer is minCost(0, n-1).

Code

#include <bits/stdc++.h>
using namespace std;

int minCostRec(vector<int>& arr, int i, int j) {
    if (i + 1 == j)
        return 0;
    
    int res = INT_MAX;
    for (int k = i + 1; k < j; k++) {
        int cost = minCostRec(arr, i, k)
                 + minCostRec(arr, k, j)
                 + arr[i] * arr[k] * arr[j];
        res = min(res, cost);
    }
    return res;
}

int matrixMultiplication(vector<int>& arr) {
    return minCostRec(arr, 0, arr.size() - 1);
}

import sys

def matrixMultiplication(arr):
    def minCostRec(i, j):
        if i + 1 == j:
            return 0
        
        res = sys.maxsize
        for k in range(i + 1, j):
            cost = (minCostRec(i, k)
                   + minCostRec(k, j)
                   + arr[i] * arr[k] * arr[j])
            res = min(res, cost)
        return res
    
    return minCostRec(0, len(arr) - 1)

class Solution {
    static int minCostRec(int[] arr, int i, int j) {
        if (i + 1 == j)
            return 0;
        
        int res = Integer.MAX_VALUE;
        for (int k = i + 1; k < j; k++) {
            int cost = minCostRec(arr, i, k)
                     + minCostRec(arr, k, j)
                     + arr[i] * arr[k] * arr[j];
            res = Math.min(res, cost);
        }
        return res;
    }
    
    static int matrixMultiplication(int[] arr) {
        return minCostRec(arr, 0, arr.length - 1);
    }
}

Complexity Analysis

Time Complexity: O(2^n)

The number of possible parenthesizations is the Catalan number C(n-2), which grows exponentially. The recursion explores all of them due to massive overlapping subproblems that are recomputed from scratch.

Space Complexity: O(n)

The recursion depth is at most n-1. Each stack frame uses O(1) space, so the call stack uses O(n) total.

Why This Approach Is Not Efficient

The brute force has exponential time complexity because it recomputes the same subproblems repeatedly. For example, minCost(1,3) might be needed by both the k=1 split of minCost(0,3) and the k=1 split of minCost(1,4) (in a larger chain). Each recomputation triggers its own cascade of recursive calls.

With arr.size() up to 100, we have up to 99 matrices. The exponential branching makes brute force completely impractical — it would take longer than the age of the universe to finish.

Key insight: There are only O(n²) unique subproblems defined by pairs (i, j). We can cache each result using memoization to avoid recomputation, bringing time down to O(n³).

Better Approach - Memoization (Top-Down DP)

Intuition

We keep the same recursive structure but add a cache: a 2D table memo[i][j] that stores the result of each subproblem after its first computation. Before solving any subproblem, we check the cache. If the answer is there, we return it instantly without re-recursing.

This transforms the exponential recursion into an O(n³) algorithm because:

There are O(n²) unique (i, j) pairs.
Each is computed exactly once, trying O(n) split points.
Total: O(n²) × O(n) = O(n³).

However, this top-down approach still uses the call stack for recursion, which adds function-call overhead and uses O(n) stack space on top of the O(n²) memo table.

Step-by-Step Explanation

Let's trace with arr = [10, 30, 5, 60] using memoization:

Step 1: Call minCost(0, 3). memo is empty.

Step 2: Try k=1. Need minCost(0,1) → base case, 0. Cache it. Need minCost(1,3) → not cached.

Step 3: Solve minCost(1,3): k=2, left=minCost(1,2)=0, right=minCost(2,3)=0, merge=30×5×60=9000. Cache memo[1][3]=9000.

Step 4: k=1 total: 0 + 9000 + 10×30×60 = 0 + 9000 + 18000 = 27000. Best so far = 27000.

Step 5: Try k=2. Need minCost(0,2) → not cached. Solve: k=1, left=0 (CACHE HIT), right=0 (CACHE HIT), merge=10×30×5=1500. Cache memo[0][2]=1500.

Step 6: Need minCost(2,3)=0 (CACHE HIT). k=2 total: 1500 + 0 + 10×5×60 = 4500. Compare: 4500 < 27000. Best = 4500.

Step 7: Cache memo[0][3]=4500. Return 4500.

Algorithm

Create memo[n][n] initialized to -1.
Define minCost(i, j): if cached, return cache; else compute by trying all splits, cache and return minimum.
Return minCost(0, n-1).

Code

#include <bits/stdc++.h>
using namespace std;

int minCostRec(vector<int>& arr, int i, int j,
               vector<vector<int>>& memo) {
    if (i + 1 == j)
        return 0;
    if (memo[i][j] != -1)
        return memo[i][j];
    
    int res = INT_MAX;
    for (int k = i + 1; k < j; k++) {
        int cost = minCostRec(arr, i, k, memo)
                 + minCostRec(arr, k, j, memo)
                 + arr[i] * arr[k] * arr[j];
        res = min(res, cost);
    }
    return memo[i][j] = res;
}

int matrixMultiplication(vector<int>& arr) {
    int n = arr.size();
    vector<vector<int>> memo(n, vector<int>(n, -1));
    return minCostRec(arr, 0, n - 1, memo);
}

import sys

def matrixMultiplication(arr):
    n = len(arr)
    memo = [[-1] * n for _ in range(n)]
    
    def minCostRec(i, j):
        if i + 1 == j:
            return 0
        if memo[i][j] != -1:
            return memo[i][j]
        
        res = sys.maxsize
        for k in range(i + 1, j):
            cost = (minCostRec(i, k)
                   + minCostRec(k, j)
                   + arr[i] * arr[k] * arr[j])
            res = min(res, cost)
        
        memo[i][j] = res
        return res
    
    return minCostRec(0, n - 1)

class Solution {
    static int minCostRec(int[] arr, int i, int j, int[][] memo) {
        if (i + 1 == j)
            return 0;
        if (memo[i][j] != -1)
            return memo[i][j];
        
        int res = Integer.MAX_VALUE;
        for (int k = i + 1; k < j; k++) {
            int cost = minCostRec(arr, i, k, memo)
                     + minCostRec(arr, k, j, memo)
                     + arr[i] * arr[k] * arr[j];
            res = Math.min(res, cost);
        }
        return memo[i][j] = res;
    }
    
    static int matrixMultiplication(int[] arr) {
        int n = arr.length;
        int[][] memo = new int[n][n];
        for (int[] row : memo)
            java.util.Arrays.fill(row, -1);
        return minCostRec(arr, 0, n - 1, memo);
    }
}

Complexity Analysis

Time Complexity: O(n³)

O(n²) unique subproblems, each trying O(n) split points with O(1) work per split.

Space Complexity: O(n²)

The memo table uses O(n²) space. The recursion stack adds O(n) depth. Total: O(n²).

Why This Approach Is Not Efficient

The memoization approach achieves O(n³) time and O(n²) space, which are asymptotically optimal for this problem. However, there are practical drawbacks:

Recursion overhead: Each recursive call incurs function-call overhead — pushing/popping stack frames, managing return addresses. For n=100, the recursion can go up to 99 levels deep, and the total number of function calls can reach hundreds of thousands.
Stack space: The O(n) recursion depth adds to memory usage. For very deep recursion or constrained environments, this can be a concern.
Unpredictable order: The top-down approach computes subproblems in the order they're first needed by the recursion, which may not be cache-friendly for the CPU.

Key insight: We can eliminate recursion entirely by computing subproblems iteratively in a specific order — from smaller subchains to larger ones. Since every subproblem dp[i][j] only depends on subproblems with shorter chain lengths (which have smaller j - i values), we can fill the table diagonal by diagonal. This is the bottom-up tabulation approach — same O(n³) time and O(n²) space, but with no recursion overhead and better cache locality.

Optimal Approach - Bottom-Up DP (Tabulation)

Intuition

Instead of starting from the full chain and recursing down to base cases, we flip the computation: start from the smallest subchains and build up to the full chain iteratively.

We create a 2D table dp[i][j] where dp[i][j] represents the minimum multiplication cost for the subchain from index i to index j. The key observation is:

Chain length 1 (single matrix, j = i+1): dp[i][j] = 0. No multiplication needed.
Chain length 2 (two matrices, j = i+2): dp[i][j] = arr[i] × arr[i+1] × arr[j]. Only one way to multiply.
Chain length L (L matrices, j = i+L): Try every split k from i+1 to j-1. dp[i][j] = min over k of (dp[i][k] + dp[k][j] + arr[i] × arr[k] × arr[j]).

We fill the table diagonally — first all length-1 chains (the main diagonal), then all length-2 chains (one diagonal above), then length-3, and so on. When we compute dp[i][j] for a chain of length L, all shorter subchains dp[i][k] and dp[k][j] are already computed and available in the table.

Imagine building a pyramid brick by brick. The bottom row (length-1 chains) is the foundation — all zeros. Each higher row uses bricks from the row below. The top brick dp[0][n-1] is the answer.

The table is upper-triangular: we only fill cells where j > i. The diagonal dp[i][i+1] is all zeros, and the answer sits at dp[0][n-1] in the top-right corner.

DP table showing diagonal fill order for Matrix Chain Multiplication with arr = [10, 30, 5, 60]

Step-by-Step Explanation

Let's trace through with arr = [1, 2, 3, 4, 3] (4 matrices: M1=1×2, M2=2×3, M3=3×4, M4=4×3):

Step 1: Initialize DP table. Create dp[5][5] filled with 0. The diagonal dp[i][i+1] = 0 for all i (single-matrix base cases).

Step 2: Fill chain length 2 (two adjacent matrices).

dp[0][2]: M1×M2. Only split k=1: dp[0][1] + dp[1][2] + arr[0]×arr[1]×arr[2] = 0 + 0 + 1×2×3 = 6.
dp[1][3]: M2×M3. Only split k=2: 0 + 0 + arr[1]×arr[2]×arr[3] = 2×3×4 = 24.
dp[2][4]: M3×M4. Only split k=3: 0 + 0 + arr[2]×arr[3]×arr[4] = 3×4×3 = 36.

Step 3: Fill chain length 3 (three adjacent matrices).

dp[0][3]: M1×M2×M3. Try two splits:
- k=1: dp[0][1] + dp[1][3] + arr[0]×arr[1]×arr[3] = 0 + 24 + 1×2×4 = 32.
- k=2: dp[0][2] + dp[2][3] + arr[0]×arr[2]×arr[3] = 6 + 0 + 1×3×4 = 18.
- Minimum = 18. dp[0][3] = 18.
dp[1][4]: M2×M3×M4. Try two splits:
- k=2: dp[1][2] + dp[2][4] + arr[1]×arr[2]×arr[4] = 0 + 36 + 2×3×3 = 54.
- k=3: dp[1][3] + dp[3][4] + arr[1]×arr[3]×arr[4] = 24 + 0 + 2×4×3 = 48.
- Minimum = 48. dp[1][4] = 48.

Step 4: Fill chain length 4 (full chain M1×M2×M3×M4).

dp[0][4]: Try three splits:
- k=1: dp[0][1] + dp[1][4] + arr[0]×arr[1]×arr[4] = 0 + 48 + 1×2×3 = 54.
- k=2: dp[0][2] + dp[2][4] + arr[0]×arr[2]×arr[4] = 6 + 36 + 1×3×3 = 51.
- k=3: dp[0][3] + dp[3][4] + arr[0]×arr[3]×arr[4] = 18 + 0 + 1×4×3 = 30.
- Minimum = 30. dp[0][4] = 30.

Result: dp[0][4] = 30.

Bottom-Up DP — Filling the Table Diagonal by Diagonal — Watch how the DP table fills from the base-case diagonal outward. Each cell dp[i][j] is computed using already-filled cells with shorter chain lengths, building toward the answer at dp[0][n-1].

Algorithm

Let n = arr.size(). Create a 2D table dp[n][n] initialized to 0.
Iterate over chain lengths len from 2 to n-1:
- For each starting index i from 0 to n-1-len:
  - Set j = i + len
  - Set dp[i][j] = infinity
  - For each split point k from i+1 to j-1:
    - Compute cost = dp[i][k] + dp[k][j] + arr[i] × arr[k] × arr[j]
    - Update dp[i][j] = min(dp[i][j], cost)
Return dp[0][n-1].

Code

#include <bits/stdc++.h>
using namespace std;

int matrixMultiplication(vector<int>& arr) {
    int n = arr.size();
    
    // dp[i][j] = minimum cost to multiply chain [i..j]
    vector<vector<int>> dp(n, vector<int>(n, 0));
    
    // Fill by increasing chain length
    for (int len = 2; len < n; len++) {
        for (int i = 0; i < n - len; i++) {
            int j = i + len;
            dp[i][j] = INT_MAX;
            
            // Try every split point
            for (int k = i + 1; k < j; k++) {
                int cost = dp[i][k] + dp[k][j]
                         + arr[i] * arr[k] * arr[j];
                dp[i][j] = min(dp[i][j], cost);
            }
        }
    }
    
    return dp[0][n - 1];
}

import sys

def matrixMultiplication(arr):
    n = len(arr)
    
    # dp[i][j] = minimum cost to multiply chain [i..j]
    dp = [[0] * n for _ in range(n)]
    
    # Fill by increasing chain length
    for length in range(2, n):
        for i in range(n - length):
            j = i + length
            dp[i][j] = sys.maxsize
            
            # Try every split point
            for k in range(i + 1, j):
                cost = (dp[i][k] + dp[k][j]
                       + arr[i] * arr[k] * arr[j])
                dp[i][j] = min(dp[i][j], cost)
    
    return dp[0][n - 1]

class Solution {
    static int matrixMultiplication(int[] arr) {
        int n = arr.length;
        
        // dp[i][j] = minimum cost to multiply chain [i..j]
        int[][] dp = new int[n][n];
        
        // Fill by increasing chain length
        for (int len = 2; len < n; len++) {
            for (int i = 0; i < n - len; i++) {
                int j = i + len;
                dp[i][j] = Integer.MAX_VALUE;
                
                // Try every split point
                for (int k = i + 1; k < j; k++) {
                    int cost = dp[i][k] + dp[k][j]
                             + arr[i] * arr[k] * arr[j];
                    dp[i][j] = Math.min(dp[i][j], cost);
                }
            }
        }
        
        return dp[0][n - 1];
    }
}

Complexity Analysis

Time Complexity: O(n³)

The outer loop runs for chain lengths from 2 to n-1 (n-2 iterations). For each length, the middle loop iterates over starting positions (up to n times). For each cell, the inner loop tries all split points (up to n-2 times). The three nested loops give O(n³) total. Each iteration does O(1) work (one multiplication, two additions, one comparison).

Space Complexity: O(n²)

The DP table dp[n][n] uses O(n²) space. No recursion stack is needed — this is the key advantage over the memoization approach. All computation is done iteratively with predictable, cache-friendly memory access patterns.

Comparison with Memoization:

Both are O(n³) time and O(n²) space asymptotically.
Bottom-up eliminates recursion overhead (no function calls, no stack frames).
Bottom-up has better cache locality (fills table in a predictable diagonal pattern).
In practice, bottom-up is faster by a constant factor, especially for larger inputs.

Language Essentials & Syntax

EASY

Pattern-Based Logic Building

EASY

Standard Libraries & Collections

EASY

Essential Mathematics

EASY

MEDIUM

Recursion Fundamentals

EASY

Hashing Essentials

EASY

MEDIUM

Comparison-Based Sorts

EASY

Divide & Conquer Sorts

EASY

MEDIUM

Array Fundamentals

EASY

MEDIUM

Intermediate Array Problems

EASY

MEDIUM

Advanced Array Challenges

EASY

MEDIUM

HARD

Searching in Sorted Arrays

EASY

MEDIUM

Search Space Reduction

EASY

MEDIUM

HARD

Matrix Binary Search

EASY

MEDIUM

HARD

Core String Operations

EASY

MEDIUM

Intermediate String Challenges

EASY

MEDIUM

Singly Linked List Basics

EASY

MEDIUM

Doubly Linked List Basics

EASY

MEDIUM

Linked List Techniques

EASY

MEDIUM

Doubly Linked List Techniques

MEDIUM

HARD

Advanced Linked List Challenges

MEDIUM

HARD

Building Recursive Thinking

EASY

MEDIUM

Generating Subsequences

EASY

MEDIUM

Exhaustive Search & Constraint Solving

MEDIUM

HARD

Bitwise Operations Fundamentals

EASY

MEDIUM

Bit Manipulation Applications

EASY

MEDIUM

Number Theory & Primes

EASY

MEDIUM

HARD