Skip to main content

Tree Data Structure Basics

Description

A tree is a hierarchical, non-linear data structure composed of nodes connected by edges. Unlike arrays or linked lists where data is stored sequentially, a tree organizes data across multiple levels, forming a parent–child relationship.

Given a tree with N nodes numbered 0 to N-1 and N-1 edges (where each edge [u, v] means node u is the parent of node v), implement a tree data structure that supports these fundamental operations:

  1. getRoot() — Return the root node (the node with no parent).
  2. getParent(node) — Return the parent of the given node.
  3. getChildren(node) — Return a list of all direct children of the given node.
  4. getLeaves() — Return all leaf nodes (nodes with no children).
  5. getHeight() — Return the height of the tree (the length of the longest path from root to any leaf).

Key Terminology:

  • Root: The topmost node with no parent.
  • Leaf (External Node): A node with no children.
  • Internal Node: A node with at least one child.
  • Parent: The immediate predecessor of a node.
  • Child: The immediate successor of a node.
  • Depth of a node: Number of edges from the root to that node.
  • Height of the tree: Maximum depth among all nodes.
  • Degree of a node: The number of children it has.
A tree with 7 nodes showing root, internal nodes, leaf nodes, edges, levels, and height annotations
A tree with 7 nodes showing root, internal nodes, leaf nodes, edges, levels, and height annotations

Examples

Example 1

Input:

N = 7
edges = [[0,1], [0,2], [0,3], [1,4], [1,5], [2,6]]

Tree structure:

       0
      /|\
     1  2  3
    /\   \
   4  5   6

Queries and Outputs:

  • getRoot() → 0
  • getParent(5) → 1
  • getChildren(0) → [1, 2, 3]
  • getChildren(1) → [4, 5]
  • getLeaves() → [3, 4, 5, 6]
  • getHeight() → 2

Explanation: Node 0 has no parent, so it is the root. Node 5's parent is node 1 (from edge [1,5]). Node 0 has three children: 1, 2, and 3. Nodes 3, 4, 5, and 6 have no children, making them leaves. The longest path from root to a leaf is 0→1→4 (or 0→1→5 or 0→2→6), which has 2 edges, so the height is 2.

Example 2

Input:

N = 4
edges = [[0,1], [1,2], [2,3]]

Tree structure (skewed / linear):

0 → 1 → 2 → 3

Queries and Outputs:

  • getRoot() → 0
  • getParent(2) → 1
  • getChildren(1) → [2]
  • getLeaves() → [3]
  • getHeight() → 3

Explanation: This is a skewed tree where every node except the last has exactly one child. The only leaf is node 3. The height equals 3 because the single root-to-leaf path 0→1→2→3 has 3 edges.

Example 3

Input:

N = 1
edges = []

Queries and Outputs:

  • getRoot() → 0
  • getChildren(0) → []
  • getLeaves() → [0]
  • getHeight() → 0

Explanation: A single-node tree has the root as its only node. Since the root has no children, it is also a leaf. The height is 0 because there are no edges at all.

Constraints

  • 1 ≤ N ≤ 10^5
  • The edges form a valid tree (connected, acyclic, exactly N-1 edges)
  • 0 ≤ node < N for all query arguments
  • Each node has at most N-1 children (general / N-ary tree)

Editorial

Brute Force

Intuition

The simplest way to represent a tree is with a parent array. Imagine you have a class register and the only thing you write down for each student is the name of their direct supervisor. That single column of information — 'who is my parent?' — is enough to reconstruct the entire tree.

We create an array parent of size N where parent[i] stores the parent of node i. The root has no parent, so we mark it as -1.

This representation makes it trivial to look up any node's parent (just read the array), but answering other questions becomes cumbersome. For example, to find the children of a node, you have no direct way — you must scan the entire parent array and collect every node that lists the target as its parent. Similarly, finding leaves requires checking, for every single node, whether anyone else claims it as a parent — a doubly-nested scan.

Step-by-Step Explanation

Let us trace the getChildren(0) operation on the tree from Example 1.

First, we build the parent array from edges [[0,1],[0,2],[0,3],[1,4],[1,5],[2,6]]:

parent = [-1, 0, 0, 0, 1, 1, 2]

(Index 0 has parent -1 because it is the root.)

Now, to find children of node 0, we scan every index and check if parent[i] == 0:

Step 1: Initialize scan. We have parent = [-1, 0, 0, 0, 1, 1, 2] and target node = 0. Result list is empty.

Step 2: Check index 0 — parent[0] = -1. Is -1 equal to 0? No. Node 0 is the root itself, not a child of node 0. Skip.

Step 3: Check index 1 — parent[1] = 0. Is 0 equal to 0? YES. Node 1 is a child of node 0. Add 1 to result. Result = [1].

Step 4: Check index 2 — parent[2] = 0. Is 0 equal to 0? YES. Node 2 is a child of node 0. Result = [1, 2].

Step 5: Check index 3 — parent[3] = 0. Is 0 equal to 0? YES. Node 3 is a child. Result = [1, 2, 3].

Step 6: Check index 4 — parent[4] = 1. Is 1 equal to 0? No. Node 4's parent is 1, not 0. Skip.

Step 7: Check index 5 — parent[5] = 1. Is 1 equal to 0? No. Skip.

Step 8: Check index 6 — parent[6] = 2. Is 2 equal to 0? No. Skip.

Step 9: Scan complete. Children of node 0 = [1, 2, 3]. We had to examine all 7 nodes just to find 3 children.

Brute Force — Scanning Parent Array for Children of Node 0 — Watch how we must scan every single element of the parent array to find which nodes have node 0 as their parent.

Algorithm

Build:

  1. Create a parent array of size N, initialized to -1.
  2. For each edge [u, v], set parent[v] = u.

getRoot():

  1. Scan the parent array and return the index where parent[i] == -1.

getParent(node):

  1. Return parent[node].

getChildren(node):

  1. Initialize an empty list children.
  2. For each index i from 0 to N-1, if parent[i] == node, append i to children.
  3. Return children.

getLeaves():

  1. For each node i, scan the entire parent array to check if any node has i as its parent.
  2. If no node lists i as parent, then i is a leaf.

getHeight():

  1. For each node i, trace the parent chain from i up to the root, counting edges to compute depth.
  2. Return the maximum depth across all nodes.

Code

#include <vector>
#include <algorithm>
using namespace std;

class TreeParentArray {
    vector<int> parent;
    int n;

public:
    void buildTree(int numNodes, vector<vector<int>>& edges) {
        n = numNodes;
        parent.assign(n, -1);
        for (auto& edge : edges) {
            parent[edge[1]] = edge[0];
        }
    }

    int getRoot() {
        for (int i = 0; i < n; i++) {
            if (parent[i] == -1) return i;
        }
        return -1;
    }

    int getParent(int node) {
        return parent[node];
    }

    vector<int> getChildren(int node) {
        vector<int> children;
        for (int i = 0; i < n; i++) {
            if (parent[i] == node) {
                children.push_back(i);
            }
        }
        return children;
    }

    vector<int> getLeaves() {
        vector<int> leaves;
        for (int i = 0; i < n; i++) {
            bool isLeaf = true;
            for (int j = 0; j < n; j++) {
                if (parent[j] == i) {
                    isLeaf = false;
                    break;
                }
            }
            if (isLeaf) leaves.push_back(i);
        }
        return leaves;
    }

    int getHeight() {
        int maxDepth = 0;
        for (int i = 0; i < n; i++) {
            int depth = 0;
            int curr = i;
            while (parent[curr] != -1) {
                depth++;
                curr = parent[curr];
            }
            maxDepth = max(maxDepth, depth);
        }
        return maxDepth;
    }
};
class TreeParentArray:
    def build_tree(self, num_nodes, edges):
        self.n = num_nodes
        self.parent = [-1] * num_nodes
        for u, v in edges:
            self.parent[v] = u

    def get_root(self):
        for i in range(self.n):
            if self.parent[i] == -1:
                return i
        return -1

    def get_parent(self, node):
        return self.parent[node]

    def get_children(self, node):
        children = []
        for i in range(self.n):
            if self.parent[i] == node:
                children.append(i)
        return children

    def get_leaves(self):
        leaves = []
        for i in range(self.n):
            is_leaf = True
            for j in range(self.n):
                if self.parent[j] == i:
                    is_leaf = False
                    break
            if is_leaf:
                leaves.append(i)
        return leaves

    def get_height(self):
        max_depth = 0
        for i in range(self.n):
            depth = 0
            curr = i
            while self.parent[curr] != -1:
                depth += 1
                curr = self.parent[curr]
            max_depth = max(max_depth, depth)
        return max_depth
import java.util.*;

class TreeParentArray {
    int[] parent;
    int n;

    void buildTree(int numNodes, int[][] edges) {
        n = numNodes;
        parent = new int[n];
        Arrays.fill(parent, -1);
        for (int[] edge : edges) {
            parent[edge[1]] = edge[0];
        }
    }

    int getRoot() {
        for (int i = 0; i < n; i++) {
            if (parent[i] == -1) return i;
        }
        return -1;
    }

    int getParent(int node) {
        return parent[node];
    }

    List<Integer> getChildren(int node) {
        List<Integer> children = new ArrayList<>();
        for (int i = 0; i < n; i++) {
            if (parent[i] == node) {
                children.add(i);
            }
        }
        return children;
    }

    List<Integer> getLeaves() {
        List<Integer> leaves = new ArrayList<>();
        for (int i = 0; i < n; i++) {
            boolean isLeaf = true;
            for (int j = 0; j < n; j++) {
                if (parent[j] == i) {
                    isLeaf = false;
                    break;
                }
            }
            if (isLeaf) leaves.add(i);
        }
        return leaves;
    }

    int getHeight() {
        int maxDepth = 0;
        for (int i = 0; i < n; i++) {
            int depth = 0;
            int curr = i;
            while (parent[curr] != -1) {
                depth++;
                curr = parent[curr];
            }
            maxDepth = Math.max(maxDepth, depth);
        }
        return maxDepth;
    }
}

Complexity Analysis

Build Time: O(N)
We iterate over the N-1 edges once, performing a constant-time array assignment for each.

getRoot: O(N)
We scan the parent array until we find -1. In the worst case, the root is the last element.

getParent: O(1)
Direct array lookup.

getChildren: O(N)
We must scan all N entries of the parent array since children are not stored explicitly. Every single call costs O(N) regardless of how many children exist.

getLeaves: O(N²)
For each of the N nodes, we scan the entire parent array to determine whether it appears as any node's parent. Outer loop × inner loop = N × N = O(N²).

getHeight: O(N × H) where H is the tree height
For each node, we trace the parent chain upward to the root. In a balanced tree H = O(log N), but in a skewed tree H = O(N), making the worst case O(N²).

Space Complexity: O(N)
Only the parent array of size N is stored.

Why This Approach Is Not Efficient

The parent-array-only approach has a critical flaw: it stores the tree from the child's perspective but forces us to answer questions from the parent's perspective.

Finding children requires a full O(N) scan every time because we have no direct way to look up which nodes list a given node as their parent. With N up to 10^5, even a single getChildren call touches 10^5 entries. If we call getChildren for every node (which getLeaves essentially does), we end up with 10^5 × 10^5 = 10^10 operations — far beyond any reasonable time limit.

The getHeight operation suffers similarly. For a skewed tree (essentially a linked list), tracing parent pointers from the deepest node to the root takes O(N) steps, and doing this for all N nodes yields O(N²).

The core insight: If we also store the relationship from the parent's perspective — that is, maintain an explicit list of children for each node — then getChildren becomes a direct O(1) lookup, getLeaves drops to O(N), and getHeight can be computed in O(N) with a single DFS traversal.

Comparison table showing operation complexities for Parent Array vs Adjacency List representations
Comparison table showing operation complexities for Parent Array vs Adjacency List representations

Optimal Approach - Adjacency List

Intuition

The fix is beautifully simple: store the tree from both perspectives.

Imagine a school directory. The parent-array approach is like only having a column 'Student → Teacher'. If the principal asks 'who are Teacher X's students?', you must read every row. But if you also maintain a column 'Teacher → List of Students', the answer is instant.

We build two data structures:

  1. A parent array (same as before) for O(1) parent lookups.
  2. A children adjacency list — an array of lists where children[v] stores all direct children of node v.

Both are built simultaneously in a single O(N) pass over the edges. Now:

  • getChildren is a direct lookup into the children list — O(1).
  • getLeaves scans the children array once and collects nodes with empty lists — O(N).
  • getHeight uses a single DFS (Depth-First Search) traversal starting from the root. At each node, we recursively compute the height of all subtrees and take the maximum, adding 1 for the current edge. This visits each node exactly once — O(N).

The DFS for height follows a simple recursive formula:

height(node) = 0                          if node is a leaf
height(node) = 1 + max(height(child))     for each child

Step-by-Step Explanation

Let us trace the getHeight() operation using DFS on the tree from Example 1:

       0
      /|\
     1  2  3
    /\   \
   4  5   6

We perform DFS from the root (node 0) and compute the height bottom-up.

Step 1: Start DFS at root node 0. Depth = 0. Visit its first child.

Step 2: Recurse into node 1 (child of 0). Depth = 1. Visit node 1's first child.

Step 3: Recurse into node 4 (child of 1). Depth = 2. Node 4 has no children — it is a leaf. Return height = 0.

Step 4: Backtrack to node 1. Node 4 returned 0. Now visit node 1's next child: node 5.

Step 5: Recurse into node 5 (child of 1). Depth = 2. Node 5 has no children — leaf. Return height = 0.

Step 6: Backtrack to node 1. Both children processed. height(1) = 1 + max(0, 0) = 1. Return 1 to node 0.

Step 7: Back at node 0. Node 1 returned 1. Now visit next child: node 2. Depth = 1.

Step 8: Recurse into node 6 (child of 2). Depth = 2. Node 6 is a leaf. Return height = 0.

Step 9: Backtrack to node 2. height(2) = 1 + 0 = 1. Return 1 to node 0.

Step 10: Back at node 0. Now visit last child: node 3. Node 3 is a leaf. Return height = 0.

Step 11: All children of node 0 processed. height(0) = 1 + max(1, 1, 0) = 2. The tree height is 2. We visited each of the 7 nodes exactly once.

DFS Height Computation — Visiting Every Node Once — Watch how a single DFS traversal computes the height bottom-up by visiting each node exactly once and propagating subtree heights upward.

Algorithm

Build:

  1. Create a parent array of size N (initialized to -1) and an array of lists children of size N.
  2. For each edge [u, v]: set parent[v] = u and append v to children[u].
  3. Identify the root as the node with parent[i] == -1. Store it for O(1) access.

getRoot():

  1. Return the stored root value.

getParent(node):

  1. Return parent[node].

getChildren(node):

  1. Return children[node] (the pre-built list).

getLeaves():

  1. Iterate over all nodes. If children[i] is empty, add i to the result.

getHeight():

  1. Define a recursive function dfs(node):
    • If children[node] is empty, return 0 (leaf base case).
    • Otherwise, return 1 + max(dfs(child) for child in children[node]).
  2. Call dfs(root) and return the result.

Code

#include <vector>
#include <algorithm>
using namespace std;

class TreeAdjList {
    vector<vector<int>> children;
    vector<int> parent;
    int root;
    int n;

public:
    void buildTree(int numNodes, vector<vector<int>>& edges) {
        n = numNodes;
        children.resize(n);
        parent.assign(n, -1);

        for (auto& edge : edges) {
            children[edge[0]].push_back(edge[1]);
            parent[edge[1]] = edge[0];
        }

        root = -1;
        for (int i = 0; i < n; i++) {
            if (parent[i] == -1) {
                root = i;
                break;
            }
        }
    }

    int getRoot() {
        return root;
    }

    int getParent(int node) {
        return parent[node];
    }

    vector<int> getChildren(int node) {
        return children[node];
    }

    vector<int> getLeaves() {
        vector<int> leaves;
        for (int i = 0; i < n; i++) {
            if (children[i].empty()) {
                leaves.push_back(i);
            }
        }
        return leaves;
    }

    int getHeight() {
        return dfs(root);
    }

private:
    int dfs(int node) {
        if (children[node].empty()) return 0;
        int maxChildHeight = 0;
        for (int child : children[node]) {
            maxChildHeight = max(maxChildHeight, dfs(child));
        }
        return 1 + maxChildHeight;
    }
};
class TreeAdjList:
    def build_tree(self, num_nodes, edges):
        self.n = num_nodes
        self.children = [[] for _ in range(num_nodes)]
        self.parent = [-1] * num_nodes

        for u, v in edges:
            self.children[u].append(v)
            self.parent[v] = u

        self.root = -1
        for i in range(num_nodes):
            if self.parent[i] == -1:
                self.root = i
                break

    def get_root(self):
        return self.root

    def get_parent(self, node):
        return self.parent[node]

    def get_children(self, node):
        return self.children[node]

    def get_leaves(self):
        return [i for i in range(self.n)
                if not self.children[i]]

    def get_height(self):
        def dfs(node):
            if not self.children[node]:
                return 0
            return 1 + max(dfs(child)
                           for child in self.children[node])
        return dfs(self.root)
import java.util.*;

class TreeAdjList {
    List<List<Integer>> children;
    int[] parent;
    int root;
    int n;

    void buildTree(int numNodes, int[][] edges) {
        n = numNodes;
        children = new ArrayList<>();
        parent = new int[n];
        Arrays.fill(parent, -1);

        for (int i = 0; i < n; i++) {
            children.add(new ArrayList<>());
        }

        for (int[] edge : edges) {
            children.get(edge[0]).add(edge[1]);
            parent[edge[1]] = edge[0];
        }

        root = -1;
        for (int i = 0; i < n; i++) {
            if (parent[i] == -1) {
                root = i;
                break;
            }
        }
    }

    int getRoot() {
        return root;
    }

    int getParent(int node) {
        return parent[node];
    }

    List<Integer> getChildren(int node) {
        return children.get(node);
    }

    List<Integer> getLeaves() {
        List<Integer> leaves = new ArrayList<>();
        for (int i = 0; i < n; i++) {
            if (children.get(i).isEmpty()) {
                leaves.add(i);
            }
        }
        return leaves;
    }

    int getHeight() {
        return dfs(root);
    }

    private int dfs(int node) {
        if (children.get(node).isEmpty()) return 0;
        int maxChildHeight = 0;
        for (int child : children.get(node)) {
            maxChildHeight = Math.max(maxChildHeight,
                                       dfs(child));
        }
        return 1 + maxChildHeight;
    }
}

Complexity Analysis

Build Time: O(N)
We iterate over N-1 edges once, performing O(1) work per edge (one array assignment and one list append).

getRoot: O(1)
The root is identified and stored during the build phase.

getParent: O(1)
Direct array lookup, same as the brute force.

getChildren: O(1)
Direct lookup into the pre-built children list. The list itself has size equal to the number of children, but accessing the list reference is constant time.

getLeaves: O(N)
A single scan over all N nodes, checking if their children list is empty. Each check is O(1).

getHeight: O(N)
The DFS visits each node exactly once. At each node, we examine its children (which sums to N-1 edge lookups total across the entire traversal). Total work: O(N).

Space Complexity: O(N)
We store the parent array (N entries) and the children adjacency list. The total number of entries across all children lists equals N-1 (one per edge). So total space is O(N) + O(N) = O(N).

Comparison with Brute Force:
The most dramatic improvement is in getChildren (O(N) → O(1)), getLeaves (O(N²) → O(N)), and getHeight (O(N²) → O(N)). By investing O(N) extra space for the children lists, every operation becomes optimal.