Union of Two Arrays

Description

This problem is about the Disjoint Set Union (DSU) data structure, also known as Union-Find.

You are given N nodes numbered from 1 to N. Initially, every node belongs to its own separate set (no nodes are connected). You need to implement two operations:

union_(x, y): Merge the set containing node x with the set containing node y. After this operation, x and y (and all nodes in their respective sets) belong to the same set.
isConnected(x, y): Determine whether nodes x and y belong to the same set. Return 1 if they are in the same set, and 0 otherwise.

You are given two arrays par[] and rank1[] initialized as par[i] = i and rank1[i] = 1 for all i. Use these arrays to efficiently manage the disjoint sets.

Examples

Example 1

Input: N = 5, Queries: Union(1,3), isConnected(1,2), Union(1,5), isConnected(3,5)

Output: 0 1

Explanation:

Initially, each node is its own set: {1}, {2}, {3}, {4}, {5}.
Union(1,3): Merge sets containing 1 and 3. Now: {1,3}, {2}, {4}, {5}.
isConnected(1,2): Nodes 1 and 2 are in different sets. Return 0.
Union(1,5): Merge sets containing 1 and 5. Node 1 is in set {1,3}, so now: {1,3,5}, {2}, {4}.
isConnected(3,5): Both 3 and 5 are in set {1,3,5}. Return 1.

Constraints

1 ≤ N ≤ 10^5
1 ≤ Q ≤ 10^5 (number of union/isConnected queries)
1 ≤ x, y ≤ N

Expected Complexity:

Time: O(N + Q) (nearly O(α(N)) per operation with optimizations)
Space: O(1) auxiliary (the par[] and rank1[] arrays are provided)

Editorial

Brute Force - Naive Union-Find

Intuition

The core idea behind Union-Find is to represent each set as a tree, where every node has a parent pointer. The root of a tree is the representative (or "leader") of that set. Two nodes belong to the same set if and only if they have the same root.

Initially, each node is its own root — it points to itself: par[i] = i. This means every node starts in a set containing just itself.

Find operation: To determine which set a node belongs to, we follow parent pointers upward from that node until we reach a node that is its own parent — that's the root. The root uniquely identifies the set.

Union operation: To merge two sets, we find the root of each set. If the roots are different, we make one root point to the other, effectively merging the two trees into one.

isConnected: Simply check if find(x) == find(y) — if both nodes have the same root, they are in the same set.

The simplest version of this is the naive approach: during union, we always attach the root of the second node under the root of the first node, regardless of tree size. This works correctly but can create tall, skinny trees (essentially linked lists), making the find operation slow — up to O(n) per call in the worst case.

Step-by-Step Explanation

Let's trace through the example: N = 5, queries: Union(1,3), isConnected(1,2), Union(1,5), isConnected(3,5).

Initialization: par = [_, 1, 2, 3, 4, 5] (1-indexed). Every node is its own parent.
Sets: {1}, {2}, {3}, {4}, {5}

Query 1: Union(1, 3)

Find root of 1: par[1] = 1 → root is 1.
Find root of 3: par[3] = 3 → root is 3.
Roots differ (1 ≠ 3), so merge: set par[3] = 1. Now node 3's parent is 1.
Sets: {1, 3}, {2}, {4}, {5}

Query 2: isConnected(1, 2)

Find root of 1: par[1] = 1 → root is 1.
Find root of 2: par[2] = 2 → root is 2.
Roots differ (1 ≠ 2). Return 0.

Query 3: Union(1, 5)

Find root of 1: par[1] = 1 → root is 1.
Find root of 5: par[5] = 5 → root is 5.
Roots differ (1 ≠ 5), so merge: set par[5] = 1.
Sets: {1, 3, 5}, {2}, {4}

Query 4: isConnected(3, 5)

Find root of 3: par[3] = 1, par[1] = 1 → root is 1.
Find root of 5: par[5] = 1, par[1] = 1 → root is 1.
Roots are the same (1 == 1). Return 1.

Naive Union-Find — Building the Parent Array — Watch how the parent array evolves as union operations merge sets, and how find operations traverse parent chains to locate roots.

Algorithm

Find(x):

While par[x] ≠ x, set x = par[x]
Return x (the root)

Union(x, y):

Find root_x = Find(x)
Find root_y = Find(y)
If root_x ≠ root_y, set par[root_y] = root_x

isConnected(x, y):

Return 1 if Find(x) == Find(y), else return 0

Code

// Naive Union-Find (without optimizations)
int find(int par[], int x) {
    while (par[x] != x) {
        x = par[x];
    }
    return x;
}

void union_(int par[], int rank1[], int x, int y) {
    int root_x = find(par, x);
    int root_y = find(par, y);
    if (root_x != root_y) {
        par[root_y] = root_x;
    }
}

int isConnected(int par[], int rank1[], int x, int y) {
    return find(par, x) == find(par, y) ? 1 : 0;
}

# Naive Union-Find (without optimizations)
def find(par, x):
    while par[x] != x:
        x = par[x]
    return x

def union_(par, rank1, x, y):
    root_x = find(par, x)
    root_y = find(par, y)
    if root_x != root_y:
        par[root_y] = root_x

def isConnected(par, rank1, x, y):
    return 1 if find(par, x) == find(par, y) else 0

// Naive Union-Find (without optimizations)
int find(int[] par, int x) {
    while (par[x] != x) {
        x = par[x];
    }
    return x;
}

void union_(int[] par, int[] rank1, int x, int y) {
    int rootX = find(par, x);
    int rootY = find(par, y);
    if (rootX != rootY) {
        par[rootY] = rootX;
    }
}

int isConnected(int[] par, int[] rank1, int x, int y) {
    return find(par, x) == find(par, y) ? 1 : 0;
}

Complexity Analysis

Time Complexity: O(n) per find operation in the worst case, O(n × Q) total

In the worst case, the tree degenerates into a straight chain (like a linked list). For example, if we union nodes 1→2, then 2→3, then 3→4, the tree becomes a chain of depth n. Each find operation must traverse the entire chain from a leaf to the root — O(n) per call. With Q queries, total time is O(n × Q).

For n and Q up to 10^5, this could mean 10^10 operations — far too slow.

Space Complexity: O(1) auxiliary

We only use the provided par[] and rank1[] arrays. No additional data structures are created.

Why This Approach Is Not Efficient

The naive approach has a critical flaw: tree height is unbounded. When we always attach one root under another without considering tree sizes, we can create arbitrarily tall trees.

Consider this sequence of unions: Union(1,2), Union(2,3), Union(3,4), Union(4,5). If each time the second root is attached under the first:

After Union(1,2): 2→1
After Union(2,3): To union 2 and 3, we find root of 2 (which is 1), find root of 3 (which is 3), and set par[3] = 1.
But in a degenerate case, we could build a chain: 5→4→3→2→1

Finding the root of node 5 now requires traversing 4 edges — the entire depth of the tree. With n nodes, this is O(n) per find.

Two optimizations fix this:

Union by Rank: Always attach the shorter tree under the taller tree. This limits the maximum height to O(log n).
Path Compression: During find, make every node on the path point directly to the root. This flattens the tree for future queries.

Together, these optimizations reduce the amortized time per operation to O(α(n)), where α is the inverse Ackermann function — effectively constant for all practical inputs.

Optimal Approach - Union by Rank with Path Compression

Intuition

The optimal approach adds two powerful optimizations to the basic Union-Find structure:

1. Union by Rank:
Each tree maintains a rank — an upper bound on its height. When merging two trees, we attach the tree with smaller rank under the tree with larger rank. This ensures the resulting tree stays as flat as possible.

If both trees have the same rank, we pick one as the new root and increment its rank by 1.
If ranks differ, we attach the smaller-rank tree under the larger-rank tree. The rank of the result does not change.

This guarantees the tree height is at most O(log n), improving find from O(n) to O(log n) per call.

2. Path Compression:
Every time we call find(x), after locating the root, we update the parent of every node on the path from x to the root to point directly to the root. This dramatically flattens the tree, so future find calls on these nodes are O(1).

With both optimizations combined, the amortized time per union or find operation becomes O(α(n)), where α(n) is the inverse Ackermann function. For all practical values of n (up to 10^80), α(n) ≤ 4. This is effectively constant time per operation.

The rank array is already provided as rank1[] initialized to 1. We will treat rank as the tree size (union by size variant) which achieves the same bounds.

Step-by-Step Explanation

Let's trace the same example with Union by Rank + Path Compression:

N = 5, queries: Union(1,3), isConnected(1,2), Union(1,5), isConnected(3,5)

Initialization:

par = [_, 1, 2, 3, 4, 5]
rank1 = [_, 1, 1, 1, 1, 1]

Query 1: Union(1, 3)

find(1): par[1]=1 → root=1 (already at root, no path to compress).
find(3): par[3]=3 → root=3.
root_x=1, root_y=3. Ranks: rank1[1]=1, rank1[3]=1. Equal ranks!
Attach 3 under 1: par[3]=1, rank1[1]=2.
par = [, 1, 2, 1, 4, 5], rank1 = [, 2, 1, 1, 1, 1]

Query 2: isConnected(1, 2)

find(1): root=1. find(2): root=2.
1 ≠ 2 → Return 0.

Query 3: Union(1, 5)

find(1): root=1. find(5): root=5.
rank1[1]=2 > rank1[5]=1, so attach 5 under 1: par[5]=1.
rank stays the same since rank of root 1 is already larger.
par = [, 1, 2, 1, 4, 1], rank1 = [, 2, 1, 1, 1, 1]

Query 4: isConnected(3, 5)

find(3): par[3]=1, par[1]=1 → root=1. Path compression: par[3] already points to 1, no change needed.
find(5): par[5]=1, par[1]=1 → root=1.
1 == 1 → Return 1.

Output: 0 1

Union by Rank + Path Compression — Keeping Trees Flat — Watch how union by rank prevents tall trees by attaching shorter trees under taller ones, and how path compression flattens the structure during find operations.

Algorithm

Find(par, x) with Path Compression:

If par[x] ≠ x:
- Recursively set par[x] = Find(par, par[x])
Return par[x]

(This makes every node on the find path point directly to the root.)

Union(par, rank1, x, y) with Union by Rank:

root_x = Find(par, x)
root_y = Find(par, y)
If root_x == root_y, do nothing (already in same set)
If rank1[root_x] < rank1[root_y], set par[root_x] = root_y
Else if rank1[root_x] > rank1[root_y], set par[root_y] = root_x
Else (equal ranks): set par[root_y] = root_x, increment rank1[root_x] by 1

isConnected(par, rank1, x, y):

Return 1 if Find(par, x) == Find(par, y), else return 0

Code

// Union by Rank + Path Compression
int find(int par[], int x) {
    if (par[x] != x) {
        par[x] = find(par, par[x]); // path compression
    }
    return par[x];
}

void union_(int par[], int rank1[], int x, int y) {
    int root_x = find(par, x);
    int root_y = find(par, y);
    
    if (root_x == root_y) return;
    
    // union by rank
    if (rank1[root_x] < rank1[root_y]) {
        par[root_x] = root_y;
    } else if (rank1[root_x] > rank1[root_y]) {
        par[root_y] = root_x;
    } else {
        par[root_y] = root_x;
        rank1[root_x]++;
    }
}

int isConnected(int par[], int rank1[], int x, int y) {
    return find(par, x) == find(par, y) ? 1 : 0;
}

# Union by Rank + Path Compression
def find(par, x):
    if par[x] != x:
        par[x] = find(par, par[x])  # path compression
    return par[x]

def union_(par, rank1, x, y):
    root_x = find(par, x)
    root_y = find(par, y)
    
    if root_x == root_y:
        return
    
    # union by rank
    if rank1[root_x] < rank1[root_y]:
        par[root_x] = root_y
    elif rank1[root_x] > rank1[root_y]:
        par[root_y] = root_x
    else:
        par[root_y] = root_x
        rank1[root_x] += 1

def isConnected(par, rank1, x, y):
    return 1 if find(par, x) == find(par, y) else 0

// Union by Rank + Path Compression
int find(int[] par, int x) {
    if (par[x] != x) {
        par[x] = find(par, par[x]); // path compression
    }
    return par[x];
}

void union_(int[] par, int[] rank1, int x, int y) {
    int rootX = find(par, x);
    int rootY = find(par, y);
    
    if (rootX == rootY) return;
    
    // union by rank
    if (rank1[rootX] < rank1[rootY]) {
        par[rootX] = rootY;
    } else if (rank1[rootX] > rank1[rootY]) {
        par[rootY] = rootX;
    } else {
        par[rootY] = rootX;
        rank1[rootX]++;
    }
}

int isConnected(int[] par, int[] rank1, int x, int y) {
    return find(par, x) == find(par, y) ? 1 : 0;
}

Complexity Analysis

Time Complexity: O((N + Q) × α(N)) ≈ O(N + Q)

With both Union by Rank and Path Compression, each union and find operation runs in amortized O(α(N)) time, where α is the inverse Ackermann function. The inverse Ackermann function grows incredibly slowly — for any practical input size (up to 10^80), α(N) ≤ 4.

For N and Q up to 10^5, the total time is effectively O(N + Q) — a massive improvement over the naive approach's O(N × Q).

Union by Rank alone gives O(log N) per operation (tree height bounded by log N).
Path Compression alone gives amortized O(log N) per operation.
Both together give amortized O(α(N)) — nearly constant.

Space Complexity: O(1) auxiliary

We use only the provided par[] and rank1[] arrays. The recursive path compression uses O(log N) stack space in the worst case, but this can be converted to an iterative two-pass approach for strict O(1) space.

Language Essentials & Syntax

EASY

Pattern-Based Logic Building

EASY

Standard Libraries & Collections

EASY

Essential Mathematics

EASY

MEDIUM

Recursion Fundamentals

EASY

Hashing Essentials

EASY

MEDIUM

Comparison-Based Sorts

EASY

Divide & Conquer Sorts

EASY

MEDIUM

Array Fundamentals

EASY

MEDIUM

Intermediate Array Problems

EASY

MEDIUM

Advanced Array Challenges

EASY

MEDIUM

HARD

Searching in Sorted Arrays

EASY

MEDIUM

Search Space Reduction

EASY

MEDIUM

HARD

Matrix Binary Search

EASY

MEDIUM

HARD

Core String Operations

EASY

MEDIUM

Intermediate String Challenges

EASY

MEDIUM

Singly Linked List Basics

EASY

MEDIUM

Doubly Linked List Basics

EASY

MEDIUM

Linked List Techniques

EASY

MEDIUM

Doubly Linked List Techniques

MEDIUM

HARD

Advanced Linked List Challenges

MEDIUM

HARD

Building Recursive Thinking

EASY

MEDIUM

Generating Subsequences

EASY

MEDIUM

Exhaustive Search & Constraint Solving

MEDIUM

HARD

Bitwise Operations Fundamentals

EASY

MEDIUM

Bit Manipulation Applications

EASY

MEDIUM

Number Theory & Primes

EASY

MEDIUM

HARD