Contains Duplicate II
Description
Given an integer array nums and an integer k, determine whether there exist two distinct indices i and j in the array such that:
nums[i] == nums[j](the values at those positions are equal), ANDabs(i - j) <= k(the two positions are at mostkapart)
Return true if such a pair exists, and false otherwise.
In other words, you need to check if any value appears more than once within a sliding window of size k + 1. The problem is not just about finding duplicates — it is about finding duplicates that are close enough to each other in terms of their positions in the array.
Examples
Example 1
Input: nums = [1, 2, 3, 1], k = 3
Output: true
Explanation: The value 1 appears at index 0 and index 3. The distance between these indices is |0 - 3| = 3, which is ≤ k = 3. Since we found two identical values within the required distance, we return true.
Example 2
Input: nums = [1, 0, 1, 1], k = 1
Output: true
Explanation: The value 1 appears at indices 0, 2, and 3. While |0 - 2| = 2 > 1, we also have |2 - 3| = 1 ≤ k = 1. The pair at indices 2 and 3 satisfies both conditions, so we return true.
Example 3
Input: nums = [1, 2, 3, 1, 2, 3], k = 2
Output: false
Explanation: Let us check every pair of equal values:
- Value 1 appears at indices 0 and 3: |0 - 3| = 3 > 2
- Value 2 appears at indices 1 and 4: |1 - 4| = 3 > 2
- Value 3 appears at indices 2 and 5: |2 - 5| = 3 > 2
No pair of duplicates has distance ≤ 2, so we return false.
Constraints
- 1 ≤ nums.length ≤ 10^5
- -10^9 ≤ nums[i] ≤ 10^9
- 0 ≤ k ≤ 10^5
Editorial
Brute Force
Intuition
The most straightforward way to solve this problem is to check every possible pair of indices. For each element in the array, we look at every other element that comes after it (within a distance of k) and check if they share the same value.
Think of it like proofreading a document for repeated words. You read each word and then scan the next few words after it to see if the same word appears again within a certain range. If you find a repeat within that range, you flag it.
Step-by-Step Explanation
Let's trace through with nums = [1, 2, 3, 1], k = 3:
Step 1: Start with i = 0, nums[0] = 1. Check all j from i+1 to min(i+k, n-1) = min(3, 3) = 3.
Step 2: Check pair (i=0, j=1): nums[0]=1, nums[1]=2. Are they equal? No.
Step 3: Check pair (i=0, j=2): nums[0]=1, nums[2]=3. Are they equal? No.
Step 4: Check pair (i=0, j=3): nums[0]=1, nums[3]=1. Are they equal? Yes! And |0-3| = 3 ≤ 3. Found a valid pair!
Step 5: Return true.
Brute Force — Checking Pairs Within Distance k — For each element, we scan forward up to k positions looking for a duplicate value. We stop as soon as we find one.
Algorithm
- For each index i from 0 to n-1:
- For each index j from i+1 to min(i+k, n-1):
- If nums[i] == nums[j], return true
- For each index j from i+1 to min(i+k, n-1):
- If no pair found, return false
Code
class Solution {
public:
bool containsNearbyDuplicate(vector<int>& nums, int k) {
int n = nums.size();
for (int i = 0; i < n; i++) {
for (int j = i + 1; j <= min(i + k, n - 1); j++) {
if (nums[i] == nums[j]) {
return true;
}
}
}
return false;
}
};class Solution:
def containsNearbyDuplicate(self, nums: list[int], k: int) -> bool:
n = len(nums)
for i in range(n):
for j in range(i + 1, min(i + k + 1, n)):
if nums[i] == nums[j]:
return True
return Falseclass Solution {
public boolean containsNearbyDuplicate(int[] nums, int k) {
int n = nums.length;
for (int i = 0; i < n; i++) {
for (int j = i + 1; j <= Math.min(i + k, n - 1); j++) {
if (nums[i] == nums[j]) {
return true;
}
}
}
return false;
}
}Complexity Analysis
Time Complexity: O(n × min(n, k))
For each of the n elements, the inner loop runs up to min(k, n-1) times. In the worst case where k ≥ n, this becomes O(n²). When k is small, it is O(n × k).
Space Complexity: O(1)
We only use a constant number of loop variables. No additional data structures are needed.
Why This Approach Is Not Efficient
The brute force approach becomes very slow when both n and k are large. With n up to 10^5 and k up to 10^5, the worst case involves 10^5 × 10^5 = 10^{10} comparisons — far too many for typical time constraints.
The core inefficiency is that we repeatedly compare the same elements. When we move from index i to i+1, most of the elements we need to check overlap with what we already checked. We are not leveraging any memory of past comparisons.
The key insight: instead of re-scanning a window of k elements every time, we can maintain a set of the most recent k elements. For each new element, we check if it already exists in the set (O(1) lookup). If it does, we have found a nearby duplicate. If not, we add it to the set and remove the element that is now more than k positions behind. This transforms repeated linear scans into constant-time lookups.
Better Approach - Hash Map (Last Seen Index)
Intuition
Instead of scanning forward from each element, we can look backward using a hash map. As we traverse the array from left to right, we store each value along with the most recent index where we saw it.
For each new element, we check the hash map: has this value been seen before? If yes, how far back was it? If the distance is within k, we have our answer.
Think of it like a receptionist at a hotel checking repeat visitors. Each time a guest arrives, the receptionist checks the guest log: "Have you been here before? When was your last visit?" If the last visit was recent enough (within k days), the receptionist flags it.
This approach is simple and effective: one pass through the array, storing and checking indices as we go.
Step-by-Step Explanation
Let's trace through with nums = [1, 2, 3, 1], k = 3:
Step 1: Initialize an empty hash map to store value → last seen index.
Step 2: Process nums[0] = 1. Is 1 in the map? No (map is empty). Store {1 → 0}. Map: {1: 0}.
Step 3: Process nums[1] = 2. Is 2 in the map? No. Store {2 → 1}. Map: {1: 0, 2: 1}.
Step 4: Process nums[2] = 3. Is 3 in the map? No. Store {3 → 2}. Map: {1: 0, 2: 1, 3: 2}.
Step 5: Process nums[3] = 1. Is 1 in the map? Yes, last seen at index 0.
- Distance = |3 - 0| = 3 ≤ k = 3. Within range!
Step 6: Return true.
Hash Map — Tracking Last Seen Index — We store each value's most recent index in a hash map. When we encounter a value already in the map, we check if the distance to its last occurrence is within k.
Algorithm
- Create an empty hash map
last_seen(maps value → most recent index) - For each index i from 0 to n-1:
- If nums[i] exists in
last_seen:- Let prev_index = last_seen[nums[i]]
- If i - prev_index ≤ k, return true
- Update last_seen[nums[i]] = i (store/overwrite with latest index)
- If nums[i] exists in
- Return false
Code
class Solution {
public:
bool containsNearbyDuplicate(vector<int>& nums, int k) {
unordered_map<int, int> lastSeen;
for (int i = 0; i < nums.size(); i++) {
if (lastSeen.count(nums[i]) && i - lastSeen[nums[i]] <= k) {
return true;
}
lastSeen[nums[i]] = i;
}
return false;
}
};class Solution:
def containsNearbyDuplicate(self, nums: list[int], k: int) -> bool:
last_seen = {}
for i, num in enumerate(nums):
if num in last_seen and i - last_seen[num] <= k:
return True
last_seen[num] = i
return Falseclass Solution {
public boolean containsNearbyDuplicate(int[] nums, int k) {
Map<Integer, Integer> lastSeen = new HashMap<>();
for (int i = 0; i < nums.length; i++) {
if (lastSeen.containsKey(nums[i]) && i - lastSeen.get(nums[i]) <= k) {
return true;
}
lastSeen.put(nums[i], i);
}
return false;
}
}Complexity Analysis
Time Complexity: O(n)
We traverse the array exactly once. Each hash map lookup and insertion is O(1) on average. So the total work is n iterations × O(1) per iteration = O(n).
Space Complexity: O(n)
In the worst case (all distinct elements), the hash map stores all n elements. The map grows up to size n.
Why This Approach Is Not Efficient
The hash map approach is already O(n) in time, which is optimal. However, its space usage can be improved. The map stores every unique value ever seen, even values that are far behind our current position and no longer relevant (more than k positions back).
For example, if k = 3 and we are at index 1000, we only care about elements at indices 997, 998, 999, and 1000. But the hash map still holds entries from indices 0 through 996 — wasting memory.
The insight: we can maintain a sliding window set of size at most k + 1. Instead of storing all values with their indices, we keep only the values within the current window of k + 1 elements. When we slide the window forward, we add the new element and remove the element that fell out of the window. This reduces space from O(n) to O(min(n, k)).
For cases where k is much smaller than n, this is a meaningful improvement.
Optimal Approach - Sliding Window with Hash Set
Intuition
Instead of remembering every element we have ever seen, we only need to remember the elements within the last k positions. This is essentially a sliding window of size k + 1 (the current element plus the k elements before it).
Imagine you are a security guard monitoring a hallway with a window that shows only the last k + 1 people who walked by. Each time a new person appears, you check: "Is this person already visible in my window?" If yes, flag a duplicate. Then, if your window is full (already showing k + 1 people), remove the oldest person from view before adding the new one.
We use a hash set (not a hash map) because we no longer need to track indices — the window boundaries implicitly enforce the distance constraint. If a value is in the set, it must be within the last k positions.
Step-by-Step Explanation
Let's trace through with nums = [1, 0, 1, 1], k = 1.
The algorithm: for each i, (1) check if nums[i] is in the set → if yes, return true, (2) add nums[i] to the set, (3) if set size > k, remove nums[i - k] to maintain the window.
Step 1: Initialize empty set. Window holds at most k + 1 = 2 values.
Step 2: i = 0, nums[0] = 1. Is 1 in the set? No. Add 1 → set: {1}. Size 1 ≤ k = 1, no eviction.
Step 3: i = 1, nums[1] = 0. Is 0 in the set {1}? No. Add 0 → set: {1, 0}. Size 2 > k = 1, so evict nums[1 - 1] = nums[0] = 1. Set: {0}.
Step 4: i = 2, nums[2] = 1. Is 1 in the set {0}? No. Add 1 → set: {0, 1}. Size 2 > 1, evict nums[2 - 1] = nums[1] = 0. Set: {1}.
Step 5: i = 3, nums[3] = 1. Is 1 in the set {1}? Yes! Duplicate found within distance k. Return true.
Sliding Window Set — Fixed-Size Window for Duplicate Detection — We maintain a hash set of at most k+1 elements. For each new element, we check membership, add it, then evict the oldest if the window overflows.
Algorithm
- Create an empty hash set
window - For each index i from 0 to n-1:
- If nums[i] is already in
window, return true - Add nums[i] to
window - If the set size exceeds k, remove nums[i - k] from the set (evict the element that just left the window)
- If nums[i] is already in
- Return false
Code
class Solution {
public:
bool containsNearbyDuplicate(vector<int>& nums, int k) {
unordered_set<int> window;
for (int i = 0; i < nums.size(); i++) {
if (window.count(nums[i])) {
return true;
}
window.insert(nums[i]);
if (window.size() > k) {
window.erase(nums[i - k]);
}
}
return false;
}
};class Solution:
def containsNearbyDuplicate(self, nums: list[int], k: int) -> bool:
window = set()
for i in range(len(nums)):
if nums[i] in window:
return True
window.add(nums[i])
if len(window) > k:
window.remove(nums[i - k])
return Falseclass Solution {
public boolean containsNearbyDuplicate(int[] nums, int k) {
Set<Integer> window = new HashSet<>();
for (int i = 0; i < nums.length; i++) {
if (window.contains(nums[i])) {
return true;
}
window.add(nums[i]);
if (window.size() > k) {
window.remove(nums[i - k]);
}
}
return false;
}
}Complexity Analysis
Time Complexity: O(n)
We traverse the array once. Each set operation (lookup, insert, remove) takes O(1) on average. Total: n iterations × O(1) work each = O(n).
Space Complexity: O(min(n, k))
The hash set stores at most k + 1 elements at any time (the current window). If k ≥ n, the set holds at most n elements. This is an improvement over the hash map approach which always uses O(n) space regardless of k, since we never store elements outside the active window.