Subarrays with K Different Integers
Description
Given an integer array nums and an integer k, return the number of good subarrays of nums.
A good subarray is a contiguous portion of the array that contains exactly k different (distinct) integers.
A subarray is a contiguous sequence of elements within the array. For example, [2,1] is a subarray of [1,2,1,2,3] (indices 1–2), but [1,3] is not because those elements are not adjacent.
You must count every occurrence of every valid subarray — two subarrays that contain the same values but span different index ranges count separately.
Examples
Example 1
Input: nums = [1,2,1,2,3], k = 2
Output: 7
Explanation: The 7 subarrays with exactly 2 distinct integers are:
- [1,2] (indices 0–1)
- [2,1] (indices 1–2)
- [1,2] (indices 2–3)
- [2,3] (indices 3–4)
- [1,2,1] (indices 0–2)
- [2,1,2] (indices 1–3)
- [1,2,1,2] (indices 0–3)
Subarrays like [1] have only 1 distinct integer, and [1,2,1,2,3] has 3 distinct integers — neither qualifies.
Example 2
Input: nums = [1,2,1,3,4], k = 3
Output: 3
Explanation: The 3 subarrays with exactly 3 distinct integers are:
- [1,2,1,3] (indices 0–3) — distinct values: 1, 2, 3
- [2,1,3] (indices 1–3) — distinct values: 1, 2, 3
- [1,3,4] (indices 2–4) — distinct values: 1, 3, 4
No other contiguous portion has exactly 3 distinct integers.
Constraints
- 1 ≤ nums.length ≤ 2 × 10^4
- 1 ≤ nums[i], k ≤ nums.length
Editorial
Brute Force
Intuition
The most direct way to solve this is to check every possible subarray and count how many have exactly k distinct integers.
Imagine you have a row of numbered cards. You pick a starting card, then slide your finger to the right one card at a time. With each extension, you count how many different numbers are visible in your selection. If the count equals k, you tally one more good subarray. If the count exceeds k, you know that extending further will only add more numbers (never fewer), so you stop and move your starting card one position to the right.
To avoid recounting from scratch on every extension, we maintain a frequency map that we update incrementally as the right boundary expands. When we move the starting position, we discard the map and start a fresh one.
Step-by-Step Explanation
Let us trace through nums = [1,2,1,2,3], k = 2:
Step 1: Start with i = 0. Initialize frequency map.
- j = 0: [1], distinct = 1. Since 1 < 2, not a good subarray.
Step 2: Expand j = 1.
- [1,2], distinct = 2 = k. Good subarray! count = 1.
Step 3: Expand j = 2.
- [1,2,1], distinct = 2 = k. Good subarray! count = 2.
Step 4: Expand j = 3.
- [1,2,1,2], distinct = 2 = k. Good subarray! count = 3.
Step 5: Expand j = 4.
- [1,2,1,2,3], distinct = 3 > k. Break — further expansion cannot reduce distinct.
Step 6: Move to i = 1. Reset frequency map.
- j = 1: [2], distinct = 1. Skip.
- j = 2: [2,1], distinct = 2 = k. count = 4.
- j = 3: [2,1,2], distinct = 2 = k. count = 5.
- j = 4: [2,1,2,3], distinct = 3 > 2. Break.
Step 7: Continue i = 2, 3, 4.
- i = 2: [1,2] is valid (count = 6), [1,2,3] has 3 distinct — break.
- i = 3: [2,3] is valid (count = 7).
- i = 4: [3] has 1 distinct. No valid subarray.
Result: count = 7.
Brute Force — Checking All Subarrays — Watch how brute force fixes each starting position, expands rightward counting distinct integers, and restarts from scratch at each new start.
Algorithm
- Initialize
count = 0. - For each starting index
ifrom0ton - 1:
a. Create an empty frequency map.
b. For each ending indexjfromiton - 1:- Increment
freq[nums[j]]. - If the number of distinct keys exceeds
k, break (further expansion only adds distinct values). - If the number of distinct keys equals
k, incrementcount.
- Increment
- Return
count.
Code
class Solution {
public:
int subarraysWithKDistinct(vector<int>& nums, int k) {
int n = nums.size();
int count = 0;
for (int i = 0; i < n; i++) {
unordered_map<int, int> freq;
for (int j = i; j < n; j++) {
freq[nums[j]]++;
if ((int)freq.size() > k) break;
if ((int)freq.size() == k) count++;
}
}
return count;
}
};class Solution:
def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
n = len(nums)
count = 0
for i in range(n):
freq = {}
for j in range(i, n):
freq[nums[j]] = freq.get(nums[j], 0) + 1
if len(freq) > k:
break
if len(freq) == k:
count += 1
return countclass Solution {
public int subarraysWithKDistinct(int[] nums, int k) {
int n = nums.length;
int count = 0;
for (int i = 0; i < n; i++) {
Map<Integer, Integer> freq = new HashMap<>();
for (int j = i; j < n; j++) {
freq.merge(nums[j], 1, Integer::sum);
if (freq.size() > k) break;
if (freq.size() == k) count++;
}
}
return count;
}
}Complexity Analysis
Time Complexity: O(n²)
The outer loop runs n times. For each starting index, the inner loop can extend up to the end of the array, giving O(n) work per start. In the worst case (e.g., when the array has many repeated values), nearly every subarray is examined, yielding O(n²) total operations.
Space Complexity: O(n)
The frequency map can store at most min(n, k) entries at a time. Since k ≤ n by the constraints, this is O(n) in the worst case, though typically much smaller.
Why This Approach Is Not Efficient
With n up to 2 × 10^4, the brute force performs up to (2 × 10^4)² = 4 × 10^8 operations — too slow for typical time limits.
A natural thought is to apply a sliding window: expand right, and when distinct exceeds k, shrink from left. However, there is a subtle complication: we need to count subarrays with exactly k distinct integers, not "at most" k. A standard sliding window that shrinks until distinct ≤ k can tell us the leftmost valid starting position, but it cannot directly enumerate how many starting positions yield exactly k distinct integers for a given right.
For instance, with nums = [1,2,1,2,3] and k = 2 at right = 3:
- Starting at left = 0 gives [1,2,1,2] → 2 distinct ✓
- Starting at left = 1 gives [2,1,2] → 2 distinct ✓
- Starting at left = 2 gives [1,2] → 2 distinct ✓
- Starting at left = 3 gives [2] → 1 distinct ✗
We need to count 3 subarrays, not just detect the range. This is hard with a single window.
Key insight: Counting subarrays with exactly k distinct integers is difficult directly, but counting subarrays with at most k is easy with a sliding window. And we can decompose:
exactly(k) = atMost(k) − atMost(k − 1)
This works because the set of subarrays with at most k distinct integers includes all subarrays with 1, 2, …, k distinct. Subtracting those with at most k − 1 leaves exactly those with k. Running two sliding windows — one for atMost(k) and one for atMost(k − 1) — gives us the answer in O(n) total time.
Optimal Approach - Sliding Window with At-Most Decomposition
Intuition
The core idea rests on one powerful observation:
The number of subarrays with exactly
kdistinct integers equals the number with at mostkminus the number with at mostk − 1.
Why? Think of it with sets: "at most k" includes everything with 1, 2, …, or k distinct values. "At most k − 1" includes everything with 1, 2, …, or k − 1. The difference is precisely the set with exactly k.
Now counting subarrays with at most m distinct integers is a textbook sliding window problem. We maintain a window [left, right] with a frequency map. Whenever distinct exceeds m, we shrink from the left. At each position of right, every subarray ending at right and starting between left and right is valid — that is right − left + 1 subarrays. We sum this quantity across all positions of right.
Imagine a conveyor belt of boxes. For each new box that arrives on the right, you check how many consecutive boxes (from your current leftmost position) you can include without breaking the rule. That count of boxes is the number of valid subarrays ending at the current position.
We run this sliding window twice — once for m = k and once for m = k − 1 — and subtract the results. Each pass is O(n), so the total is O(n).
Step-by-Step Explanation
Let us trace through nums = [1,2,1,2,3], k = 2.
Pass 1: atMost(k = 2)
Step 1: left = 0, freq = {}, total = 0.
Step 2: right = 0. Add 1: freq = {1:1}. Distinct = 1 ≤ 2.
- Valid subarrays ending at 0: [1]. Count += 1. Total = 1.
Step 3: right = 1. Add 2: freq = {1:1, 2:1}. Distinct = 2 ≤ 2.
- Valid subarrays ending at 1: [2], [1,2]. Count += 2. Total = 3.
Step 4: right = 2. Add 1: freq = {1:2, 2:1}. Distinct = 2 ≤ 2.
- Valid subarrays ending at 2: [1], [2,1], [1,2,1]. Count += 3. Total = 6.
Step 5: right = 3. Add 2: freq = {1:2, 2:2}. Distinct = 2 ≤ 2.
- Valid subarrays ending at 3: [2], [1,2], [2,1,2], [1,2,1,2]. Count += 4. Total = 10.
Step 6: right = 4. Add 3: freq = {1:2, 2:2, 3:1}. Distinct = 3 > 2!
- Shrink: remove nums[0]=1 → freq = {1:1, 2:2, 3:1}. Still 3 distinct.
- Shrink: remove nums[1]=2 → freq = {1:1, 2:1, 3:1}. Still 3 distinct.
- Shrink: remove nums[2]=1 → freq = {2:1, 3:1}. Distinct = 2 ≤ 2. left = 3.
- Valid subarrays ending at 4: [3], [2,3]. Count += 2. Total = 12.
atMost(2) = 12
Pass 2: atMost(k − 1 = 1)
Each position can only contain a single repeated value. The sliding window yields counts [1, 1, 1, 1, 1] = 5.
atMost(1) = 5
Step 7: exactly(2) = atMost(2) − atMost(1) = 12 − 5 = 7.
Sliding Window — Counting Subarrays with At Most K Distinct — Watch the atMost(k=2) pass: the window expands right, shrinks left when distinct exceeds 2, and accumulates the count of valid subarrays at each step.
Algorithm
- Define a helper function
countAtMost(m)that returns the total number of subarrays with at mostmdistinct integers:
a. Initializeleft = 0,total = 0, and an empty frequency map.
b. For eachrightfrom0ton − 1:- Increment
freq[nums[right]]. - While
freqhas more thanmkeys:- Decrement
freq[nums[left]]. If it reaches 0, remove the key. - Increment
left.
- Decrement
- Add
right − left + 1tototal.
c. Returntotal.
- Increment
- Return
countAtMost(k) − countAtMost(k − 1).
Code
class Solution {
public:
int subarraysWithKDistinct(vector<int>& nums, int k) {
return countAtMost(nums, k) - countAtMost(nums, k - 1);
}
private:
int countAtMost(vector<int>& nums, int m) {
int n = nums.size();
int total = 0, left = 0;
unordered_map<int, int> freq;
for (int right = 0; right < n; right++) {
freq[nums[right]]++;
while ((int)freq.size() > m) {
freq[nums[left]]--;
if (freq[nums[left]] == 0) {
freq.erase(nums[left]);
}
left++;
}
total += right - left + 1;
}
return total;
}
};class Solution:
def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
def count_at_most(m: int) -> int:
total = 0
left = 0
freq = {}
for right in range(len(nums)):
freq[nums[right]] = freq.get(nums[right], 0) + 1
while len(freq) > m:
freq[nums[left]] -= 1
if freq[nums[left]] == 0:
del freq[nums[left]]
left += 1
total += right - left + 1
return total
return count_at_most(k) - count_at_most(k - 1)class Solution {
public int subarraysWithKDistinct(int[] nums, int k) {
return countAtMost(nums, k) - countAtMost(nums, k - 1);
}
private int countAtMost(int[] nums, int m) {
int n = nums.length;
int total = 0, left = 0;
Map<Integer, Integer> freq = new HashMap<>();
for (int right = 0; right < n; right++) {
freq.merge(nums[right], 1, Integer::sum);
while (freq.size() > m) {
int leftVal = nums[left];
freq.merge(leftVal, -1, Integer::sum);
if (freq.get(leftVal) == 0) {
freq.remove(leftVal);
}
left++;
}
total += right - left + 1;
}
return total;
}
}Complexity Analysis
Time Complexity: O(n)
The countAtMost helper is called twice (for k and k − 1). Each call performs a single pass of the right pointer across all n elements. Within each call, the left pointer also only moves forward and advances at most n times in total. Every element enters and leaves the frequency map at most once per call. All hash map operations are O(1) amortized. So each call is O(n), and the total is O(n) + O(n) = O(n).
Space Complexity: O(n)
The frequency map stores at most min(n, k) entries at any time. Since k ≤ n by the constraints, the worst case is O(n). In practice, for small k, space is O(k). No additional data structures proportional to input size are needed beyond the map.