Skip to main content

Top K Frequent Elements

MEDIUMProblemSolveExternal Links

Description

Given an integer array nums and an integer k, return the k elements that appear most frequently in the array.

You may return the answer in any order. It is guaranteed that the answer is always unique — there will never be a tie that makes the k-th most frequent element ambiguous.

Examples

Example 1

Input: nums = [1, 1, 1, 2, 2, 3], k = 2

Output: [1, 2]

Explanation: The element 1 appears 3 times, 2 appears 2 times, and 3 appears 1 time. The two most frequent elements are 1 and 2. Returning [2, 1] is also accepted since order does not matter.

Example 2

Input: nums = [1], k = 1

Output: [1]

Explanation: There is only one element in the array, so it is trivially the most frequent. Return [1].

Example 3

Input: nums = [4, 4, 4, 6, 6, 2, 2, 2, 2], k = 2

Output: [2, 4]

Explanation: Element 2 appears 4 times, element 4 appears 3 times, and element 6 appears 2 times. The top 2 most frequent are 2 and 4.

Constraints

  • 1 ≤ nums.length ≤ 10^5
  • -10^4 ≤ nums[i] ≤ 10^4
  • k is in the range [1, number of unique elements in the array]
  • It is guaranteed that the answer is unique

Editorial

Brute Force

Intuition

The most natural approach is straightforward: first, count how many times each element appears, then sort all elements by their frequency in descending order, and finally pick the first k elements from the sorted list.

Think of it like a popularity contest at school. You first tally every student's votes (counting frequencies), then you rank all students from most popular to least popular (sorting by frequency), and finally you announce the top k winners.

Step-by-Step Explanation

Let's trace through with nums = [1, 1, 1, 2, 2, 3], k = 2.

Step 1: Build a frequency map by scanning the array.

  • Process 1: freq = {1: 1}
  • Process 1: freq = {1: 2}
  • Process 1: freq = {1: 3}
  • Process 2: freq = {1: 3, 2: 1}
  • Process 2: freq = {1: 3, 2: 2}
  • Process 3: freq = {1: 3, 2: 2, 3: 1}

Step 2: Convert the frequency map into a list of (element, frequency) pairs.

  • pairs = [(1, 3), (2, 2), (3, 1)]

Step 3: Sort pairs by frequency in descending order.

  • sorted = [(1, 3), (2, 2), (3, 1)] (already sorted in this case)

Step 4: Pick the first k = 2 elements from the sorted list.

  • result = [1, 2]

Result: [1, 2]

Brute Force — Count Frequencies Then Sort — Watch how we build a frequency map from the input array, then sort entries by frequency to find the top k elements.

Algorithm

  1. Create a hash map to count the frequency of each element in nums.
  2. Convert the hash map entries into a list of (element, frequency) pairs.
  3. Sort the list by frequency in descending order.
  4. Return the first k elements from the sorted list.

Code

class Solution {
public:
    vector<int> topKFrequent(vector<int>& nums, int k) {
        unordered_map<int, int> freq;
        for (int num : nums) {
            freq[num]++;
        }
        
        vector<pair<int, int>> pairs(freq.begin(), freq.end());
        sort(pairs.begin(), pairs.end(), [](const pair<int,int>& a, const pair<int,int>& b) {
            return a.second > b.second;
        });
        
        vector<int> result;
        for (int i = 0; i < k; i++) {
            result.push_back(pairs[i].first);
        }
        return result;
    }
};
class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        freq = {}
        for num in nums:
            freq[num] = freq.get(num, 0) + 1
        
        sorted_items = sorted(freq.items(), key=lambda x: x[1], reverse=True)
        
        return [item[0] for item in sorted_items[:k]]
class Solution {
    public int[] topKFrequent(int[] nums, int k) {
        Map<Integer, Integer> freq = new HashMap<>();
        for (int num : nums) {
            freq.put(num, freq.getOrDefault(num, 0) + 1);
        }
        
        List<Map.Entry<Integer, Integer>> entries = new ArrayList<>(freq.entrySet());
        entries.sort((a, b) -> b.getValue() - a.getValue());
        
        int[] result = new int[k];
        for (int i = 0; i < k; i++) {
            result[i] = entries.get(i).getKey();
        }
        return result;
    }
}

Complexity Analysis

Time Complexity: O(n + d log d)

Building the frequency map takes O(n) where n is the length of the array. Sorting the d distinct elements by frequency takes O(d log d). In the worst case, all elements are distinct (d = n), making this O(n log n).

Space Complexity: O(d)

The frequency map and sorted list each store at most d distinct elements. In the worst case d = n, so space is O(n).

Why This Approach Is Not Efficient

The sorting step dominates the time complexity at O(d log d), which in the worst case is O(n log n). The problem's follow-up explicitly asks for a solution better than O(n log n).

The key observation is that we do not need a full sort of all elements by frequency. We only care about the top k — the rest of the ordering is wasted work. If we could extract just the k most frequent elements without sorting everything, we would save time.

A min-heap of size k can do exactly this: maintain only the k most frequent elements seen so far, discarding the rest. This reduces sorting from O(d log d) to O(d log k).

Better Approach - Min Heap

Intuition

Instead of sorting all elements, we use a min-heap (priority queue) of size k to track only the k most frequent elements as we iterate through the frequency map.

Imagine you are a talent show judge and can only keep k contestants on stage. As each new contestant performs (we process each distinct element), you compare their score (frequency) with the weakest performer currently on stage (the heap's minimum). If the new contestant is better, the weakest one gets eliminated and the new one takes their spot. After all contestants have auditioned, whoever remains on stage is the top k.

The min-heap always gives us the contestant with the lowest frequency in O(1), and replacing them costs only O(log k). Since k is typically much smaller than n, this is a significant improvement over full sorting.

Step-by-Step Explanation

Let's trace through with nums = [1, 1, 1, 2, 2, 3], k = 2.

Step 1: Build frequency map: {1: 3, 2: 2, 3: 1}.

Step 2: Initialize an empty min-heap (ordered by frequency).

Step 3: Process element 1 (frequency 3).

  • Heap size (0) < k (2), so push (3, 1) into heap.
  • Heap: [(3, 1)]

Step 4: Process element 2 (frequency 2).

  • Heap size (1) < k (2), so push (2, 2) into heap.
  • Heap: [(2, 2), (3, 1)] — min-heap, so (2, 2) is on top.

Step 5: Process element 3 (frequency 1).

  • Heap size (2) == k (2). Compare frequency 1 with heap top frequency 2.
  • 1 < 2, so element 3 is not frequent enough. Do NOT push.
  • Heap remains: [(2, 2), (3, 1)]

Step 6: Extract elements from heap.

  • Pop (2, 2) → element 2. Pop (3, 1) → element 1.
  • Result: [1, 2]

Result: [1, 2]

Min Heap — Maintaining Top K Frequent Elements — Watch how a min-heap of size k efficiently filters out less frequent elements, keeping only the k most frequent at all times.

Algorithm

  1. Build a frequency map by scanning the array: O(n).
  2. Create an empty min-heap.
  3. For each (element, frequency) pair in the map:
    a. If heap size < k, push the pair.
    b. Else if frequency > heap-top frequency, pop the top and push the new pair.
  4. Extract all elements from the heap into the result array.
  5. Return the result.

Code

class Solution {
public:
    vector<int> topKFrequent(vector<int>& nums, int k) {
        unordered_map<int, int> freq;
        for (int num : nums) {
            freq[num]++;
        }
        
        // Min-heap of (frequency, element)
        priority_queue<pair<int,int>, vector<pair<int,int>>, greater<pair<int,int>>> minHeap;
        
        for (auto& [element, count] : freq) {
            minHeap.push({count, element});
            if ((int)minHeap.size() > k) {
                minHeap.pop();
            }
        }
        
        vector<int> result;
        while (!minHeap.empty()) {
            result.push_back(minHeap.top().second);
            minHeap.pop();
        }
        return result;
    }
};
import heapq
from collections import Counter

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        freq = Counter(nums)
        
        # Use a min-heap of size k
        min_heap = []
        for element, count in freq.items():
            heapq.heappush(min_heap, (count, element))
            if len(min_heap) > k:
                heapq.heappop(min_heap)
        
        return [element for count, element in min_heap]
class Solution {
    public int[] topKFrequent(int[] nums, int k) {
        Map<Integer, Integer> freq = new HashMap<>();
        for (int num : nums) {
            freq.put(num, freq.getOrDefault(num, 0) + 1);
        }
        
        // Min-heap ordered by frequency
        PriorityQueue<int[]> minHeap = new PriorityQueue<>((a, b) -> a[1] - b[1]);
        
        for (Map.Entry<Integer, Integer> entry : freq.entrySet()) {
            minHeap.offer(new int[]{entry.getKey(), entry.getValue()});
            if (minHeap.size() > k) {
                minHeap.poll();
            }
        }
        
        int[] result = new int[k];
        for (int i = 0; i < k; i++) {
            result[i] = minHeap.poll()[0];
        }
        return result;
    }
}

Complexity Analysis

Time Complexity: O(n + d log k)

Building the frequency map takes O(n). For each of the d distinct elements, we perform a heap push/pop costing O(log k). Total: O(n + d log k). Since d ≤ n and k ≤ d, and because log k is typically much smaller than log n, this is a meaningful improvement over O(n log n) when k is small.

Space Complexity: O(d + k)

The frequency map stores d distinct elements. The heap holds at most k elements at any time. Total extra space is O(d).

Why This Approach Is Not Efficient

The heap approach runs in O(n + d log k), which is better than the sorting approach's O(n + d log d). However, it still involves O(d log k) work for the heap operations.

Can we do even better? If we could avoid all comparison-based sorting entirely, we could achieve O(n) time. The insight is that the maximum possible frequency of any element is n (the length of the array). This means frequencies are bounded integers in the range [1, n], which is a scenario perfectly suited for bucket sort — a non-comparison sort that runs in O(n).

By placing elements into buckets indexed by their frequency, we can collect the top k elements by simply walking the buckets from highest frequency downward. No heap, no sorting, just direct placement and linear traversal.

Optimal Approach - Bucket Sort

Intuition

Since the frequency of any element can range from 1 to n (where n is the array length), we create an array of n + 1 buckets. Bucket i holds all elements that appear exactly i times. After building the frequency map and distributing elements into buckets, we walk backwards from bucket n down to bucket 1, collecting elements until we have gathered k.

Think of a mail sorting room with numbered shelves from 1 to n. Each shelf number represents a frequency. When you know that element 5 appears 3 times, you place it on shelf 3. After all elements are shelved, you start from the highest shelf and work your way down, picking items until you have k. Since you visit the highest frequencies first, you naturally get the most frequent elements.

This approach is brilliant because it completely avoids comparison-based sorting. The bucket array acts as a counting sort, giving us O(n) total time.

Step-by-Step Explanation

Let's trace through with nums = [1, 1, 1, 2, 2, 3], k = 2.

Step 1: Build frequency map.

  • freq = {1: 3, 2: 2, 3: 1}

Step 2: Create bucket array of size n + 1 = 7 (indices 0 through 6). Each bucket is an empty list.

  • buckets = [[], [], [], [], [], [], []]

Step 3: Distribute elements into buckets based on frequency.

  • Element 1 has frequency 3 → place in bucket[3].
  • buckets = [[], [], [], [1], [], [], []]

Step 4: Element 2 has frequency 2 → place in bucket[2].

  • buckets = [[], [], [2], [1], [], [], []]

Step 5: Element 3 has frequency 1 → place in bucket[1].

  • buckets = [[], [3], [2], [1], [], [], []]

Step 6: Walk from highest bucket (index 6) downward, collecting elements until we have k = 2.

  • bucket[6]: empty. Skip.
  • bucket[5]: empty. Skip.
  • bucket[4]: empty. Skip.
  • bucket[3]: contains [1]. Collect element 1. result = [1]. Count = 1.
  • bucket[2]: contains [2]. Collect element 2. result = [1, 2]. Count = 2.
  • Count == k. Stop.

Result: [1, 2]

Bucket Sort — O(n) Frequency-Based Collection — Watch how elements are placed into frequency buckets and then collected from highest to lowest frequency to find the top k elements in linear time.

Algorithm

  1. Build a frequency map by scanning the array: O(n).
  2. Create a bucket array of size n + 1, where each bucket is an empty list.
  3. For each (element, frequency) in the frequency map, append the element to bucket[frequency].
  4. Initialize an empty result list.
  5. Walk from bucket[n] down to bucket[1]:
    • For each element in the current bucket, add it to the result.
    • If the result has k elements, stop and return.
  6. Return the result.

Code

class Solution {
public:
    vector<int> topKFrequent(vector<int>& nums, int k) {
        int n = nums.size();
        unordered_map<int, int> freq;
        for (int num : nums) {
            freq[num]++;
        }
        
        // Create buckets: index = frequency, value = list of elements
        vector<vector<int>> buckets(n + 1);
        for (auto& [element, count] : freq) {
            buckets[count].push_back(element);
        }
        
        // Collect from highest frequency bucket downward
        vector<int> result;
        for (int i = n; i >= 1 && (int)result.size() < k; i--) {
            for (int element : buckets[i]) {
                result.push_back(element);
                if ((int)result.size() == k) break;
            }
        }
        return result;
    }
};
from collections import Counter

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        n = len(nums)
        freq = Counter(nums)
        
        # Create buckets: index = frequency
        buckets = [[] for _ in range(n + 1)]
        for element, count in freq.items():
            buckets[count].append(element)
        
        # Collect from highest frequency downward
        result = []
        for i in range(n, 0, -1):
            for element in buckets[i]:
                result.append(element)
                if len(result) == k:
                    return result
        return result
class Solution {
    public int[] topKFrequent(int[] nums, int k) {
        int n = nums.length;
        Map<Integer, Integer> freq = new HashMap<>();
        for (int num : nums) {
            freq.put(num, freq.getOrDefault(num, 0) + 1);
        }
        
        // Create buckets: index = frequency
        @SuppressWarnings("unchecked")
        List<Integer>[] buckets = new ArrayList[n + 1];
        for (int i = 0; i <= n; i++) {
            buckets[i] = new ArrayList<>();
        }
        for (Map.Entry<Integer, Integer> entry : freq.entrySet()) {
            buckets[entry.getValue()].add(entry.getKey());
        }
        
        // Collect from highest frequency downward
        int[] result = new int[k];
        int idx = 0;
        for (int i = n; i >= 1 && idx < k; i--) {
            for (int element : buckets[i]) {
                result[idx++] = element;
                if (idx == k) break;
            }
        }
        return result;
    }
}

Complexity Analysis

Time Complexity: O(n)

Building the frequency map takes O(n). Creating and filling the bucket array takes O(d) where d is the number of distinct elements (d ≤ n). The collection phase scans at most n + 1 buckets, and across all buckets there are at most d elements total, so the scan is O(n + d) = O(n). Every step is linear, giving O(n) overall.

Space Complexity: O(n)

The frequency map stores up to d entries. The bucket array has n + 1 slots, and the elements distributed across all buckets total d. Overall extra space is O(n).

This is the most time-efficient solution possible for this problem — we cannot do better than O(n) since we must read every element at least once.