Group Anagrams
Description
You are given an array of strings. Your task is to group together all strings that are anagrams of each other.
Two strings are considered anagrams if one can be rearranged to form the other. In other words, they contain exactly the same characters with the same frequencies, just in a different order.
Return the grouped anagrams as a list of lists. The groups can appear in any order, and the strings within each group can also appear in any order.
Examples
Example 1
Input: strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["bat"], ["nat", "tan"], ["ate", "eat", "tea"]]
Explanation: The string "bat" has no anagram partner in the list, so it forms its own group. The strings "nat" and "tan" are anagrams of each other because rearranging the letters of "nat" gives "tan" and vice versa. Similarly, "ate", "eat", and "tea" are all anagrams — they each contain exactly one 'a', one 'e', and one 't'.
Example 2
Input: strs = [""]
Output: [[""]]
Explanation: There is only one string, which is an empty string. It forms a group by itself. An empty string is considered an anagram of itself since both have zero characters.
Example 3
Input: strs = ["a"]
Output: [["a"]]
Explanation: A single-character string can only be an anagram of itself. Since there is no other string to pair with, it forms its own group.
Constraints
- 1 ≤ strs.length ≤ 10^4
- 0 ≤ strs[i].length ≤ 100
- strs[i] consists of lowercase English letters only
Editorial
Brute Force
Intuition
The most straightforward way to group anagrams is to compare every string with every other string and check whether they are anagrams. Two strings are anagrams if sorting both of them produces the same result.
Imagine you have a pile of Scrabble tiles arranged into different words. To find which words use the same tiles, you could pick up each word, sort its letters alphabetically, and then compare the sorted version with the sorted version of every other word. Words that look the same after sorting belong to the same group.
For each string that hasn't been grouped yet, we scan all remaining strings, check if they are anagrams (by sorting and comparing), and collect them into a group.
Step-by-Step Explanation
Let's trace through with strs = ["eat", "tea", "tan", "ate"]:
Step 1: Mark all strings as ungrouped. Start with i=0, word "eat".
Step 2: Sort "eat" → "aet". Create a new group starting with "eat".
Step 3: Compare with j=1, word "tea". Sort "tea" → "aet". "aet" == "aet"? YES — add "tea" to the group. Mark "tea" as grouped.
Step 4: Compare with j=2, word "tan". Sort "tan" → "ant". "aet" == "ant"? NO — skip.
Step 5: Compare with j=3, word "ate". Sort "ate" → "aet". "aet" == "aet"? YES — add "ate" to the group. Mark "ate" as grouped. Group 1: ["eat", "tea", "ate"].
Step 6: Move to i=1. "tea" is already grouped — skip.
Step 7: Move to i=2, word "tan". Sort "tan" → "ant". Create a new group with "tan".
Step 8: Compare with j=3. "ate" is already grouped — skip. Group 2: ["tan"].
Step 9: Move to i=3. "ate" is already grouped — skip. All strings processed.
Result: [["eat", "tea", "ate"], ["tan"]]
Brute Force — Comparing Every Pair — Watch how we compare each ungrouped string with all others by sorting their characters and checking for equality.
Algorithm
- Create a boolean array to track which strings have been grouped
- For each ungrouped string at index i:
a. Sort the string to get its anagram fingerprint
b. Start a new group containing this string
c. For each ungrouped string at index j > i:- Sort it and compare with the fingerprint from step 2a
- If they match, add it to the current group and mark as grouped
- Add the completed group to the result
- Return all groups
Code
class Solution {
public:
vector<vector<string>> groupAnagrams(vector<string>& strs) {
int n = strs.size();
vector<bool> grouped(n, false);
vector<vector<string>> result;
for (int i = 0; i < n; i++) {
if (grouped[i]) continue;
string sortedI = strs[i];
sort(sortedI.begin(), sortedI.end());
vector<string> group = {strs[i]};
for (int j = i + 1; j < n; j++) {
if (grouped[j]) continue;
string sortedJ = strs[j];
sort(sortedJ.begin(), sortedJ.end());
if (sortedI == sortedJ) {
group.push_back(strs[j]);
grouped[j] = true;
}
}
grouped[i] = true;
result.push_back(group);
}
return result;
}
};class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
n = len(strs)
grouped = [False] * n
result = []
for i in range(n):
if grouped[i]:
continue
sorted_i = ''.join(sorted(strs[i]))
group = [strs[i]]
for j in range(i + 1, n):
if grouped[j]:
continue
sorted_j = ''.join(sorted(strs[j]))
if sorted_i == sorted_j:
group.append(strs[j])
grouped[j] = True
grouped[i] = True
result.append(group)
return resultclass Solution {
public List<List<String>> groupAnagrams(String[] strs) {
int n = strs.length;
boolean[] grouped = new boolean[n];
List<List<String>> result = new ArrayList<>();
for (int i = 0; i < n; i++) {
if (grouped[i]) continue;
char[] charsI = strs[i].toCharArray();
Arrays.sort(charsI);
String sortedI = new String(charsI);
List<String> group = new ArrayList<>();
group.add(strs[i]);
for (int j = i + 1; j < n; j++) {
if (grouped[j]) continue;
char[] charsJ = strs[j].toCharArray();
Arrays.sort(charsJ);
String sortedJ = new String(charsJ);
if (sortedI.equals(sortedJ)) {
group.add(strs[j]);
grouped[j] = true;
}
}
grouped[i] = true;
result.add(group);
}
return result;
}
}Complexity Analysis
Time Complexity: O(n² × k log k)
We have two nested loops over the n strings, giving O(n²) pairs. For each pair, we sort one string of average length k, costing O(k log k). In practice, each string gets sorted multiple times across different pair comparisons, making the total work O(n² × k log k).
Space Complexity: O(n × k)
We store the grouped boolean array of size n and create sorted copies of strings. The result storage itself requires O(n × k) space to hold all the strings in their groups.
Why This Approach Is Not Efficient
The brute force approach compares every pair of strings, resulting in O(n²) comparisons. With n up to 10^4, that means up to 50 million pair comparisons, each involving a sort operation of O(k log k) where k can be up to 100. This totals roughly 50 million × 700 ≈ 35 billion operations — far too slow.
The core inefficiency is that we repeatedly sort the same strings and compare them pairwise. We are essentially answering the question 'are these two strings anagrams?' n² times, when we could instead compute a fingerprint for each string once and group by that fingerprint.
Key insight: if we sort each string just once and use the sorted version as a key in a hash map, strings that are anagrams will naturally map to the same key. This eliminates the need for pairwise comparisons entirely.
Better Approach - Sorted String Key
Intuition
Instead of comparing every pair, we can use a clever observation: all anagrams, when their characters are sorted alphabetically, produce the exact same string. For example, "eat", "tea", and "ate" all become "aet" when sorted.
Think of it like organizing books by their ISBN rather than comparing every book's content with every other book. The sorted string acts as a unique identifier (ISBN) for each anagram group.
We sort each string once, use the sorted version as a key in a hash map, and push the original string into the list associated with that key. At the end, every key in the map represents one anagram group, and its value is the list of all original strings belonging to that group.
Step-by-Step Explanation
Let's trace through with strs = ["eat", "tea", "tan", "ate", "nat", "bat"]:
Step 1: Initialize an empty hash map.
Step 2: Process "eat". Sort → "aet". Map has no key "aet". Create new entry: {"aet": ["eat"]}.
Step 3: Process "tea". Sort → "aet". Map already has key "aet". Append: {"aet": ["eat", "tea"]}.
Step 4: Process "tan". Sort → "ant". Map has no key "ant". Create new entry: {"aet": ["eat", "tea"], "ant": ["tan"]}.
Step 5: Process "ate". Sort → "aet". Map has key "aet". Append: {"aet": ["eat", "tea", "ate"], "ant": ["tan"]}.
Step 6: Process "nat". Sort → "ant". Map has key "ant". Append: {"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"]}.
Step 7: Process "bat". Sort → "abt". Map has no key "abt". Create: {"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"], "abt": ["bat"]}.
Step 8: Collect all values from the map.
Result: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
Sorted Key Hash Map — Grouping by Sorted Fingerprint — Watch how sorting each string creates a canonical key, and the hash map automatically groups anagrams together under the same key.
Algorithm
- Create an empty hash map where keys are sorted strings and values are lists of original strings
- For each string in the input array:
a. Sort the characters of the string to create a canonical key
b. If the key exists in the map, append the original string to that key's list
c. If the key does not exist, create a new entry with the key and a list containing the original string - Return all values from the hash map as the result
Code
class Solution {
public:
vector<vector<string>> groupAnagrams(vector<string>& strs) {
unordered_map<string, vector<string>> groups;
for (const string& s : strs) {
string key = s;
sort(key.begin(), key.end());
groups[key].push_back(s);
}
vector<vector<string>> result;
for (auto& pair : groups) {
result.push_back(pair.second);
}
return result;
}
};class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
groups = {}
for s in strs:
key = ''.join(sorted(s))
if key not in groups:
groups[key] = []
groups[key].append(s)
return list(groups.values())class Solution {
public List<List<String>> groupAnagrams(String[] strs) {
Map<String, List<String>> groups = new HashMap<>();
for (String s : strs) {
char[] chars = s.toCharArray();
Arrays.sort(chars);
String key = new String(chars);
groups.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}
return new ArrayList<>(groups.values());
}
}Complexity Analysis
Time Complexity: O(n × k log k)
We iterate through all n strings. For each string of length k, we sort it in O(k log k) time. The hash map insertion and lookup are O(k) on average (for hashing the string key). Since k log k dominates k, the total time is O(n × k log k), where n is the number of strings and k is the maximum string length.
Space Complexity: O(n × k)
We store all n strings in the hash map. Each string has length up to k, so the total space for storing all strings is O(n × k). The sorted keys also use O(k) space each, but they are reused.
Why This Approach Is Not Efficient
The sorted key approach is a massive improvement over brute force — going from O(n² × k log k) down to O(n × k log k). However, there is still a hidden cost: sorting each string takes O(k log k) time per string.
With n up to 10^4 and k up to 100, this gives roughly 10^4 × 100 × 7 ≈ 7 million operations. While this is efficient enough to pass, we can ask: can we avoid sorting entirely?
The insight is that instead of sorting characters to create a canonical key, we can count the frequency of each character. Since there are only 26 lowercase letters, we can represent any string as a fixed-length array of 26 counts. Two strings are anagrams if and only if their character count arrays are identical. Building this count array takes O(k) time — no sorting needed — shaving off the log k factor.
Optimal Approach - Character Count Key
Intuition
Rather than sorting each string to find its canonical form, we can represent each string by counting how many times each letter appears. Since the input only contains lowercase English letters (26 possible characters), we create an array of 26 integers where position 0 counts 'a's, position 1 counts 'b's, and so on.
Think of it like a recipe. Instead of listing ingredients in alphabetical order to compare recipes (the sorting approach), you create a standardized ingredient card: '2 cups flour, 1 cup sugar, 3 eggs'. Two recipes are the same if their ingredient cards match — regardless of the order the ingredients were listed.
For the string "eat": a=1, e=1, t=1 → key = "1#0#0#0#1#...#0#1#0#0#..."
For the string "tea": a=1, e=1, t=1 → key = "1#0#0#0#1#...#0#1#0#0#..."
Both produce the same count-based key, so they are grouped together. This runs in O(k) per string instead of O(k log k).
Step-by-Step Explanation
Let's trace through with strs = ["eat", "tea", "tan", "ate", "nat", "bat"]:
Step 1: Initialize an empty hash map where keys are character count strings.
Step 2: Process "eat". Count characters: a=1, e=1, t=1. Build key string: "1#0#0#0#1#0#...#0#1#0#..." (26 counts separated by '#'). Map has no such key → create new entry.
Step 3: Process "tea". Count: a=1, e=1, t=1. Key is identical to "eat"'s key → append "tea" to existing group.
Step 4: Process "tan". Count: a=1, n=1, t=1. This produces a different key than "eat" → create new entry.
Step 5: Process "ate". Count: a=1, e=1, t=1. Same key as "eat" and "tea" → append to group.
Step 6: Process "nat". Count: a=1, n=1, t=1. Same key as "tan" → append to group.
Step 7: Process "bat". Count: a=1, b=1, t=1. New unique key → create new entry.
Step 8: Collect all groups from the map.
Result: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
Character Count Key — O(k) Fingerprinting — Watch how counting character frequencies (instead of sorting) creates a canonical key for each string, grouping anagrams in O(n × k) total time.
Algorithm
- Create an empty hash map where keys are character count representations and values are lists of strings
- For each string in the input:
a. Create an array of 26 zeros (one for each letter a-z)
b. For each character in the string, increment the corresponding count
c. Convert the count array to a string key (e.g., "1#0#0#0#1#0#...#1")
d. Add the original string to the list for that key in the map - Return all values from the hash map
Code
class Solution {
public:
vector<vector<string>> groupAnagrams(vector<string>& strs) {
unordered_map<string, vector<string>> groups;
for (const string& s : strs) {
vector<int> count(26, 0);
for (char c : s) {
count[c - 'a']++;
}
string key = "";
for (int i = 0; i < 26; i++) {
key += to_string(count[i]) + "#";
}
groups[key].push_back(s);
}
vector<vector<string>> result;
for (auto& pair : groups) {
result.push_back(pair.second);
}
return result;
}
};class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
groups = {}
for s in strs:
count = [0] * 26
for c in s:
count[ord(c) - ord('a')] += 1
key = tuple(count)
if key not in groups:
groups[key] = []
groups[key].append(s)
return list(groups.values())class Solution {
public List<List<String>> groupAnagrams(String[] strs) {
Map<String, List<String>> groups = new HashMap<>();
for (String s : strs) {
int[] count = new int[26];
for (char c : s.toCharArray()) {
count[c - 'a']++;
}
StringBuilder keyBuilder = new StringBuilder();
for (int i = 0; i < 26; i++) {
keyBuilder.append(count[i]).append('#');
}
String key = keyBuilder.toString();
groups.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}
return new ArrayList<>(groups.values());
}
}Complexity Analysis
Time Complexity: O(n × k)
We iterate through all n strings. For each string of length k, we scan its characters once to build the count array (O(k)), then build the key string from 26 fixed counts (O(26) = O(1) since 26 is constant). Hash map operations are O(k) on average for hashing the key. Total: O(n × k).
Compared to the sorting approach (O(n × k log k)), we eliminated the log k factor. For k = 100, this means roughly 7× fewer operations per string.
Space Complexity: O(n × k)
We store all n strings in the hash map, each of length up to k. The count arrays use O(26) = O(1) extra space per string. The key strings use O(26) = O(1) space each (since the count values are bounded). Total auxiliary space beyond the output is O(n × k).