Remove duplicates from a sorted doubly linked list
Description
You are given the head of a doubly linked list containing n nodes, where the list is sorted in non-decreasing order by node values. Your task is to remove duplicate nodes so that each distinct value appears only once. When multiple nodes share the same value, keep the first occurrence and delete all subsequent copies.
In a doubly linked list, every node has three components: a data field holding the value, a next pointer referencing the following node, and a prev pointer referencing the preceding node. When you remove a node, you must update the neighboring nodes' pointers so the list remains properly connected in both the forward and backward directions.
Return the head of the modified list after all duplicates have been removed.
Examples
Example 1
Input: n = 6, DLL: 1 ↔ 1 ↔ 1 ↔ 2 ↔ 3 ↔ 4
Output: 1 ↔ 2 ↔ 3 ↔ 4
Explanation: The value 1 appears three times at the beginning of the list. We keep only the first occurrence and remove the second and third nodes. The remaining nodes (2, 3, 4) each appear once so they stay unchanged. The final list has 4 nodes: 1 ↔ 2 ↔ 3 ↔ 4.
Example 2
Input: n = 7, DLL: 1 ↔ 2 ↔ 2 ↔ 3 ↔ 3 ↔ 4 ↔ 4
Output: 1 ↔ 2 ↔ 3 ↔ 4
Explanation: Three different values have duplicates: 2 appears twice, 3 appears twice, and 4 appears twice. For each value, we keep the first occurrence and remove the second. The value 1 has no duplicate, so it stays as is. After removing three duplicate nodes, the list has 4 distinct nodes: 1 ↔ 2 ↔ 3 ↔ 4.
Example 3
Input: n = 1, DLL: 5
Output: 5
Explanation: A single-node list has no duplicates to remove. The list remains unchanged.
Constraints
- 1 ≤ n ≤ 10^5
- The list is sorted in non-decreasing order
- Expected Time Complexity: O(n)
- Expected Auxiliary Space: O(1)
Editorial
Brute Force
Intuition
Before we leverage the fact that the list is sorted, let us think of a general-purpose approach that works on any list — sorted or not.
Imagine you are checking attendance for a class. You walk down a line of students and keep a roster of names you have already recorded. Each time you encounter a new student, you write their name on the roster. If someone's name is already on the roster, you know they are a repeat and ask them to step out of the line.
Translated to code, the roster is a hash set. We traverse the doubly linked list from head to tail. For each node, we check whether its value is already in our set. If the value is new, we add it to the set and move on. If the value is already in the set, we know this node is a duplicate — we delete it by relinking its neighbors.
This approach is universal: it handles unsorted lists, lists with duplicates scattered randomly, and of course sorted lists. However, it uses extra memory for the hash set, which we will question shortly.
Step-by-Step Explanation
Let us trace through with DLL: 1 ↔ 1 ↔ 1 ↔ 2 ↔ 3 ↔ 4 (Example 1):
Step 1: Initialize an empty hash set: seen = {}. Set curr to the head node (value 1). The list has 6 nodes.
- State: List = [1, 1, 1, 2, 3, 4], seen = {}, curr at node 0 (value 1)
Step 2: Check curr.data = 1. Is 1 in seen? No — this is the first time we encounter value 1.
- Add 1 to seen. seen = {1}.
- Advance curr to the next node (node 1, value 1).
- State: List = [1, 1, 1, 2, 3, 4], seen = {1}, curr at node 1
Step 3: Check curr.data = 1. Is 1 in seen? Yes — this is a duplicate!
- Save reference to curr.next (node 2, value 1) before relinking.
- Set curr.prev.next (node 0's next) to curr.next (node 2). This bypasses the current duplicate.
- Set curr.next.prev (node 2's prev) to curr.prev (node 0). This completes the backward link.
- Move curr to the saved next reference (node 2, value 1).
- State: List = [1, 1, 2, 3, 4], seen = {1}, curr at new node 1 (value 1)
Step 4: Check curr.data = 1. Is 1 in seen? Yes — another duplicate!
- Save next reference (node with value 2).
- Relink: node 0's next → value 2 node. Value 2 node's prev → node 0.
- Move curr to the value 2 node.
- State: List = [1, 2, 3, 4], seen = {1}, curr at new node 1 (value 2)
Step 5: Check curr.data = 2. Is 2 in seen? No — first encounter.
- Add 2 to seen. seen = {1, 2}.
- Advance curr to next node (value 3).
- State: List = [1, 2, 3, 4], seen = {1, 2}, curr at node 2 (value 3)
Step 6: Check curr.data = 3. Is 3 in seen? No — first encounter.
- Add 3 to seen. seen = {1, 2, 3}.
- Advance curr to next node (value 4).
- State: List = [1, 2, 3, 4], seen = {1, 2, 3}, curr at node 3 (value 4)
Step 7: Check curr.data = 4. Is 4 in seen? No — first encounter.
- Add 4 to seen. seen = {1, 2, 3, 4}.
- Advance curr to curr.next, which is null.
- State: List = [1, 2, 3, 4], seen = {1, 2, 3, 4}, curr = null
Step 8: curr is null — traversal complete.
Result: 1 ↔ 2 ↔ 3 ↔ 4. We removed 2 duplicate nodes. The hash set consumed space proportional to the number of unique values.
Algorithm
- Create an empty hash set called
seen. - Set
currto the head of the doubly linked list. - While
curris not null:
a. Ifcurr.datais already inseen:- Save
curr.nextasnextNode. - If
curr.prevexists, setcurr.prev.nexttocurr.next. - If
curr.nextexists, setcurr.next.prevtocurr.prev. - Set
curr = nextNode.
b. Ifcurr.datais not inseen: - Add
curr.datatoseen. - Advance
curr = curr.next.
- Save
- Return the head of the modified list.
Code
/*
* Definition for doubly-linked list node:
* struct Node {
* int data;
* Node* next;
* Node* prev;
* Node(int val) : data(val), next(nullptr), prev(nullptr) {}
* };
*/
class Solution {
public:
Node* removeDuplicates(Node* head) {
if (head == nullptr) return nullptr;
unordered_set<int> seen;
Node* curr = head;
while (curr != nullptr) {
if (seen.count(curr->data)) {
// Duplicate found — remove this node
Node* nextNode = curr->next;
if (curr->prev != nullptr) {
curr->prev->next = curr->next;
}
if (curr->next != nullptr) {
curr->next->prev = curr->prev;
}
curr = nextNode;
} else {
// First occurrence — record it
seen.insert(curr->data);
curr = curr->next;
}
}
return head;
}
};class Solution:
def removeDuplicates(self, head):
if head is None:
return None
seen = set()
curr = head
while curr is not None:
if curr.data in seen:
# Duplicate found — remove this node
next_node = curr.next
if curr.prev is not None:
curr.prev.next = curr.next
if curr.next is not None:
curr.next.prev = curr.prev
curr = next_node
else:
# First occurrence — record it
seen.add(curr.data)
curr = curr.next
return head/*
* Definition for doubly-linked list node:
* class Node {
* int data;
* Node next;
* Node prev;
* Node(int val) { data = val; next = null; prev = null; }
* }
*/
class Solution {
Node removeDuplicates(Node head) {
if (head == null) return null;
HashSet<Integer> seen = new HashSet<>();
Node curr = head;
while (curr != null) {
if (seen.contains(curr.data)) {
// Duplicate found — remove this node
Node nextNode = curr.next;
if (curr.prev != null) {
curr.prev.next = curr.next;
}
if (curr.next != null) {
curr.next.prev = curr.prev;
}
curr = nextNode;
} else {
// First occurrence — record it
seen.add(curr.data);
curr = curr.next;
}
}
return head;
}
}Complexity Analysis
Time Complexity: O(n)
We traverse the entire list once, visiting each of the n nodes exactly one time. For each node, we perform a hash set lookup and possibly an insertion, both of which are O(1) on average. Deletion of a node in a doubly linked list is O(1) since we have direct access to the previous and next pointers. Total: n iterations × O(1) work = O(n).
Space Complexity: O(n)
The hash set stores every unique value we encounter. In the worst case (all values are distinct, no duplicates to remove), the set holds all n values, consuming O(n) extra memory. Even in the average case, the set size equals the number of unique values, which can be up to n.
Why This Approach Is Not Efficient
While the hash set approach runs in O(n) time, it uses O(n) auxiliary space — which is wasteful given an important property we have not yet exploited: the list is already sorted.
In a sorted list, all copies of the same value are guaranteed to be adjacent. If the value 3 appears three times, those three nodes sit right next to each other. There is no scenario where a duplicate hides further away in the list.
This adjacency property means we never need to "remember" values we have seen in the past. We only need to compare each node with its immediate next neighbor. If they share the same value, the next node is a duplicate. If they differ, we move forward.
The hash set is a general-purpose tool designed for unsorted data where duplicates could appear anywhere. Using it on a sorted list is like bringing a metal detector to find your keys when they are already sitting on the table in front of you — the tool works, but it is unnecessarily powerful and costly.
Key insight: since duplicates are always adjacent in a sorted list, a simple neighbor comparison replaces the entire hash set. This reduces space from O(n) to O(1).
Optimal Approach - Leveraging Sorted Order
Intuition
Picture a bookshelf where books are arranged alphabetically. If you want to remove duplicate titles, you do not need to scan the entire shelf for each book. You simply look at the book next to the current one. If both titles match, you pull out the duplicate. If the titles differ, you move on to the next book. Because the shelf is sorted, you are guaranteed that any duplicates will be sitting right beside each other.
The same logic applies to our sorted doubly linked list. We place a pointer called curr at the head. At each step, we compare curr.data with curr.next.data:
-
If they are equal: The next node is a duplicate. We remove it by adjusting pointers — set
curr.nextto skip over the duplicate node, and update the backward pointer of the node after the duplicate to point back tocurr. We do NOT advancecurrbecause there might be more copies of the same value right after the one we just removed. -
If they are different: The next node holds a new value, so no deletion is needed. We advance
currtocurr.nextand continue.
We repeat until curr reaches a node with no next neighbor, meaning we have processed the entire list. The head never changes because the first occurrence of each value is always kept.
Step-by-Step Explanation
Let us trace through with DLL: 1 ↔ 2 ↔ 2 ↔ 3 ↔ 3 ↔ 4 ↔ 4 (Example 2):
Step 1: Set curr to head (node 0, value 1). The full list is [1, 2, 2, 3, 3, 4, 4].
Step 2: Compare curr.data (1) with curr.next.data (2). Values differ — no duplicate. Advance curr to node 1 (value 2).
Step 3: Compare curr.data (2) with curr.next.data (2). Values are equal — duplicate found!
Step 4: Remove the duplicate. Set curr.next to the node after the duplicate (value 3). Update that node's prev to point to curr. List becomes [1, 2, 3, 3, 4, 4]. Do not advance curr yet — there could be more 2s.
Step 5: Compare curr.data (2) with new curr.next.data (3). Values differ — no more duplicates of 2. Advance curr to node 2 (value 3).
Step 6: Compare curr.data (3) with curr.next.data (3). Values are equal — duplicate found!
Step 7: Remove the duplicate. Set curr.next to skip to the node with value 4. Update backward pointer. List becomes [1, 2, 3, 4, 4]. Do not advance curr.
Step 8: Compare curr.data (3) with new curr.next.data (4). Values differ — no more duplicates of 3. Advance curr to node 3 (value 4).
Step 9: Compare curr.data (4) with curr.next.data (4). Values are equal — duplicate found!
Step 10: Remove the duplicate. Since the duplicate is the tail node, set curr.next to null. List becomes [1, 2, 3, 4].
Step 11: curr.next is now null — no more nodes to check. Traversal complete.
Result: 1 ↔ 2 ↔ 3 ↔ 4. We removed 3 duplicate nodes in a single pass using no extra space.
Sorted Order Deduplication — Removing Adjacent Duplicates In Place — Watch how we compare each node with its next neighbor in a single pass. Because the list is sorted, duplicates are always adjacent, so a simple neighbor comparison replaces the need for any extra data structure.
Algorithm
- If the head is null, return null (empty list).
- Set
currto the head of the list. - While
curris not null ANDcurr.nextis not null:
a. Ifcurr.dataequalscurr.next.data:- The next node is a duplicate. Save a reference to it.
- Set
curr.nexttoduplicate.next(skip over the duplicate). - If
duplicate.nextexists, setduplicate.next.prevtocurr(update backward link). - Do NOT advance
curr— check for more consecutive duplicates of the same value.
b. Ifcurr.datadoes not equalcurr.next.data: - No duplicate at this position. Advance
curr = curr.next.
- Return the head (unchanged, since we always keep the first occurrence).
Code
/*
* Definition for doubly-linked list node:
* struct Node {
* int data;
* Node* next;
* Node* prev;
* Node(int val) : data(val), next(nullptr), prev(nullptr) {}
* };
*/
class Solution {
public:
Node* removeDuplicates(Node* head) {
if (head == nullptr) return nullptr;
Node* curr = head;
while (curr != nullptr && curr->next != nullptr) {
if (curr->data == curr->next->data) {
// Next node is a duplicate — remove it
Node* duplicate = curr->next;
curr->next = duplicate->next;
if (duplicate->next != nullptr) {
duplicate->next->prev = curr;
}
delete duplicate;
// Do not advance curr; check for more duplicates
} else {
// Values differ — move forward
curr = curr->next;
}
}
return head;
}
};class Solution:
def removeDuplicates(self, head):
if head is None:
return None
curr = head
while curr is not None and curr.next is not None:
if curr.data == curr.next.data:
# Next node is a duplicate — remove it
duplicate = curr.next
curr.next = duplicate.next
if duplicate.next is not None:
duplicate.next.prev = curr
# Do not advance curr; check for more duplicates
else:
# Values differ — move forward
curr = curr.next
return head/*
* Definition for doubly-linked list node:
* class Node {
* int data;
* Node next;
* Node prev;
* Node(int val) { data = val; next = null; prev = null; }
* }
*/
class Solution {
Node removeDuplicates(Node head) {
if (head == null) return null;
Node curr = head;
while (curr != null && curr.next != null) {
if (curr.data == curr.next.data) {
// Next node is a duplicate — remove it
Node duplicate = curr.next;
curr.next = duplicate.next;
if (duplicate.next != null) {
duplicate.next.prev = curr;
}
// Do not advance curr; check for more duplicates
} else {
// Values differ — move forward
curr = curr.next;
}
}
return head;
}
}Complexity Analysis
Time Complexity: O(n)
We traverse the list from head to tail in a single pass. At each step, we either advance curr forward by one node (when values differ) or delete curr.next (when values match). Each node is either visited as curr or deleted as curr.next — but never both. This means each of the n original nodes is processed exactly once. Every individual operation (comparison, pointer update, deletion) is O(1). Total: O(n).
Space Complexity: O(1)
We use only a single pointer variable (curr) and a temporary reference to the node being deleted (duplicate). No hash sets, no arrays, no auxiliary data structures. The number of variables is constant regardless of input size. The deletions happen in place by rewiring existing pointers, so auxiliary space is O(1).