Skip to main content

Remove duplicates from a sorted doubly linked list

Description

You are given the head of a doubly linked list containing n nodes, where the list is sorted in non-decreasing order by node values. Your task is to remove duplicate nodes so that each distinct value appears only once. When multiple nodes share the same value, keep the first occurrence and delete all subsequent copies.

In a doubly linked list, every node has three components: a data field holding the value, a next pointer referencing the following node, and a prev pointer referencing the preceding node. When you remove a node, you must update the neighboring nodes' pointers so the list remains properly connected in both the forward and backward directions.

Return the head of the modified list after all duplicates have been removed.

Examples

Example 1

Input: n = 6, DLL: 1 ↔ 1 ↔ 1 ↔ 2 ↔ 3 ↔ 4

Output: 1 ↔ 2 ↔ 3 ↔ 4

Explanation: The value 1 appears three times at the beginning of the list. We keep only the first occurrence and remove the second and third nodes. The remaining nodes (2, 3, 4) each appear once so they stay unchanged. The final list has 4 nodes: 1 ↔ 2 ↔ 3 ↔ 4.

Example 2

Input: n = 7, DLL: 1 ↔ 2 ↔ 2 ↔ 3 ↔ 3 ↔ 4 ↔ 4

Output: 1 ↔ 2 ↔ 3 ↔ 4

Explanation: Three different values have duplicates: 2 appears twice, 3 appears twice, and 4 appears twice. For each value, we keep the first occurrence and remove the second. The value 1 has no duplicate, so it stays as is. After removing three duplicate nodes, the list has 4 distinct nodes: 1 ↔ 2 ↔ 3 ↔ 4.

Example 3

Input: n = 1, DLL: 5

Output: 5

Explanation: A single-node list has no duplicates to remove. The list remains unchanged.

Constraints

  • 1 ≤ n ≤ 10^5
  • The list is sorted in non-decreasing order
  • Expected Time Complexity: O(n)
  • Expected Auxiliary Space: O(1)

Editorial

Brute Force

Intuition

Before we leverage the fact that the list is sorted, let us think of a general-purpose approach that works on any list — sorted or not.

Imagine you are checking attendance for a class. You walk down a line of students and keep a roster of names you have already recorded. Each time you encounter a new student, you write their name on the roster. If someone's name is already on the roster, you know they are a repeat and ask them to step out of the line.

Translated to code, the roster is a hash set. We traverse the doubly linked list from head to tail. For each node, we check whether its value is already in our set. If the value is new, we add it to the set and move on. If the value is already in the set, we know this node is a duplicate — we delete it by relinking its neighbors.

This approach is universal: it handles unsorted lists, lists with duplicates scattered randomly, and of course sorted lists. However, it uses extra memory for the hash set, which we will question shortly.

Step-by-Step Explanation

Let us trace through with DLL: 1 ↔ 1 ↔ 1 ↔ 2 ↔ 3 ↔ 4 (Example 1):

Step 1: Initialize an empty hash set: seen = {}. Set curr to the head node (value 1). The list has 6 nodes.

  • State: List = [1, 1, 1, 2, 3, 4], seen = {}, curr at node 0 (value 1)

Step 2: Check curr.data = 1. Is 1 in seen? No — this is the first time we encounter value 1.

  • Add 1 to seen. seen = {1}.
  • Advance curr to the next node (node 1, value 1).
  • State: List = [1, 1, 1, 2, 3, 4], seen = {1}, curr at node 1

Step 3: Check curr.data = 1. Is 1 in seen? Yes — this is a duplicate!

  • Save reference to curr.next (node 2, value 1) before relinking.
  • Set curr.prev.next (node 0's next) to curr.next (node 2). This bypasses the current duplicate.
  • Set curr.next.prev (node 2's prev) to curr.prev (node 0). This completes the backward link.
  • Move curr to the saved next reference (node 2, value 1).
  • State: List = [1, 1, 2, 3, 4], seen = {1}, curr at new node 1 (value 1)

Step 4: Check curr.data = 1. Is 1 in seen? Yes — another duplicate!

  • Save next reference (node with value 2).
  • Relink: node 0's next → value 2 node. Value 2 node's prev → node 0.
  • Move curr to the value 2 node.
  • State: List = [1, 2, 3, 4], seen = {1}, curr at new node 1 (value 2)

Step 5: Check curr.data = 2. Is 2 in seen? No — first encounter.

  • Add 2 to seen. seen = {1, 2}.
  • Advance curr to next node (value 3).
  • State: List = [1, 2, 3, 4], seen = {1, 2}, curr at node 2 (value 3)

Step 6: Check curr.data = 3. Is 3 in seen? No — first encounter.

  • Add 3 to seen. seen = {1, 2, 3}.
  • Advance curr to next node (value 4).
  • State: List = [1, 2, 3, 4], seen = {1, 2, 3}, curr at node 3 (value 4)

Step 7: Check curr.data = 4. Is 4 in seen? No — first encounter.

  • Add 4 to seen. seen = {1, 2, 3, 4}.
  • Advance curr to curr.next, which is null.
  • State: List = [1, 2, 3, 4], seen = {1, 2, 3, 4}, curr = null

Step 8: curr is null — traversal complete.

Result: 1 ↔ 2 ↔ 3 ↔ 4. We removed 2 duplicate nodes. The hash set consumed space proportional to the number of unique values.

Algorithm

  1. Create an empty hash set called seen.
  2. Set curr to the head of the doubly linked list.
  3. While curr is not null:
    a. If curr.data is already in seen:
    • Save curr.next as nextNode.
    • If curr.prev exists, set curr.prev.next to curr.next.
    • If curr.next exists, set curr.next.prev to curr.prev.
    • Set curr = nextNode.
      b. If curr.data is not in seen:
    • Add curr.data to seen.
    • Advance curr = curr.next.
  4. Return the head of the modified list.

Code

/*
 * Definition for doubly-linked list node:
 * struct Node {
 *     int data;
 *     Node* next;
 *     Node* prev;
 *     Node(int val) : data(val), next(nullptr), prev(nullptr) {}
 * };
 */

class Solution {
public:
    Node* removeDuplicates(Node* head) {
        if (head == nullptr) return nullptr;

        unordered_set<int> seen;
        Node* curr = head;

        while (curr != nullptr) {
            if (seen.count(curr->data)) {
                // Duplicate found — remove this node
                Node* nextNode = curr->next;

                if (curr->prev != nullptr) {
                    curr->prev->next = curr->next;
                }
                if (curr->next != nullptr) {
                    curr->next->prev = curr->prev;
                }

                curr = nextNode;
            } else {
                // First occurrence — record it
                seen.insert(curr->data);
                curr = curr->next;
            }
        }

        return head;
    }
};
class Solution:
    def removeDuplicates(self, head):
        if head is None:
            return None

        seen = set()
        curr = head

        while curr is not None:
            if curr.data in seen:
                # Duplicate found — remove this node
                next_node = curr.next

                if curr.prev is not None:
                    curr.prev.next = curr.next
                if curr.next is not None:
                    curr.next.prev = curr.prev

                curr = next_node
            else:
                # First occurrence — record it
                seen.add(curr.data)
                curr = curr.next

        return head
/*
 * Definition for doubly-linked list node:
 * class Node {
 *     int data;
 *     Node next;
 *     Node prev;
 *     Node(int val) { data = val; next = null; prev = null; }
 * }
 */

class Solution {
    Node removeDuplicates(Node head) {
        if (head == null) return null;

        HashSet<Integer> seen = new HashSet<>();
        Node curr = head;

        while (curr != null) {
            if (seen.contains(curr.data)) {
                // Duplicate found — remove this node
                Node nextNode = curr.next;

                if (curr.prev != null) {
                    curr.prev.next = curr.next;
                }
                if (curr.next != null) {
                    curr.next.prev = curr.prev;
                }

                curr = nextNode;
            } else {
                // First occurrence — record it
                seen.add(curr.data);
                curr = curr.next;
            }
        }

        return head;
    }
}

Complexity Analysis

Time Complexity: O(n)

We traverse the entire list once, visiting each of the n nodes exactly one time. For each node, we perform a hash set lookup and possibly an insertion, both of which are O(1) on average. Deletion of a node in a doubly linked list is O(1) since we have direct access to the previous and next pointers. Total: n iterations × O(1) work = O(n).

Space Complexity: O(n)

The hash set stores every unique value we encounter. In the worst case (all values are distinct, no duplicates to remove), the set holds all n values, consuming O(n) extra memory. Even in the average case, the set size equals the number of unique values, which can be up to n.

Why This Approach Is Not Efficient

While the hash set approach runs in O(n) time, it uses O(n) auxiliary space — which is wasteful given an important property we have not yet exploited: the list is already sorted.

In a sorted list, all copies of the same value are guaranteed to be adjacent. If the value 3 appears three times, those three nodes sit right next to each other. There is no scenario where a duplicate hides further away in the list.

This adjacency property means we never need to "remember" values we have seen in the past. We only need to compare each node with its immediate next neighbor. If they share the same value, the next node is a duplicate. If they differ, we move forward.

The hash set is a general-purpose tool designed for unsorted data where duplicates could appear anywhere. Using it on a sorted list is like bringing a metal detector to find your keys when they are already sitting on the table in front of you — the tool works, but it is unnecessarily powerful and costly.

Key insight: since duplicates are always adjacent in a sorted list, a simple neighbor comparison replaces the entire hash set. This reduces space from O(n) to O(1).

Optimal Approach - Leveraging Sorted Order

Intuition

Picture a bookshelf where books are arranged alphabetically. If you want to remove duplicate titles, you do not need to scan the entire shelf for each book. You simply look at the book next to the current one. If both titles match, you pull out the duplicate. If the titles differ, you move on to the next book. Because the shelf is sorted, you are guaranteed that any duplicates will be sitting right beside each other.

The same logic applies to our sorted doubly linked list. We place a pointer called curr at the head. At each step, we compare curr.data with curr.next.data:

  • If they are equal: The next node is a duplicate. We remove it by adjusting pointers — set curr.next to skip over the duplicate node, and update the backward pointer of the node after the duplicate to point back to curr. We do NOT advance curr because there might be more copies of the same value right after the one we just removed.

  • If they are different: The next node holds a new value, so no deletion is needed. We advance curr to curr.next and continue.

We repeat until curr reaches a node with no next neighbor, meaning we have processed the entire list. The head never changes because the first occurrence of each value is always kept.

Step-by-Step Explanation

Let us trace through with DLL: 1 ↔ 2 ↔ 2 ↔ 3 ↔ 3 ↔ 4 ↔ 4 (Example 2):

Step 1: Set curr to head (node 0, value 1). The full list is [1, 2, 2, 3, 3, 4, 4].

Step 2: Compare curr.data (1) with curr.next.data (2). Values differ — no duplicate. Advance curr to node 1 (value 2).

Step 3: Compare curr.data (2) with curr.next.data (2). Values are equal — duplicate found!

Step 4: Remove the duplicate. Set curr.next to the node after the duplicate (value 3). Update that node's prev to point to curr. List becomes [1, 2, 3, 3, 4, 4]. Do not advance curr yet — there could be more 2s.

Step 5: Compare curr.data (2) with new curr.next.data (3). Values differ — no more duplicates of 2. Advance curr to node 2 (value 3).

Step 6: Compare curr.data (3) with curr.next.data (3). Values are equal — duplicate found!

Step 7: Remove the duplicate. Set curr.next to skip to the node with value 4. Update backward pointer. List becomes [1, 2, 3, 4, 4]. Do not advance curr.

Step 8: Compare curr.data (3) with new curr.next.data (4). Values differ — no more duplicates of 3. Advance curr to node 3 (value 4).

Step 9: Compare curr.data (4) with curr.next.data (4). Values are equal — duplicate found!

Step 10: Remove the duplicate. Since the duplicate is the tail node, set curr.next to null. List becomes [1, 2, 3, 4].

Step 11: curr.next is now null — no more nodes to check. Traversal complete.

Result: 1 ↔ 2 ↔ 3 ↔ 4. We removed 3 duplicate nodes in a single pass using no extra space.

Sorted Order Deduplication — Removing Adjacent Duplicates In Place — Watch how we compare each node with its next neighbor in a single pass. Because the list is sorted, duplicates are always adjacent, so a simple neighbor comparison replaces the need for any extra data structure.

Algorithm

  1. If the head is null, return null (empty list).
  2. Set curr to the head of the list.
  3. While curr is not null AND curr.next is not null:
    a. If curr.data equals curr.next.data:
    • The next node is a duplicate. Save a reference to it.
    • Set curr.next to duplicate.next (skip over the duplicate).
    • If duplicate.next exists, set duplicate.next.prev to curr (update backward link).
    • Do NOT advance curr — check for more consecutive duplicates of the same value.
      b. If curr.data does not equal curr.next.data:
    • No duplicate at this position. Advance curr = curr.next.
  4. Return the head (unchanged, since we always keep the first occurrence).

Code

/*
 * Definition for doubly-linked list node:
 * struct Node {
 *     int data;
 *     Node* next;
 *     Node* prev;
 *     Node(int val) : data(val), next(nullptr), prev(nullptr) {}
 * };
 */

class Solution {
public:
    Node* removeDuplicates(Node* head) {
        if (head == nullptr) return nullptr;

        Node* curr = head;

        while (curr != nullptr && curr->next != nullptr) {
            if (curr->data == curr->next->data) {
                // Next node is a duplicate — remove it
                Node* duplicate = curr->next;
                curr->next = duplicate->next;

                if (duplicate->next != nullptr) {
                    duplicate->next->prev = curr;
                }

                delete duplicate;
                // Do not advance curr; check for more duplicates
            } else {
                // Values differ — move forward
                curr = curr->next;
            }
        }

        return head;
    }
};
class Solution:
    def removeDuplicates(self, head):
        if head is None:
            return None

        curr = head

        while curr is not None and curr.next is not None:
            if curr.data == curr.next.data:
                # Next node is a duplicate — remove it
                duplicate = curr.next
                curr.next = duplicate.next

                if duplicate.next is not None:
                    duplicate.next.prev = curr

                # Do not advance curr; check for more duplicates
            else:
                # Values differ — move forward
                curr = curr.next

        return head
/*
 * Definition for doubly-linked list node:
 * class Node {
 *     int data;
 *     Node next;
 *     Node prev;
 *     Node(int val) { data = val; next = null; prev = null; }
 * }
 */

class Solution {
    Node removeDuplicates(Node head) {
        if (head == null) return null;

        Node curr = head;

        while (curr != null && curr.next != null) {
            if (curr.data == curr.next.data) {
                // Next node is a duplicate — remove it
                Node duplicate = curr.next;
                curr.next = duplicate.next;

                if (duplicate.next != null) {
                    duplicate.next.prev = curr;
                }

                // Do not advance curr; check for more duplicates
            } else {
                // Values differ — move forward
                curr = curr.next;
            }
        }

        return head;
    }
}

Complexity Analysis

Time Complexity: O(n)

We traverse the list from head to tail in a single pass. At each step, we either advance curr forward by one node (when values differ) or delete curr.next (when values match). Each node is either visited as curr or deleted as curr.next — but never both. This means each of the n original nodes is processed exactly once. Every individual operation (comparison, pointer update, deletion) is O(1). Total: O(n).

Space Complexity: O(1)

We use only a single pointer variable (curr) and a temporary reference to the node being deleted (duplicate). No hash sets, no arrays, no auxiliary data structures. The number of variables is constant regardless of input size. The deletions happen in place by rewiring existing pointers, so auxiliary space is O(1).