Introduction to Concurrency — Why It Matters for LLD

Introduction

Imagine you're designing a movie ticket booking system. Two users — Alice and Bob — try to book the last seat for the same show at the exact same time. If your system isn't designed to handle this, both see the seat as available, both click "Book," and both get a confirmation. Now one seat has been sold twice.

This isn't a logic bug. Your booking code is correct for a single user. The problem is concurrency — multiple things happening at the same time, interacting in ways you didn't plan for.

Every real-world system you'll design in this course — parking lots, elevators, chat applications, stock brokerages — must handle concurrent access. Users don't politely wait in line. They all hit your system at once.

This tutorial introduces what concurrency is, why it's a first-class concern in Low Level Design, and what problems arise when you ignore it. Over the next seven tutorials in this section, we'll build the full toolkit: threads, synchronization, locks, deadlocks, and thread safety. Then in Section 15, we'll apply these concepts through concurrency patterns like Producer-Consumer and Thread Pool.

What is Concurrency?

Concurrency is the ability of a system to handle multiple tasks that are in progress at the same time. The key phrase is in progress — not necessarily executing simultaneously.

Think of a chef in a kitchen. A single chef can manage three dishes at once: while the pasta boils, they chop vegetables for the salad and check on the roasting chicken. The chef isn't doing three things at the exact same instant — they're switching between tasks, making progress on each one. That's concurrency.

If you hire three chefs, each working on one dish at the same time on separate stoves, that's parallelism — actual simultaneous execution. We'll explore the distinction in the next tutorial.

The Computer Science Definition

In computer science, concurrency refers to the ability of a system to execute multiple tasks through either:

Time-slicing (interleaving): A single CPU rapidly switches between tasks, giving each a small slice of time. To users, everything appears to run simultaneously, even though only one task executes at any instant.
True parallelism: Multiple CPU cores execute tasks at the exact same time. Each core runs a different task independently.

Most modern systems use both — your operating system interleaves thousands of tasks across a handful of CPU cores.

Why "Concurrent" and Not Just "Fast"?

Concurrency isn't primarily about speed. It's about structure — organizing your program so that multiple activities can be in progress, respond to events, and share resources safely. A well-designed concurrent system:

Stays responsive even when doing heavy work (the UI doesn't freeze)
Uses system resources efficiently (doesn't waste CPU time waiting for I/O)
Scales to handle many users or requests
Remains correct even when actions overlap

Why Concurrency Matters for LLD

In Sections 1-13, we designed systems assuming one user at a time. That simplification let us focus on class design, relationships, and patterns. But every system we'll build in the Design Problems sections (17-52) lives in the real world, where multiple things happen at once.

Real Concurrency Scenarios in This Course

Design Problem	Concurrency Challenge
Parking Lot (S17)	Two cars arrive at the same entrance — who gets the last spot?
LRU Cache (S20)	Multiple threads read/write the cache simultaneously
Elevator System (S24)	Multiple passengers press buttons while elevators are already moving
Movie Ticket Booking (S35)	Hundreds of users try to book the same showtime
Ride-Sharing / Uber (S38)	Drivers and riders match concurrently — a driver can't be assigned to two riders
Stock Brokerage (S42)	Buy and sell orders arrive in rapid succession and must be matched correctly
Thread-Safe Queue (S49)	Producers and consumers access the same queue from different threads
Rate Limiter (S51)	Concurrent requests must be tracked and limited per-user

The Three Pillars of Concurrent LLD

When designing any concurrent system, three concerns must be addressed:

Safety (Correctness): The system produces correct results even when multiple actions overlap. No double-bookings, no lost updates, no corrupted data.
Liveness: The system makes progress and doesn't get stuck. No deadlocks where two threads wait for each other forever. No starvation where one thread is permanently blocked.
Performance: Concurrency control doesn't become a bottleneck. Locking everything with a single global lock is safe but destroys parallelism. The art is finding the right granularity.

What Goes Wrong Without Concurrency Control

Let's make the movie ticket problem concrete with code. Here's a naive ticket booking system:

import threading
import time


class TicketBookingSystem:
    """A BROKEN booking system — no concurrency control."""

    def __init__(self, total_seats: int):
        self.available_seats = total_seats
        self.bookings: list[str] = []

    def book_seat(self, user: str) -> str:
        # Step 1: Check availability
        if self.available_seats > 0:
            # Simulating a small delay (network, DB write, etc.)
            time.sleep(0.001)

            # Step 2: Decrement seat count
            self.available_seats -= 1

            # Step 3: Record booking
            self.bookings.append(user)
            return f"{user}: Booking confirmed! Seats left: {self.available_seats}"
        else:
            return f"{user}: No seats available."


def simulate_concurrent_bookings():
    system = TicketBookingSystem(total_seats=1)  # Only ONE seat
    results = []

    def try_book(user_name: str):
        result = system.book_seat(user_name)
        results.append(result)

    # Two users try to book the SAME last seat simultaneously
    thread_alice = threading.Thread(target=try_book, args=("Alice",))
    thread_bob = threading.Thread(target=try_book, args=("Bob",))

    thread_alice.start()
    thread_bob.start()

    thread_alice.join()
    thread_bob.join()

    print("--- Results ---")
    for r in results:
        print(f"  {r}")
    print(f"  Final seats: {system.available_seats}")
    print(f"  Bookings: {system.bookings}")


simulate_concurrent_bookings()

# Possible (WRONG) output — both users book the same seat:
#   Alice: Booking confirmed! Seats left: 0
#   Bob: Booking confirmed! Seats left: -1
#   Final seats: -1
#   Bookings: ['Alice', 'Bob']
#
# The seat count went NEGATIVE. One seat was sold to two people.

import java.util.ArrayList;
import java.util.List;

public class TicketBookingSystem {
    // A BROKEN booking system — no concurrency control.
    private int availableSeats;
    private final List<String> bookings = new ArrayList<>();

    public TicketBookingSystem(int totalSeats) {
        this.availableSeats = totalSeats;
    }

    public String bookSeat(String user) {
        // Step 1: Check availability
        if (availableSeats > 0) {
            // Simulating delay
            try { Thread.sleep(1); } catch (InterruptedException e) {}

            // Step 2: Decrement seat count
            availableSeats--;

            // Step 3: Record booking
            bookings.add(user);
            return user + ": Booking confirmed! Seats left: " + availableSeats;
        } else {
            return user + ": No seats available.";
        }
    }

    public static void main(String[] args) throws InterruptedException {
        TicketBookingSystem system = new TicketBookingSystem(1); // ONE seat
        List<String> results = new ArrayList<>();

        Thread alice = new Thread(() -> results.add(system.bookSeat("Alice")));
        Thread bob = new Thread(() -> results.add(system.bookSeat("Bob")));

        alice.start();
        bob.start();
        alice.join();
        bob.join();

        System.out.println("--- Results ---");
        for (String r : results) System.out.println("  " + r);
        System.out.println("  Final seats: " + system.availableSeats);
        System.out.println("  Bookings: " + system.bookings);

        // Possible WRONG output:
        //   Alice: Booking confirmed! Seats left: 0
        //   Bob: Booking confirmed! Seats left: -1
        //   Final seats: -1
        //   Bookings: [Alice, Bob]
    }
}

What Happened?

Here's the timeline of the bug:

Step	Alice's Thread	Bob's Thread	`available_seats`
1	Checks: `available_seats > 0`? Yes		1
2		Checks: `available_seats > 0`? Yes	1
3	Decrements: `available_seats = 0`		0
4	Records: `bookings.append("Alice")`		0
5		Decrements: `available_seats = -1`	-1
6		Records: `bookings.append("Bob")`	-1

Both threads read available_seats as 1 before either had a chance to decrement it. This is called a race condition — the result depends on the unpredictable timing of thread execution. We'll study race conditions in depth in Tutorial 14.4.

The fix requires synchronization — ensuring that the check-and-decrement happens as one indivisible operation. We'll learn exactly how to do this with mutexes (Tutorial 14.5) and locks (Tutorial 14.7).

Visualization

Thread interleaving race condition timeline showing two threads (Alice and Bob) reading shared seats=1 before either decrements, causing double-booking. Includes the three pillars of concurrency (Safety, Liveness, Performance) and the synchronized fix.

Thread Interleaving Simulator — toggle between 'No Lock' (race condition) and 'With Lock' (correct) to see how two threads (Alice/Bob) race for the last seat. Watch the animated timeline as both read seats=1 simultaneously without a lock, causing double-booking.

The Vocabulary of Concurrency

Before we dive into the rest of this section, here are the key terms you'll encounter. Don't worry about memorizing them now — each will get its own tutorial.

Term	What It Means	Covered In
Thread	A lightweight unit of execution within a process. Multiple threads share the same memory space.	S14.3
Process	A heavyweight unit of execution with its own isolated memory. Creating and switching processes is expensive.	S14.3
Race Condition	A bug that occurs when the outcome depends on the unpredictable timing of thread execution.	S14.4
Critical Section	A block of code that accesses shared resources and must not be executed by multiple threads simultaneously.	S14.4
Mutex (Mutual Exclusion)	A lock that ensures only one thread can enter a critical section at a time.	S14.5
Semaphore	A counter-based synchronization tool that allows a limited number of threads to access a resource.	S14.5
Deadlock	A situation where two or more threads wait for each other forever, so none can proceed.	S14.6
Livelock	Threads keep responding to each other but make no actual progress — like two people trying to pass each other in a hallway.	S14.6
Starvation	A thread is perpetually denied access to a resource because other threads keep getting priority.	S14.6
Thread Safety	A property of code that guarantees correct behavior when accessed from multiple threads simultaneously.	S14.8

Think of this as a roadmap for the section. By Tutorial 14.8, you'll understand every entry in this table.

Concurrency in Python vs Java

Throughout this section and Section 15 (Concurrency Patterns), we'll implement every concept in both Python and Java. Here's how each language handles concurrency:

Python: The GIL Limitation

CPython (the standard Python interpreter) has a Global Interpreter Lock (GIL) — a mutex that allows only one thread to execute Python bytecode at a time. This means:

CPU-bound tasks (heavy computation) don't benefit from multiple threads — use multiprocessing instead
I/O-bound tasks (network calls, file reads, database queries) do benefit — while one thread waits for I/O, another can run
Python threads are still useful for learning concurrency concepts — race conditions, deadlocks, and synchronization all apply

import threading  # Standard library — OS-level threads

Note: Python 3.13+ is experimenting with a "free-threaded" mode that removes the GIL. But for this course, we use standard CPython behavior.

Java: Native Thread Support

Java was designed for concurrency from the start. Every Java program runs in the Java Virtual Machine (JVM), which maps Java threads to operating system threads.

No GIL — true parallel execution on multi-core CPUs
Rich concurrency APIs in java.util.concurrent
Built-in synchronized keyword for locking
Memory model formally defines thread interaction rules

// Java Thread creation — two approaches:
Thread t = new Thread(() -> System.out.println("Hello from thread!"));
t.start();

Why Both Languages?

Python makes concurrency concepts readable and approachable — less boilerplate, easier to focus on the concept. Java shows how concurrency works in a production-grade language with true parallelism and a rich standard library. Together, they give you the full picture.

Common Mistakes

Mistake 1: "I'll Add Concurrency Later"

Retrofitting concurrency into a single-threaded design is extremely difficult. Shared mutable state that's safe in single-threaded code becomes a minefield in multi-threaded code. Design for concurrency from the start — even if the first version is single-threaded, make your data access patterns compatible with future locking.

Mistake 2: "More Threads = Faster"

Creating a thread for every request or task can actually slow down your system. Thread creation has overhead (memory for the stack, OS scheduling). Context switching between thousands of threads wastes CPU cycles. In Tutorial 15.3 (Thread Pool Pattern), we'll learn how to manage a fixed pool of reusable threads.

Mistake 3: "Concurrency Only Matters for Servers"

Even a simple desktop application needs concurrency. If you run a file download on the main thread, the UI freezes. If your game processes physics on the render thread, the framerate drops. Concurrency is about responsiveness, not just throughput.

Mistake 4: Ignoring the GIL in Python

Assuming Python threads provide true parallelism for CPU-bound work leads to disappointment and mysterious non-speedups. For CPU-heavy tasks in Python, use multiprocessing or concurrent.futures.ProcessPoolExecutor. For I/O-bound work, threads are fine.

Key Takeaways

Concurrency is the ability to handle multiple tasks in progress at once — through interleaving on a single core or true parallelism across multiple cores.
Every design problem in this course (Sections 17-52) involves concurrent access: multiple users, multiple threads, simultaneous requests. Ignoring concurrency leads to bugs like race conditions (double-booking the last seat).
The three pillars of concurrent design are Safety (correct results), Liveness (no deadlocks/starvation), and Performance (concurrency control doesn't bottleneck).
Python's GIL limits true parallelism for CPU-bound work but threads still apply for I/O-bound tasks and for learning concurrency concepts. Java supports true multi-threaded parallelism natively.
This section builds the complete concurrency toolkit: threads (S14.3), race conditions (S14.4), synchronization primitives (S14.5), deadlocks (S14.6), locks (S14.7), and thread safety (S14.8). Section 15 applies these concepts through concurrency patterns like Producer-Consumer and Thread Pool.