Skip to main content

DRY — Don't Repeat Yourself

Introduction

Imagine you move to a new apartment. You update your address on your bank's website, then on your insurance portal, then on your employer's HR system, then on your gym membership form. Four different places storing the same piece of information. When you move again, you have to remember every single place — and if you miss one, your mail goes to the wrong address.

This is exactly what happens in code when the same logic or data exists in multiple places. Change one copy and forget another, and your system is inconsistent.

The DRY principle — Don't Repeat Yourself — states that every piece of knowledge should have a single, authoritative representation in your system. It was coined by Andy Hunt and Dave Thomas in The Pragmatic Programmer and is one of the most fundamental software design principles. You will see its influence in every design pattern (Sections 10–12) and refactoring technique (Section 9) we cover.

In this tutorial, you will learn what DRY actually means (it is not just about avoiding copy-paste), how to recognize all three forms of duplication, and critically — when extracting shared logic actually makes things worse.

What DRY Actually Means

The original definition from The Pragmatic Programmer is:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

Notice the word knowledge, not code. DRY is about avoiding duplication of knowledge and intent, not just textually identical lines. Two pieces of code can look identical but represent different pieces of knowledge that happen to look the same today — and that is not a DRY violation.

There are three types of duplication that DRY targets:

  1. Code duplication — The same logic written in multiple places. This is the most visible form. It leads to the "shotgun surgery" code smell (Section 9) — one change requires modifying many files.

  2. Data duplication — The same fact stored in multiple locations without a single source of truth. For example, storing a customer's full name in both the users table and the orders table. When the customer changes their name, which copy is authoritative?

  3. Knowledge duplication — The same business rule encoded in multiple ways across different modules. For example, a tax rate hardcoded as 0.08 in the pricing module and as 8 (percent) in the reporting module. They express the same knowledge in different forms.

DRY is closely connected to the Single Responsibility Principle from Section 6. SRP says each class has one reason to change. DRY says each piece of knowledge lives in one place. Together, they prevent the cascading modifications that make codebases fragile.

DRY Violation — Duplicated Validation Logic

Let us look at a real scenario. An e-commerce system validates email addresses in three different places: user registration, newsletter subscription, and contact form submission. Each copy re-implements the same regex and error messages.

import re


class UserRegistration:
    def register(self, name: str, email: str, password: str) -> None:
        # Email validation — copy 1
        if not email or "@" not in email:
            raise ValueError("Invalid email address")
        if not re.match(r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$", email):
            raise ValueError("Invalid email format")

        print(f"User {name} registered with {email}")


class NewsletterSubscription:
    def subscribe(self, email: str) -> None:
        # Email validation — copy 2
        if not email or "@" not in email:
            raise ValueError("Invalid email address")
        if not re.match(r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$", email):
            raise ValueError("Invalid email format")

        print(f"{email} subscribed to newsletter")


class ContactForm:
    def submit(self, email: str, message: str) -> None:
        # Email validation — copy 3
        if not email or "@" not in email:
            raise ValueError("Invalid email address")
        if not re.match(r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$", email):
            raise ValueError("Invalid email format")

        print(f"Message from {email}: {message}")
import java.util.regex.Pattern;

public class UserRegistration {
    public void register(String name, String email, String password) {
        // Email validation — copy 1
        if (email == null || !email.contains("@")) {
            throw new IllegalArgumentException("Invalid email address");
        }
        if (!Pattern.matches("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$", email)) {
            throw new IllegalArgumentException("Invalid email format");
        }
        System.out.println("User " + name + " registered with " + email);
    }
}

public class NewsletterSubscription {
    public void subscribe(String email) {
        // Email validation — copy 2
        if (email == null || !email.contains("@")) {
            throw new IllegalArgumentException("Invalid email address");
        }
        if (!Pattern.matches("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$", email)) {
            throw new IllegalArgumentException("Invalid email format");
        }
        System.out.println(email + " subscribed to newsletter");
    }
}

public class ContactForm {
    public void submit(String email, String message) {
        // Email validation — copy 3
        if (email == null || !email.contains("@")) {
            throw new IllegalArgumentException("Invalid email address");
        }
        if (!Pattern.matches("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$", email)) {
            throw new IllegalArgumentException("Invalid email format");
        }
        System.out.println("Message from " + email + ": " + message);
    }
}

The same email validation regex and logic appears in three classes. When the business decides to also allow .co domains or add a maximum length limit, a developer must hunt down and update all three copies. Miss one, and users get inconsistent error messages depending on which form they use.

This is the practical cost of a DRY violation: one change requires modifications in multiple places, and forgetting one creates a bug that is invisible until a user hits the inconsistent path.

Applying DRY — Single Source of Truth

The fix is to extract the shared knowledge into a single location. Now there is exactly one place that defines what a valid email looks like.

import re


class EmailValidator:
    """Single source of truth for email validation rules."""

    _EMAIL_PATTERN = re.compile(
        r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
    )

    @staticmethod
    def validate(email: str) -> None:
        """Validates an email address. Raises ValueError if invalid."""
        if not email or "@" not in email:
            raise ValueError("Invalid email address")
        if not EmailValidator._EMAIL_PATTERN.match(email):
            raise ValueError("Invalid email format")


class UserRegistration:
    def register(self, name: str, email: str, password: str) -> None:
        EmailValidator.validate(email)
        print(f"User {name} registered with {email}")


class NewsletterSubscription:
    def subscribe(self, email: str) -> None:
        EmailValidator.validate(email)
        print(f"{email} subscribed to newsletter")


class ContactForm:
    def submit(self, email: str, message: str) -> None:
        EmailValidator.validate(email)
        print(f"Message from {email}: {message}")
import java.util.regex.Pattern;

public class EmailValidator {
    private static final Pattern EMAIL_PATTERN =
        Pattern.compile("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$");

    public static void validate(String email) {
        if (email == null || !email.contains("@")) {
            throw new IllegalArgumentException("Invalid email address");
        }
        if (!EMAIL_PATTERN.matcher(email).matches()) {
            throw new IllegalArgumentException("Invalid email format");
        }
    }
}

public class UserRegistration {
    public void register(String name, String email, String password) {
        EmailValidator.validate(email);
        System.out.println("User " + name + " registered with " + email);
    }
}

public class NewsletterSubscription {
    public void subscribe(String email) {
        EmailValidator.validate(email);
        System.out.println(email + " subscribed to newsletter");
    }
}

public class ContactForm {
    public void submit(String email, String message) {
        EmailValidator.validate(email);
        System.out.println("Message from " + email + ": " + message);
    }
}

Now when the validation rules change — add a length limit, support new TLDs, require corporate domains — there is exactly one place to update. Every consumer automatically gets the updated behavior.

Notice that EmailValidator does not represent a domain concept like Customer or Order — it exists purely to centralize knowledge and achieve low coupling. This is what GRASP calls a Pure Fabrication (covered later in this section).

Visualization

DRY Principle — From Duplication to Single Source of Truth

Beyond Code — Data and Knowledge Duplication

Code duplication is the most visible form of DRY violation, but the two subtler forms cause equally painful bugs.

Data Duplication

Consider a system that stores the customer's shipping address in both the customers table and in every orders row. When the customer updates their address, the customers table is updated — but past and future orders still reference the old copy. Which is authoritative? The system is now in an inconsistent state.

Fix: Store the address in one place (customers), and have orders reference it by customer ID. If orders need to preserve the address at the time of purchase (a legitimate business requirement), store it as an explicit snapshot labeled shipping_address_at_order_time — making it clear that it is a historical copy, not the authoritative source.

Knowledge Duplication

A tax calculation system uses the rate 0.08 in the pricing module and 8 (as a percentage integer) in the reporting module. Both express the same knowledge — the sales tax rate — but in different forms. When the rate changes to 8.5%, a developer updates the pricing module to 0.085 but the reporting module still shows 8.

Fix: Define the tax rate in one configuration source — a constant, a config file, or a database setting. Both modules read from that single source.

Knowledge duplication is the hardest to detect because the duplicates do not look identical. They require understanding the intent behind the code, not just its text.

Common Mistakes

Mistake 1: Confusing Coincidental Similarity with Duplication

❌ Wrong: Two functions look identical, so you extract them into one shared function — even though they represent completely different business rules.

For example, a calculate_employee_bonus() and calculate_customer_discount() both happen to multiply an amount by a percentage today. But employee bonuses and customer discounts are different knowledge domains. They will evolve independently. Merging them creates a false coupling — changing the discount logic accidentally changes bonus calculations.

✅ Right: Before extracting, ask: "Are these expressing the same piece of knowledge, or do they just happen to look the same right now?" If they represent different business concepts, leave them separate even if the code is identical.

Mistake 2: Over-DRY — Premature Abstraction

❌ Wrong: You see two lines of similar code and immediately extract a parameterized helper with three configuration flags. The helper is harder to understand than the original duplication.

✅ Right: Apply the Rule of Three — wait until you see the same pattern three times before extracting. Two occurrences might be coincidence. Three is a pattern. This connects directly to YAGNI (the next tutorial in this section): do not build abstractions for problems you do not have yet.

Mistake 3: DRY Across Service Boundaries

❌ Wrong: Two microservices share validation logic, so you create a shared library. Now both services are coupled to the library's release cycle, and updating it requires coordinated deployments.

✅ Right: DRY applies most strongly within a module or service. Across system boundaries, some duplication is acceptable — even healthy — if it preserves independent deployability. The cost of coupling can outweigh the cost of duplication.

When to Apply DRY

Apply DRY when:

  • The same business rule is encoded in multiple places — tax rates, validation rules, formatting logic
  • Changing one piece of logic requires modifying multiple files (the shotgun surgery smell from Section 9)
  • You find yourself copying and pasting code between classes or modules
  • Configuration values (timeouts, URLs, feature flags) are hardcoded in multiple locations

Do NOT apply DRY when:

  • Two pieces of code look similar but represent different business concepts (coincidental duplication)
  • Extracting shared logic would create a dependency between modules that should be independent
  • The shared abstraction would be more complex than the duplication it replaces
  • You have seen the pattern only twice — wait for the third occurrence (Rule of Three)
  • The duplication is across service or deployment boundaries where coupling is costly

DRY is a powerful principle, but like all principles, it can be misapplied. When DRY conflicts with KISS (next tutorial) or module independence, you need to make a judgment call. The goal is reducing the cost of change, not achieving zero textual duplication at all costs.

Key Takeaways

  • DRY means every piece of knowledge has a single authoritative representation — it is about knowledge and intent, not just identical code
  • Three forms of duplication: code (copy-paste), data (same fact in multiple stores), and knowledge (same rule expressed in different forms)
  • DRY violations cause shotgun surgery — one logical change requires modifications across many files, and missing one creates subtle bugs
  • Apply the Rule of Three before extracting: two occurrences might be coincidence, three is a pattern
  • Do not force DRY across system boundaries or between unrelated business concepts — some duplication is acceptable to preserve independence
  • DRY works hand-in-hand with the Single Responsibility Principle (Section 6) and connects to GRASP's Pure Fabrication (later in this section)