Skip to main content

Introduction to CTE

Introduction

A Common Table Expression (CTE) is one of the most powerful features in SQL for writing clean, readable, and maintainable queries. It allows you to define a temporary named result set that exists only for the duration of a single query.

In this tutorial, you will learn:

  • What a CTE is and how it works
  • Basic CTE syntax and structure
  • How CTEs differ from subqueries
  • When to use CTEs in your SQL queries

What is a CTE?

Definition

A Common Table Expression (CTE) is a temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. Think of it as creating a temporary table that exists only for the duration of your query.

Simple Analogy

Imagine you're writing a complex report and need to use the same calculated data multiple times. Instead of recalculating it each time, you:

  1. Calculate once and give it a name
  2. Reference that name wherever you need the data

A CTE works exactly like this – you define a result set once, name it, and reference it throughout your query.

Visualization

Diagram showing a CTE as a named temporary result set defined with WITH keyword, used by the main query below
Diagram showing a CTE as a named temporary result set defined with WITH keyword, used by the main query below

Introduction to CTEs -- Named Temporary Queries

Basic Syntax

CTE Structure

WITH cte_name AS (
    -- CTE query definition
    SELECT column1, column2
    FROM table_name
    WHERE condition
)
-- Main query that uses the CTE
SELECT *
FROM cte_name;

Key Components

ComponentDescriptionRequired
WITHKeyword that starts a CTE definitionYes
cte_nameThe name you assign to your temporary result setYes
ASKeyword connecting name to query definitionYes
(query)The SELECT statement that defines the CTE's dataYes
Main QueryThe query that references and uses the CTEYes

Naming Rules

  • CTE names follow the same rules as table names
  • Use descriptive names that indicate what data the CTE contains
  • Names are case-insensitive in most databases
  • Cannot use reserved SQL keywords as CTE names

CTE vs Subquery

Side-by-Side Comparison

Using a Subquery:

SELECT *
FROM (
    SELECT seller_id, SUM(price) as total_sales
    FROM olist_order_items_dataset
    GROUP BY seller_id
) AS seller_totals
WHERE total_sales > 1000;

Using a CTE:

WITH seller_totals AS (
    SELECT seller_id, SUM(price) as total_sales
    FROM olist_order_items_dataset
    GROUP BY seller_id
)
SELECT *
FROM seller_totals
WHERE total_sales > 1000;

Key Differences

AspectSubqueryCTE
PlacementInside the main queryBefore the main query
ReadabilityCan become nested and hard to readLinear, top-to-bottom reading
ReusabilityMust repeat for each useDefine once, use multiple times
DebuggingHarder to isolate issuesEasy to test independently
Self-referenceNot possiblePossible (recursive CTEs)

When CTEs Shine

CTEs are particularly useful when:

  • You need to reference the same derived data multiple times
  • Your query has multiple levels of nesting
  • You want to break complex logic into named, logical steps
  • You need recursion (which subqueries cannot do)

Practice

Let's practice writing basic CTEs with our e-commerce dataset.

Exercise 1: Your First CTE

Create a simple CTE that calculates total payments per order:

-- Define a CTE to calculate total payment per order
WITH order_totals AS (
    SELECT 
        order_id,
        SUM(payment_value) AS total_payment
    FROM olist_order_payments_dataset
    GROUP BY order_id
)
-- Use the CTE in the main query
SELECT 
    order_id,
    total_payment
FROM order_totals
WHERE total_payment > 500
ORDER BY total_payment DESC
LIMIT 10;

Exercise 2: CTE with Column Aliases

You can specify column names in the CTE definition for clearer code:

-- CTE with explicit column names
WITH customer_orders (customer_id, order_count, first_order, last_order) AS (
    SELECT 
        customer_id,
        COUNT(*) AS order_count,
        MIN(order_purchase_timestamp) AS first_order,
        MAX(order_purchase_timestamp) AS last_order
    FROM olist_orders_dataset
    GROUP BY customer_id
)
SELECT *
FROM customer_orders
WHERE order_count > 1
LIMIT 10;

Exercise 3: Using CTE Multiple Times

One of the main advantages of CTEs is referencing them multiple times:

-- CTE referenced multiple times in the main query
WITH state_stats AS (
    SELECT 
        customer_state,
        COUNT(*) AS customer_count
    FROM olist_customers_dataset
    GROUP BY customer_state
)
SELECT 
    s1.customer_state,
    s1.customer_count,
    -- Compare each state to the average
    ROUND(s1.customer_count * 1.0 / (SELECT AVG(customer_count) FROM state_stats), 2) AS ratio_to_avg
FROM state_stats s1
ORDER BY s1.customer_count DESC
LIMIT 10;

Common Mistakes

Mistake 1: Missing the Main Query

Wrong:

WITH my_cte AS (
    SELECT * FROM orders
);
-- Error: CTE defined but never used!

Correct:

WITH my_cte AS (
    SELECT * FROM orders
)
SELECT * FROM my_cte;  -- Must have a main query

Mistake 2: Referencing CTE Before Definition

Wrong:

SELECT * FROM my_cte;  -- CTE doesn't exist yet!

WITH my_cte AS (
    SELECT * FROM orders
)

Correct:

WITH my_cte AS (
    SELECT * FROM orders
)
SELECT * FROM my_cte;  -- CTE is defined first

Mistake 3: Trying to Use CTE Outside Its Scope

Wrong:

WITH my_cte AS (
    SELECT * FROM orders
)
SELECT * FROM my_cte;

SELECT * FROM my_cte;  -- Error: CTE no longer exists!

CTEs only exist for the single statement that follows them. Each new statement requires its own CTE definition.

Summary

Key Takeaways

  1. A CTE (Common Table Expression) is a temporary named result set defined with the WITH clause

  2. Syntax: WITH cte_name AS (SELECT ...) SELECT ... FROM cte_name

  3. Benefits over subqueries:

  • More readable (top-to-bottom flow)
  • Reusable within the same query
  • Can be self-referencing (recursive)
  • Easier to debug and maintain
  1. Scope: CTEs exist only for the duration of the single statement that follows

  2. Naming: Use descriptive names that indicate the CTE's purpose

Quick Reference

-- Basic CTE template
WITH cte_name AS (
SELECT columns
FROM table
WHERE conditions
)
SELECT *
FROM cte_name;

What's Next?

Now that you understand what a CTE is, let's explore why you should use CTEs and the specific advantages they provide for writing better SQL queries.