Skip to main content

Introduction to Partitions

Introduction

As your database tables grow to tens of millions or even billions of rows, even well-indexed queries can start to slow down. At a certain scale, managing a single massive table becomes a real challenge — not just for query performance, but also for maintenance operations like backups, index rebuilds, and data archival.

This is where table partitioning comes in. Partitioning allows you to split a large table into smaller, more manageable pieces while still treating it as a single logical table from a query perspective.

In this tutorial, you will learn:

  • What partitioning is and why it exists
  • The core concept behind horizontal partitioning
  • How partitioning differs from simply creating multiple tables
  • Real-world scenarios where partitioning makes a significant difference
  • The key terminology you need before diving deeper

What is Table Partitioning?

Table partitioning is the process of dividing a large table into smaller physical segments called partitions. Each partition holds a subset of the table's rows based on a specific rule — typically the value of a chosen column.

The crucial point is this: while the data is physically split across partitions, it still appears as one single table to the user. You query it, join it, and filter it exactly as before. The database engine handles the complexity of knowing where each row lives.

Generated visual representation from text diagram for What is Table Partitioning?.
Generated visual representation from text diagram for What is Table Partitioning?.

Think of it like organizing a massive filing cabinet. Instead of dumping every document into one giant drawer, you create labeled drawers — one for each year, each month, or each region. When someone asks for a document from 2023, you go straight to the 2023 drawer instead of searching through everything.

Visualization

Diagram showing a large table divided into smaller partitions based on a partition key, with each partition stored independently
Diagram showing a large table divided into smaller partitions based on a partition key, with each partition stored independently

Horizontal vs Vertical Partitioning

There are two ways to partition data, and it is important to know the difference:

Horizontal Partitioning (Row-Based)

This is what people usually mean when they say "partitioning." The table is split by rows — each partition contains a subset of the rows based on some column value.

Vertical Partitioning (Column-Based)

The table is split by columns — less commonly used and more of a database normalization or design concern than a built-in feature.

Generated visual representation from text diagram for Horizontal vs Vertical Partitioning.
Generated visual representation from text diagram for Horizontal vs Vertical Partitioning.
TypeSplits ByCommon Use
HorizontalRowsLarge tables with time-series or category-based data
VerticalColumnsTables with many columns where some are rarely accessed

For the rest of this module, when we say "partitioning" we mean horizontal partitioning — dividing a table by rows.

Why Not Just Create Separate Tables?

You might wonder: why not just create separate tables like orders_2022, orders_2023, orders_2024? That achieves the same physical separation, right?

While separate tables do split data physically, partitioning provides significant advantages over that manual approach:

AspectSeparate TablesPartitioned Table
Query syntaxMust use UNION ALL across tablesSingle table name in queries
Schema changesMust ALTER every tableALTER once, applies to all partitions
ConstraintsMust define per tableDefined once at the table level
Foreign keysCannot easily referenceStandard foreign key references
Application codeMust know which table to queryNo application changes needed
Adding new partitionsCreate new table, update all queriesAdd partition — queries work automatically
Cross-partition queriesComplex UNION ALL statementsHandled transparently by the engine

The Key Benefit: Transparency

With partitioning, your application code does not change at all. A query like this works before and after partitioning:

SELECT *
FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

The database engine automatically knows to look only in the 2023 partition. This behavior is called partition elimination (or partition pruning), and it is one of the biggest performance gains of partitioning.

When Should You Use Partitioning?

Partitioning is not for every table. It adds complexity to your database design and should be used when the benefits clearly outweigh the overhead.

Good Candidates for Partitioning

ScenarioWhy Partitioning Helps
Very large tables (millions+ rows)Smaller partitions are faster to scan and maintain
Time-series data (logs, orders, events)Old data can be archived by dropping entire partitions
Rolling window queriesQueries that always filter by date benefit from partition elimination
Data lifecycle managementEasy to purge old data without expensive DELETE operations
Maintenance operationsIndex rebuilds and backups can target individual partitions

Poor Candidates for Partitioning

ScenarioWhy Partitioning May Hurt
Small tables (under 1M rows)Overhead of partitioning is not justified
Queries that span all partitionsNo partition elimination means no benefit
Highly transactional OLTPCross-partition transactions add complexity
Frequently updated partition keysMoving rows between partitions is expensive
Generated visual representation from text diagram for When Should You Use Partitioning?.
Generated visual representation from text diagram for When Should You Use Partitioning?.

Key Partitioning Terminology

Before going deeper into partitioning, here are the essential terms you need to know:

TermDefinition
Partition KeyThe column used to determine which partition a row belongs to (e.g., order_date)
Partition FunctionA rule that maps column values to partition numbers (e.g., each year maps to a partition)
Partition SchemeMaps partitions to physical storage locations (filegroups)
Partition EliminationThe optimizer's ability to skip irrelevant partitions during query execution
Boundary ValuesThe cutoff points that define where one partition ends and the next begins
Range LEFT / RIGHTDetermines whether the boundary value belongs to the left or right partition
FilegroupA logical storage container where partition data is physically stored
Sliding WindowA pattern where you regularly add new partitions and remove old ones

How These Pieces Fit Together

Generated visual representation from text diagram for Key Partitioning Terminology.
Generated visual representation from text diagram for Key Partitioning Terminology.

Summary

Here is what you should remember about table partitioning:

ConceptKey Takeaway
WhatSplitting a large table into smaller physical segments (partitions)
WhyImproves query performance, simplifies maintenance, enables efficient data lifecycle management
HowBased on a partition key column and boundary values
TransparencyQueries treat the partitioned table as a single table — no code changes needed
Best forVery large tables with time-series data, rolling window queries, and data archival needs
Not forSmall tables, queries that always scan all data, or frequently changing partition keys

Key Differences from Other Optimization Techniques

TechniqueWhat It Does
IndexesSpeed up lookups within a table by creating sorted data structures
PartitioningSplits the table itself into smaller physical pieces
ViewsProvide a virtual layer over data, no physical changes
CTAS/Temp TablesCreate copies of data for specific use cases

Partitioning and indexing work together — you can (and should) create indexes on partitioned tables. In the next tutorials, we will explore the step-by-step process of setting up partitioning and learn how to create partitioned tables.