Skip to main content

Introduction to Window Value Functions

Introduction

Welcome to Window Value Functions! These are specialized window functions that let you access values from other rows in your result set without using self-joins or subqueries. They are incredibly powerful for comparing rows, calculating changes over time, and analyzing sequential data.

In this lesson, you will learn:

  • What window value functions are and how they differ from aggregate window functions
  • The four main window value functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE
  • When and why to use each function
  • Common use cases in business analysis

What are Window Value Functions?

Window value functions retrieve values from specific rows within a window frame. Unlike aggregate functions (SUM, AVG, COUNT) that combine multiple values into one, value functions return a single value from a specific row.

The Key Insight

Window value functions answer questions like:

  • What was the previous value? (LAG)
  • What is the next value? (LEAD)
  • What was the first value in the group? (FIRST_VALUE)
  • What is the last value in the group? (LAST_VALUE)

Comparison with Aggregate Functions

Function TypeWhat It DoesExample
AggregateCombines valuesSUM(price) → total of all prices
ValueRetrieves a specific valueLAG(price) → previous row's price

Visualization

Overview diagram showing 4 window value functions: LAG looks back, LEAD looks forward, FIRST_VALUE gets partition start, LAST_VALUE gets partition end
Overview diagram showing 4 window value functions: LAG looks back, LEAD looks forward, FIRST_VALUE gets partition start, LAST_VALUE gets partition end

Window Value Functions -- Introduction

The Four Main Window Value Functions

SQL provides four primary window value functions:

1. LAG - Look Back

LAG(column, offset, default) retrieves a value from a previous row.

LAG(price, 1, 0) OVER (ORDER BY order_date)
-- Returns the price from 1 row before the current row
-- Returns 0 if there is no previous row

2. LEAD - Look Forward

LEAD(column, offset, default) retrieves a value from a following row.

LEAD(price, 1, 0) OVER (ORDER BY order_date)
-- Returns the price from 1 row after the current row
-- Returns 0 if there is no next row

3. FIRST_VALUE - First in Window

FIRST_VALUE(column) retrieves the value from the first row in the window.

FIRST_VALUE(price) OVER (PARTITION BY seller_id ORDER BY order_date)
-- Returns the first price for each seller

4. LAST_VALUE - Last in Window

LAST_VALUE(column) retrieves the value from the last row in the window.

LAST_VALUE(price) OVER (
    PARTITION BY seller_id 
    ORDER BY order_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
-- Returns the last price for each seller

Note: LAST_VALUE requires a proper frame clause to work as expected.

Quick Visual Comparison

Consider this ordered dataset:

Row | Value
----|------
  1 |  100   ← FIRST_VALUE returns this
  2 |  150   ← LAG returns this when current is row 3
  3 |  200   ← CURRENT ROW
  4 |  175   ← LEAD returns this when current is row 3
  5 |  225   ← LAST_VALUE returns this

For row 3 (value = 200):

  • LAG(value, 1) = 150 (previous row)
  • LEAD(value, 1) = 175 (next row)
  • FIRST_VALUE(value) = 100 (first row)
  • LAST_VALUE(value) = 225 (last row, with proper frame)

Function Reference Table

FunctionDirectionOffsetReturns
LAGBackwardOptional (default 1)Previous row value
LEADForwardOptional (default 1)Next row value
FIRST_VALUEFirstN/AFirst row in window
LAST_VALUELastN/ALast row in window

Basic Syntax Overview

LAG and LEAD Syntax

LAG(column [, offset [, default]])
    OVER ([PARTITION BY partition_column] ORDER BY sort_column)

LEAD(column [, offset [, default]])
    OVER ([PARTITION BY partition_column] ORDER BY sort_column)

Parameters:

  • column: The column to retrieve
  • offset: How many rows back (LAG) or forward (LEAD). Default is 1.
  • default: Value to return if no row exists. Default is NULL.

FIRST_VALUE and LAST_VALUE Syntax

FIRST_VALUE(column)
    OVER ([PARTITION BY partition_column] ORDER BY sort_column [frame_clause])

LAST_VALUE(column)
    OVER (
        [PARTITION BY partition_column] 
        ORDER BY sort_column
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    )

Important: LAST_VALUE almost always needs an explicit frame clause to get the actual last value in the partition.

Why ORDER BY is Required

Window value functions require ORDER BY because they need a defined sequence to determine "previous" and "next" rows.

Without ORDER BY

Without ordering, the concept of "previous" or "next" row is meaningless:

-- This will NOT work correctly
LAG(price) OVER ()  -- Which row is "previous"?

With ORDER BY

With ordering, the sequence is clear:

-- This works: previous row in order_date sequence
LAG(price) OVER (ORDER BY order_date)

Common Ordering Strategies

Use CaseORDER BY
Time-series analysisTimestamp or date column
Sequential comparisonID or sequence number
Ranked comparisonMetric (price, score, etc.)

Quick Examples

Let's see each function in action with simple examples.

LAG Example: Compare to Previous Order

-- Compare each order's price to the previous order
SELECT 
    order_id,
    price,
    LAG(price, 1, 0) OVER (ORDER BY order_id) AS previous_price,
    price - LAG(price, 1, 0) OVER (ORDER BY order_id) AS price_change
FROM olist_order_items_dataset
LIMIT 15;

LEAD Example: Compare to Next Order

-- Compare each order's price to the next order
SELECT 
    order_id,
    price,
    LEAD(price, 1, 0) OVER (ORDER BY order_id) AS next_price,
    LEAD(price, 1, 0) OVER (ORDER BY order_id) - price AS upcoming_change
FROM olist_order_items_dataset
LIMIT 15;

FIRST_VALUE Example: Compare to First Order

-- Compare each seller's orders to their first order price
SELECT 
    seller_id,
    order_id,
    price,
    FIRST_VALUE(price) OVER (
        PARTITION BY seller_id 
        ORDER BY order_id
    ) AS first_price,
    price - FIRST_VALUE(price) OVER (
        PARTITION BY seller_id 
        ORDER BY order_id
    ) AS change_from_first
FROM olist_order_items_dataset
LIMIT 20;

Common Use Cases

Window value functions excel at solving specific analytical problems:

1. Period-Over-Period Comparison

Compare current period to previous period (month-over-month, year-over-year):

-- Monthly sales with previous month comparison
LAG(monthly_sales, 1) OVER (ORDER BY month)

2. Gap Analysis

Identify gaps or anomalies in sequential data:

-- Time between consecutive orders
order_date - LAG(order_date, 1) OVER (ORDER BY order_date)

3. Customer Journey Analysis

Track customer behavior over time:

-- Days between purchases for each customer
LAG(purchase_date) OVER (PARTITION BY customer_id ORDER BY purchase_date)

4. Baseline Comparison

Compare all values to a reference point:

-- Compare all prices to the first price
price / FIRST_VALUE(price) OVER (ORDER BY date) AS price_index

5. Trend Detection

Identify upward or downward trends:

CASE 
    WHEN price > LAG(price) OVER (ORDER BY date) THEN 'Increasing'
    WHEN price < LAG(price) OVER (ORDER BY date) THEN 'Decreasing'
    ELSE 'Stable'
END AS trend

Value Functions vs Aggregate Functions

It's important to understand when to use each type:

Use Aggregate Window Functions When:

  • You need to calculate totals, averages, or counts
  • You want running totals or moving averages
  • You need to compare values to group statistics (avg, sum, min, max)

Use Value Window Functions When:

  • You need to compare to a specific row (previous, next, first, last)
  • You're calculating period-over-period changes
  • You're analyzing gaps or sequences in data
  • You need to detect trends or patterns

Decision Guide

QuestionUse
Compare to previous/next row?LAG / LEAD
Compare to group average?AVG() OVER()
Calculate running total?SUM() OVER()
Compare to first/last in group?FIRST_VALUE / LAST_VALUE
Calculate percentage change?LAG with arithmetic

Practice

Try these exercises to familiarize yourself with window value functions.

Exercise 1: Basic LAG

Show each payment with the previous payment value:

-- Show current and previous payment

SELECT 
    order_id,
    payment_value,
    LAG(payment_value, 1, 0) OVER (ORDER BY order_id) AS previous_payment
FROM olist_order_payments_dataset
LIMIT 15;

Exercise 2: LEAD for Forecasting

Show each price with what the next price will be:

-- Show current price and next price

SELECT 
    order_id,
    price,
    LEAD(price, 1) OVER (ORDER BY order_id) AS next_price
FROM olist_order_items_dataset
LIMIT 15;

Exercise 3: FIRST_VALUE by Group

Show each order with the first order price for that seller:

-- First price per seller

SELECT 
    seller_id,
    order_id,
    price,
    FIRST_VALUE(price) OVER (
        PARTITION BY seller_id 
        ORDER BY order_id
    ) AS sellers_first_price
FROM olist_order_items_dataset
LIMIT 20;

Key Takeaways

Here's what you've learned about window value functions:

  1. Window value functions retrieve values from specific rows, not aggregates
  2. LAG looks backward to previous rows
  3. LEAD looks forward to following rows
  4. FIRST_VALUE gets the first value in the window
  5. LAST_VALUE gets the last value (requires frame clause)
  6. ORDER BY is required to define row sequence
  7. Key use cases: period-over-period, gap analysis, trend detection

Coming Up Next

In the following lessons, we'll explore:

  • Detailed usage of LAG and LEAD for month-over-month analysis
  • Customer retention analysis with value functions
  • Time gap analysis techniques
  • FIRST_VALUE and LAST_VALUE patterns
  • Real-world business applications