Navigating MySQL with Window Functions

Navigating MySQL with Window Functions

MySQL's window functions, including LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE(), offer powerful capabilities for data analysis by providing seamless access to result sets while preserving context. This guide explores each function's purpose, application, and practical use cases.
Cemil Tokatli
July 4, 2025
Topics:
Share:

Window functions in MySQL transform how you can query and analyze data by allowing efficient row comparisons, retrieving data from various positions, and extracting specific values within result sets. In this article, we delve into some of the most potent window functions: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE(). To demonstrate these functions, we'll use a sample table filled with dummy data to reflect real-world applications.

Sample Data Setup

Let's create a sample table sales to illustrate the use of these window functions. This table includes some 20 rows of sales data, complete with salespersons, dates, and sales amounts to serve as our real-world example:

CREATE TABLE sales (
    id INT PRIMARY KEY AUTO_INCREMENT,
    salesperson VARCHAR(50),
    sale_date DATE,
    sale_amount DECIMAL(10,2)
);

INSERT INTO sales (salesperson, sale_date, sale_amount) VALUES
('Alice', '2023-01-01', 1000.00),
('Bob', '2023-01-02', 1500.00),
('Alice', '2023-01-03', 2000.00),
('Charlie', '2023-01-03', 1750.00),
('Alice', '2023-01-04', 2500.00),
('Bob', '2023-01-05', 3000.00),
('Charlie', '2023-01-05', 1200.00),
('Alice', '2023-01-06', 2200.00),
('Bob', '2023-01-07', 1800.00),
('Charlie', '2023-01-08', 1450.00),
('Alice', '2023-01-09', 1900.00),
('Bob', '2023-01-10', 2100.00),
('Charlie', '2023-01-11', 2300.00),
('Alice', '2023-01-12', 2600.00),
('Bob', '2023-01-13', 1700.00),
('Charlie', '2023-01-14', 1850.00),
('Alice', '2023-01-15', 2100.00),
('Bob', '2023-01-16', 1100.00),
('Charlie', '2023-01-17', 1550.00),
('Alice', '2023-01-18', 2400.00);

Using LAG()

The LAG() function lets you see a previous row's value in your result set, making it perfect for finding trends by comparing current values with earlier ones.

SELECT 
    salesperson,
    sale_date,
    sale_amount,
    LAG(sale_amount, 1) OVER (PARTITION BY salesperson ORDER BY sale_date) AS previous_sale,
    sale_amount - LAG(sale_amount, 1) OVER (PARTITION BY salesperson ORDER BY sale_date) AS difference
FROM sales
ORDER BY salesperson, sale_date;

This output shows how each sale amount compares to the previous sale by the same salesperson. For the first entry of each salesperson, there is no previous sale, hence the NULL values.

salesperson sale_date sale_amount previous_sale difference
Alice 2023-01-01 1000.00 NULL NULL
Alice 2023-01-03 2000.00 1000.00 1000.00
Alice 2023-01-04 2500.00 2000.00 500.00
Alice 2023-01-06 2200.00 2500.00 -300.00
Alice 2023-01-09 1900.00 2200.00 -300.00
... ... ... ... ...

Using LEAD()

LEAD() gives you the value of the next row, helping you anticipate future outcomes by examining increases or decreases between data points.

SELECT 
    salesperson,
    sale_date,
    sale_amount,
    LEAD(sale_amount, 1) OVER (PARTITION BY salesperson ORDER BY sale_date) AS next_sale,
    LEAD(sale_amount, 1) OVER (PARTITION BY salesperson ORDER BY sale_date) - sale_amount AS difference_next
FROM sales
ORDER BY salesperson, sale_date;

Here, we observe the sale amount predictions for the next transaction. For the last transaction of any salesperson, next_sale shows NULL as there isn't a subsequent transaction.

salesperson sale_date sale_amount next_sale difference_next
Alice 2023-01-01 1000.00 2000.00 1000.00
Alice 2023-01-03 2000.00 2500.00 500.00
Alice 2023-01-04 2500.00 2200.00 -300.00
Alice 2023-01-06 2200.00 1900.00 -300.00
Alice 2023-01-09 1900.00 2600.00 700.00
... ... ... ... ...

Using FIRST_VALUE()

FIRST_VALUE() is handy when you need to benchmark against the first value of a set. It's useful for initial condition comparisons.

SELECT 
    salesperson,
    sale_date,
    sale_amount,
    FIRST_VALUE(sale_amount) OVER (PARTITION BY salesperson ORDER BY sale_date) AS first_sale
FROM sales
ORDER BY salesperson, sale_date;

This output depicts how each sale measure up against the first ever recorded sale of the salesperson, underlining each transaction's growth or drop.

salesperson sale_date sale_amount first_sale
Alice 2023-01-01 1000.00 1000.00
Alice 2023-01-03 2000.00 1000.00
Alice 2023-01-04 2500.00 1000.00
Alice 2023-01-06 2200.00 1000.00
Alice 2023-01-09 1900.00 1000.00
... ... ... ...

Using LAST_VALUE()

The LAST_VALUE() function helps you get the last value in an ordered set, perfect for comparing rows against the most recent entry in the data.

SELECT 
    salesperson,
    sale_date,
    sale_amount,
    LAST_VALUE(sale_amount) OVER (PARTITION BY salesperson ORDER BY sale_date RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_sale
FROM sales
ORDER BY salesperson, sale_date;

Illustrating each transaction's distance from the most recent entry by the same salesperson, which is insightful in identifying patterns within data seasonality.

salesperson sale_date sale_amount last_sale
Alice 2023-01-01 1000.00 2400.00
Alice 2023-01-03 2000.00 2400.00
Alice 2023-01-04 2500.00 2400.00
Alice 2023-01-06 2200.00 2400.00
Alice 2023-01-09 1900.00 2400.00
... ... ... ...

Using NTH_VALUE()

NTH_VALUE() retrieves the nth value in an ordered set, ideal for pinpointing specific points within a dataset.

SELECT 
    salesperson,
    sale_date,
    sale_amount,
    NTH_VALUE(sale_amount, 3) OVER (PARTITION BY salesperson ORDER BY sale_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS third_sale
FROM sales
ORDER BY salesperson, sale_date;

Focuses on retrieving the exact third sale for each salesperson. It helps identify trends or anomalies at specific points within the sales cycle.

salesperson sale_date sale_amount third_sale
Alice 2023-01-01 1000.00 2500.00
Alice 2023-01-03 2000.00 2500.00
Alice 2023-01-04 2500.00 2500.00
Alice 2023-01-06 2200.00 2500.00
Alice 2023-01-09 1900.00 2500.00
... ... ... ...

Conclusion

Mastering window functions like LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE() enhances your MySQL toolkit, enabling you to perform complex data analysis and extraction with ease. These functions allow for seamless data iteration, efficient row comparison, and precise value referencing across result sets, fostering efficient database management and insightful data discovery.