Window functions in SQL are a powerful tool for performing calculations across a set of rows related to the current row. Unlike aggregate functions, which return one result for a group of rows, window functions do not collapse the result set. Instead, they allow you to include both the raw data and the calculated values in the result set.
A window function performs a calculation across a specified range of rows (a window) in a result set. Unlike regular aggregate functions, which group data, window functions operate within the scope of the rows defined by the window.
The general syntax for a window function is:
function_name (expression)
OVER (PARTITION BY column_name ORDER BY column_name ROWS BETWEEN frame_start AND frame_end)
function_name(expression)
: Specifies the window function and the column to operate on.OVER
: Defines the window for the function.PARTITION BY
: Divides the data into subsets for the calculation.ORDER BY
: Specifies the order of rows within each partition.ROWS BETWEEN
: Defines the range of rows in the window.The PARTITION BY
clause divides the result set into partitions to perform calculations within each partition.
Suppose we have a sales
table:
EmployeeID | Region | Sales |
---|---|---|
1 | North | 100 |
2 | North | 200 |
3 | South | 150 |
4 | South | 250 |
SELECT EmployeeID, Region, Sales,
SUM(Sales) OVER (PARTITION BY Region) AS TotalSalesByRegion
FROM sales;
EmployeeID | Region | Sales | TotalSalesByRegion |
---|---|---|---|
1 | North | 100 | 300 |
2 | North | 200 | 300 |
3 | South | 150 | 400 |
4 | South | 250 | 400 |
The ORDER BY
clause specifies the order of rows within a partition.
SELECT EmployeeID, Region, Sales,
RANK() OVER (PARTITION BY Region ORDER BY Sales DESC) AS Rank
FROM sales;
EmployeeID | Region | Sales | Rank |
---|---|---|---|
2 | North | 200 | 1 |
1 | North | 100 | 2 |
4 | South | 250 | 1 |
3 | South | 150 | 2 |
A window frame specifies a subset of rows relative to the current row.
SELECT EmployeeID, Sales,
SUM(Sales) OVER (ORDER BY EmployeeID ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS RunningTotal
FROM sales;
EmployeeID | Sales | RunningTotal |
---|---|---|
1 | 100 | 100 |
2 | 200 | 300 |
3 | 150 | 350 |
4 | 250 | 400 |
RANK()
, but without gaps.
SELECT EmployeeID, Sales,
SUM(Sales) OVER (ORDER BY EmployeeID) AS RunningTotal
FROM sales;
EmployeeID | Sales | RunningTotal |
---|---|---|
1 | 100 | 100 |
2 | 200 | 300 |
3 | 150 | 450 |
4 | 250 | 700 |
SELECT EmployeeID, Sales,
PERCENT_RANK() OVER (ORDER BY Sales) AS PercentileRank
FROM sales;
EmployeeID | Sales | PercentileRank |
---|---|---|
1 | 100 | 0.0 |
3 | 150 | 0.33 |
2 | 200 | 0.66 |
4 | 250 | 1.0 |
SELECT EmployeeID, Sales,
LAG(Sales) OVER (ORDER BY EmployeeID) AS PreviousSales,
LEAD(Sales) OVER (ORDER BY EmployeeID) AS NextSales
FROM sales;
EmployeeID | Sales | PreviousSales | NextSales |
---|---|---|---|
1 | 100 | NULL | 200 |
2 | 200 | 100 | 150 |
3 | 150 | 200 | 250 |
4 | 250 | 150 | NULL |
Window functions are an essential feature in SQL for advanced data analysis. They provide unmatched flexibility for ranking, trend analysis, and cumulative calculations without collapsing the result set. Happy Coding!❤️