If you think you never had encounter window functions, maybe you remember the question from an interview:

What’s difference between **GROUP BY** and **PARTITION BY**.

Think is that **PARTITION BY** is one of many window functions.

Ok, so let’s start with the difference between **GROUP BY** and **PARTITION BY**.

#### Understanding window functions concept

Let’s consider stored data of payments from home budget.

CREATE SCHEMA WF; GO CREATE TABLE [WF].[Payments] ( [Id] INT NOT NULL IDENTITY(1,1) , [Amount] DECIMAL(10,2) NOT NULL , [Date] DATE NOT NULL , [Category] NVARCHAR(20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [Person] NVARCHAR(20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , CONSTRAINT [PK_WF.Payments] PRIMARY KEY ([Id] ASC) ) GO SET IDENTITY_INSERT WF.Payments ON; GO INSERT INTO WF.Payments ([Id], [Amount], [Date], [Category], [Person]) VALUES (1, 30.15, '20180701', N'Travel', N'Alice'), (2, 37.49, '20180712', N'Car', N'Alice'), (3, 80.01, '20180704', N'Home', N'Mark'), (4, 43.10, '20180804', N'Car', N'Alice'), (5, 4.96, '20180826', N'Travel', N'Alice'), (6, 31.32, '20180805', N'Home', N'Alice'), (7, 30.52, '20180913', N'Home', N'Mark'), (8, 52.94, '20180903', N'Car', N'Mark'), (9, 14.45, '20181021', N'Travel', N'Alice'), (10, 29.53, '20181120', N'Travel', N'Mark'), (11, 0.95, '20181123', N'Car', N'Alice'), (12, 19.95, '20181210', N'Car', N'Alice'), (13, 0.68, '20181222', N'Travel', N'Alice'); GO SET IDENTITY_INSERT WF.Payments OFF; GO

We should receive table

I will try to explain the idea of Window Functions basing on the following example with ROW_NUMBER() and PARTITION BY keywords.

SELECT ROW_NUMBER() OVER (ORDER BY Date ASC) as '[1] ROW_NUMBER()' , Amount as '[2] Amount' , ROW_NUMBER() OVER (PARTITION BY MONTH(Date) ORDER BY Date ASC) AS '[3] ROW_NUMBER()-Date-ASC', ROW_NUMBER() OVER (PARTITION BY MONTH(Date) ORDER BY Date DESC) AS '[4] ROW_NUMBER()-Date-DESC', Date as '[5] Date', ROW_NUMBER() over (PARTITION BY Person ORDER BY Date ASC) AS '[6] ROW-NUMBER()-Person' , Person '[7] Person' FROM WF.Payments P ORDER BY P.[Date] ASC

And here’s the result:

- PARTITION BY in OVER clause creates a subset of rows from the base query
- ORDER BY in OVER clause sorts item for calculating purpose of the window function
- ORDER BY in OVER clause has nothing in common with ORDER BY in the base query – although then can be identical in some cases
- When window function is used without PARTITION BY clause, a whole query result is treated like one big window (column [1])
- For every window function (in different columns) different windows are created – compare results of ROW_NUMBER() in columns [1], [3] and [6]
- In columns [3] and [4] there is the same function, window, but with different sorting
- Window functions can be used with SELECT and ORDER BY keywords

Here’s generic syntax for window functions

function (expression) OVER ( [ PARTITION BY expression_list ] [ ORDER BY order_list ] [ ROWS frame_clause ])

##### Frame clauses

Apart from partitioning and sorting there is another one way to precise scope of windowing.

Frame clauses can limit scope for calculation with respect to currently calculated expression in row.

For example if you need progressive sum of payment amount, you can use frame clauses.

Here’s list of available frame clauses.

UNBOUNDED PRECEDING | ||

n PRECEDING | ||

CURRENT ROW | AND | CURRENT ROW |

n FOLLOWING | ||

UNBOUNDED FOLLOWING | ||

#### Window functions types

RANKING | |

ROW_NUMBER() | Determines the ordinal number of the current row within its partition. |

RANK() | Determines the rank of a value in a group of values. Rows with equal values for the ranking criteria receive the same rank. |

DENSE_RANK() | Determines the rank of a value in a group of values. Rows with equal values receive the same rank. There are no gaps in the sequence of ranked values if two or more rows have the same rank. |

NTILE(n) | Divides the rows for each window partition, as equally as possible, into a specified number of ranked groups. |

OFFSET | |

LEAD (col) | Returns the value for the row after the current row in a partition. |

LAG (col) | Returns the value for the row before the current row in a partition. |

FIRST_VALUE(col) | Returns the value of the specified expression with respect to the first row in the window frame. |

LAST_VALUE(col) | Returns the value of the specified expression with respect to the last row in the window frame. |

AGGREGATE | |

SUM(col) | Sum of the expression across all input values. Ignores NULL values. |

AVG(col) | Average value. Ignores NULL values. |

COUNT(col) | COUNT(*) counts all of the rows in the target table if they do or do not include nulls. COUNT(expression) computes the number of rows with non-NULL values. |

MAX(col) | Maximum value of the expression across all input values. Ignores NULL values. |

MIN(col) | Minimum value of the expression across all input values. Ignores NULL values. |

#### Examples

What % of the total amount in a month is the amount of every payment?

SELECT [Date] AS 'Date [1]', Amount as 'Amount [2]', SUM(Amount) OVER (PARTITION BY MONTH(Date)) AS 'SUM(Amount) [3]', CAST ((Amount / (SUM(Amount) OVER (PARTITION BY MONTH(Date))))*100 AS NUMERIC(4,1)) AS '% of total in month [4]' FROM WF.Payments P ORDER BY P.[Date] ASC

SELECT Date AS 'Date [1]', SUM(Amount) OVER (ORDER BY Date ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as 'SUM(Amount) [2]', Amount AS 'Amount [3]', SUM(Amount) OVER (PARTITION BY MONTH(Date) ORDER BY Date ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as 'SUM(Amount) [4]' FROM WF.Payments P ORDER BY P.[Date] ASC

-- (5) Difference between the current and previous payment amount SELECT ROW_NUMBER() OVER (ORDER BY Date asc) AS '#' , Date, LAG(Date) OVER (ORDER BY Date ASC) as 'LAG(Date) - Date of previous payment', DATEDIFF(day, LAG(Date) OVER (ORDER BY Date ASC), Date) AS 'Payments day interval', Amount FROM WF.Payments P ORDER BY P.[Date] ASC

partition definition | order | frames | |||||||||

RANKING | ROW_NUMBER() | ||||||||||

RANK() | |||||||||||

DENSE_RANK() | |||||||||||

NTILE(n) | |||||||||||

OFFSET | LEAD (col) | ||||||||||

LAG (col) | |||||||||||

FIRST_VALUE(col) | OVER | PARTITION BY col | ORDER BY col | ROWS BETWEEN | AND | ||||||

LAST_VALUE(col) | |||||||||||

AGGREGATE | SUM(col) | UNBOUNDED PRECEDING | CURRENT ROW | ||||||||

AVG(col) | n PRECEDING | n FOLLOWING | |||||||||

COUNT(col) | CURRENT ROW | UNBOUNDED FOLLOWING | |||||||||

MAX(col) | |||||||||||

MIN(col) | |||||||||||

http://tableizer.journalistopia.com/tableizer.php |