SQL PARTITION BY Clause overview (2023)

This article will cover the SQL PARTITION BY clause and, in particular, the difference with GROUP BY in a select statement. We will also explore various use cases of SQL PARTITION BY.

We use SQL PARTITION BY to divide the result set into partitions and perform computation on each subset of partitioned data.

Preparing Sample Data

Let us create an Orders table in my sample database SQLShackDemo and insert records to write further queries.

1

2

3

4

5

6

7

8

9

10

Use SQLShackDemo

Go

CREATE TABLE [dbo].[Orders]

(

[orderid] INT,

[Orderdate] DATE,

[CustomerName] VARCHAR(100),

[Customercity] VARCHAR(100),

[Orderamount] MONEY

)

I use ApexSQL Generate to insert sample data into this article. Right click on the Orders table and Generate test data.

SQL PARTITION BY Clause overview (1)

It launches the ApexSQL Generate. I generated a script to insert data into the Orders table. Execute this script to insert 100 records in the Orders table.

1

2

3

4

5

6

7

USE [SQLShackDemo]

GO

INSERT [dbo].[Orders]VALUES (216090, CAST(N'1826-12-19' AS Date), N'Edward', N'Phoenix', 4713.8900)

GO

INSERT [dbo].[Orders]VALUES (508220, CAST(N'1826-12-09' AS Date), N'Aria', N'San Francisco', 9832.7200)

GO

Once we execute insert statements, we can see the data in the Orders table in the following image.

SQL PARTITION BY Clause overview (2)

We use SQL GROUP BY clause to group results by specified column and use aggregate functions such as Avg(), Min(), Max() to calculate required values.

Group By function syntax

1

2

3

4

SELECT expression, aggregate function ()

FROM tables

WHERE conditions

GROUP BY expression

Suppose we want to find the following values in the Orders table

  • Minimum order value in a city
  • Maximum order value in a city
  • Average order value in a city

Execute the following query with GROUP BY clause to calculate these values.

1

2

3

4

5

6

SELECT Customercity,

AVG(Orderamount) AS AvgOrderAmount,

MIN(OrderAmount) AS MinOrderAmount,

SUM(Orderamount) TotalOrderAmount

FROM [dbo].[Orders]

GROUP BY Customercity;

(Video) SQL Partition By Explained

In the following screenshot, we can see Average, Minimum and maximum values grouped by CustomerCity.

SQL PARTITION BY Clause overview (3)

Now, we want to add CustomerName and OrderAmount column as well in the output. Let’s add these columns in the select statement and execute the following code.

1

2

3

4

5

6

SELECT Customercity, CustomerName ,OrderAmount,

AVG(Orderamount) AS AvgOrderAmount,

MIN(OrderAmount) AS MinOrderAmount,

SUM(Orderamount) TotalOrderAmount

FROM [dbo].[Orders]

GROUP BY Customercity;

Once we execute this query, we get an error message. In the SQL GROUP BY clause, we can use a column in the select statement if it is used in Group by clause as well. It does not allow any column in the select clause that is not part of GROUP BY clause.

SQL PARTITION BY Clause overview (4)

We can use the SQL PARTITION BY clause to resolve this issue. Let us explore it further in the next section.

SQL PARTITION BY

We can use the SQL PARTITION BY clause with the OVER clause to specify the column on which we need to perform aggregation. In the previous example, we used Group By with CustomerCity column and calculated average, minimum and maximum values.

Let us rerun this scenario with the SQL PARTITION BY clause using the following query.

1

2

3

4

5

SELECT Customercity,

AVG(Orderamount) OVER(PARTITION BY Customercity) AS AvgOrderAmount,

MIN(OrderAmount) OVER(PARTITION BY Customercity) AS MinOrderAmount,

SUM(Orderamount) OVER(PARTITION BY Customercity) TotalOrderAmount

FROM [dbo].[Orders];

In the output, we get aggregated values similar to a GROUP By clause. You might notice a difference in output of the SQL PARTITION BY and GROUP BY clause output.

SQL PARTITION BY Clause overview (5)


Group By

SQL PARTITION BY

We get a limited number of records using the Group By clause

We get all records in a table using the PARTITION BY clause.

It gives one row per group in result set. For example, we get a result for each group of CustomerCity in the GROUP BY clause.

It gives aggregated columns with each record in the specified table.

We have 15 records in the Orders table. In the query output of SQL PARTITION BY, we also get 15 rows along with Min, Max and average values.

In the previous example, we get an error message if we try to add a column that is not a part of the GROUP BY clause.

We can add required columns in a select statement with the SQL PARTITION BY clause. Let us add CustomerName and OrderAmount columns and execute the following query.

1

2

3

4

5

6

7

SELECT Customercity,

CustomerName,

OrderAmount,

AVG(Orderamount) OVER(PARTITION BY Customercity) AS AvgOrderAmount,

MIN(OrderAmount) OVER(PARTITION BY Customercity) AS MinOrderAmount,

SUM(Orderamount) OVER(PARTITION BY Customercity) TotalOrderAmount

FROM [dbo].[Orders];

(Video) Intermediate SQL Tutorial | Partition By

We get CustomerName and OrderAmount column along with the output of the aggregated function. We also get all rows available in the Orders table.

SQL PARTITION BY Clause overview (6)

In the following screenshot, you can for CustomerCity Chicago, it performs aggregations (Avg, Min and Max) and gives values in respective columns.

SQL PARTITION BY Clause overview (7)

Similarly, we can use other aggregate functions such as count to find out total no of orders in a particular city with the SQL PARTITION BY clause.

1

2

3

4

5

6

7

8

SELECT Customercity,

CustomerName,

OrderAmount,

COUNT(OrderID) OVER(PARTITION BY Customercity) AS CountOfOrders,

AVG(Orderamount) OVER(PARTITION BY Customercity) AS AvgOrderAmount,

MIN(OrderAmount) OVER(PARTITION BY Customercity) AS MinOrderAmount,

SUM(Orderamount) OVER(PARTITION BY Customercity) TotalOrderAmount

FROM [dbo].[Orders];

We can see order counts for a particular city. For example, we have two orders from Austin city therefore; it shows value 2 in CountofOrders column.

SQL PARTITION BY Clause overview (8)

PARTITION BY clause with ROW_NUMBER()

We can use the SQL PARTITION BY clause with ROW_NUMBER() function to have a row number of each row. We define the following parameters to use ROW_NUMBER with the SQL PARTITION BY clause.

  • PARTITION BY column – In this example, we want to partition data on CustomerCity column
  • Order By: In the ORDER BY column, we define a column or condition that defines row number. In this example, we want to sort data on the OrderAmount column

1

2

3

4

5

6

7

8

9

10

SELECT Customercity,

CustomerName,

ROW_NUMBER() OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC) AS "Row Number",

OrderAmount,

COUNT(OrderID) OVER(PARTITION BY Customercity) AS CountOfOrders,

AVG(Orderamount) OVER(PARTITION BY Customercity) AS AvgOrderAmount,

MIN(OrderAmount) OVER(PARTITION BY Customercity) AS MinOrderAmount,

SUM(Orderamount) OVER(PARTITION BY Customercity) TotalOrderAmount

FROM [dbo].[Orders];

In the following screenshot, we get see for CustomerCity Chicago, we have Row number 1 for order with highest amount 7577.90. it provides row number with descending OrderAmount.

SQL PARTITION BY Clause overview (9)

PARTITION BY clause with Cumulative total value

Suppose we want to get a cumulative total for the orders in a partition. Cumulative total should be of the current row and the following row in the partition.

SQL PARTITION BY Clause overview (10)

For example, in the Chicago city, we have four orders.

(Video) Over clause in SQL Server

CustomerCity

CustomerName

Rank

OrderAmount

Cumulative Total Rows

Cumulative Total

Chicago

Marvin

1

7577.9

Rank 1 +2

14777.51

Chicago

Lawrence

2

7199.61

Rank 2+3

14047.21

Chicago

Alex

3

6847.66

Rank 3+4

8691.49

Chicago

Jerome

4

1843.83

Rank 4

1843.83

In the following query, we the specified ROWS clause to select the current row (using CURRENT ROW) and next row (using 1 FOLLOWING). It further calculates sum on those rows using sum(Orderamount) with a partition on CustomerCity ( using OVER(PARTITION BY Customercity ORDER BY OrderAmount DESC).

1

2

3

4

5

6

7

SELECT Customercity,

CustomerName,

OrderAmount,

ROW_NUMBER() OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC) AS "Row Number",

CONVERT(VARCHAR(20), SUM(orderamount) OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING), 1) AS CumulativeTotal,


SQL PARTITION BY Clause overview (11)

Similarly, we can calculate the cumulative average using the following query with the SQL PARTITION BY clause.

1

2

3

4

5

6

7

SELECT Customercity,

CustomerName,

OrderAmount,

ROW_NUMBER() OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC) AS "Row Number",

CONVERT(VARCHAR(20), AVG(orderamount) OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING), 1) AS CumulativeAVG


SQL PARTITION BY Clause overview (12)

ROWS UNBOUNDED PRECEDING with the PARTITION BY clause

We can use ROWS UNBOUNDED PRECEDING with the SQL PARTITION BY clause to select a row in a partition before the current row and the highest value row after current row.

In the following table, we can see for row 1; it does not have any row with a high value in this partition. Therefore, Cumulative average value is the same as of row 1 OrderAmount.

For Row2, It looks for current row value (7199.61) and highest value row 1(7577.9). It calculates the average for these two amounts.

For Row 3, it looks for current value (6847.66) and higher amount value than this value that is 7199.61 and 7577.90. It calculates the average of these and returns.


CustomerCity

CustomerName

Rank

OrderAmount

Cumulative Average Rows

Cumulative Average

Chicago

Marvin

1

7577.9

Rank 1

7577.90

Chicago

Lawrence

2

7199.61

Rank 1+2

7388.76

Chicago

Alex

3

6847.66

Rank 1+2+3

7208.39

Chicago

Jerome

4

1843.83

Rank 1+2+3+4

5867.25

Execute the following query to get this result with our sample data.

1

2

3

4

5

6

7

8

SELECT Customercity,

CustomerName,

OrderAmount,

ROW_NUMBER() OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC) AS "Row Number",

CONVERT(VARCHAR(20), AVG(orderamount) OVER(PARTITION BY Customercity

ORDER BY OrderAmount DESC ROWS UNBOUNDED PRECEDING), 1) AS CumulativeAvg

FROM [dbo].[Orders];


SQL PARTITION BY Clause overview (13)

Conclusion

In this article, we explored the SQL PARTIION BY clause and its comparison with GROUP BY clause. We also learned its usage with a few examples. I hope you find this article useful and feel free to ask any questions in the comments below

  • Author
  • Recent Posts

Rajendra Gupta

Hi! I am Rajendra Gupta, Database Specialist and Architect, helping organizations implement Microsoft SQL Server, Azure, Couchbase, AWS solutions fast and efficiently, fix related issues, and Performance Tuning with over 14 years of experience.

I am the author of the book "DP-300 Administering Relational Database on Microsoft Azure". I published more than 650 technical articles on MSSQLTips, SQLShack, Quest, CodingSight, and SeveralNines.

I am the creator of one of the biggest free online collections of articles on a single topic, with his 50-part series on SQL Server Always On Availability Groups.

Based on my contribution to the SQL Server community, I have been recognized as the prestigious Best Author of the Year continuously in 2019, 2020, and 2021 (2nd Rank) at SQLShack and the MSSQLTIPS champions award in 2020.

Personal Blog: https://www.dbblogger.com
I am always interested in new challenges so if you need consulting help, reach me at rajendra.gupta16@gmail.com

View all posts by Rajendra Gupta

Latest posts by Rajendra Gupta (see all)

  • Rename on-premises SQL Server database and Azure SQL database - October 18, 2022
  • SQL Commands to check current Date and Time (Timestamp) in SQL Server - October 7, 2022
  • Exploring Azure SQL Database Geo-Restore - September 2, 2022
(Video) What the PARTITION BY clause does in Analytic Functions Oracle SQL

Related posts:

  1. Descripción general de la cláusula PARTITION BY de SQL
  2. Overview of SQL RANK functions
  3. How to use Window functions in SQL Server
  4. An overview of the SQL Server Update Join
  5. SQL Order by Clause overview and examples

FAQs

What does partition by clause do in SQL? ›

The PARTITION BY clause divides a query's result set into partitions. The window function is operated on each partition separately and recalculate for each partition. You can specify one or more columns or expressions to partition the result set.

Does partitioning improve query performance? ›

So In a way partitioning distribute your table's data across the file system, so when query is run on a table only a fraction of data is processed which result in better performance.

What does ROW_NUMBER () over partition by do? ›

PARTITION BY

The ROW_NUMBER() method is then applied to each partition, which assigns a separate rank number to each partition. If the partition by clause is not specified, the ROW NUMBER function will treat the entire result as a single partition and rank it from top to bottom.

What does order by do in partition by SQL? ›

Partition By: This divides the rows or query result set into small partitions. Order By: This arranges the rows in ascending or descending order for the partition window. The default order is ascending. Row or Range: You can further limit the rows in a partition by specifying the start and endpoints.

Is partition by better than GROUP BY? ›

Therefore, in conclusion, the PARTITION BY retrieves all the records in the table, while the GROUP BY only returns a limited number. One more thing is that GROUP BY does not allow to add columns which are not parts of GROUP BY clause in select statement. However, with PARTITION BY clause, we can add required columns.

Does partition by order matter? ›

Reordering partitions won't affect performance, but it can break things. For example, the Windows Recovery partition is specified by partition number, and that partition number will be different if the sequence is different, which can break your ability to access the Windows Recovery environment.

Can data partitioning reduce performance? ›

In many large-scale solutions, data is divided into partitions that can be managed and accessed separately. Partitioning can improve scalability, reduce contention, and optimize performance.

How do I partition a large table? ›

Decide on the partition design: date range, numeric range, or list of values. Choose the column(s) on which to partition the table. Decide how many levels of partitions you want. For example, you can create a date range partition table by month and then subpartition the monthly partitions by sales region.

When should you partition a table? ›

When to Partition a Table
  1. Tables greater than 2GB should always be considered for partitioning.
  2. Tables containing historical data, in which new data is added into the newest partition. A typical example is a historical table where only the current month's data is updatable and the other 11 months are read-only.

Which is faster group by or partition by? ›

Group By with not be always be faster than Partition by... its more important to understand the semantics of how the work. - Group BY with hashout the keys and then apply distinct on it.. so If you have nested queries or Views then its a never ending story.

What is rank () over partition by? ›

The RANK() function is a window function that assigns a rank to each row in the partition of a result set. The rank of a row is determined by one plus the number of ranks that come before it. RANK() OVER ( PARTITION BY <expr1>[{,<expr2>...}] ORDER BY <expr1> [ASC|DESC], [{,<expr2>...}] )

What is difference between rank () ROW_NUMBER () and Dense_rank () in SQL? ›

row_number numbers the rows 1, 2, 3, etc by the columns in the ORDER BY clause, and if there are ties, it is arbitrary which rows that gets the same number. rank and dense_rank are similar to row_number , but when there are ties, they will give the same value to the tied values.

What is difference between GROUP BY and partition by? ›

1) GROUP BY clause is used to group data using single or multiple columns based on the requirements. 2) Used with aggregate functions like min, max, avg, sum, etc. 1) PARTITION BY clause is used to divide the result set into partitions and perform computation on each subset of partitioned data.

Can we use WHERE in partition by? ›

bhushan - we cannot use a where clause after the partition by is over. so, your first example is perfect like Frank's.

Can we use over without partition? ›

The default window frame depends on whether ORDER BY is omitted or not. There are two possible scenarios when ORDER BY is omitted: Using OVER () , omitting both PARTITION BY and ORDER BY . Using OVER (PARTITION BY …) , omitting only ORDER BY .

What is GROUP BY and partition by in SQL? ›

1) GROUP BY clause is used to group data using single or multiple columns based on the requirements. 2) Used with aggregate functions like min, max, avg, sum, etc. Partition By clause:- 1) PARTITION BY clause is used to divide the result set into partitions and perform computation on each subset of partitioned data.

What is rank () over partition by? ›

The RANK() function is a window function that assigns a rank to each row in the partition of a result set. The rank of a row is determined by one plus the number of ranks that come before it. RANK() OVER ( PARTITION BY <expr1>[{,<expr2>...}] ORDER BY <expr1> [ASC|DESC], [{,<expr2>...}] )

Can we use where in partition by? ›

bhushan - we cannot use a where clause after the partition by is over. so, your first example is perfect like Frank's.

Why do we use partition? ›

A PARTITION BY clause is used to partition rows of table into groups. It is useful when we have to perform a calculation on individual rows of a group using other rows of that group. It is always used inside OVER() clause. The partition formed by partition clause are also known as Window.

Videos

1. Window Functions-Part 1(Basics of Windowing,OVER() Clause,Partition By Clause)
(Stellar Smart Academy)
2. Learn GROUP BY vs PARTITION BY in SQL
(Anthony B. Smoak)
3. SQL Tutorial: PARTITION BY
(DataCamp)
4. Lesson 38 Over Partition by Clause in SQL
(Vijay Mulsaniya)
5. MSSQL - What is Table Partitioning Explained
(CodeCowboyOrg)
6. SQL Ranking Functions: Part 1 The Over Clause
(Database by Doug)
Top Articles
Latest Posts
Article information

Author: Allyn Kozey

Last Updated: 03/10/2023

Views: 5730

Rating: 4.2 / 5 (43 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Allyn Kozey

Birthday: 1993-12-21

Address: Suite 454 40343 Larson Union, Port Melia, TX 16164

Phone: +2456904400762

Job: Investor Administrator

Hobby: Sketching, Puzzles, Pet, Mountaineering, Skydiving, Dowsing, Sports

Introduction: My name is Allyn Kozey, I am a outstanding, colorful, adventurous, encouraging, zealous, tender, helpful person who loves writing and wants to share my knowledge and understanding with you.