Introducing Performance Improvement of Window Functions in PostgreSQL 15

When working with databases, there are always some projects oriented to performing analytics and reporting tasks over the information stored in the database. Usually, these tasks leverage window functions to do calculations “across a set of table rows that are somehow related to the current row,” as is described in the documentation.

There are several built-in windows functions available in PostgreSQL. In the latest release, PostgreSQL 15, some performance improvements were added for the rank(), row_number(), and count() functions. First, let’s review what these functions can do.

The window functions

As mentioned above, the window functions let us perform some calculations on a set of table rows related to the current one. The “set of table rows” is usually identified as a “partition” defined by a column or columns. As we read in the documentation, the named functions work for:

rank () → bigint

Returns the rank of the current row, with gaps; that is, the row_number of the first row in its peer group.

row_number () → bigint

Returns the current row number within its partition, counting from 1.

count ( * ) → bigint

Computes the number of input rows.

Consider the following table and data:

                                          Table "public.employee"
    Column      |           Type           | Collation | Nullable |                 Default
-----------------+--------------------------+-----------+----------+------------------------------------------
emp_id          | integer                  |           | not null | nextval('employee_emp_id_seq'::regclass)
emp_name        | character varying(20)    |           | not null |
emp_country     | character varying(35)    |           | not null |
emp_salary      | integer                  |           | not null |
date_of_joining | timestamp with time zone |           | not null | now()
Indexes:
    "employee_pkey" PRIMARY KEY, btree (emp_id)

Table "public.employee"

Column | Type | Collation | Nullable | Default

-----------------+--------------------------+-----------+----------+------------------------------------------

emp_id | integer | | not null | nextval('employee_emp_id_seq'::regclass)

emp_name | character varying(20) | | not null |

emp_country | character varying(35) | | not null |

emp_salary | integer | | not null |

date_of_joining | timestamp with time zone | | not null | now()

Indexes:

"employee_pkey" PRIMARY KEY, btree (emp_id)

demo=# SELECT * FROM employee ;
emp_id  |   emp_name   | emp_country | emp_salary |    date_of_joining
--------+--------------+-------------+------------+------------------------
      1 | KELIO        | Japon       |       2000 | 2021-10-26 00:00:00+00
      2 | JEAN-VINCENT | Canada      |       6500 | 2021-01-22 00:00:00+00
      3 | JUNO         | Japon       |       4000 | 2021-02-27 00:00:00+00
      4 | GUY-EMMANUEL | Salvador    |       2000 | 2020-07-27 00:00:00+00
      5 | WALI         | Japon       |       7000 | 2021-01-31 00:00:00+00
      6 | HENRI-PAUL   | Canada      |       4500 | 2021-08-19 00:00:00+00
      7 | MUHAMED      | France      |       5000 | 2021-07-20 00:00:00+00
      8 | MUHITTIN     | Madagascar  |       2500 | 2021-12-31 00:00:00+00
      9 | DEVLIN       | Madagascar  |       7000 | 2022-04-03 00:00:00+00
     10 | JOSUE        | Salvador    |       5500 | 2020-09-25 00:00:00+00
(10 rows)

demo=# SELECT * FROM employee ;

emp_id | emp_name | emp_country | emp_salary | date_of_joining

--------+--------------+-------------+------------+------------------------

1 | KELIO | Japon | 2000 | 2021-10-26 00:00:00+00

2 | JEAN-VINCENT | Canada | 6500 | 2021-01-22 00:00:00+00

3 | JUNO | Japon | 4000 | 2021-02-27 00:00:00+00

4 | GUY-EMMANUEL | Salvador | 2000 | 2020-07-27 00:00:00+00

5 | WALI | Japon | 7000 | 2021-01-31 00:00:00+00

6 | HENRI-PAUL | Canada | 4500 | 2021-08-19 00:00:00+00

7 | MUHAMED | France | 5000 | 2021-07-20 00:00:00+00

8 | MUHITTIN | Madagascar | 2500 | 2021-12-31 00:00:00+00

9 | DEVLIN | Madagascar | 7000 | 2022-04-03 00:00:00+00

10 | JOSUE | Salvador | 5500 | 2020-09-25 00:00:00+00

(10 rows)

rank()

We can use the rank() function to get the rank of employees (id) per country based on their salary. Look at the next example.

demo=# SELECT 
         emp_id, 
         emp_salary, 
         emp_country, 
         rank() OVER (PARTITION BY emp_country ORDER BY emp_salary DESC) 
       FROM employee;

emp_id  | emp_salary | emp_country | rank
--------+------------+-------------+------
      2 |       6500 | Canada      |    1
      6 |       4500 | Canada      |    2
      7 |       5000 | France      |    1
      5 |       7000 | Japon       |    1
      3 |       4000 | Japon       |    2
      1 |       2000 | Japon       |    3
      9 |       7000 | Madagascar  |    1
      8 |       2500 | Madagascar  |    2
     10 |       5500 | Salvador    |    1
      4 |       2000 | Salvador    |    2
(10 rows)

demo=# SELECT

emp_id,

emp_salary,

emp_country,

rank() OVER (PARTITION BY emp_country ORDER BY emp_salary DESC)

FROM employee;

emp_id | emp_salary | emp_country | rank

--------+------------+-------------+------

2 | 6500 | Canada | 1

6 | 4500 | Canada | 2

7 | 5000 | France | 1

5 | 7000 | Japon | 1

3 | 4000 | Japon | 2

1 | 2000 | Japon | 3

9 | 7000 | Madagascar | 1

8 | 2500 | Madagascar | 2

10 | 5500 | Salvador | 1

4 | 2000 | Salvador | 2

(10 rows)

row_number()

In the next example, the row_number() function gets a sorted list of employees’ names per country and their relative numeric position.

demo=# SELECT 
         emp_id, 
         emp_name, 
         emp_country, 
         row_number() OVER (PARTITION BY emp_country ORDER BY emp_name) 
       FROM employee;

emp_id  |   emp_name   | emp_country | row_number
--------+--------------+-------------+------------
      6 | HENRI-PAUL   | Canada      |          1
      2 | JEAN-VINCENT | Canada      |          2
      7 | MUHAMED      | France      |          1
      3 | JUNO         | Japon       |          1
      1 | KELIO        | Japon       |          2
      5 | WALI         | Japon       |          3
      9 | DEVLIN       | Madagascar  |          1
      8 | MUHITTIN     | Madagascar  |          2
      4 | GUY-EMMANUEL | Salvador    |          1
     10 | JOSUE        | Salvador    |          2
(10 rows)

demo=# SELECT

emp_id,

emp_name,

emp_country,

row_number() OVER (PARTITION BY emp_country ORDER BY emp_name)

FROM employee;

emp_id | emp_name | emp_country | row_number

--------+--------------+-------------+------------

6 | HENRI-PAUL | Canada | 1

2 | JEAN-VINCENT | Canada | 2

7 | MUHAMED | France | 1

3 | JUNO | Japon | 1

1 | KELIO | Japon | 2

5 | WALI | Japon | 3

9 | DEVLIN | Madagascar | 1

8 | MUHITTIN | Madagascar | 2

4 | GUY-EMMANUEL | Salvador | 1

10 | JOSUE | Salvador | 2

(10 rows)

count()

The count() function is an “old known” tool used by almost everyone with access to a SQL engine. This function is part of the aggregate functions list but can act as windows functions when the OVER clause follows the call. So we can use it to know how many employees share the same salary as a given employee name.

demo=# SELECT 
         emp_name, 
         emp_salary, 
         emp_country, 
         count(*) OVER (PARTITION BY emp_salary) 
       FROM employee;

  emp_name   | emp_salary | emp_country | count
--------------+------------+-------------+-------
KELIO        |       2000 | Japon       |     2
GUY-EMMANUEL |       2000 | Salvador    |     2
MUHITTIN     |       2500 | Madagascar  |     1
JUNO         |       4000 | Japon       |     1
HENRI-PAUL   |       4500 | Canada      |     1
MUHAMED      |       5000 | France      |     1
JOSUE        |       5500 | Salvador    |     1
JEAN-VINCENT |       6500 | Canada      |     1
WALI         |       7000 | Japon       |     2
DEVLIN       |       7000 | Madagascar  |     2
(10 rows)

demo=# SELECT

emp_name,

emp_salary,

emp_country,

count(*) OVER (PARTITION BY emp_salary)

FROM employee;

emp_name | emp_salary | emp_country | count

--------------+------------+-------------+-------

KELIO | 2000 | Japon | 2

GUY-EMMANUEL | 2000 | Salvador | 2

MUHITTIN | 2500 | Madagascar | 1

JUNO | 4000 | Japon | 1

HENRI-PAUL | 4500 | Canada | 1

MUHAMED | 5000 | France | 1

JOSUE | 5500 | Salvador | 1

JEAN-VINCENT | 6500 | Canada | 1

WALI | 7000 | Japon | 2

DEVLIN | 7000 | Madagascar | 2

(10 rows)

Window functions in PostgreSQL 15

Now that we have refreshed what the window functions are, let’s consider what the PostgreSQL 15 release notes say:

Improve the performance of window functions that use row_number(), rank(), and count() (David Rowley)

Accordingly, if we are users of the window functions and move from a previous version to version 15, we should see an improvement in the performance of our workload. Let’s test it.

Laboratory case

To test the performance of the window functions, I created three instances of PostgreSQL, (a) version 13, (b) version 14, and (c) version 15.

I used the same public.employee table used in the previous examples, and I loaded it with 10K rows. Then I executed the same queries we saw before for the window functions. I got the output from an EXPLAIN (ANALYZE) command which executes the query, and we can see the timing for the specific window function.

The EXPLAIN (ANALYZE) output was the same for each version of PostgreSQL.

rank()

PG15

                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
WindowAgg  (cost=1288.49..1488.49 rows=10000 width=25) (actual time=11.946..18.732 rows=10000 loops=1)
  ->  Sort  (cost=1288.49..1313.49 rows=10000 width=17) (actual time=11.928..12.803 rows=10000 loops=1)
        Sort Key: emp_country, emp_salary DESC
        Sort Method: quicksort  Memory: 980kB
        ->  Seq Scan on employee  (cost=0.00..180.00 rows=10000 width=17) (actual time=0.008..2.402 rows=10000 loops=1)
Planning Time: 0.143 ms
Execution Time: 19.268 ms
(7 rows)

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------

WindowAgg (cost=1288.49..1488.49 rows=10000 width=25) (actual time=11.946..18.732 rows=10000 loops=1)

-> Sort (cost=1288.49..1313.49 rows=10000 width=17) (actual time=11.928..12.803 rows=10000 loops=1)

Sort Key: emp_country, emp_salary DESC

Sort Method: quicksort Memory: 980kB

-> Seq Scan on employee (cost=0.00..180.00 rows=10000 width=17) (actual time=0.008..2.402 rows=10000 loops=1)

Planning Time: 0.143 ms

Execution Time: 19.268 ms

(7 rows)

PG14

                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
WindowAgg  (cost=844.39..1044.39 rows=10000 width=25) (actual time=12.585..20.921 rows=10000 loops=1)
  ->  Sort  (cost=844.39..869.39 rows=10000 width=17) (actual time=12.560..13.545 rows=10000 loops=1)
        Sort Key: emp_country, emp_salary DESC
        Sort Method: quicksort  Memory: 1020kB
        ->  Seq Scan on employee  (cost=0.00..180.00 rows=10000 width=17) (actual time=0.011..1.741 rows=10000 loops=1)
Planning Time: 0.449 ms
Execution Time: 21.407 ms
(7 rows)

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------

WindowAgg (cost=844.39..1044.39 rows=10000 width=25) (actual time=12.585..20.921 rows=10000 loops=1)

-> Sort (cost=844.39..869.39 rows=10000 width=17) (actual time=12.560..13.545 rows=10000 loops=1)

Sort Key: emp_country, emp_salary DESC

Sort Method: quicksort Memory: 1020kB

-> Seq Scan on employee (cost=0.00..180.00 rows=10000 width=17) (actual time=0.011..1.741 rows=10000 loops=1)

Planning Time: 0.449 ms

Execution Time: 21.407 ms

(7 rows)

PG13

                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
WindowAgg  (cost=844.39..1044.39 rows=10000 width=25) (actual time=18.949..28.619 rows=10000 loops=1)
  ->  Sort  (cost=844.39..869.39 rows=10000 width=17) (actual time=18.896..19.998 rows=10000 loops=1)
        Sort Key: emp_country, emp_salary DESC
        Sort Method: quicksort  Memory: 1020kB
        ->  Seq Scan on employee  (cost=0.00..180.00 rows=10000 width=17) (actual time=0.011..1.228 rows=10000 loops=1)
Planning Time: 0.460 ms
Execution Time: 29.111 ms
(7 rows)

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------

WindowAgg (cost=844.39..1044.39 rows=10000 width=25) (actual time=18.949..28.619 rows=10000 loops=1)

-> Sort (cost=844.39..869.39 rows=10000 width=17) (actual time=18.896..19.998 rows=10000 loops=1)

Sort Key: emp_country, emp_salary DESC

Sort Method: quicksort Memory: 1020kB

-> Seq Scan on employee (cost=0.00..180.00 rows=10000 width=17) (actual time=0.011..1.228 rows=10000 loops=1)

Planning Time: 0.460 ms

Execution Time: 29.111 ms

(7 rows)

We can easily see from the WindowAgg node that the total time was smaller in the PG15 than in the other two. The performance improvement is clear here.

To verify this is consistent, I got the Total Time for the WindowAgg node from 500 executions and plotted the next graph.

postgresql total time

We can see the timing from the PG15 version is better than the other versions. Also, I added a trending line. We see the PG13 performed the “worst,” and even when the PG14 showed a better trend, the PG15 was the best.

I did the same exercise for the row_number() and count() functions.

row_number()

postgresql row number

count()

I also got the results from 500 executions for these last two functions. In both cases, the difference was smaller than the rank() function, but the PG15 still showed better results, as seen in the trend line.

The performance improvement for the rank(), row_number(), and count() window function introduced in the new PostgreSQL 15 will let all those analytic and reporting projects process their data faster than in previous versions. As always, every environment has its characteristics and challenges, but as we saw in these quick tests, the new version delivers the improvement just out of the box.

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Introducing Performance Improvement of Window Functions in PostgreSQL 15

The window functions

rank()

row_number()

count()

Window functions in PostgreSQL 15

Laboratory case

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Introducing Performance Improvement of Window Functions in PostgreSQL 15

The window functions

rank()

row_number()

count()

Window functions in PostgreSQL 15

Laboratory case

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation