Introduction to MySQL 8.0 Common Table Expressions (Part 1)

MySQL Common Table Expressions This blog is the first part of a two-articles series. In this article, I’m going to introduce the Common Table Expression (CTE), a new feature available on MySQL 8.0, as well as Percona Server for MySQL 8.

What is a Common Table Expression?

We can define a CTE as an alternative to a derived table. In a small way, CTE simplifies complex joins and subqueries, improving the readability of the queries. CTE is part of ANSI SQL 99 and was introduced in MySQL 8.0.1. The same feature is available even on Percona Server for MySQL 8.0.

The main reasons for using CTE are:

Better readability of the queries
Can be referenced multiple times in the same query
Improved performance
A valid alternative to a VIEW, if your user cannot create VIEWs
Easier chaining of multiple CTE
Possibility to create recursive queries: this can be really useful when dealing with hierarchical data

SELECT, UPDATE, and DELETE statements can reference the CTE.

Note: for the examples in this article, I’ll use the world database you can download from the MySQL site.

How to Create and Use a CTE

We have said that a CTE is like a derived table when using a subquery, but the declaration is moved before the main query. A new dedicated clause is needed: WITH.

Let’s start with a subquery with a derived table to find the most populated countries in Europe:

mysql> SELECT Name, Population

-> FROM (SELECT * FROM country WHERE continent='Europe') AS derived_t

-> ORDER BY Population DESC LIMIT 5;

+--------------------+------------+

| Name | Population |

+--------------------+------------+

| Russian Federation | 146934000 |

| Germany | 82164700 |

| United Kingdom | 59623400 |

| France | 59225700 |

| Italy | 57680000 |

+--------------------+------------+

Let’s rewrite it using the CTE instead:

mysql> WITH cte AS (SELECT * FROM country WHERE continent='Europe')

-> SELECT Name, Population

-> FROM cte

-> ORDER BY Population DESC LIMIT 5;

+--------------------+------------+

| Name | Population |

+--------------------+------------+

| Russian Federation | 146934000 |

| Germany | 82164700 |

| United Kingdom | 59623400 |

| France | 59225700 |

| Italy | 57680000 |

+--------------------+------------+

The syntax is quite simple. Before your query, using WITH, you can define the CTE or even multiple CTEs. After that, you can reference in the query all the CTEs as many times as you need. You can think about a CTE as a sort pre-materialized query available as a temporary table for the scope of your main query.

We can also specify the column names if a parenthesized list of names follows the CTE name:

mysql> WITH cte(eur_name, eur_population) AS (SELECT Name, Population FROM country WHERE continent='Europe')

-> SELECT eur_name, eur_population

-> FROM cte

-> ORDER BY eur_opulation DESC LIMIT 5;

+--------------------+----------------+

| eur_name | eur_population |

+--------------------+----------------+

| Russian Federation | 146934000 |

| Germany | 82164700 |

| United Kingdom | 59623400 |

| France | 59225700 |

| Italy | 57680000 |

+--------------------+----------------+

CTE can also be used as the data source for updating other tables as in the following examples:

# create a new table 
mysql> CREATE TABLE country_2020 ( Code char(3), Name char(52), Population_2020 int, PRIMARY KEY(Code) );
Query OK, 0 rows affected (0.10 sec)

# copy original data
mysql> INSERT INTO country_2020 SELECT Code, Name, Population FROM country;
Query OK, 239 rows affected (0.01 sec)
Records: 239  Duplicates: 0  Warnings: 0

# increase population of european countries by 10%
mysql> WITH cte(eur_code, eur_population) AS (SELECT Code, Population FROM country WHERE continent='Europe')  
    -> UPDATE country_2020, cte 
    -> SET Population_2020 = ROUND(eur_population*1.1) 
    -> WHERE Code=cte.eur_code;
Query OK, 46 rows affected (0.01 sec)
Rows matched: 46  Changed: 46  Warnings: 0

# delete from the new table all non-europian countries
mysql> WITH cte AS (SELECT Code FROM country WHERE continent <> 'Europe') 
    -> DELETE country_2020 
    -> FROM country_2020, cte 
    -> WHERE country_2020.Code=cte.Code;
Query OK, 193 rows affected (0.02 sec)

mysql> SELECT * FROM country_2020 ORDER BY Population_2020 DESC LIMIT 5;
+------+--------------------+-----------------+
| Code | Name               | Population_2020 |
+------+--------------------+-----------------+
| RUS  | Russian Federation |       161627400 |
| DEU  | Germany            |        90381170 |
| GBR  | United Kingdom     |        65585740 |
| FRA  | France             |        65148270 |
| ITA  | Italy              |        63448000 |
+------+--------------------+-----------------+

# create a new table

mysql> CREATE TABLE country_2020 ( Code char(3), Name char(52), Population_2020 int, PRIMARY KEY(Code) );

Query OK, 0 rows affected (0.10 sec)

# copy original data

mysql> INSERT INTO country_2020 SELECT Code, Name, Population FROM country;

Query OK, 239 rows affected (0.01 sec)

Records: 239 Duplicates: 0 Warnings: 0

# increase population of european countries by 10%

mysql> WITH cte(eur_code, eur_population) AS (SELECT Code, Population FROM country WHERE continent='Europe')

-> UPDATE country_2020, cte

-> SET Population_2020 = ROUND(eur_population*1.1)

-> WHERE Code=cte.eur_code;

Query OK, 46 rows affected (0.01 sec)

Rows matched: 46 Changed: 46 Warnings: 0

# delete from the new table all non-europian countries

mysql> WITH cte AS (SELECT Code FROM country WHERE continent <> 'Europe')

-> DELETE country_2020

-> FROM country_2020, cte

-> WHERE country_2020.Code=cte.Code;

Query OK, 193 rows affected (0.02 sec)

mysql> SELECT * FROM country_2020 ORDER BY Population_2020 DESC LIMIT 5;

+------+--------------------+-----------------+

| Code | Name | Population_2020 |

+------+--------------------+-----------------+

| RUS | Russian Federation | 161627400 |

| DEU | Germany | 90381170 |

| GBR | United Kingdom | 65585740 |

| FRA | France | 65148270 |

| ITA | Italy | 63448000 |

+------+--------------------+-----------------+

CTE can also be used for INSERT … SELECT queries like the following:

mysql> CREATE TABLE largest_countries (Code char(3), Name char(52), SurfaceArea decimal(10,2), PRIMARY KEY(Code) );
Query OK, 0 rows affected (0.08 sec)

mysql> INSERT INTO largest_countries
    -> WITH cte AS (SELECT Code, Name, SurfaceArea FROM country ORDER BY SurfaceArea DESC LIMIT 10)
    -> SELECT * FROM cte;
Query OK, 10 rows affected (0.02 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM largest_countries;
+------+--------------------+-------------+
| Code | Name               | SurfaceArea |
+------+--------------------+-------------+
| ARG  | Argentina          |  2780400.00 |
| ATA  | Antarctica         | 13120000.00 |
| AUS  | Australia          |  7741220.00 |
| BRA  | Brazil             |  8547403.00 |
| CAN  | Canada             |  9970610.00 |
| CHN  | China              |  9572900.00 |
| IND  | India              |  3287263.00 |
| KAZ  | Kazakstan          |  2724900.00 |
| RUS  | Russian Federation | 17075400.00 |
| USA  | United States      |  9363520.00 |
+------+--------------------+-------------+
10 rows in set (0.00 sec)

mysql> CREATE TABLE largest_countries (Code char(3), Name char(52), SurfaceArea decimal(10,2), PRIMARY KEY(Code) );

Query OK, 0 rows affected (0.08 sec)

mysql> INSERT INTO largest_countries

-> WITH cte AS (SELECT Code, Name, SurfaceArea FROM country ORDER BY SurfaceArea DESC LIMIT 10)

-> SELECT * FROM cte;

Query OK, 10 rows affected (0.02 sec)

Records: 10 Duplicates: 0 Warnings: 0

mysql> SELECT * FROM largest_countries;

+------+--------------------+-------------+

| Code | Name | SurfaceArea |

+------+--------------------+-------------+

| ARG | Argentina | 2780400.00 |

| ATA | Antarctica | 13120000.00 |

| AUS | Australia | 7741220.00 |

| BRA | Brazil | 8547403.00 |

| CAN | Canada | 9970610.00 |

| CHN | China | 9572900.00 |

| IND | India | 3287263.00 |

| KAZ | Kazakstan | 2724900.00 |

| RUS | Russian Federation | 17075400.00 |

| USA | United States | 9363520.00 |

+------+--------------------+-------------+

10 rows in set (0.00 sec)

Let’s think about a CTE like a temporary table pre-calculated or materialized before the main query. Then you can reference that temporary table as many times you need in your query.

Also, you can create multiple CTEs, and all of them can be used in the main query. The syntax is like the following:

WITH cte1 AS (SELECT ... FROM ... WHERE ...),

cte2 AS (SELECT ... FROM ... WHERE ...)

SELECT ...

FROM table1, table1, cte1, cte2 ....

WHERE .....

Scope of CTE

A CTE can be used even in subqueries, but in such a case, be aware of the scope. The CTE exists for the scope of a single statement.

Consider the following valid queries:

WITH cte AS (SELECT Code FROM country WHERE Population<1000000)

SELECT * FROM city WHERE city.CountryCode IN

(SELECT Code FROM cte); # Scope: "cte" is visible to top SELECT

SELECT * FROM city WHERE city.CountryCode IN

(WITH cte AS (SELECT Code FROM country WHERE Population<1000000)

SELECT Code from cte); # Scope: "cte" is not visible to top SELECT

To avoid any trouble with the scope, the best way to use CTEs is by creating all of them at the beginning of the top query. This way, all CTEs can be used wherever you need, multiple times.

Chaining multiple CTEs

When creating multiple CTEs for a query, another interesting feature is the chaining. It is possible to define any CTE containing one or more references to previous CTEs in the chain.

The following example shows how the chaining can be used. We would like to find out the countries with the highest and lowest population density in the world. We create a chain of three CTEs. The last two contain a reference to the first one.

mysql> WITH density_by_country(country,density) AS

(SELECT Name, Population/SurfaceArea

FROM country

WHERE Population>0 and surfacearea>0),

max_density(country,maxdensity,label) AS

(SELECT country, density, 'max density'

FROM density_by_country

WHERE density=(SELECT MAX(density) FROM density_by_country)),

min_density(country,mindensity,label) AS

(SELECT country, density, 'min density'

FROM density_by_country

WHERE density=(SELECT MIN(density) FROM density_by_country))

SELECT * FROM max_density UNION ALL SELECT * FROM min_density;

+-----------+------------+-------------+

| country | maxdensity | label |

+-----------+------------+-------------+

| Macao | 26277.7778 | max density |

| Greenland | 0.0259 | min density |

+-----------+------------+-------------+

Let’s think now how you can rewrite the same query using derived tables instead. You would need to copy (several times) the definition of density_by_country. The final query would be really very large and probably less readable.

Use CTE Instead of VIEW

It could happen that your database user doesn’t have the right to create a VIEW. A CTE doesn’t require specific grants apart from the capability to read table and columns, the same for any regular SELECT.

Then you can use CTE instead of a VIEW. Apart from the grants options, a CTE has generally better performance than a VIEW, as we’ll show later.

Let’s create a view and run a query using it.

mysql> CREATE VIEW city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city GROUP BY countrycode);

mysql> SELECT name, city_pop_by_country.sum_population/country.population ratio 
       FROM country, city_pop_by_country 
       WHERE country.code=city_pop_by_country.countrycode 
         AND country.population > (SELECT 10*AVG(sum_population) FROM city_pop_by_country);
+--------------------+--------+
| name               | ratio  |
+--------------------+--------+
| Bangladesh         | 0.0664 |
| Brazil             | 0.5048 |
| China              | 0.1377 |
| Germany            | 0.3194 |
| Egypt              | 0.2933 |
| Ethiopia           | 0.0510 |
| Indonesia          | 0.1767 |
| India              | 0.1216 |
| Iran               | 0.3845 |
| Japan              | 0.6153 |
| Mexico             | 0.6043 |
| Nigeria            | 0.1557 |
| Pakistan           | 0.2016 |
| Philippines        | 0.4072 |
| Russian Federation | 0.4706 |
| Turkey             | 0.4254 |
| United States      | 0.2825 |
| Vietnam            | 0.1173 |
+--------------------+--------+

mysql> CREATE VIEW city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city GROUP BY countrycode);

mysql> SELECT name, city_pop_by_country.sum_population/country.population ratio

FROM country, city_pop_by_country

WHERE country.code=city_pop_by_country.countrycode

AND country.population > (SELECT 10*AVG(sum_population) FROM city_pop_by_country);

+--------------------+--------+

| name | ratio |

+--------------------+--------+

| Bangladesh | 0.0664 |

| Brazil | 0.5048 |

| China | 0.1377 |

| Germany | 0.3194 |

| Egypt | 0.2933 |

| Ethiopia | 0.0510 |

| Indonesia | 0.1767 |

| India | 0.1216 |

| Iran | 0.3845 |

| Japan | 0.6153 |

| Mexico | 0.6043 |

| Nigeria | 0.1557 |

| Pakistan | 0.2016 |

| Philippines | 0.4072 |

| Russian Federation | 0.4706 |

| Turkey | 0.4254 |

| United States | 0.2825 |

| Vietnam | 0.1173 |

+--------------------+--------+

# here is the explain of the query using the VIEW

mysql> EXPLAIN SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, city_pop_by_country WHERE country.code=city_pop_by_country.countrycode AND country.population > (SELECT 10*AVG(sum_population) FROM city_pop_by_country);

+----+-------------+------------+------------+-------+---------------+-------------+---------+--------------------+------+----------+-------------+

+----+-------------+------------+------------+-------+---------------+-------------+---------+--------------------+------+----------+-------------+

| 1 | PRIMARY | <derived3> | NULL | ref | <auto_key0> | <auto_key0> | 12 | world.country.Code | 16 | 100.00 | NULL |

+----+-------------+------------+------------+-------+---------------+-------------+---------+--------------------+------+----------+-------------+

Let’s try now to rewrite the same query using the CTE.

mysql> WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city GROUP BY countrycode)
       SELECT name, city_pop_by_country.sum_population/country.population ratio 
       FROM country, city_pop_by_country 
       WHERE country.code=city_pop_by_country.countrycode 
         AND country.population > (SELECT 10*AVG(sum_population) FROM city_pop_by_country);
+--------------------+--------+
| name               | ratio  |
+--------------------+--------+
| Bangladesh         | 0.0664 |
| Brazil             | 0.5048 |
| China              | 0.1377 |
| Germany            | 0.3194 |
| Egypt              | 0.2933 |
| Ethiopia           | 0.0510 |
| Indonesia          | 0.1767 |
| India              | 0.1216 |
| Iran               | 0.3845 |
| Japan              | 0.6153 |
| Mexico             | 0.6043 |
| Nigeria            | 0.1557 |
| Pakistan           | 0.2016 |
| Philippines        | 0.4072 |
| Russian Federation | 0.4706 |
| Turkey             | 0.4254 |
| United States      | 0.2825 |
| Vietnam            | 0.1173 |
+--------------------+--------+

mysql> WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city GROUP BY countrycode)

SELECT name, city_pop_by_country.sum_population/country.population ratio

FROM country, city_pop_by_country

WHERE country.code=city_pop_by_country.countrycode

AND country.population > (SELECT 10*AVG(sum_population) FROM city_pop_by_country);

+--------------------+--------+

| name | ratio |

+--------------------+--------+

| Bangladesh | 0.0664 |

| Brazil | 0.5048 |

| China | 0.1377 |

| Germany | 0.3194 |

| Egypt | 0.2933 |

| Ethiopia | 0.0510 |

| Indonesia | 0.1767 |

| India | 0.1216 |

| Iran | 0.3845 |

| Japan | 0.6153 |

| Mexico | 0.6043 |

| Nigeria | 0.1557 |

| Pakistan | 0.2016 |

| Philippines | 0.4072 |

| Russian Federation | 0.4706 |

| Turkey | 0.4254 |

| United States | 0.2825 |

| Vietnam | 0.1173 |

+--------------------+--------+

# here is the EXPLAIN of the query using CTE

mysql> EXPLAIN WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city GROUP BY countrycode)

SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, city_pop_by_country WHERE country.code=city_pop_by_country.countrycode AND country.population > (SELECT 10*AVG(sum_population) FROM city_pop_by_country);

+----+-------------+------------+------------+-------+---------------+-------------+---------+--------------------+------+----------+-------------+

+----+-------------+------------+------------+-------+---------------+-------------+---------+--------------------+------+----------+-------------+

| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 12 | world.country.Code | 16 | 100.00 | NULL |

+----+-------------+------------+------------+-------+---------------+-------------+---------+--------------------+------+----------+-------------+

Taking a look at the two execution plans, we can notice that with the query, there are DERIVED stages. The materialization of the VIEW is needed more times, anytime the view is referenced.

The sample database is small, but we can enable the profiling to compare the execution time of the two queries.

mysql> SET profiling=1;

# execute the queries several times

mysql> show profiles;
+----------+------------+-----------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                         |
+----------+------------+-----------------------------------------------------------------------------------------------+
...
...                                                                                                                                                                                                                                                                                             |
|       35 | 0.00971925 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ...    |
|       36 | 0.00963100 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ...    |
|       37 | 0.00976900 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ...    |
|       38 | 0.00963875 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ...    |
|       39 | 0.00971200 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ...    |
|       40 | 0.00546550 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |
|       41 | 0.00546975 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |
|       42 | 0.00550325 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |
|       43 | 0.00548000 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |
|       44 | 0.00545675 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |
+----------+------------+-----------------------------------------------------------------------------------------------+

mysql> SET profiling=1;

# execute the queries several times

mysql> show profiles;

+----------+------------+-----------------------------------------------------------------------------------------------+

| Query_ID | Duration | Query |

+----------+------------+-----------------------------------------------------------------------------------------------+

...

... |

| 35 | 0.00971925 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ... |

| 36 | 0.00963100 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ... |

| 37 | 0.00976900 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ... |

| 38 | 0.00963875 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ... |

| 39 | 0.00971200 | SELECT name, city_pop_by_country.sum_population/country.population ratio FROM country, ... |

| 40 | 0.00546550 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |

| 41 | 0.00546975 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |

| 42 | 0.00550325 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |

| 43 | 0.00548000 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |

| 44 | 0.00545675 | WITH city_pop_by_country AS (SELECT countrycode, SUM(population) sum_population FROM city ... |

+----------+------------+-----------------------------------------------------------------------------------------------+

The execution time of the query with the view is around 0.0097 seconds, while with CTE, it is around 0.0054 seconds. So, the CTE is faster. Using larger tables and having more references of the view means the difference between the queries will be more relevant.

Using CTE instead of a VIEW is more efficient because only a single materialization is needed, and the temporary table created can be referenced many times in the main query.

Conclusion

We have introduced the new Common Table Expression feature available on MySQL 8.0. Using CTE, you can simplify, in most cases, the readability of the queries, but you can also use CTE instead of VIEWs to improve the overall performance.

Using CTE, it is also possible to create a recursive query. In the next article of this series, we’ll see examples about how to use recursive CTE to generate series or to query hierarchies.

4 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Adnan

4 years ago

So can we take CTE as an alternative to View ? and what about memory efficiency in CTE ? is it efficient than View or standard sub queries ? if so, then it will give a big benefit on performance optimization.

Corrado Pandiani

Author

Reply to Adnan

4 years ago

Hi Adnan. Basically the execution plan of a view and a CTE is the same if the derived query is referenced only once. The CTE will be significantly better than a view if the derived query is referenced more times. The CTE materialize the query at the beginning anche than can be referenced as many times you need. For the view is different. Any sinle reference of a derived table requires materialization. This involves more memiry usage and worse performance. I hope this helps.

Øystein Grøvlen

4 years ago

Note that a CTE is not always materialized. Like a derived table, it may be merged into the main query if it does not contain aggregation, limit, union etc.

Corrado Pandiani

Author

Reply to Øystein Grøvlen

4 years ago

Hi, you’re right. It’s the same as the view. Even for CTE the optimizer can merge the CTE code or materialize it. The aim of the article was to show the usage and the benefit of CTE by the application perspective. Anyway it is worth considering to write anotehr article discussing about internals and the optimizer optimization when using CTE. Thanks

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Introduction to MySQL 8.0 Common Table Expressions (Part 1)

What is a Common Table Expression?

How to Create and Use a CTE

Scope of CTE

Chaining multiple CTEs

Use CTE Instead of VIEW

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Introduction to MySQL 8.0 Common Table Expressions (Part 1)

What is a Common Table Expression?

How to Create and Use a CTE

Scope of CTE

Chaining multiple CTEs

Use CTE Instead of VIEW

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation