Abstract:
By diving into the details of our case study, we will explain how incorrect table statistics may lead the optimizer to choose a suboptimal execution plan. We will also go into how MySQL calculates the table statistics and the ways to correct the table statistics to prevent it from happening again.
Case study: Incorrect table statistics lead the optimizer to choose a poor execution plan.
A customer reported a drastic performance degradation of a query while there were no code changes and no configuration changes made. The data in this article has been edited for brevity and modified to mitigate the exposure of confidential information. The case has also been approved for publication by the customer.
We obtained the query execution plan, and got the results as shown below (execution plan #1):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | mysql> explain -> SELECT count(con.id) , -> MAX(DAYNAME(con.date)) , -> now() , -> pcz.type, -> pcz.c_c -> FROM con AS con -> join orders o on con.o_id = o.id -> JOIN pcz AS pcz ON o.d_p_c_z_id = pcz.id -> left join c c on con.c_id = c.id -> WHERE con.date = current_date() and pcz.type = "T_D" -> GROUP BY con.date, pcz.c_c, pcz.type; +----+-------------+-------+------------+--------+-------------------+----------+---------+----------------------------+------+----------+----------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-------------------+----------+---------+----------------------------+------+----------+----------------------------------------------+ | 1 | SIMPLE | pcz | NULL | ALL | PRIMARY | NULL | NULL | NULL | 194 | 10.00 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | o | NULL | ref | PRIMARY,dpcz_FK | dpcz_FK | 9 | custom.pcz.id | 1642 | 100.00 | Using index | | 1 | SIMPLE | con | NULL | ref | FK_order,IDX_date | FK_order | 8 | custom.o.id | 1 | 4.23 | Using where | | 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 8 | custom.con.c_id | 1 | 100.00 | Using index | +----+-------------+-------+------------+--------+-------------------+----------+---------+----------------------------+------+----------+----------------------------------------------+ |
The estimated rows to be examined = (194*10%)*1642*(1*4.23%)=1347
Upon taking a closer look at the query, we see the condition: con.date = current_date() . This condition seems to be a better choice to help filter results, but why did the MySQL optimizer skip out on using the index? Let’s take a look at the execution plan by forcing the use of the index on the con.date field. The explain output (execution plan #2) will be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | mysql> explain -> SELECT count(con.id) , -> MAX(DAYNAME(con.date)) , -> now() , -> pcz.type, -> pcz.c_c -> FROM con AS con USE INDEX(IDX_date) -> join orders o on con.o_id = o.id -> JOIN p_c_z AS pcz ON o.d_p_c_z_id = pcz.id -> left join c c on con.c_id = c.id -> WHERE con.date = current_date() and pcz.type = "T_D" -> GROUP BY con.date, pcz.c_c, pcz.type; +----+-------------+-------+------------+--------+-----------------+----------+---------+---------------------------------------+--------+----------+---------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-----------------+----------+---------+---------------------------------------+--------+----------+---------------------------------+ | 1 | SIMPLE | con | NULL | ref | IDX_date | IDX_date | 3 | const | 110446 | 100.00 | Using temporary; Using filesort | | 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 8 | custom.con.c_id | 1 | 100.00 | Using index | | 1 | SIMPLE | o | NULL | eq_ref | PRIMARY,dpcz_FK | PRIMARY | 8 | custom.con.o_id | 1 | 100.00 | Using where | | 1 | SIMPLE | pcz | NULL | eq_ref | PRIMARY | PRIMARY | 8 | custom.o.d_p_c_z_id | 1 | 10.00 | Using where | +----+-------------+-------+------------+--------+-----------------+----------+---------+---------------------------------------+--------+----------+---------------------------------+ |
The estimated rows to be examined = 110446*(1*10%)=11045 rows
Based on the estimate, because 1347 is around one-tenth of 11045, it is reasonable for MySQL to choose execution plan #1, as it appears to be optimal.
However, in comparing the expected response times with the actual results, something was clearly off. While the second execution plan returned results within the expected time (before the performance degradation), execution plan #1 exceeded the estimated time for response.
Taking a further look at the structure of the table orders and execution plan #1, we found out that there are actually 194 rows of the table pcz. Moreover, looking at the index orders.dpcz_FK, the table orders will return 1642 rows, because of the FOREIGN KEY constraint orders_ibfk_10
as shown below – this implies that the number of rows in the table orders should be 194*1642=318548, but the actual number of rows of the table orders is 32508150, which is more than 100 times the estimated amount of 318548.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | CREATE TABLE `orders` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, ... `d_p_c_z_id` bigint(20) DEFAULT NULL, ..., PRIMARY KEY (`id`), ... KEY `dpcz_FK` (`d_p_c_z_id`), ... CONSTRAINT `orders_ibfk_10` FOREIGN KEY (`d_p_c_z_id`) REFERENCES `p_c_z` (`id`) ON DELETE CASCADE ON UPDATE CASCADE, ... ) ENGINE=InnoDB .... mysql> select * from mysql.innodb_table_stats where database_name='cutom' and table_name='orders'; +---------------+------------+---------------------+----------+----------------------+--------------------------+ | database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes | +---------------+------------+---------------------+----------+----------------------+--------------------------+ | custom | orders | 2022-03-03 21:58:18 | 32508150 | 349120 | 697618 | +---------------+------------+---------------------+----------+----------------------+--------------------------+ |
As such, we suspected that the table statistics of orders.dpcz_FK are not accurate. We verified it by running the test below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | mysql> select * from mysql.innodb_index_stats where database_name='cutom' and table_name='orders' and index_name='dpcz_FK'; mysql> select * from mysql.innodb_index_stats where database_name='custom' and table_name='orders' and index_name='dpcz_FK'; +---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+ | database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description | +---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+ | custom | orders | dpcz_FK | 2022-02-28 12:35:30 | n_diff_pfx01 | 19498 | 50 | d_p_c_z_id | | custom | orders | dpcz_FK | 2022-02-28 12:35:30 | n_diff_pfx02 | 32283087 | 128 | d_p_c_z_id,id | | custom | orders | dpcz_FK | 2022-02-28 12:35:30 | n_leaf_pages | 55653 | NULL | Number of leaf pages in the index | | custom | orders | dpcz_FK | 2022-02-28 12:35:30 | size | 63864 | NULL | Number of pages in the index | +---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+ mysql> select count(distinct d_p_c_z_id) from orders; +----------------------------------------------+ | count(distinct d_p_c_z_id) | +----------------------------------------------+ | 195 | +----------------------------------------------+ |
Bingo! The actual cardinality of the column d_p_c_z_id (of which the index dpcz_FK index on) of the table orders is 195. In the statistics table mysql.innodb_index_stats the stat_value for the index dpcz_FK is 19498, which is incorrect and is a large difference from the actual value! Moreover, the stat_value for an index is supposed to be the cardinality of the column of the table.
Using the correct stat_value 195 for the index dpcz_FK, we can get the actual number of rows to return in line 2 of execution plan #1 to be 32508150/195=166708, and then the estimated rows to be examined will be (194*10%)*166708*(1*4.23%)=136804. As this new value is over 10 times larger than 11045, the estimated row size of execution plan #2, MySQL will now correctly select execution plan #2 without us having to force the use of the index.
But why did MySQL calculate the table statistics incorrectly, and how can we fix it?
To answer that question, we first need to know how MySQL calculates the table statistics and what parameters control the process. Then the path to the solution can be discovered easily.
How InnoDB Calculates the Table Statistics
The Table Statistics can be collected explicitly or automatically. People usually enable(also by default) innodb_stats_auto_recalc to automatically recalculate persistent statistics after the data in a table is changed substantially. When 10% of the rows in the table are changed, InnoDB will recalculate the statistics. Alternatively, we can use ANALYZE TABLE to recalculate statistics explicitly.
Behind the specifications, InnoDB uses the sampling technique known as a random dive which samples random pages from each index on a table to estimate the cardinality of the index. The innodb_stats_persistent_sample_pages controls the number of sampled pages. Refer to the link.
As per the code and description, random sampling is not fully random. The sampling pages are actually selected based on the sampling algorithm. Ultimately, the total number of different key values, namely, the stat_value of the index will be calculated by the formula: N * R * N_DIFF_AVG_LEAF. where:
N: the number of leaf pages
R: the ratio of the number of different key values on level LA to the total number of records on level LA
N_DIFF_AVG_LEAF: the average number of different key values found in all the A leaf pages.
The details of the sampling algorithm code can be found in the link:
With the above understanding, we know that when the index of a table is fragmented, both the number of leaf pages(N) and the ratio of the number of different key values on level LA to the total number of records on level LA(R) becomes more and more inaccurate, and so the calculation of the stat_value may be incorrect. Once that happens, unless the parameter innodb_stats_persistent_sample_pages is changed or the index is reconstructed, recalculation explicitly(manually run ANALYZE TABLE) will not be able to produce the correct stat_value.
Solution: How can we correct the table statistics and prevent it from happening again?
Due to the sampling algorithm as discussed above, we now know that there are only two factors that impact the calculation: the parameter innodb_stats_persistent_sample_pages: A; and how the index is organized.
To allow InnoDB to get the correct table statistics, we will have to either increase the innodb_stats_persistent_sample_pages or rebuild/reconstruct the index. The head-on approach to reconstructing the index is to rebuild the table, for example, execute a no-op alter against the table.
Let’s take a look at the following three examples:
- ANALYZE TABLE without rebuilding, keeping innodb_stats_persistent_sample_pages as it is (128), the stat_value slightly changed to 19582, close to the original incorrect 19498, still off. The number of the leaf pages in the index slightly changed from 55653 to 55891, the number of pages in the index also slightly changed from 63864 to 64248:123456789101112131415161718192021mysql> show variables = 'innodb_stats_persistent_sample_pages;+--------------------------------------+-------+| Variable_name | Value |+--------------------------------------+-------+| innodb_stats_persistent_sample_pages | 128 |+--------------------------------------+-------+mysql> analyze table orders;+---------------+---------+----------+----------+| Table | Op | Msg_type | Msg_text |+---------------+---------+----------+----------+| custom.orders | analyze | status | OK |+---------------+---------+----------+----------+mysql> select * from mysql.innodb_index_stats where database_name='custom' and table_name='orders' and index_name='dpcz_FK';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| custom | orders | dpcz_FK | 2022-03-03 21:58:18 | n_diff_pfx01 | 19582 | 50 | d_p_c_z_id || custom | orders | dpcz_FK | 2022-03-03 21:58:18 | n_diff_pfx02 | 32425512 | 128 | d_p_c_z_id,id || custom | orders | dpcz_FK | 2022-03-03 21:58:18 | n_leaf_pages | 55891 | NULL | Number of leaf pages in the index || custom | orders | dpcz_FK | 2022-03-03 21:58:18 | size | 64248 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
- ANALYZE TABLE without rebuilding, but increasing innodb_stats_persistent_sample_pages from 128 to 512, got stat_value to 192 very close to the real cardinality 195. There was a big change in the number of leaf pages in the index, from 55653 to 44188. The number of pages in the index also changed drastically, from 63864 to 50304.123456789101112131415161718192021mysql> show variables like '%persistent_sample%';+--------------------------------------+-------+| Variable_name | Value |+--------------------------------------+-------+| innodb_stats_persistent_sample_pages | 512 |+--------------------------------------+-------+mysql> analyze table orders;+---------------+---------+----------+----------+| Table | Op | Msg_type | Msg_text |+---------------+---------+----------+----------+| custom.orders | analyze | status | OK |+---------------+---------+----------+----------+mysql> select * from mysql.innodb_index_stats where database_name='custom' and table_name='orders' and index_name='dpcz_FK';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| custom | orders | dpcz_FK | 2022-03-09 06:54:29 | n_diff_pfx01 | 192 | 179 | d_p_c_z_id || custom | orders | dpcz_FK | 2022-03-09 06:54:29 | n_diff_pfx02 | 31751321 | 512 | d_p_c_z_id,id || custom | orders | dpcz_FK | 2022-03-09 06:54:29 | n_leaf_pages | 44188 | NULL | Number of leaf pages in the index || custom | orders | dpcz_FK | 2022-03-09 06:54:29 | size | 50304 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
Rebuilding the table, keeping innodb_stats_persistent_sample_pages as 128, also got the correct stat_value 187 close to the real cardinality 195. The number of leaf pages in the index substantially changed, from 55653 to 43733, and the number of pages in the index also changed from 63864 to 50111.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | mysql> show variables = 'innodb_stats_persistent_sample_pages'; +--------------------------------------+-------+ | Variable_name | Value | +--------------------------------------+-------+ | innodb_stats_persistent_sample_pages | 128 | +--------------------------------------+-------+ mysql> alter table orders engine=innodb; Query OK, 0 rows affected (11 min 16.37 sec) mysql> select * from mysql.innodb_index_stats where database_name='custom' and table_name='orders' and index_name='dpcz_FK'; +---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+ | database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description | +---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+ | custom | orders | dpcz_FK | 2022-03-07 18:44:43 | n_diff_pfx01 | 187 | 128 | d_p_c_z_id | | custom | orders | dpcz_FK | 2022-03-07 18:44:43 | n_diff_pfx02 | 31531493 | 128 | d_p_c_z_id,id | | custom | orders | dpcz_FK | 2022-03-07 18:44:43 | n_leaf_pages | 43733 | NULL | Number of leaf pages in the index | | custom | orders | dpcz_FK | 2022-03-07 18:44:43 | size | 50111 | NULL | Number of pages in the index | +---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+ |
After the table statistics data is corrected, the MySQL optimizer would choose the correct execution plan as well:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | mysql> explain SELECT count(con.id) , MAX(DAYNAME(con.date)) , now() , pcz.type, pcz.c_c FROM con AS con join orders o on con.order_id = o.id JOIN p_c_z AS pcz ON o.d_p_c_z_id = pcz.id left join c c on con.c_id = c.id WHERE con.date = current_date() and pcz.type = "T_D" GROUP BY con.date, pcz.c_c, pcz.type; +----+-------------+-------+------------+--------+-------------------+----------+---------+---------------------------------------+------+----------+---------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-------------------+----------+---------+---------------------------------------+------+----------+---------------------------------+ | 1 | SIMPLE | con | NULL | ref | FK_order,IDX_date | IDX_date | 3 | const | 3074 | 100.00 | Using temporary; Using filesort | | 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 8 | custom.con.c_id | 1 | 100.00 | Using index | | 1 | SIMPLE | o | NULL | eq_ref | PRIMARY,dpcz_FK | PRIMARY | 8 | custom.con.order_id | 1 | 100.00 | Using where | | 1 | SIMPLE | pcz | NULL | eq_ref | PRIMARY | PRIMARY | 8 | custom.o.d_p_c_z_id | 1 | 10.00 | Using where | +----+-------------+-------+------------+--------+-------------------+----------+---------+---------------------------------------+------+----------+---------------------------------+ 4 rows in set, 1 warning (0.01 sec) |
Conclusion
MySQL optimizer depends on the accuracy of the table statistics in order to select the optimal execution plan. We can control the accuracy of the table statistics by changing the parameter innodb_stats_persistent_sample_pages. We can also choose to force a full recalculation of the table statistics by rebuilding/reconstructing the table while defragmenting the indexes, which helps to improve the accuracy of the table statistics. To reconstruct the table, we can directly alter the table with no-op or use pt-online-schema-change to achieve the same effect.
Is there a tool to automatically determine which stats are off?
I realize that doing a count() of a big table could be expensive but maybe could be done on off peak hours or a replica.
Hey Ivan, AFAIK, there is no such a tool to automatically determine which stats are off. However, doing a count(distinct columns) of a indexed column will only read the index, so usually shouldn’t take too long.
What does level LA mean?
Hello ss,
As mentioned, the details can be found even the code:
https://github.com/mysql/mysql-server/blob/6846e6b2f72931991cc9fd589dc9946ea2ab58c9/storage/innobase/dict/dict0stats.cc
“Definition: N-prefix-boring record is a record on a non-leaf page that equals
the next (to the right, cross page boundaries, skipping the supremum and
infimum) record on the same level when looking at the fist n-prefix columns.
The last (user) record on a level is not boring (it does not match the
non-existent user record to the right). We call the records boring because all
the records on the page below a boring record are equal to that boring record.
We avoid diving below boring records when searching for a leaf page to
estimate the number of distinct records because we know that such a leaf
page will have number of distinct records == 1.
For each n-prefix: start from the root level and full scan subsequent lower
levels until a level that contains at least A*10 distinct records is found.
Lets call this level LA.”
Thanks for this writeup! Very helpful.
We’re encountering the opposite issue, wondering if you have any insight.
My understanding is that if the server setting innodb_stats_persistent is turned ON, then cardinality (from SHOW INDEXES for example) should be reading from the table mysql.innodb_index_stats.
That doesn’t seem to be the case from what we’re experiencing.
What we’re seeing is that the result stored in mysql.innodb_index_stats is completely accurate, however SHOW INDEXES is returning something completely wrong. (This is after running ANALYZE and FLUSH on the table).
Further, if we change innodb_stats_persistent to OFF, we still get an incorrect cardinality report from SHOW INDEXES, however it’s slightly less wrong.
Have you seen anything like this before?
We’re on MariaDB 10.6.10, and there seems to be a bug report on something similar but not the exact issue: https://jira.mariadb.org/browse/MDEV-28327
Hey Brandon,
innodb_stats_persistent is to specify “whether
InnoDB
index statistics are persisted to disk”. when it is turned off, “statistics may be recalculated frequently”, so it is possible that “it’s slightly less wrong”. but in the end, it shouldn’t impact the statistic calculation.innodb_stats_persistent=OFF,
innodb_stats_transient_sample_pages would be the factor to adjust the
statistic calculation.innodb_stats_persistent=ON, use
innodb_stats_persistent_sample_pages instead.
btw, “As of MySQL 5.6.6, optimizer statistics are persisted to disk by default”
Hope the info help
Best
Thanks for the reply Edwin!
The one thing we’re missing is: Why would mysql.innodb_index_stats not match the result of SHOW INDEXES when innodb_stats_persistent=ON?
We have that turned on, and we’ve increased innodb_stats_persistent_sample_pages significantly, and see the correct result in mysql.innodb_index_stats, however SHOW INDEXES still returns a wildly different number for cardinality.
Are we missing something?