Comments on: Correcting MySQL Inaccurate Table Statistics for Better Execution Plan https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/ Wed, 16 Nov 2022 20:03:37 +0000 hourly 1 https://wordpress.org/?v=6.5.2 By: Brandon https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973819 Wed, 16 Nov 2022 20:03:37 +0000 https://www.percona.com/blog/?p=80706#comment-10973819 Thanks for the reply Edwin!

The one thing we’re missing is: Why would mysql.innodb_index_stats not match the result of SHOW INDEXES when innodb_stats_persistent=ON?

We have that turned on, and we’ve increased innodb_stats_persistent_sample_pages significantly, and see the correct result in mysql.innodb_index_stats, however SHOW INDEXES still returns a wildly different number for cardinality.

Are we missing something?

]]>
By: Edwin Wang https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973816 Wed, 16 Nov 2022 00:30:33 +0000 https://www.percona.com/blog/?p=80706#comment-10973816 Hey Brandon,
innodb_stats_persistent is to specify “whether InnoDB index statistics are persisted to disk”. when it is turned off, “statistics may be recalculated frequently”, so it is possible that “it’s slightly less wrong”. but in the end, it shouldn’t impact the statistic calculation.

innodb_stats_persistent=OFF, innodb_stats_transient_sample_pages would be the factor to adjust the statistic calculation.
innodb_stats_persistent=ON, use innodb_stats_persistent_sample_pages instead.

btw, “As of MySQL 5.6.6, optimizer statistics are persisted to disk by default”
Hope the info help

Best

]]>
By: Brandon https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973814 Tue, 15 Nov 2022 00:15:28 +0000 https://www.percona.com/blog/?p=80706#comment-10973814 Thanks for this writeup! Very helpful.

We’re encountering the opposite issue, wondering if you have any insight.

My understanding is that if the server setting innodb_stats_persistent is turned ON, then cardinality (from SHOW INDEXES for example) should be reading from the table mysql.innodb_index_stats.

That doesn’t seem to be the case from what we’re experiencing.

What we’re seeing is that the result stored in mysql.innodb_index_stats is completely accurate, however SHOW INDEXES is returning something completely wrong. (This is after running ANALYZE and FLUSH on the table).

Further, if we change innodb_stats_persistent to OFF, we still get an incorrect cardinality report from SHOW INDEXES, however it’s slightly less wrong.

Have you seen anything like this before?

We’re on MariaDB 10.6.10, and there seems to be a bug report on something similar but not the exact issue: https://jira.mariadb.org/browse/MDEV-28327

]]>
By: Edwin Wang https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973591 Mon, 18 Apr 2022 17:23:52 +0000 https://www.percona.com/blog/?p=80706#comment-10973591 Hello ss,
As mentioned, the details can be found even the code:
https://github.com/mysql/mysql-server/blob/6846e6b2f72931991cc9fd589dc9946ea2ab58c9/storage/innobase/dict/dict0stats.cc

“Definition: N-prefix-boring record is a record on a non-leaf page that equals
the next (to the right, cross page boundaries, skipping the supremum and
infimum) record on the same level when looking at the fist n-prefix columns.
The last (user) record on a level is not boring (it does not match the
non-existent user record to the right). We call the records boring because all
the records on the page below a boring record are equal to that boring record.
We avoid diving below boring records when searching for a leaf page to
estimate the number of distinct records because we know that such a leaf
page will have number of distinct records == 1.
For each n-prefix: start from the root level and full scan subsequent lower
levels until a level that contains at least A*10 distinct records is found.
Lets call this level LA.”

]]>
By: Edwin Wang https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973590 Mon, 18 Apr 2022 17:19:59 +0000 https://www.percona.com/blog/?p=80706#comment-10973590 Hey Ivan, AFAIK, there is no such a tool to automatically determine which stats are off. However, doing a count(distinct columns) of a indexed column will only read the index, so usually shouldn’t take too long.

]]>
By: ss https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973586 Sat, 16 Apr 2022 09:06:03 +0000 https://www.percona.com/blog/?p=80706#comment-10973586 What does level LA mean?

]]>
By: Ivan Baldo https://www.percona.com/blog/correcting-mysql-inaccurate-table-statistics-for-better-execution-plan/#comment-10973585 Sat, 16 Apr 2022 00:40:45 +0000 https://www.percona.com/blog/?p=80706#comment-10973585 Is there a tool to automatically determine which stats are off?
I realize that doing a count() of a big table could be expensive but maybe could be done on off peak hours or a replica.

]]>