pt-table-checksum ErrorsIn this blog post, we’ll look at how to approach pt-table-checksum errors.

pt-table-checksum is one of the most popular tools in Percona Toolkit, and it is widely used to identify data differences between masters and slaves. Therefore, as Percona Support Engineers we have customers often asking questions related to the pt-table-checksum errors and warnings produced. Below are the most common issues raised with pt-table-checksum, and we decided to address those issues to help with how to mitigate related warnings or errors.

Unable to detect slaves

It’s possible that the tool cannot connect to the slaves due to not specific enough information found on the master. By default, it is looking for slaves based on the replica threads visible in master’s processlist. This could be the problem if, for example, the slave’s MySQL runs with a different TCP port, the hostname is not resolved correctly or both the master and slave are on the same host, or this is Galera-based replication. In this case, there is –recursion-method option to try with different discovery methods: ‘hosts’ or ‘cluster’. And if all of them fail, you can specify each slave details manually using the ‘dsn’ method.

An example using this option for the cluster looks like this:

and when a DSN is needed (like for mysqlsandbox instances), we have to add the slave(s) details to the table, similar to below:

ROW format on slave

The problem is that second and next level replicas (in chain replication topology) will not calculate the diffs as expected. So this message warns that the slave is using binlog_format=ROW, as the tool needs STATEMENT format to calculate the diffs separately on the slave and master. This is done by replicating the command (e.g., INSERT INTO percona.checksum SELECT CRC32 …. WHERE … ) as the original statement, not as a row copy of CRC values already computed on the master. And that is possible as the tool sets the binlog_format=STATEMENT in its session. This session setting does not propagate further into the slave’s own binary log though. This is not a problem when all the slaves are replicating directly from the master, and in such cases we can ignore that message and use the –no-check-binlog-format option.

By the way, the warning message is misleading regarding breaking replication claim, hence the bug reported.

Unable to switch session binlog_format to STATEMENT

or:

This can be an issue if STATEMENT mode is unsupported in the MySQL variant or special edition of it – Amazon RDS for example, or when switching is prohibited either by lack of SUPER privilege (limitation for Amazon Aurora), or Percona XtraDB Cluster Strict Mode safety precaution as seen on the example above. To workaround it in Percona XtraDB Cluster, temporarily relaxing the strict mode (be careful as this may be dangerous) will work:

For Aurora though (only in case asynchronous replication is used between Aurora clusters or from Aurora to non-Aurora MySQL), you will have to change the binlog_format globally to STATEMENT using the option groups.

Too large chunk size or no good index

or

Instead of examining each table with a single big query, the pt-table-checksum splits tables into chunks to ensure that the checksum is non-intrusive and doesn’t cause too much replication lag or load on the server. To create these chunks, it needs an index of some sort (preferably a primary key or unique index). If there is no index, and the table contains a suitably small number of rows, the tool tries to checksum the table in a single chunk.

Skipping the table, as in the second message example, is a common issue with pt-table-checksum and can be caused by different/outdated table statistics on the master or slave side. To alleviate this issue, make sure all your tables contain a primary or unique key. pt-table-checksum requires that to divide a table into chunks effectively. We also suggest that you make sure these messages are not related to real differences in this table (maybe a row count is significantly different). Also, executing pt-table-checksum with PTDEBUG is a good idea as it captures a lot of debugging info and it provides better insight into what is causing the issue.

There can be some random skipping of tables across many tool runs, and it’s probably because of a mix of two variables. One of it is innodb_stats_on_metadata. Turn it off, at least during the checksum running, such that InnoDB index stats won’t change so often. We remind you it’s a dynamic variable, which means you can change it without MySQL server restart. On the other hand, if constant statistics change for a table (even though the innodb_stats_on_metadata=0, statistics change with each significant amount of writes) is a problem, you may want to disable it for the duration of checksum. Check innodb_stats_auto_update option in Percona Server for MySQL for details.

pt-table-checksum uses an EXPLAIN query to determine the number of rows in the chunk, so ever-changing table statistics is most likely the reason for skipped tables. This is where pt-table-checksum decides to skip a chunk or not. This avoids the scenario that a table has fewer rows on the master but many on a replica, and is checksummed in a single large query, which causes a very long delay in replication. This is also affected by –chunk-size-limit, which defaults to 2. Try setting up higher chunk-size-limit or chunk-time so that pt-table-checksum allows larger chunks, but do it during off-peak periods. Of course, allowing too big of a chunk makes the server suffer for heavy selects, and slave lag may also be a problem while –chunk-time adjusts the chunk size dynamically so that the checksum query executes in a defined amount of time.

For tables that can’t be chunked and must be checksummed in a single run, the chunk size should be sufficiently large, and sometimes is not enough. That’s where the chunk-size-limit comes into play. The –chunk-size-limit modifier is a multiplier for chunk-size and allows larger chunks. To make sure your server is not heavily loaded, you can set a threshold at which pt-table-checksum pauses itself. This can be done by using –-max-load parameter of pt-table-checksum so, in this way –chunk-time and –chunk-size-limit won’t noticeably impact your server. We would suggest to start with default value –chunk-size-limit and increase it gradually till it succeeds. High values of –chunk-size-limit guarantee higher rates of successful runs, but there’s no way to tell if it will always be successful because the number of rows processed is only an estimate. It’s worth mentioning that you can also try running ANALYZE TABLE on “skipped tables” before running pt-table-checksum to make sure statistics are up to date. This may help or may not help, as statistics are estimated and it still might not be inaccurate.

Also, scripting retries of skipped chunks can be a good approach. You can redirect the pt-table-checksum output to a log file and parse that log to find out which tables need to be re-tried separately. You can do many re-tries for a single table if necessary, and the checksum result for a particular table in the checksums table gets overwritten without affecting other results.

All the problems described above will not take place when a table has a primary key on auto_increment int column.

Suboptimal query plan

The tool uses several heuristics to determine whether an execution plan is good or bad. The first is whether EXPLAIN reports that MySQL intends to use the desired index to access the rows. If MySQL chooses a different index, the tool considers the query unsafe. The tool also checks how much of the index MySQL reports that it uses for the query. The EXPLAIN output shows this in the key_len column. The tool remembers the largest key_len seen, and skips chunks where MySQL reports that it uses a smaller prefix of the index. However, it stretches the overall time to run checksum as it runs several heuristics to decide whether execution path is good or bad. This helps to decide the chunk. By default, –check-plan is on. It can bring a little bit of additional load to the server, but if that’s the case you can always monitor the checksum progress during execution and cancel pt-table-checksum at any moment if necessary. In general, it’s good to keep it enabled. Further, it’s best to run pt-table-checksum during low database traffic time.

To deal with the above error, disable the feature by using –no-check-plan when you get one “Skipping chunk” error. The only drawback of using it is leaving the door open for possible (costly) table scans.

Missing or filtered tables on the slave

This above error is clear that table test.dummy exists on the master but is missing on the slave server. This usually occurs with replication filters. pt-table-checksum failed because test.dummy checksummed on the master while failed on replica to checksum. This can be easily reproduced as per the below example:

As per the above example, the ‘test’ database is ignored to replicate via replication filter Replicate_Ignore_DB, which means any updates on that database will not fall to slave.

That is actually not an error, but it means that pt-table-checksum is waiting on replicas to run checksum queries. We have customers reporting that the tool runs forever and never came out from “Waiting to check replicas for differences”.  We noticed this problem occurs when database tables exist on replicas but are ignored by replication filters. Because pt-table-checksum checksums each chunk with an INSERT/REPLACE…SELECT query, and those queries from the master never fall to replicas via replication because the tables in question are blocked by replication filters. So the tool waits forever to check the checksum result on replicas, which will never happen. To remedy this issue, use the –ignore-databases or –ignore-tables option to ignore filtered tables from the checksum process.

Replication filters can bring unexpected issues as the last two warnings/errors demonstrated.

Conclusion

pt-table-checksum is a robust tool that validates data between master/slaves in a replication environment. However, in some scenarios the task can be quite challenging. Fortunately, there are options to deal with these obstacles. Some, however, involve not only using specific options for the tool, but also properly (re-)designing your schema. A proper primary key may not only allow the tool to work much faster, less expensive, but sometimes to work at all.

5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Marisela W. Kegley

Does pt-table-checksum also support Galera plugin? Or only master-slave replication?

Jagannath

How to use pt-table-checksum and pt-table-sync in master-master replication with mixed binlog and both the servers are having differences?

lei

Hi,i have some trouble in using pt-table-checksum tools,it can not work .by adding PTDBUG=1 ,it rase Database xxxxx is a system database, ignoring.

Felix Ostmann

There is also another solution for the “Too large chunk size or no good index” problem:
Increasing –slave-skip-tolerance from 1.0 (default). This option is a simple multiplier for the skip table check (see code below).
I changed this to 2 and every faulty skip has gone. For my setup it is no problem to run a query which perhaps check up to 2*rows as calculated.

if ( $n_rows and $n_rows > ($tbl->{chunk_size} * $chunk_size_limit) * $slave_skip_tolerance ) {
# skip with “Too large chunk size or no good index”-error
}