PREPARE StatementIn this blog, we’ll look at the differences in how a PREPARE statement handles errors in binary and text protocols.

Introduction

Since Percona XtraDB Cluster is a multi-master solution, when an application executes conflicting workloads one of the workloads gets rolled back with a DEADLOCK error. While the same holds true even if you fire the workload through a PREPARE statement, there are differences between using the MySQL connector API (with binary protocol) and the MySQL client (with text protocol). Let’s look at these differences with the help of an example.

Base Workload

  • Say we have a two-node cluster (n1 and n2) with the following base schema and tables:
  • Workload (w1) on node-1 (n1):
  • Workload (w2) on node-2 (n2):
  • The workloads are conflicting, and the first node to commit wins. Let’s assume n1 commits first, and the write-set is replicated to n2 while n2 is busy executing the local workload.

Different Scenarios Based on Configuration

wsrep_retry_autocommit > 0
  • n2 tries to apply the replicated workload, thereby forcefully aborting (brute-force abort) the local workload (w2).
  • The forcefully aborted workload is retried (based on wsrep_retry_autocommit value (default is 1)).
  • w2 succeeds on retry, and the new state of table t would be:
  • The user can increase wsrep_retry_autocommit to a higher value, though we don’t recommend this as a higher value increases pressure on the system to retry failed workloads. A retry doesn’t ensure things always pass if additional conflicting replicated workload are lined up in the certification queue. Also, it is important to note that a retry works only if AUTO_COMMIT=ON. For a begin-commit transaction, retry is not applicable. If we abort this transaction, then the complete transaction is rolled back. The workload executor application should have the logic to expect workload failure due to conflict and handle it accordingly.
wsrep_retry_autocommit = 0
  • Let’s say the user sets wsrep_retry_autocommit=0 on node-2 and executes the same workload.
  • This time it doesn’t retry the local workload (w2). Instead, it sends an error to the end-user:
  • Only the replicated workload (w1) is applied, and local workload w2 fails with a DEADLOCK error.
Use of MySQL Connector API

Now let’s try to execute these workloads through Connector API. Connector API uses a binary protocol, which means it directly invokes the PREPARE statement and executes the statement using dedicated command codes ( COM_STMT_PREPARE and COM_STMT_EXECUTE).

While the conflicting transaction internal abort remains same, the COM_QUERY command code (text protocol) handling corrects the “query interrupted error” and resets it to “deadlock error” based on the wsrep_conflict_state value. It also has logic to retry the auto-commit enable statement.

This error handling and retry logic is not present when the COM_STMT_PREPARE and COM_STMT_EXECUTE command codes are used. This means the user sees a different error (non-masked error) when we try the same workload through an application using the MySQL Connector API:

As you can see above, the statement/workload execution fails with “Query execution was interrupted” error vs. “deadlock” error (as seen with the normal MySQL client).

Conclusion

While the core execution remains same, error handling is different with text and binary protocol. This issue/limitation was always present, but it’s more prominent since sysbench-1.0 started using the PREPARE statement as a default.

What does this means to end-user/application?

  • Nothing has changed from the Percona XtraDB Cluster perspective, except that if your application is planning to use a binary protocol (like Connector API), then the application should able to handle both the errors and retry the failed transaction accordingly.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Leila Steward

Im not sure with the new error handling procedure using protocol based percona method but this article is cool. i learned new today.

Alexey Kopytov

Thanks for explaining the limitation, I’m going to reference this post from the sysbench FAQ. As a workaround for this PXC/Galera issue, one can add “1317” to the –mysql-ignore-errors option, e.g.:

$ sysbench oltp_read_write –mysql-ignore-errors=1213,1020,1205,1317 run

I’m also considering adding 1317 to the default list of ignored errors.