In this blog post, we’ll look at how we’ve improved wsrep-stages and related instrumentation in Percona XtraDB Cluster.
Introduction
When you execute a workload and need to find out what the given thread is working on, “SHOW PROCESSLIST” comes to the top of your mind. It is an effective way to track the thread status. We decided to improve the stages in Percona XtraDB Cluster to make “SHOW PROCESSLIST” more meaningful.
In the blog below, we will check out the different wsrep-stages and the significance associated with them.
Loading of data
Running a simple insert/sysbench prepare workload. The state is stable as it mainly captures MySQL stages indicating that the table is being updated:
1 | | 9 | root | localhost | test | Query | 0 | update | INSERT INTO sbtest3(id, k, c, pad) VALUES(893929,515608,'28459951974-62599818307-78562787160-7859397 | 0 | 0 | |
Running UPDATE workload
Running simple sysbench update-key workload. Let’s look at the different states that the user sees and their significance. (MASTER and SLAVE states are different and are marked accordingly.)
MASTER view:
- This stage indicates that the write-set is trying to replicate. Global sequence numbers are assigned after the write-set is replicated and so the global-seqno is currently -1:
1 | | 80 | root | localhost | test | Query | 0 | wsrep: initiating replication for write set (-1) | UPDATE sbtest4 SET k=k+1 WHERE id=502338 | 0 | 1 | |
- This stage indicates successful replication of the write-set. This means the write-set is now added to the group-channel. Global-seqno is updated in the message too:
1 | | 79 | root | localhost | test | Query | 0 | wsrep: write set replicated (196575) | UPDATE sbtest3 SET k=k+1 WHERE id=502723 | 0 | 1 | |
- This stage indicates the write-set has successfully passed the certification stage (making its path clear for commit):
1 | | 85 | root | localhost | test | Query | 0 | wsrep: pre-commit/certification passed (196574) | UPDATE sbtest7 SET k=k+1 WHERE id=495551 | 0 | 1 | |
- This stage indicates that InnoDB commit has been triggered for the write-set:
1 | | 138 | root | localhost | test | Query | 0 | innobase_commit_low (585721) | UPDATE sbtest6 SET k=k+1 WHERE id=500551 | 0 | 1 | |
SLAVE/Replicating node view:
- This stage indicates that the slave thread is trying to commit the replicated write-set with the given seqno. It is likely waiting for its turn of the CommitMonitor:
1 | | 6 | system user | | NULL | Sleep | 0 | wsrep: committing write set (224905) | NULL | 0 | 0 | |
- This stage indicates a successful commit of the replicated write-set with the given seqno:
1 | | 2 | system user | | NULL | Sleep | 0 | wsrep: committed write set (224896) | NULL | 0 | 0 | |
- This stage indicates that updating the rows is in progress. (Often it was difficult to know what the workload is trying to do: UPDATE/INSERT/DELETE.) Now there is an easy way to find out:
1 | | 13 | system user | | NULL | Sleep | 0 | wsrep: updating row for write-set (178711) | NULL | 0 | 0 | |
1 | | 18 | system user | | NULL | Sleep | 0 | wsrep: writing row for write-set (793454) | NULL | 0 | 0 | |
1 | | 11 | system user | | NULL | Sleep | 0 | wsrep: deleting row for write-set (842123) | NULL | 0 | 0 | |
- This stage indicates that the given write-set is being applied:
1 | | 10 | system user | | NULL | Sleep | 0 | wsrep: applying write-set (899370) | NULL | 0 | 0 | |
Improved Instrumentation
Let’s answer some simple questions that most profiling experts will face:
- How long did replication take (adding write-set to channel)?
1 2 3 4 5 6 7 8 9 10 11 | mysql> select event_name, timer_wait/1000000 as time_in_mics from events_stages_history where event_name like '%in replicate%' order by time_in_mics desc limit 5; +---------------------------------------+--------------+ | event_name | time_in_mics | +---------------------------------------+--------------+ | stage/wsrep/wsrep: in replicate stage | 1.2020 | | stage/wsrep/wsrep: in replicate stage | 0.7880 | | stage/wsrep/wsrep: in replicate stage | 0.7740 | | stage/wsrep/wsrep: in replicate stage | 0.7550 | | stage/wsrep/wsrep: in replicate stage | 0.7480 | +---------------------------------------+--------------+ 5 rows in set (0.01 sec) |
- How long did it take for pre-commit/certification checks?
1 2 3 4 5 6 7 8 9 10 11 | mysql> select event_name, timer_wait/1000000 as time_in_mics from events_stages_history where event_name like '%in pre-commit%' order by time_in_mics desc limit 5; +----------------------------------------+--------------+ | event_name | time_in_mics | +----------------------------------------+--------------+ | stage/wsrep/wsrep: in pre-commit stage | 1.3450 | | stage/wsrep/wsrep: in pre-commit stage | 1.0000 | | stage/wsrep/wsrep: in pre-commit stage | 0.9480 | | stage/wsrep/wsrep: in pre-commit stage | 0.9180 | | stage/wsrep/wsrep: in pre-commit stage | 0.9030 | +----------------------------------------+--------------+ 5 rows in set (0.01 sec) |
- How long did it take to commit a transaction on the slave (slave_thread=16 threads)?
1 2 3 4 5 6 7 8 9 10 11 | mysql> select thread_id, event_name, timer_wait/1000000 as time_in_mics from events_stages_history where event_name like '%committing%' order by time_in_mics desc limit 5; +-----------+-------------------------------+--------------+ | thread_id | event_name | time_in_mics | +-----------+-------------------------------+--------------+ | 56 | stage/wsrep/wsrep: committing | 0.5870 | | 58 | stage/wsrep/wsrep: committing | 0.5860 | | 47 | stage/wsrep/wsrep: committing | 0.5810 | | 59 | stage/wsrep/wsrep: committing | 0.5740 | | 60 | stage/wsrep/wsrep: committing | 0.5220 | +-----------+-------------------------------+--------------+ 5 rows in set (0.00 sec) |
- Increasing the number of slave thread creates more contention (slave_thread=64):
1 2 3 4 5 6 7 8 9 10 11 | mysql> select thread_id, event_name, timer_wait/1000000 as time_in_mics from events_stages_history where event_name like '%committing%' order by time_in_mics desc limit 5; +-----------+-------------------------------+--------------+ | thread_id | event_name | time_in_mics | +-----------+-------------------------------+--------------+ | 90 | stage/wsrep/wsrep: committing | 1.6930 | | 97 | stage/wsrep/wsrep: committing | 1.5870 | | 103 | stage/wsrep/wsrep: committing | 1.5140 | | 87 | stage/wsrep/wsrep: committing | 1.2560 | | 102 | stage/wsrep/wsrep: committing | 1.1040 | +-----------+-------------------------------+--------------+ 5 rows in set (0.00 sec) |
- The amount oftTime taken to apply a write-set:
1 2 3 4 5 6 7 8 9 10 11 | mysql> select thread_id, event_name, timer_wait/1000000 as time_in_mics from events_stages_history where event_name like '%applying%' order by time_in_mics desc limit 5; +-----------+---------------------------------------+--------------+ | thread_id | event_name | time_in_mics | +-----------+---------------------------------------+--------------+ | 166 | stage/wsrep/wsrep: applying write set | 1.6880 | | 168 | stage/wsrep/wsrep: applying write set | 1.5820 | | 146 | stage/wsrep/wsrep: applying write set | 1.5270 | | 124 | stage/wsrep/wsrep: applying write set | 1.4760 | | 120 | stage/wsrep/wsrep: applying write set | 1.4440 | +-----------+---------------------------------------+--------------+ 5 rows in set (0.00 sec) |
Conclusion
The improved wsrep-stage framework makes it more effective for a user to find out the state of a given thread. Using the derived instrumentation through wsrep-stage is a good way to understand where the time is being spent.
mysql> select * from performance_schema.events_stages_history;
Empty set (0.00 sec)
It should be noted that P_S.events_stages_history needs to be enabled in performance_schema.setup_consumers. It is disabled by default at least on older versions. There are also variables to control the history length.
A great slide deck that includes more about this is Percona’s Bill Karwin’s “Getting the Most Out of MySQL 5.6” from 2014.
Right. If one need to use Performance Schema all those pre-condition as per MySQL semantics needs to be followed.
consumers, instruments …etc.