At my latest webinar “MySQL Test Framework (MTR) for Troubleshooting”, I received an interesting question about MTR test cases for Percona XtraDB Cluster (PXC). Particularly about testing SST and IST.
This post is intended to answer this question. It assumes you are familiar with MTR and can write tests for MySQL servers. If you are not, please watch the webinar recording first.
You can find example tests in any PXC tarball package. They are located in directories mysql-test/suite/galera , mysql-test/suite/galera_3nodes and mysql-test/suite/wsrep , though that last directory only contains a configuration file.
If you simply try to run tests in galera suite you will find they all are disabled, because the environment variable WSREP_PROVIDER was not set:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ ./mtr --suite=galera Logging: ./mtr --suite=galera MySQL Version 5.7.19 Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one... - using tmpdir: '/tmp/xYgQqOa5b7' Checking supported features... - SSL connections supported - binaries built with wsrep patch Using suites: galera Collecting tests... Checking leftover processes... - found old pid 30624 in 'mysqld.3.pid', killing it... process did not exist! Removing old var directory... Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'... Installing system database... Using parallel: 1 ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 galera.GAL-419 [ skipped ] Test needs 'big-test' option ... galera.galera_binlog_checksum [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER? galera.galera_binlog_event_max_size_min [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER? galera.galera_flush_gtid [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER? galera.galera_gtid [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER? galera.lp1435482 [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER? ^Cmysql-test-run: *** ERROR: Got ^C signal |
In order to run these tests you need to set this variable first.
I use the quite outdated 5.7.19 PXC package (the version does not matter for the purpose of this post) and run tests as:
1 | WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so ./mtr --suite=galera |
After the variable WSREP_PROVIDER is set, mtr can successfully run:
sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so ./mtr --suite=galera Logging: ./mtr --suite=galera MySQL Version 5.7.19 Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one... - using tmpdir: '/tmp/I6HfuqkwR1' Checking supported features... - SSL connections supported - binaries built with wsrep patch Using suites: galera Collecting tests... Checking leftover processes... - found old pid 14271 in 'mysqld.1.pid', killing it... process did not exist! - found old pid 14273 in 'mysqld.2.pid', killing it... process did not exist! Removing old var directory... Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'... Installing system database... Using parallel: 1 ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 galera.GAL-419 [ skipped ] Test needs 'big-test' option ... worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds... galera.galera_binlog_checksum [ pass ] 2787 worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds... galera.galera_binlog_event_max_size_min [ pass ] 2200 ... |
Now we are ready to write our first PXC test. The easiest way to get started is to open any existing test and check how it is written. Then modify it so that it replays our own scenario.
Since the question was about testing IST and SST, I will use the test galera_ist_progress as an example. First let’s check that it runs successfully and that it does not have any requirements that could prevent it from running inside regular production binaries:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so ./mtr --suite=galera galera_ist_progress Logging: ./mtr --suite=galera galera_ist_progress MySQL Version 5.7.19 Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one... - using tmpdir: '/tmp/EodvOyCJwo' Checking supported features... - SSL connections supported - binaries built with wsrep patch Collecting tests... Checking leftover processes... Removing old var directory... Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'... Installing system database... Using parallel: 1 ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds... galera.galera_ist_progress [ pass ] 17970 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 17.970 of 218 seconds executing testcases Completed: All 1 tests were successful. |
Everything is fine. Now let’s look into the test itself.
First, this test has its own configuration file. Let’s check what’s in there:
1 2 3 4 5 | $ cat suite/galera/t/galera_ist_progress.cnf !include ../galera_2nodes.cnf [mysqld.1] |
galera_2nodes.cnf is one of the standard configuration files in galera suite. If we look into it we may notice that wsrep_provider_options is defined and overriding this option is not required for all tests.
We’ll continue our review. The test script includes the galera_cluster.inc file:
1 | --source include/galera_cluster.inc |
This file is located outside of galera suite and contains 2 lines:
1 2 | --let $galera_cluster_size = 2 --source include/galera_init.inc |
galera_init.inc , in its turn, creates as many nodes as defined by the galera_cluster_size variable and additionally creates a default connection for each of them.
Now let’s step out from galera_ist_progress and check if this knowledge is enough to create our first PXC test.
I created a simple test based on a two node setup which checks a few status and system variables, creates a table, inserts data into it, and ensures that content is accessible on both nodes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | $ cat ~/src/tests/t/pxc.test --source include/galera_cluster.inc --connection node_1 --echo We are on node 1 select @@hostname, @@port; show status like 'wsrep_cluster_size'; show status like 'wsrep_cluster_status'; show status like 'wsrep_connected'; create table t1(id int not null auto_increment primary key, f1 int) engine=innodb; insert into t1(f1) values(1),(2),(3); select * from t1; --connection node_2 --echo We are on node 2 select @@hostname, @@port; show status like 'wsrep_cluster_size'; show status like 'wsrep_cluster_status'; show status like 'wsrep_connected'; select * from t1; insert into t1(f1) values(1),(2),(3); select * from t1; --connection node_1 --echo We are on node 1 select * from t1; drop table t1; |
However, if I run this test in the main suite, it will fail:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ export WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ do_test.sh -s ~/mysql_packages -b Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100 Logging: ./mysql-test-run.pl --record --force pxc MySQL Version 5.7.19 Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one... - using tmpdir: '/tmp/uUmBztSWUA' Checking supported features... - SSL connections supported - binaries built with wsrep patch Collecting tests... Checking leftover processes... Removing old var directory... Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'... Installing system database... Using parallel: 1 ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 main.pxc [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER? -------------------------------------------------------------------------- The servers were restarted 0 times Spent 0.000 of 108 seconds executing testcases Completed: All 0 tests were successful. 1 tests were skipped, 1 by the test itself. =====Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100===== =====pxc===== sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ echo $WSREP_PROVIDER /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so |
The reason for this failure is that galera suite has default option files that set the necessary variables. Let’s skip those option files for a while and simply run our test in galera suite:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ do_test.sh -s ~/mysql_packages -b Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100 -t galera Logging: ./mysql-test-run.pl --record --force --suite=galera pxc MySQL Version 5.7.19 Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one... - using tmpdir: '/tmp/ytqEjnfM7i' Checking supported features... - SSL connections supported - binaries built with wsrep patch Collecting tests... Checking leftover processes... Removing old var directory... Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'... Installing system database... Using parallel: 1 ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009 worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds... galera.pxc [ pass ] 2420 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 2.420 of 208 seconds executing testcases Completed: All 1 tests were successful. pxc.result =====Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100===== =====pxc===== We are on node 1 select @@hostname, @@port; @@hostname @@port Thinkie 13000 show status like 'wsrep_cluster_size'; Variable_name Value wsrep_cluster_size 2 show status like 'wsrep_cluster_status'; Variable_name Value wsrep_cluster_status Primary show status like 'wsrep_connected'; Variable_name Value wsrep_connected ON create table t1(id int not null auto_increment primary key, f1 int) engine=innodb; insert into t1(f1) values(1),(2),(3); select * from t1; id f1 2 1 4 2 6 3 We are on node 2 select @@hostname, @@port; @@hostname @@port Thinkie 13004 show status like 'wsrep_cluster_size'; Variable_name Value wsrep_cluster_size 2 show status like 'wsrep_cluster_status'; Variable_name Value wsrep_cluster_status Primary show status like 'wsrep_connected'; Variable_name Value wsrep_connected ON select * from t1; id f1 2 1 4 2 6 3 insert into t1(f1) values(1),(2),(3); select * from t1; id f1 2 1 4 2 6 3 7 1 9 2 11 3 We are on node 1 select * from t1; id f1 2 1 4 2 6 3 7 1 9 2 11 3 drop table t1; |
You will see that the test reports that the two nodes run on different ports:
1 2 3 4 5 6 7 8 9 10 11 | We are on node 1 select @@hostname, @@port; @@hostname @@port Thinkie 13000 ... We are on node 2 select @@hostname, @@port; @@hostname @@port Thinkie 13004 |
… and that PXC started:
1 2 3 4 5 6 7 8 9 | show status like 'wsrep_cluster_size'; Variable_name Value wsrep_cluster_size 2 show status like 'wsrep_cluster_status'; Variable_name Value wsrep_cluster_status Primary show status like 'wsrep_connected'; Variable_name Value wsrep_connected ON |
And we can also clearly see that each node sees the changes to our test table that were made by the other node.
Now let’s get back to IST test, defined in galera_ist_progress.test .
In order to test IST it first stops writes to the cluster:
1 2 3 | # Isolate node #2 --connection node_2 SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 1'; |
Then it connects to node 1 and waits until wsrep_cluster_size becomes 1:
1 2 3 | --connection node_1 --let $wait_condition = SELECT VARIABLE_VALUE = 1 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size'; --source include/wait_condition.inc |
Then it turns wsrep_on OFF on node 2:
1 2 3 4 5 | --connection node_2 SET SESSION wsrep_on = OFF; --let $wait_condition = SELECT VARIABLE_VALUE = 'non-Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status'; --source include/wait_condition.inc SET SESSION wsrep_on = ON; |
Now node 2 is completely isolated and node 1 can be updated, so we can test IST when we bring node 2 back online.
1 2 3 4 5 6 7 8 9 10 11 12 | --connection node_1 CREATE TABLE t1 (f1 INTEGER) ENGINE=InnoDB; INSERT INTO t1 VALUES (1); INSERT INTO t1 VALUES (2); INSERT INTO t1 VALUES (3); INSERT INTO t1 VALUES (4); INSERT INTO t1 VALUES (5); INSERT INTO t1 VALUES (6); INSERT INTO t1 VALUES (7); INSERT INTO t1 VALUES (8); INSERT INTO t1 VALUES (9); INSERT INTO t1 VALUES (10); |
After the update is done, node 2 is brought online:
1 2 3 4 5 6 7 8 9 10 | --connection node_2 SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 0'; --connection node_1 --let $wait_condition = SELECT VARIABLE_VALUE = 2 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size'; --source include/wait_condition.inc --connection node_2 --let $wait_condition = SELECT VARIABLE_VALUE = 'Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status'; --source include/wait_condition.inc |
Once node 2 is online, checks for IST progress are performed. To check for IST progress, the test greps the error log file from node 2 where any messages about IST progress are printed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | # # Grep for expected IST output in joiner log # --connection node_1 --let $assert_count = 1 --let $assert_file = $MYSQLTEST_VARDIR/log/mysqld.2.err --let $assert_only_after = Need state transfer --let $assert_text = Receiving IST: 11 writesets, seqnos --let $assert_select = Receiving IST: 11 writesets, seqnos --source include/assert_grep.inc --let $assert_text = Receiving IST... 0.0% ( 0/11 events) complete --let $assert_select = Receiving IST... 0.0% ( 0/11 events) complete --source include/assert_grep.inc --let $assert_text = Receiving IST...100.0% (11/11 events) complete --let $assert_select = Receiving IST...100.0% (11/11 events) complete --source include/assert_grep.inc |
Here is the error log snipped from node 2 when it re-joined the cluster and initiated state transfer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | 2018-05-25T17:00:46.908569Z 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 13) 2018-05-25T17:00:46.908637Z 2 [Note] WSREP: State transfer required: Group state: f364a69b-603c-11e8-a632-ce5a4a7d5964:13 Local state: f364a69b-603c-11e8-a632-ce5a4a7d5964:2 2018-05-25T17:00:46.908673Z 2 [Note] WSREP: New cluster view: global state: f364a69b-603c-11e8-a632-ce5a4a7d5964:13, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 3 2018-05-25T17:00:46.908694Z 2 [Note] WSREP: Setting wsrep_ready to true 2018-05-25T17:00:46.908717Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer. 2018-05-25T17:00:46.908737Z 2 [Note] WSREP: Setting wsrep_ready to false 2018-05-25T17:00:46.908757Z 2 [Note] WSREP: You have configured 'xtrabackup-v2' state snapshot transfer method which cannot be performed on a running server. Wsrep provider won't be able to fall back to it if other means of state transfer are unavailable. In that case you will need to restart the server. 2018-05-25T17:00:46.908777Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2018-05-25T17:00:46.908799Z 2 [Note] WSREP: REPL Protocols: 7 (3, 2) 2018-05-25T17:00:46.908831Z 2 [Note] WSREP: Assign initial position for certification: 13, protocol version: 3 2018-05-25T17:00:46.908886Z 0 [Note] WSREP: Service thread queue flushed. 2018-05-25T17:00:46.908934Z 2 [Note] WSREP: Check if state gap can be serviced using IST 2018-05-25T17:00:46.909062Z 2 [Note] WSREP: IST receiver addr using tcp://127.0.0.1:13006 2018-05-25T17:00:46.909232Z 2 [Note] WSREP: Prepared IST receiver, listening at: tcp://127.0.0.1:13006 2018-05-25T17:00:46.909267Z 2 [Note] WSREP: State gap can be likely serviced using IST. SST request though present would be void. 2018-05-25T17:00:46.909489Z 0 [Note] WSREP: Member 1.0 (Thinkie) requested state transfer from '*any*'. Selected 0.0 (Thinkie)(SYNCED) as donor. 2018-05-25T17:00:46.909513Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 13) 2018-05-25T17:00:46.909557Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0 2018-05-25T17:00:46.909602Z 2 [Note] WSREP: GCache history reset: f364a69b-603c-11e8-a632-ce5a4a7d5964:2 -> f364a69b-603c-11e8-a632-ce5a4a7d5964:13 2018-05-25T17:00:46.910221Z 0 [Note] WSREP: 0.0 (Thinkie): State transfer to 1.0 (Thinkie) complete. 2018-05-25T17:00:46.910422Z 0 [Note] WSREP: Member 0.0 (Thinkie) synced with group. 2018-05-25T17:00:47.006802Z 2 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset 2018-05-25T17:00:47.106423Z 2 [Note] WSREP: Receiving IST: 11 writesets, seqnos 2-13 2018-05-25T17:00:47.106764Z 0 [Note] WSREP: Receiving IST... 0.0% ( 0/11 events) complete. 2018-05-25T17:00:47.109740Z 0 [Note] WSREP: Receiving IST...100.0% (11/11 events) complete. 2018-05-25T17:00:47.110029Z 2 [Note] WSREP: IST received: f364a69b-603c-11e8-a632-ce5a4a7d5964:13 2018-05-25T17:00:47.110433Z 0 [Note] WSREP: 1.0 (Thinkie): State transfer from 0.0 (Thinkie) complete. 2018-05-25T17:00:47.110480Z 0 [Note] WSREP: SST leaving flow control 2018-05-25T17:00:47.110509Z 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 13) 2018-05-25T17:00:47.110778Z 0 [Note] WSREP: Member 1.0 (Thinkie) synced with group. 2018-05-25T17:00:47.110830Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 13) 2018-05-25T17:00:47.110890Z 2 [Note] WSREP: Synchronized with group, ready for connections |
If you want to write your own tests for IST and SST operations you can use existing test cases as a baseline. You are not required to use grep, and can explore your own scenarios. The important parts of the code are:
- The variable WSREP_PROVIDER must be set before the test run
- The test should be either in galera suite or if you choose to use your own suite you must copy the definitions from the galera suite default configuration file
- The test should include the file include/galera_cluster.inc
- To isolate the node from the cluster run the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 | # Isolate node #2 --connection node_2 SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 1'; --connection node_1 --let $wait_condition = SELECT VARIABLE_VALUE = 1 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size'; --source include/wait_condition.inc --connection node_2 SET SESSION wsrep_on = OFF; --let $wait_condition = SELECT VARIABLE_VALUE = 'non-Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status'; --source include/wait_condition.inc SET SESSION wsrep_on = ON; |
Replace the node numbers if needed.
To bring the node back to the cluster run the following code:
1 2 3 4 5 6 7 8 9 10 11 12 | # Restore node #2, IST is performed --connection node_2 SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 0'; --connection node_1 --let $wait_condition = SELECT VARIABLE_VALUE = 2 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size'; --source include/wait_condition.inc --connection node_2 --let $wait_condition = SELECT VARIABLE_VALUE = 'Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status'; --source include/wait_condition.inc |
Depending on the size of the updates and gcache you can test either IST or SST in this way.