In this post, the first one of a Maxscale series, I describe how to use MariaDB’s MaxScale and MySQL-utilities with MySQL Asynchronous replication.
When we talk about high availability with asynchronous replication, we always think about MHA or PRM. But if we want to transparently use the slave(s) for READs, what can we use ?
Description:
- Three MySQL servers, but one has very limited resources and will never be able to handle the production load. In fact this node is used for backup and some back-office queries.
- We would like to use one of the nodes as a master and the other two as slaves, but only one will be addressed by the application for the READs. If needed, that same node will become the master.
- The application doesn’t handle READ and WRITE connections, and it’s impossible to change it.
To achieve our goals, we will use MaxScale and it’s R/W filter. When using Maxscale and asynchronous replication with MariaDB, it’s possible to use MariaDB’s replication manager, which is a wonderful tool written in Go. Unfortunately, this tool doesn’t support standard MySQL. To replace it, I used then the Oracle’s MySQL-Utilities.
Our three nodes are:
- percona1 (master)
- percona2 (powerful slave)
- percona3 (weak slave)
It’s mandatory in this solution to use GTID, as it’s the only method supported by the mysql-utilities we are using.
This is the MaxScale configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | [maxscale] threads=4 [Splitter Service] type=service router=readwritesplit servers=percona1, percona2 user=maxscale passwd=264D375EC77998F13F4D0EC739AABAD4 [Splitter Listener] type=listener service=Splitter Service protocol=MySQLClient port=3306 socket=/tmp/ClusterMaster [percona1] type=server address=192.168.90.2 port=3306 protocol=MySQLBackend [percona2] type=server address=192.168.90.3 port=3306 protocol=MySQLBackend [percona3] type=server address=192.168.90.4 port=3306 protocol=MySQLBackend [Replication Monitor] type=monitor module=mysqlmon servers=percona1, percona2, percona3 user=maxscale passwd=264D375EC77998F13F4D0EC739AABAD4 monitor_interval=1000 script=/usr/local/bin/failover.sh events=master_down [CLI] type=service router=cli [CLI Listener] type=listener service=CLI protocol=maxscaled address=localhost port=6603 |
As you can notice, the Splitter Service
contains only the two nodes able to handle the load.
And to perform the failover, in the Replication Monitor
section, we define a script to use when the master is down.
That script calls mysqlrpladmin
from the mysql-utilities.
In the script we also define the following line to be sure the weak slave will never become a master.
1 | never_master=192.168.90.4 |
When everything is setup and running, you should see something like this:
1 2 3 4 5 6 7 8 9 | # maxadmin -pmariadb list servers Servers. -------------------+-----------------+-------+-------------+-------------------- Server | Address | Port | Connections | Status -------------------+-----------------+-------+-------------+-------------------- percona1 | 192.168.90.2 | 3306 | 15 | Master, Running percona2 | 192.168.90.3 | 3306 | 1025 | Slave, Running percona3 | 192.168.90.4 | 3306 | 0 | Slave, Running -------------------+-----------------+-------+-------------+-------------------- |
So as you can see, Maxscale discovers on its own which server is the master; this doesn’t need to be specified in the configuration.
You can also use mysqldrpladmin
utility to verify the cluster’s health:
1 2 3 4 5 6 7 8 9 10 11 | # /usr/bin/mysqlrpladmin --rpl-user=repl:replpercona --master=manager:[email protected]:3306 --slaves=manager:[email protected]:3306,manager:[email protected]:3306 health # Checking privileges. # # Replication Topology Health: +---------------+-------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +---------------+-------+---------+--------+------------+---------+ | 192.168.90.2 | 3306 | MASTER | UP | ON | OK | | 192.168.90.3 | 3306 | SLAVE | UP | ON | OK | | 192.168.90.4 | 3306 | SLAVE | UP | ON | OK | +---------------+-------+---------+--------+------------+---------+ |
Try it with --verbose
😉
When we test with sysbench, and we stop the master, we can see that there are some errors due to disconnects. Also, during the promotion of the new master, sysbench can’t reconnect:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | [ 20s] queue length: 0, concurrency: 0 [ 21s] threads: 8, tps: 2.00, reads: 28.00, writes: 8.00, response time: 107.61ms (95%), errors: 0.00, reconnects: 0.00 [ 21s] queue length: 0, concurrency: 0 [ 22s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 22s] queue length: 0, concurrency: 0 [ 23s] threads: 8, tps: 1.00, reads: 14.00, writes: 4.00, response time: 100.85ms (95%), errors: 0.00, reconnects: 0.00 [ 23s] queue length: 0, concurrency: 0 [ 24s] threads: 8, tps: 0.00, reads: 11.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 24s] queue length: 0, concurrency: 1 [ 25s] threads: 8, tps: 1.00, reads: 3.00, writes: 4.00, response time: 235.41ms (95%), errors: 0.00, reconnects: 0.00 [ 25s] queue length: 0, concurrency: 0 [ 26s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 26s] queue length: 0, concurrency: 0 FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session [ 27s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 27s] queue length: 0, concurrency: 3 FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session [ 28s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 28s] queue length: 0, concurrency: 4 FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session [ 29s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 29s] queue length: 0, concurrency: 5 [ 30s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 30s] queue length: 0, concurrency: 5 FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session [ 31s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 31s] queue length: 0, concurrency: 7 FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session WARNING: Both max-requests and max-time are 0, running endless test sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Target transaction rate: 1/sec Report intermediate results every 1 second(s) Random number generator seed is 0 and will be ignored Threads started! FATAL: unable to connect to MySQL server, aborting... FATAL: error 1045: failed to create new session PANIC: unprotected error in call to Lua API (Failed to connect to the database) WARNING: Both max-requests and max-time are 0, running endless test sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Target transaction rate: 1/sec Report intermediate results every 1 second(s) Random number generator seed is 0 and will be ignored Threads started! [ 1s] threads: 8, tps: 1.99, reads: 27.93, writes: 7.98, response time: 211.49ms (95%), errors: 0.00, reconnects: 0.00 [ 1s] queue length: 0, concurrency: 0 [ 2s] threads: 8, tps: 1.00, reads: 14.00, writes: 4.00, response time: 51.01ms (95%), errors: 0.00, reconnects: 0.00 [ 2s] queue length: 0, concurrency: 0 [ 3s] threads: 8, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 [ 3s] queue length: 0, concurrency: 0 [ 4s] threads: 8, tps: 1.00, reads: 13.99, writes: 4.00, response time: 80.28ms (95%), errors: 0.00, reconnects: 0.00 |
It took 8 seconds to automatically failover.
Then we can see the status of the servers:
1 2 3 4 5 6 7 8 | # maxadmin -pmariadb list serversServers. -------------------+-----------------+-------+-------------+-------------------- Server | Address | Port | Connections | Status -------------------+-----------------+-------+-------------+-------------------- percona1 | 192.168.90.2 | 3306 | 17 | Down percona2 | 192.168.90.3 | 3306 | 1025 | Master, Running percona3 | 192.168.90.4 | 3306 | 0 | Slave, Running -------------------+-----------------+-------+-------------+-------------------- |
If we re-start percona1
, we can see now:
1 2 3 4 5 6 7 8 9 | # maxadmin -pmariadb list servers Servers. -------------------+-----------------+-------+-------------+-------------------- Server | Address | Port | Connections | Status -------------------+-----------------+-------+-------------+-------------------- percona1 | 192.168.90.2 | 3306 | 17 | Running percona2 | 192.168.90.3 | 3306 | 1025 | Master, Running percona3 | 192.168.90.4 | 3306 | 0 | Slave, Running -------------------+-----------------+-------+-------------+-------------------- |
To add the node again in the asynchronous replication as a slave, we need to use another MySQL utility, mysqlreplicate
:
1 2 3 4 5 6 | # mysqlreplicate --master=manager:[email protected] --slave=manager:[email protected] --rpl-user=repl:replpercona # master on 192.168.90.3: ... connected. # slave on 192.168.90.2: ... connected. # Checking for binary logging on master... # Setting up replication... # ...done. |
This is source of failover.sh
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | #!/bin/bash # failover.sh # wrapper script to mysqlrpladmin # user:password pair, must have administrative privileges. user=manager:percona # user:password pair, must have REPLICATION SLAVE privileges. repluser=repl:replpercona never_master=192.168.90.4 ARGS=$(getopt -o '' --long 'event:,initiator:,nodelist:' -- "$@") eval set -- "$ARGS" while true; do case "$1" in --event) shift; event=$1 shift; ;; --initiator) shift; initiator=$1 shift; ;; --nodelist) shift; nodelist=$1 shift; ;; --) shift; break; ;; esac done # find the candidates for i in $(echo $nodelist | sed s/,/n/g) do if [[ "$i" =~ "$never_master" ]] then # do nothing echo nothing >/dev/null else if [[ "$i" =~ "$initiator" ]] then # do nothing echo nothing >/dev/null else candidates="$candidates,${user}@${i}" fi fi if [[ "$i" =~ "$initiator" ]] then # do nothing echo nothing >/dev/null else slaves="$slaves,${user}@${i}" fi done cmd="/usr/bin/mysqlrpladmin --rpl-user=$repluser --slaves=${slaves#?} --candidates=${candidates#?} failover" # uncomment following line for debug #echo $cmd >> /tmp/fred eval $cmd |
In the next post, we will focus on the monitoring module used in this configuration.
for master failover don’t forget orchestrator
Hi Frederic,
This is a great blog.
I want to know if I can use the same mysql utility and configuration for PXC instead of MariaDB
Thx a lot
What if percona1 (master& maxscale) failed?
we need another node to deploy the maxscale, which means we need at least 4 nodes to perform truly High availability?
Hi!
Is this working for you with the new version? Everything seems to work but Maxscale doesn’t detect the new topology with the new master. I tried also to set the new master manually through maxadmin but it gives this after a a while: “lost_master. [Master, Running] -> [Slave, Running]
Do you have any idea about this?