ProxySQL and MHA Integration

MHA This blog post discusses ProxySQL and MHA integration, and how they work together.

MHA (Master High Availability Manager and tools for MySQL) is almost fully integrated with the ProxySQL process. This means you can count on the MHA standard feature to manage failover, and ProxySQL to manage the traffic and shift from one server to another.

This is one of the main differences between MHA and VIP, and MHA and ProxySQL: with MHA/ProxySQL, there is no need to move IPs or re-define DNS.

The following is an example of an MHA configuration file for use with ProxySQL:

server default]
    user=mha
    password=mha
    ssh_user=root
    repl_password=replica
    manager_log=/tmp/mha.log
    manager_workdir=/tmp
    remote_workdir=/tmp
    master_binlog_dir=/opt/mysql_instances/mha1/logs
    client_bindir=/opt/mysql_templates/mysql-57/bin
    client_libdir=/opt/mysql_templates/mysql-57/lib
    master_ip_failover_script=/opt/tools/mha/mha4mysql-manager/samples/scripts/master_ip_failover
    master_ip_online_change_script=/opt/tools/mha/mha4mysql-manager/samples/scripts/master_ip_online_change
    log_level=debug

    [server1]
    hostname=mha1r
    ip=192.168.1.104
    candidate_master=1

    [server2]
    hostname=mha2r
    ip=192.168.1.107
    candidate_master=1

    [server3]
    hostname=mha3r
    ip=192.168.1.111
    candidate_master=1
     
    [server4]
    hostname=mha4r
    ip=192.168.1.109
    no_master=1

server default]

user=mha

password=mha

ssh_user=root

repl_password=replica

manager_log=/tmp/mha.log

manager_workdir=/tmp

remote_workdir=/tmp

master_binlog_dir=/opt/mysql_instances/mha1/logs

client_bindir=/opt/mysql_templates/mysql-57/bin

client_libdir=/opt/mysql_templates/mysql-57/lib

master_ip_failover_script=/opt/tools/mha/mha4mysql-manager/samples/scripts/master_ip_failover

master_ip_online_change_script=/opt/tools/mha/mha4mysql-manager/samples/scripts/master_ip_online_change

log_level=debug

[server1]

hostname=mha1r

ip=192.168.1.104

candidate_master=1

[server2]

hostname=mha2r

ip=192.168.1.107

candidate_master=1

[server3]

hostname=mha3r

ip=192.168.1.111

candidate_master=1

[server4]

hostname=mha4r

ip=192.168.1.109

no_master=1

NOTE: Be sure to comment out the “FIX ME ” lines in the sample/scripts.

After that, just install MHA as you normally would.

In ProxySQL, be sure to have all MHA users and the servers set.

When using ProxySQL with standard replication, it’s important to set additional privileges for the ProxySQL monitor user. It must also have “Replication Client” set or it will fail to check the SLAVE LAG. The servers MUST have a defined value for the attribute max_replication_lag, or the check will be ignored.

As a reminder:

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.104',600,3306,1000,0);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.104',601,3306,1000,10);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.107',601,3306,1000,10);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.111',601,3306,1000,10);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.109',601,3306,1000,10);
INSERT INTO mysql_replication_hostgroups VALUES (600,601);
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
insert into mysql_query_rules (username,destination_hostgroup,active) values('mha_W',600,1);
insert into mysql_query_rules (username,destination_hostgroup,active) values('mha_R',601,1);
insert into mysql_query_rules (username,destination_hostgroup,active,retries,match_digest) values('mha_RW',600,1,3,'^SELECT.*FOR UPDATE');
insert into mysql_query_rules (username,destination_hostgroup,active,retries,match_digest) values('mha_RW',601,1,3,'^SELECT');
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('mha_W','test',1,600,'test_mha');
insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('mha_R','test',1,601,'test_mha');
insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('mha_RW','test',1,600,'test_mha');
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.104',600,3306,1000,0);

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.104',601,3306,1000,10);

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.107',601,3306,1000,10);

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.111',601,3306,1000,10);

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.109',601,3306,1000,10);

INSERT INTO mysql_replication_hostgroups VALUES (600,601);

LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;

insert into mysql_query_rules (username,destination_hostgroup,active) values('mha_W',600,1);

insert into mysql_query_rules (username,destination_hostgroup,active) values('mha_R',601,1);

insert into mysql_query_rules (username,destination_hostgroup,active,retries,match_digest) values('mha_RW',600,1,3,'^SELECT.*FOR UPDATE');

insert into mysql_query_rules (username,destination_hostgroup,active,retries,match_digest) values('mha_RW',601,1,3,'^SELECT');

LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('mha_W','test',1,600,'test_mha');

insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('mha_R','test',1,601,'test_mha');

insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('mha_RW','test',1,600,'test_mha');

LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK

OK, now that all is ready, let’s rock’n’roll!

Controlled fail-over

First of all, the masterha_manager should not be running or you will get an error.

Now let’s start some traffic:

Write
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=mha_RW --mysql-password=test --mysql-db=mha_test --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=5000 --max-requests=0 --max-time=900 --oltp-point-selects=5 --oltp-read-only=off --oltp-dist-type=uniform --oltp-reconnect-mode=transaction --oltp-skip-trx=off --num-threads=10 --report-interval=10 --mysql-ignore-errors=all  run
Read only
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=mha_RW --mysql-password=test --mysql-db=mha_test --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=5000 --max-requests=0 --max-time=900 --oltp-point-selects=5 --oltp-read-only=on --num-threads=10 --oltp-reconnect-mode=query --oltp-skip-trx=on --report-interval=10  --mysql-ignore-errors=all run

Write

sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=mha_RW --mysql-password=test --mysql-db=mha_test --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=5000 --max-requests=0 --max-time=900 --oltp-point-selects=5 --oltp-read-only=off --oltp-dist-type=uniform --oltp-reconnect-mode=transaction --oltp-skip-trx=off --num-threads=10 --report-interval=10 --mysql-ignore-errors=all run

Read only

sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=mha_RW --mysql-password=test --mysql-db=mha_test --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=5000 --max-requests=0 --max-time=900 --oltp-point-selects=5 --oltp-read-only=on --num-threads=10 --oltp-reconnect-mode=query --oltp-skip-trx=on --report-interval=10 --mysql-ignore-errors=all run

Let it run for a bit, then check:

mysql> select * from stats_mysql_connection_pool where hostgroup between 600 and 601 order by hostgroup,srv_host desc;
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host      | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 600       | 192.168.1.104 | 3306     | ONLINE | 10       | 0        | 20     | 0       | 551256  | 44307633        | 0               | 285        | <--- current Master
| 601       | 192.168.1.111 | 3306     | ONLINE | 5        | 3        | 11     | 0       | 1053685 | 52798199        | 4245883580      | 1133       |
| 601       | 192.168.1.109 | 3306     | ONLINE | 3        | 5        | 10     | 0       | 1006880 | 50473746        | 4052079567      | 369        |
| 601       | 192.168.1.107 | 3306     | ONLINE | 3        | 5        | 13     | 0       | 1040524 | 52102581        | 4178965796      | 604        |
| 601       | 192.168.1.104 | 3306     | ONLINE | 7        | 1        | 16     | 0       | 987548  | 49458526        | 3954722258      | 285        |
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

mysql> select * from stats_mysql_connection_pool where hostgroup between 600 and 601 order by hostgroup,srv_host desc;

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

| 600 | 192.168.1.104 | 3306 | ONLINE | 10 | 0 | 20 | 0 | 551256 | 44307633 | 0 | 285 | <--- current Master

| 601 | 192.168.1.111 | 3306 | ONLINE | 5 | 3 | 11 | 0 | 1053685 | 52798199 | 4245883580 | 1133 |

| 601 | 192.168.1.109 | 3306 | ONLINE | 3 | 5 | 10 | 0 | 1006880 | 50473746 | 4052079567 | 369 |

| 601 | 192.168.1.107 | 3306 | ONLINE | 3 | 5 | 13 | 0 | 1040524 | 52102581 | 4178965796 | 604 |

| 601 | 192.168.1.104 | 3306 | ONLINE | 7 | 1 | 16 | 0 | 987548 | 49458526 | 3954722258 | 285 |

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

Now perform the failover. To do this, instruct MHA to do a switch, and to set the OLD master as a new slave:

masterha_master_switch --master_state=alive --conf=/etc/mha.cnf --orig_master_is_new_slave --interactive=0 --running_updates_limit=0

1	masterha_master_switch --master_state=alive --conf=/etc/mha.cnf --orig_master_is_new_slave --interactive=0 --running_updates_limit=0

Check what happened:

[ 160s] threads: 10, tps: 354.50, reads: 3191.10, writes: 1418.50, response time: 48.96ms (95%), errors: 0.00, reconnects:  0.00
[ 170s] threads: 10, tps: 322.50, reads: 2901.98, writes: 1289.89, response time: 55.45ms (95%), errors: 0.00, reconnects:  0.00
[ 180s] threads: 10, tps: 304.60, reads: 2743.12, writes: 1219.91, response time: 58.09ms (95%), errors: 0.10, reconnects:  0.00 <--- moment of the switch
[ 190s] threads: 10, tps: 330.40, reads: 2973.40, writes: 1321.00, response time: 50.52ms (95%), errors: 0.00, reconnects:  0.00
[ 200s] threads: 10, tps: 304.20, reads: 2745.60, writes: 1217.60, response time: 58.40ms (95%), errors: 0.00, reconnects:  1.00
[ 210s] threads: 10, tps: 353.80, reads: 3183.80, writes: 1414.40, response time: 48.15ms (95%), errors: 0.00, reconnects:  0.00

[ 160s] threads: 10, tps: 354.50, reads: 3191.10, writes: 1418.50, response time: 48.96ms (95%), errors: 0.00, reconnects: 0.00

[ 170s] threads: 10, tps: 322.50, reads: 2901.98, writes: 1289.89, response time: 55.45ms (95%), errors: 0.00, reconnects: 0.00

[ 180s] threads: 10, tps: 304.60, reads: 2743.12, writes: 1219.91, response time: 58.09ms (95%), errors: 0.10, reconnects: 0.00 <--- moment of the switch

[ 190s] threads: 10, tps: 330.40, reads: 2973.40, writes: 1321.00, response time: 50.52ms (95%), errors: 0.00, reconnects: 0.00

[ 200s] threads: 10, tps: 304.20, reads: 2745.60, writes: 1217.60, response time: 58.40ms (95%), errors: 0.00, reconnects: 1.00

[ 210s] threads: 10, tps: 353.80, reads: 3183.80, writes: 1414.40, response time: 48.15ms (95%), errors: 0.00, reconnects: 0.00

Check ProxySQL:

mysql> select * from stats_mysql_connection_pool where hostgroup between 600 and 601 order by hostgroup,srv_host desc;
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host      | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 600       | 192.168.1.107 | 3306     | ONLINE | 10       | 0        | 10     | 0       | 123457  | 9922280         | 0               | 658        | <--- new master
| 601       | 192.168.1.111 | 3306     | ONLINE | 2        | 6        | 14     | 0       | 1848302 | 91513537        | 7590137770      | 1044       |
| 601       | 192.168.1.109 | 3306     | ONLINE | 5        | 3        | 12     | 0       | 1688789 | 83717258        | 6927354689      | 220        |
| 601       | 192.168.1.107 | 3306     | ONLINE | 3        | 5        | 13     | 0       | 1834415 | 90789405        | 7524861792      | 658        |
| 601       | 192.168.1.104 | 3306     | ONLINE | 6        | 2        | 24     | 0       | 1667252 | 82509124        | 6789724589      | 265        |
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

mysql> select * from stats_mysql_connection_pool where hostgroup between 600 and 601 order by hostgroup,srv_host desc;

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

| 600 | 192.168.1.107 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 123457 | 9922280 | 0 | 658 | <--- new master

| 601 | 192.168.1.111 | 3306 | ONLINE | 2 | 6 | 14 | 0 | 1848302 | 91513537 | 7590137770 | 1044 |

| 601 | 192.168.1.109 | 3306 | ONLINE | 5 | 3 | 12 | 0 | 1688789 | 83717258 | 6927354689 | 220 |

| 601 | 192.168.1.107 | 3306 | ONLINE | 3 | 5 | 13 | 0 | 1834415 | 90789405 | 7524861792 | 658 |

| 601 | 192.168.1.104 | 3306 | ONLINE | 6 | 2 | 24 | 0 | 1667252 | 82509124 | 6789724589 | 265 |

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

In this case, the servers weren’t behind the master and switch happened quite fast.

We can see that the WRITE operations that normally are an issue, given the need to move around a VIP or change name resolution, had a limited hiccup.

Read operations were not affected, at all. Nice, eh?

Do you know how long it takes to do a switch under these conditions? real 0m2.710s yes 2.7 seconds.

This is more evidence that, most of the time, an MHA-based switch is caused by the need to redirect traffic from A to B using the network.

Crash fail-over

What happened if instead of an easy switch, we have to cover a real failover?

First of all, let’s start masterha_manager:

nohup masterha_manager --conf=/etc/mha.cnf --wait_on_monitor_error=60 --wait_on_failover_error=60 >> /tmp/mha.log 2>&1

1	nohup masterha_manager --conf=/etc/mha.cnf --wait_on_monitor_error=60 --wait_on_failover_error=60 >> /tmp/mha.log 2>&1

Then let’s start a load again. Finally, go to the MySQL node that uses master xxx.xxx.xxx.107

ps aux|grep mysql
mysql    18755  0.0  0.0 113248  1608 pts/0    S    Aug28   0:00 /bin/sh /opt/mysql_templates/mysql-57/bin/mysqld_safe --defaults-file=/opt/mysql_instances/mha1/my.cnf
mysql    21975  3.2 30.4 4398248 941748 pts/0  Sl   Aug28  93:21 /opt/mysql_templates/mysql-57/bin/mysqld --defaults-file=/opt/mysql_instances/mha1/my.cnf --basedir=/opt/mysql_templates/mysql-57/ --datadir=/opt/mysql_instances/mha1/data --plugin-dir=/opt/mysql_templates/mysql-57//lib/plugin --log-error=/opt/mysql_instances/mha1/mysql-3306.err --open-files-limit=65536 --pid-file=/opt/mysql_instances/mha1/mysql.pid --socket=/opt/mysql_instances/mha1/mysql.sock --port=3306
 
 
And kill the MySQL process.
kill -9 21975 18755

ps aux|grep mysql

mysql 18755 0.0 0.0 113248 1608 pts/0 S Aug28 0:00 /bin/sh /opt/mysql_templates/mysql-57/bin/mysqld_safe --defaults-file=/opt/mysql_instances/mha1/my.cnf

mysql 21975 3.2 30.4 4398248 941748 pts/0 Sl Aug28 93:21 /opt/mysql_templates/mysql-57/bin/mysqld --defaults-file=/opt/mysql_instances/mha1/my.cnf --basedir=/opt/mysql_templates/mysql-57/ --datadir=/opt/mysql_instances/mha1/data --plugin-dir=/opt/mysql_templates/mysql-57//lib/plugin --log-error=/opt/mysql_instances/mha1/mysql-3306.err --open-files-limit=65536 --pid-file=/opt/mysql_instances/mha1/mysql.pid --socket=/opt/mysql_instances/mha1/mysql.sock --port=3306

And kill the MySQL process.

kill -9 21975 18755

As before, check what happened on the application side:

[  80s] threads: 4, tps: 213.20, reads: 1919.10, writes: 853.20, response time: 28.74ms (95%), errors: 0.00, reconnects:  0.00
[  90s] threads: 4, tps: 211.30, reads: 1901.80, writes: 844.70, response time: 28.63ms (95%), errors: 0.00, reconnects:  0.00
[ 100s] threads: 4, tps: 211.90, reads: 1906.40, writes: 847.90, response time: 28.60ms (95%), errors: 0.00, reconnects:  0.00
[ 110s] threads: 4, tps: 211.10, reads: 1903.10, writes: 845.30, response time: 29.27ms (95%), errors: 0.30, reconnects:  0.00 <-- issue starts
[ 120s] threads: 4, tps: 198.30, reads: 1785.10, writes: 792.40, response time: 28.43ms (95%), errors: 0.00, reconnects:  0.00
[ 130s] threads: 4, tps: 0.00, reads: 0.60, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.40         <-- total stop in write
[ 140s] threads: 4, tps: 173.80, reads: 1567.80, writes: 696.30, response time: 34.89ms (95%), errors: 0.40, reconnects:  0.00 <-- writes restart
[ 150s] threads: 4, tps: 195.20, reads: 1755.10, writes: 780.50, response time: 33.98ms (95%), errors: 0.00, reconnects:  0.00
[ 160s] threads: 4, tps: 196.90, reads: 1771.30, writes: 786.80, response time: 33.49ms (95%), errors: 0.00, reconnects:  0.00
[ 170s] threads: 4, tps: 193.70, reads: 1745.40, writes: 775.40, response time: 34.39ms (95%), errors: 0.00, reconnects:  0.00
[ 180s] threads: 4, tps: 191.60, reads: 1723.70, writes: 766.20, response time: 35.82ms (95%), errors: 0.00, reconnects:  0.00

[ 80s] threads: 4, tps: 213.20, reads: 1919.10, writes: 853.20, response time: 28.74ms (95%), errors: 0.00, reconnects: 0.00

[ 90s] threads: 4, tps: 211.30, reads: 1901.80, writes: 844.70, response time: 28.63ms (95%), errors: 0.00, reconnects: 0.00

[ 100s] threads: 4, tps: 211.90, reads: 1906.40, writes: 847.90, response time: 28.60ms (95%), errors: 0.00, reconnects: 0.00

[ 110s] threads: 4, tps: 211.10, reads: 1903.10, writes: 845.30, response time: 29.27ms (95%), errors: 0.30, reconnects: 0.00 <-- issue starts

[ 120s] threads: 4, tps: 198.30, reads: 1785.10, writes: 792.40, response time: 28.43ms (95%), errors: 0.00, reconnects: 0.00

[ 130s] threads: 4, tps: 0.00, reads: 0.60, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.40 <-- total stop in write

[ 140s] threads: 4, tps: 173.80, reads: 1567.80, writes: 696.30, response time: 34.89ms (95%), errors: 0.40, reconnects: 0.00 <-- writes restart

[ 150s] threads: 4, tps: 195.20, reads: 1755.10, writes: 780.50, response time: 33.98ms (95%), errors: 0.00, reconnects: 0.00

[ 160s] threads: 4, tps: 196.90, reads: 1771.30, writes: 786.80, response time: 33.49ms (95%), errors: 0.00, reconnects: 0.00

[ 170s] threads: 4, tps: 193.70, reads: 1745.40, writes: 775.40, response time: 34.39ms (95%), errors: 0.00, reconnects: 0.00

[ 180s] threads: 4, tps: 191.60, reads: 1723.70, writes: 766.20, response time: 35.82ms (95%), errors: 0.00, reconnects: 0.00

So it takes ~10 seconds to perform failover.

To understand better, let see what happened in MHA-land:

Tue Aug 30 09:33:33 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 30 09:33:33 2016 - [info] Reading application default configuration from /etc/mha.cnf..
... Read conf and start
Tue Aug 30 09:33:47 2016 - [debug] Trying to get advisory lock..
Tue Aug 30 09:33:47 2016 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
... Wait for errors
Tue Aug 30 09:34:47 2016 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away) <--- Error time
Tue Aug 30 09:34:56 2016 - [warning] Connection failed 4 time(s)..                                     <--- Finally MHA decide to do something 
Tue Aug 30 09:34:56 2016 - [warning] Master is not reachable from health checker!
Tue Aug 30 09:34:56 2016 - [warning] Master mha2r(192.168.1.107:3306) is not reachable!
Tue Aug 30 09:34:56 2016 - [warning] SSH is reachable.
Tue Aug 30 09:34:58 2016 - [info] Master failover to mha1r(192.168.1.104:3306) completed successfully. <--- end of the failover

Tue Aug 30 09:33:33 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Tue Aug 30 09:33:33 2016 - [info] Reading application default configuration from /etc/mha.cnf..

... Read conf and start

Tue Aug 30 09:33:47 2016 - [debug] Trying to get advisory lock..

Tue Aug 30 09:33:47 2016 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

... Wait for errors

Tue Aug 30 09:34:47 2016 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away) <--- Error time

Tue Aug 30 09:34:56 2016 - [warning] Connection failed 4 time(s).. <--- Finally MHA decide to do something

Tue Aug 30 09:34:56 2016 - [warning] Master is not reachable from health checker!

Tue Aug 30 09:34:56 2016 - [warning] Master mha2r(192.168.1.107:3306) is not reachable!

Tue Aug 30 09:34:56 2016 - [warning] SSH is reachable.

Tue Aug 30 09:34:58 2016 - [info] Master failover to mha1r(192.168.1.104:3306) completed successfully. <--- end of the failover

MHA sees the server failing at xx:47, but because of the retry and checks validation, it actually fully acknowledges the downtime at xx:56 (~8 seconds after).

To perform the whole failover, it only takes ~2 seconds (again). Because no movable IPs or DNSs were involved, the operations were fast. This is true when the servers have the binary-log there, but it’s a different story if MHA also has to manage and push data from the binarylog to MySQL.

As you can see, ProxySQL can also help reduce the timing for this scenario, totally skipping the network-related operations. These operations are the ones causing the most trouble in these cases.

7 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Alessandro Moretti

7 years ago

Is there a reason to use mysql_replication_hostgroups? This table implies that some host are read-only, in your case the 601 hostgroup. On failover will the MHA change the read_only on the old master to 1 and the new master to 0?

Marco Tusa

Author

Reply to Alessandro Moretti

7 years ago

Hi Alessandro,
yes that is correct here we are talking of standard MySQL replication with MHA controller. As such we have a master with READ_ONLY=0 and Slaves with READ_ONLY=1. In case of failover MHA will change the READ_ONLY variables in the new just elect MASTER to 0.
ProxySQL will “see” that event and will copy the entry to the 600 group, re-enabling writes against the new elected Master.

Hope this clarify

Marco Tusa

Author

7 years ago

Hope this clarify

Wagner Bianchi

7 years ago

Marco, good article, as always!

Just calling the attention to a small thing; when you create a database to hold sysbench tables on ProxySQL, you got in “test_mha” and when you run sysbench, you’re running it against “mha_test”:

#: proxysql default_schema for mha_* users
insert into mysql_users (username,password,active,default_hostgroup,default_schema) values (‘mha_W’,’test’,1,10,’test_mha’);

#: sysbench
sysbench … –mysql-db=mha_test –db-driver=mysql … run

If it’s not a problem, pardon myself. Cheers!

BHUVANESHi

5 years ago

Hey, Im using Orchestrator for failover. This is also will make the read_only=OFF on the new master. But I didn’t see any hostgroup changes. Did you use any parameters in proxysql?

Marco Tusa

Author

Reply to BHUVANESHi

5 years ago

Are you using the Replication Host group Table?
Best way is to share the mysql_server and replication host group table information to debug your problem

Tamal

3 years ago

We have a configuration of one master server and two slaves and MHA is installed on top of it . MHA manager is installed and is operated through a dedicated server. Our objective is to build up MHA in such a way that when master host(the server where master database is running) goes down it should perform fail over. As per our present infrastructure database fail over happens whenever database running on master host is down but db fail over doesn’t happen when host is down or unreachable.

As we are using virtual machine and not a physical server power manger script is not deployed in MHA .
Will we be able to meet our requirement If we integrate proxy sql with MHA ?
We do understand that using proxy sql there is no need to move ip from one server to another server but db fail over takes place and proxy sql routes write connection but will there be db fail over when master host is down ?

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

ProxySQL and MHA Integration

Controlled fail-over

Crash fail-over

Related

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

ProxySQL and MHA Integration

Controlled fail-over

Crash fail-over

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation