Just about a month ago, Pavel Ivanov released Ripple under the Apache-2.0 license. Ripple is a MySQL binlog server: software which receives binary logs from MySQL or MariaDB servers and delivers them to another MySQL or MariaDB server. Practically ,this is an intermediary master which does not store any data, except the binary logs themselves, and does not apply events. This solution allows saving of a lot of resources on the server, which acts only as a middle-man between the master and its actual slave(s).
The intermediary server, keeping binary logs only and not doing any other job, is a prevalent use case which allows us to remove IO (binlog read) and network (binlog retrieval via network) load from the actual master and free its resources for updates. The intermediary master, which does not do any work, distributes binary logs to slaves connected to it. This way you can have an increased number of slaves, attached to such a server, without affecting the application, running updates.
Currently, users exploit the Blackhole storage engine to emulate similar behavior. But Blackhole is just a workaround: it still executes all the events in the binary logs, requires valid MySQL installation, and has a lot of issues. Such a pain!
Therefore a new product which can do the same job and is released with an open source license is something worth trying.
A simple test
For this blog, I did a simple test. First, I installed it as described in the README file. Instructions are pretty straightforward, and I successfully built the server on my Ubuntu 18.04.2 LTS laptop. Guidelines suggest to install libmariadbclient-dev , and I replaced libmysqlclient-dev which I had already on my machine. Probably this was not needed, but since the tool claims to support both MySQL and MariaDB binary log formats, I preferred to install the MariaDB client.
There is no manual of usage instructions. However, the tool supports -help command, and it is, again, straightforward.
The server can be started with options:
1 | $./bazel-bin/rippled -ripple_datadir=./data -ripple_master_address=127.0.0.1 -ripple_master_port=13001 -ripple_master_user=root -ripple_server_ports=15000 |
Where:
- -ripple-datadir : datadir where Ripple stores binary logs
- -ripple_master_address : master host
- -ripple_master_port : master port
- -ripple_master_user : replication user
- -ripple_server_ports : comma-separated ports which Ripple will listen
I did not find an option for securing binary log retrieval. The slave can connect to the Ripple server with any credentials. Have this in mind when deploying Ripple in production.
Now, let’s run a simple test. I have two servers. Both running on localhost, one with port 13001 (master) and another one on port 13002 (slave). The command line which I used to start rippled , points to the master. Binary logs are stored in the data directory:
1 2 3 4 | $ ls -l data/ total 14920 -rw-rw-r-- 1 sveta sveta 15251024 Mar 6 01:43 binlog.000000 -rw-rw-r-- 1 sveta sveta 71 Mar 6 00:50 binlog.index |
I pointed the slave to the Ripple server with the command
1 2 | mysql> change master to master_host='127.0.0.1',master_port=15000, master_user='ripple'; Query OK, 0 rows affected, 1 warning (0.02 sec) |
Then started the slave.
On the master, I created the database sbtest and ran sysbench oltp_read_write.lua test for a single table. After some time, I stopped the load and checked the content of the table on master and slave:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | master> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 10000 | +----------+ 1 row in set (0.08 sec) master> checksum table sbtest1; +----------------+------------+ | Table | Checksum | +----------------+------------+ | sbtest.sbtest1 | 4162333567 | +----------------+------------+ 1 row in set (0.11 sec) slave> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 10000 | +----------+ 1 row in set (0.40 sec) slave> checksum table sbtest1; +----------------+------------+ | Table | Checksum | +----------------+------------+ | sbtest.sbtest1 | 1797645970 | +----------------+------------+ 1 row in set (0.13 sec) slave> checksum table sbtest1; +----------------+------------+ | Table | Checksum | +----------------+------------+ | sbtest.sbtest1 | 4162333567 | +----------------+------------+ 1 row in set (0.10 sec) |
It took some time for the slave to catch up, but everything was applied successfully.
Ripple has nice verbose logging:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | $ ./bazel-bin/rippled -ripple_datadir=./data -ripple_master_address=127.0.0.1 -ripple_master_port=13001 -ripple_master_user=root -ripple_server_ports=15000 WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 15:57:13.641451 27908 rippled.cc:48] InitPlugins I0306 15:57:13.642007 27908 rippled.cc:60] Setup I0306 15:57:13.642937 27908 binlog.cc:307] Starting binlog recovery I0306 15:57:13.644090 27908 binlog.cc:350] Scanning binlog file: binlog.000000 I0306 15:57:13.872016 27908 binlog.cc:417] Binlog recovery complete binlog file: binlog.000000, offset: 15251088, gtid: 6ddac507-3f90-11e9-8ee9-00163e000000:0-0-7192 I0306 15:57:13.872050 27908 rippled.cc:106] Recovered binlog I0306 15:57:13.873811 27908 mysql_server_port_tcpip.cc:150] Listen on host: localhost, port: 15000 I0306 15:57:13.874282 27908 rippled.cc:62] Start I0306 15:57:13.874511 27910 mysql_master_session.cc:181] Master session starting I0306 15:57:13.882601 27910 mysql_client_connection.cc:148] connected to host: 127.0.0.1, port: 13001 I0306 15:57:13.895349 27910 mysql_master_session.cc:137] Connected to host: 127.0.0.1, port: 13001, server_id: 1, server_name: W0306 15:57:13.898556 27910 mysql_master_session.cc:197] master does not support semi sync I0306 15:57:13.898583 27910 mysql_master_session.cc:206] start replicating from '6ddac507-3f90-11e9-8ee9-00163e000000:0-0-7192' I0306 15:57:13.899031 27910 mysql_master_session.cc:229] Master session entering main loop I0306 15:57:13.899550 27910 binlog.cc:626] Update binlog position to end_pos: binlog.000000:15251152, gtid: 0-0-7192 I0306 15:57:13.899572 27910 binlog.cc:616] Skip writing event [ Previous_gtids len = 67 ] I0306 15:57:13.899585 27910 binlog.cc:626] Update binlog position to end_pos: binlog.000000:15251152, gtid: 0-0-7192 ... |
Conclusion
it may be good to run more tests before using Ripple in production, and to explore its other options, but from a first view it seems to be a very nice and useful product.
Interesting. We so also use the black hole storage engine, but this looks great will explore more into it.
Thanks for sharing.
I added –ripple_master_password some time ago ( https://github.com/google/mysql-ripple/pull/5 ). I have open pull requests to add support for authenticating clients with mysql_native_password and handle MySQL 8.0 clients, which need AuthSwitchRequest/AuthSwitchResponse as it stubbornly tries to use caching_sha2_password.
Does anyone know if Ripple manages the number of expire_logs_days for each M/S relationship?
I am very interested in this, not so much for relaying to many, many slaves, but to see if it will serve as a temporary stop gap for very large MANY TB databases, as in 30TB, where their 3TB binlog space is not large enough to accommodate storing enough days to SAFELY do backups. The binlog data transfer on these databases is > 41GB per hour, so to date XTRABACKUP is unable to backup the data; continuously errors out on logs wrapping. Past experience we increased log file sizes from 2GB to 6GB and it usually resolved these issues, but as of recent not even 41GB log file sizes were enough.
In summary we are now LFTP..ing one of the databases offline slaves to rebuild it, thus the expire_logs_days issue risks shorting binlog space while we risk an estimated 3.5 day copy with 6 days of logs.
Yes I would love to redesign these databases, sharding to multiple smaller nodes, but management has been unwilling to improve its design. Any insights appreciated!
It has two options:
-ripple_purge_expire_logs_days (Purge binlog files older than this days
(0=disable). This is evaluated independently of ripple_purge_keep_size.)
type: int32 default: 0
-ripple_purge_logs_keep_size (Purge binlog files if total size exeeds this
value (0=disable). This is evaluated independently of
ripple_purge_expire_logs_days.) type: uint64 default: 0
I am not sure if this is what you need. But if you have a multiple master you can use multiple rippled instances for the better purge tuning.
Looking forward for binary packages of this in the Percona yum repository, it would be useful for us!
LeFred talks about binary packages in below:
https://lefred.be/content/ripple-binlog-server-for-mysql/
Does Ripple have similar functionality as mysqld, i.e. filtering tables (–replicate- * options)?
I did not find filters among its options.