Large replication topologies are quite common nowadays, and this kind of architecture often requires a quick method to rebuild a replica from another server.
The Clone Plugin, available since MySQL 8.0.17, is a great feature that allows cloning databases out of the box. It is easy to rebuild a replica or to add new nodes to a cluster using the plugin. Before the release of the plugin, the best open-source alternative was Percona XtraBackup for MySQL Databases.
In this blog post, we compare both alternatives for cloning purposes. If you need to perform backups, Percona XtraBackup is a better tool as it supports compression and incremental backups, among other features not provided by the plugin. The plugin supports compression only for network transmission, not for storage.
But one of the plugin’s strong points is simplicity. Once installed and configured, cloning a database is straightforward. Just issuing a command from the destination database is enough.
Percona XtraBackup, on the other side, is a more complex tool. The cloning process involves several stages: backup, stream, write, and prepare. These stages can take place in parallel: we can stream the backup to the new server using netcat and, at the same time, we can write it into the destination directory. The only stage that is sequential is the last one: prepare.
Test Characteristics
We used sysbench to create 200 tables of 124Mb each for a total of 24Gb. Both source and replica virtual machines run 4 cores, 8 Gb RAM, and 60Gb storage. We created the disks on the same datastore.
During the tests, we did not generate additional operations on the database. We measured only the clone process, reducing the benchmark complexity. Otherwise, we would have to take into consideration things like application response time, or the number of transactions executed. This is beyond the scope of this assessment.
We tested different combinations of clone and Percona XtraBackup operations. For XtraBackup, we tested 1 to 4 threads, with and without compression. In the case of compression, we allocated the same number of threads to compression and decompression. For the clone plugin, we tested auto (which lets the server decide how many threads will perform the clone) and 1 to 4 threads. We also tested with and without compression. Finally, we executed all the tests using three different network limits: 500mbps, 1000mbps, and 4000mbps. These make a total of 54 tests, executed 12+1 times each.
All times are in seconds. In the graphs below, lower values are better.
Method
Clone
Out of the required parameters to operate the clone plugin, the following were set up accordingly in the recipient server:
clone_max_concurrency=<maximum number of threads>
Defines the maximum number of threads used for a remote cloning operation with autotune enabled. Otherwise, this is the exact number of threads that remote cloning uses.clone_autotune_concurrency
If enabled the clone operation uses up to clone_max_concurrency threads. The default is 16.clone_enable_compression
If enabled, the remote clone operation will use compression.
Percona XtraBackup
To stream the backup we used the xbstream format and sent the data to the remote server using netcat. We applied the following parameters:
parallel=<number of threads>
Xtrabackup and xbstream parameter that defines the number of threads used for backup and restore operations.rebuild-threads
The number of threads used for the rebuild (prepare) operation.decompress_threads
andcompress_threads
Xtrabackup and xbstream parameters that define the number of threads used for compression operations.
Some people use additional parameters like innodb-read-io-threads
, innodb-write-io-threads
, or innoDB-io-capacity
, but these parameters only affect the behavior of InnoDB background threads. They have no impact during backup and restore operations.
Results
Clone
No compression
For the lower bandwidth tests, the number of threads used does not make a difference. Once we increase bandwidth we see that time cuts by half when we move from one thread to two. Going beyond that value improves slightly. Probably we reach the disk i/o limit.
The auto
option is consistently the fastest one.
Compression
Compression is supposed to improve performance for lower bandwidth connections, but we see that this is not the case. Bandwidth has no impact on execution time and compression makes the clone slower. Again auto
gives the best results, equivalent to 4 threads.
Percona XtraBackup
No Compression
Without compression, we see again that the number of threads does not make any difference in the lower bandwidth test. When we increase bandwidth, the number of threads is important, but we quickly reach i/o limits.
Compression
When using compression, we see that requires less time to complete in almost every case compared with the option without compression, even when bandwidth is not the limit.
Conclusion
We see that, when using compression, the clone plugin is the slower option while Percona XtraBackup gives great results for all bandwidths. Without compression, the clone plugin is faster when using more than 2 threads. XtraBackup is faster for fewer threads.
Below, we have a chart comparing the worst and best results. As expected, the worst results correspond to one thread executions.
The Clone Plugin is a great option for simplicity. Percona XtraBackup is excellent to save bandwidth and provides better results with fewer threads. With enough threads and bandwidth available, both solutions provide comparable results.
Hi Pep Pla 😉
Thank you for the interest you bring to MySQL clone.
However, there is one point not covered in your blog post, maybe this is not clear, but clone has been developed to have not more than approximately 5% impact on user threads and therefor not optimized on idle servers.
I don’t see in your tests if there was load on the servers. If not, this is like a file copy over the net.
Cheers,
Hi,
I was not aware of that limitation. I haven’t been able to find any reference in the documentation to this behavior (5%). The documentation says that “During a cloning operation, the number of threads increases incrementally toward a target of double the current thread count”. I guess this limit is overridden using clone_max_concurrency and disabling autotune.
IMHO it is a bit counter intuitive to measure the number of threads (user threads? running? connected? background?… my recently installed 8.0 reports 44 threads in ps.threads…) to know how many threads can be allocated for a transfer. This could impose a penalty on servers designated as clone sources only, or clone processes that take place during low activity windows.
If you can share more information with me about this (and compression), I would be pleased to write: Episode II Attack of the MySQL Clones.
Cheers
Hey Pep, I am more interested in the incremental backup VS incremental cloning. I mean do you happen to verify the ability for the plugins as of incremental cloning? Thanks
Hi Edwin,
Current version of clone plugin does not support incremental cloning. I guess you could implement some sort of incremental cloning using clone plugin + binlog point in time recovery… or using the incremental feature of xtrabackup.
Keep in mind that you can’t apply incremental backups once your database has been opened. A better approach could involve filesystem snapshots.
Actually, it depends on what you want to do with “incremental cloning”.
P.
Some time ago I did a couple of benchmarks for SSTs in Galera, where I was trying to optimize for cross-region SSTs and get them over as fast as possible. My conclusion was that the most determining factor is the compression algorithm. Xtrabackup uses Qpress (LZ4 based algorithm) by default to compress and this is one of the least CPU demanding algorithms. I wasn’t able to determine which compression algorithm MySQL Clone uses, but if it’s the Zstd algorithm it’s definitely more CPU bound. You can find my findings here: https://mysqlquicksand.wordpress.com/2019/09/27/putting-galera-sst-compression-on-the-benchmark/ or in my presentation: https://www.percona.com/sites/default/files/ple19-slides/day2-am/benchmarking-is-never-optional.pdf
Also the lowest bandwidth tier of 500mbit you chose is quite high for cross-region throughput: our limit between our DC and GCP was 300mbit the maximum we could achieve. It would be more interesting to have more benchmarks with lower bandwidth transfers.
Hi,
I also tried to know what algorithm uses the clone plugin for compression. Regarding the low bandwidth test, I think it is not only bandwidth, but also latency what impacts streaming. Did you consider using any udp based transfer utility rather than using tcp based ones? I made tests and, in lan environments, it does not make any difference.
P.