Optimize SST in Percona XtraDB Cluster with ZSTD Compression

Optimize SST in Percona XtraDB Cluster with ZSTD Compression Percona XtraDB Cluster (PXC) offers a great deal of flexibility when it comes to the state transfer (SST) options (used when a new node is automatically provisioned with data). For many environments, on-the-fly compression capability gives great benefits of saving network bandwidth during the process of sending sometimes terabytes of data. The usual choice for compression here is a built-in Percona XtraBackup compress option (using qpress internally), or options compressor/decompressor for the compression tool of choice. In the second case, the popular option is the gzip or its multi-threaded version pigz, which offers a better compression rate than qpress.

In this writeup, I would like to mention another important compression alternative, which is gaining good popularity recently – zstd.

I decided to do a simple test of various SST settings in terms of compression method and number of parallel threads. Note that my test is limited to basically one hardware scenario and a generic mix of TPCC and sysbench data.

The specs of my test box, which I tested with PXC 8.0.25: 2x Qemu-KVM VMs, each has 6GB RAM, 8 vCPUs (i7 11th gen), disk storage on a fast NVMe drive, and 1Gbps virtual network link. Therefore, my goal is only to give some hints and encourage to test various options, as the potential benefit may be quite significant in some environments.

In order to set particular compression, I used the following configuration options, where x means a number of parallel threads.

No compression

[sst]
backup_threads=x

1 2	[sst] backup_threads=x

qpress used internally by XtraBackup

[sst]
backup_threads=x
[xtrabackup]
compress
parallel=x
compress-threads=x

[sst]

backup_threads=x

[xtrabackup]

compress

parallel=x

compress-threads=x

qpress

[sst]
compressor='qpress -io -Tx 1'
decompressor='qpress -dio'
backup_threads=x
[xtrabackup]
parallel=x

[sst]

compressor='qpress -io -Tx 1'

decompressor='qpress -dio'

backup_threads=x

[xtrabackup]

parallel=x

pigz

[sst]
compressor='pigz -px'
decompressor='pigz -px -d'
backup_threads=x
[xtrabackup]
parallel=x

[sst]

compressor='pigz -px'

decompressor='pigz -px -d'

backup_threads=x

[xtrabackup]

parallel=x

zstd

[sst]
compressor='zstd -1 -Tx'
decompressor='zstd -d -Tx'
backup_threads=x
[xtrabackup]
parallel=x

[sst]

compressor='zstd -1 -Tx'

decompressor='zstd -d -Tx'

backup_threads=x

[xtrabackup]

parallel=x

On each SST test, I measured the complete time of starting the new node, network data received bytes during the SST process by the donor, and data written to the joiner’s disk.

Here are the results:

SST time in seconds
Threads	No compression	qpress built-in	qpress	gzip (pigz)	zstd
1	102	156	130	976	118
2	92	123	112	474	92
4	85	106	109	258	95
8	86	99	109	182	97

Data received by the joiner during SST [MB]
No compression	qpress built-in	qpress	gzip (pigz)	zstd
20762	6122	6138	4041	4148
Data written by the joiner to disk during SST [MB]
20683	26515	20683	20684	20683

And some graphical views for convenience:

SST MySQL

In this test case, the small gain of using multiple threads with no compression or with lightweight compression is due to the fact that the network link and disk IO became the bottleneck faster than the CPU.

The test shows how bad regarding CPU utilization gzip is compared to other compression methods, as CPU was the main bottleneck even with 8 threads here.

Quite excellent results came with zstd, which while offering the same good compression rate as gzip, completely outperforms it in terms of CPU utilization, and all of that with the lowest compression level of “1”!

One thing that needs clarification is the difference between the two methods using qpress (quicklz) compression. When using the compress option for Percona XtraBackup, the tool first compresses each file and sends it with .qp suffix to the joiner. Then, the joiner has to decompress those files before it can prepare the backup. Therefore, it is always a more expensive one as requires more disk space during the process.

Any real-life examples of introducing better compression methods are very welcome in the comments! I wonder if zstd turns out to be as effective in your real use cases.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Patrick Domack

2 years ago

i fond only 3% better compression using zstd vs gzip. but i mainly use it with dovecot where it is much faster than gzip to decompress the same data many times, and compression speed doesnt matter much. i would think using it for column compression would be good.

Przemysław Malkowski

Author

Reply to Patrick Domack

2 years ago

By 3% better you mean only that much faster right? Interesting, I wonder if you have tried low compression levels as well, since I saw some bad behavior when using zstd with level 4+ where it was much slower than level 1-2 while not actually giving smaller result files at all. Of course a lot depends on actual data, but often level of 1 already gives comparable compression ratio to gzip while being a lot lot faster.
When it comes to column compression (https://www.percona.com/doc/percona-server/LATEST/flexibility/compressed_columns.html), you may always file a feature request (https://www.percona.com/blog/2019/06/12/report-bugs-improvements-new-feature-requests-for-percona-products/).
I already did for page compression: https://bugs.mysql.com/bug.php?id=105962

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Optimize SST in Percona XtraDB Cluster with ZSTD Compression

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Optimize SST in Percona XtraDB Cluster with ZSTD Compression

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation