InnoDB Page Compression: the Good, the Bad and the Ugly

In this blog post, we’ll look at some of the facets of InnoDB page compression.

Somebody recently asked me about the best way to handle JSON data compression in MySQL. I took a quick look at InnoDB page compression and wanted to share my findings.

First, the good part.

InnoDB page compression is actually really easy to use and provides a decent compression ratio. To use it, I just ran CREATE TABLE commententry (...) COMPRESSION="zlib"; – and that’s all. By the way, for my experiment I used the subset of Reddit comments stored in JSON (described here: Big Dataset: All Reddit Comments – Analyzing with ClickHouse).

This method got me a compressed table of 3.9GB. Compare this to 8.4GB for an uncompressed table and it’s about a 2.15x compression ratio.

Now, the bad part.

As InnoDB page compression uses “hole punching,” the standard Linux utils do not always properly support files created this way. In fact, to see the size “3.9GB” I had to use du --block-size=1 tablespace_name.ibd , as the standard ls -l tablespace_name.ibd shows the wrong size (8.4GB). There is a similar limitation on copying files. The standard way cp old_file new_file may not always work, and to be sure I had to use cp --sparse=always old_file new_file.

Speaking about copying, here’s the ugly part.

The actual time to copy the sparse file was really bad.

On a fairly fast device (a Samsung SM863), copying the sparse file mentioned above in its compressed size of 3.9GB took 52 minutes! That’s shocking, so let me repeat it again: 52 minutes to copy a 3.9GB file on an enterprise SATA SSD.

By comparison, copying regular 8.4GB file takes 9 seconds! Compare 9 sec and 52 mins.

To be fair, the NMVe device (Intel® SSD DC D3600) handles sparse files much better. It took only 12 seconds to copy the same sparse file on this device.

Having considered all this, it is hard to recommend that you use InnoDB page compression for serious production. Well, unless you power your database servers with NVMe storage.

For JSON data, the Compressed Columns in Percona Server for MySQL should work quite well using Dictionary to store JSON keys – give it a try!

9 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Peter Zaitsev

Admin

6 years ago

Vadim, When you talk about SATA SM863 vs NVMe DC D3600 is it about NVMe vs SATA or some sort of other device differences ?

With just 2 devices it is very hard to say what is representative here – relatively modest overhead you see on NVMe or major slow down which you have on SM863. Could it be this particular device is extremely poor handling sparse/fragmented files ?

Vadim Tkachenko

Author

6 years ago

Peter,

While it definitely might be the device related, I think NVMe devices are commonly known to handle sparse files better

Rick Pizzi

6 years ago

Sorry Vadim, I miss the rationale for why would someone ever want to copy an .ibd file?

Vadim Tkachenko

Author

Reply to Rick Pizzi

6 years ago

Rick,

How would you do InnoDB backup if not copying files?

Rick Pizzi

Reply to Vadim Tkachenko

6 years ago

You talking xtrabackup here? Well, we have a large number very large tables (hundreds of gigabytes) and we leverage InnoDB compression on all of them. Needless to say we never experienced such bad performances, our backups take reasonable times… following what you described, copying a 1 TB compressed table should take days, which is not the case….

Vadim Tkachenko

Author

Reply to Rick Pizzi

6 years ago

Rick,

To clarify,
Are you using InnoDB Page Compression https://dev.mysql.com/doc/refman/5.7/en/innodb-page-compression.html
or InnoDB Table Compression https://dev.mysql.com/doc/refman/5.7/en/innodb-table-compression.html ?

Rick Pizzi

Reply to Vadim Tkachenko

6 years ago

My bad, of course we use table level compression which doesn’t exhibit the same behaviour (I guess it is not using hole punching). In this case I would recommend everyone to steer away from page level compression if they want to take backups in reasonable time 🙂

Andy

6 years ago

What about InnoDB table compression? Would you recommend that?

Between InnoDB table compression and Percona Server Compressed Columns how would you choose?

Vadim Tkachenko

Author

Reply to Andy

6 years ago

Andy,

In this case it depends on your use case. Please also check https://www.percona.com/live/e17/sessions/percona-xtradb-compressed-columns-with-dictionaries-an-alternative-to-innodb-table-compression

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

InnoDB Page Compression: the Good, the Bad and the Ugly

First, the good part.

Now, the bad part.

Speaking about copying, here’s the ugly part.

Related

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

InnoDB Page Compression: the Good, the Bad and the Ugly

First, the good part.

Now, the bad part.

Speaking about copying, here’s the ugly part.

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation