Previously I tested Tokutek’s Fractal Trees (TokuMX & TokuMXse) as MongoDB storage engines – today let’s look into the MySQL area.

I am going to use modified LinkBench in a heavy IO-load.

I compared InnoDB without compression, InnoDB with 8k compression, TokuDB with quicklz compression.
Uncompressed datasize is 115GiB, and cachesize is 12GiB for InnoDB and 8GiB + 4GiB OS cache for TokuDB.

Important to note is that I used tokudb_fanout=128, which is only available in our latest Percona Server release.
I will write more on Fractal Tree internals and what does tokudb_fanout mean later. For now let’s just say it changes the shape of the fractal tree (comparing to default tokudb_fanout=16).

I am using two storage options:

  • Intel P3600 PCIe SSD 1.6TB (marked as “i3600” on charts) – as a high end performance option
  • Crucial M500 SATA SSD 900GB (marked as “M500” on charts) – as a low end SATA SSD

The full results and engine options are available here

Results on Crucial M500 (throughput, more is better)

Crucial M500

    Engine Throughput [ADD_LINK/10sec]
  • InnoDB: 6029
  • InnoDB 8K: 6911
  • TokuDB: 14633

There TokuDB outperforms InnoDB almost two times, but also shows a great variance in results, which I correspond to a checkpoint activity.

Results on Intel P3600 (throughput, more is better)

Intel P3600

  • Engine Throughput [ADD_LINK/10sec]
  • InnoDB: 27739
  • InnoDB 8K: 9853
  • TokuDB: 20594

To understand the reasoning why InnoDB shines on a fast storage let’s review IO usage by all engines.
Following chart shows Reads in KiB, that engines, in average, performs for a request from client.

IO Reads

Following chart shows Writes in KiB, that engines, in average, performs for a request from client.

IO Writes

There we can make interesting observations that TokuDB on average performs two times less writes than InnoDB, and this is what allows TokuDB to be better on slow storages. On a fast storage, where there is no performance penalty on many writes, InnoDB is able to get ahead, as InnoDB is still better in using CPUs.

Though, it worth remembering, that:

  • On a fast expensive storage, TokuDB provides a better compression, which allows to store more data in limited capacity
  • TokuDB still writes two time less than InnoDB, that mean twice longer lifetime for SSD (still expensive).

Also looking at the results, I can make the conclusion that InnoDB compression is inefficient in its implementation, as it is not able to get benefits: first, from doing less reads (well, it helps to get better than uncompressed InnoDB, but not much); and, second, from a fast storage.

9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ovais Tariq

Because of the title of the post, which suggests that it’s a general benchmark, wouldn’t it be pertinent to conclude that uncompressed InnoDB performs much better on fast storage as showed by the later part of your benchmark.

There are quite a few popular flash storage implementations (Pure Storage, SolidFire, etc) that provide compression in which case it doesn’t really matter if InnoDB does not efficiently compress as compression is pushed to the storage layer. In such a case what would really matter is whether InnoDB is able to take advantage of fast storage.

In the end it all depends on the storage layer. Obviously InnoDB compression implementation needs a great deal of improvement.

Mark Callaghan

Would be great to see results for Pure Storage compression. Can someone give Percona access to one? Otherwise I am skeptical that a clever storage device will do the right thing for compression with the IO done by an update-in-place b-tree.k

Ovais Tariq

Hi Mark,

Pure Storage uses both deduplication and compression. The process is roughly as follows, the blocks are first written to NVRAM. Then deduplication is done on 512-byte granularity. Deduplication is done by calculating hashes of 512-byte sized blocks and storing the hashes in a hash-table, if the hash already exists in the hash-table, then the block is skipped. Every new block that cannot already be found in the hash table is then compressed using LZO algorithm. The compressed block then finally gets written to the SSD. This is what Pure calls inline deduplication and compression. They also have a background deduplication and compression process. This is as much as I know about the internals. To me deduplication is essentially another level of compression but at a higher level.

On the practical side, I can share the numbers from one of our replica-sets here at Lithium. The size of the uncompressed dataset as seen by the OS is 2TB. This get’s compressed to about 470GB (22.9% of the uncompressed size).
And when we were using compressed InnoDB tables, we were only able to reduce to 75% of the uncompressed size. Most of our data is textual, which is a good candidate for compression. Granted we were not too smart in the past with compression and were not using padding and such (well I wasn’t at Lithium then). But even then I wouldn’t expect the current implementation of InnoDB compression to get me any near to the numbers I get with Pure Storage.

That is why I care more about performance, because I know I can push compression down to the storage layer. And to me the fact that InnoDB is able to utilize fast storage makes it the winner for me.

I see more and more storage companies getting smart about storage and employing different techniques to compress data at the storage layer and that is making flash more economical and would increase its adoption. There are both small players in the market doing that, such as Pure Storage, SolidFire, Kaminario as well as big players like EMC, IBM, etc.

Peter Zaitsev

Ovais,

I wonder whenever you’re getting the “compression” from really compression or rather deduplication. 512 byte blocks are way to small for meaningful compression. Even 16K in Innodb are rather small for good compression.

I think this is one of the nicer features of TokuDB to be able to change the block size in the wide range to trade compression level for speed of random uncached access.

Peter Zaitsev

Vadim,

I wonder whenever 8G+4G setup for cache for TokuDB was found optimal in this case. It would be interesting to see the graph for results for different cache allocation for OS cache vs Internal TokuDB Cache. Internal cache should be a lot more efficient but OS cache caches compressed data and so can fit more. I guess this will be important decision to make for TokuDB at this point.

Simon J Mudd

One thing I miss is a clear comment on the amount of RAM in the server(s) concerned. There are comments about the fact that you want to allow the OS to cache stuff but it’s not clear how much RAM is there to be used for caching and this may make quite a difference in performance.

So please if you can, apart from showing the storage devices, and the data sets also add a comment to show exactly how much RAM was on the server you were using. Thanks.

Ovais Tariq

@Peter, in the end what matters is how much data can fit on the flash, that’s exactly the reason why we look at using compression. So that we can fit more data on less flash to make flash more economical.
I agree with your point though, on the compression by TokuDB as compared to InnoDB. Out of the box that is one of the benefits of using TokuDB.
I would love to have Percona do some tests on Pure Storage to compare its storage efficiency vs compressed-InnoDB vs compressed-TokuDB.

Cristian Vasile

Vadim, could you run the same tests against a RAM disk, and let us know the results?