InnoDB File Growth WeirdnessThere is a common pattern in life, you often discover or understand things by accident. Many scientific discoveries fit such a description. In our database world, I was looking to see how BLOB/TEXT columns are allocated using overlay pages and I stumbled upon something interesting and unexpected. Let me present to you my findings, along with my attempt at explaining what is happening.

InnoDB Tablespaces

The first oddity I found is a bunch of free pages in each tablespace it is skipping. Here’s an example from a simple table with only an integer primary key and a char(32) column:

The innodb_space tool comes from the InnoDB ruby project of Jeremy Cole. If you want to explore InnoDB file formats, you need these tools. As you can see, the first extent has 27 free pages. These free pages are reserved for node pages (non-leaf) and will eventually be all used. At this point, I thought a table with 34 index pages, just starting to use the second extent for the leaf pages, would have 90 free pages (27 + 63) and use 2MB of disk space. While the previous statement proved to be true, I was up for quite a surprise.

InnoDB File Growth

To better illustrate the amount of free space available in an InnoDB data file, I decided to follow the evolution of the tablespace file size as I added index pages. The following figure shows my result.

Show the evolution of the size of an Innodb tablespace as data is added.

As I added rows, more leaf pages were allocated until the file segment of the leaf pages reached 32 pages. At this point, the table has 33 index pages, one root, and 32 leaves. The allocation of another page forces InnoDB to fully allocate the first extent and add the second one for the leaves. At this point, the size on the disk is 2MB. If we keep inserting rows, the following page allocation triggers InnoDB to allocate 7 reserved extents of 1MB each. At this point, the tablespace size on the disk reaches 9MB.

 

Register for Percona Live ONLINE
A Virtual Event about Open Source Databases

 

InnoDB uses the reserved extents for btree maintenance operations. They are not accounted for in the free space of the tablespace. Now, reserving 7 extents of 1MB each in a table containing only 560KB of data is pretty insane. At this point, the InnoDB tablespace has a size of 9MB on disk. This is extremely inefficient, about 8.4MB if just free space filled with “0”. Of course, as the table grows, the size impact of these reserved extents is diluted. The amount of reserved space will grow by about 1MB (1 extent) for every 100MB allocated.

This allocation of reserved extents is far from optimal, especially in a multi-tenants era where it is fairly common to see MySQL servers handling more than 100k tables. 100k tables, each with only 1MB of data in them will use 900GB of disk space. This phenomenon is not new, a bug report was created back in 2013 and is still open. The bug is considered a low priority and non-critical.

A lot of effort has been devoted to improving the capacity of MySQL 8.0 to handle a large number of tables. Until the allocation of reserved extents is fixed, be aware of this issue when planning your storage allocation. Of course, if you are using ZFS, the impacts are more limited…

My lab setup uses ZFS for LXC and KVM instances. LZ4 compression does magic on extents full of “0”, the actual consumed space is reduced to 225KB. If you want to explore the use of ZFS with MySQL, I invite you to read a ZFS post I wrote a few years ago.

Ok, time now to close this parenthesis and go back to the storage of BLOB/TEXT columns. That will be for a future post though!