Percona XtraBackup Smart Memory EstimationTaking a MySQL backup using Percona XtraBackup (PXB) consists of basically two steps: 1) take the backup and 2) prepare the backup.

Briefly speaking, taking a backup means that PXB will copy all of the files from your instance and transfer them to another location. While it does the copy, it spawns a thread that will monitor the InnoDB redo log (WAL/transaction log) and store a copy of all the new redo log entries generated by the server during the backup.

Before restoring the backup into a new instance, users have to prepare the backup. This operation is the same as the crash recovery steps that the MySQL server does after a server crash.

It consists of reading all the redo log entries into memory, categorizing them by space id and page id, reading the relevant pages into memory, and checking the LSN number on the page and on the redo log record. If the LSN from the redo log is more recent than the one read from the page, we need to apply the redo log change to the page.

Memory usage

Percona XtraBackup/MySQL server utilizes InnoDB Buffer Pool memory to perform this operation. Memory for 256 pages is reserved for loading the pages into the buffer pool, while the remaining memory is utilized for hashing/categorizing the redo log entries.

This is controlled by the PXB parameter –use-memory . The more memory you have for this parameter the better. In case the available memory on the buffer pool is not enough, the work will have to be performed in multiple batches. After each batch, the memory structures have to be freed in order to make room for the next batch.

This has a huge performance impact as an InnoDB page holds data from multiple rows. If a change on the same page happens on a different batch, that page will have to be fetched and evicted multiple times.

Motivation

In regards to increasing the memory available for prepare phase we identified two challenges that motivated us to enhance this aspect of the software:

  1. Not every user knows/reads the manual to understand what each parameter does and knows about them. We have seen a countless number of times users complaining about PXB being slow and the solution was to increase –use-memory as the user did not know about it and was using the default value, causing the prepare to require a huge amount of batches to complete.
  2. Even knowing about how to tune PXB memory, there is no simple way to set –use-memory to the required memory for that particular prepare. The formula to get the amount of memory required depends on various factors such as the number of redo log entries, the number of entries per page, and so on. For example, x records on the same page versus x records on different pages will have different memory requirements.

Smart Memory Estimation

We are glad to announce Percona XtraBackup Smart Memory Estimation as of the release of Percona XtraBackup 8.0.30.

PXB has extended the crash recovery logic to extract the formula used to allocate memory.

Now, during the backup phase, while PXB is copying the redo log entries, it will compute the required memory for prepare. Not only that, but it will also take into consideration the number of InnoDB pages that will be required to be fetched from the disk. If enough memory is available for parsing the redo logs in one go, we also increase the 256 frames limit.

With this information gathered during the backup, PXB will check the server’s available free memory and will use up to –use-free-memory-pct of that memory for prepare.

This new feature will be released as Tech Preview to give users the chance to test it and provide feedback. –use-free-memory-pct will be released as 0% (disabled) during the duration of the tech preview. The aim is to make it enabled by default at 50% once we make the feature GA, which will allow PXB to use up to 50% of free memory to complete the prepare process as fast as possible.

Users will have the ability to adjust PXB memory allowance from 0 to 100%.

Benchmarks

To compare the new default we ran three backups, using sysbench with 16, 32, and 64 tables containing 1M rows each.

We used an ec2 c4.8xlarge instance (36 vCPUs / 60G memory / General Purpose SSD (gp2))

During each –backup we ran the below sysbench:

Each –prepare operation was run three times and the best time was extracted.

Please note: The purpose of this experiment is to show what the new default will look like once this feature becomes generally available.

The above table shows the amount of memory required by PXB with –use-free-memory-pct=50, the size of the PXB log file (relo log entries copied during the backup), and the size of the resulting backup folder. Operations done without Smart Memory Estimation used the default of 128M for the buffer pool.

Percona XtraBackup time to run

16 tables result – prepare time dropped to ~5.7% of the original time. An improvement in recovery time of about 17X.

32 tables result – prepare time dropped to ~8,2% of the original time. An improvement in recovery time of about 12X.

64 tables result – prepare time dropped to ~9.9% of the original time. An improvement in recovery time of about 10X.

Summary

As we can see, the better results come from the smaller datasets, as the chances are that the same page will be present in multiple redo log records, causing it to be fetched into multiple different batches, however, the bigger the dataset bigger is the impact in recovery time, for example, 64 tables time dropped from 35 minutes and 28 seconds to only three minutes and 31 seconds.

As mentioned before, at the moment this feature will remain Tech Preview for a few releases, and users are welcome to use it via –use-free-memory-pct parameter and provide feedback.

Subscribe
Notify of
guest

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Johan Andersson

Will it be available in the 2.4 version of Percona Xtrabackup ?