Restore Strategies with MyDumper

Restore Strategies with MyDumper In my previous post Back From a Long Sleep, MyDumper Lives!, I told you that Fast Index Creation was in the queue and (after fixing several bugs) it will package in release 0.10.7 next month. But why am I so excited about it? Well, this feature opens new opportunities, what I’m going to call Restore Strategies.

In the Past…

In the past, the only option was to first restore the table definition and then insert the data rows in two simple steps. On tables with millions of rows, we already know why it takes more time as it inserts in the clustered index and in the secondary index, instead of building the secondary indexes after the data has been inserted, as myloader is able to do now.

Nowadays, we have the option to do it, in three steps:

Create the table with the primary/unique key.
Execute the inserts statements in parallel.
Create the indexes and constraints.

But, we don’t have just one table, we have multiple tables with different amounts of indexes, different amounts of columns per index, different amounts of columns, different amounts of rows… and depending on the table order selected will be the restoration time. So, the question is: what is the best table order?

Currently…

Currently, myloader has no rule or strategy that defines the order, as we read the files using g_dir_read_name and enqueue them. We know that larger tables should be imported first, if not, the whole process could take longer, as might be possible that importing the data of the largest tables and then creating the indexes, take the same amount of time that the rest of the database to import. After working on several restore operations, I realized that not always the largest table is the one that is going to take longer. So, what we need to determine is the table that is going to take the longest to import. During a mydumper execution, we know the number of rows that each table has and we can get that info and pass it to myloader. We can also estimate the index length based on the amount of index and the type of each one.

Another thing to consider is that the three steps are for all the tables, which means that we create all the tables, execute the inserts for all the tables, and finally, we create the indexes for all the tables. But, I think that doing inserts in a table and running in parallel a process that creates the indexes, will speed up the whole process.

Do We Know Enough?

From my point of view, there is a lot of information that we are missing during a table restore, we don’t know inserts or MB per second per table or database. I think that this will be valuable information if we want to develop the tools to use the best Restore Strategy on our database.

Conclusion

I shared some ideas and opened some discussions about how to import using myloader. I hope we can develop some of them in the future to have a better understanding and reduce the restore timings. Feel free to open an issue on MyDumper repository or add a comment below!

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Restore Strategies with MyDumper

In the Past…

Currently…

Do We Know Enough?

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Restore Strategies with MyDumper

In the Past…

Currently…

Do We Know Enough?

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation