When you design a backup strategy, you need to think about the business requirements, as you will need to shape your backups to meet them. Let’s review the basics briefly; you need to define the RPO and RTO. The RPO stands for “Recovery Point Objective”, which means how far back you will be able to recover. The RTO stands for “Recovery Time Objective”, this is the time the business expects the data to be recovered. This article will focus on one scenario that can help to meet the RTO.
The scenario
Imagine the company has a replica set running Percona Server for MongoDB 6.0 (PSMDB). This replica set has a footprint of one terabyte. The operations team has also configured Percona Backup for MongoDB (PBM) to generate physical and logical backups. One terrible day, a chain of unfortunate events occurs; one developer gets a call from his manager about a critical bug found on the PRODUCTION system, he quickly goes through the code that was released yesterday and finds that the issue can be easily fixed by removing a set of documents inserted on a collection. As he has read-write access to the PRODUCTION database, he decided to be fast and run the delete command directly on PRODUCTION to try to mitigate the issue as fast as possible. As you can imagine, when someone does things fast, the tendency to make a mistake is high. This was the case, and 90% of the documents in this collection were removed, and now the problem is even bigger than a critical bug — the system is completely down.
The solution
Since the database is rather large, it can take a long time to restore the whole thing, and given that a single collection is the culprit, it will be faster to execute a restore for that single collection.
The first thing to do is to list the backups available:
1 2 3 4 5 6 7 | $ pbm list Backup snapshots: 2024-03-22T20:42:50Z <logical> [restore_to_time: 2024-03-22T20:43:12Z] 2024-03-22T21:45:35Z <physical> [restore_to_time: 2024-03-22T21:45:36Z] ... PITR <off>: 2024-03-22T20:43:13Z - 2024-03-22T20:52:58Z |
Next, we need to find the most recent logical backup, as the restore of selected collections requires a logical backup. In this case, the backup that we need is “2024-03-22T20:42:50Z”.
Now, we have two options:
- Restore the collection on the live database: This will overwrite the existing data in the collection. If you are sure that no additional data was added to the collection, then this definitely is the fastest and simplest path.
- Restore the collection on a temporary instance: This will allow you to export and import the data into the live database without overwriting the new data generated. This alternative adds more steps to the process, but we can preserve the existing data.
Option one
Restore the single collection into the live database:
1 2 | $ pbm restore 2024-03-22T20:42:50Z --ns "sample_training.zips" Starting restore 2024-03-22T22:23:56.715785074Z from '2024-03-22T20:42:50Z'...Restore of the snapshot from '2024-03-22T20:42:50Z' has started |
You can view what is PBM doing by running the following command:
1 2 3 4 | $ pbm status -s running Currently running: ================== (none) |
In this case, the restore process is complete; you can list the restore operations with the following command:
1 2 3 | $ pbm list --restore Restores history: 2024-03-22T22:23:56.715785074Z [backup: snapshot, selective] done |
You can see the restore details with this command:
1 2 3 4 5 6 7 8 9 10 11 12 13 | $ pbm describe-restore 2024-03-22T22:23:56.715785074Z name: "2024-03-22T22:23:56.715785074Z" opid: 65fe04fccc46cf421780bab5 backup: "2024-03-22T20:42:50Z" type: logical status: done namespaces: - sample_training.zips last_transition_time: "2024-03-22T22:24:05Z" replsets: - name: rs0 status: done last_transition_time: "2024-03-22T22:24:04Z" |
Finally, confirm the data was restored as expected
1 2 | rs0 [direct: primary] sample_training> db.zips.find().count() 29470 |
Option two
Create a mongod config file for the temporary instance:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | $ cat /etc/mongod_tmp.conf|grep -v "^$"|grep -v "^#" storage: dbPath: /var/lib/mongodb_tmp ### Different dbPath journal: enabled: true systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod_tmp.log ### Different log file processManagement: fork: true pidFilePath: /var/run/mongod_tmp.pid ### Different pidFilePath net: port: 27018 ### Different port bindIp: 127.0.0.1 replication: replSetName: rs0 |
Create the dbPath:
1 2 | $ sudo mkdir /var/lib/mongodb_tmp $ sudo chown mongod.mongod /var/lib/mongodb_tmp/ |
Start the temporary instance:
1 2 3 4 | $ sudo -u mongod /usr/bin/mongod -f /etc/mongod_tmp.conf about to fork child process, waiting until server is ready for connections. forked process: 16270 child process started successfully, parent exiting |
Configure PBM to run on the new instance and make sure it has point-in-time recovery (PITR) disabled. In this case, the new instance is running on port 27018.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | $ pbm status Cluster: ======== rs0: - rs0/192.168.56.3:27018 [P]: pbm-agent v2.4.0 OK PITR incremental backup: ======================== Status [ON] Currently running: ================== (none) Backups: ======== S3 us-east-1 s3://bucket-s3/mongodb_backup/test1 (none) |
Force a sync to pull the list of backups stored on the S3 bucket:
1 2 | $ pbm config --force-resync Storage resync started |
List the backups and make sure the logical backup you require is present:
1 2 3 4 5 6 7 | $ pbm list Backup snapshots: 2024-03-22T20:42:50Z <logical> [restore_to_time: 2024-03-22T20:43:12Z] 2024-03-22T21:45:35Z <physical> [restore_to_time: 2024-03-22T21:45:36Z] ... PITR <off>: 2024-03-22T20:43:13Z - 2024-03-22T20:52:58Z |
Restore the collection you need to recover:
1 2 | $ pbm restore 2024-03-22T20:42:50Z --ns "sample_training.zips" Starting restore 2024-03-22T22:47:52.180513787Z from '2024-03-22T20:42:50Z'...Restore of the snapshot from '2024-03-22T20:42:50Z' has started |
Export all the documents from the recovered collection:
1 2 3 4 | $ mongodump --uri=$MONGODB_URI --archive=/tmp/sample_training.zips.archive.gzip --gzip --db=sample_training --collection=zips 2024-03-22T23:05:46.047+0000 WARNING: ignoring unsupported URI parameter 'replsetname' 2024-03-22T23:05:46.099+0000 writing sample_training.zips to archive '/tmp/sample_training.zips.archive.gzip' 2024-03-22T23:05:46.368+0000 done dumping sample_training.zips (29470 documents) |
Import the archive file into the live database, this will append the data.
1 2 3 4 5 6 7 8 9 | $ mongorestore --uri=$MONGODB_URI --nsInclude="sample_training.zips" --gzip --archive=/tmp/sample_training.zips.archive.gzip 2024-03-22T23:13:28.180+0000 WARNING: ignoring unsupported URI parameter 'replsetname' 2024-03-22T23:13:28.221+0000 preparing collections to restore from 2024-03-22T23:13:28.246+0000 reading metadata for sample_training.zips from archive '/tmp/sample_training.zips.archive.gzip' 2024-03-22T23:13:28.250+0000 restoring to existing collection sample_training.zips without dropping 2024-03-22T23:13:28.251+0000 restoring sample_training.zips from archive '/tmp/sample_training.zips.archive.gzip' 2024-03-22T23:13:29.934+0000 finished restoring sample_training.zips (29470 documents, 0 failures) 2024-03-22T23:13:29.935+0000 no indexes to restore for collection sample_training.zips 2024-03-22T23:13:29.935+0000 29470 document(s) restored successfully. 0 document(s) failed to restore. |
Alternate solution
If the requirement is to recover the data up until the second when it was deleted, then we should do a PITR. For this option to be viable, we need to have this feature enabled in PBM. Due to the size of the database, we will need a separate server to execute the restore process. The steps to perform this are detailed on this documentation page, Make a point-in-time restore. Once you have the database restored up to the time you need it, you can export the collection documents and import them as we did on the second option.
Percona Backup for MongoDB flexibility
The flexibility that PBM offers to manage the backup and restore operations is unique, and this is just a simple scenario that PBM can help you with. It is important to understand how PBM works to be able to build strategies to meet the business needs. If you need help managing your databases, don’t hesitate to contact us, we have an excellent team of experts ready to help.
Percona Distribution for MongoDB is a source-available alternative for enterprise MongoDB. A bundling of Percona Server for MongoDB and Percona Backup for MongoDB, Percona Distribution for MongoDB combines the best and most critical enterprise components from the open source community into a single feature-rich and freely available solution.
Download Percona Distribution for MongoDB Today!