Test/Dev Environments With Prod Data Using Percona Backup for MongoDBThis is a very straightforward article written with the intention to show you how easy it is to refresh your Test/Dev environments with PROD data, using Percona Backup for MongoDB (PBM). This article will cover all the steps from the PBM configuration until the restore, assuming that the PBM agents are all up and running on all the replica set members of either PROD and Dev/Test servers.

Taking the Backup on PROD

This step is quite simple and it demands no more than two commands:

1. Configuring the Backup

Important note on two things: I will address my backups to an S3 bucket and I am defining a prefix. When defining a prefix in the PBM storage configuration, a subdirectory will be automatically created and the backup files will be stored on that subdirectory instead of the root of the S3 bucket.

2. Taking the Backup

Having the PBM properly configured, it is time to take the backup. (You can skip this step if you already have PBM backups to use, of course.)

And if we hit the PBM status command, we will see the snapshot running and when it is complete, the PBM status will show it as completed like below:

Configuring the PBM Space on a DEV/TEST Environment

All right, now my PROD has a proper backup routine configured. I will move one step forward and configure my PBM space but this time in a Dev/Test environment – named here as DEV.

The backup list resync from the store has started.

Note that the S3 bucket is exactly the same where PROD is storing the backups but with a different prefix. If I hit a status command, I will see it is configured but no snapshots available yet:

Lastly, note that the replica set name is exactly the same as PROD. If this was a sharded cluster, rather than a non-sharded replicaset, all the replica set names have to match in the target cluster. PBM is guided by the replica set name and if my DEV env had a different one, it would not be possible to load backup metadata from PROD to DEV

Transfering the Desired Backup Files

The next step will be transferring the backup files from the PROD prefix to the target prefix. I will use the AWS CLI to achieve that, but there is one important thing to keep in mind in advance: determining which files are referent to a certain backup set (snapshot). Let’s go back to the PBM status output taken in PROD previously:

The PBM snapshots are named with the timestamp from when the backup started. If we check at the S3 prefix where it is stored, we will see that the file’s names contain that timestamp in its name composition.

So, it will be easy now to know which file I have to copy.

Checking the DEV prefix:

The files are already there and PBM has already automatically loaded their metadata into the DEV PBM collections:

Finally – Restoring It

Believing it or not, now comes the easiest part: the restore. It is only one command and nothing else:

Refreshing Dev/Test environments with PROD data is a very common and required task in corporations worldwide. I hope this article helps to clarify the practical questions regarding using PBM for it!

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Sami Ahlroos

Just a quick note. If “pbm status” is not showing any backups on the destination server after copying the files, running “pbm config –force-resync” to re-read the backup list from storage should help.