MongoDB Sharded ClusterIf you have been working in the database field for some time, you have likely come across the need to create a new database, based on an existing one. The most common example I can think of is to create a copy of the production database for testing purposes.

In the case of MongoDB sharded clusters, the official documentation covers the procedure to restore a sharded cluster from a backup. But what if we want to restore the dataset to different hosts, and also rename the shards and/or replicasets? There are some mentions of metadata renaming in the documentation, but the steps are not complete.

I am providing the detailed steps for MongoDB 4.2, although I should warn you to use them at your own risk since this is not a supported procedure. I won’t cover the backup part in detail here, but normally we can stop the balancer, and then shut down one secondary member from each shard, as well as one config server, and create snapshots to seed members of the new cluster. You can also check Corrado’s blog Percona Backup for MongoDB in Action for another way to get a consistent backup of a sharded cluster with no downtime.

Overview of Restoring a MongoDB Sharded Cluster

In this case, I am assuming we want to clone a cluster where components are named as follows:

The target cluster will have the following names:

One caveat here is the target cluster needs to keep the same number of shards as the source.

Depending on the number of shards, the metadata editing can be time-consuming, so you should use some kind of automation to help. At least use a terminal multiplexer so you can broadcast the commands you type in a single tab to other tabs (one per shard).

Restoring the Config Servers

The first thing to do is to restore the backup of the config server to the new hardware. Once we have that, follow these steps:

1. Start the config server in standalone mode. I am taking the opportunity and renaming the config server Replicaset in the file:

2. Drop the local database

3. Update the shard metadata in config.shards collection

4. Modify the primary shard name for non-shared collections

5. Modify each chunk metadata (this can take a while if you have lots of chunks)

6. Modify chunk history metadata

7. Start the config server with the new name and initiate its replica set

At this point, it is safe to add the other members of the config server replicaset.

Next, we have to work on the shards themselves.

Restoring the Shards

Once you have restored the backup of each shard, do the following:

1. Start the member of each shard you restored in standalone mode:

2. Create a temporary user with the __system role. This is needed to edit the system collections.

3. Drop the local database

4. Drop the document that stores Oplog recovery information (we want to avoid any kind of recovery on start, just keep the data as is).

5. Update shard identity metadata

6. Remove all documents in the cached metadata collections. Basically anything starting with db.cache has to go.

7. Restart each shard as a single node replica set and add the other members. In this case, I am adding them as non-voting as we don’t want them to become primary by accident (at least until they finish syncing from the node that has actual data).

8. Finally, don’t forget to remove the privileged user we had created.

Final Words

While the procedure above works, it would be nice if MongoDB had some built-in script to help with this tedious task. Hopefully, future versions include something in this regard. If you get a chance to use this procedure, please leave a comment in the section below and let me know how it goes.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Kay Agahd

I had to do the same procedure in a different context and found it quite tedious due to the lack of official mongodb documentation. So thank you for the great write-up!

Anoop V

Thanks for the procedure. Found it very helpful. Since the cluster was authentication enabled we had to go through few extra checks also.