Migration of a MongoDB Replica Set to a Sharded ClusterIn this blog post, we will discuss how can we migrate from a replica set to sharded cluster. 

Before moving to migration let me briefly explain Replication and Sharding and why do we need to shard a replica Set.

Replication: It creates additional copies of data and allows for automatic failover to another node in case Primary went down. It also helps to scale our reads if the application is fine to read data that may not be the latest.

Sharding: It allows horizontal scaling of data writes by allowing data partition in multiple servers by using a shard key. Here, we should understand that a shard key is very important to distribute the data evenly across multiple servers. 

Why Do We Need a Sharded Cluster?

We need sharding due to the below reasons:

  1. By adding shards, we can reduce the number of operations each shard manages. 
  2. It increases the Read/Write capacity by distributing the Reads/Writes across multiple servers. 
  3. It also gives high availability as we deploy the replicas for the shards, config servers, and multiple MongoS.

Sharded cluster will include two more components which are Config Servers and Query routers i.e. MongoS.

Config Servers: It keeps metadata for the sharded cluster. The metadata comprises a list of chunks on each shard and the ranges that define the chunks. The metadata indicates the state of all the data and its components within the cluster. 

Query Routers(MongoS): It caches metadata and uses it to route the read or write operations to the respective shards. It also updates the cache when there are any metadata changes for the sharded cluster like Splitting of chunks or shard addition etc. 

Note: Before starting the migration process it’s recommended that you perform a full backup (if you don’t have one already).

The Procedure of Migration:

  1. Initiate at least a three-member replica set for the Config Server ( another member can be included as a hidden node for the backup purpose).
  2. Perform necessary OS, H/W, and disk-level tuning as per the existing Replica set.
  3. Setup the appropriate clusterRole for the Config servers in the mongod config file.
  4. Create at least two more nodes for the Query routers ( MongoS )
  5. Set appropriate configDB parameters in the mongos config file.
  6. Repeat step 2 from above to tune as per the existing replica set.
  7. Apply proper SELinux policies on all the newly configured nodes of Config server and MongoS.
  8. Add clusterRole parameter into existing replica set nodes in a rolling fashion.
  9. Copy all the users from the replica set to any MongoS.
  10. Connect to any MongoS and add the existing replica set as Shard. 

Note: Do not enable sharding on any database until the shard key is finalized. If it’s finalized then we can enable the sharding.

Detailed Migration Plan:

Here, we are assuming that a Replica set has three nodes (1 primary, and 2 secondaries)

  1. Create three servers to initiate a 3-member replica set for the Config Servers.

Perform necessary OS, H/W, and disk-level tuning. To know more about it, please visit our blog on Tuning Linux for MongoDB.

  1. Install the same version of Percona Server for MongoDB as the existing replica set from here.
  2. In the config file of the config server mongod, add the parameter clusterRole: configsvr and port: 27019  to start it as config server on port 27019.
  3. If SELinux policy is enabled then set the necessary SELinux policy for dbPath, keyFile, and logs as below.

Start all the Config server mongod instances and connect to any one of them. Create a temporary user on it and initiate the replica set.

Create a role anyResource with action anyAction as well and assign it to “tempUser“.

Now our Config server replica set is ready, let’s move to deploying Query routers i.e. MongoS.

  1. Create two instances for the MongoS and tune the OS, H/W, and disk. To do it follow our blog Tuning Linux for MongoDB or point 1 from the above Detailed migration.
  2. In mongos config file, adjust the configDB parameter and include only non-hidden nodes of Config servers ( In this blog post, we have not mentioned starting hidden config servers).
  3. Apply SELinux policies if it’s enabled, then follow step 4 and keep the same keyFile and start the MongoS on port 27017.
  4. Add the below parameter in mongod.conf on the Replica set nodes. Make sure the services are restarted in a rolling fashion i.e. start with the Secondaries then step down the existing Primary and restart it with port 27018.

Login to any MongoS and authenticate using “tempUser” and add the existing replica set as a shard.

Verify it with:

Connect to the Primary of the replica set and copy all the users and roles. To authenticate/authorize mention the replica set user.

  1.  Connect to any MongoS and verify copied users on it. 
  2.  Shard the database if shardKey is finalized (In this post, we are not sharing this information as it’s related to migration of Replica set to Sharded cluster only).

Shard the database:

Shard the collection with hash-based shard key:

Shard the collection with range based shard key:

Conclusion

Migration of a MongoDB replica set to a sharded cluster is very important to scale horizontally, increase the read/write operations, and also reduce the operations each shard manages.

We encourage you to try our products like Percona Server for MongoDB, Percona Backup for MongoDB, or Percona Operator for MongoDB. You can also visit our site to know “Why MongoDB Runs Better with Percona”.

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments