In this blog post, we will work on the first replica-set configuration for MySQL DBAs. We will map as many names as possible and compare how the databases work.

Replica-sets are the most common MongoDB deployment nowadays. One of the most frequent questions is: How do you deploy a replica-set? In this blog, the setup we’ll use compares the MongoDB replica-set to a standard MySQL master-slave replication not using GTID.

Replica-Set configuration

replica-set

The replica-set usually consists of 3+ instances in different hosts that communicate with each other through both dedicated connections and heartbeat packages. The latter checks the other instances’ health in order to keep the high availability of the replica-sets. The names are slightly different: while “primary” corresponds to “master” in MySQL, “secondary” corresponds to “slave.” MongoDB only supports a single master — different from MySQL, which can have more than one depending on how you set it.

Master-Slave

master-slave

Unlike MySQL, MongoDB does not use files to replicate each other (such as binary log or relay log files). All the statements that should be replicated are in the oplog.rs collection. This collection is a capped collection, which means it handles a limited number of documents. Therefore, when it becomes full new content replaces old documents. The amount of data that the oplog.rs can keep is called the “oplog window,” and it is measured in seconds. If a secondary node is delayed for longer than the oplog can handle, a new initial sync is needed. The same happens in MySQL when a slave tries to read binary logs that have been deleted. 

When the replica-set is initialized, all the inserts, updates and deletes are saved in a database called “local” in a collection called oplog.rs. The replica-set initialization can be compared to enabling bin logs in the MySQL configuration.

Now let’s point out the most important differences between such databases: the way they handle replication, and how they keep high availability.

For a standard MySQL replication we need a to enable the binlog in the config file, perform a backup, be aware of the binlog position, restore this backup in a server with a different server id, and finally start the slave thread in the slave. On the other hand, in MongoDB you only need a primary that has been previously configured with the replSet parameter, and then add the new secondaries with the same replSet parameter. No backup needed, no restore needed, no oplog position needed.

Unlike MySQL, MongoDB is capable of electing a new primary when the primary fails. This process is called election, and each instance will vote for a new primary based on how up-to-date they are without human intervention. This is why at least three instances are necessary for a reliable production replica-set. The election is based on votes, and for a secondary to become primary it needs the majority of votes – at least two out of three votes/boxes are required. We can also have an arbiter dedicated to voting only – it does not handle any data, but only decides which secondary should receive a vote. Most drivers are capable of changing the master once we need to pass the replica-set name in the connection string, and with this information drivers map primary and secondary on the fly using the result of rs.config().

Note: There are a few tools capable of emulating this behavior in MySQL. One example is: https://www.percona.com/blog/2016/09/02/mha-quickstart-guide/

Maintaining Replica-sets

After deploying a replica-set, we should monitor it. There are a couple of commands that identify not only the available hosts, but also the replication status. They edit such replication as well.

The command rs.status() will show all the details of the replication, such as the replica-set name, all the hosts that belong to this replica-set, and their status. This command is similar to “show slave hosts” in MySQL.

In addition, the command rs.printSlaveReplicationInfo() shows how delayed the secondaries are. It can be compared to “show slave status” in MySQL.

Replica-sets can be managed online by the command rs.config(). Passing the replica-set name as a parameter in the mongod process, or in the config file, is the only necessary action to start a replica-set. All the other configs can be managed using rs.config().

Step-by-Step How to Start Your First Replica-Set:

Please follow the following instructions to start testing replica-set with three nodes, using all the commands we’ve talked about.

For a production installation, please follow instructions on how to use our repositories here.

Download Percona Server for MongoDB:

Create folders:

Generate the configs file:

(This is a simple config file, and almost all parameters are the default, so please edit the database directory first.)

Starting MongoDB’s:

  •  Before initializing any MongoDB instance, confirm if the config files exist:

  • Then start mongod process and repeat for the others:

Initializing a replica-set:

  • Connect to the first MongoDB:

  • Add a new member

  • Check replication lag:

  • Start an election:

Shut down instances:

Hopefully, this was helpful. Please post any questions in the comments section.

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
shakirajane

Wow that’s an complete survey of mano db

Wagner Bianchi

Excellent blog Mr. Tonete! Congrats man!

Jerry

Excellent! Adamo

Silas Mendes

Great post Adamo! Thanks for sharing!

Mohit

I am trying to set it up on remote machine and getting below error.

rs01:PRIMARY> rs.add(“panther2.domain.com:27017”)
{
“ok” : 0,
“errmsg” : “Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 172.31.0.28:27017; the following nodes did not respond affirmatively: panther2.domain.com:27017 failed with Couldn’t get a connection within the time limit”,
“code” : 74
}

Mohit

Thank you Adamo. It works after setting up the correct hostname under /etc/hostname. Long live percona. 🙂