In this blog post, we will work on the first replica-set configuration for MySQL DBAs. We will map as many names as possible and compare how the databases work.
Replica-sets are the most common MongoDB deployment nowadays. One of the most frequent questions is: How do you deploy a replica-set? In this blog, the setup we’ll use compares the MongoDB replica-set to a standard MySQL master-slave replication not using GTID.
The replica-set usually consists of 3+ instances in different hosts that communicate with each other through both dedicated connections and heartbeat packages. The latter checks the other instances’ health in order to keep the high availability of the replica-sets. The names are slightly different: while “primary” corresponds to “master” in MySQL, “secondary” corresponds to “slave.” MongoDB only supports a single master — different from MySQL, which can have more than one depending on how you set it.
Unlike MySQL, MongoDB does not use files to replicate each other (such as binary log or relay log files). All the statements that should be replicated are in the oplog.rs collection. This collection is a capped collection, which means it handles a limited number of documents. Therefore, when it becomes full new content replaces old documents. The amount of data that the oplog.rs can keep is called the “oplog window,” and it is measured in seconds. If a secondary node is delayed for longer than the oplog can handle, a new initial sync is needed. The same happens in MySQL when a slave tries to read binary logs that have been deleted.
When the replica-set is initialized, all the inserts, updates and deletes are saved in a database called “local” in a collection called oplog.rs. The replica-set initialization can be compared to enabling bin logs in the MySQL configuration.
Now let’s point out the most important differences between such databases: the way they handle replication, and how they keep high availability.
For a standard MySQL replication we need a to enable the binlog in the config file, perform a backup, be aware of the binlog position, restore this backup in a server with a different server id, and finally start the slave thread in the slave. On the other hand, in MongoDB you only need a primary that has been previously configured with the replSet parameter, and then add the new secondaries with the same replSet parameter. No backup needed, no restore needed, no oplog position needed.
Unlike MySQL, MongoDB is capable of electing a new primary when the primary fails. This process is called election, and each instance will vote for a new primary based on how up-to-date they are without human intervention. This is why at least three instances are necessary for a reliable production replica-set. The election is based on votes, and for a secondary to become primary it needs the majority of votes – at least two out of three votes/boxes are required. We can also have an arbiter dedicated to voting only – it does not handle any data, but only decides which secondary should receive a vote. Most drivers are capable of changing the master once we need to pass the replica-set name in the connection string, and with this information drivers map primary and secondary on the fly using the result of rs.config().
Note: There are a few tools capable of emulating this behavior in MySQL. One example is: https://www.percona.com/blog/2016/09/02/mha-quickstart-guide/
Maintaining Replica-sets
After deploying a replica-set, we should monitor it. There are a couple of commands that identify not only the available hosts, but also the replication status. They edit such replication as well.
The command rs.status() will show all the details of the replication, such as the replica-set name, all the hosts that belong to this replica-set, and their status. This command is similar to “show slave hosts” in MySQL.
In addition, the command rs.printSlaveReplicationInfo() shows how delayed the secondaries are. It can be compared to “show slave status” in MySQL.
Replica-sets can be managed online by the command rs.config(). Passing the replica-set name as a parameter in the mongod process, or in the config file, is the only necessary action to start a replica-set. All the other configs can be managed using rs.config().
Step-by-Step How to Start Your First Replica-Set:
Please follow the following instructions to start testing replica-set with three nodes, using all the commands we’ve talked about.
For a production installation, please follow instructions on how to use our repositories here.
Download Percona Server for MongoDB:
1 2 3 4 | $ cd ~ wget https://www.percona.com/downloads/percona-server-mongodb-3.2/percona-server-mongodb-3.2.10-3.0/binary/tarball/percona-server-mongodb-3.2.10-3.0-trusty-x86_64.tar.gz tar -xvzf percona-server-mongodb-3.2.10-3.0-trusty-x86_64.tar.gz mv percona-server-mongodb-3.2.10-3.0 mongodb |
Create folders:
1 2 | cd mongodb/bin mkdir data1 data2 data3 |
Generate the configs file:
(This is a simple config file, and almost all parameters are the default, so please edit the database directory first.)
1 2 3 4 5 6 7 8 9 10 11 12 13 | for i in {1..3}; do echo echo 'storage: dbPath: "'$(pwd)'/data'$i'" systemLog: destination: file path: "'$(pwd)'/data'$i'/mongodb.log" logAppend: true processManagement: fork: true net: port: '$(( 27017 + $i -1 ))' replication: replSetName: "rs01"' > config$i.cfg; done |
Starting MongoDB’s:
- Before initializing any MongoDB instance, confirm if the config files exist:
1 2 3 4 5 | percona@mongo32:~/mongodb/bin$ ls -lah *.cfg config1.cfg config2.cfg config3.cfg |
- Then start mongod process and repeat for the others:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | percona@mongo32:~/mongodb/bin$ ./mongod -f config1.cfg 2016-11-10T16:56:12.854-0200 I STORAGE [main] Counters: 0 2016-11-10T16:56:12.855-0200 I STORAGE [main] Use SingleDelete in index: 0 about to fork child process, waiting until server is ready for connections. forked process: 1263 child process started successfully, parent exiting percona@mongo32:~/mongodb/bin$ ./mongod -f config2.cfg 2016-11-10T16:56:21.992-0200 I STORAGE [main] Counters: 0 2016-11-10T16:56:21.993-0200 I STORAGE [main] Use SingleDelete in index: 0 about to fork child process, waiting until server is ready for connections. forked process: 1287 child process started successfully, parent exiting percona@mongo32:~/mongodb/bin$ ./mongod -f config3.cfg 2016-11-10T16:56:24.250-0200 I STORAGE [main] Counters: 0 2016-11-10T16:56:24.250-0200 I STORAGE [main] Use SingleDelete in index: 0 about to fork child process, waiting until server is ready for connections. forked process: 1310 child process started successfully, parent exiting |
Initializing a replica-set:
- Connect to the first MongoDB:
1 2 3 4 5 6 7 | $ ./mongo > rs.initiate() { "info2" : "no configuration specified. Using a default configuration for the set", "me" : "mongo32:27017", "ok" : 1 } |
- Add a new member
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | rs01:PRIMARY> rs.add('mongo32:27018') // replace to your hostname, localhost is not allowed. { "ok" : 1 } rs01:PRIMARY> rs.add('mongo32:27019') { "ok" : 1 } rs01:PRIMARY> rs.status() { "set" : "rs01", "date" : ISODate("2016-11-10T19:40:08.190Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "mongo32:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 2636, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "electionTime" : Timestamp(1478804218, 2), "electionDate" : ISODate("2016-11-10T18:56:58Z"), "configVersion" : 3, "self" : true }, { "_id" : 1, "name" : "mongo32:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 44, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "lastHeartbeat" : ISODate("2016-11-10T19:40:07.129Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:40:05.132Z"), "pingMs" : NumberLong(0), "syncingTo" : "mongo32:27017", "configVersion" : 3 }, { "_id" : 2, "name" : "mongo32:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 3, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "lastHeartbeat" : ISODate("2016-11-10T19:40:07.130Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:40:06.239Z"), "pingMs" : NumberLong(0), "configVersion" : 3 } ], "ok" : 1 } |
- Check replication lag:
1 2 3 4 5 6 7 8 | $ mongo rs01:PRIMARY> rs.printSlaveReplicationInfo() source: mongo32:27018 syncedTo: Thu Nov 10 2016 17:40:05 GMT-0200 (BRST) 0 secs (0 hrs) behind the primary source: mongo32:27019 syncedTo: Thu Nov 10 2016 17:40:05 GMT-0200 (BRST) 0 secs (0 hrs) behind the primary |
- Start an election:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | $mongo rs01:PRIMARY> rs.stepDown() 2016-11-10T17:41:27.271-0200 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27017': DB.prototype.runCommand@src/mongo/shell/db.js:135:1 DB.prototype.adminCommand@src/mongo/shell/db.js:153:16 rs.stepDown@src/mongo/shell/utils.js:1182:12 @(shell):1:1 2016-11-10T17:41:27.274-0200 I NETWORK [thread1] trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed 2016-11-10T17:41:27.275-0200 I NETWORK [thread1] reconnect 127.0.0.1:27017 (127.0.0.1) ok rs01:SECONDARY> rs01:SECONDARY> rs.status() { "set" : "rs01", "date" : ISODate("2016-11-10T19:41:39.280Z"), "myState" : 2, "term" : NumberLong(2), "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "mongo32:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 2727, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "configVersion" : 3, "self" : true }, { "_id" : 1, "name" : "mongo32:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 135, "optime" : { "ts" : Timestamp(1478806805, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-11-10T19:40:05Z"), "lastHeartbeat" : ISODate("2016-11-10T19:41:37.155Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:41:37.155Z"), "pingMs" : NumberLong(0), "configVersion" : 3 }, { "_id" : 2, "name" : "mongo32:27019", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 94, "optime" : { "ts" : Timestamp(1478806897, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2016-11-10T19:41:37Z"), "lastHeartbeat" : ISODate("2016-11-10T19:41:39.151Z"), "lastHeartbeatRecv" : ISODate("2016-11-10T19:41:38.354Z"), "pingMs" : NumberLong(0), "electionTime" : Timestamp(1478806896, 1), "electionDate" : ISODate("2016-11-10T19:41:36Z"), "configVersion" : 3 ], "ok" : 1 } rs01:SECONDARY> exit |
Shut down instances:
1 | $ killall mongod |
Hopefully, this was helpful. Please post any questions in the comments section.
Wow that’s an complete survey of mano db
Excellent blog Mr. Tonete! Congrats man!
Excellent! Adamo
Great post Adamo! Thanks for sharing!
I am trying to set it up on remote machine and getting below error.
rs01:PRIMARY> rs.add(“panther2.domain.com:27017”)
{
“ok” : 0,
“errmsg” : “Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 172.31.0.28:27017; the following nodes did not respond affirmatively: panther2.domain.com:27017 failed with Couldn’t get a connection within the time limit”,
“code” : 74
}
Hello Mohit,
This can happen due the number of instances you have in the replicaset.
This seems to be a network/communication issue. Please reach out me on twitter @AdamoTonete
Regards,
Thank you Adamo. It works after setting up the correct hostname under /etc/hostname. Long live percona. 🙂