MongoDB is one of the most admired and effortless NoSQL databases to set up. Developers want to spend time building the features for their application, and with MongoDB, developers can build the application quickly while utilizing well-supported infrastructure and high availability with automatic failover.
In this blog post, we will discuss the top five things which MongoDB does better than anyone else.
Ease of Setup
First and foremost, MongoDB is very easy to install and deploy, and a developer can start writing code immediately for the application. As said, the installation of MongoDB is very simple whether it’s on Windows, Mac, or Linux. Even for Linux/Mac, one can download the tarball, extract it, configure the db/log path, and start it. Percona offers “Percona Server for MongoDB”, an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB Community Edition. For more details on the installation of “Percona Server for MongoDB (PSMDB)” on various OS, please visit the installation section. One can also spin up MongoDB with Kubernetes, and Percona also has a Kubernetes Operator for Percona Server for MongoDB available.
Flexible Schema
One of the great features MongoDB has is a flexible schema. MongoDB can be a schemaless database. A developer won’t be stuck with a defined schema, i.e. we don’t need to define data-type in a collection before inserting the data or a field’s data type can be different across the documents in a collection. A document from an employee collection in MongoDB may look like:
1 2 3 | { "emp_name" : "XYZ", "city" : "NYC" } { "emp_name" : "XYZ", "city" : "NYC", "country" : "US" } |
Even if you have to change the structure of a document in a collection, you only need to update the document with a new structure. Consider a document of a phone collection below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | { "_id" : ObjectId("5f8d175127f5862e567f676c"), "model_name" : "iphone12", "features" : { "5G_support" : true, "display" : "OLED with Ceramic Shield" }, } |
Let’s say you want to append a new field “screen_size” in it. One can easily do this with an update without specifying the type of screen_size column [in MongoDB it’s a key].
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | db.phone.update({ model_name: "iphone12" }, { $set : { "screen_size" : 6.1 } } ) db.phone.find({ "model_name" : "iphone12" }).pretty() { "_id" : ObjectId("5f8d175127f5862e567f676c"), "model_name" : "iphone12", "features" : { "5G_support" : true, "display" : "OLED with Ceramic Shield" }, "screen_size" : 6.1 } |
It also allows related documents to be embedded as a single document or a document reference:
Embedded single document
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | { "_id" : ObjectId("5f8d175127f5862e567f676c"), "model_name" : "iphone12", "features" : { "5G_support" : true, "display" : "OLED with Ceramic Shield" }, "screen_size" : 6.1 } |
A document with reference
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | // Customer collection’s document { _id : 1211, name : "Apple Solomon Pond Mall", address : "601 Donald Lynch Blvd, Marlborough, MA 01752, United States" } // Review collection’s Document { _id : 442321 review : “iphone SE is the cheapest iphone with a similar look like iPhone 8 but better internals with A13 bionic chip. However, it’s camera is not upto mark...” cust_id : 1211 } |
Fault Tolerance
MongoDB has built-in Replication features that provide High Availability and redundancy. Since it has copies of data in multiple servers, it gives a layer of fault tolerance in case of loss of a database server. Having multiple copies of data in different regions increases the availability and data locality for reads with potential stale reading. It can also improve data locality for writes with zoned shards.
In the MongoDB Replica set, how many nodes can be unavailable and still have sufficient members to elect a new Primary is said to be the Fault Tolerance limit.
A correct fault tolerance configuration would be a mix of business consideration and budget. To achieve replication and fault tolerance of one, we would require a minimum of three nodes. So, if one node goes down, there will still be a majority of nodes available to elect a new Primary.
The below chart shows the number of required nodes to achieve fault tolerance.
Number of nodes | Majority Required to Elect a New Primary | Fault Tolerance |
3 | 2 | 1 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 2 |
Replica Sets can also increase the number of queries served to the application as clients can send read requests to secondaries of the replica set, i.e a client can set the readPreference to read from the secondary, “nearest”, or by a tag set. However, reading from secondary nodes comes with a tradeoff as well. Clients may see stale data.
Scalability
Scalability is one of the key features of MongoDB. It is built on a scale-out architecture which enables it to sustain a high volume of data and traffic.
In any database system, growth can be managed by two methods: vertical and horizontal scaling. Vertical scaling involves increasing the capacity of a single server like a more powerful CPU, increasing RAM, or extending disk space. Horizontal scaling involves dividing the dataset into multiple small machines without any code change to be made at the application level.
MongoDB supports sharding through Horizontal scaling. It is cost-effective, more data can be written or read back as necessary as you’re able to distribute the load across your shards. When there is an increase in dataset growth a new shard can be added at any time and MongoDB will automatically migrate the data.
In MongoDB, sharding happens at the collection level and each document is associated with a shard key which decides which shard the document should live on. The application doesn’t send requests to shards directly, it sends the requests to the MongoS [the Query router] and MongoS redirects the read/write request to the respective shards by cached metadata from the config servers.
Performance
Database performance varies with many factors like “Database design”, “application queries” and “load” etc. and MongoDB has the ability to handle large volumes of unstructured data because it allows users to query in a different way which is more appropriate to their workload. It is always faster to retrieve a related single document than to join data across multiple collections.
To get better performance, one needs to make sure that the working set fits in RAM as well. All data persists in hard-disk, except when using the in-memory storage engine, but during the query execution, it fetches the data from local RAM. It is also important to have the right indexes and enough RAM in place to get the advantage of MongoDB’s performance.
Conclusion
MongoDB is feature-rich, and an easy way to get started with NoSQL databases. It has a flexible data model, expressive easy to learn query syntax, automatic failover with replica sets, and is quite scalable. It also has good documentation which makes a developer’s life a lot easier.
Percona Server for MongoDB offers all of the functionality of MongoDB Enterprise edition with a non-licensed model. This means no need to worry about purchasing licenses for production or non-production environments. You can ensure consistent deployment across all environments by utilizing non-licensed, open source software, all while ensuring that the security standards required by your organization are being met and if support is what you need Percona has you covered there as well.
MongoDB has both Community and Enterprise editions. While the Community edition is source-available and it is free to use the database within the confines of the SSPL license, Enterprise is available as a part of the MongoDB Enterprise subscription which includes MongoDB-provided support for your deployment.
To know more about what Percona Server for MongoDB covers, please visit the blog “Why Pay for Enterprise When Open Source Has You Covered?”.
Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.
Well, I expect a little more of details. I feel disappointed.
Thanks Pablo for going through the blog and your comment. I really felt sad that you found this disappointed. Well, this blog was specifically intended to mention top 5 features only.