A comparison between popular NoSQL databases: MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB
As an entrepreneur or an enterprise IT project manager, you likely think about databases to use in your application development projects. Many projects require NoSQL databases in addition to relational databases. Which NoSQL database should you use? Our MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB comparison can help you to decide.
We first provide overviews of all of these NoSQL databases. After explaining their pros and cons, we compare them. We then recommend when to use which one. Read on.
An introduction to MongoDB
MongoDB is one of the most popular NoSQL databases. It’s a general-purpose database that’s document-based. As a document database, MongoDB stores data in JSON-like documents.
10gen, a software company started developing MongoDB in 2007. This company transitioned to an open-source development model in 2009, and it launched MongoDB in the same year. 10gen changed its name to MongoDB Inc. in 2013.
This database went through several iterations of development. Its latest stable release came in December 2020, and this release is 4.4.3.
The advantages of MongoDB
MongoDB offers the following advantages:
- Flexibility: MongoDB doesn’t have schemas. You can store any type of data in separate documents, which offers flexibility.
- Sharding: MongoDB supports sharding, which is a useful technique for partitioning. It allows you to store large sets of data. You can just distribute them to different servers.
- Speed: MongoDB allows you to index documents. This enables faster access to documents, and speed is a key advantage of MongoDB.
- Availability: You can use MongoDB as a file system, which is called “GridFS” (Grid File System). It offers very useful features like load balancing, replication, etc. These features offer high availability.
- Scalability: MongoDB allows horizontal scaling. This helps you to deal with large data sets since you can distribute them to multiple servers.
- Querying capabilities: MongoDB is a feature-rich database management system. It provides query support, which includes ad-hoc queries.
- Support: MongoDB provides excellent documentation. It provides robust technical support, furthermore, there’s community support available.
The disadvantages to using MongoDB
MongoDB has a few disadvantages, which are as follows:
- The lack of support for transactions: Are you creating an application that requires transactions? Do you plan to update multiple documents or collections using transactions? MongoDB doesn’t support this, and you could encounter corrupt data.
- Immature and inadequate “Join” capabilities: MongoDB lacks a mature “Join” capability. Developers working on improving MongoDB are currently working on this. You need to write your own code to implement “Joins”, furthermore, it can adversely impact the performance.
- A high degree of memory usage: As we discussed, MongoDB provides limited support for “Joins”. This causes data redundancy. This DBMS (Database Management System) stores key names for each value pairs. These factors combine to increase the usage of memory.
- Requires high expertise: Do you want excellent performance from MongoDB? You need the right expertise to implement the indexes well. Failing to do so can considerably degrade the performance.
- Duplication of data: MongoDB isn’t an RDBMS (Relational Database Management System). It doesn’t have well-defined relations, which causes duplication of data.
- Limitations concerning document size: You can’t have documents in MongoDB that are larger than 16 MB.
- Limitations concerning nesting: You can nest documents up to 100 levels only.
What is MongoDB used for
Companies use MongoDB when they need scalability and caching. MongoDB aids in real-time analytics. Organizations use it to build mobile apps, content management apps, IoT (Internet of Things) apps, and real-time analytics apps.
Examples of prominent companies using MongoDB are as follows:
An introduction to Cassandra
Apache Cassandra, or Cassandra as we commonly call it, is a popular open-source NoSQL database. This distributed database can handle a large volume of data. This scalable database offers high availability, and it doesn’t have a “Single Point of Failure” (SPoF).
Two developers at Facebook had created Cassandra. Avinash Lakshman was one of them, who was also one of the creators of the Dynamo storage system offered by Amazon. Prashant Malik was the other developer. They developed it using Java.
Facebook offered Cassandra as an open-source project in 2008. Several developers continued its development and enhancement. At the time of writing, the latest stable release of Cassandra is 3.11.9. This was released in November 2020.
Apache Software Foundation oversees the development and support of Cassandra. This open-source database is available under the Apache License 2.0.
The advantages of Cassandra
Apache Cassandra provides the following advantages:
- Decentralization: Cassandra doesn’t have a “Single Point of Failure”. Every node of Cassandra can equally service any request. This helps in crafting effective replication strategies. You can retrieve data from other nodes even if one node fails. This ensures high availability.
- Flexibility concerning data storage: You can store structured, semi-structured, or unstructured data on a Cassandra database.
- Flexibility vis-a-vis data distribution: You can set up Cassandra to use multiple data centers. This facilitates the distribution of data.
- ACID-compliance: Cassandra complies with the ACID (“Atomicity”, “Consistency”, “Isolation”, and “Durability”) guidelines. This helps in executing database transactions.
- The ability to handle massive data: Apache Cassandra can handle very large volumes of data, which is an advantage.
- Performance: Cassandra can handle a large amount of “write” requests at high speed without adversely impacting the “read” requests.
- Scalability: You can add more hardware easily to accommodate more data and requests. This horizontal scalability doesn’t require you to shut the database down, and you don’t need any major adjustments. The linear scalability of Cassandra ensures a quick response.
- Ease of use: Cassandra provides CQL (Cassandra Query Language), an alternative to SQL. You can use this easy-to-use interface to access Cassandra.
- Hadoop integration: Cassandra has Hadoop integration, and it supports MapReduce, Apache Pig, and Apache Hive.
- Consistency: Casandra provides “eventual consistency” of “Reads”, ”Updates”, “Inserts”, and “deletes”. You can tune the level of consistency.
The disadvantages to using Cassandra
Cassandra has the following disadvantages:
- Lacks the defining features of RDBMSs: Apache Cassandra isn’t an RDBMS. It doesn’t have referential integrity, and it doesn’t support “JOINS”. Cassandra doesn’t support subqueries, “GROUP BY”, and “ORDER BY”.
- Duplication of data: Database designers model the data based on expected queries in Cassandra. This is diametrically opposite to what happens in the case of RDBMSs. This approach can result in duplication of data in Cassandra.
- Can’t meet strong ACID requirements: While ACID compliant, Cassandra can’t support if you have strong ACID requirements.
- “READ” operations can be slow: Cassandra provides excellent speed for “WRITE” operations, however, “READ” operations can be slow. Too many “READ” requests can cause latency.
- Can’t support aggregates: Cassandra won’t suit if you need too many aggregate operations.
- Limited querying capabilities: Cassandra provides limited querying capabilities for data retrieval.
- Limitations of the Cassandra community: Apache Cassandra community has very capable developers, and it can sometimes produce brilliant results. The standard of support can disappoint you sometimes though.
What is Cassandra used for
You can use Cassandra if you need to store and manage massive data across multiple servers. If you can’t afford to lose data, then Cassandra is a good choice. Organizations that need their databases up even if one server goes down should use Cassandra.
The following are examples of well-known organizations that use Cassandra:
- Best Buy
An introduction to Redis
Redis, an open-source NoSQL database, enjoys considerable popularity. It’s an in-memory data structure store. You can use it as a cache, message broker, or database.
Redis supports various data structures. You can use it to store strings, hashes, lists, sets, sorted sets, bitmaps, geospatial indexes, etc. This distributed database can be used as an in-memory key-value database.
Salvator Sanfilippo started developing Redis using Tcl. He soon transitioned the project to C due to the obvious advantages of this programming language. Sanfilippo launched Redis as an open-source project in 2009.
Redis Labs has sponsored this project since 2015. After multiple rounds of developments, the latest stable release of Redis is 6.0.10. This release came in January 2021. Redis is available under a BSD 3-clause license.
The advantages of Redis
Redis offers the following advantages:
- Excellent for caching: If your primary requirement is caching, then Redis fits your requirements very well. Redis works faster than other caching solutions.
- Advanced data structures: As we stated, Redis supports a wide variety of data structures like strings, hashes, lists, sets, etc. Note that Redis uses its own hashing mechanism called the “Redis Hashing”.
- Ease of use: You can set up Redis easily, furthermore, programmers can learn and use it easily.
- Persistence: Redis offers persistence, which further enhances its reputation as a caching solution.
- Flexibility: You can store key and value pairs as large as 512 MB on Redis.
- Scalability: You can scale a Redis database easily without downtime or performance degradation.
- Support: You can get community support. Redis Labs offers premium support for the paid plans.
The disadvantages to using Redis
Redis has a few disadvantages, e.g.:
- Large-scale cloud deployment of Redis can be hard.
- Redis doesn’t offer a mature clustering solution.
- You can’t implement “Role-Based-Account-Control” (RBAC) with Redis.
- Redis lacks in-built encryption.
- Redis has a few limitations vis-a-vis the “Master-Slave” model. It doesn’t have a smooth failover if a “Master” doesn’t have at least one “Slave”. There’s a limitation concerning sharding too. Redis assigns hash-slots to “Masters”, and it implements sharding based on that. If a “Master” holding a set of hash-slots goes down, then the data written on that slot will be lost.
What is Redis used for
Are you planning to implement a robust session cache mechanism in your app? Redis works well in such use cases. You can use it for implementing full page cache (FPC) and queues, furthermore. Redis suits applications dealing with increments and decrements, furthermore, you can build social networking apps using it.
Examples of leading companies using Redis are as follows:
An introduction to Memcached
Memcached, a general-purpose memory caching system is a well-known NoSQL database. It’s free and open-source, and You can use this distributed database to make database-driven websites faster. Memcached achieves this by caching data and objects in RAM. This reduces the number of external database/API “Read” operations.
Brad Fitzpatrick from Danga Interactive had developed Memcached in 2003. Fitzpatrick had developed it for the website LiveJournal, and he had used Perl for this. Anatoly Vorobey, another developer on the LiveJournal project team, had then rewritten the Memcached project in C.
Memcached went through several rounds of developments and enhancements. At the time of writing, its latest stable release is 1.6.9. This release came in November 2020. Memcached is available under a revised BSD license. A vibrant developer community now supports Memcached.
The advantages of Memcached
Apart from being a free and open-source solution, Memcached offers the following advantages:
- Caching: Memcached is a robust caching solution since it keeps data in RAM.
- Speed: As a caching solution, Memcached offers the speed that you need. Its quick response time makes it a suitable caching solution for high-traffic websites.
- Scalability: Memcached supports multi-threading. This makes it a good choice to develop scalable apps.
- Maturity: Memcached has gone through considerable developments and enhancements since it was launched in 2003. It’s a mature and stable solution.
- Support: A vibrant developers’ community provides good support for Memcached.
The disadvantages to using Memcached
Memcached has the following disadvantages:
- Competition: It competes with Redis, which is versatile. Both Memcached and Redis do very well as caching solutions. However, Memcached is primarily a caching solution. Redis offers much more. If you need more than just a caching solution, then you will likely choose Redis.
- Suitable for simple use cases only: The internal memory management solutions in Memcached use less memory for metadata. That makes it suitable for simple use cases. However, it also presents limitations for complex use cases. If you plan to deal with dynamic data sizes, then the internal memory management processes of Memcached don’t work that well.
- Limitations concerning large data sets: Large data sets commonly contain serialized data. You need more space to store them. This poses a challenge in the case of Memcached. It loses data in the case of a restart, and rebuilding cache consumes plenty of resources.
- Limitations concerning the key-size: Memcached limits the key-size to 1 MB.
- Memcached doesn’t support replication.
What is Memcached used for
Are you developing a high-traffic website? Consider using Memcached to speed it up. You can use Memcached to cache comparatively small and static data. Memcached supports multi-threading, therefore, you can use it to create scalable apps.
The following are examples of well-known companies/apps that use Memcached:
An introduction to DynamoDB
Unlike the free and open-source NoSQL databases that we talked about, Amazon DynamoDB is a licensed offering from Amazon. Amazon manages it fully and offers it as a part of the AWS portfolio. This NoSQL DBMS supports key-value and document data structures.
Note that you can’t deploy Amazon DynamoDB on-premises or on hybrid cloud. You can use it only on the AWS cloud platform. You don’t use DynamoDB APIs directly, rather, you embed an AWS SDK into your app.
A team of developers at Amazon had published a white paper in 2007, which later came to be known as the “Dynamo white paper”. This laid the groundwork. Amazon created the Dynamo database for internal use only.
Amazon DynamoDB is different though since it’s a “Database-as-a-Service” (DBaaS) for external customers. It addresses the availability, scalability, and durability needs that Amazon claims to address. The company launched it in 20212.
The advantages of DynamoDB
Amazon DynamoDB offers the following advantages:
- The advantages of relying on the mature cloud capabilities of AWS: DynamoDB is cloud-native, and you can deploy it only on the AWS cloud. Therefore, it uses the advanced cloud capabilities of AWS. This delivers numerous advantages in the areas of performance, scalability, availability, reliability, durability, security, etc.
- Ease of use: You can set up and use DynamoDB easily, furthermore, you can administer it easily. AWS offers a user-friendly SDK.
- Documentation and support: You can use the comprehensive documentation offered by DynamoDB. Amazon offers premium support.
- Cost savings: The pricing plans of DynamoDB can considerably reduce your costs.
- Support for streams: DynamoDB supports streams, therefore, you can create systems that react to changes in data.
- Integration: DynamoDB integrates very well with AWS Lambdas and API gateways.
- Fully-managed: You don’t need to worry about back-up, replication, and provisioning.
The disadvantages to using DynamoDB
Apart from the fact that DynamoDB isn’t open-source, it has the following disadvantages:
- You might find the cost model hard to understand at times. This can make it hard to predict costs.
- DynamoDB doesn’t offer sufficiently powerful querying capabilities.
- You can’t deploy DynamoDB anywhere outside the AWS cloud platform.
- DynamoDB doesn’t offer “JOINS” and foreign keys since it’s not an RDBMS.
- You can’t use any server-side scripts with DynamoDB.
- DynamoDB uses a “throughput” model for provisioning and pricing, which has a limitation. If you don’t know your expected “READ”/”WRITE” volumes, then you might not have estimated the throughput well. You might see the batch processes failing due to this. You can work around this problem by using the on-demand pricing model.
What is DynamoDB used for
Are you already using the AWS cloud computing platform? If you need a NoSQL database, then DynamoDB stands as an obvious choice. Game developers significantly use DynamoDB. Many organizations working in the “Internet of Things” (IoT) space use DynamoDB.
The following are examples of well-known companies that use DynamoDB:
- New York Times
MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB: A comparison
We now compare these popular NoSQL DBMSs, and we start with a table summarizing the similarities and differences.
|What it is||A popular document database that you can use on the cloud as well as deploy on-premises||Wide-column data store that uses characteristics of BigTable and DynamoDB||In-memory data structure store that you can use as cache, database, and message broker||In-memory key-value store, which was designed originally for caching||Hosted database service offered by Amazon that uses the AWS cloud to store data|
|Who developed it||MongoDB, Inc.||Apache Software Foundation||Salvatore Sanfilippo||Danga Interactive||Amazon|
|When was it launched first||2009||2008||2009||2003||2012|
|Primary database model||Document store||Wide column store||Key-value store||Key-value store||Document store;
|Secondary database model||Search engine||Document store;
Time Series DBMS
|Does it support cloud and on-premise deployment?||Yes, supports both cloud and on-premise deployment||Yes, supports both cloud and on-premise deployment||Yes, supports both cloud and on-premise deployment||Yes, supports both cloud and on-premise deployment||No, it supports cloud-based deployment only|
|Created using which programming language||C++||Java||C||C||Java|
|Server operating systems supported||Linux;
|AWS cloud hosts DynamoDB, which makes this question irrelevant|
|Supports secondary indexes?||Yes||Restricted||Yes||No||Yes|
|Supports SQL?||Supports read-only SQL queries via MongoDB connector for BI||Supports “Cassandra Query Language” (CQL) with SELECT, DML, and DDL statements similar to SQL||Doesn’t support SQL||Doesn’t support SQL||Doesn’t support SQL|
|Access methods including APIs||Proprietary protocols using JSON||Proprietary protocols||Proprietary protocols||Proprietary protocols||RESTful API|
|Supports foreign keys?||No||No||No||No||No|
MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB: A comparison
We now compare these popular NoSQL DBMSs, and we start with a table summarizing the similarities and differences.
What it is?
MongoDB: A popular document database that you can use on the cloud as well as deploy on-premises
Cassandra: Wide-column data store that uses characteristics of BigTable and DynamoDB
Redis: In-memory data structure store that you can use as cache, database, and message broker
Memcached: In-memory key-value store, which was designed originally for caching
DynamoDB: Hosted database service offered by Amazon that uses the AWS cloud to store data
Who developed it?
MongoDB: MongoDB, Inc.
Cassandra: Apache Software Foundation
Redis: Salvatore Sanfilippo
Memcached: Danga Interactive
When was it launched first?
Primary database model
MongoDB: Document store
Cassandra: Wide column store
Redis: Key-value store
Memcached: Key-value store
DynamoDB: Document store, Key-value store
Secondary database model
MongoDB: Search engine
Redis: Document store, Graph DBMS, Search Engine, Time Series DBMS
DynamoDB: Commercial license
Does it support cloud and on-premise deployment?
MongoDB: Yes, supports both cloud and on-premise deployment
Cassandra: Yes, supports both cloud and on-premise deployment
Redis: Yes, supports both cloud and on-premise deployment
Memcached: Yes, supports both cloud and on-premise deployment
DynamoDB: No, it supports cloud-based deployment only
Created using which programming language
Server operating systems supported
MongoDB: Linux, OS X, Solaris, Windows
Cassandra: BSD, Linux, OS X, Windows
Redis: BSD, Linux, OS X, Windows
Memcached: FreeBSD, Linux, OS X, Unix, Windows
DynamoDB: AWS cloud hosts DynamoDB, which makes this question irrelevant
Programming languages supported
Memcached: C, C++, .Net, Java, Python, Perl, Ruby, PHP, and more
Supports secondary indexes?
MongoDB: Supports read-only SQL queries via MongoDB connector for BI
Cassandra: Supports “Cassandra Query Language” (CQL) with SELECT, DML, and DDL statements similar to SQL
Redis: Doesn’t support SQL
Memcached: Doesn’t support SQL
DynamoDB: Doesn’t support SQL
Access methods including APIs
MongoDB: Proprietary protocols using JSON
Cassandra: Proprietary protocols
Redis: Proprietary protocols
Memcached: Proprietary protocols
DynamoDB: RESTful API
Supports foreign keys?
Performance comparison between MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB
The performance comparison works out as follows:
- Cassandra delivers great performance when you need a high “Write” throughput.
- MongoDB offers very good performance if you need very high read concurrency.
- DynamoDB can be a good choice for performance if you don’t mind a licensed NoSQL DBMS exclusively on the AWS cloud.
- If you need to handle huge keys and objects, then Redis performs well.
- Memcached performs well for caching small and static data.
Scalability comparison between MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB
The comparison of scalability between the above-mentioned NoSQL DBMSs goes as follows:
- Among multi-purpose open-source NoSQL DBMSs, MongoDB and Cassandra offer high scalability.
- Among licensed databases, DynamoDB utilizes the cloud capabilities of AWS. It offers high scalability.
- For requirements involving caching small datasets, Memcached scales well. Its multithreading capabilities become handy here.
- Redis scales well too. It scales without any performance degradation or downtime.
User-friendliness comparison between MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB
MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB comparison concerning ease of use goes as follows:
- You can easily learn, install and use Cassandra, MongoDB, and Redis.
- Since Amazon offers DynamoDB as a fully-managed DBaaS (Database-as-a-Service), it’s easy-to-use among the licensed NoSQL DBMSs.
The comparison of the unique features between MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB
Which unique features do these popular NoSQL DBMSs offer? This comparison plays out as follows:
- Memcached offers unique memory management capabilities that make it an excellent caching solution for data that’s small and static.
- Redis supports many data structures like strings, hashes, lists, sets, etc.
- Redis allows you to store large key-value pairs.
- Cassandra offers decentralization where each node can service any request. If one node fails, you can retrieve the entire data from the other nodes.
- Cassandra offers its unique query language called the “Cassandra Query Language” (CQL).
- The ability to use MongoDB as a file system, named “GridFS” (Grid File System), offers unique advantages. This makes load balancing, replication, etc. very easy, and you get high availability.
- Amazon offers AWS SDK for DynamoDB. You integrate it in your app, and you access DynamoDB via this SDK.
When to use MongoDB vs Cassandra vs Redis vs Memcached vs DynamoDB
Consider the following decision-making factors to choose a NoSQL DBMS for your project:
- Want the cloud capabilities of AWS and don’t mind paying for a licensed NoSQL DBMS? Amazon DynamoDB stands as the obvious choice. It uses the cloud capabilities of AWS, and it offers very good performance, scalability, security, and availability.
- Prefer ease-of-use and deep integration with AWS offerings like AWS Lambdas? You should use DynamoDB.
- Do you have a use case where data must be recovered even if a server fails? Choose Cassandra since it offers decentralization and fault tolerance.
- Do you foresee a high “Write” throughput and comparatively fewer “Reads”? Use Cassandra.
- Use Cassandra if you need a multi-datacenter deployment or Hadoop integration.
- Use MongoDB if you need to scale up rapidly.
- If you need to store large documents, then MongoDB works well for you.
- Need to store large videos/images/media files? Consider using MongoDB.
- Do you need high read concurrency? Use MongoDB.
- Want a scalable caching solution for small, static data? You should use Memcached.
- Need a caching solution for large data elements? Use Redis.
- Use Redis if you need to store varied data structures like strings, hashes, lists, sets, etc.
- Need a NoSQL database for messaging queues? You should use Redis.
We talked about some of the most popular NoSQL DBMSs. After discussing the pros and cons of MongoDB, Cassandra, Redis, Memcached, and DynamoDB, we compared them. Finally, we talked about when to use NoSQL DBMS. Analyze your project requirements carefully to choose the right DBMS.
Are you looking to get your App built? Contact us at email@example.com or visit our website Devathon to find out how we can breathe life into your vision with beautiful designs, quality development, and continuous testing.