An introduction to CockroachDB
If you are an entrepreneur or an enterprise IT leader, then you need to plan the technology stack for your software development project. You need to choose the right database for your project. If you are developing a heavy-duty transaction processing system or a high-demand analytics system, you will likely use an RDBMS (Relational Database Management System). You can choose one from several popular RDBMSs.
However, do you want an assurance that your data in an RDBMS will survive even large-scale failures in application systems and servers? This narrows the choice. This is where CockroachDB becomes important. This relatively new RDBMS offers a high degree of survivability to your data. In this article, we review CockroachDB and its features. We assess its pros and cons. Subsequently, we review its use cases. Finally, we compare CockroachDB with some of the popular databases. Read on.
The background: Why CockroachDB?
Let’s assume that you want to build an important OLTP (Online Transaction Processing) or analytics system. We also assume that this system will have high visibility.
You can’t afford to lose data even if there’s a failure in a database server. Furthermore, you need data integrity. When talking about popular RDBMS solutions, you will probably first think of MySQL, PostgreSQL, etc.
However, they enjoy more popularity due to their functional capabilities. E.g., PostgreSQL supports complex SQL queries very well. Developers and data modelers know that it’s one of the most advanced RDBMS solutions.
You want a differentiated RDBMS solution that offers survivability and data integrity as its main value proposition. The RDBMS will secure your data even if one server fails. It should have sufficient data redundancy.
Furthermore, you don’t want corrupted data in your RDBMS even if a database solution fails. You need a database that maintains data integrity. It should keep the data consistent.
You want an RDBMS solution that focuses on these value propositions over everything else. That brings us to review CockroachDB.
What is CockroachDB?
CockroachDB is a distributed SQL database. Its distributed design ensures data redundancy. CockroachDB keeps your data safe even if one database server fails. CockroachDB can survive failures of the disk, machine, rack, and data centers. This ability to survive is the reason why the developers of this database chose the name “CockroachDB”.
CockroachDB focuses on maintaining data integrity even if a database operation fails. CockroachDB focuses on consistency for this. It utilizes a strongly consistent key-value store.
It’s built to support transaction processing systems, SoRs (Systems of Record), etc. This open-source RDBMS (Relational Database Management System) scales horizontally.
A brief history of CockroachDB
Spencer Kimball, Peter Mattis, and Ben Darnell founded Cockroach Labs in 2015, and this company built CockroachDB. They worked in Google earlier.
They got the inspiration to build CockroachDB after reviewing database products built by Google. These are as follows:
- Spanner: A globally distributed database;
- F1: A fault-tolerant distributed RDBMS that supported the ad business of Google.
However, the developers also reviewed other popular databases like MySQL, PostgreSQL, Cassandra, and AWS SimpleDB. They wanted to build a database that will be composed of symmetric nodes. Their plan involved building a database without external dependencies. These developers wanted this database to spread itself naturally across availability zones. Survivability was a key requirement.
Cockroach Labs launched CockroachDB in 2015. The company had first offered it as open-source software. However, it made a change to this in 2019.
Cockroach Labs now offers CockroachDB under a BSL (Business Source License). This would allow users of CockroachDB to do most of the tasks using the open-source license model. If a company wants to offer a commercial version of CockroachDB, then it needs to buy a license.
A brief overview of CockroachDB architecture
The developers of CockroachDB wanted to offer a database that’s strongly consistent, scalable, and survivable. Towards this, they used the following to design and implement a suitable architecture:
- Cluster: This refers to your CockroachDB redeployment. You can think of it as a logical application.
- Node: A “node” refers to one machine running CockroachDB. Multiple nodes make up one cluster.
- Range: CockroachDB stores all data in a giant sorted map of key-value pairs. This covers all data belonging to users like tables, indexes, etc. CockroachDB stores all system data in this way too. This DBMS (Database Management System) divides this keyspace into ranges, i.e., contiguous chunks of the keyspace.
- Replica: Cockroach replicates each range. It stores replicas in different nodes. By default, CockroachDB creates 2 replicas.
- “Lease-holder”: One of the replicas for each range receives and coordinates the “read” and “write” requests for that range. We call this replica the “lease-holder” for that range.
- “Raft” consensus protocol: The “Raft” consensus protocol is an algorithm. It works to ensure that the data is safely stored on multiple machines, and these machines agree on the current state.
- “Raft log”: A “Raft log” is a time-ordered log for each range. This contains the details of “write” operations that impact a range. The replicas of the range should agree on these “write” operations.
- “Raft leader”: Each of the ranges in CockroachDB has one replica that acts as the “leader” for the “write” requests. This replica uses the “Raft” consensus protocol to ensure that the majority of the replicas reach an agreement before committing a “write” operation. This process uses the “Raft” log.
A brief note about the ACID guidelines
We take a few moments to talk about the ACID (“Atomicity”, “Consistency”, “Isolation”, and “Durability) guidelines before we talk about the features of CockroachDB. Note that the ACID guidelines aren’t exclusive to CockroachDB.
The ACID guidelines intend to process database transactions reliably. By the phrase “database transactions”, we refer to any database operations. These could include creating a new record or updating an existing one.
The ACID guidelines consist of the following:
- “Atomicity”: The database system can commit an entire operation successfully. Alternatively, it rolls back the entire operation when faced with a failure. You don’t have a middle path.
- “Consistency”: A database transaction must maintain the data integrity rules. The database system rolls back the entire transaction even if one part fails to maintain data integrity.
- “Isolation”: One “Read” or “Write” operation can’t have an impact on other “Read” or “Write” operations.
- “Durability”: If the database system successfully commits a transaction, then the change will remain permanently.
The RDBMS solutions that conform strongly with the ACID guidelines ensure data integrity and survivability. CockroachDB strongly conforms to these guidelines.
Key features of CockroachDB
CockroachDB offers the following features:
- The durability of your data: A distributed RDBMS, CockroachDB supports replication and automated repairs. This offers survivability. Your data remains even if one database server fails.
- Your data remains true and consistent: As we discussed, CockroachDB strongly conforms to the ACID guidelines. This maintains data integrity. CockroachDB doesn’t let your data become corrupted. CockroachDB offers strong consistency. This database guarantees serializable SQL transactions. That’s the highest level of isolation defined by SQL standards. CockroachDB uses the Raft consensus algorithm for “write” operations. It uses a custom time-based synchronization algorithm for “read” operations. These factors allow CockroachDB to offer strong consistency.
- Ease of use: You can deploy and use CockroachDB easily.
- Supports SQL: CockroachDB supports SQL, therefore, developers can use schemas, tables, rows, columns, and indexes. CockroachDB is also a transactional key-value store, and it’s distributed.
- Supports distributed transactions: CockroachDB supports distributed transactions across a cluster. Whether you have a few servers in one location or many servers across multiple locations, CockroachDB can support distributed transactions.
- Scalability: You can scale CockroachDB horizontally, and this doesn’t require operational overhead.
- Availability: CockroachDB offers high availability by design.
A brief note on CockroachDB performance
CockroachDB uses the “TPC-C” benchmark to measure its performance, and it uses other tests too. TPC-C is an established industry-standard benchmark to measure the performance of OLTP (Online Transaction Processing) systems.
TPC-C includes measuring performances of CRUD (“Create”, Read”, “Update”, and “Delete”) operations, basic joins, and other relevant SQL statements. This benchmarking system uses a metric called tpmC to measure the throughput and latency of transactions.
CockroachDB publishes its performance metrics. This database can process 1.68M tpmC with 140,000 warehouses, which results in an efficiency score of 95%.
As far as latency is concerned, CockroachDB can process a single-row “read” operation in 1 ms. It can process a single-row “write” operation in 2 ms. CockroachDB can achieve this within one availability zone.
Pros and cons of CockroachDB
We now evaluate the pros and cons of CockroachDB.
Advantages of CockroachDB
CockroachDB offers the following advantages:
- Great customer support: The Cockroach Labs team provides responsive, prompt, and effective customer support. This is true for the free version of it. As you can expect, the Cockroach Labs team provides even better support for the paid version.
- Cost savings: You can use “CockroachDB Core”, the open-source version of CockroachDB for most of the purposes. It’s free. The license cost per CPU of a fully-supported CockroachDB Enterprise version is less than Oracle RAC.
- Flexibility: You can deploy CockroachDB on-premises. You can deploy it on the public and private cloud. CockroachDB supports the deployment of containers, VMs, and bare-metal servers. Cockroach Labs offers CockroachCloud, a Database-as-a-Service on AWS and Google cloud.
- The durability of your data: As we discussed, CockroachDB ensures that your data survives severe crashes.
- Data integrity: As we discussed, CockroachDB conforms to the ACID (“Atomicity”, “Consistency”, “Isolation”, and “Durability”) guidelines. It maintains data integrity. That’s an advantage for SoRs and OLTP systems.
- Availability: CockroachDB offers a high degree of availability.
- Performance: As we stated earlier, CockroachDB offers a high performance against the TPC-C benchmarking system.
- Scalability: CockroachDB offers cloud-native and horizontal scalability. It’s a highly scalable RDBMS.
- Ease of use: You can deploy and use CockroachDB easily. You can use the available CockroachDB tutorials for this.
- Supports PostgreSQL: PostgreSQL is a highly popular open-source RDBMS. CockroachDB supports the SQL dialect of PostgreSQL. You can use the PostgreSQL client libraries to connect to CockroachDB.
- Other useful features: CockroachDB supports active dynamic schema changes. It supports datacenter-aware applications, thanks to the geo-partitioning support.
An overview of CockroachDB limitations
CockroachDB has a few limitations. These are as follows:
- CockroachDB Can’t support complex database transactions. It isn’t very suitable for applications that use complex SQL “JOIN” statements. It’s not the ideal database for heavy analytics or OLAP.
- You need to pay for the advanced features if you use CockroachDB. Some of the powerful features that users will prefer in a distributed database are available in the enterprise edition only.
- The hiring lead time for CockroachDB developers could be higher. CockroachDB is relatively new, therefore, it can’t match the popularity of MySQL or PostgreSQL yet. You might not get as many developers skilled in CockroachDB compared to MySQL or PostgreSQL.
A summary of CockroachDB use cases
When should you use CockroachDB? CockroachDB is useful in the following use cases:
- Building a distributed OLTP (Online Transaction Processing) system: Financial transaction processing systems, retail sales transaction processing applications, CRM (Customer Relationship Management) systems are examples of OLTP systems. These systems require fast responses. These applications need high availability. The distributed model of CockroachDB offers high performance, consistency, and availability.
- Systems requiring geo-partitioning for low latency and regulatory compliance: Some applications need to store data in particular regions due to regulatory requirements. Many applications need to store data in particular regions due to low latency requirements. The distributed model of CockroachDB allows nodes with data in the locations they need.
- Building an SoR (System of Record): An SoR application should provide confidence to users about the authenticity of data stored by it. It might have data from different sources. The data might come at different schedules and the processes to extract data might vary. Such applications need strongly consistent databases. Such a database should comply with ACID properties. CockroachDB meets these requirements.
- The need to embrace cloud computing without abandoning the relational database management model: Organizations that want to use an RDBMS even after they transition to the cloud can use CockroachDB. CockroachDB is a distributed RDBMS. It’s cloud-agnostic. This RDBMS can scale to support an enterprise-level implementation.
Who uses CockroachDB?
The following are examples of well-known companies that use CockroachDB:
- JPMorgan Chase
- Centene Corporation
A few CockroachDB alternatives
The top open-source RDBMS alternatives to CockroachDB are the following:
The top paid RDBMS alternatives to CockroachDB are as follows:
- Microsoft SQL Server
NoSQL databases have a few key differences with RDBMSs. We will shortly talk about them. Within that constraint, the top open-source NoSQL alternatives to CockroachDB are as follows:
A comparison of CockroachDB vs popular SQL and NoSQL databases
We now compare CockroachDB vs popular SQL and NoSQL databases. The following table summarizes the key differences:
|Description||Distributed SQL database||Popular open-source RDBMS||Popular open-source RDBMS||Popular open-source NoSQL document database||Popular wide-column store based NoSQL database|
|Primary database model||RDBMS||RDBMS||RDBMS||Document store||Wide column store|
|Secondary database model||–||Document store,
|Developer||Cockroach Labs||Oracle||PostgreSQL Global Development Group||MongoDB, Inc||Apache Software Foundation|
|Only cloud-based?||Supports both on-premises and cloud-based deployment||Supports both on-premises and cloud-based deployment||Supports both on-premises and cloud-based deployment||Supports both on-premises and cloud-based deployment||Supports both on-premises and cloud-based deployment|
|Implementation language||Golang||C and C++||C||C++||Java|
|Server operating systems||Linux,
|Supports SQL?||Supports SQL||Supports SQL||Supports SQL||Read-only SQL queries via the MongoDB Connector for BI||Supports CQL (Cassandra Query Language), which is some similarities with SQL.|
|Programming languages supported||C#,
Ruby, and more
Ruby, and more
Python, and more
Ruby, and more
Ruby, and more
A comparison of CockroachDB vs popular SQL and NoSQL databases
We now compare CockroachDB vs popular SQL and NoSQL databases. The following table summarizes the key differences:
CockroachDB: Distributed SQL database
MySQL: Popular open-source RDBMS
PostgreSQL: Popular open-source RDBMS
MongoDB: Popular open-source NoSQL document database
Cassandra: Popular open-source NoSQL document database
Primary database model
MongoDB: Document store
Cassandra: Wide column store
Secondary database model
MySQL: Document store, Spatial DBMS
PostgreSQL: Document store, Spatial DBMS
MongoDB: Spatial DBMS, Search engine
CockroachDB: Cockroach Labs
PostgreSQL: PostgreSQL Global Development Group
MongoDB: MongoDB, Inc
Cassandra: Apache Software Foundation
CockroachDB: Open-source, BSL
CockroachDB: Supports both on-premises and cloud-based deployment
MySQL: Supports both on-premises and cloud-based deployment
PostgreSQL: Supports both on-premises and cloud-based deployment
MongoDB: Supports both on-premises and cloud-based deployment
Cassandra: Supports both on-premises and cloud-based deployment
MySQL: C and C++
Server operating systems
CockroachDB: Linux, macOS, Windows
MySQL: FreeBSD, Linux, OS X, Solaris, Windows
PostgreSQL: FreeBSD, HP-UX, Linux, NetBSD, OpenBSD, OS X, Solaris, Unix, Windows
MongoDB: Linux, OS X, Solaris, Windows
Cassandra: BSD, Linux, OS X, Windows
CockroachDB: Supports SQL
MySQL: Supports SQL
PostgreSQL: Supports SQL
MongoDB: Read-only SQL queries via the MongoDB Connector for BI
Cassandra: Supports CQL (Cassandra Query Language), which is some similarities with SQL.
Programming languages supported
CockroachDB vs MySQL
CockroachDB and MySQL have the following similarities:
- Both of them are RDBMSs. You use schemas. You create tables with rows and columns. Both allow you to store structured data in rows and columns.
- CockroachDB and MySQL are ACID-compliant.
- Both offer high performance and availability.
- You can get robust support whether you use CockroachDB or MySQL.
- Both CockroachDB and MySQL support on-premises and cloud-based deployment models.
The differences between CockroachDB and MySQL are as follows:
- Latency: Since CockroachDB is a distributed SQL database, it will generally not be as low-latency as a single-node SQL database. This holds in the case of CockroachDB-vs-MySQL comparison too.
- SQL capabilities: CockroachDB can’t support some of the complex SQL operations like complex “JOINs”.
- Fitment for Analytics/OLAP: CockroachDB is less suitable for analytics/OLAP than MySQL.
- Popularity: MySQL has been around for much longer than CockroachDB, and it enjoys higher popularity.
- Scalability: CockroachDB scales much better for OLTP workloads than MySQL.
- Operational simplicity: You can start a new CockroachDB cluster by executing just a few commands. CockroachDB offers more operational simplicity than MySQL.
CockroachDB vs PostgreSQL
CockroachDB and PostgreSQL have the following similarities:
- Both of them are RDBMSs. Both of them support schemas, SQL statements, tables, rows, columns, indexes, etc.
- PostgreSQL and CockroachDB are ACID-compliant.
- Both of them offer high performance and availability.
- You can get great support for both PostgreSQL and CockroachDB.
- CockroachDB and PostgreSQL support cloud-based as well as on-premises deployment.
The CockroachDB vs PostgreSQL differences are as follows:
- Popularity: PostgreSQL enjoys higher popularity than CockroachDB. PostgreSQL has been around for much longer than CockroachDB.
- SQL capabilities: PostgreSQL is a highly advanced open-source RDBMS. Its SQL capabilities are higher than that of CockroachDB. CockroachDB can’t support some of the complex SQL operations that PostgreSQL supports.
- Fitment for OLAP/Analytics: PostgreSQL is more suitable than CockroachDB for analytics/OLAP applications.
- Scalability: Adding new nodes to a CockroachDB cluster is easy. It provides better scalability than PostgreSQL.
- Operational simplicity: Using CockroachDB is simpler than PostgreSQL.
CockroachDB vs MongoDB
CockroachDB has a few similarities with MongoDB. These are as follows:
- Both CockroachDB and MongoDB offer high performance and availability.
- You can get robust support for both CockroachDB and MongoDB.
- Both CockroachDB and MongoDB allow the on-premises and cloud-based deployment models.
- CockroachDB and MongoDB support distributed “read” operations.
The differences between CockroachDB and MongoDB are as follows:
- CockroachDB is an RDBMS, whereas MongoDB is a NoSQL database. CockroachDB supports tables, rows, columns, and indexes. On the other hand, MongoDB stores documents in a JSON-like format. MongoDB doesn’t support SQL.
- CockroachDB is a strongly consistent database. On the other hand, MongoDB offers eventual consistency.
- CockroachDB supports online, dynamic, and active changes to the database schema. MongoDB supports offline changes.
- CockroachDB supports geo-partitioning at the row level. MongoDB doesn’t support geo-partitioning.
- MongoDB is more popular than CockroachDB. MongoDB has been around for much longer than CockroachDB.
CockroachDB vs Cassandra
CockroachDB and Cassandra have the following similarities:
- Cassandra supports distributed workloads, and CockroachDB is a distributed database.
- You can get robust support whether you use Cassandra or CockroachDB.
- CockroachDB and Cassandra offer high performance and availability.
- You can use both the on-premises and cloud-based deployment models for Cassandra and CockroachDB.
The differences between CockroachDB vs Cassandra are as follows:
- CockroachDB is an RDBMS, and Cassandra is a NoSQL database. CockroachDB supports tables, rows, columns, indexes, and SQL. Cassandra doesn’t support the relational model. Cassandra has CQL (Cassandra Query Language), which is similar to SQL. It’s not SQL though.
- CockroachDB is strongly consistent and Cassandra isn’t.
- Cassandra is scalable, however, CockroachDB offers better scalability.
- CockroachDB offers more operational simplicity than Cassandra.
- Cassandra is a well-established database and it has been around for a while. It’s more popular than CockroachDB.
We reviewed CockroachDB and how it works. We looked at its architecture, features, advantages, disadvantages, and use cases. Finally, we compared CockroachDB with some of the popular SQL and NoSQL databases. Analyze your project requirements carefully before you choose a database.
Are you looking to get your App built? Contact us at firstname.lastname@example.org or visit our website Devathon to find out how we can breathe life into your vision with beautiful designs, quality development, and continuous testing.