« Sponsored Post: 3T, Bridgecrew, Toptal, IP2Location, Ipdata, StackHawk, InterviewCamp.io, Educative, Triplebyte, Stream, Fauna | Main | Sponsored Post: 3T, Bridgecrew, Toptal, IP2Location, Ipdata, StackHawk, InterviewCamp.io, Educative, Triplebyte, Stream, Fauna »
Wednesday
Feb172021

Benchmark (YCSB) numbers for Redis, MongoDB, Couchbase2, Yugabyte and BangDB

This is guest post by Sachin Sinha who is passionate about data, analytics and machine learning at scale. Author & founder of BangDB.

This article is to simply report the YCSB bench test results in detail for five NoSQL databases namely Redis, MongoDB, Couchbase, Yugabyte and BangDB and compare the result side by side. I have used latest versions for each NoSQL DB and have followed the recommendations to run all the databases in optimized conditions. I have also used the default six test scenarios as defined by the YCSB framework. I have restricted it to 10M records for each test. However, user can run the bench for as many numbers as they practically find suitable.

About YCSB

Following configurations were used for the evaluation purpose.

Each of these workload test runs in two steps, 1. Load and 2. Run. Load stage is to load the data and then run stage we run the test. I have run each test with clean database to reflect the numbers in fair manner

Summary

To summarize the test, here is the high-level report of the tests, load and run both.


Load is consistent for all dbs for all tests as expected as this phase is to load the data. Run phase is where each db is tested for different test conditions.

Workload A: Update heavy workload

This workload has a mix of 50/50 reads and writes. An application example is a session store recording recent actions.

The first graph shows the ops/sec (throughput) for the 10M records. However the second chart shows how quickly the test was completed by DBs.

We note that for MongoDB update latency is really very low (low is better) compared to other dbs, however the read latency is on the higher side.

Workload B: Read mostly workload

This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is an update, but most operations are to read tags.


The latency table shows that 99th percentile latency for Yugabyte is quite high compared to others (lower is better)

Workload C: Read only

This workload is 100% read. Application example: user profile cache, where profiles are constructed elsewhere (e.g., Hadoop).


The latency table is following for test C and since it was read only test hence there is no update latency figure here. Again Yugabyte latency is quite high

Workload D: Read latest workload

In this workload, new records are inserted, and the most recently inserted records are the most popular. Application example: user status updates; people want to read the latest.


The latency table for test D is as below. Here Redis and Yugabyte have higher latencies, Yugabyte performs bad for both Insert and Read for the test

Workload E: Short ranges

In this workload, short ranges of records are queried, instead of individual records. Application example: threaded conversations, where each scan is for the posts in a given thread (assumed to be clustered by thread id).


This requires a lots of scans, hence the throughput decreases for all dbs, however Redis is slowest, understandably so as it is also reflected in the latency table below. Yugabyte performs really good in this test

Workload F: Read-modify-write

In this workload, the client will read a record, modify it, and write back the changes. Application example: user database, where user records are read and modified by the user or to record user activity.


Here Yugabyte has high latency esp for 99th percentile for Update and Read-modify-write. However, Very high Read latency for MonoDB makes it the last db to finish the task

Conclusion

While each database has been designed for different goals and use cases, YCSB test provides somewhat a common ground for the benchmark, therefore the numbers shown in this document can be used by developers or users to help select the db suitable for their requirement. All of these dbs are available free of cost for download / install and it will be fairly straightforward to run these tests in your environment for further analysis. The tests can be modified or added in order to cover a set of specific scenarios as needed.

Reader Comments (6)

Good post and info to see. Will checkout BangDB
Do we have python binding for the db?
Or Is REST api available?

February 17, 2021 | Unregistered CommenterRaghu

How about comparison with Aerospike? num looks pretty good for bangdb
How would it compare with Aerospike? do you have some data on that too?

February 17, 2021 | Unregistered CommenterMike L

Looked at few use cases of BangDB sounds interesting - Where and how can I download?

February 17, 2021 | Unregistered CommenterDatNerd

11 seconds latency to read from MongoDB?
What indexes were used to support this workload?!
We have a 12 TB MongoDB cluster and retrieve random key-value lookups in avg 1 second??

February 18, 2021 | Unregistered CommenterSander

looks like it's time for big companies to switch to bang

what kind of metrics are there for a long stream of transactions over a period of constant running? (days, weeks, etc.)

February 18, 2021 | Unregistered Commentergerry

@Sander It's 11ms for MongoDB and not 11 seconds,

February 19, 2021 | Unregistered CommenterMike L

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>