- High Scalability

This is guest post by Sachin Sinha who is passionate about data, analytics and machine learning at scale. Author & founder of BangDB.

This article is to simply report the YCSB bench test results in detail for five NoSQL databases namely Redis, MongoDB, Couchbase, Yugabyte and BangDB and compare the result side by side. I have used latest versions for each NoSQL DB and have followed the recommendations to run all the databases in optimized conditions. I have also used the default six test scenarios as defined by the YCSB framework. I have restricted it to 10M records for each test. However, user can run the bench for as many numbers as they practically find suitable.

About YCSB

Following configurations were used for the evaluation purpose.

Redis Server: 5.07, x86/64
MongoDB server: 4.4.2, x86_64
YugabyteDB:2.5.0, x86_64
Couchbase2: 7.0 Beta, x86_64
BangDB server: 2.0.0, x86_64
Number of records: 10M
RAM: 32GB, Cores: 16
YCSB workloads: see github.com/brianfrankcooper/YCSB/wiki/Core-Workloads

Each of these workload test runs in two steps, 1. Load and 2. Run. Load stage is to load the data and then run stage we run the test. I have run each test with clean database to reflect the numbers in fair manner

Summary

To summarize the test, here is the high-level report of the tests, load and run both.

Load is consistent for all dbs for all tests as expected as this phase is to load the data. Run phase is where each db is tested for different test conditions.

Workload A: Update heavy workload

This workload has a mix of 50/50 reads and writes. An application example is a session store recording recent actions.

The first graph shows the ops/sec (throughput) for the 10M records. However the second chart shows how quickly the test was completed by DBs.

We note that for MongoDB update latency is really very low (low is better) compared to other dbs, however the read latency is on the higher side.

Workload B: Read mostly workload

This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is an update, but most operations are to read tags.

The latency table shows that 99th percentile latency for Yugabyte is quite high compared to others (lower is better)

Workload C: Read only

This workload is 100% read. Application example: user profile cache, where profiles are constructed elsewhere (e.g., Hadoop).

The latency table is following for test C and since it was read only test hence there is no update latency figure here. Again Yugabyte latency is quite high

Workload D: Read latest workload

In this workload, new records are inserted, and the most recently inserted records are the most popular. Application example: user status updates; people want to read the latest.

The latency table for test D is as below. Here Redis and Yugabyte have higher latencies, Yugabyte performs bad for both Insert and Read for the test

Workload E: Short ranges

In this workload, short ranges of records are queried, instead of individual records. Application example: threaded conversations, where each scan is for the posts in a given thread (assumed to be clustered by thread id).

This requires a lots of scans, hence the throughput decreases for all dbs, however Redis is slowest, understandably so as it is also reflected in the latency table below. Yugabyte performs really good in this test

Workload F: Read-modify-write

In this workload, the client will read a record, modify it, and write back the changes. Application example: user database, where user records are read and modified by the user or to record user activity.

Here Yugabyte has high latency esp for 99th percentile for Update and Read-modify-write. However, Very high Read latency for MonoDB makes it the last db to finish the task

Conclusion

While each database has been designed for different goals and use cases, YCSB test provides somewhat a common ground for the benchmark, therefore the numbers shown in this document can be used by developers or users to help select the db suitable for their requirement. All of these dbs are available free of cost for download / install and it will be fairly straightforward to run these tests in your environment for further analysis. The tests can be modified or added in order to cover a set of specific scenarios as needed.