High Scalability -

Recommend Datanet: a New CRDT Database that Let's You Do Bad Bad Things to Distributed Data (Email)

This action will generate an email recommending this article to the recipient of your choice. Note that your email address and your recipient's email address are not logged by this system.

Email Article Link

The email sent will contain a link to this article, the article title, and an article excerpt (if available). For security reasons, your IP address will also be included in the sent email.

Article Excerpt:

We've had databases targeting consistency. These are your typical RDBMSs. We've had databases targeting availability. These are your typical NoSQL databases.

If you're using your CAP decoder ring you know what's next...what databases do we have that target making concurrency a first class feature? That promise to thrive and continue to function when network partitions occur?

No many, but we have a brand new concurrency oriented database: Datanet - a P2P replication system that utilizes CRDT algorithms to allow multiple concurrent actors to modify data and then automatically & sensibly resolve modification conflicts.

Datanet is the creation of Russell Sullivan. Russell spent over three years hidden away in his mad scientist layer researching, thinking, coding, refining, and testing Datanet. You may remember Russell. He has been involved with several articles on HighScalability and he wrote AlchemyDB, a NoSQL database, which was acquired by Aerospike.

So Russell has a feel for what's next. When he built AlchemyDB he was way ahead of the pack and now he thinks practical, programmer friendly CRDTs are what's next. Why?

Concurrency and data locality. To quote Russell:

Datanet lets you ship data to the spot where the action is happening. When the action happens it is processed locally, your system's reactivity is insanely quick. This is pretty much the opposite of the non-concurrent case where you need to go to a specific machine in the cloud to modify a piece of data regardless of where the action takes place. As your system grows, the concurrent approach is superior.

We have been slowly moving away from transactions towards NoSQL for reasons of scalability, availability, robustness, etc. Datanet continues this evolution by taking the next step and moving towards extreme distribution: supporting tons of concurrent writers.

The shift is to more distribution in computation. We went from one app-server & one DB to app-server-clusters and clustered-DBs, to geographically distributed data-centers, and now we are going much further with Datanet, data is distributed anywhere you need it to a local cache that functions as a database master.

How does Datanet work?

In Datanet, the same piece of data can simultaneously exist as a write-able entity in many many places in the stack. Datanet is a different way of looking at data: Datanet more closely resembles an internet routing protocol than a traditional client-server database ... and this mirrors the current realities that data is much more in flight than it used to be.

What bad bad things can you do to your distributed data? Here's an amazing video of how Datanet recovers quickly, predictably, and automatically from Chaos Monkey level extinction events. It's pretty slick.

Here's an email interview I did with Russell. He goes into a lot more detail about Datanet and what it's all about. I think you will find it interesting.

Let's start with your name and a little of your background?

Article Link:

Your Name:

Your Email:

Recipient Email:

Message: