Entries in nosql (54)

Monday
Jul092012

Data Replication in NoSQL Databases

This is the third guest post (part 1, part 2) of a series by Greg Lindahl, CTO of blekko, the spam free search engine. Previously, Greg was Founder and Distinguished Engineer at PathScale, at which he was the architect of the InfiniPath low-latency InfiniBand HCA, used to build tightly-coupled supercomputing clusters.

blekko's home-grown NoSQL database was designed from the start to support a web-scale search engine, with 1,000s of servers and petabytes of disk. Data replication is a very important part of keeping the database up and serving queries. Like many NoSQL database authors, we decided to keep R=3 copies of each piece of data in the database, and not use RAID to improve reliability. The key goal we were shooting for was a database which degrades gracefully when there are many small failures over time, without needing human intervention.

Why don't we like RAID for big NoSQL databases?

Click to read more ...

Monday
Apr302012

Masstree - Much Faster than MongoDB, VoltDB, Redis, and Competitive with Memcached

The EuroSys 2012 system conference has an excellent live blog summary of their talks for: Day 1, Day 2, Day 3 (thanks Henry at the Paper Trail blog). Summaries for each of the accepted papers are here.

One of the more interesting papers from a NoSQL perspective was Cache Craftiness for Fast Multicore Key-Value Storage, a wonderfully detailed description of the low level techniques used to implement Masstree:

A storage system specialized for key-value data in which all data fits in memory, but must persist across server restarts. It supports arbitrary, variable-length keys. It allows range queries over those keys: clients can traverse subsets of the database, or the whole database, in sorted order by key. On a 16-core machine Masstree achieves six to ten million operations per second on parts A–C of the Yahoo! Cloud Serving Benchmark benchmark, more than 30 as fast as VoltDB [5] or MongoDB [2].

If you are looking for innovative detailed high performance design, this paper is for you. An example from the section on writer-writer coordination: 

Masstree writers coordinate using per-node spinlocks. A node’s lock is stored in a single bit in its version counter. Any modification to a node’s keys or values requires holding the node’s lock. Some data is protected by other nodes’ locks, however. A node’s parent pointer is protected by its parent’s lock, and a border node’s prev pointer is protected by its previous sibling’s lock. This minimizes the simultaneous locks required by split operations; when an interior node splits, for example, it can assign its children’s parent pointers without obtaining their locks.

Here's the live blog writeup:

Click to read more ...

Monday
Feb202012

Berkeley DB Architecture - NoSQL Before NoSQL was Cool

After the filesystem and simple library packages like dbmBerkeley DB was the original luxury embedded database widely used by applications as their core database engine. NoSQL before NoSQL was cool. The hidden secret making complex applications sing. If you want to dispense with all the network overhead of a server based system, it's still a a good choice.

There's a great writeup for the architecture behind Berkeley DB in the book The Architecture of Open Source Applications. If you want to understand more about how a database works or if you are pondering how to build your own, it's rich in detail, explanations, and lessons. Here's the Berkeley DB  chapter from the book. It covers topics like: Architectural Overview; The Access Methods: Btree, Hash, Recno, Queue; The Library Interface Layer; The Buffer Manager: Mpool; Write-ahead Logging; The Lock Manager: Lock; The Log Manager: Log; The Transaction Manager: Txn. 

Click to read more ...

Tuesday
Feb072012

Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collection

This is a guest post by Doug Judd, original creator of Hypertable and the CEO of Hypertable, Inc.

Hypertable delivers 2X better throughput in most tests -- HBase fails 41 and 167 billion record insert tests, overwhelmed by garbage collection -- Both systems deliver similar results for random read uniform test

We recently conducted a test comparing the performance of Hypertable (@hypertable) version 0.9.5.5 to that of HBase (@HBase) version 0.90.4 (CDH3u2) running Zookeeper 3.3.4.  In this post, we summarize the results and offer explanations for the discrepancies. For the full test report, see Hypertable vs. HBase II.

Introduction

Hypertable and HBase are both open source, scalable databases modeled after Google's proprietary Bigtable database.  The primary difference between the two systems is that Hypertable is written in C++, while HBase is written in Java.  We modeled this test after the one described in section 7 of the Bigtable paper and tuned both systems for maximum performance.  The test was run on a total of sixteen machines connected together with gigabit Ethernet.  The machines had the following configuration:

Click to read more ...

Tuesday
Jan242012

The State of NoSQL in 2012

This is a guest post by Siddharth Anand, a senior member of LinkedIn's Distributed Data Systems team. 

Preamble Ramble

If you’ve been working in the online (e.g. internet) space over the past 3 years, you are no stranger to terms like “the cloud” and “NoSQL”.

In 2007, Amazon published a paper on Dynamo. The paper detailed how Dynamo, employing a collection of techniques to solve several problems in fault-tolerance, provided a resilient solution to the on-line shopping cart problem. A few years go by while engineers at AWS toil in relative obscurity at standing up their public cloud.

It’s December 2008 and I am a member of Netflix’s Software Infrastructure team. We’ve just been told that there is something called the “CAP theorem” and because of it, we are to abandon our datacenter in hopes of leveraging Cloud Computing.

Huh?

Click to read more ...

Thursday
Jan052012

Shutterfly Saw a Speedup of 500% With Flashcache

In the "should I or shouldn't I" debate around deploying SSD, it always helps to have real-world data. Fiesta! with a live-blog summary of a presentation by Kenny Gorman on Shutterfly on MongoDB Performance Tuning.

What if you still need more performance after doing all of this tuning? One option is to use SSDs. Shutterfly uses Facebook’s flashcache: kernel module to cache data on SSD. Designed for MySQL/InnoDB. SSD in front of a disk, but exposed as a single mount point. This only makes sense when you have lots of physical I/O. Shutterfly saw a speedup of 500% w/ flashcache. A benefit is that you can delay sharding: less complexity.

The whole series of posts has a lot of great information and is worth a longer look, especially if you are considering using MongoDB. 

Click to read more ...

Tuesday
Nov012011

Finding the Right Data Solution for Your Application in the Data Storage Haystack

The InfoQ article Finding the Right Data Solution for Your Application in the Data Storage Haystack makes a series of concrete recommendations for a user who wants to find the right storage solution for his application.  

Few years back, there was a time SQL RDBMS were solution for almost all storage needs, but we all know how scaling came along and shattered the perfect dream. Then NoSQL happened, and now we are end up with a Haystack of solutions. For example, Local memory, Relational, Files, Distributed Cache, Column Family Storage, Document Storage, Name value pairs, Graph DBs, Service Registries, Queue, and Tuple Space etc. are some classes of such solutions.

We discuss about how to find the right storage solution, and we make choices often when we design. But, when comes to describe how to select the right one, we often end up giving very high-level guideline. The article argues that the way to make more concrete recommendations is to drill down into bit more detail and consider them case by case.

To that end the article takes four parameters about an application/usecase (Scale, Consistency, Type of Data, and Queries needed), then take some 40+ cases that arises from different value combination of those parameters and make one or more concrete recommendations on right storage solution for that case.

What follows are the four parameters and potential values they can take and the recommendations for structured, semi-structured, and unstructured data: 

Click to read more ...

Tuesday
Sep132011

Must see: 5 Steps to Scaling MongoDB (Or Any DB) in 8 Minutes

Jared Rosoff concisely, effectively, entertainingly, and convincingly gives an 8 minute MongoDB tutorial on scaling MongoDB at Scale Out Camp. The ideas aren't just limited to MongoDB, they work for most any database: Optimize your queries; Know your working set size; Tune your file system; Choose the right disks; Shard. Here's an explanation of all 5 strategies:

Click to read more ...

Tuesday
Sep062011

Big Data Application Platform

It's time to think of the architecture and application platforms surrounding "Big Data" databases. Big Data is often centered around new database technologies mostly from the emerging NoSQL world. The main challenge that these databases solve is how to handle massive amount of data at a reasonable cost and without poor performanc - distributed databases emerged to address this challenge and today we're seeing high adoption rate and quite impressive success stories such as the Netflix use of Cassandra/DataStax solution. All that indicate the speed in which this market evolves.

The need for a Big Data Application Platform

Click to read more ...

Thursday
Aug042011

Jim Starkey is Creating a Brave New World by Rethinking Databases for the Cloud

Jim Starkey, founder of NuoDB, in this thread on the Cloud Computing group, delivers a masterful post on why he thinks the relational model is the best overall compromise amongst the different options, why NewSQL can free itself from the limitations of legacy SQL architectures, and how this creates a brave new lock free world....

I'll [Jim Starkey] go into more detail later in the post for those who care, but the executive summary goes like this:  Network latency is relatively high and human attention span is relatively low.  So human facing computer systems have to perform their work in a small number of trips between the client and the database server.  But the human condition leads inexorably to data complexity.  There are really only two strategies to manage this problem. One is to use coarse granularity storage, glombing together related data into a single blob and letting intelligence on the client make sense of it.  The other is storing fine granularity data on the server and using intelligence on the server to aggregate data to be returned to the client.

NoSQL uses the former for a variety of reasons...

Click to read more ...