This is a guest post by Brian Bulkowski, CTO and co-founder of Aerospike, a leading clustered NoSQL database, has worked in the area of high performance commodity systems since 1989.
Why flash rules for databases
The economics of flash memory are staggering. If you’re not using SSD, you are doing it wrong.
Not quite true, but close. Some small applications fit entirely in memory – less than 100GB – great for in-memory solutions. There’s a place for rotational drives (HDD) in massive streaming analytics and petabytes of data. But for the vast space between, flash has become the only sensible option.
For example, the Samsung 840 costs $180 for 250GB. The speed rating for this drive is rated by the manufacturer at 96,000 random 4K read IOPS, and 61,000 random 4K write IOPS. The Samsung 840 is not alone at this price performance. A 300GB Intel 320 is $450. An OCZ Vertex 4 256GB is $235, with the Intel being rated as slowest, but our internal testing showing solid performance. Most datacenter chassis will accommodate four data drives, and adding four Samsung 840 creates a system with 1TB of storage, 384,000 read IOPS, 248,000 random write IOPS, for a storage street cost of $720 and adding an extra 0.3 watts to a server’s power draw.
If you have a dataset under 10TB, and you’re still using rotational drives, you’re doing it wrong. The new low cost of flash makes rotational drives useful only for the lightest of workloads.
Most operational non-analytic work loads require only a few IOPS per transaction. A good database should require just one.
HDD as a price of about $0.10 per GB – 10x cheaper than flash – but each spindle supports about 200 IOPS--- the number of seeks per second. Until the recent advent of flash, databases were IOPS limited, requiring large arrays to reach high performance. Estimating cost per IOP is difficult, as smaller drives provide the same performance for lower cost. But achieving performance similar to the 96,000 IOPS of a $180 Samsung 840 would require over 400 HDD at a price of hundreds of thousands of dollars.
Let’s compare the economics of memory. Dell is currently (December 2012) charging $20 per GB for DRAM (16GB DIMM at $315), and a fully loaded R720 with RDIMMs topping out at 384GB for $13,000—or $33 per GB, fully loaded. Memory doesn’t have IOPS, and main memory databases measured over 1M transactions per second. Memory is faster, but we’ll see that for most use cases, network bottlenecks will overcome RAM’s performance advantage.
Step back: $33 per GB for RAM, $1 per GB for flash. High density 12T solutions can be built with the current Dell R720, compared to a high density 384GB memory system at about the same price ($13K/server). RAM’s power draw tips the equation even further.
Flash storage provides random access capabilities, which means your application developers are spending less time optimizing query patterns. All the queries go fast. That fast random access results in architectural flexibility, and allows you to change your data patterns and applications rapidly. That’s priceless.
The lure of main memory databases
Click to read more ...