Hazelcast 2.0: Big Data In-Memory
As it is said in the recent article "Google: Taming the Long Latency Tail - When More Machines Equals Worse Results" , latency variability has greater impact in larger scale clusters where a typical request is composed of multiple distributed/parallel requests. The overall response time dramatically decreases if latency of each request is not consistent and low.
In dynamically scalable partitioned storage systems, whether it is a NoSQL database, filesystem or in-memory data grid, changes in the cluster (adding or removing a node) can lead to big data moves in the network to re-balance the cluster. Re-balancing will be needed for both primary and backup data on those nodes. If a node crashes for example, dead node’s data has to be re-owned (become primary) by other node(s) and also its backup has to be taken immediately to be fail-safe again. Shuffling MBs of data around has a negative effect in the cluster as it consumes your valuable resources such as network, CPU and RAM. It might also lead to higher latency of your operations during that period.
With 2.0 release, Hazelcast, an open source clustering and highly scalable data distribution platform written in Java, focuses on latency and makes it easier to cache/share/operate TB's of data in-memory. Storing terabytes of data in-memory is not a problem but avoiding GC to achieve predictable, low latency and being resilient to crashes are big challenges. By default, Hazelcast stores your distributed data (map entries, queue items) into Java heap which is subject to garbage collection. As your heap gets bigger, garbage collection might cause your application to pause tens of seconds, badly effecting your application performance and response times. Elastic Memory is Hazelcast with off-heap memory storage to avoid GC pauses. Even if you have terabytes of cache in-memory with lots of updates, GC will have almost no effect; resulting in more predictable latency and throughput.
Elastic Memory implementation uses NIO DirectByteBuffers and doesn’t require any defragmentation. Here is how things work: User defines the number of GB storage to have off the heap per JVM, let’s say it is 40GB. Hazelcast will create 40 DirectBuffers, each with 1GB capacity. If you have, say 100 nodes, then you have total of 4TB off-heap storage capacity. Each buffer is divided into configurable chunks (blocks) (default chunk-size is 1KB). Hazelcast uses a queue of available (writable) blocks. 3KB value, for example, will be stored into 3 blocks. When the value is removed, these blocks are returned back into the available blocks queue so that they can be reused to store another value.
With new backup implementation, data owned by a node is divided into chunks and evenly backed up by all the other nodes. In other words, every node takes equal responsibility to backup every other node. This leads to better memory usage and less influence in the cluster when you add/remove nodes.
To demonstrate the capabilities of Elastic Memory, Hazelcast team did a demo using 100 EC2 m2.4xlarge instances. The demo will run the SimpleMapTest.java available in the distribution. Initially the application will load the grid with total of 500M entries, each with 4KB value size. Redundancy level is 2 by default. There will be 2 copy of each entry in the cluster. This makes total of 1B entries, that takes 4TB in memory.
After the loading 500M entries, it will do %95 get and %5 put to random keys. Later on, we'll terminate an instance to observe no data loss because of backups and we should also notice that key ownerships remain well-balanced. The total throughput of the cluster was over 1.3M distributed operations per second.
Reader Comments (8)
Having actually used several versions of Hazelcast 1.X over a 2 year period I can tell you I will never give them another chance.
:(
This off-heap Elastic Memory feature is not available for the open-source version of the software.
JD - Could you please specify what issues you faced?
@JD could you please point me to the issue(s) you have created?
@Andy yes Elastic Memory is not open source. But all other goodies like distributed backup and new connection layer are part of community edition as well. I should have mentioned that loudly.
I am just wondering how Elastic Memory will work if key-value size is << 1KB? You fight external fragmentation by allocating multiple chunks per object but it seems that internal fragmentation is still big issue in your implementation.
::Disclaimer: I work for Hazelcast::
@JD Every good thing can be misused. Hazelcast is in production in many financial, gaming and telecom companies. Note that we also offer consulting service to make sure Hazelcast is used in the most optimized and proper way.
@Vladimir keys are not stored off the heap, only values are stored off-heap. You are right that if value size < 1KB you will have more memory space wasted that is why we don't recommend Elastic Memory for tiny values. We have customers storing average of 30KB values, for example. Perfect match, right. Also note that you can define the storage type (on-heap or off-heap) per map, so that you can configure to store tiny objects on-heap and big-ones off the heap.
Doesn't pulling large values from off-heap have a big latency penalty for deserialization?
@MDS Hazelcast keeps your values in byte[] format. In a 10 node cluster, for example, 90% of your reads will be from a remote node. Since values are kept in byte[] form, remote node will simply send the byte[] over the wire to the caller node and value is deserialized on the caller side. When values are stored off-heap, for each read, we pay the cost of pulling the byte[] from off-heap and copy it into heap, and the rest is the same.