Entries by Fuad Malikov (2)

Tuesday
Apr032012

Hazelcast 2.0: Big Data In-Memory

As it is said in the recent article "Google: Taming the Long Latency Tail - When More Machines Equals Worse Results" , latency variability has greater impact in larger scale clusters where a typical request is composed of multiple distributed/parallel requests. The overall response time dramatically decreases if latency of each request is not consistent and low. 

In dynamically scalable partitioned storage systems, whether it is a NoSQL database, filesystem or in-memory data grid, changes in the cluster (adding or removing a node) can lead to big data moves in the network to re-balance the cluster. Re-balancing will be needed for both primary and backup data on those nodes. If a node crashes for example, dead node’s data has to be re-owned (become primary) by other node(s) and also its backup has to be taken immediately to be fail-safe again. Shuffling MBs of data around has a negative effect in the cluster as it consumes your valuable resources such as network, CPU and RAM. It might also lead to higher latency of your operations during that period.

Click to read more ...

Tuesday
Apr122011

Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast

Mozilla processes TB's of Firefox crash reports daily using HBase, Hadoop, Python and Thrift protocol. The project is called Socorro, a system for collecting, processing, and displaying crash reports from clients. Today the Socorro application stores about 2.6 million crash reports per day. During peak traffic, it receives about 2.5K crashes per minute. 

In this article we are going to demonstrate a proof of concept showing how Mozilla could integrate Hazelcast into Socorro and achieve caching and processing 2TB of crash reports with 50 node Hazelcast cluster. The video for the demo is available here.

 

To read the rest of the article please click below...

Click to read more ...