
This is a guest post by Psi Mankoski (email), Principal Engineer, Aerospike.
When a customer’s business really starts gaining traction and their web traffic ramps up in production, they know to expect increased server resource load. But what do you do when memory usage still keeps on growing beyond all expectations? Have you found a memory leak in the server? Or else is memory perhaps being lost due to fragmentation? While you may be able to throw hardware at the problem for a while, DRAM is expensive, and real machines do have finite address space. At Aerospike, we have encountered these scenarios along with our customers as they continue to press through the frontiers of high scalability.
In the summer of 2013 we faced exactly this problem: big-memory (192 GB RAM) server nodes were running out of memory and crashing again within days of being restarted. We wrote an innovative memory accounting instrumentation package, ASMalloc [13], which revealed there was no discernable memory leak. We were being bitten by fragmentation.
This article focuses specifically on the techniques we developed for combating memory fragmentation, first by understanding the problem, then by choosing the best dynamic memory allocator for the problem, and finally by strategically integrating the allocator into our database server codebase to take best advantage of the disparate life-cycles of transient and persistent data objects in a heavily multi-threaded environment. For the benefit of the community, we are sharing our findings in this article, and the relevant source code is available in the Aerospike server open source GitHub repo. [12]
Executive Summary
-
Memory fragmentation can severely limit scalability and stability by wasting precious RAM and causing server node failures.
-
Aerospike evaluated memory allocators for its in-memory database use-case and chose the open source JEMalloc dynamic memory allocator.
-
Effective allocator integration must consider memory object life-cycle and purpose.
-
Aerospike optimized memory utilization by using JEMalloc extensions to create and manage per-thread (private) and per-namespace (shared) memory arenas.
-
Using these techniques, Aerospike saw substantial reduction in fragmentation, and the production systems have been running non-stop for over 1.5 years.
Introduction
Click to read more ...