Wednesday
Apr242013
Strategy: Using Lots of RAM Often Cheaper than Using a Hadoop Cluster
Wednesday, April 24, 2013 at 9:25AM
Solving problems while saving money is always a problem. In Nobody ever got fired for using Hadoop on a cluster they give some counter-intuitive advice by showing a big-memory server may provide better performance per dollar than a cluster:
- For jobs where the input data is multi-terabyte or larger a Hadoop cluster is the right solution.
- For smaller problems memory has reached a GB/$ ratio where it is technically and financially feasible to use a single server with 100s of GB of DRAM rather than a cluster. Given the majority of analytics jobs do not process huge data sets, a cluster doesn't need to be your first option. Scaling up RAM saves on programmer time, reduces programmer effort, improved accuracy, and reduces hardware costs.
Reader Comments (1)
Interesting perspective when it comes to watching your spend. I would ask what would happen when that server fails? I'm not sure that using a single server allows you to fail gracefully.