Cost Analysis: TripAdvisor and Pinterest costs on the AWS cloud
This is a guest post by Ali Khajeh-Hosseini, Technical Lead at PlanForCloud.com.
I read a recent blog post about TripAdvisor's experiment with AWS where they attempted to process 700K HTTP requests per minute on a replica of their live site. There was also an interesting blog post on Pinterest's massive growth on AWS. These blogs highlighted exactly the types of questions we're interested in, mainly:
- How much would it cost to deploy System X on Cloud Y? e.g. how much would it cost to host TripAdvisor on the AWS US-East cloud?
- Would it be cheaper to use deployment option X or Y? e.g. would it be cheaper to use reserved instances, different types of instances, different cloud providers...
- What happens to costs when the system grows? e.g. Pinterest has around 410TB of data on S3, what if that keeps growing at a rate of 25% every month, like it has been in the last 10 months?
1. TripAdvisor's "700K requests/minute" deployment on the AWS US-East cloud - how much would it cost?
What does the deployment look like?
- 270 x Hi-Memory XLarge (m2.xlarge) running 24 hours/day as front end servers.
- 70 x Hi-Memory XLarge (m2.xlarge) running 24 hours/day as back end servers.
- 32 x Hi-Memory Xlarge (m2.xlarge) running 24 hours/day as memcache servers.
Databases
- 5 x Cluster Compute 8XLarge (cc2.8xlarge) running 24 hours/day as database servers, each having a 1TB EBS volume with 100 IOPS attached.
Storage
- 5TB of EBS with snapshots to S3 every month for backups of all databases.
Data Transfers
2. How much would be saved if TripAdvisor's deployment used AWS Reserved Instances?
3. What would happen to Pinterest's storage cost if they keep growing at a rate of 25% every month?
Pinterest has around 410TB of user data on AWS S3, which costs around $39K per month on S3 Standard Storage. If this figure grows by 25% every month, like it has been for the last year, Pinterest would have to pay around $470,000/month one year from now if they keep using S3 Standard Storage. The same amount of user data would cost around $319,000/month if they use S3 Reduced Redundancy storage.
The payoffs in these cases are as follows:
- For TripAdvisor (and any company that is thinking of running long term projects/sites on AWS), they should take reserved instances seriously. For example, the TripAdvisor 700K system would save around $800K/year if it uses reserved instances vs on-demand (I know this was short-lived experimental system, but I'm just using it as an example). This requires no change to the architecture or the system itself, it's simply a matter of doing some forecasting to ensure you can handle the upfront charges in your cashflow. Companies can also buy their reserved instances in stages to stagger the upfront costs.
- For Pinterest (and any company that is looking to grow their systems on the cloud), they should use growth patterns to do "what-if" style analysis to a) understand the costs of elasticity b) compare deployment options. For example, Rackspace released their cloud block storage services; their SSD option is similar to AWS EBS Provisioned IOPS but it's much more expensive.
Either way, like with any other purchasing decision, it's good to compare your options and have a rough idea of your monthly bills.
Reader Comments (7)
That is a lot of kit for that number of requests no? like 40req/s per frontend instance?
Have you factored in the additional costs associated with performance related issues on AWS, both due to instability and unpredictability?
Excellent post!! Lots of food for thought.
"For example, Rackspace released their cloud block storage services; their SSD option is similar to AWS EBS Provisioned IOPS but it's much more expensive."
I'm not so sure.
The IOPS EBS are non-local and not SSD, and the two offerings are subtly different. Network latency appears to be a target reduction at the AWS offering, whereas Rackspace is focused on disk I/O.
Also, Rackspace charges more for their offering but I/O is free, whereas AWS is initially cheaper but pricing is based on utilization.
Basically, I think that it's very, very hard to compare apples and oranges here. I do believe that the Rackspace option is less virtualized, hence less interesting but also probably offers more performance consistency, and I DEFINITELY recommend lots of perf testing on both if your application is highly I/O sensitive.
I haven't compared numbers yet either, but my feeling is that the Rackspace offering is probably faster out of the box and cheaper on high-utilization apps, but OTOH striping across several EBS (fairly simple with an md device) could be a very interesting comparison indeed.
Excellent article!!
@Matt, that does indeed seem like a lot of kit now that you mention it. Those boxes should get at least 500 req/s and probably more like quadruple that, depending on how horrible your code is, to say nothing of awesome performance on static media (which is probably CDN'ed out.)
A little effort in speeding each box'es response can reap HUGE cost benefits, and the cost savings get even higher as the network gets larger.
@anonymous. ouch. :)
@matt and @jamieson 700k request/min = 11666req/sec
@anonymous, this analysis did not include potential performance variations, it's difficult to include it in such cost analysis upfront without running a proof-of-concept system (which might be why TripAdvisor decided to run the POC). The problem with standard performance benchmarks is that they don't tell you much about your overall application performance, e.g. just because the CPU on instance type X on Cloud Y was better, doesn't mean that your application is going to benefit.
@Jamieson, thanks! We're in the process of doing more analysis on Rackspace's new storage offerings (will publish it on http://blog.planforcloud.com when it's ready), but as you hinted, they do claim better performance, SLAs and support.
The Tripadvisor case is interesting - but I think its missing a dimension. Specifically - whats the use case? If its a production site, then yes - Looking at it from the outside, certainly reserved instances would help. However, there may be other ways, and more powerful ways to reduce an organizations cloud costs (without upfront investment):
1. If an organization also plans to use the cloud for development and test, and wants to replicate their production environment, a VM consolidation technology will substantially reduce cost. Especially since typical test/dev workloads have low utilization, one can achieve high consolidation ratios - and thereby save substantially.
2. If an organization deploys their application with the help of a cloud neutral infrastructure, switching to a different cloud would be easy - and the organization could benefit from differential cloud rates.
Thanks,
Navin
--
Navin R. Thadani
@ravellosystems
@navinthadani