Use Google For Throughput, Amazon And Azure For Low Latency
Which cloud should you use? It may depend on what you need to do with it. What Zach Bjornson needs to do is process large amounts scientific data as fast as possible, which means reading data into memory as fast as possible. So, he made benchmark using Google's new multi-cloud PerfKitBenchmarker, to figure out which cloud was best for the job.
The results are in a very detailed article: AWS S3 vs Google Cloud vs Azure: Cloud Storage Performance. Feel free to datamine the results for more insights, but overall his conclusions are:
- Amazon and Azure provide the lowest latency, while Google provides the highest throughput, for both uploads and downloads. This means that AWS and Azure excel for smaller files, while GCE excels for larger files, and this highlights the importance of benchmarking with data that are comparable in size to what your application uses.
- The substantial limitations on AWS EC2 network throughput must be taken into consideration when designing high-speed data processing systems.
- Google's unique multi-region buckets keep costs down when working with data from multiple datacenters in the same region (e.g. continent).
- Object storage scales automatically to provide high aggregate throughput.
- Finally, note that I’m only showing data from API access (which is the exact same boto code for AWS and Google), and I have unsurprisingly observed substantial differences in performance from different clients (the vendor-specific CLIs, node.js API package, cURL’ing URLs, etc.).
All benchmark caveats apply or course. Zach discusses more about how the benchmarks were made:
I actually ran the tests quite a few times (~30 times for GCS, ~15 times for AWS, less for Azure) over the course of about four months, and in the blog I focused on the consistent aspects:
- The ranking of the medians were stable, e.g. in every repeat of the benchmarks, Google had the highest throughput and S3 and Azure always had the lowest latency.
- S3 appears to have a throughput cap at ~91 MB/s (figure 1-right where the line becomes perfectly flat).
- AWS has relatively low caps on VM network throughput, which puts an effective cap on S3 throughput as well.
For what its worth, the numerical throughput and latency values were surprisingly stable. The benchmarks take about 20 minutes to run, so there's also a fair amount of averaging going right there.
Related Articles
- On HackerNews
- Good discussion of the benchmark on Reddit and on Reddit and on Reddit.
- Network Performance
- Understanding Latency versus Throughput
- Happy New Year from Google Cloud Platform - still the price/performance leader in public cloud!
- Google is winning the cloud pricing and performance war
- A look inside Google’s Data Center Networks / ONS 2015: Wednesday Keynote - Amin Vahdat
Reader Comments (1)
These results cannot be extrapolated to compute instances. So please change the title, it's misleading.
Googles storage is, by default, cross regions. So it makes sense that initial latencies are higher. This is notably not the case with compute instances, at least not provably by this benchmark.