Entries from February 24, 2019 - March 2, 2019

Friday
Mar012019

Stuff The Internet Says On Scalability For March 1st, 2019

Wake up! It's HighScalability time:

 

10 years of AWS architecture increasing simplicity or increasing complexity? (Michael Wittig)

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. Know anyone who needs cloud? I wrote Explain the Cloud Like I'm 10 just for them. It has 39 mostly 5 star reviews. They'll learn a lot and love you forever.

 

  • 1.3 billion: npm package downloads per day; 20: honeybee communication signals used to coordinate thousands of workers; 71: average global life expectancy; 120K: max inflight SQS messages; 80%: shared code between iOS, Android, the web; 1 TB: microSD card; 20%: increase in value wind energy using ML; 64%: respondents cite optimizing cloud spend as the topvinitiative; 250: drones augmenting small military units; 35,880: record robots shipped to North American companies; 50K: aerial photos of the UK; 119%: increase in demand for AI talent; 18TB: MAMR hard drive; $20 million: Pinterest paid more than expected for AWS; 100,000: MySQL connections; 19%: all requests come from Bots, APIs, and search engine crawlers; 

  • Quotable Quotes:
    • @evazhengll: A surgeon in #China performed world’s 1st remote operation using '#5G Surgery' on animal, removing its liver, through controlling robotic arms in a location 30 miles away. It was made possible by using a low latency of 0.1 seconds, the lower the latency, the more responsive the robot
    • @AWSonAir: .@McDonalds uses Amazon ECS to scale to support 20,000 orders per second. #AWSSummit
    • @antoniogm: Know why the European startup scene sucks? Because American startups have a huge, high-GDP, early-adopter market from day one, and they internationalize AFTER scaling. Euros have to internationalize IN ORDER TO scale, and most die in the process. GDPR makes this *worse*.
    • Ivan Ivanitskiy: Even though blockchain does not allow for modification of data, it cannot ensure such data is correct.
    • @kelseyhightower: Kubernetes is for people building platforms. If you are a developer building your own platform (AppEngine, Cloud Foundry, or Heroku clone), then Kubernetes is for you.
    • @adrianco: I think the main thing cloud native apps do that datacenter apps don’t do is scale elastically (even down to zero in some cases) and maintain high utilization, so you stop paying when you stop using the resource.
    • @kellabyte: Also almost every mention of SEDA  is incorrect IMO. If you read the paper the goal of the paper was to dynamically adjust CPU resources by *CHAINED* queues where thread pools can move threads between stages so that stages who needed more compute time got more threads.
    • So much more...
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Feb272019

Give Meaning to 100 Billion Events a Day — The Shift to Redshift

This is a guest post by Alban Perillat-Merceroz, from the Analytics team at Teads.

In part one, we described our Analytics data ingestion pipeline, with BigQuery sitting as our data warehouse. However, having our analytics events in BigQuery is not enough. Most importantly, data needs to be served to our end-users.

TL;DR — Teads Analytics big picture

In this article, we will detail:

  • Why we chose Redshift to store our data marts,
  • How it fits into our serving layer,
  • Key learnings and optimization tips to make the most out of it,
  • Orchestration workflows,
  • How our data visualization apps (Chartio, web apps) benefit from this data.

Data is in BigQuery, now what?

Click to read more ...

Monday
Feb252019

Design Of A Modern Cache—Part Deux

This is a guest post by Benjamin Manes, who did engineery things for Google and is now doing engineery things as CTO of Vector.

The previous article described the caching algorithms used by Caffeine, in particular the eviction and concurrency models. Since then we’ve made improvements to the eviction algorithm and explored a new approach towards expiration.

Eviction Policy

Window TinyLFU (W-TinyLFU) splits the policy into three parts: an admission window, a frequency filter, and the main region. By using a compact popularity sketch, the historic frequencies are cheap to retain and lookup. This allows for quickly discarding new arrivals that are unlikely to be used again, guarding the main region from cache pollution. The admission window provides a small region for recency bursts to avoid consecutive misses when an item is building up its popularity.

 

 

This structure works surprisingly well for many important workloads like database, search, and analytics. These cases are frequency-biased where a small admission window is desirable to filter aggressively...

Click to read more ...