Stuff The Internet Says On Scalability For March 17th, 2021
Hey, HighScalability is here again!
Reverse engineering an ancient analog computer is a detective story worth reading. A Model of the Cosmos in the ancient Greek Antikythera Mechanism.
Do you love this Stuff? Without your encouragement on Patreon this Stuff won't stuffin’ stuff.
Know someone who needs to know the cloud? I wrote Explain the Cloud Like I'm 10 just for them. On Amazon it has 262 mostly 5 star reviews. Here's a review that is not on the block chain:
Number Stuff:
- 100+ trillion: objects in S3 with peaks of tens of millions of requests per second. Did you know AWS has a VP of Block and Object Storage?
- 2022: first first trillion-dollar year for eCommerce. 2020 saw a 42% growth over 2019.
- 5 meters: the Perseverance Rover’s landing distance from the actual target location. All thanks to its Terrain Relative Navigation (TRN) system which provides a map-relative position fix that can be used to accurately target specific landing points on the surface and avoid hazards.
- 90 milligrams: Newton’s law of gravity still holds.
- $7k: in extra costs because AWS has this horrible lifecycle rule approach for changing storage class. Then there’s Citibank just got a $500 million lesson in the importance of UI design.
- ~10%: of the top 10,000 websites track you using CNAME tracking, cloaking, and subdomain collusion.
- $100: ad revenue earned for 1 million page views.
- 85 million: items for sale on Etsy
- 70%: cut in GTA Online loading time by reverse engineering a binary and improving a broken JSON parser. Bravo!
- $120 billion: spent on Shopify’s platform in 2020. Double 2019. 40% of Amazon.
- 30%: chip demand above supply. 20% year-over-year growth. Also Living Through Chipageddon.
- 7 trillion: Kafka events per day processed at LinkedIn. 1 exabyte of data.
- 2: fighter pilots saved by Automatic Ground Collision Avoidance System software.
- 37 billion: data records leaked in 2020. Growth of 140% YoY.
- 7: top companies in the S&P 500 index are all tech companies.
- 1,041.66: certs generated every minute by Let’s Encrypt.
- $5.4 billion: startup funding for 29 companies in the EVs, quantum software, photonics, and 5G/6G.
- 30,000+: Google certificates were found by Google to be forged by Symantec CAs.
- 14.1%: US Internet subscribers now consume over 1TB of data per month, up 94 percent from 2019.
- $1 trillion+: worldwide cost of cybercrime. ~1% of global GDP. Up by more than 50% compared to 2018.
- 10: particles of space dust land on each square meter of Earth’s surface every year.
- 10,000%+: AWS markup on bandwidth costs.
- 35%: increase in the sale of physical books after digitization.
- 250 terabits per second: Google’s subsea cable is ready and pumping bits across the Atlantic Ocean using a a 12 fiber pair space-division multiplexing (SDM) design.
- 15 minutes: time to compile 1 billion lines of C++ code using a 64 core AMD 3990X Threadripper. The machine cost $3989.
- 196%: more strawberries by weight on average produced by data scientists compared with traditional farmers. Increased ROI 75.5%
- 0.93%: The Annualized Failure Rate for Backblaze’s drives are way down 50% in 2020.
- 256 million: builds run by GitLab CI in December.
Quotable Stuff:
- @tommchenry: Star Trek story where they invent the replicator, then scramble to invent new tech to add to the replicator to ensure it chokes the infinite plenty with artificial self-imposed scarcity so the social relations of political and economic power aren’t threatened.
- REvil’s Unknown: Yes, as a weapon [ransomware] can be very destructive. Well, I know at the very least that several affiliates have access to a ballistic missile launch system, one to a U.S. Navy cruiser, a third to a nuclear power plant, and a fourth to a weapons factory. It is quite feasible to start a war. But it’s not worth it—the consequences are not profitable.
- @vzverovich: 500,000 lines of code to control the Mars rover entry, descend and landing. Almost as big as std::tuple implementation!
- @axboe: Personal goal achieved: IOPS=3004074, IOS/call=31/31, inflight=128 (128) which is > 3M IOPS on a single CPU core, using io_uring in polled mode and an Optane2 device. 512b random reads. I like moving goal posts, so next milestone is 4M per core.
- Karen Hao: I pressed him [Joaquin Quiñonero Candela, a director of AI at Facebook] one more time. Certainly he couldn’t believe that algorithms had done absolutely nothing to change the nature of these issues, I said. “I don’t know, he said with a halting stutter. Then he repeated, with more conviction: “That’s my honest answer. Honest to God. I don’t know.
- @jamesurquhart: [With the new API destinations feature of AWS EventBridge] AWS has basically coopted the entire HTTP API ecosystem into its event-driven applications story. It's a brilliant move.
- @TrungTPhan: State of Cloud Wars • AWS has plateaued at 33% market share for past few years • Microsoft has doubled share (10-20%) in the last 4 years • Google (9%) and Alibaba (6%) follow while IBM (never mind)
- peter_d_sherman: One of the most interesting things which will happen in the near future in computing is the adoption of serial memory interfaces. For pretty much the entire history of modern computing, RAM has been attached to a system via a high-speed parallel interface. Making parallel interfaces fast is hard and requires extremely rigorous control of timing skew between pins, therefore the routing of PCB traces between a CPU and RAM slots must be done with great precision. At the speeds of modern parallel RAM interfaces like DDR4, what is theoretically a digital interface in practice must be viewed as practically analog (to the point that part of a CPU DDR4 controller is called the “PHY). Moreover, the maximum distance between a CPU and its RAM slots is extremely tight. The positioning of RAM slots on a motherboard is largely constrained by these physics considerations. For these reasons you have never seen anything like the flexibility with RAM attachment that you can get with, for example, PCIe or SAS. PCIe and SAS are serial architectures which support cabling and even switching, allowing entire additional chassis of PCIe and SAS devices to be attached to a system via cables.
- jxf: The main advantage of Kubernetes isn't really Kubernetes anymore — it's the ecosystem of stuff around it and the innumerable vendors you can pay to make your problems someone else's problems.
- Roger Lee: Subscriptions are more important than box office.
- @chrismunns: “Buttt Daaaaaaddddd, I wanna build my own traffic routerrrrrr. “Sweetie, when you are 30% of the internet’s traffic you can do that. Until then loadbalancers and DNS are fine for you and your other friends “Ugh I have no fun “You can have fun with loadbalancers
- @mbushong: A little surprised at people dunking on OVH. Stuff happens, and I’d think a little empathy is a stronger starting position than some of what I have seen. Reasoned discussions about availability and architecture are obviously positive outcomes. But someone is having a bad day :(
- @jesterxl: 1. not having to do a TON of work & maintenance, easier to build things 2a. designing back-ends not in a monolith, adopting functional programming and allowing AWS to own state & side effects 2b. S3's (used to) eventual consistency 2c: active/active architectures - no AMI refreshes - no Docker security finding updates - no rehydrations - no log configuration (Splunk & ELK, wtf is fluentd?) The above seems trite, but dude, MONTHS of slow, boring, pain
- @chantastic: My 85 year old grandmother passed today. This is the woman who taught me how to program BASIC on an ATARI. She helped us got our first computer (Intel 386) and told me to "break it". I did and she was the first human to debug my mistakes. I am because she was
- @haysstanford: Ask a programmer to review 20 lines of code, they'll find 7 issues. Ask them to review 500 lines & they'll find 0 issues.
- qmarchi: Some interesting facts to know for those who don't dig into it. Walmart: - has 80+ internal apps, mostly variants but still unique - runs k8s inside of Distribution Centers - maintains a fleet of >180k mobile devices in the US alone - has a half-dozen data centers in the US - has most International infrastructure seperate from US Stores'
- tothrowaway: I'm at OVH as well (in the BHS datacenter, fortunately). I run my entire production system on one beefy machine. The apps and database are replicated to a backup machine hosted with Hetzner (in their Germany datacenter). I also run a tiny VM at OVH which proxies all traffic to Hetzner. I use a failover IP to point at the big rig at OVH. If the main machine fails, I move the failover IP to the VM, which sends all traffic to Hetzner.
- ev1: It is kind of interesting that on the US side everyone is in disbelief, or like "why not use AWS" - while most of the European market knows of OVH, Hetzner, etc. My own reason for using OVH? It's affordable and I would not have gotten many projects (and the gaming community I help out with) off the ground otherwise. I can rent bare metal with NVMe, and several terabytes of RAM for less than my daily wage for the whole month, and not worry about per-GB billing or attacks. In the gaming world you generally do not ever want to use usage based billing - made the mistake of using Cloudfront and S3 once and banned script kiddies would wget-loop the largest possible file from the most expensive region botnet repeatedly in a money-DoS. I legitimately wouldn't have been able to do my "for-fun-and-learning" side projects (no funding, no accelerator credits, ...) without someone like them. The equivalent of a digitalocean $1000/m VM is about $100 on OVH.
- @_KarenHao: Reporting this thoroughly convinced me that self-regulation does not, cannot work. Facebook has only ever moved on issues because or in anticipation of external regulation. If we have any hope of fixing FB's problems, we can no longer afford to wait for it to do so itself.
- @brintly: Limited sample set, I have worked with a few that were going Azure because either the fear of Amazon or Wal-Mart was one of their biggest customers and they felt pressure to move away from AWS. I’ve worked with more retailers that are on/going to AWS than other platforms.
- @colmmacc: Also: it's striking that VC relies on pitch decks. I can never decide if this is either A) terrible, and the industry is fundamentally lazy or B) genius, because it focuses almost exclusively on founders' ability to sell and that's all that matters.
- Stef Shrader: over twice as many of the drivers who weren't experienced with Level 2 automation but used it for the test didn't remember the bear at all compared to either of the other groups in the test.
- Orbital Index: “This little demonstration EST [Enormous Space Telescopes], with a total mass less than 20 kg, including optics that would be positioned along or suspended from the tether at the parabola focal point, would have four times the light gathering capacity of Webb (about thirty times that of Hubble), while costing on the order of 1/1,000th as much.
- Jeff Lawson: By 2013 [at Twilio], because of the growth of the codebase and the complexity of the tests and builds, the process was sometimes taking as long as 12 hours! Not only that, but the build would actually fail a substantial number of times — at worst, up to 50 percent of the time — and the developer would have to start over again. We regularly lost days of productivity just getting code out. This was the opposite of moving fast. Writing the code wasn’t the hard part. Wrangling our antiquated systems was. Talk about a self-inflicted wound. As a result, our best engineers started quitting, frustrated at the inability to do their jobs. At first it was a few, and before we knew it, nearly half of our engineers had quit. Half!
- @Foone: FUN FACT: the original RFC defining how domain names work doesn't use COM/ORG/NET as the first example. It uses COLORS, FLAVORS, and TRUTH. So your domains are GREEN.COLORS, TRUTH., and CHOCOLATE.NATURAL.FLAVORS
- @QuinnyPig: This tells us something about their architecture. ~500GB transferred out, ~620GB transferred in, and ~10TB transferred between AZs (it gets billed twice). Why are they taking data in, then moving it back and forth this much? Kafka, Cassandra, or Kubernetes is likely...Found the bastard! EKS without Fargate costs indicates that they're running Kubernetes. That'd speak to the data transfer cross-AZ charges. Does the newest version respect topology yet? This one doesn't.
- ed25519FUUU: To put it in perspective, a single bitcoin transaction can power the average american's household electricity needs for... an entire month.
- AgentK20: I'm the CTO of a moderately sized gaming community, Hypixel Minecraft, who operates about 700 rented dedicated machines to service 70k-100k concurrent players. We push about 4PB/mo in egress bandwidth, something along the lines of 32gbps 95th-percentile. The big cloud providers have repeatedly quoted us an order of magnitude more than our entire fleet's cost....JUST in bandwidth costs. Even if we bring our own ISPs and cross-connect to just use cloud's compute capacity, they still charge stupid high costs to egress to our carriers. Even if bandwidth were completely free, at any timescale above 1-2 years purchasing your own hardware, LTO-ing, or even just renting will be cheaper. Cloud is great if your workload is variable and erratic and you're unable to reasonably commit to year+ terms, or if your team is so small that you don't have the resources to manage infrastructure yourself, but at a team size of >10 your sysadmins running on bare metal will pay their own salaries in cloud savings.
- spenczar5: Cool! Some thoughts from a former Twitch engineer: - Probably the hardest part of running these things is managing outbound bandwidth costs. You'll need to either limit inbound bitrate or transcode video down to a manageable rate, or else you'll quickly spend a lot of money on shipping 4k video for people. - Right now, your nginx hosts both do ingest and playback, if I understand it right. You might want to separate the two. It makes maintenance easier, and it lets you scale much better - right now, a single stream probably maxes out on viewership based on the CPU capacity of the single nginx host that is ingesting the stream, transcoding it, and delivering it. If you have multiple nginx hosts which could deliver the already-transcoded stream, you could scale much better.
- @rafrasenberg: My current AWS full-stack set: ⚒️ Backend: TypeScript GraphQL AppSync DynamoDB Lambda AWS CDK ⚒️ Front-end: TypeScript Svelte S3 API Gateway Cloudfront ⚒️ Additional: Cognito Amplify
- fmajid: IIRC at one point Facebook was a 1GB+ executable transpiled from PHP to C++ using HipHop, and that certainly fits any reasonable definition of a monolith, so yes, monoliths can be scaled to an absurd degree.
- Joe Jacobsen: none of us were briefed on the original design and most aspects were delegated to just a small number of Boeing … engineers for approval.
- Tim Wu: Recall that telephone technology was at the time both primitive and a luxury. For that reason, it is possible that Western Union thought it wasn’t such a big deal to let Bell establish a phone service, imagining it was simply letting Bell run a complementary but unrelated monopoly.
- @tmclaughbos: This morning had a brief slack exchange with an engineer about my troubles with split stack AWS APIG. The said, “It was easier when we just added routes i our spring boot app. And there I realized as ops and dev people we’re coming to serverless with very different experiences. My immediate reaction was, “Hold on! I don’t miss the days of orchestrating load balancers, nginx proxies, and not to mention making deploying your application reliably into something you don’t think about. And later I realized they never had to think about that before. Coming from an ops and infrastructure background I both see and love the progress serverless brings for infrastructure management and the problems I don’t have to solve. But many developers are being exposed to this layer of their application for the first time.
- Edward Teller: I have no hope of clearing my conscience. The things we are working on are so terrible that no amount of protesting or fiddling with politics will save our souls.
- @adrianco: Netflix was using commercial CDNs and outgrew them. There used to be spare CDN capacity in the evenings, but when Netflix peak exceeded daytime peak, they had to deploy their own system, and could get deeper into ISPs with open connect devices.
- @iann0036: Just got mind blown by @QuinnyPig on Twitch with the mention of a technique of spinning up a separate account to buy RIs / Savings Plans as they apply to all accounts but the support cost % is per account Exploding head
- Forrest Brazeal: Containers are about repackaging the past. Serverless is about reimagining the future.
- Neil C. Thompson: We do mean that the economic cycle that has led to the usage of a common computing platform, underpinned by rapidly improving universal processors, is giving way to a fragmentary cycle, where economics push users toward divergent computing platforms driven by special purpose processors. This fragmentation means that parts of computing will progress at different rates. This will be fine for applications that move in the 'fast lane,' where improvements continue to be rapid, but bad for applications that no longer get to benefit from field-leaders pushing computing forward, and are thus consigned to a 'slow lane' of computing improvements. This transition may also slow the overall pace of computer improvement, jeopardizing this important source of economic prosperity.
- _fat_santa: This might be an unpopular opinion around here but I believe Leetcode is a terrible metric for assessing developer performance[...]As a programmer, your performance is based upon what you can build. You're never going to have a manager or customer go "well this looks good, but I noticed that your link list function is 3 lines vs 1 line". Leetcode at the end of the day is semantics of programming, no one besides the most hardcore programmers give a damn about it. Sure you can reverse a linked list in a one line of code, but can you build a piece of working functioning software?
- mfer: There are numerous schools of thought on programming. For example, I've seen those who are interested in leetcode, algos, and the like. I remember this one time where management wanted to have JS/front-end devs answer questions about C and b-trees. They couldn't find anyone to make it through the whole interview process. The problem was that people who could handle the C and b-trees couldn't cut it at the JS questions that came later. The JS devs never got passed the C/b-tree questions. There is a culture of elite knowledge and a club around that. Some are into the school people have degrees from and that kind of thing. There is another side of it that's about the ability to use code to problem solve. I remember meeting this senior engineer that customers used to constantly request by name. He was one of the most senior levels at the company. I later learned that he had no degree. He had a ton of hands on knowledge and understood the technology from years of working. He learned it like a skilled trade and he was valuable to everyone involved.
- Geoff Huston: This is not a new problem by any means. It appears that the common theme of the Internet’s growth over the past thirty years has been one where the capabilities of the infrastructure is the limiting factor, while the underlying dynamics of demand continue to completely outpace the delivery capacity of these platforms. There is no reason to suspect that this will change anytime soon.
- Geoff Huston: A more abstract view of the dilemma in security by design was provided by Russ White in his presentation on security by design. As Bruce Schneier pointed out: “The Internet and all the systems we build today are getting more complex at a rate that is faster than we are capable of matching. Security in reality is actually improving, but the target is constantly shifting. As complexity grows, we are losing ground. The consequent question is: “Can we contain the complexity of these systems? Russ’ answer is not all that encouraging: “Reducing complexity locally almost always leads to increasing complexity globally. Optimizing locally almost always leads to decreasing optimisation globally. Oh dear!
- Exxact: For CS nerds, it’s [DeepMind’s AlphaFold & the Protein Folding Problem] like trying to find a minimum description length with O(n) complexity for an O(n3)algorithm, as compared to going from O(n3) to O(n2).
- Guy Meynants: The cameras on Perseverance have three main improvements over those that flew on Curiosity, say the JPL scientists in the Space Science Reviews paper. Firstly, the Cmosis CMV20000 sensors are colour chips, which gives better contextual imaging capabilities than the monochrome predecessors. The second improvement is that the cameras have a wider field of view – 90° x 70° as opposed to 45° x 45° – which means only five overlapping images are needed to create a 360° panoramic view (Curiosity needed 10 images to achieve the same effect). The third improvement is that the 20-megapixel sensors can resolve greater detail than the older model.
- Paul Ratner: Qin developed this algorithm to predict the orbits of planets in the solar system, training it on data of Mercury, Venus, Earth, Mars, Ceres, and Jupiter orbits. The data is "similar to what Kepler inherited from Tycho Brahe in 1601," as Qin writes in his newly-published paper on the subject. From this data, a "serving algorithm" can correctly predict other planetary orbits in the solar system, including parabolic and hyperbolic escaping orbits. What's remarkable, it can do so without having to be told about Newton's laws of motion and universal gravitation. It can figure those laws out for itself from the numbers.
- Unfortunately, I’m unable to discover where I found this related quote: You can imagine a universe where Newton's laws are never discovered because the human need for abstraction isn't there.
- Daesol: Life is an accumulation of bets we’ve made (as well as bets made by our parents, and the society we grew up in etc). The future is also just a series of bets. A non-ergodic path-dependent sequence...So it’s worth putting in more conscious effort into assessing the favourability of a bet. Especially for higher stakes.
- Chip Overclock: As you might expect, using the guaranteed delivery Transmission Control Protocol (TCP) instead of UDP solves this problem. And indeed, my initial implementation years ago used TCP. But what I found during these kinds of real-time visualizations, especially in locations with spotty cellular reception, is that the display could lag many seconds behind real-time as the TCP/IP stack retransmitted lost packets. And once behind, it never caught up. In fact, it sometimes got progressively worse. It was far better to have the map pointer jump forward than to permanently and increasingly lag behind what was happening in meat-space.
- Pat Helland: Unlike the olden days, we continue to get more and more memory, persistent storage (SSDs and NVMe), network bandwidth, and CPU. Hard disk drive (HDD) storage capacity continues to increase but it is getting colder (e.g., less access bandwidth relative to its capacity)...With all these wonderful improvements, the amount of time waiting to get to something else has become the bottleneck. Latency is the design point.
- Benedict Evans: Newspaper revenue really started to collapse well over a decade ago, and we've been discussing what to do about it for almost as long.
- anchochilis: I work on a 3-person DevOps team that just finished migrating ~20 services from GCE vms running docker-compose to GKE. It's taken us a little over a year. Partly because K8s has a steep learning curve, but also because safely transitioning services without disrupting product teams adds a lot of overhead. The investment is already yielding great returns. Developers are happy. Actual quote: "Kubernetes is the biggest quality-of-life improvement I've experienced in my career."...1. Reliable rolling deployments. 2. Seamless horizontal scale-out. 3. GitOps/ArgoCD.
- mynameisash: One morning when I came in and sat down at my desk, all of the old-timers were having coffee and discussing the fiasco. I was very happy to hear all of them talk about how mistakes happen, and the last person to be blamed for such an outage is the poor guy or gal that hit the ENTER button. Rather, blame falls (to various degrees) on: the engineers in their orbit who should be backing them up; the managers helping to onboard them; the chain of command; the entire system that is in place to prevent inappropriate access.
- Lyft: Consider fetching only the fields you need and sorting by _doc (if possible) in Elasticsearch Scroll requests while making use of _routing and terminate_after in Elasticsearch Count requests. These simple changes yielded performance improvements that ultimately helped us reduce cluster resources while ensuring we maintain SLAs. Since Elasticsearch performance is largely based on a variety of factors (document size, search operation rate, document structure, index size, etc.), it is recommended you test with tools like JMeter to accurately measure performance and tune to your needs.
- @simoncrosby: Repeat after me: "store then analyze" is a 2005 mindset that doesn't meet the needs of 2020s data-driven apps. AKA: Big data, data lakes... you're about to drown
- @mathowie: eptember 2000: Me and @ev jumped in my car in SF and drove to the Palo Alto Fry's to buy a $500 Celeron-powered HP home computer that was on sale. We booted it up back in the office, installed Apache, and it started serving up every early *.blogspot.com site that evening.
- @Scott_Wiener: MAJOR WIN FOR NET NEUTRALITY! The federal court just rejected the effort by telecom & cable companies to block enforcement of the net neutrality law I authored, #SB822! The court ruled that California has the authority to protect net neutrality. SB 822 can now be enforced!
- Werner Vogels: I think one of the tenets up front was don't lock yourself into your architecture, because two or three orders of magnitude of scale and you will have to rethink it. Some of the things we did early on in thinking hard about what an evolvable architecture would be—something that we could build on in the future when we would be adding functionality to S3—were revolutionary. We had never done that before.
- Werner Vogels: There's one other thing that I want to point out. One of the big differences between Amazon the Retailer and AWS in terms of technology is that in retail, you can experiment the hell out of things, and if customers don't like it, you can turn it off. In AWS you can't do that. Customers are going to build their businesses on top of you, and you can't just pull the plug on something because you don't like it anymore or think that something else is better.
- @giltene: The core thing you are probably wrestling with is how to encourage timely release of referred-from-heap resources that are not part of the GC’ed heap. Triggering GC based on trending and thresholding if those things is usually the answer. E.g. JVMs trigger on a native memory use.
- jhurliman: I had the opportunity to go down to JPL and speak with team members about this design decision. The space hardened processors are not fast enough to do real time sensor fusion and flight control, so they were forced to move to the faster snapdragon. This processor will have not flips on Mars, possibly up to every few minutes. Their solution is to hold two copies of memory and double check operations as much as possible, and if any difference is detected they simply reboot. Ingenuity will start to fall out of the sky, but it can go through a full reboot and come back online in a few hundred milliseconds to continue flying. In the far future where robots are exploring distant planets, our best tech troubleshooting tool is to turn it off and turn it on again.
- gresrun: 5+ yrs @ Google, Google is my 5th company. Google has all the building blocks for great backend services and front-end development and, if you know where to look and have some experience with them, you can build a rock-solid product in <6mos, also assuming you have a team that can execute and the political will to ship it. Politics/consensus building is where the real roadblocks lie in Google, and presumably other large companies. Trying to make high-level product & technical decisions when you have 10 stakeholders with 3 VPs, all in different orgs, is serious exercise in patience; months of emails & meetings await you.
- @NorminalNews: BREAKING: SpaceX reports that they accidentally uploaded Starship SN10 flight code to Falcon B1059, the F9 booster broke up shortly after the reentry burn as it attempted to transition itself to a bellyflop maneuver.
- @greglinden: Dirty secret of cloud computing, lots of inefficiency (most are overprovisioning, lots of idle servers, complexity, switching costs): "subscription mode soon gets soured as the rising monthly bills come in for services nobody knows where and when they are being used"
- Bill Joy (1984): These editors tend to last too long - almost a decade for vi now. Ideas aren't advancing very quickly, are they?
- TruthWillHurt: Here's mine - We were running on Cloud Foundry, had one DevOps person that mostly dealt with Jenkins, payed for 32-64GB RAM. Decided to move to K8s (Azure AKS), Three months later we have 4-6 DevOps people dealing with networking, cross-az replication, cluster size and autoscaling, And we're paying thousands of $$$ pm for a minimum of 6 64GB VMs. FAIL. Corporate decided to stop trying to compete with cloud vendors and shut down our in-house Cloud Foundry hosting. Also Microsoft sales folks worked client decision makers pretty hard.
- Randolph Nesse ~ Why so many false alarms? An optimal system generates many false alarms. When information is limited the cost of defense is less than the cost of no defense.
- Brian Bailey: Perhaps the biggest change is that we need to start teaching a new generation of software engineers who are not constrained by the notions of single-threaded execution, by the notion of a single, contiguous, almost limitless amount of memory, and who accept that what they do consumes power and that waste is expensive. Today, indirectly, software engineers are responsible for about 10% of worldwide power consumption, and that number is rapidly rising. It has to stop.
- Benedict Evans: Part of the promise of the internet is that you can take things that only worked in big cities and scale them everywhere. In the off-line world, you could never take that unique store in London or Milan and scale it nationally or globally - you couldn’t get the staff, and there wasn’t the density of the right kind of customer (and that’s setting aside the problem that scaling it might make people less interested anyway).
- Brent Ozar: Oracle – massively expensive. Microsoft SQL Server – pretty doggone expensive. AWS RDS Postgres and Aurora – inexpensive to mildly expensive. Postgres – free to inexpensive, depending on support
- Wayfair: From our testing, it’s clear that the geographic latency impact of switching to Google Cloud Spanner would be significant, especially when compared to similar timings from on-prem SQL Server infrastructure. In Spanner’s best case (nam6 with client in us-central1), read and write timings are double that of SQL Server and in its worst case (nam-eur-asia1 with client in europe-west3), latency is up to 15 times greater.
- @dustinmoris: The more I work with @GCPcloud and @Azure at the same time - doing pretty much the same stuff across both clouds for different projects/work - the more I'm astonished by how much better GCP is than Azure. It's on so many levels better, that it's even hard to explain.
- Jonathan Brooks: You’re never too old, never too experienced, and never too practiced at what you do to learn. Even though I’ve been doing this for centuries, I’ve discovered that there is always something more to learn–you just need to know where to look. So, keep learning, and someday you might be as good or better than I am.
- Stef Shrader: Farmers would rather not deal with this black market at all, so they've become some of the loudest voices in fight to enshrine a formal right to repair act that would guarantee access to the tools and diagnostic systems necessary to fix their own stuff. At least 20 states including farm-heavy Nebraska have introduced right to repair legislation, per Freethink.
- M.G. Siegler: As with many of the things Amazon bakes into Prime, Apple is starting to understand the value of creating the illusion of value.
- @cpswan: I asked somebody at GCP about this a little while ago. Seems that egress pricing is viewed as a digital moat keeping data on their services (whoever 'they' are).
- @SeanMcTex: As a 50 year old human doing software development for a living, I wonder about age's effects on what is sometimes seen as a young person's game. I feel like I've continued to get better at it over the years; good to see research supporting this:
- @changeinside: As an early customer of cloudcheckr during a period of fast cloud expansion I can confirm - a great tool, but there’s a point where 0.5% of annual spend is better invested in staff to reduce waste than a tool to track it
- @slava_pestov: If you’re a senior engineer, it’s important to understand that nobody wants to “finish up your hacked up, half-assed “prototype implementation of anything. Either do the job properly, or let someone else tackle it
- @qhardy: Google Cloud Revenue up 47% year on year Backlog $30 billion, up from $19 billion in the previous quarter Deals over $250 million up 3x Multicloud, real-time analytics, meaningful applied Machine Learning - stuff the others don't have - will continue to differentiate.
- Chris Fields: We suggest a developmental explanation for this evolutionary phenomenon: obligate gametic reproduction is the result of germline stem cells winning a winner-take-all competition with non-germline stem cells for control of reproduction and hence lineage survival. We develop this suggestion by extending Hamilton’s rule, which factors the relatedness between parties into the cost/benefit analysis that underpins cooperative behaviors, to include similarity of cellular state. We show how coercive or deceptive cell-cell signaling can be used to make costly cooperative behaviors appear less costly to the cooperating party. We then show how competition between stem-cell lineages can render an ancestral combination of vegetative reproduction with facultative sex unstable, with one or the other process driven to extinction. The increased susceptibility to cancer observed in obligately-sexual lineages is, we suggest, a side-effect of deceptive signaling that is exacerbated by the loss of whole-body regenerative abilities.
- Paul Vixie: Engineering economics requires that the cost in CPU, memory bandwidth, and memory storage of any new state added for rate limiting be insignificant compared with an attacker's effort.
- hallenworld: RISC-V is a great soft-core for FPGAs. I no longer have to use vendor cores or SDKs for this.
- @randybias: So there seems to be a state of affairs where "DevOps" is basically: operators deploy and manage the k8s clusters and maybe some shared app infra services and devs manage the micro services (by team) they deploy on top. Everyone uses the same tools to see the deployment.
- Jack Dangermond: I went to design school, first environmental design school and then landscape architecture and then city planning. And in that progression, I came to understand very clearly the idea of problem-solving, because that's what design really is about, you see a problem and you come up, creatively, with something that solves the problem.
- Viviane Callier: But why would metamorphosis be better than having two specialized proteins? The scientists theorize in their paper about a couple of linked possibilities. If a single protein can do double duty, it spares the cell from transcribing, translating and maintaining more than one gene. But the more compelling advantage may be that the protein’s ability to transform may give the body a more dynamic way to control its defenses against bacteria.
- Project Zero: This blog post discussed three improvements in iOS 14 affecting iMessage security: the BlastDoor service, resliding of the shared cache, and exponential throttling. Overall, these changes are probably very close to the best that could’ve been done given the need for backwards compatibility, and they should have a significant impact on the security of iMessage and the platform as a whole. It’s great to see Apple putting aside the resources for these kinds of large refactorings to improve end users’ security. Furthermore, these changes also highlight the value of offensive security work: not just single bugs were fixed, but instead structural improvements were made based on insights gained from exploit development work.
Useful Stuff:
- In very late breaking news some private was found to blame for the attack on Pearl Harbor. While sipping mint juleps in the Officers Club, Generals lambasted the private’s Pacific Theater defence plan, saying it was "poorly thought out and even more poorly implemented." No responsible parties could be found to comment. Former SolarWinds CEO blames intern for 'solarwinds123' password leak. Also Talk At Berkeley's Information Access Seminar.
- Stack Overflow says Best practices can slow your application down:
- if you’re faced with a well-defined problem, you should probably stick to those best practices. But in building the codebase for our public Stack Overflow site, we didn’t always follow them. Best practices are not required practices.
- Our scaling strategy was to scale up, not scale out. Stack Overflow: The Architecture - 2016 Edition.
- We use a lot of static methods and fields as to minimize allocations whenever we have to. By minimizing allocations and making the memory footprint as slim as possible, we decrease the application stalls due to garbage collection. A
- To make sure regularly accessed data is faster, we use both memoization and caching. Memoization means we store the results of expensive operations; if we get the same inputs, we return the stored values instead of running the function again. We use a lot of caching (in different levels, both in-process and external, with Redis) as some of the SQL operations can be slow, while Redis is fast. Translating from relational data in SQL to object oriented data in any application can be a performance bottleneck, so we built Dapper, a high performance micro-ORM that suits our performance needs.
- Things like polymorphism and dependency injection have been replaced with static fields and service locators. Those are harder to replace for automated testing, but save us some precious allocations in our hot paths
- Similarly, we don’t write unit tests for every new feature. The thing that hinders our ability to unit test is precisely the focus on static structures. Static methods and properties are global, harder to replace at runtime, and therefore, harder to “stub or “mock.
- How do you uplift digital data into the physical realm? Troy Hunt with another creative use of Cloudflare. All the details you need to build your own. Creating a LaMetric App with Cloudflare Workers and KV: I had this idea out of nowhere the other day that I should have a visual display somewhere in my office showing how many active Have I Been Pwned (HIBP) subscribers I presently have.
- 2021 FOSDEM (free and open source software) videos are now available.
- How Post Content is Stored on Tumblr. Adding new attributes to a relational model is hard. The next step is to move to a schemaless JSON-like format so you don’t have to mess with all that uptight relational model rigidity. It’s flexible, but it’s hard to query. The next next step is often to turn the JSON back into a relational model so you can make sense of your data. It’s the endless data model circle of life. Also Uber on Evolving Schemaless into a Distributed SQL Database. Also also Pinterest on Manas Two-stage Retrieval — The efficient architecture for hierarchical documents.
- Seeing how services can be built is always great, but I have to agree, this is spendy. The culprit is no doubt the very low concurrent connection limit. Low cost requires high utilization. Building a high-scale chat server on Cloud Run:
- In this blog, I will show you how to use WebSockets support to build a fleet of serverless containers that make up a chatroom server that can scale a high number of concurrent connections (250,000 clients). Any Cloud Run service, by default, can scale up to 1,000 instances. Currently 250 clients can connect to a single container instance.
- Cloud Run runs and scales any containerized service application. Based on the load (connected clients), it will add more container instances or shut down unused ones. Therefore, our chat server has to be stateless.
- To synchronize data between the dynamic fleet of container instances behind a Cloud Run service, we will use the Redis PubSub protocol simply because it delivers new messages to any connected client over a persistent TCP connection.
- Serverless is by design more expensive than pre-provisioned VM-based compute. If you deploy this app with 128MB RAM and 1 vCPU today, it will cost (0.00002400 + (0.00000250/8)) * 60 * 60 = $0.0875 per hour per instance. 1 This means if you have 1,000 instances actively running and serving 250K clients, it will cost $87/hour, which is $62.6K/month.
- thesandlord: I think some people are missing the point when comparing this to a traditional VM setup. Yes it is way more expensive, but it lets you deploy something that works in 10 minutes vs messing with VMs and auto-scaling groups and all that jazz. If you are a GCP or AWS or bare metal expert that can set this thing up in their sleep, that's great but the majority of people can really benefit from a PaaS like GCR. Because Cloud Run uses vanilla Docker containers, once you have validated the idea you can move to GKE or VMs or a server under your desk or whatever. And if it never takes off, that's fine too because you didn't spend a ton of time investing in making it work.
- emilsedgh: Cloud Run has been fantastic. I had a worker service running on Heroku. Very CPU intensive. The traffic pattern was extremely low throughout the day, but had completely unexpected surges. On Heroku, my choices were: Paying $3k (basically paying for peak surge throughout the month) or having a lot of slow/failed responses during the surge. Moved to Cloud Run very easily. Just a normal dockerized 12 factor app. Now I pay ~$50/m and it automatically scales up when I need more workers.
- reilly3000: This estimate misses the entire benefit of Cloud Run which is metered compute billing. Nobody in their right mind would run a serverless stack if they had 250K connections 24/7.
- Microsoft Azure and AWS combined control more than 50% of worldwide cloud infrastructure services spend. “No easy task is a phrase used often in Kinsta’s article comparing AWS vs Azure in 2021. But if you want a short comparison of the two services that is still very long then this is a good source. The conclusion is much as you would expect: while AWS might look better overall, your own research for your business may lead you to decide on Azure as the best choice for you. Also Google Cloud vs AWS Onboarding Comparison.
- Serverless Doesn't Make Sense Or does it?
- Using Google Cloud Functions to dynamically resize images was too slow, but then again, nothing was just right. The cause: dreaded cold starts.
- Azure is just as slow.
- Lambda is way faster, so he ported the code to Lambda in 5 minutes and then spent the next 10 weeks configuring API gateway. Clever line. API gateway is complex, but there’s always ramp up time when learning new tools. He probably should have used HTTP API instead.
- Is synchronously resizing images a good use of serverless? Probably not. Resize them once and serve them from a CDN. Cloudflare workers have limitations too. And if running your own k8s cluster is a potential solution then serverless is probably not a great fit.
- Also Ben Awad, Serverless Makes Sense Now. Good comparison of different pricing options.
- Also also It's TIIIIIIME - LIVE Main Event: Ready, Set... Serverless vs. Containers where you’ll hear Forrest Brazeal give an impassioned defense of serverless. Though I’m not quite sure how packaging serverless using containers is any knock on serverless. It’s just a packaging format. Who cares? Oh wait, Forrest says that about a second after I just wrote the same thing. It’s the serverless mindset that is eating the world as value moves up the stack. Not packaging technologies. Not FaaS. Ignore all that. The important idea is: own less; build more (pay more?). It’s not just about cloud provider services. It’s using Twilio, Zapier, Stripe, Auth0, and other SaaSs. He’s talking about a variant of the older API economy idea with serverless as the universal glue layer so you can build your true point of product differentiation.
- Also also also Google admits Kubernetes container tech is so complex, it's had to roll out an Autopilot feature to do it all for you.
- Also also also also AWS re:Invent 2020 Breakout Sessions | Serverless.
- Witness the serverless-spirit animating Forrest Brazeal in one of the classic pro-anything rants of all time. 3 Counter-Intuitive Reasons Why Serverless is the Awesomest:
- Serverless Violates the Second Law of Thermodynamics. Services transparently get better, faster, cheaper over time with no effort by you. Example: Dynamodb. One day on-demand billing was introduced so you pay for what you use. It saved someone $120k per year by just checking the on-demand capacity box.
- Serverless is Expensive. It’s a good thing. Serverless is a forcing function directing you to use managed services that yes, are expensive, but are work you don’t have to do—which gives you time to focus on your core business. Opportunities to efficiently trade dollars for engineering time are rare.
- Serverless Lock-in is Good. You’re always locked-in. It’s better to lock-in on something good.
- @ben11kehoe: Also: managed services are compression algorithms for experience. Literally what serverless is about.
- How Wix improved website performance by evolving their infrastructure:
- Thanks to leveraging industry standards, cloud providers, and CDN capabilities, combined with a major rewrite of our website runtime, the percentage of Wix sites reaching good 75th percentile scores on all Core Web Vitals (CWV) metrics more than tripled year over year
- Wix adopted a performance-oriented culture, and further improvements will continue rolling out to users.
- We adjusted all our monitoring and internal discussions to include industry standard metrics such as Web Vitals, which include: LCP (Largest Contentful Paint), FID (First Input Delay), CLS (Cumulative Layout Shift)
- They switched from client side rendering to server-side rendering. The cycle continues. Why? This approach improved the visibility experience, especially on slower devices/connections, and opened the door for further performance optimizations. However, it also meant that for each web page request, a unique HTML response was generated on the fly, which is far from optimal. So they turned to caching and something that seems very client side: We carefully migrated this data and cookies to a new endpoint, which is called on each page load, but returns a slim JSON, which is required only for the hydration process, to reach full page interactivity.
- ~ 13%HTML requests served directly from the browser cache, saving much bandwidth and reducing loading times for repeat views
- HTTP/2 was enabled for all user domains, reducing both the amount of connections required and the overhead that comes with each new connection.
- 21 - 25% reduction of median file transfer size using brotli instead of gzip.
- Recently, we integrated with a solution by our DNS provider, to automatically select the best performing CDN according to the client's network and origin. This enables us to serve the static files from the best location for each visitor, and avoid availability issues on a certain CDN.
- We are currently integrating with various CDN providers to support serving the entire Wix site directly from CDN locations to improve the distribution of our servers across the globe and thus further improve response times.
- Good overview. Covers a lot of ground beyond the basics. The Big Little Guide to Message Queues. Also FOQS: Scaling a distributed priority queue.
- They actually use both Google Cloud and AWS for reals. Mux Is an API Based Platform That Lets You Process and Stream Videos:
- It uses Phoenix, Elixir and Go to handle billions of video views a month. It’s hosted on AWS and GCP with Kubernetes and has been up and running since early 2016. Some of Mux’s customers have millions of concurrent video views
- About 45 people work at Mux and half are involved with engineering
- Their main public API is an out of the box Phoenix app
- They have a real-time dashboard that is powered by websockets and channels
- Prometheus is used for metrics but it’s not hooked into Elixir Telemetry (yet)
- Kubernetes and Docker drive their production infrastructure.
- Buildkite is used for their CI / CD pipeline
- SendGrid is used for transactional emails, Sentry for errors and Opsgenie for paging
- All payments go through Stripe, including the metered billing which they hand rolled
- The Elixir app has a PostgreSQL billing DB and also uses ClickHouse (SQL based). – ClickHouse lets them store billions of rows and access everything quickly
- The Elixir API runs on AWS with an AWS load balancer sitting in front of it all
- The video infrastructure runs on Google Cloud
- Smoke tests and various alarms help detect issues in production (they use Flink)
- Let’s Encrypt on Preparing to Issue 200 Million Certificates in 24 Hours:
- a new generation of database server from Dell featuring dual AMD EPYC 7542 CPUs, 64 physical cores in total. These machines have 2TB of faster RAM. Much faster CPUs and double the memory is great, but the really interesting thing about these machines is that the EPYC CPUs provide 128 PCIe4 lanes each. This means we could pack in 24 6.4TB NVME drives for massive I/O performance. There is no viable hardware RAID for NVME, so we’ve switched to ZFS to provide the data protection we need.
- We originally looked into upgrading to 10G, but learned that upgrading to 25G fiber wasn’t much more expensive. Cisco ended up generously donating most of the switches and equipment we needed for this upgrade, and after replacing a lot of server NICs Let’s Encrypt is now running on a 25G fiber network!
- Thales generously donated new HSMs with about 10x the performance - approximately 10,000 signing operations per second, 20,000 between the pair. That means we can now perform 864,000,000 signing operations in 24 hours from a single data center.
- Two Views of Lambda Diverged in a Yellow Wood…: Instead of the ~23 seconds it took when I made the complaint, it now comfortably and reliably returns in less than a second every time.The fix, as helpfully suggested by Randall Hunt, was dead simple: Stop trying to do this for every request. Instead, have a Lambda function fire off once a minute, perform the transform, stuff that into an S3 bucket behind CloudFront without caching, and call it a day.
- Lenskart on Serving millions of users on a budget:
- 2.5M monthly active users; 153M requests per month; 10k rpm; 34 microservices; 1.2B db ops per month; 40TB of data; 10x scale in 3 years
- $1000 per month
- Firebase Functions, Firestore, Hosting, and Memorystore
- Firebase: Easy accessibility for mobile engineers; Simple setup, quick deployments.
- Liked the Minimal investment in DevOps and Infra. Low maintenance effort.
- Event Driven Autoscaling Architecture. beNX: Architecture Changes to Handle Massive Peak Traffic
- Big Hit Entertainment, handles ~100x traffic spikes within minutes from global fans with a highly scalable architecture on AWS. 7M+ global users from 229 countries.
- When a message comes in a that has to go to millions of users a lambda function increases the auto scaling group.
- A cloudwatch rule is triggered periodically to check if traffic is nearly normal and reduce the auto scaling group accordingly.
- 70% of instances in the auto scaling group are spot instances which saves 50% of the cost. Using spot instances removed cold start latency delays.
- Goployer (https://goployer.dev/) dynamically provision auto scaling groups.
- Considering in the future to use EKS and Fargate.
- The talk is more product pitch than architecture trends, but there are a few interesting bits. Adrian Cockcroft’s architecture trends and topics for 2021:
- AWS is systematically addressing objections to a serverless first world. Better portability with new container support. Scalability with new 10GB/6 vCPU. Build more complex configurations using Proton. AWS Proton is the first fully managed application deployment service for container and serverless applications. Using EFS you can bundle more code with each lambda function. Serverless and container build pipelines can be shared.
- 50% of all new AWS services are being built on top of Lambda.
- Chaos engineering is experimenting to ensure that the impact of failure is mitigated. AWS Fault Injection Simulator let’s you run experiments to test if availability and performance impacts of failure are being mitigated. Failure happens, so detect it fast and respond fast.
- Amazon DevOps Guru does alert correlation by looking at logs. Why Amazon? No idea.
- Retry storms. You want to prevent work amplification by: reducing retries to zero except at subsystem entry and exit points; reduce timeouts to drop orphaned requests; route calls with the same zone.
- Get multi-zone failover solid before attempting multi-region.
- I bet Wardley who has been talking about Wardley Maps forever would find it amusing his maps are being used by early adopters.
- There are now machines and clusters with huge amounts of CPU, RAM, memory bandwidth, and network bandwidth. For example, the UltraCluster has 4.4 petabytes of memory. How can you take advantage of huge memory systems? That’s still the question. In-memory analytics. Large graph databases. Use shared memory to avoid serialization overhead. Put pods all on one node. Fast replication for persistence.
- Some projects to watch: Twizzler: An Operating System for Next-Generation Memory Hierarchies; MemVerge - distributed memory objects; brytlyt - a database that runs completely inside the GPU; SQream - GPU accelerated data warehouse.
- We’ve gone through these phases—shared memory, hold everything in memory, custom hardware, etc—before. The problem is commodity approaches using commodity hardware have always won out. If AWS really wants any of these to succeed, it must champion them.
- I’ve wondered how to do something similar. Love this approach. How screen scraping and TinyML can turn any dial into an API:
- This image shows a traditional water meter that’s been converted into a web API, using a cheap ESP32 camera and machine learning to understand the dials and numbers. I expect there are going to be billions of devices like this deployed over the next decade, not only for water meters but for any older device that has a dial, counter, or display.
- There’s a massive amount of information out in the real world that’s can’t be remotely monitored or analyzed over time, and a lot of it is displayed through dials and displays. Waiting for all of the systems involved to be replaced with connected versions could take decades, which is why I’m so excited about this incremental approach.
- jomjol/AI-on-the-edge-device
- All software evolves until it performs company specific tasks like magic. The Netflix Cosmos Platform:
- We have found that the programming model of “microservices that trigger workflows that orchestrate serverless functions to be a powerful paradigm.
- Moving from a large distributed application to a “platform plus applications was a major paradigm shift. Everyone had to change their mindset. Application developers had to give up a certain amount of flexibility in exchange for consistency, reliability, etc.
- Plato is the glue that ties everything together in Cosmos by providing a framework for service developers to define domain logic and orchestrate stateless functions/services. Plato is a forward chaining rule engine which lends itself to the asynchronous and compute-intensive nature of our algorithms. Unlike a procedural workflow engine like Netflix’s Conductor, Plato makes it easy to create workflows that are “always on. For example, as we develop better encoding algorithms, our rules-based workflows automatically manage updating existing videos without us having to trigger and manage new workflows.
- Cosmos has two axes of separation. On the one hand, logic is divided between API, workflow and serverless functions. On the other hand, logic is separated between application and platform. The platform API provides media-specific abstractions to application developers while hiding the details of distributed computing. For example, a video encoding service is built of components that are scale-agnostic: API, workflow, and functions. They have no special knowledge about the scale at which they run.
- The subsystems all communicate with each other asynchronously via Timestone, a high-scale, low-latency priority queuing system.
- When a new title arrives from a production studio, it triggers a Tapas workflow which orchestrates requests to perform inspections, encode video (multiple resolutions, qualities, and video codecs), encode audio (multiple qualities and codecs), generate subtitles (many languages), and package the resulting outputs (multiple player formats). Thus, a single request to Tapas can result in hundreds of requests to other Cosmos services and thousands of Stratum function invocations.
- Scaling Reporting at Reddit:
- Druid is a columnar database designed to ingest high volumes of raw events with ingestion-time aggregation and perform sub-second query-time aggregates across those events, making it an excellent fit for our use-case.
- A Reddit client fires events (view, click, etc.) to our server. These events get placed onto Kafka as they are received.
- A Spark job validates incoming events and places them into Amazon S3 as parquet files.
- Another Spark job performs minor transformations on these parquet files to make them appropriate for Druid to ingest.
- The reporting service responds to UI requests by querying Druid.
- The schema used in our Druid data source is simple. We have a column for each dimension that we need to group-by and a single column indicating the event type. We can now easily solve the use-case above where the advertiser wants to generate a six-month report for all of their ads using a single SQL query: SELECT COUNT(*) FROM reporting_events WHERE advertiser_id == $advertiser_id AND event_type == ‘click’ AND date >= $start_date AND date <= $end_date GROUP BY date, ad_id;
- Interesting design problem: how do you coordinate all parts of a system? Andrew Huberman: Sleep, Dreams, Creativity & the Limits of the Human Mind. Our body uses temperature. The master clock that coordinates our whole circadian rhythm is controlled by body temperature. Genius. All tissues can measure temperature locally so they can make global decisions based on local information in a timely manner. I can’t imagine a message passing approach working as reliably.
- Light on details, but I’m sure it was crafty. How Etsy Prepared for Historic Volumes of Holiday Traffic in 2020:
- Cyber Monday, which is typically our busiest day of the year. Throughput on our sharded MySQL infrastructure peaked around 1.5 million queries per second. Memcache throughput peaked over 20M requests per second. Our internal http API cluster served over 300k requests per second. More than 1000 hosts.
- we discourage, for a few weeks, deploying changes that might be expected to disrupt sellers, support, or infrastructure. Necessary changes can still get pushed, but the standards are higher for preparation and communication. We call this period “Slush because it’s not a total freez
- We look at historical trends and system resource usage and communicate our infrastructure projections to GCP many weeks ahead of time to ensure they will have the right types of machines in the right locations. For this purpose, we share upper-bound estimates, because this may become the maximum available to us later. We err on the side of over-communicating with GCP.
- We decided to try a multi-team game day with macro load testing and see what we could learn. We wanted an open-ended test that could help expose bottlenecks as early as possible. All teams had at least one common deadline: request quota increases from GCP for their projects by early September. We confirmed our many scaling tools worked as intended. We exposed bottlenecks in the configuration of some components, for example some Memcache clusters and our StatsD relays.
- Wayfair on A Dynamic Journey to Performance:
- We now think about performance in terms of percentiles rather than averages
- bumped up the default logging level threshold from DEBUG to WARN and saw a 3ms performance improvement
- removing DataDog APM from our deployment, we saw another 2-3ms performance improvement.
- Our dependency resolution layer turned out to be our biggest performance bottleneck. Once we changed our stateless class registration to single instances, we saw a 15-20ms performance improvement. We brought down our p50 from 27ms to 6 ms and our p99 below our previous observed p50 during the initial PDP rollout.
- 12 requests per second:
- In reality most of the "super-fast" benchmarks mean very little except for a few niche use-cases. If you look at the code in detail, you'll see that they're either simple "Hello, World!" or echo servers and all of them spend most of their time calling hand-crafted C code with Python bindings. As soon as you introduce any actual Python work into the code you'll see those numbers plunge.
- What we've seen in the benchmarks is that schema design, database choice and architecture will be the bottlenecks. Going with one of the new fully async frameworks purely for speed will probably not be as effective as just using PyPy and an async Gunicorn worker. I also found it gave me a kind of decision paralysis, asking many more questions like: if we can keep our latency low, is it more or less performant to use a synchronous Foo client written in C, or an async one written in pure Python?
- Building an Instagram for DJs with FaunaDB with Keston Crandall:
- Why azure and azure functions? Came from a .net background. Good client libraries for C#.
- Azure functions has a HTTP REST layer built-in by default and it takes no time to configure.
- You can build your entire project with one button in Visual Studio and ship it up for easy deployment. You don't need API Gateway. It’s easier and takes less time.
- Didn't go with CosmosDB because it did not have cross collection transactions. In FaunaDB everything is inside a transaction.
- CosmosDB didn't have a serverless consumption based plan. Wanted a truly serverless plan that would scale with usage. The FaunaDB C# client library can serialize classes and create it in the database without writing code. Databases can be hosted in particular regions to reduce data ingress charges.
- FaunaDB will be more graph oriented. Infinite scalability. Can add all the indexes you want. Cheap for development.
Soft Stuff:
- suborbital/atmo: Building web services should be simple. Atmo makes it easy to create a powerful server application without needing to worry about scalability, infrastructure, or complex networking.
- adlrocha: The basic idea is that if every peer in a decentralized network includes a common runtime, and all functions and data are uniquely identified in the network, you can run anything, anywhere. And the fact that content-addressed networks give a CDN-by-default capability, would allow an IPFS-based Atmo to scale seamlessly as long as is a peer with available resources to run your bundle. This would enable a global serverless infrastructure and a seamless developer experience (no more worrying about what cloud provider to choose).
- Also @adlrocha - Building a scalable monolith
- CondensationDB/Condensation: a zero-trust distributed database that ensures data ownership and data security. Inspired by the blockchain system, the email system, and git versioning, Condensation's architecture is a unique solution to develop scalable and modern applications, excelling at synchronization.
- AbstractMachinesLab/lam (article): a lightweight, universal actor-model vm for writing scalable and reliable applications that run natively and on WebAssembly. It is inspired by Erlang and Lua, and it is compatible with the Erlang VM.
- bastion-rs/bastion: a highly-available, fault-tolerant runtime system with dynamic, dispatch-oriented, lightweight process model. It supplies actor-model-like concurrency with a lightweight process implementation and utilizes all of the system resources efficiently guaranteeing of at-most-once message delivery.
- An In-Depth Study of Correlated Failures in Production SSD-Based Data Centers.
- We present an in-depth data-driven analysis on the correlated failures in the SSD-based data centers at Alibaba. We study nearly one million SSDs of 11 drive models based on a dataset of SMART logs, trouble tickets, physical locations, and applications.
- A non-negligible fraction of SSD failures belong to intra-node and intra-rack failures (12.9% and 18.3% in our dataset, respectively). Also, the intra-node and intrarack failure group size can exceed the tolerable limit of some typical redundancy protection schemes.
- The likelihood of having an additional intranode (intra-rack) failure in an intra-node (intra-rack) failure group depends on the already existing intra-node (intra-rack) failures.
- The relative percentages of intra-node and intrarack failures vary across drive models. Putting too many SSDs from the same drive model in the same nodes (racks) leads to a high percentage of intra-node (intra-rack) failures. Also, the AFR and environmental factors (e.g., temperature) affect the relative percentages of intra-node and intra-rack failures
- Finding 6. MLC SSDs with higher densities generally have lower relative percentages of intra-node and intra-rack failures.
- The relative percentages of intra-node and intrarack failures increase with age
- The SMART attributes have limited correlations with intra-node and intra-rack failures
- Write-dominant workloads lead to more SSD failures overall, but are not the only impacting factor on the AFRs
- Erasure coding shows higher reliability than replication based on the failure patterns in our dataset.
- Redundancy schemes that are sufficient for tolerating independent failures may be insufficient for tolerating the correlated failures as shown in our dataset.
- virtualagc/virtualagc (video): the previously lost Apollo 10 LM software, as flown (also known as Luminary 69 Rev 2)
Pub Stuff:
- Foundational distributed systems papers: here is my compilation of foundational papers in the distributed systems area. (I focused on the core distributed systems area, and did not cover networking, security, distributed ledgers, verification work etc. I even left out distributed transactions, I hope to cover them at a later date.)
- A network analysis on cloud gaming: Stadia, GeForce Now and PSNow: We find that GeForce Now and Stadia use the RTP protocol to stream the multimedia content, with the latter relying on the standard WebRTC APIs. They result bandwidth hungry and consume up to 45 Mbit/s, depending on the network and video quality. PS Now instead uses only undocumented protocols and never exceeds 13 Mbit/s.
- Cloud Native Transformation: How do you serve your customers faster, better, smarter? That’s an easy one: with Cloud Native technology, culture, and strategy. But how do you get started in moving your organisation toward the cloud? That’s not so easy—the choices are many, risk shadows every decision, and the complexity of the whole thing grows as you move forward.
- HHVM Jump-Start: Boosting Both Warmup and Steady-State Performance at Scale: In this paper, we argue for HHVM’s Jump-Start approach, describe it in detail, and present steady-state optimizations built on top of it. Running the Facebook website, we demonstrate that Jump-Start effectively solves the warmup problem in HHVM, reducing the server capacity loss during warmup by 54.9%, while also improving steady-state performance by 5.4%.
- FirePlace: Placing FireCracker virtual machines with hindsight imitation: We see that in production traffic from Amazon Web Services (AWS), µVM resource use is spiky and short lived, and that forecasting algorithms are not useful. We evaluate Reinforcement Learning (RL) approaches for this task, but find that off-the-shelf RL algorithms are not always performant. We present a forecasting-free algorithm, called FirePlace, that learns the placement decision using a variant of hindsight optimization, which we call hindsight imitation. We evaluate our approach using a production traffic trace of µVM usage from AWS Lambda. FirePlace improves upon baseline algorithms by 10% when placing 100K µVMs.
- Silent Data Corruptions at Scale: We [Facebook] provide a high-level overview of the mitigations to reduce the risk of silent data corruptions within a large production fleet. In our large-scale infrastructure, we have run a vast library of silent error test scenarios across hundreds of thousands of machines in our fleet. This has resulted in hundreds of CPUs detected for these errors, showing that SDCs are a systemic issue across generations. We have monitored SDCs for a period longer than 18 months. Based on this experience, we determine that reducing silent data corruptions requires not only hardware resiliency and production detection mechanisms, but also robust fault-tolerant software architectures.
- ApplePlatform Security: This documentation provides details about how security technology and features are implemented within Apple platforms. It also helps organizations combine Apple platform security technology and features with their own policies and procedures to meet their specific security needs
- Algorithms: This web page contains a free electronic version of my self-published textbook Algorithms, along with other lecture notes I have written for various theoretical computer science classes at the University of Illinois, Urbana-Champaign since 1998.
- Scalable Statistical Root Cause Analysis on App Telemetry: In this paper, we propose Minesweeper, a technique for RCA that moves towards automatically identifying the root cause of bugs from their symptoms. The method is based on two key aspects: (i) a scalable algorithm to efficiently mine patterns from telemetric information that is collected along with the reports, and (ii) statistical notions of precision and recall of patterns that help point towards root causes. We evaluate Minesweeper on its scalability and effectiveness in finding root causes from symptoms on real world bug and crash reports from Facebook's apps. Our evaluation demonstrates that Minesweeper can perform RCA for tens of thousands of reports in less than 3 minutes, and is more than 85% accurate in identifying the root cause of regressions.
- Reading and Writing the Morphogenetic Code: We focus on the morphogenetic code: the mechanisms and information structures by which cellular networks internally represent the target morphology, and compute the cell activities needed at each time point to bring the body closer to that morphology.
If you've made it this far, I did a major rewrite of a novella I wrote a few years ago: The Strange Trial of Ciri: An AI Twist on the Pinocchio Story. It has a completely new ending exploring how to create a sentient AI through social interaction, how an AI mind differs from a human mind, and how a world spanning social network run by an AI can be used to change the world. If those themes interest you, then I think you'll like the book. Please give it a try and tell me what you think. Thanks.
Reader Comments (1)
Thanks for the quote about TCP vs. UDP in my article on real-time geolocation while making use of OpenStreetMap. The big downside to UDP is that encryption and authentication with UDP is mostly an unsolved problem. True, there are solutions, but they tend to eliminate the very qualities of UDP that I find useful. These days I'm starting to look at QUIC, Google's low latency UDP-based stream protocol that is currently undergoing IETF standardization.