Stuff The Internet Says On Scalability For May 3rd, 2019
Friday, May 3, 2019 at 9:07AM
HighScalability Team in hot links
Wake up! It's HighScalability time:
Event horizon? Nope. It's a close up of a security hologram. Makes one think.
Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. I wrote Explain the Cloud Like I'm 10 for people who need to understand the cloud. And who doesn't these days? On Amazon it has 45 mostly 5 star reviews (105 on Goodreads). They'll learn a lot and hold you in awe.
Number Stuff:
- $1 trillion: Microsoft is the most valuable company in the world (for now)
- 20%: global enterprises will have deployed serverless computing technologies by 2020
- 390 million: paid Apple subscriptions, revenue from the services business climbed from $9.9 billion to $11.5 billion, services now account for “one-third” of the company’s gross profits
- 1011: CubeStat missions
- $326 billion: USA farm expenses in 2017
- 61%: increase in average cyber attack losses from $229,000 last year to $369,000 this, a figure exceeding $700,000 for large firms versus just $162,000 in 2018.
- $550: can yield 20x profit on the sale of compromised login credentials
Quotable Stuff:
- Robert Lightfoot~ Protecting against risk and being safe are not the same thing. Risk is just simply a calculation of likelihood and consequence. Would we have ever launched Apollo in the environment we’re in today? Would Buzz and Neil have been able to go to the moon in the risk posture we live in today? Would we have launched the first shuttle with a crew? We must move from risk management to risk leadership. From a risk management perspective, the safest place to be is on the ground. From a risk leadership perspective, I believe that’s the worst place this nation can be.
- Paul Kunert: In dollar terms, Jeff Bezos's cloud services wing grew 41 per cent year on year to $7.6bn, figures from Canalys show. Microsoft was up 75 per cent to $3.4bn and Google grew a whopping 83 per cent to $2.3bn.
- @codinghorror: 1999 "MIT - We estimate that the puzzle will require 35 years of continuous computation to solve" 2019 "🌎- LOL" https://www.csail.mit.edu/news/programmers-solve-mits-20-year-old-cryptographic-puzzle …
- @dvassallo: TIL what EC2's "Up to" means. I used to think it simply indicates best effort bandwidth, but apparently there's a hard baseline bottleneck for most EC2 instance types (those with an "up to"). It's significantly smaller than the rating, and it can be reached in just a few minutes. This stuff is so obscure that I bet 99% of Amazon SDEs that use EC2 daily inside Amazon don't know about these limits. I only noticed this by accident when I was benchmarking S3 a couple of weeks ago
- @Adron: 1997: startup requires about a million $ just to get physical infra setup for a few servers. 2007: one can finally run stuff online and kind of skip massive hardware acquisitions just to run a website. 2017: one can scale massively & get started for about $10 bucks of infra.
- Wired: Nadella’s approach as “subtle shade.” He never explicitly eighty-sixed a division or cut down a product leader, but his underlying intentions were always clear. His first email to employees ran more than 1,000 words—and made no mention of Windows. He later renamed the cloud offering Microsoft Azure. “Satya doesn’t talk shit—he just started omitting ‘Windows’ from sentences,” this executive says. “Suddenly, everything from Satya was ‘cloud, cloud, cloud!’ ”
- @ThreddyTheTrex: My approach to side projects has evolved. Beginning of my career: “I will build everything from scratch using C and manage my own memory and I don’t mind if it takes 3 years.” Now: “I will use existing software that takes no more than 15 minutes to configure.”
- btown: The software wouldn't have crashed if the user didn't click these buttons in a weird order. The bug was only one factor in a chain of events that led to the segfault.
- @Tjido: There are more than 10,000 data lakes in AWS. @strataconf #datalakes #stratadata
- Nicolas Kemper: Accretive projects are everywhere: Museums, universities, military bases – even neighborhoods and cities. Key to all accretive projects is that they house an institution, and key to all successful institutions is mission. Whereas scope is a detailed sense of both the destination and the journey, a mission must be flexible and adjust to maximum uncertainty across time. In the same way, an institution and a building are often an odd pair, because whereas the building is fixed and concrete, finished or unfinished, an institution evolves and its work is never finished.
- @markmadsen: Your location-identified tweets plus those of two friends on twitter predict your location to within 100m 77% of the time. Location data is PII and must be treated as such #StrataData
- Backblaze: The Annualized Failure Rate (AFR) for Q1 is 1.56%. That’s as high as the quarterly rate has been since Q4 2017 and its part of an overall upward trend we’ve seen in the quarterly failure rates over the last few quarters. Let’s take a closer look.
- Theron Mohamed: Google's advertising revenue rose by 15% to $30.72 billion, a sharp slowdown from 24% growth a year ago, according to its earnings report for the first quarter of 2018. Paid clicks rose 39%, a significant decrease from 59% year-on-year growth in the first quarter of 2018. Cost-per-click also fell 19%, after sliding 19% in the same period of 2018.
- @ajaynairthinks: “It was what we know to do so it was faster” -> this is the key challenge. Right now, the familiar path is not easy/effective in the long term, and the effective path is not familiar in the short term. We need make this gap visible, and we need to make the easy things familiar.
- @NinjaEconomics: "For the first time ever there are now more people in the world older than 65 than younger than 5."
- Filipe Oliveira: with the new AWS public cloud C5n Instances designed for compute-heavy applications and to deliver performance that is just about indistinguishable from bare metal, with 100 Gbps Networking along with a higher ceiling on packets per second, we should be able deliver at least the same 50 Million operations per second bellow 1 millisecond with less VM nodes
- Nima Khajehnouri: Snap’s monetization algorithms have the single biggest impact to our advertisers and shareholders
- Carmen Bambach: He is an artist of his time and one that transcends his time. He is very ambitious. It’s important to remember that although Leonardo was a “disciple of experience,” as he called himself, he is also paying great attention to the sources of his time. After having devoured and looked at and bought many books, he realizes he can do better. He really wants to write books, but it’s a very steep learning curve. The way we should look at his notebooks and the manuscripts is that they are essentially the raw material for what he had intended to produce as treatises. His great contribution is being able to visualize knowledge in a way that had not been done before.
- Charlie Demerjian: The latest Intel roadmap leak blows a gaping hole in Intel’s 10nm messaging. SemiAccurate has said all along that the process would never work right and this latest info shows that 10nm should have never been released.
- @mipsytipsy: Abuse and misery pile up when you are building and running large software systems without understanding them, without good feedback loops. Feedback loops are not a punishment. They mature you into a wise elder engineer. They give you agency, mastery, autonomy, direction. And that is why software engineers, management, and ops engineers should all feel personally invested in empowering software engineers to own their own code in production.
- Skip: Serverless has made it possible to scale Skip with a small team of engineers. It’s also given us a programming model that lets us tackle complexity early on, and gives us the ability to view our platform as a set of fine-grained services we can spread across agile teams.
- seanwilson: Imagine having to install Trello, Google Docs, Slack etc. manually everywhere you wanted to use it, deal with updates yourself and ask people you wanted to collaborate with to do the same. That makes no sense in terms of ease of use.
- Darryl Campbell: The slick PR campaign masked a design and production process that was stretched to the breaking point. Designers pushed out blueprints at double their normal pace, often sending incorrect or incomplete schematics to the factory floor. Software engineers had to settle for re-creating 40-year-old analog instruments in digital formats, rather than innovating and improving upon them. This was all done for the sake of keeping the Max within the constraints of its common type certificate.
- Stripe: We have seen such promising results from our remote engineers that we are greatly increasing our investment in remote engineering. We are formalizing our Remote engineering hub. It is coequal with our physical hubs, and will benefit from some of our experience in scaling engineering organizations.
- Joel Hruska: According to Intel in its Q1 2019 conference call, NAND price declines were a drag on its earnings, falling nearly twice the expected amount. This boom and bust cycle is common in the DRAM industry, where it drove multiple players to exit the market over the past 18 years. This is one reason we’re effectively down to just three DRAM manufacturers — Samsung, SK Hynix, and Micron. There are still a few more players in the NAND market, though we’ve seen consolidations there as well.
- Alastair Edwards: The cloud infrastructure market is moving into a new phase of hybrid IT adoption, with businesses demanding cloud services that can be more easily integrated with their on-premises environment. Most cloud providers are now looking at ways to enter customers’ existing data centres, either through their own products or via partnerships
- Paul Johnston: And yes I can absolutely see how the above company could have done this whole solution better as a Serverless solution but they don’t have the money for rearchitecting their back end (I don’t imagine) and what would be the value anyway? It’s up and running, with paying clients. The value at this point doesn’t seem valuable. Additional features may be a good fit for a Serverless approach, but not the whole thing if it’s all working. The pain of migrating to a new backend database, the pain of server migrations even at this level of simplicity, the pain of having to coordinate with other teams on something that seems so trivial, but never is that trivial has been really hard.
- @rseroter: In serverless ... Functions are not the point. Managed services are not the point. Ops is not the point. Cost is not the point. Technology is not the point. The point is focus on customer value. @ben11kehoe laying it all out. #deliveragile2019
- @jessitron: Serverless is a direction, not a destination. There is no end state. @ben11kehoe Keep moving technical details out of the team’s focus, in favor of customer value. #deliverAgile
- @ondayd: RT RealGeneKim "RT jessitron: When we rush development, skip tests and refactoring, we get “Escalating Risk.” Please give up the “technical debt” description; it gives businesspeople a very wrong impression of the tradeoffs. From Janellekz #deliverAgile "
- @ben11kehoe: Good points in here about event-driven architectures. I do think the "bounded context" notions from microservices are still applicable, and that we don't have good enough tools for establishing contracts for events and dynamic routing for #serverless yet.
- Riot Games: We use MapReduce, a common cluster computing model, to calculate data in a distributed fashion. Below is an example of how we calculate the cosine similarity metric - user data is mapped to nodes, the item-item metric is calculated for each user, and the results are shuffled and sent to a common node so they can be aggregated together in the reduce stage. It takes approximately 1000 compute hours to carry out the entire offer generation process, from snapshotting data to running all of the distributed algorithms. That’s 50 machines running for 20 hours each.
- Will Knight: Sze’s hardware is more efficient partly because it physically reduces the bottleneck between where data is stored and where it’s analyzed, but also because it uses clever schemes for reusing data. Before joining MIT, Sze pioneered this approach for improving the efficiency of video compression while at Texas Instruments.
- Hersleb hypothesis~ coding is a socio-technical process where code and humans interact. According to what we call the the Hersleb hypothesis, the following anti-pattern is a strong predictor for defects: • If two code sections communicate...• But the programmers of those two sections do not...• Then that code section is more likely to be buggy
- Joel Hruska: But the adoption of chiplets is also the engineering acknowledgment of constraints that didn’t used to exist. We didn’t used to need chiplets. When companies like TSMC publicly predict that their 5nm node will deliver much smaller performance and power improvements than previous nodes did, it’s partly a tacit admission that the improvements engineers have gotten used to delivering from process nodes will now have to be gained in a different fashion. No one is particularly sure how to do this, and analyses of how effectively engineers boost performance without additional transistors to throw at the problem have not been optimistic.
- Bryan Meyers: To some extent I think we should view chiplets as a stop-gap until other innovations come along. They solve the immediate problems of poor yields and reticle limits in exchange for a slight increase in integration complexity, while opening the door to more easily integrating application-specific accelerators cost-effectively. But it's also not likely that CPU sockets will get much larger. We'll probably hit the limit of density when chiplet-based SoC's start using as much power as high-end GPUs. So really we're waiting on better interconnects (e.g. photonics or wireless NoC) or 3D integration to push much farther. Both of which I think are still at least a decade away.
- Olsavsky: And that will be a constant battle between growth, geographic expansion in AWS, and also efficiencies to limit how much we actually need. I think we are also getting much better at adding capacity faster, so there is less need to build it six to twelve 12 months in advance.
- Malith Jayasinghe: We noticed that a non-blocking system was able to handle a large number of concurrent users while achieving higher throughput and lower latency with a small number of threads. We then looked at how the number of processing threads impacts the performance. We noticed a minimal impact on throughput and average latency on the number of threads. However, as the number of threads increases, we see a significant increase in the tail latencies (i.e. latency percentiles) and load average.
- Paul Berthaux: We [Algolia] run multiple LBs for resiliency - the LB selection is made through round robin DNS. For now this is fine, as the LBs are performing very simple tasks in comparison to our search API servers, so we do not need an even load balancing across them. That said, we have some very long term plans to move from round-robin DNS to something based on Anycast routing. he detection of upstream failures as well as retries toward different upstreams is embedded inside NGINX/OpenResty. I use the log_by_lua directive from OpenResty with some custom Lua code to count the failures and trigger the removal of the failing upstream from the active Redis entry and alert the lb-helper after 10 failures in a row. I set up this failure threshold to avoid lots of unnecessary events in case of short self resolving incidents like punctual packet loss. From there the lb-helper will probe the failing upstream FQDN and put It back in Redis once it'll recover.
Useful Stuff:
Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.