Friday
Apr192019

Stuff The Internet Says On Scalability For April 19th, 2019

Wake up! It's HighScalability time:

 

Spirit? Smoke? Lightning? Nope. It's a gorgeous LIDAR image showing 1500 years of Willamette River movement (@Blacky_Himself)

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. I wrote Explain the Cloud Like I'm 10 for people who need to understand the cloud. And who doesn't these days? On Amazon it has 44 mostly 5 star reviews (102 on Goodreads). They'll learn a lot and hold you in awe.

 

  • 536: IRS tax return submissions per second; 400,000: drone planted trees in a day; 200 million: smart speaker observers installed by year end; 54 million: GoT pirated in first 24 hours; 123,052: kg of crashed human spacecraft on the surface of the moon; 610 pounds: 128 kilobytes of  IBM S/360 core memory; 33%: per account month over month Lambda function growth; $100,000: Netflix bug bounty payout; $1 million: Shopify bug bount payout; ~$2300: cost to transfer 23TB from S3 to Backblaze B2 in 7 hours; 14%: Netflix users share passwords; 88%: believe P != NP; 30-90: minutes saved by StackOverflow per week; 83%: US teens have an iPhone; $2m: Microsoft bug bounty payout; $1 million: made by Colin Cowherd on 331 million Facebook page views, moving more to Instagram; 0: lines of code in Pong; ~95%: redis is slower when GDPR complient;

  • Quotable Quotes:
    • melodysheep: The universe has only just begun.
    • @matthew_d_green: I spent the year before Heartbleed visiting important people in DC trying to convince them OpenSSL was a mess, and they should fund it as “critical infrastructure”. They laughed and told me that term referred to dams and power plants.
    • Tim Cook: No
    • @asymco: Among 8,000 U.S. high school students surveyed, 83% have an iPhone, 9% Android. 86% plan their next phone to be an iPhone. -Piper Jaffray Taking Stock With Teens survey
    • @dialtone: Do you know S3 throughput is higher than a SATA 3 controller and almost as much as PCIe 4.0 2x? Depending on the usecase, mounting S3 as a filesystem, not only makes sense, but it saves a LOT of money (and time) as well. You just need to use it a special kind of filesystem.
    • Steven Melendez: By 2022, automation will displace about 75 million jobs worldwide. On the other hand, they will create an estimated 133 million new jobs. The predictions come from extrapolating from surveys sent to more than 300 major employers around the world. 
    • So many more quotes...
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Apr162019

MySQL High Availability Framework Explained โ€“ Part III: Failover Scenarios

MySQL High Availability Framework Explained – Part III: Failover Scenarios

In this three-part blog series, we introduced a High Availability (HA) Framework for MySQL hosting in Part I, and discussed the details of MySQL semisynchronous replication in Part II. Now in Part III, we review how the framework handles some of the important MySQL failure scenarios and recovers to ensure high availability.

MySQL Failover Scenarios

Scenario 1 – Master MySQL Goes Down

  • The Corosync and Pacemaker framework detects that the master MySQL is no longer available. Pacemaker demotes the master resource and tries to recover with a restart of the MySQL service, if possible.
  • At this point, due to the semisynchronous nature of the replication, all transactions committed on the master have been received by at least one of the slaves.
  • Pacemaker waits until all the received transactions are applied on the slaves and lets the slaves report their promotion scores. The score calculation is done in such a way that the score is ‘0’ if a slave is completely in sync with the master, and is a negative number otherwise.
  • Pacemaker picks the slave that has reported the 0 score and promotes that slave which now assumes the role of master MySQL on which writes are allowed.
  • After slave promotion, the Resource Agent triggers a DNS rerouting module. The module updates the proxy DNS entry with the IP address of the new master, thus, facilitating all application writes to be redirected to the new master.
  • Pacemaker also sets up the available slaves to start replicating from this new master.

Thus, whenever a master MySQL goes down (whether due to a MySQL crash, OS crash, system reboot, etc.), our HA framework detects it and promotes a suitable slave to take over the role of the master. This ensures that the system continues to be available to the applications.

Scenario 2 – Slave MySQL Goes Down

  • The Corosync and Pacemaker framework detects that the slave MySQL is no longer available.
  • Pacemaker tries to recover the resource by trying to restart MySQL on the node. If it comes up, it is added back to the current master as a slave and replication continues.
  • If recovery fails, Pacemaker reports that resource as down – based on which alerts or notifications can be generated. If necessary, the ScaleGrid support team will handle the recovery of this node.
  • In this case, there is no impact on the availability of MySQL services.

Scenario 3 – Network Partition – Network Connectivity Breaks Down Between Master and Slave Nodes

This is a classical problem in any distributed system where each node thinks the other nodes are down, while in reality, only the network communication between the nodes is broken. This scenario is more commonly known as split-brain scenario, and if not handled properly, can lead to more than one node claiming to be a master MySQL which in turn leads to data inconsistencies and corruption.

Let’s use an example to review how our framework deals with split-brain scenarios in the cluster. We assume that due to network issues, the cluster has partitioned into two groups – master in one group and 2 slaves in the other group, and we will denote this as [(M), (S1,S2)].

  • Corosync detects that the master node is not able to communicate with the slave nodes, and the slave nodes can communicate with each other, but not with the master.
  • The master node will not be able to commit any transactions as the semisynchronous replication expects acknowledgement from at least one of the slaves before the master can commit. At the same time, Pacemaker shuts down MySQL on the master node due to lack of quorum based on the Pacemaker setting ‘no-quorum-policy = stop’. Quorum here means a majority of the nodes, or two out of three in a 3-node cluster setup. Since there is only one master node running in this partition of the cluster, the no-quorum-policy setting is triggered leading to the shutdown of the MySQL master.
  • Now, Pacemaker on the partition [(S1), (S2)] detects that there is no master available in the cluster and initiates a promotion process. Assuming that S1 is up to date with the master (as guaranteed by semisynchronous replication), it is then promoted as the new master.
  • Application traffic will be redirected to this new master MySQL node and the slave S2 will start replicating from the new master.

Thus, we see that the MySQL HA framework handles split-brain scenarios effectively, ensuring both data consistency and availability in the event the network connectivity breaks between master and slave nodes.

This concludes our 3-part blog series on the MySQL High Availability (HA) framework using semisynchronous replication and the Corosync plus Pacemaker stack. At ScaleGrid, we offer highly available hosting for MySQL on AWS and MySQL on Azure that is implemented based on the concepts explained in this blog series. Please visit the ScaleGrid Console for a free trial of our solutions.

Tuesday
Apr162019

Sponsored Post: PerfOps, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Who's Hiring? 


  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • Join Etleap, an Amazon Redshift ETL tool to learn the latest trends in designing a modern analytics infrastructure. Learn what has changed in the analytics landscape and how to avoid the major pitfalls which can hinder your organization from growth. Watch a demo and learn how Etleap can save you on engineering hours and decrease your time to value for your Amazon Redshift analytics projects. Register for the webinar today.

  • Advertise your event here!

Cool Products and Services

  • PerfOps is a data platform that digests real-time performance data for CDN and DNS providers as measured by real users worldwide. Leverage this data across your monitoring efforts and integrate with PerfOps’ other tools such as Alerts, Health Monitors and FlexBalancer – a smart approach to load balancing. FlexBalancer makes it easy to manage traffic between multiple CDN providers, API’s, Databases or any custom endpoint helping you achieve better performance, ensure the availability of services and reduce vendor costs. Creating an account is Free and provides access to the full PerfOps platform.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net
  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorialStream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.
  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


Make Your Job Search O(1) — not O(n)

Triplebyte is unique because they're a team of engineers running their own centralized technical assessment. Companies like Apple, Dropbox, Mixpanel, and Instacart now let Triplebyte-recommended engineers skip their own screening steps.

We found that High Scalability readers are about 80% more likely to be in the top bracket of engineering skill.

Take Triplebyte's multiple-choice quiz (system design and coding questions) to see if they can help you scale your career faster.


The Solution to Your Operational Diagnostics Woes

Scalyr gives you instant visibility of your production systems, helping you turn chaotic logs and system metrics into actionable data at interactive speeds. Don't be limited by the slow and narrow capabilities of traditional log monitoring tools. View and analyze all your logs and system metrics from multiple sources in one place. Get enterprise-grade functionality with sane pricing and insane performance. Learn more today


If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Friday
Apr122019

Stuff The Internet Says On Scalability For April 12th, 2019

Out sick today, but start saving up for that ticket to Mars (more):

 

Wednesday
Apr102019

Sponsored Post: PerfOps, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Who's Hiring? 


  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • Join Etleap, an Amazon Redshift ETL tool to learn the latest trends in designing a modern analytics infrastructure. Learn what has changed in the analytics landscape and how to avoid the major pitfalls which can hinder your organization from growth. Watch a demo and learn how Etleap can save you on engineering hours and decrease your time to value for your Amazon Redshift analytics projects. Register for the webinar today.

  • Advertise your event here!

Cool Products and Services

  • PerfOps is a data platform that digests real-time performance data for CDN and DNS providers as measured by real users worldwide. Leverage this data across your monitoring efforts and integrate with PerfOps’ other tools such as Alerts, Health Monitors and FlexBalancer – a smart approach to load balancing. FlexBalancer makes it easy to manage traffic between multiple CDN providers, API’s, Databases or any custom endpoint helping you achieve better performance, ensure the availability of services and reduce vendor costs. Creating an account is Free and provides access to the full PerfOps platform.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net
  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorialStream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.
  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


Make Your Job Search O(1) — not O(n)

Triplebyte is unique because they're a team of engineers running their own centralized technical assessment. Companies like Apple, Dropbox, Mixpanel, and Instacart now let Triplebyte-recommended engineers skip their own screening steps.

We found that High Scalability readers are about 80% more likely to be in the top bracket of engineering skill.

Take Triplebyte's multiple-choice quiz (system design and coding questions) to see if they can help you scale your career faster.


The Solution to Your Operational Diagnostics Woes

Scalyr gives you instant visibility of your production systems, helping you turn chaotic logs and system metrics into actionable data at interactive speeds. Don't be limited by the slow and narrow capabilities of traditional log monitoring tools. View and analyze all your logs and system metrics from multiple sources in one place. Get enterprise-grade functionality with sane pricing and insane performance. Learn more today


If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Monday
Apr082019

From bare-metal to Kubernetes

Friday
Apr052019

Stuff The Internet Says On Scalability For April 5th, 2019

Wake up! It's HighScalability time:

 

How unhappy do you have to be as a customer to take so much joy in end-of-lifing a product?

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. I wrote Explain the Cloud Like I'm 10 for people who need to understand the cloud. And who doesn't these days? On Amazon it has 44 mostly 5 star reviews (100 on Goodreads). They'll learn a lot and love you for the hookup.

 

  • $40 million: Fortnite World Cup prize money; 89%: of people who like Go say they like Go; 170 million: paid iCloud accounts; 533: days bacteria lived on the outside of ISS; 95%: BTC volume is fake; 51: LTE vulnerabilities found by fuzzing; 13,000: CRISPR edits in a single cell; 5G: 762Mbps down and a 19ms ping; 17,000: awesome Historic Blues & Folk Recordings; 3,236: Amazon broadband LEO satellite network; 5.1 million: emails sent during 10 day spam campaign; 

  • Quoteable Quotes:
    • @KimZetter: Crashed Tesla vehicles sold at junk yards and auctions found to contain invasive personal info on driver, including phonebook and calendar info from drivers' paired mobile devices; also found unencrypted video showing what happened just before accident
    • Pete Warden: There are 150 billion embedded processors out there in the world, that’s more than twenty each for every man, woman, and child on earth! Not only did that number amaze me when I first came across it, but the growth rate is an astonishing 20% annually, with no signs of slowing down. That’s much faster than smartphone usage, which is almost flat, or the growth in the number of internet users, which is in the low single digits these days...My objection to the internet of things is that the majority of embedded devices are not connected to any network, and as I’ll discuss in a bit, it’s unlikely that they ever will be, at least more than intermittently.
    • @fharper: But I’m also hurt because the people I cared about, treated me like if I was no one, like if I didn’t deserve to have someone from @npmjs tell me the bad news. People I loved, who at the moment all my accounts were turned off (which was during the meeting), never looked back.
    • Gustav Kuhn: It is only once you start thinking about some of the huge day-to-day challenges our visual system constantly faces that the true wonders of the brain start to emerge. Our brain uses a really clever and almost science-fictional trick that prevents us from living in the past: we look into the future. Our visual system is continuously predicting the future, and the world that you are now perceiving is the world that your visual system has predicted to be the present in the past
    • Young Girl: I’ve thought about deleting instagram, but then I think—what would I do with my life?
    • What are you waiting for?
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Apr032019

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

PostgreSQL is an open source object-relational database system that has soared in popularity over the past 30 years from its active, loyal, and growing community. For the 2nd year in a row, PostgreSQL has kept the title of #1 fastest growing database in the world according to the DBMS of the Year report by the experts at DB-Engines. So what makes PostgreSQL so special, and how is it being used today? We found the answers at the Postgres Conference in March where we surveyed PostgreSQL users, contributors, and SQL and NoSQL database administrators alike. In this free PostgreSQL Trends Report, we break down PostgreSQL hosting use across public cloud vs. private cloud vs. hybrid cloud, most popular cloud providers, migration trends, database combinations with Postgres, and why PostgreSQL is preferred over popular RDBMS alternatives.

Private Cloud vs. Public Cloud vs. Hybrid Cloud

Click to read more ...

Friday
Mar292019

Stuff The Internet Says On Scalability For March 29th, 2019

Wake up! It's HighScalability time:

 

Uber's microservice Graph. Thousands of microservices. Crazy like a fox? Or just crazy? (@msuriar)

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. I wrote Explain the Cloud Like I'm 10 for people who need to understand the cloud. And who doesn't these days? On Amazon it has 42 mostly 5 star reviews (100 on Goodreads). They'll learn a lot and love you for the hookup.

 

  • 1.5 billion: monthly What's App users; 80 billion: docker downloads in 6 years; 1 billion: players on the App Store. 300,000 games; 13.5 billion: Voyager 1 miles from earth; 11 years: Teeny-Tiny Bluetooth Transmitter; 500 million: Airbnb guests; 12.5 million bits: information learned by average adult; 7.7%: Amazon's share of US retail sales; 100 million: Stack Overflow monthly visitors; $156B: Consumer spending in apps across iOS and Google Play by 2023; 

  • Quotable Quotes:
    • John C. Lilly: When I say we may be our programs, nothing more, nothing less, I mean the substrate, the basic substratum under all else of our metaprograms is our programs. All we are as humans is what is built-in, what has been acquired, and what we make of both of these. So we are the result of the program substrate—the self-metaprogrammer.
    • @andrewhurstdog: I caused a Gmail outage so big it made the national news, by forgetting to dereference a pointer...Right, this comes from blameless postmortems. Firing a person doesn't fix the problem, it removes a person that understands the problem.
    • The more quotes the merrier...
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Friday
Mar222019

Stuff The Internet Says On Scalability For March 22nd, 2019

Wake up! It's HighScalability time:

 

Van Gogh? Nope. A satellite image of phytoplankton populations or algae blooms in the Baltic Sea. (NASA)

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. Know anyone who needs cloud? I wrote Explain the Cloud Like I'm 10 just for them. It has 41 mostly 5 star reviews. They'll learn a lot and love you even more.

 

  • .5 billion: weekly visits to Apple App store; $500m: new US exascale computer; $1.7 billion: EU's newest fine on Google; flat: 2018 global box office; up: digital rentals, sales, and subscriptions to streaming services; 43%: Dutch say let AI run the country; 2 min: time from committing an AWS key to Github to first attack; 40: bugs in blockchain platforms found in 30 days; 100%: Norwegian government 2025 EV target; 

  • Quotable Quotes:
    • @raganwald: You know what would really help restore confidence? Have members of the executive leadership team, and Boeing’s board of directors, fly on 737 Max’s, every day, for a month. Let them dogfood the software patch. I am ๐Ÿ’ฏ serious.
    • skamille: I worry that the cloud is just moving us back to a world of proprietary software. Hell, many of these providers are just providing open source API compatibility with custom-built backends! What happens when no new open source comes out of the smaller companies, and the big-3 decide they don't really need or want to play nice anymore?
    • @adriancolyer: "eRPC (efficient RPC) is a new general-purpose remote procedure call (RPC) library that offers performance comparable to specialized systems, while running on commodity CPUs in traditional datacenter networks based on either lossy Ethernet or lossless fabrics… We port a production grade implementation of Raft state machine replication to eRPC without modifying the core Raft source code. We achieve 5.5 µs of replication latency on lossy Ethernet, which is faster than or comparable to specialized replication systems that use programmable switches, FPGAs, or RDMA."
    • @matthewstoller: I just looked at Netflix’s 10K. The company is burning through cash. $3B this year, $4B next year. Debt skyrocketing. It all looks good when capital is cheap I guess.
    • @BrentToderian: What city went from 14% of all trips by bike in 2001, to 22% by 2012, then leaped to 30% in 3 years by 2015, & 35% by 2018? Thanks to #Ghent, Belgium’s Deputy Mayor @filipwatteeuw for sharing the secrets of their urban biking transformation (with jumps in walking & transit too). 
    • @slobodan_: "It is serverless the same way WiFi is wireless. At some point, the e-mail I send over WiFi will hit a wire, of course"
    • Yep, there are more quotes...
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...