Entries in Database (63)

Tuesday
Apr162019

MySQL High Availability Framework Explained – Part III: Failover Scenarios

MySQL High Availability Framework Explained – Part III: Failover Scenarios

In this three-part blog series, we introduced a High Availability (HA) Framework for MySQL hosting in Part I, and discussed the details of MySQL semisynchronous replication in Part II. Now in Part III, we review how the framework handles some of the important MySQL failure scenarios and recovers to ensure high availability.

MySQL Failover Scenarios

Scenario 1 – Master MySQL Goes Down

  • The Corosync and Pacemaker framework detects that the master MySQL is no longer available. Pacemaker demotes the master resource and tries to recover with a restart of the MySQL service, if possible.
  • At this point, due to the semisynchronous nature of the replication, all transactions committed on the master have been received by at least one of the slaves.
  • Pacemaker waits until all the received transactions are applied on the slaves and lets the slaves report their promotion scores. The score calculation is done in such a way that the score is ‘0’ if a slave is completely in sync with the master, and is a negative number otherwise.
  • Pacemaker picks the slave that has reported the 0 score and promotes that slave which now assumes the role of master MySQL on which writes are allowed.
  • After slave promotion, the Resource Agent triggers a DNS rerouting module. The module updates the proxy DNS entry with the IP address of the new master, thus, facilitating all application writes to be redirected to the new master.
  • Pacemaker also sets up the available slaves to start replicating from this new master.

Thus, whenever a master MySQL goes down (whether due to a MySQL crash, OS crash, system reboot, etc.), our HA framework detects it and promotes a suitable slave to take over the role of the master. This ensures that the system continues to be available to the applications.

Scenario 2 – Slave MySQL Goes Down

  • The Corosync and Pacemaker framework detects that the slave MySQL is no longer available.
  • Pacemaker tries to recover the resource by trying to restart MySQL on the node. If it comes up, it is added back to the current master as a slave and replication continues.
  • If recovery fails, Pacemaker reports that resource as down – based on which alerts or notifications can be generated. If necessary, the ScaleGrid support team will handle the recovery of this node.
  • In this case, there is no impact on the availability of MySQL services.

Scenario 3 – Network Partition – Network Connectivity Breaks Down Between Master and Slave Nodes

This is a classical problem in any distributed system where each node thinks the other nodes are down, while in reality, only the network communication between the nodes is broken. This scenario is more commonly known as split-brain scenario, and if not handled properly, can lead to more than one node claiming to be a master MySQL which in turn leads to data inconsistencies and corruption.

Let’s use an example to review how our framework deals with split-brain scenarios in the cluster. We assume that due to network issues, the cluster has partitioned into two groups – master in one group and 2 slaves in the other group, and we will denote this as [(M), (S1,S2)].

  • Corosync detects that the master node is not able to communicate with the slave nodes, and the slave nodes can communicate with each other, but not with the master.
  • The master node will not be able to commit any transactions as the semisynchronous replication expects acknowledgement from at least one of the slaves before the master can commit. At the same time, Pacemaker shuts down MySQL on the master node due to lack of quorum based on the Pacemaker setting ‘no-quorum-policy = stop’. Quorum here means a majority of the nodes, or two out of three in a 3-node cluster setup. Since there is only one master node running in this partition of the cluster, the no-quorum-policy setting is triggered leading to the shutdown of the MySQL master.
  • Now, Pacemaker on the partition [(S1), (S2)] detects that there is no master available in the cluster and initiates a promotion process. Assuming that S1 is up to date with the master (as guaranteed by semisynchronous replication), it is then promoted as the new master.
  • Application traffic will be redirected to this new master MySQL node and the slave S2 will start replicating from the new master.

Thus, we see that the MySQL HA framework handles split-brain scenarios effectively, ensuring both data consistency and availability in the event the network connectivity breaks between master and slave nodes.

This concludes our 3-part blog series on the MySQL High Availability (HA) framework using semisynchronous replication and the Corosync plus Pacemaker stack. At ScaleGrid, we offer highly available hosting for MySQL on AWS and MySQL on Azure that is implemented based on the concepts explained in this blog series. Please visit the ScaleGrid Console for a free trial of our solutions.

Wednesday
Apr032019

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

PostgreSQL is an open source object-relational database system that has soared in popularity over the past 30 years from its active, loyal, and growing community. For the 2nd year in a row, PostgreSQL has kept the title of #1 fastest growing database in the world according to the DBMS of the Year report by the experts at DB-Engines. So what makes PostgreSQL so special, and how is it being used today? We found the answers at the Postgres Conference in March where we surveyed PostgreSQL users, contributors, and SQL and NoSQL database administrators alike. In this free PostgreSQL Trends Report, we break down PostgreSQL hosting use across public cloud vs. private cloud vs. hybrid cloud, most popular cloud providers, migration trends, database combinations with Postgres, and why PostgreSQL is preferred over popular RDBMS alternatives.

Private Cloud vs. Public Cloud vs. Hybrid Cloud

Click to read more ...

Tuesday
Feb192019

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

Redis Cluster is the native sharding implementation available within Redis that allows you to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. At ScaleGrid, we recently added support for Redis Clusters on our platform through our fully managed Redis hosting plans. In this post, we’re going to introduce you to the advanced Redis Cluster sharding opportunities, discuss its advantages and limitations, when you should deploy, and how to connect to your Redis Cluster.

Sharding with Redis Cluster

Click to read more ...

Tuesday
Jan082019

Slow MySQL Start Time in GTID mode? Binary Log File Size May Be The Issue

Have you been experiencing slow MySQL startup times in GTID mode? We recently ran into this issue on one of our MySQL hosting deployments and set out to solve the problem. In this blog, we break down the issue that could be slowing down your MySQL restart times, how to debug for your deployment, and what you can do to decrease your start time and improve your understanding of GTID-based replication.

How We Found The Problem

Click to read more ...

Monday
Nov062017

Birth of the NearCloud: Serverless + CRDTs @ Edge is the New Next Thing


Kuhiro 10X Faster than Amazon Lambda

 

This is a guest post by Russell Sullivan, founder and CTO of Kuhirō.

Serverless is an emerging Infrastructure-as-a-Service solution poised to become an Internet-wide ubiquitous compute platform. In 2014 Amazon Lambda started the Serverless wave and a few years later Serverless has extended to the CDN-Edge and beyond the last mile to mobile,  IoT, & storage.

This post examines recent innovations in Serverless at the CDN Edge (SAE). SAE is a sea change, it’s a really big deal, it marks the beginning of moving business logic from a single Cloud-region out to the edges of the Internet, which may eventually penetrate as far as servers running inside cell phone towers. When 5G arrives SAE will be only a few milliseconds away from billions of devices, the Internet will be transformed into a global-scale real-time compute-platform.

The journey of being a founder and then selling a NOSQL company, along the way architecting three different NOSQL data-stores, led me to realize that computation is currently confined to either the data-center or the device: the vast space between the two is largely untapped. So I teamed up with some smart people and we created the startup Kuhirō: a company dedicated to incrementally pushing pieces of the Cloud out to the edge, gradually creating a decentralized cloud very close to end-users, a NearCloud.

We decided the foundations of this NearCloud would be compute & data so we are beginning with a stateful SAE system which will serve as a springboard for subsequent offerings (e.g. ML inference, real-time-analytics, etc…). At many CDN edges, we run customer business logic as functions which read and write real-time customer-data. We put in the effort to make a CRDT-based data-layer that (for the first time ever) delivers low-latency dynamic web-processing on shared-global-data from the CDN edge. Kuhirō enables customers to move the dynamic latency-sensitive parts of their app from the cloud to the edge, customer apps become global-scale real-time applications with Kuhirō handling the operations and scaling.

Serverless at the Edge Architecture

Click to read more ...

Monday
Oct172016

Datanet: a New CRDT Database that Let's You Do Bad Bad Things to Distributed Data

 

We've had databases targeting consistency. These are your typical RDBMSs. We've had databases targeting availability. These are your typical NoSQL databases.

If you're using your CAP decoder ring you know what's next...what databases do we have that target making concurrency a first class feature? That promise to thrive and continue to function when network partitions occur?

No many, but we have a brand new concurrency oriented database: Datanet - a P2P replication system that utilizes CRDT algorithms to allow multiple concurrent actors to modify data and then automatically & sensibly resolve modification conflicts.

Datanet is the creation of Russell Sullivan. Russell spent over three years hidden away in his mad scientist layer researching, thinking, coding, refining, and testing Datanet. You may remember Russell. He has been involved with several articles on HighScalability and he wrote AlchemyDB, a NoSQL database, which was acquired by Aerospike.

So Russell has a feel for what's next. When he built AlchemyDB he was way ahead of the pack and now he thinks practical, programmer friendly CRDTs are what's next. Why?

Concurrency and data locality. To quote Russell:

Datanet lets you ship data to the spot where the action is happening. When the action happens it is processed locally, your system's reactivity is insanely quick. This is pretty much the opposite of the non-concurrent case where you need to go to a specific machine in the cloud to modify a piece of data regardless of where the action takes place. As your system grows, the concurrent approach is superior.

We have been slowly moving away from transactions towards NoSQL for reasons of scalability, availability, robustness, etc. Datanet continues this evolution by taking the next step and moving towards extreme distribution: supporting tons of concurrent writers.

The shift is to more distribution in computation. We went from one app-server & one DB to app-server-clusters and clustered-DBs, to geographically distributed data-centers, and now we are going much further with Datanet, data is distributed anywhere you need it to a local cache that functions as a database master.

How does Datanet work?

In Datanet, the same piece of data can simultaneously exist as a write-able entity in many many places in the stack. Datanet is a different way of looking at data: Datanet more closely resembles an internet routing protocol than a traditional client-server database ... and this mirrors the current realities that data is much more in flight than it used to be.

What bad bad things can you do to your distributed data? Here's an amazing video of how Datanet recovers quickly, predictably, and automatically from Chaos Monkey level extinction events. It's pretty slick. 

 

Here's an email interview I did with Russell. He goes into a lot more detail about Datanet and what it's all about. I think you will find it interesting. 

Let's start with your name and a little of your background?

Click to read more ...

Wednesday
Sep072016

Code Generation: The Inner Sanctum of Database Performance

This is guest post by Drew Paroski, architect and engineering manager at MemSQL. Previously he worked at Facebook and developed HHVM, the popular real-time PHP compiler used across the company’s web scale application.

Achieving maximum software efficiency through native code generation can bring superior scaling and performance to any database. And making code generation a first-class citizen of the database, from the beginning, enables a rich set of speed improvements that provide benefits throughout the software architecture and end-user experience.

If you decide to build a code generation system you need to clearly understand the costs and benefits, which we detail in this article. If you are willing to go all the way in the name of performance, we also detail an approach to save you time leveraging existing compiler tools and frameworks such as LLVM in a proven and robust way.

Code Generation Basics

Click to read more ...

Thursday
Feb252016

When Should Approximate Query Processing Be Used?

This is a guest repost by Barzan Mozafari, an assistant professor at University of Michigan and an advisor to a new startup, snappydata.io, that recently launched an open source OLTP + OLAP Database built on Spark.

The growing market for Big Data has created a lot of interest around approximate query processing (AQP) as a means of achieving interactive response times (e.g., sub-second latencies) when faced with terabytes and petabytes of data. At the same time, there is a lot of misinformation about this technology and what it can or cannot do.

Having been involved in building a few academic prototypes and industrial engines for approximate query processing, I have heard many interesting statements about AQP and/or sampling techniques (from both DB vendors and end-users):

Myth #1. Sampling is only useful when you know your queries in advance
Myth #2. Sampling misses out on rare events or outliers in the data
Myth #3. AQP systems cannot handle join queries
Myth #4. It is hard for end-users to use approximate answers
Myth #5. Sampling is just like indexing
Myth #6. Sampling will break the BI tools
Myth #7. There is no point approximating if your data fits in memory

Although there is a grain of truth behind some of these myths, none of them are actually accurate. There are many different forms of sampling, approximation, and error quantification, and their nuances are missed by these blanket statements. In other words, many of these impressions are simply based on wrong assumptions and/or misunderstanding of basic AQP terminology.

Anyhow, instead of going over each of these statements and explaining why they are categorically wrong, in this post I’d like to answer the positive question: When can (and should) one use approximate answers? Note that by asking this question, I am implicitly giving away that I don’t think approximate answers are always useful. A perfect example where you don’t want to use approximation is in billing departments. (Although every time I look at my own Internet bill, I start to think that even this example has its own exceptions. I’m too afraid to mention my Internet provider’s name here but I am sure you can guess).

Anyhow, let’s discuss the key reasons and use-cases for approximate answers.

1. Use AQP when you care about interactive response times

Click to read more ...

Wednesday
Sep232015

How will new memory technologies impact in-memory databases?

This is a guest post by Yiftach Shoolman, Co-founder & CTO of redislabs. Will 3D XPoint change everything? Not as much as you might hope...

Recently, investors, analysts, partners and customers have asked me how the announcement from Intel and Micron about their new 3D XPoint memory technology will affect the in-memory databases market. In these discussions, a common question was “Who needs an in-memory database if all the non in-memory databases will achieve similar performance with 3D XPoint technology?” Well, I think that's a valid question so I've decided to take a moment to describe how we think this technology will influence our market.

First, a little background...

The motivation of Intel and Micron is clear -- DRAM is expensive and hasn’t changed much during the last few years (as shown below). In addition, there are currently only three major makers of DRAM on the planet (Samsung Electronics, Micron and SK Hynix), which means that the competition between them is not as cutthroat as it used to be between four and five major manufacturers several years ago.

DRAM Price Trends

Click to read more ...

Wednesday
May202015

Database Scaling Redefined: Scaling Demanding Queries, High Velocity Data Modifications and Fast Indexing All At Once for Big Data

This is a guest post by Cihan Biyikoglu, Director of Product Management at Couchbase.

Question: A few million people are out looking for a setup to efficiently live and interact. What is the most optimized architecture they can use?

  1. Build one giant high-rise for everyone,
  2. Build many single-family homes OR
  3. Build something in between?

Schools, libraries, retail stores, corporate HQs, homes are all there to optimize variety of interactions. Sizes of groups and type of exchange vary drastically… Turns out, what we have chosen to do is, to build all of the above. To optimize different interactions, different architectures make sense.

While high rises can be effective for interactions with high density of people in a small amount of land, it is impractical to build 500 story buildings. It is also hard to add/remove floors as you need them. So high-rises feel awfully like scaling-up – cluster of processors communicating over fast memory to compute fast but limited scale ceiling and elasticity.

As your home, single-family architecture work great. Nice backyard to play and private space for family dinners... You may need to get in your car to interact with other families, BUT it is easy to build more single family houses: so easy elasticity and scale. Single-family structure feels awfully like scaling-out, doesn't it? Cluster of commodity machines that communicate over slower networks and come with great elasticity.

“How does this all relate to database scalability?” you ask…

Click to read more ...