Self-Healing Feature in Percona Distribution for MySQL OperatorIn the previous release of our Percona Distribution for MySQL Operator, we implemented one interesting feature, which can be seen as “self-healing”: https://jira.percona.com/browse/K8SPXC-564.

I do not think it got enough attention, so I want to write more about this.

As it is well known, a 3-node cluster can survive a crash of one node (or pod, in Kubernetes terminology), and this case is very well handled by itself. However, if there is a problem with 2 nodes at the same time, this scenario is problematic for Percona XtraDB Cluster. Let’s see why this is a problem.

First, let’s review if the first node goes offline:

 

MySQL Operator Node

 

In this case, the cluster can continue work, because Node 1 and Node 2 figure out they still can form a majority, and they establish a 2-node cluster and continue operations.

Now, let’s take a look at what happens if Node 2 becomes unresponsive.

 

MySQL Kubernetes Operator

 

When this happens, Node 1 loses communication to both Node 2 and Node 3, and it has no choice but to declare itself offline, in order to prevent a split-brain situation. Node 1 does not have a way to know if Node 2 crashed or it is a problem in the network link between Node 1 and Node 2, and if this is only a network link, then Node 2 in theory can accept user updates from a different link.

So in this case Node 1 becomes offline, and the only way to resolve it is with human intervention. This is the case with our standard deployments of Percona XtraDB Cluster, and it was the case in Kubernetes Operators up until version 1.7.0.

Let’s see how the problem exposes itself in Percona Distribution for MySQL Operator 1.6.0. Assume we have a functioning cluster:

And now to emulate the crash of two nodes, I will execute:

And 10 seconds later:

So two pods are killed, and on logs from cluster1-pxc-0, we can see:

And it will continue to stay in NON-PRIMARY until we manually do something to resolve it.

Auto-Recovery in Percona Distribution for MySQL Operator 1.7.0

In version 1.7.0 we decided to improve how the Operator handles this crash, and make it less manual and more automatic. In the end, let’s take a look at the Operator goal in general (a quote from https://kubernetes.io/docs/concepts/extend-kubernetes/operator/#motivation):

“The Operator pattern aims to capture the key aim of a human operator who is managing a service or set of services. Human operators who look after specific applications and services have deep knowledge of how the system ought to behave, how to deploy it, and how to react if there are problems.

People who run workloads on Kubernetes often like to use automation to take care of repeatable tasks. The Operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides“

So we decided to add more automation and make the full cluster crash less painful.

How did we achieve this? On one hand, it may look like more of a brute force approach, but on the other, we added sophistication to make sure there is no data loss.

Brute force approach: we forceful reboot all cluster pods, and the sophisticated part is that after a restart it finds a pod with the most advanced GTID position so it has all the recent data and this pod will start first (will work as a bootstrap), and other pods will re-join cluster, using IST or SST if necessary.

Let’s see what happens in the new Operator (I use the 1.8.0 version).

The setup and steps to get into “unoperational” state are the same as previously.

But let’s take a look at logs in cluster1-pxc-0 pod:

Basically, you see what I described: The Operator enforced the full cluster crash, and after it performed a recovery from the most recent pod. And we can see:

The cluster recovered from an “unoperation” state. Happy Clustering with our Percona Distribution for MySQL Operator!

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments