Cluster Statuses in Percona Kubernetes OperatorsIn Kubernetes, all resources have a status field separated from their spec. The status field is an interface both for humans or applications to read the perceived state of the resource.

When you deploy our Percona Kubernetes Operators –  Percona Operator for MongoDB or Percona Operator for MySQL – in your Kubernetes cluster, you’re creating a custom resource (CR for short) and it has its own status, too. Since Kubernetes operators mimic the human operator and aim to have the required expertise to run software in a Kubernetes cluster; the status of the custom resources should be smart.

You can get cluster status with the commands below, or via (Kubernetes API) for Percona Operator for MySQL:

And for Percona Operator for MongoDB:

As you can see there are several fields in the output: conditions, cluster size, number of ready cluster members, statuses and versions of different components, and the “state”. In the following sections, we’ll take a look at every possible value of the state field.

Initializing

While the cluster is progressing to readiness, CR status is “initializing”. It includes creating the cluster, scaling it up or down, and updating the CR that triggers a rolling restart of pods (for instance updating Percona Operator for MySQL memory limits).

Percona Operator for MongoDB also reconfigures the replica set config if necessary (for instance it adds the new pods as members to replset or removes terminated ones). Replica set in MongoDB is a set of servers that implements replication and automatic failover. Although they have the same name, it’s different from the Kubernetes replica set. While this configuration is happening or if there is an unknown/unpredicted error during it, the status is also “initializing”.

Since version 1.7.0, the Percona Operator for MySQL can handle full crash recovery if necessary. If a pod waits for the recovery, the cluster status is “initializing”.

Ready

The operator keeps track of the status of each component in the cluster. Percona Operator for MongoDB has the following components:

  1. mongod stateful set
  2. configsvr stateful set if sharding is enabled
  3. mongos deployment if sharding is enabled

Percona Operator for MySQL components:

  1. PXC stateful set
  2. HAProxy stateful set if enabled
  3. ProxySQL stateful set if enabled

All components need to be in “ready” status for CR to be “ready”. If the number of ready pods controlled by the stateful set reaches the desired number, the operator marks the component as ready. The readiness of the pods is tracked by Kubernetes using readiness probes for each container in the pod. For example, for a Percona XtraDB Cluster container to be ready “wsrep_cluster_status” needs to be “Primary” and “wsrep_local_state” should be “Synced” or “Donor”. For a Percona Server for MongoDB container to be ready, accepting TCP connections on 27017 is enough.

But ready as the CR status means more than that. CR “ready” means the cluster (Percona Server for MongoDB or Percona XtraDB Cluster) is up and running and ready to receive traffic. So, even if all components are ready, the cluster status can be “initializing”. In the Percona Operator for MongoDB, the replica set needs to be initialized and its config up-to-date. Also, with the 1.9.0 release of both operators, the load balancer needs to be ready if the cluster is exposed with exposeType: LoadBalancer.

Stopping

Version 1.9.0 introduced two new statuses:

  1. Stopping
  2. Paused

Stopping means the cluster is paused or deleted and its pods are terminating right now.

If you run kubectl delete psmdb <cluster-name> or kubectl delete pxc <cluster-name> the resource can be deleted quickly without a chance to see “stopping” status. If you had finalizers (for example “delete-pxc-pods-in-order” in Percona Operator for MySQL) deletion will be blocked until the finalizer list is exhausted and you can observe “stopping” status.

Paused

Once the cluster is paused and all pods are terminated, the CR status becomes “paused”.

To pause the cluster: kubectl patch <psmdb|pxc> <cluster-name> --type=merge -p '{"spec": {"pause": true}}'

Keep in mind, when the cluster is paused and exposeType is LoadBalancer – Load balancers are still there and you continue to pay for them.

Error

Before 1.9.0, “error” status could mean two different things:

  1. An error occurred in the operator during the reconciliation of the CR
  2. One or more pods in a component are not schedulable

With 1.9.0, the “error” status means only the operator errors. If there is an unschedulable pod, the cluster’s status will be initializing. If the cluster is stuck in initializing for too long, it’s better to check the operator logs to investigate.

You can try new statuses in version 1.9.0 of both Percona Operator for MongoDB and Percona Operator for MySQL. Percona Operator for MongoDB was released in June and Percona Operator for MySQL is on the way.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments