Galera Cache (gcache) is finally recoverable on restart

This post describes how to recover Galera Cache (or gcache) on restart.

Recently Codership introduced (with Galera 3.19) a very important and long awaited feature. Now users can recover Galera cache on restart.

Need

If you gracefully shutdown cluster nodes one after another, with some lag time between nodes, then the last node to shutdown holds the latest data. Next time you restart the cluster, the last node shutdown will be the first one to boot. Any followup nodes that join the cluster after the first node will demand an SST.

Why SST, when these nodes already have data and only few write-sets are missing? The DONOR node caches missing write-sets in Galera cache, but on restart this cache is wiped clean and restarted fresh. So the DONOR node doesn’t have a Galera cache to donate missing write-sets.

This painful set up made it necessary for users to think and plan before gracefully taking down the cluster. With the introduction of this new feature, the user can retain the Galera cache.

How does this help ?

On restart, the node will revive the galera-cache. This means the node can act as a DONOR and service missing write-sets (facilitating IST, instead of using SST). This option to retain the galera-cache is controlled by an option named gcache.recover=yes/no. The default is NO (Galera cache is not retained). The user can set this option for all nodes, or selective nodes, based on disk usage.

gcache.recover in action

The example below demonstrates how to use this option:

Let’s say the user has a three node cluster (n1, n2, n3), with all in sync.
The user gracefully shutdown n2 and n3.
n1 is still up and running, and processes some workload, so now n1 has latest data.
n1 is eventually shutdown.
Now the user decides to restart the cluster. Obviously, the user needs to start n1 first, followed by n2/n3.
n1 boots up, forming an new cluster.
n2 boots up, joins the cluster, finds there are missing write-sets and demands IST but given that n1 doesn’t have a gcache, it falls back to SST.

n2 (JOINER node log):

2016-11-18 13:11:06 3277 [Note] WSREP: State transfer required: 
 Group state: 839028c7-ad61-11e6-9055-fe766a1886c3:4680
 Local state: 839028c7-ad61-11e6-9055-fe766a1886c3:3893

2016-11-18 13:11:06 3277 [Note] WSREP: State transfer required:

Group state: 839028c7-ad61-11e6-9055-fe766a1886c3:4680

Local state: 839028c7-ad61-11e6-9055-fe766a1886c3:3893

n1 (DONOR node log), gcache.recover=no:

2016-11-18 13:11:06 3245 [Note] WSREP: IST request: 839028c7-ad61-11e6-9055-fe766a1886c3:3893-4680|tcp://192.168.1.3:5031
2016-11-18 13:11:06 3245 [Note] WSREP: IST first seqno 3894 not found from cache, falling back to SST

1 2	2016-11-18 13:11:06 3245 [Note] WSREP: IST request: 839028c7-ad61-11e6-9055-fe766a1886c3:3893-4680\|tcp://192.168.1.3:5031 2016-11-18 13:11:06 3245 [Note] WSREP: IST first seqno 3894 not found from cache, falling back to SST

Now let’s re-execute this scenario with gcache.recover=yes.

n2 (JOINER node log):

2016-11-18 13:24:38 4603 [Note] WSREP: State transfer required: 
 Group state: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:1495
 Local state: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:769
....
2016-11-18 13:24:41 4603 [Note] WSREP: Receiving IST: 726 writesets, seqnos 769-1495
....
2016-11-18 13:24:49 4603 [Note] WSREP: IST received: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:1495

2016-11-18 13:24:38 4603 [Note] WSREP: State transfer required:

Group state: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:1495

Local state: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:769

....

2016-11-18 13:24:41 4603 [Note] WSREP: Receiving IST: 726 writesets, seqnos 769-1495

....

2016-11-18 13:24:49 4603 [Note] WSREP: IST received: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:1495

n1 (DONOR node log):

2016-11-18 13:24:38 4573 [Note] WSREP: IST request: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:769-1495|tcp://192.168.1.3:5031
2016-11-18 13:24:38 4573 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

1 2	2016-11-18 13:24:38 4573 [Note] WSREP: IST request: ee8ef398-ad63-11e6-92ed-d6c0646c9f13:769-1495\|tcp://192.168.1.3:5031 2016-11-18 13:24:38 4573 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

You can also validate this by checking the lowest write-set available in gcache on the DONOR node.

mysql> show status like 'wsrep_local_cached_downto';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| wsrep_local_cached_downto | 1 |
+---------------------------+-------+
1 row in set (0.00 sec)

mysql> show status like 'wsrep_local_cached_downto';

+---------------------------+-------+

| Variable_name | Value |

+---------------------------+-------+

| wsrep_local_cached_downto | 1 |

+---------------------------+-------+

1 row in set (0.00 sec)

So as you can see, gcache.recover could restore the cache on restart and help service IST over SST. This is a major resource saver for most of those graceful shutdowns.

gcache revive doesn’t work if . . .

If gcache pages are involved. Gcache pages are still removed on shutdown, and the gcache write-set until that point also gets cleared.

Again let’s see and example:

Let’s assume the same configuration and workflow as mentioned above. We will just change the workload pattern.
n1, n2, n3 are in sync and an average-size workload is executed, such that the write-set fits in the gcache. (seqno=1-x)
n2 and n3 are shutdown.
n1 continues to operate and executes some average size workload followed by a huge transaction that results in the creation of a gcache page. (1-x-a-b-c-h) [h represent transaction seqno]
Now n1 is shutdown. During shutdown, gcache pages are purged (irrespective of the keep_page_sizes setting).
The purge ensures that all the write-sets that has seqno smaller than gcache-page-residing write-set are purged, too. This effectively means (1-h) everything is removed, including (a,b,c).
On restart, even though n1 can revive the gcache it can’t revive anything, as all the write-sets are purged.
When n2 boots up, it requests IST, but n1 can’t service the missing write-set (a,b,c,h). This causes SST to take place.

Summing it up

Needless to say, gcache.recover is a much needed feature, given it saves SST pain. (Thanks Codership.) It would be good to see if the feature can be optimized to work with gcache pages.

And yes, Percona XtraDB Cluster inherits this feature in its upcoming release.

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Shrinivasa

7 years ago

Hi Krunal,

As per documentation, majority of nodes should be up for cluster to function. But how come both node 1 is able to serve request when other 2 nodes are down.

Krunal Bauskar

Author

7 years ago

You can have cluster with single node too. When you first boot the node you have cluster with single node.
If you have 2 node cluster and node-2 leaves the cluster gracefully (user shutdown) that is not treated as split-brain as before going off node-2 communicate its graceful shutdown status to node-1.

Raghupradeep

5 years ago

Does bootstrapping the first node clears the gcache ?

Krunal Bauskar

Reply to Raghupradeep

5 years ago

By bootstrapping I presume it is bootstrapping the cluster. Gcache will be cleared if gcache.recover=no.

trangtriqc

3 years ago

I am wondering whether Gcache always save the last data which fit gcache.size whenever database has new data?

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Galera Cache (gcache) is finally recoverable on restart

Need

How does this help ?

gcache.recover in action

gcache revive doesn’t work if . . .

Summing it up

Related

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Galera Cache (gcache) is finally recoverable on restart

Need

How does this help ?

gcache.recover in action

gcache revive doesn’t work if . . .

Summing it up

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

High Availability: Choosing the Right Option for Your Percona Monitoring and Management

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation