Migrating PostgreSQL to Kubernetes

More and more companies are adopting Kubernetes. For some it is about being cutting-edge, for some, it is a well-defined strategy and a business transformation. Developers and operations teams all over the world are struggling with moving applications that aren’t cloud-native friendly to containers and Kubernetes.

Migrating databases is always a challenge, which comes with risks and downtime for businesses. Today I’m going to show how easy it is to migrate a PostgreSQL database to Kubernetes with minimal downtime with Percona Distribution for PostgreSQL Operator.

Goal

To perform the migration I’m going to use the following setup:

Migrating PostgreSQL to Kubernetes

PostgreSQL database deployed on-prem or somewhere in the cloud. It will be the Source.
Google Kubernetes Engine (GKE) cluster where Percona Operator deploys and manages PostgreSQL cluster (the Target) and pgBackRest Pod
PostgreSQL backups and Write Ahead Logs are uploaded to some Object Storage bucket (GCS in my case)
pgBackRest Pod reads the data from the bucket
pgBackRest Pod restores the data continuously to the PostgreSQL cluster in Kubernetes

The data should be continuously synchronized. In the end, I want to shut down PostgreSQL running on-prem and only keep the cluster in GKE.

Migration

Prerequisites

To replicate the setup you will need the following:

PostgreSQL (v 12 or 13) running somewhere
pgBackRest installed
Google Cloud Storage or any S3 bucket. My examples will be about GCS.
Kubernetes cluster

Configure The Source

I have Percona Distribution for PostgreSQL version 13 running on some Linux machines.

1. Configure pgBackrest

# cat /etc/pgbackrest.conf
[global]
log-level-console=info
log-level-file=debug
start-fast=y

[db]
pg1-path=/var/lib/postgresql/13/main
repo1-type=gcs
repo1-gcs-bucket=sp-test-1
repo1-gcs-key=/tmp/gcs.key
repo1-path=/on-prem-pg

# cat /etc/pgbackrest.conf

[global]

log-level-console=info

log-level-file=debug

start-fast=y

[db]

pg1-path=/var/lib/postgresql/13/main

repo1-type=gcs

repo1-gcs-bucket=sp-test-1

repo1-gcs-key=/tmp/gcs.key

repo1-path=/on-prem-pg

pg1-path should point to PostgreSQL data directory
repo1-type is set to GCS as we want our backups to go there
The key is in /tmp/gcs.key file. The key can be obtained through Google Cloud UI. Read more about it here.
The backups are going to be stored in on-prem-pg folder in sp-test-1 bucket

2. Edit postgresql.conf config to enable archival through pgBackrest

archive_mode = on   
archive_command = 'pgbackrest --stanza=db archive-push %p'

1 2	archive_mode = on archive_command = 'pgbackrest --stanza=db archive-push %p'

Restart is required after changing the configuration.

3. Operator requires to have a postgresql.conf file in the data directory. It is enough to have an empty file:

touch /var/lib/postgresql/13/main/postgresql.conf

1	touch /var/lib/postgresql/13/main/postgresql.conf

4. primaryuser must be created on the Source to ensure replication is correctly set up by the Operator.

# create user primaryuser with encrypted password '<PRIMARYUSER PASSWORD>' replication;

1	# create user primaryuser with encrypted password '<PRIMARYUSER PASSWORD>' replication;

Configure The Target

1. Deploy Percona Distribution for PostgreSQL Operator on Kubernetes. Read more about it in the documentation here.

# create the namespace
kubectl create namespace pgo

# clone the git repository
git clone -b v0.2.0 https://github.com/percona/percona-postgresql-operator/
cd percona-postgresql-operator

# deploy the operator
kubectl apply -f deploy/operator.yaml

# create the namespace

kubectl create namespace pgo

# clone the git repository

git clone -b v0.2.0 https://github.com/percona/percona-postgresql-operator/

cd percona-postgresql-operator

# deploy the operator

kubectl apply -f deploy/operator.yaml

2. Edit main custom resource manifest – deploy/cr.yaml.

I’m not going to change the cluster name and keep it cluster1
the cluster is going to operate in Standby mode, which means it is going to sync the data from the GCS bucket. Set spec.standby to true.
configure GCS itself. spec.backup section would look like this ( bucket and repoPath are the same as in pgBackrest configuration above)

 backup:
...
    repoPath: "/on-prem-pg"
...
    storages:
      my-s3:
        type: gcs
        endpointUrl: https://storage.googleapis.com
        region: us-central1-a
        uriStyle: path
        verifyTLS: false
        bucket: sp-test-1
    storageTypes: [
      "gcs"
    ]

backup:

...

repoPath: "/on-prem-pg"

...

storages:

my-s3:

type: gcs

endpointUrl: https://storage.googleapis.com

region: us-central1-a

uriStyle: path

verifyTLS: false

bucket: sp-test-1

storageTypes: [

"gcs"

]

I would like to have at least one Replica in my PostgreSQL cluster. Set spec.pgReplicas.hotStandby.size to 1.

3. Operator should be able to authenticate with GCS. To do that we need to create a secret object called <CLUSTERNAME>-backrest-repo-config with gcs-key in data. It should be the same key we used on the Source. See the example of this secret here.

kubectl apply -f gcs.yaml

1	kubectl apply -f gcs.yaml

4. Create users by creating Secret objects: postgres and primaryuser (the one we created on the Source). See the examples of users Secrets here. The passwords should be the same as on the Source.

kubectl apply -f users.yaml

1	kubectl apply -f users.yaml

5. Now let’s deploy our cluster on Kubernetes by applying the cr.yaml:

kubectl apply -f deploy/cr.yaml

1	kubectl apply -f deploy/cr.yaml

Verify and Troubleshoot

If everything is done correctly you should see the following in the Primary Pod logs:

kubectl -n pgo logs -f --tail=20 cluster1-5dfb96f77d-7m2rs
2021-07-30 10:41:08,286 INFO: Reaped pid=548, exit status=0
2021-07-30 10:41:08,298 INFO: establishing a new patroni connection to the postgres cluster
2021-07-30 10:41:08,359 INFO: initialized a new cluster
Fri Jul 30 10:41:09 UTC 2021 INFO: PGHA_INIT is 'true', waiting to initialize as primary
Fri Jul 30 10:41:09 UTC 2021 INFO: Node cluster1-5dfb96f77d-7m2rs fully initialized for cluster cluster1 and is ready for use
2021-07-30 10:41:18,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:18,810 INFO: no action.  i am the standby leader with the lock                                                     2021-07-30 10:41:28,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:28,832 INFO: no action.  i am the standby leader with the lock

kubectl -n pgo logs -f --tail=20 cluster1-5dfb96f77d-7m2rs

2021-07-30 10:41:08,286 INFO: Reaped pid=548, exit status=0

2021-07-30 10:41:08,298 INFO: establishing a new patroni connection to the postgres cluster

2021-07-30 10:41:08,359 INFO: initialized a new cluster

Fri Jul 30 10:41:09 UTC 2021 INFO: PGHA_INIT is 'true', waiting to initialize as primary

Fri Jul 30 10:41:09 UTC 2021 INFO: Node cluster1-5dfb96f77d-7m2rs fully initialized for cluster cluster1 and is ready for use

2021-07-30 10:41:18,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs 2021-07-30 10:41:18,810 INFO: no action. i am the standby leader with the lock 2021-07-30 10:41:28,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs 2021-07-30 10:41:28,832 INFO: no action. i am the standby leader with the lock

Change some data on the Source and ensure that it is properly synchronized to the Target cluster.

Common Issues

The following error message indicates that you forgot to create postgresql.conf file in the data directory:

FileNotFoundError: [Errno 2] No such file or directory: '/pgdata/cluster1/postgresql.conf' -> '/pgdata/cluster1/postgresql.base.conf'

1	FileNotFoundError: [Errno 2] No such file or directory: '/pgdata/cluster1/postgresql.conf' -> '/pgdata/cluster1/postgresql.base.conf'

Sometimes it is easy to forget to create the primaryuser and see the following in the logs:

psycopg2.OperationalError: FATAL:  password authentication failed for user "primaryuser"

1	psycopg2.OperationalError: FATAL: password authentication failed for user "primaryuser"

Wrong or missing object store credentials will trigger the following error:

WARN: repo1: [CryptoError] unable to load info file '/on-prem-pg/backup/db/backup.info' or '/on-prem-pg/backup/db/backup.info.copy':      CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218529960] wrong tag            HINT: is or was the repo encrypted?                                                                                                 CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218595386] nested asn1 error
      HINT: is or was the repo encrypted?
      HINT: backup.info cannot be opened and is required to perform a backup.
      HINT: has a stanza-create been performed?
ERROR: [075]: no backup set found to restore
Fri Jul 30 10:54:00 UTC 2021 ERROR: pgBackRest standby Creation: pgBackRest restore failed when creating standby

WARN: repo1: [CryptoError] unable to load info file '/on-prem-pg/backup/db/backup.info' or '/on-prem-pg/backup/db/backup.info.copy': CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218529960] wrong tag HINT: is or was the repo encrypted? CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218595386] nested asn1 error

HINT: is or was the repo encrypted?

HINT: backup.info cannot be opened and is required to perform a backup.

HINT: has a stanza-create been performed?

ERROR: [075]: no backup set found to restore

Fri Jul 30 10:54:00 UTC 2021 ERROR: pgBackRest standby Creation: pgBackRest restore failed when creating standby

Cutover

Everything looks good and it is time to perform the cutover. In this blog post, I cover only the database side but do not forget that your application should be reconfigured to point to the correct PostgreSQL cluster. It might be a good idea to stop the application before the cutover.

1. Stop the source PostgreSQL cluster to ensure no data is written

systemctl stop postgresql

1	systemctl stop postgresql

2. Promote the Target cluster to primary. To do that remove spec.backup.repoPath, change spec.standby to false in deploy/cr.yaml, and apply the changes:

kubectl apply -f deploy/cr.yaml

1	kubectl apply -f deploy/cr.yaml

PostgreSQL will be restarted automatically and you will see the following in the logs:

2021-07-30 11:16:20,020 INFO: updated leader lock during promote
2021-07-30 11:16:20,025 INFO: Changed archive_mode from on to True (restart might be required)
2021-07-30 11:16:20,025 INFO: Changed max_wal_senders from 10 to 6 (restart might be required)
2021-07-30 11:16:20,027 INFO: Reloading PostgreSQL configuration.
server signaled
2021-07-30 11:16:21,037 INFO: Lock owner: cluster1-5dfb96f77d-n4c79; I am cluster1-5dfb96f77d-n4c79
2021-07-30 11:16:21,132 INFO: no action.  i am the leader with the lock

2021-07-30 11:16:20,020 INFO: updated leader lock during promote

2021-07-30 11:16:20,025 INFO: Changed archive_mode from on to True (restart might be required)

2021-07-30 11:16:20,025 INFO: Changed max_wal_senders from 10 to 6 (restart might be required)

2021-07-30 11:16:20,027 INFO: Reloading PostgreSQL configuration.

server signaled

2021-07-30 11:16:21,037 INFO: Lock owner: cluster1-5dfb96f77d-n4c79; I am cluster1-5dfb96f77d-n4c79

2021-07-30 11:16:21,132 INFO: no action. i am the leader with the lock

Conclusion

Deploying and managing database clusters is not an easy task. Recently released Percona Distribution for PostgreSQL Operator automates day-1 and day-2 operations and turns running PostgreSQL on Kubernetes into a smooth and pleasant journey.

With Kubernetes becoming the default control plane, the most common task for developers and operations teams is to perform the migration, which usually turns into a complex project. This blog post shows that database migration can be an easy task with minimal downtime.

We encourage you to try out our operator. See our github repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

Are you a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Migrating PostgreSQL to Kubernetes

Goal

Migration

Prerequisites

Configure The Source

Configure The Target

Verify and Troubleshoot

Common Issues

Cutover

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Migrating PostgreSQL to Kubernetes

Goal

Migration

Prerequisites

Configure The Source

Configure The Target

Verify and Troubleshoot

Common Issues

Cutover

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation