Backup management is the backbone of any reliable database administration strategy, and while pgBackRest simplifies this process, its default configuration often combines backup and expiry together. At times, due to slow network bandwidth, huge database sizes, or any other resource crunch, expiring backups (i.e., deleting the older backups) require more than taking backups for databases. We can optimize the backup management process by separating backups and expiry operations to address this. Here, we’ll dig into how to break free from the conventional approach by decoupling backup and expiry operations in pgBackRest.
Why separate backups and expiry?
Separating backups and expiry provides several benefits:
- Faster backups: Backups can be performed more frequently without being slowed down by the potentially time-consuming expiration process.
- Improved flexibility: One can customize the scheduling of backups and expiry independently to meet specific requirements.
- Reduced backup window: By running backups more frequently and expiring backups less frequently, the impact on the system during backup operations can be minimized.
Steps to separate backups and expiry
1. Configuration adjustment
Open the pgbackrest.conf file, usually located in the pgBackRest configuration directory. Adjust the retention settings based on the backup strategy by commenting on the retention for backups and setting the retention for archives.
1 2 | [global] repo1-retention-archive-type=full |
In this example, we are retaining backups for an infinite time and retaining required archives for FULL backup consistently so that issues are not faced while restoring full backups.
2. Execute backups
Run the backup command to perform a backup.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | $ pgbackrest --stanza=main --config=/etc/pgbackrest/pgbackrest.conf --log-level-console=info backup --type=full 2024-01-31 09:13:01.375 P00 INFO: backup command begin 2.48: --config=/etc/pgbackrest/pgbackrest.conf --exec-id=18737-d1ca38f0 --log-level-console=info --pg1-path=/var/lib/postgresql/16/main --repo1-path=/var/lib/pgbackrest --repo1-retention-archive-type=full --stanza=main --start-fast --type=full WARN: option 'repo1-retention-full' is not set for 'repo1-retention-full-type=count', the repository may run out of space HINT: to retain full backups indefinitely (without warning), set option 'repo1-retention-full' to the maximum. 2024-01-31 09:13:02.086 P00 INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes 2024-01-31 09:13:02.587 P00 INFO: backup start archive = 000000060000000100000057, lsn = 1/57000028 2024-01-31 09:13:02.587 P00 INFO: check archive for prior segment 000000060000000100000056 2024-01-31 09:13:26.215 P00 INFO: execute non-exclusive backup stop and wait for all WAL segments to archive 2024-01-31 09:13:26.422 P00 INFO: backup stop archive = 000000060000000100000057, lsn = 1/57000138 2024-01-31 09:13:26.426 P00 INFO: check archive for segment(s) 000000060000000100000057:000000060000000100000057 2024-01-31 09:13:26.542 P00 INFO: new backup label = 20240131-091301F 2024-01-31 09:13:26.598 P00 INFO: full backup size = 512.3MB, file total = 1195 2024-01-31 09:13:26.599 P00 INFO: backup command end: completed successfully (25227ms) 2024-01-31 09:13:26.599 P00 INFO: expire command begin 2.48: --config=/etc/pgbackrest/pgbackrest.conf --exec-id=18737-d1ca38f0 --log-level-console=info --repo1-path=/var/lib/pgbackrest --repo1-retention-archive-type=full --stanza=main 2024-01-31 09:13:26.608 P00 INFO: option 'repo1-retention-archive' is not set - archive logs will not be expired 2024-01-31 09:13:26.608 P00 INFO: expire command end: completed successfully (9ms) |
Notice the warnings while executing the backups that are expected.
3. Execute expiry
Let’s check the backup info and see how many backups are available in the repo before expiring the backups:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | $ pgbackrest info stanza: main status: ok cipher: none db (current) wal archive min/max (16): 000000060000000100000057/00000006000000010000005F full backup: 20240131-091301F timestamp start/stop: 2024-01-31 09:13:01+00 / 2024-01-31 09:13:26+00 wal start/stop: 000000060000000100000057 / 000000060000000100000057 database size: 512.3MB, database backup size: 512.3MB repo1: backup set size: 59MB, backup size: 59MB full backup: 20240131-093948F timestamp start/stop: 2024-01-31 09:39:48+00 / 2024-01-31 09:40:12+00 wal start/stop: 000000060000000100000059 / 000000060000000100000059 database size: 512.3MB, database backup size: 512.3MB repo1: backup set size: 59MB, backup size: 59MB diff backup: 20240131-093948F_20240131-094022D timestamp start/stop: 2024-01-31 09:40:22+00 / 2024-01-31 09:40:23+00 wal start/stop: 00000006000000010000005B / 00000006000000010000005B database size: 512.3MB, database backup size: 8.3KB repo1: backup set size: 59MB, backup size: 440B backup reference list: 20240131-093948F diff backup: 20240131-093948F_20240131-094033D timestamp start/stop: 2024-01-31 09:40:33+00 / 2024-01-31 09:40:35+00 wal start/stop: 00000006000000010000005C / 00000006000000010000005D database size: 512.3MB, database backup size: 8.3KB repo1: backup set size: 59MB, backup size: 442B backup reference list: 20240131-093948F full backup: 20240131-094044F timestamp start/stop: 2024-01-31 09:40:44+00 / 2024-01-31 09:41:07+00 wal start/stop: 00000006000000010000005F / 00000006000000010000005F database size: 512.3MB, database backup size: 512.3MB repo1: backup set size: 59MB, backup size: 59MB |
Run the expiry command to remove outdated backups based on the configured retention policies.
1 2 3 4 5 6 7 8 9 10 11 12 | $ pgbackrest --stanza=main --config=/etc/pgbackrest/pgbackrest.conf --log-level-console=detail expire --repo1-retention-full=1 2024-01-31 09:43:01.502 P00 INFO: expire command begin 2.48: --config=/etc/pgbackrest/pgbackrest.conf --exec-id=19111-dc5f56ed --log-level-console=detail --repo1-path=/var/lib/pgbackrest --repo1-retention-archive-type=full --repo1-retention-full=1 --stanza=main 2024-01-31 09:43:01.504 P00 INFO: repo1: expire full backup 20240131-091301F 2024-01-31 09:43:01.504 P00 INFO: repo1: expire full backup set 20240131-093948F, 20240131-093948F_20240131-094022D, 20240131-093948F_20240131-094033D 2024-01-31 09:43:01.513 P00 INFO: repo1: remove expired backup 20240131-093948F_20240131-094033D 2024-01-31 09:43:01.513 P00 INFO: repo1: remove expired backup 20240131-093948F_20240131-094022D 2024-01-31 09:43:01.513 P00 INFO: repo1: remove expired backup 20240131-093948F 2024-01-31 09:43:01.548 P00 INFO: repo1: remove expired backup 20240131-091301F 2024-01-31 09:43:01.595 P00 DETAIL: repo1: 16-1 archive retention on backup 20240131-094044F, start = 00000006000000010000005F 2024-01-31 09:43:01.596 P00 INFO: repo1: 16-1 remove archive, start = 000000060000000100000057, stop = 00000006000000010000005E 2024-01-31 09:43:01.596 P00 INFO: expire command end: completed successfully (96ms) postgres@ip-172-31-18-53:~$ |
Notice even though we have not given any attribute for expiring differential backups while expiring the full backups, pgBackRest will automatically remove the subsequent differential backups, which are dependent on the expiring full backup set.
Now let’s again check the backup information to find out which backup sets were expired and which were retained.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | $ pgbackrest info stanza: main status: ok cipher: none db (current) wal archive min/max (16): 00000006000000010000005F/000000060000000100000061 full backup: 20240131-094044F timestamp start/stop: 2024-01-31 09:40:44+00 / 2024-01-31 09:41:07+00 wal start/stop: 00000006000000010000005F / 00000006000000010000005F database size: 512.3MB, database backup size: 512.3MB repo1: backup set size: 59MB, backup size: 59MB diff backup: 20240131-094044F_20240131-094124D timestamp start/stop: 2024-01-31 09:41:24+00 / 2024-01-31 09:41:25+00 wal start/stop: 000000060000000100000061 / 000000060000000100000061 database size: 512.3MB, database backup size: 8.3KB repo1: backup set size: 59MB, backup size: 442B backup reference list: 20240131-094044F |
As expected, it has kept one full backup as per the attributes given in the expiry command. Also it has also retained the relevant differential backup against the full backup set, making room for PITR to be performed at any given time.
4. Schedule backups and expiry
Create separate cron jobs or scheduled tasks for backups and expiry based on the preferred frequency.
1 2 3 4 5 6 7 8 | # Example backup cron job (run diff backups on weekdays) 0 3 * * 1-5 pgbackrest --stanza=main --config=/etc/pgbackrest/pgbackrest.conf backup --type=diff # Example backup cron job (run full backups on Saturday) 0 3 * * 6 pgbackrest --stanza=main --config=/etc/pgbackrest/pgbackrest.conf backup --type=full # Example expiry cron job (run less frequently, e.g., on Sunday) 0 4 * * 0 pgbackrest --stanza=main --config=/etc/pgbackrest/pgbackrest.conf expire --repo1-retention-full=1 |
In this example, differential backups run on weekdays at 3 AM, full backup runs on Saturdays, and expiry runs weekly on Sundays at 4 AM. Notice expiry command has –repo1-retention-full=1, which will allow us to keep at least one full backup. pgBackRest will automatically take care of the expiring subsequent differential backups, along with expiring each full backup set.
Conclusion
This approach ensures that backup operations remain swift and responsive, even if the database grows. I saw this issue for one of our clients where they had a backup repository on the cloud, and expiry took three to four times more time than the backup, hampering multiple things. We also used the file bundle option of pgBackRest, which helped decrease the time taken for the backup. However, expiry was still taking more time. Decoupling backup and expiry operations with pgBackRest isn’t just a configuration tweak – it’s a strategic move toward a more resilient and adaptable database management approach.
Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together. Run PostgreSQL in your production and mission-critical environments and easily deploy and orchestrate reliable PostgreSQL in Kubernetes.
Download Percona Distribution for PostgreSQL Today!