Percona Backup for MongoDB v1.2 is out! In a nutshell it offers (much) faster compression choices, backup cancellation, and assorted convenience features such as a universal backup deletion command and progress logging.
And, for the DBAs out there in the field, we’ve added a whole extra binary (pbm-speed-test) that will give a quick way to investigate your speed limit – Is it the remote storage write speed? Compression algorithm? A network bottleneck?
The New Convenience Features
pbm delete-backup
Whether you’re using S3, GCS, Minio, or a remote filesystem server as your backup storage you can now drop a backup snapshot with the same command. Eg.
pbm delete-backup 2020-04-20T13:45:59Z
The above will drop only the single backup snapshot named “2020-04-20T13:45:59Z”. (Use pbm list to double-check what those are).
For your backup retention policies you’d like to, say, delete all backups that are older than 30 days in a unix cron job. (Classic unix crond, systemd timer units, etc.)
To make this easier to script the command also supports an –older-than argument. Eg.
pbm delete-backup --older-than 2020-04-21
pbm delete-backup --older-than 2020-04-20T13:45:00
The two datestring formats supported are the YYYY-mm-dd and YYYY-dd-mmTHH:MM:SS shown above. UTC time is always used.
Example crontab
The following (one line) crontab entry will drop the backups > 30 days for a MongoDB cluster hosted on the msvr_abc* servers at 23:55 each night. $(date -u -d ‘-30 days’ +%Y-%m-%d) is used to make the YYYY-MM-DD datetime string for 30 days ago.
55 23 * * * export PBM_MONGODB_URI="mongodb://pbmuser:secret@msvr_abc_cfg1.domain:27019,msvr_abc_cfg2.domain:27019,msvr_abc_cfg3.domain:27019/"; /usr/bin/pbm delete-backup --older-than $(date -u -d '-30 days' +%Y-%m-%d)
Remember: crontab won’t load the environment variables you have in your normal user terminal session, so PBM_MONGODB_URI will need to be explicitly set, or sourced. And remember – in a cluster run “pbm” CLI commands when connecting to the configsvr replicaset (documentation). To avoid showing the user credentials you should source it from an include file.
Backup progress logging
Got a huge backup and you’re trying PBM for the first time? Maybe you’re using an object store instead of filesystem for the first time, and/or this is not the production environment and you don’t know how good the servers are.
If you’d like see the upload MB/s rate you can look at the pbm-agent node’s logs to see the progress now. See “Backup progress logs” for a sample.
Compression Algorithm Choice
For PBM <= 1.1 it used gzip, as mongodump/mongoretore do. In earlier versions we had other compression options inside the code base, but suppressed them for conformity with mongodump. Eventually it became clear that gzip’s performance wasn’t sufficient and we needed to expose other choices.
We experimented with all the compression libraries shown below. We found that the “s2” library that parallelizes snappy compression was the fastest, so we’ve made that the default. (“s2” is no relation to AWS S3.)
Compression family | Single thread | Parallelized |
---|---|---|
zlib | gzip ~50MB/s | pgzip ~250MB/s (4 cores) ~750MB/s (8 cores) |
lz4 | lz4 * ~300MB/s | |
snappy | snappy ~500MB/s | s2 ~1.25GB/s (4 cores) ~2.5GB/s (8 cores) |
“none” ∞ |
Tip: if you want to throttle the upload speed, e.g. to avoid saturating the network, try one of the single-thread versions.
* lz4 wasn’t explored much because we haven’t identified a parallelized library for it. I believe it should have been a bit faster than single-threaded snappy if it were tuned to use a different level (lower compaction). Best-case examples in other projects have been reported at ~1GB/s.
pbm-speed-test
A backup can have any of the following as an IO bottlenecks. You’ll only be going as fast as the slowest, whichever it is.
- W: The remote storage’s write speed
- N: Network bandwith between here and the storage
- C: Compression algorithm speed (for the CPUs available)
- R: The speed to read from the local mongod
To enable you to figure out which of W, N, C or R it is we’ve created an extra program: pbm-speed-test.
In a nutshell:
- Use pbm-speed-test compression to determine C
- Use pbm-speed-test storage –compression=none to determine what min(W,N) is.
(It’s probably W that is the slower. You’ll have measure network bandwidth another way if you are suspicious of N.) - It’s unlikely to be R but if you want to be thorough use:
mongodump <connection args> –archive > /dev/null to test how quick you can read without any possible the output side. Determine the db total size separately, and divide it by the time elapsed.
Note the speed will differ a lot when it’s dumping document data already in cache vs. data being fetched cold from disk, so don’t be tempted to dump only for 1 min or so.
Whichever of those three tests has the lowest throughput will be the bottleneck pbm backups have.
Hi Akira, thanks for nice article. For PBM incremental backup its default interval is 10min, can we set it as per requirement like incremental everyday and fulllbackup once in a week?
Hi Aaayushi!
The incremental backup interval is 10 mins, yes. As of the current version (1.4.0) the variable in question is “PITRdefaultSpan” and it isn’t configurable yet. (https://github.com/percona/percona-backup-mongodb/blob/11598d05c6562183b46d7e8a81fb741ce78ebc96/pbm/pitr.go#L20)
This period is how long the automatic process to fetch and store oplog slices waits before making the next oplog slice. Or as another way of saying it, every day 24 * 6 = 144 oplog slices are made.
> like incremental everyday and fulllbackup once in a week?
Oh, yes, you can do this. There is no internal scheduler for the snapshots (the full backups); you must still launch those by making a “pbm backup” (full snapshot) request from outside of PBM. If you A) run “pbm backup” once a week and B) enable PITR the oplog slices between each weekly backup will continue to be collected.
So it is perfectly possible to have weekly full backups and only have oplog capture for PITR inbetween. If you have a large size of data (say a TB or more) but low writes (say 10’s of GB) per day this is a suitable decision i.m.o.
Just for the curious: There is already feature request to add an internal scheduler for full backups: https://jira.percona.com/browse/PBM-530.