The PostgreSQL Write-Ahead Log (WAL) is a recording location within the Postgres cluster, capturing all modifications made to the cluster’s data files before being written to the heap. During crash recovery, the WAL contains sufficient data for Postgres to restore its state to the point of the last committed transaction.
Use cases
There may arise circumstances where it becomes necessary to determine the numerical difference between the WAL files. For instance, when recovering from a significant delay or while configuring replication on a sizable database, the recovery process can be time-consuming as new WAL files are replayed. Initially, when setting up replication, the server may not permit login access. In such cases, calculating the disparity in the number of WAL files can provide an estimation of the recovery time, allowing for an assessment of the lag.
Another practical application for calculating the difference between WAL files is in the context of the archiver process. Determining the variance between WAL files makes it possible to estimate the number of remaining files that are yet to be archived.
To calculate the difference between two WAL files, let’s understand the WAL file name format.
The name format for PostgreSQL Write-Ahead Logs (WAL) files is TTTTTTTTXXXXXXXXYYYYYYYY, a 24-character hexadecimal representation of the LSN (Log Sequence Number) associated with the WAL record. The LSN is a unique identifier for each WAL record.
In format TTTTTTTTXXXXXXXXYYYYYYYY, ‘T’ is the timeline, ‘X’ is the high 32-bits from the LSN(Segment number), and ‘Y’ is the low 32-bits of the LSN.
For example, a WAL file name might look like this: “0000000100001234000000AB”.
Here’s a breakdown of the components in the example:
– “00000001”: This represents the timeline ID. It is usually 1 for the default timeline.
– “00012340”: This represents the WAL file number, indicating the sequential order of the WAL file within the timeline.
– “000000AB”: This is the hexadecimal representation of the segment file number.
We can determine the numerical difference between two WAL files with the below sql:
1 2 3 4 | SELECT ABS(('x' || SUBSTRING(current_wal, 1, 8))::bit(32)::int - ('x' || SUBSTRING(old_wal, 1, 8))::bit(32)::int) + ABS(('x' || SUBSTRING(current_wal, 9, 8))::bit(32)::int*256 - ('x' || SUBSTRING(old_wal, 9, 8))::bit(32)::int*256) + ABS(('x' || SUBSTRING(current_wal, 17))::bit(32)::int - ('x' || SUBSTRING(old_wal, 17))::bit(32)::int) FROM (select '000000330000006900000047' current_wal, '0000003200000069000000AB' old_wal) as wal_segs; |
Let’s create a PostgreSQL function that facilitates the calculation of the numerical difference between two WAL files, making it more convenient to use to determine the variance between them.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | CREATE OR REPLACE FUNCTION public.get_walfile_diff(current_wal text, old_wal text) RETURNS numeric LANGUAGE plpgsql AS $function$ DECLARE v_wal_diff numeric; BEGIN IF length(current_wal) != 24 OR length(old_wal) != 24 THEN RAISE EXCEPTION 'Invalid wal file length'; END IF; v_wal_diff := ( ABS(('x' || SUBSTRING(current_wal, 1, 8))::bit(32)::int - ('x' || SUBSTRING(old_wal, 1, 8))::bit(32)::int) + ABS(('x' || SUBSTRING(current_wal, 9, 8))::bit(32)::int*256 - ('x' || SUBSTRING(old_wal, 9, 8))::bit(32)::int*256) + ABS(('x' || SUBSTRING(current_wal, 17))::bit(32)::int - ('x' || SUBSTRING(old_wal, 17))::bit(32)::int) ); RETURN v_wal_diff; END; $function$; |
Examples
1 2 3 4 5 6 7 8 9 10 | postgres=# SELECT public.get_walfile_diff('000000330000006900000047','000000330000006900000048'); get_walfile_diff ------------------ 1 (1 row) postgres=# SELECT public.get_walfile_diff('000000330000006900000047','0000003300000069000000AB'); get_walfile_diff ------------------ 100 (1 row) |
Overall, being able to calculate the numerical difference between WAL files contributes to effective management and understanding of Postgres database recovery/archiver processes.
Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together.
for wal files 0000000100000002000000B6 and 000000010000000200000086 it gave answer 32 but actual wal generated was 52. any suggestion
The diff is actually 48,
postgres=# select get_walfile_diff(‘0000000100000002000000B6′,’000000010000000200000086’);
get_walfile_diff
——————
48
(1 row)
All the 49 WAL filenames between these 2 files are as below,
000000010000000200000086
000000010000000200000087
000000010000000200000088
000000010000000200000089
00000001000000020000008A
00000001000000020000008B
00000001000000020000008C
00000001000000020000008D
00000001000000020000008E
00000001000000020000008F
000000010000000200000090
000000010000000200000091
000000010000000200000092
000000010000000200000093
000000010000000200000094
000000010000000200000095
000000010000000200000096
000000010000000200000097
000000010000000200000098
000000010000000200000099
00000001000000020000009A
00000001000000020000009B
00000001000000020000009C
00000001000000020000009D
00000001000000020000009E
00000001000000020000009F
0000000100000002000000A0
0000000100000002000000A1
0000000100000002000000A2
0000000100000002000000A3
0000000100000002000000A4
0000000100000002000000A5
0000000100000002000000A6
0000000100000002000000A7
0000000100000002000000A8
0000000100000002000000A9
0000000100000002000000AA
0000000100000002000000AB
0000000100000002000000AC
0000000100000002000000AD
0000000100000002000000AE
0000000100000002000000AF
0000000100000002000000B0
0000000100000002000000B1
0000000100000002000000B2
0000000100000002000000B3
0000000100000002000000B4
0000000100000002000000B5
0000000100000002000000B6