Comments on: The 1-2-3 for PostgreSQL Logical Replication Using an RDS Snapshot https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/ Fri, 10 Nov 2023 08:05:30 +0000 hourly 1 https://wordpress.org/?v=6.5.2 By: Shivam https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10974204 Fri, 10 Nov 2023 08:05:30 +0000 https://www.percona.com/blog/?p=75310#comment-10974204 Hi David, This is very-well explained. I recently used this method to perform a major RDS version upgrade. I’ve one doubt though. How can we get the LSN in case we’ve multiple databases in a single RDS instance. Will they be same for all the databases(from step 8)? Thanks

]]>
By: Fabio Hisamoto https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973914 Tue, 07 Feb 2023 21:51:05 +0000 https://www.percona.com/blog/?p=75310#comment-10973914 Hi David,

What lsn I use if there are more than one redo done in the logs?

]]>
By: Rajesh https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973760 Tue, 20 Sep 2022 05:19:16 +0000 https://www.percona.com/blog/?p=75310#comment-10973760 did you validate if there was any data loss ?
Instead of the restart or advance with lsn that was collected, you can use ‘invalid lsn’ ..

advance your subscription from invalid lsn and it should work.

]]>
By: Shantanu Gupta https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973642 Wed, 08 Jun 2022 14:47:48 +0000 https://www.percona.com/blog/?p=75310#comment-10973642 I tried to setup logical replication using the above steps at least 3 times. It didn’t work ever. The issue is with LSN approach mentioned in the article, may be I didn’t understand correctly or I am doing something wrong here.

Fix for me was to use restart_lsn of publisher on subscriber. I ran “SELECT pg_replication_origin_advance(‘pg_57414’, restart_lsn_of_publisher) ;” query on subscriber. restart_lsn_of_publisher value was obtained from publisher by running “select restart_lsn from pg_replication_slots;” on publisher.

]]>
By: David Gonzalez https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973626 Tue, 24 May 2022 13:58:19 +0000 https://www.percona.com/blog/?p=75310#comment-10973626 Hi Gerret, the goal of that step is to ensure our subscription is going to start applying changes just at the point the recovery from the snapshot was done, in some highly transactional environments there might be a number of changes between the replication slot creation and the point in time the snapshot process kicks in, so the subscription might start applying duplicated changes.

]]>
By: Gerret Sanders https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973623 Fri, 20 May 2022 14:53:27 +0000 https://www.percona.com/blog/?p=75310#comment-10973623 Hi David, could you tell me what’s the purpose of advancing the subscription to the LSN from the logs?

I tried your process on a small test database while skipping that step and it seemed to work fine.

]]>
By: David Gonzalez https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973562 Mon, 07 Mar 2022 14:47:07 +0000 https://www.percona.com/blog/?p=75310#comment-10973562 Hi Dan, the idea behind this blog post was to provide some insight how we can avoid the initial copy when using logical replication for an upgrade process, this is particularly beneficial when the dataset size is big so we might avoid the extra load in the source and save some network time. However, the logical replication is a very flexible approach we can adjust.
If you already have identified the tables related to those long running transaction, you might change the approach from a single publication for ALL TABLES, from step 3, to a multiple publications, so you can exclude the tables for the long transactions from the “main” publication, follow the approach for the rest of the tables and then just clean (truncate) the excluded tables in the target side and configure the logical replication as usual, adding the tables in the source to a different publication and create the corresponding subscription in the target (with the copy_data option enabled) and let the initial copy to run. Keep in mind for this approach you might need to address any foreign keys violations or unwanted trigger executions than may change the data. Hopefully you can overcome the issue :).

]]>
By: Dan https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973560 Sat, 05 Mar 2022 03:43:13 +0000 https://www.percona.com/blog/?p=75310#comment-10973560 I’ve tried the instructions a few times. First I did a proof of concept taking a snapshot from production, creating a new database from it to use as source, then following the steps as described. Worked well.

Going from 11.8 to 14.1. Upgrading the RDS snapshot from 11.8 to 11.14 and then to 14.1.

However, I’ve now tried it twice with production as the source and it fails on logical replication streaming each time. It barely applies any changes, but I attribute that to AWS RDS lazy loading of blocks after the database is restored from snapshot. I had to increase the wal_sender_timeout and wal_receiver_timeout, both to 10min. But, it does apply changes is spurts. So something is getting through. But, after a couple of hours I get an error about a request WAL segment has already been removed. And oddly, it’s a WAL from before where it says it is at in recovery.

2022-03-05 03:08:44 UTC:10.0.1.15(37418):repluser@db:[938]:LOG: starting logical decoding for slot “pglogical_upgrade01”
2022-03-05 03:08:44 UTC:10.0.1.15(37418):repluser@db:[938]:DETAIL: Streaming transactions committing after 16DE6/CF902580, reading WAL from 16DE6/93F45A70.
2022-03-05 03:08:44 UTC:10.0.1.15(37418):repluser@db:[938]:ERROR: requested WAL segment 0000000100016DE600000024 has already been removed

That wal log is well before where the origin was advanced to, before enabling (step 8):
SELECT pg_replication_origin_advance(‘pg_4113386′, ’16DE6/6816C3B8’) ;

I’m not sure what is happening here. I’m wondering if it has to do with long running transactions. But, short of terminating any open transactions, I’m not sure what I could change to mitigate that. If I wait until all the transactions that were from before the publisher was created, then the pg_replication_origin_advance would be even farther in the future. Maybe I increase wal_keep_segments up to a substantial number so that it keeps hours of wal on disk. Any suggestions?

]]>
By: David Gonzalez https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973559 Fri, 04 Mar 2022 16:09:50 +0000 https://www.percona.com/blog/?p=75310#comment-10973559 Hi there, from the database logs in RDS are you able to get some message with details about what the subscription is doing? I wonder if could be something related with LSN position.

]]>
By: Rocky https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973555 Fri, 04 Mar 2022 02:22:47 +0000 https://www.percona.com/blog/?p=75310#comment-10973555 Ive tried this to migrate v10.16 to v13.5. I upgraded snapshot twice to reach at 13.5 before performing restore. Per document, after advancing LSN at subscriber RDS and enabling subscription, it never works.
In pg_replication_slots, it says active but data doesn’t get replicated.

]]>
By: Daniel Guzman Burgos https://www.percona.com/blog/postgresql-logical-replication-using-an-rds-snapshot/#comment-10973284 Sun, 01 Aug 2021 12:54:04 +0000 https://www.percona.com/blog/?p=75310#comment-10973284 Worth to mention that this doesn’t work on PostgreSQL 9.6 since there’s no PUBLICATION on it. In that case you can use the pgLogical extension (available on RDS)

]]>