postgres_fdw Enhancement in PostgreSQL 14

Note: PostgreSQL 14 has been released. Please visit PostgreSQL 14 – Performance, Security, Usability, and Observability for additional information.

It’s exciting times in the PostgreSQL world with the version 14 beta released a few days ago. It’s now time to look under the hood and find out what has changed from version 13, see what has improved, and what behaviors are changed that we should be aware of. Putting it all in a single blog would take weeks to write and days to read, so this one will focus solely on the changes expected in the Foreign Data Wrapper in the GA release of PostgreSQL version 14.

Foreign Data Wrappers (FDWs) provide a mechanism by which regular SQL queries can be used to access data that resides outside PostgreSQL. There are many different FDWs available, however, and PostgreSQL comes with a “File FDW” and a “PostgreSQL FDW”. PostgreSQL FDW may seem counterintuitive, but, it is an extremely useful feature. And there have been some very useful updates to this FDW.

So, let’s start understanding what has changed.

Performance Feature

If you are already using PostgreSQL FDW for any use case, take note of the performance improvements.

1 – Parallel /Async Foreign Scans

(Allow a query referencing multiple foreign tables to perform foreign table scans in parallel)

Remote aggregations and remote joins might have been a performance nightmare when performed across multiple servers. The performance benefit comes from the parallelization of ForeignScan which can now be executed in parallel asynchronously. The sequential execution previously was very slow, and in some cases, too slow. For this, a new server option is added “async_capable” which allows for parallel planning and execution of ForeignScan.

Create Servers and User-Mappings

-- Create foreign server 1.
CREATE SERVER postgres_svr1
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '127.0.0.1', async_capable "true");

-- Create foreign server 2.
CREATE SERVER postgres_svr2
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '127.0.0.1', async_capable "true");

CREATE USER MAPPING FOR vagrant
SERVER postgres_svr1
OPTIONS (user 'postgres', password 'pass');

CREATE USER MAPPING FOR vagrant
SERVER postgres_svr2
OPTIONS (user 'postgres', password 'pass');

-- Create foreign server 1.

CREATE SERVER postgres_svr1

FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (host '127.0.0.1', async_capable "true");

-- Create foreign server 2.

CREATE SERVER postgres_svr2

FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (host '127.0.0.1', async_capable "true");

CREATE USER MAPPING FOR vagrant

SERVER postgres_svr1

OPTIONS (user 'postgres', password 'pass');

CREATE USER MAPPING FOR vagrant

SERVER postgres_svr2

OPTIONS (user 'postgres', password 'pass');

Create Local Tables

CREATE TABLE parent_local (a INTEGER, b CHAR, c TEXT, d VARCHAR(255)) PARTITION BY RANGE (a);
CREATE TABLE child_local1 (a INTEGER, b CHAR, c TEXT, d VARCHAR(255));
CREATE TABLE child_local2 (a int, b CHAR, c text, d VARCHAR(255));
GRANT ALL ON child_local1 to postgres;
GRANT ALL ON child_local2 to postgres;

CREATE TABLE parent_local (a INTEGER, b CHAR, c TEXT, d VARCHAR(255)) PARTITION BY RANGE (a);

CREATE TABLE child_local1 (a INTEGER, b CHAR, c TEXT, d VARCHAR(255));

CREATE TABLE child_local2 (a int, b CHAR, c text, d VARCHAR(255));

GRANT ALL ON child_local1 to postgres;

GRANT ALL ON child_local2 to postgres;

Create Foreign TABLES

CREATE FOREIGN TABLE parent_remote1
PARTITION OF parent_local VALUES FROM 1000 TO 2000
SERVER postgres_svr1 OPTIONS table_name 'child_local1');

CREATE FOREIGN TABLE parent_remote2
PARTITION OF parent_local
FOR VALUES FROM 2000 TO 3000
SERVER postgres_svr2 OPTIONS table_name 'child_local2');

CREATE FOREIGN TABLE parent_remote1

PARTITION OF parent_local VALUES FROM 1000 TO 2000

SERVER postgres_svr1 OPTIONS table_name 'child_local1');

CREATE FOREIGN TABLE parent_remote2

PARTITION OF parent_local

FOR VALUES FROM 2000 TO 3000

SERVER postgres_svr2 OPTIONS table_name 'child_local2');

Now try that, and see the plan tree, now you can see two Async Foreign plans in the tree.

CREATE TABLE sample_table (a INTEGER, b CHAR, c TEXT, d VARCHAR(255));

EXPLAIN (VERBOSE, COSTS OFF)
INSERT INTO sample_table SELECT * FROM parent_local WHERE a % 100 = 0;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Insert on public.sample_table
   ->  Append
         ->  Async Foreign Scan on public.parent_remote1 parent_local_1
               Output: parent_local_1.a, parent_local_1.b, parent_local_1.c, parent_local_1.d
               Remote SQL: SELECT a, b, c, d FROM public.child_local1 WHERE (((a % 100) = 0))
         ->  Async Foreign Scan on public.parent_remote2 parent_local_2
               Output: parent_local_2.a, parent_local_2.b, parent_local_2.c, parent_local_2.d
               Remote SQL: SELECT a, b, c, d FROM public.child_local2 WHERE (((a % 100) = 0))
(8 rows)

CREATE TABLE sample_table (a INTEGER, b CHAR, c TEXT, d VARCHAR(255));

EXPLAIN (VERBOSE, COSTS OFF)

INSERT INTO sample_table SELECT * FROM parent_local WHERE a % 100 = 0;

QUERY PLAN

----------------------------------------------------------------------------------------------

Insert on public.sample_table

-> Append

-> Async Foreign Scan on public.parent_remote1 parent_local_1

Output: parent_local_1.a, parent_local_1.b, parent_local_1.c, parent_local_1.d

Remote SQL: SELECT a, b, c, d FROM public.child_local1 WHERE (((a % 100) = 0))

-> Async Foreign Scan on public.parent_remote2 parent_local_2

Output: parent_local_2.a, parent_local_2.b, parent_local_2.c, parent_local_2.d

Remote SQL: SELECT a, b, c, d FROM public.child_local2 WHERE (((a % 100) = 0))

(8 rows)

2 – Bulk Insert

(Allow postgres_fdw to INSERT rows in bulk.)

Now bulk insert functionality has been added to Foreign Data Wrapper, and postgres_fdw is now supporting that function. There is an opportunity for other foreign data wrappers to implement bulk insert. A complete blog can be seen here.

Functional Features

1 – TRUNCATE Command

Allow TRUNCATE to operate on foreign tables.

Foreign Data Wrapper is enhanced to support TRUNCATE command on the foreign table as the target. That means it issues a TRUNCATE command shipped to the foreign server and executes on the table. Fortunately, this functionality is implemented in postgres_fdw. Here is an example of that.

CREATE SERVER postgres_svr 
       FOREIGN DATA WRAPPER postgres_fdw 
       OPTIONS (host '127.0.0.1');

CREATE USER MAPPING FOR vagrant
       SERVER postgres_svr
       OPTIONS (user 'postgres', password 'pass');

CREATE FOREIGN TABLE foo_remote (a INTEGER,
                                 b CHAR,
                                 c TEXT,
                                 d VARCHAR(255))
       SERVER postgres_svr
       OPTIONS(table_name 'foo_local');

CREATE SERVER postgres_svr

FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (host '127.0.0.1');

CREATE USER MAPPING FOR vagrant

SERVER postgres_svr

OPTIONS (user 'postgres', password 'pass');

CREATE FOREIGN TABLE foo_remote (a INTEGER,

b CHAR,

c TEXT,

d VARCHAR(255))

SERVER postgres_svr

OPTIONS(table_name 'foo_local');

Now a foreign table can be truncated using the TRUNCATE command.

postgres=# TRUNCATE foo_remote;
TRUNCATE TABLE

1 2	postgres=# TRUNCATE foo_remote; TRUNCATE TABLE

2 – LIMIT TO Child Partitioning

(Allow postgres_fdw to import table partitions if specified by IMPORT FOREIGN SCHEMA … LIMIT TO.)

The postgres_fdw does not allow the import of table partitions, because data can be accessed using the root partition. But in case the user wants to import the partitioned table partitioning, PostgreSQL 14 added a new option call “LIMIT TO”. Create a new schema on a remote machine and add a parent table “foo_schema.foo_table_parent” and one child table “foo_schema.foo_table_child”.

postgres=# \d+ foo_schema.*
                                   Table "foo_schema.foo_table_child"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              | 
Partition of: foo_schema.foo_table_parent FOR VALUES FROM (0) TO (10)

Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 10))
Access method: heap

                            Partitioned table "foo_schema.foo_table_parent"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              | 
Partition key: RANGE (a)
Partitions: foo_schema.foo_table_child FOR VALUES FROM (0) TO (10)

postgres=# \d+ foo_schema.*

Table "foo_schema.foo_table_child"

--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------

Partition of: foo_schema.foo_table_parent FOR VALUES FROM (0) TO (10)

Partition constraint: ((a IS NOT NULL) AND (a >= 0) AND (a < 10))

Access method: heap

Partitioned table "foo_schema.foo_table_parent"

--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------

Partition key: RANGE (a)

Partitions: foo_schema.foo_table_child FOR VALUES FROM (0) TO (10)

Import schema without specifying the LIMIT TO, you can see only the parent table is imported.

IMPORT FOREIGN SCHEMA foo_schema FROM SERVER postgres_svr INTO bar_schema;
postgres=# \d+ bar_schema.*
                                 Foreign table "bar_schema.foo_table_parent"
 Column |  Type   | Collation | Nullable | Default |    FDW options    | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+-------------------+---------+--------------+-------------
 a      | integer |           |          |         | (column_name 'a') | plain   |              | 
Server: postgres_svr
FDW options: (schema_name 'foo_schema', table_name 'foo_table_parent')

IMPORT FOREIGN SCHEMA foo_schema FROM SERVER postgres_svr INTO bar_schema;

postgres=# \d+ bar_schema.*

Foreign table "bar_schema.foo_table_parent"

--------+---------+-----------+----------+---------+-------------------+---------+--------------+-------------

Server: postgres_svr

FDW options: (schema_name 'foo_schema', table_name 'foo_table_parent')

If you explicitly specified partitioned table into LIMIT TO clause then it will import that table.

postgres=# IMPORT FOREIGN SCHEMA foo_schema LIMIT TO (foo_table_parent, foo_table_child)
            FROM SERVER loopback INTO bar_schema;
IMPORT FOREIGN SCHEMA

postgres=# \d+ bar_schema.*
                                  Foreign table "bar_schema.foo_table_child"
 Column |  Type   | Collation | Nullable | Default |    FDW options    | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+-------------------+---------+--------------+-------------
 a      | integer |           |          |         | (column_name 'a') | plain   |              | 
Server: loopback
FDW options: (schema_name 'foo_schema', table_name 'foo_table_child')

                                 Foreign table "bar_schema.foo_table_parent"

 Column |  Type   | Collation | Nullable | Default |    FDW options    | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+-------------------+---------+--------------+-------------
 a      | integer |           |          |         | (column_name 'a') | plain   |              | 
Server: loopback
FDW options: (schema_name 'foo_schema', table_name 'foo_table_parent')

postgres=# IMPORT FOREIGN SCHEMA foo_schema LIMIT TO (foo_table_parent, foo_table_child)

FROM SERVER loopback INTO bar_schema;

IMPORT FOREIGN SCHEMA

postgres=# \d+ bar_schema.*

Foreign table "bar_schema.foo_table_child"

--------+---------+-----------+----------+---------+-------------------+---------+--------------+-------------

Server: loopback

FDW options: (schema_name 'foo_schema', table_name 'foo_table_child')

Foreign table "bar_schema.foo_table_parent"

--------+---------+-----------+----------+---------+-------------------+---------+--------------+-------------

Server: loopback

FDW options: (schema_name 'foo_schema', table_name 'foo_table_parent')

3 – Active and Valid Connection List

(Add postgres_fdw function postgres_fdw_get_connections to report open foreign server connections)

A new function postgres_fdw_get_connections() is added. The function returns the open connection names local session to the foreign servers of postgres_fdw. It also outputs the validity of the connection.

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
    server_name    | valid 
-------------------+-------
 postgres_svr      | t
 postgres_svr_bulk | t
(2 rows)

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;

server_name | valid

-------------------+-------

postgres_svr | t

postgres_svr_bulk | t

(2 rows)

Now disconnect all connections connection and try the query again.

postgres=# SELECT 1 FROM postgres_fdw_disconnect_all();
 ?column? 
----------
        1
(1 row)

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
 server_name | valid 
-------------+-------
(0 rows)

postgres=# SELECT 1 FROM postgres_fdw_disconnect_all();

?column?

----------

(1 row)

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;

server_name | valid

-------------+-------

(0 rows)

4 – Keep Connections

(Add postgres_fdw functions to discard cached connections)

A new option keep_connections is added to keep the connections alive so that subsequent queries can reuse them. By default, this option is on, but when turned off, the connections will be discarded as the transactions end.

Set the option off

ALTER SERVER loopback OPTIONS (keep_connections 'off');

1	ALTER SERVER loopback OPTIONS (keep_connections 'off');

Establish the connection using the remote query.

postgres=# BEGIN;
BEGIN

postgres=*# select * from foo_remote;
 a | b | c | d 
---+---+---+---
(0 rows)

postgres=*# END;
COMMIT

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
 server_name | valid 
-------------+-------
(0 rows)

postgres=# BEGIN;

BEGIN

postgres=*# select * from foo_remote;

a | b | c | d

---+---+---+---

(0 rows)

postgres=*# END;

COMMIT

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;

server_name | valid

-------------+-------

(0 rows)

Set the keep_connections option on

ALTER SERVER postgres_svr options (set keep_connections 'on');

1	ALTER SERVER postgres_svr options (set keep_connections 'on');

postgres=# BEGIN;
BEGIN
postgres=*# select * from foo_remote;
 a | b | c | d 
---+---+---+---
(0 rows)

-- Establish the connection using the remote query.

postgres=*# END;
COMMIT

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
 server_name  | valid 
--------------+-------
 postgres_svr | t
(1 row)

postgres=# BEGIN;

BEGIN

postgres=*# select * from foo_remote;

a | b | c | d

---+---+---+---

(0 rows)

-- Establish the connection using the remote query.

postgres=*# END;

COMMIT

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;

server_name | valid

--------------+-------

postgres_svr | t

(1 row)

5 – Reestablish Brocken Connection

(Allow postgres_fdw to reestablish foreign server connections if necessary)

Previously when the remote server restarted and the postgres_fdw connection was broken, then the error was thrown because the cached connection is no longer available. This is fixed in PostgreSQL, and, in any case, the connection is broken and no longer exists in the cache, and postgres_fdw will establish the connection.

Conclusion

It is quite promising that the Foreign Data Wrapper API is expanding over every release, but PostgreSQL 14 provides some user-centric new features. The performance-related improvements give another reason to use FDWs for many relevant use cases. The functionality will surely be added in the next few versions, making these more performant and easy to use.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

John Mudd

2 years ago

I have 1,000 stores. Each has it’s own Postgres server and database. The tables are the same at each store. Can I use postgres_fdw to create a single db that has access to the cumulative data of all stores? Ideally I want each table to have an extra field added that will identify which store provided the data.

jfbaro

2 years ago

That’s amazing. I am surprised by the small number of companies using FDW today, as it is such an advanced and powerful tool. One question though, let’s say I have 35 microservices, each one running a remote database and one of these databases changes its schema. Is there a way of syncing this to the MASTER PostgreSQL so that it can refresh the schema on its side? Thanks!!

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

postgres_fdw Enhancement in PostgreSQL 14

Performance Feature

1 – Parallel /Async Foreign Scans

2 – Bulk Insert

Functional Features

1 – TRUNCATE Command

2 – LIMIT TO Child Partitioning

3 – Active and Valid Connection List

4 – Keep Connections

5 – Reestablish Brocken Connection

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

postgres_fdw Enhancement in PostgreSQL 14

Performance Feature

1 – Parallel /Async Foreign Scans

2 – Bulk Insert

Functional Features

1 – TRUNCATE Command

2 – LIMIT TO Child Partitioning

3 – Active and Valid Connection List

4 – Keep Connections

5 – Reestablish Brocken Connection

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation