My last blog introduced the issues one can face when upgrading PostGIS and PostgreSQL at the same time. The purpose of this blog is to walk through the steps with an example.
For our purposes, we will confine ourselves to working with the community versions of 9.6 and 11 respectively, and use LXD in order to create a working environment prototyping the steps, and profiling the issues.
Creating the Demo Environment Using LXD
The first step is creating a template container with the requisite packages and configurations. This template is a basic distribution of Ubuntu 18.04, which has already been installed in the development environment.
1 2 3 4 | # creating the template container lxc cp template-ubuntu1804 template-BetterView lxc start template-BetterView lxc exec template-BetterView bash |
These packages install the necessary supporting packages, installing PostgreSQL from the community repository:
1 | apt install -y wget gnupg2 |
These steps are copied from the community download page for Ubuntu distributions:
1 2 | echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - |
The aforementioned repository is now updated thus making it possible to install our two versions of PostgreSQL (i.e. 9.6 and 11, respectively). Installing pg_repack pulls in the requisite packages while installing this very useful package at the same time too:
1 2 | apt update apt install -y postgresql-11-repack postgresql-9.6-repack |
These next packages are useful. Midnight Commander, mc, is a terminal-based shell navigator and file manager while the other package installs utilities, such as netstat, to monitor the status of all network-based services on the localhost:
1 | apt install -y mc net-tools |
This last step merely updates the man pages database and the mlocate database. It makes it easier to locate files on the host. Beware this can be a security risk if used on a production host.
1 | mandb && updatedb |
This little snippet of code creates our simulated production host. Creating the instance from a template container makes it much easier to try different variations in quick order:
1 2 3 4 5 | # creating the POC upgrade container lxc rm --force pg1-BV lxc cp template-BetterView pg1-BV lxc start pg1-BV lxc exec pg1-BV bash |
As per our scenario, upgrading PostGIS requires two different versions to be installed on the host. Notice that PostgreSQL version 9.6 has the older version of PostGIS, while version 11 has the newer one.
For our purposes, this presentation assumes upgrading both PostgreSQL and PostGIS is the method to be used.
ATTENTION: Executing this upgrade operation into two distinct phases is preferred. Either upgrade PostgreSQL and then upgrade PostGIS or upgrade PostGIS on the old version to match the new version on PostgreSQL and then upgrade the PostgreSQL data cluster.
The underlying assumption is that application code can break between PostGIS version upgrades therefore pursuing an incremental process can mitigate potential issues.
https://PostGIS.net/docs/PostGIS_Extensions_Upgrade.html
https://PostGIS.net/workshops/PostGIS-intro/upgrades.html
1 | apt install -y postgresql-9.6-postgis-2.4 postgresql-11-postgis-3 |
About PostGIS
Available versions of PostGIS, as per the community repository at the time of this blog’s publication:
- 9.6:
- postgresql-9.6-postgis-2.4
- postgresql-9.6-postgis-2.5
- postgresql-9.6-PostGIS-3
- 11:
- postgresql-11-postgis-2.5
- postgresql-11-postgis-3
- PostGIS supported versions matrix
ATTENTION: Azure supports only PostgreSQL 9.6 with PostGIS 2.3.2.
Before You Upgrade
About
This query lists all user-defined functions that have been installed in your database. Use it to summarize not only what you’ve created but the entire suite of PostGIS function calls:
1 2 3 4 5 6 7 8 | -- -- get list of all PostGIS functions -- select nspname, proname from pg_proc join pg_namespace on pronamespace=pg_namespace.oid where nspname not in ('pg_catalog','information_schema') order by 1,2; |
In order to validate your functions, you need to know which ones are being used, therefore tracking the functions prior to the upgrade process will identify them. Please note there are two settings i.e. pl, all. Out of an abundance of caution, it is suggested initially using all for an extended period of time:
1 2 3 4 5 6 | -- -- postgresql.conf -- track_functions = none # none, pl, all -- alter system set track_functions=all; select pg_reload_conf(); |
This view collects all the statistics related to function calls:
1 2 3 4 5 6 7 8 9 10 11 12 | <span style="font-size: 11px;">-- -- track function activity -- View "pg_catalog.pg_stat_user_functions" Column | Type | Collation | Nullable | Default ------------+------------------+-----------+----------+--------- funcid | oid | | | schemaname | name | | | funcname | name | | | calls | bigint | | | total_time | double precision | | | self_time | double precision | | |</span> |
Example
This is a simple example demonstrating tracking function call usage. Note there are two function calls and one of them is invoked in the other:
1 2 3 4 5 6 7 8 9 10 11 12 | CREATE OR REPLACE FUNCTION f1 ( in a integer, out b integer ) AS $$ BEGIN raise notice 'function f1 is called'; perform pg_sleep(1); b = a+1; END $$ LANGUAGE plpgsql; |
1 2 3 4 5 6 7 8 9 10 11 12 13 | CREATE OR REPLACE FUNCTION f2 ( in c integer, out d integer ) as $$ BEGIN raise notice 'function f2 is called'; perform f1(c); raise notice 'returning from f2'; d := 0; END $$ language plpgsql; |
This SQL statement resets all statistics being tracked in the PostgreSQL database. Please note there are other functions that can be used to reset specific statistics while preserving others:
1 | select * from pg_stat_reset(); |
And here’s our functions’ invocations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | <span style="font-size: 10px;">db01=# select * from f1(4); NOTICE: function f1 is called b --- 5 db01=# select * from f2(4); NOTICE: function f2 is called NOTICE: function f1 is called NOTICE: returning from f2 d --- 0 db01=# select * from pg_stat_user_functions; funcid | schemaname | funcname | calls | total_time | self_time --------+------------+----------+-------+------------+----------- 17434 | public | f1 | 2 | 2002.274 | 2002.274 17437 | public | f2 | 1 | 1001.126 | 0.599 </span> |
An Upgrade Example Using pg_upgrade
SYNOPSIS
There are two discrete upgrades:
- pg_upgrade: pg 9.6 -> pg 11
- PostGIS upgrade: postgis-2.4 -> postgis2.5 -> postgis-3
HOUSE CLEANING
An Ubuntu-based upgrade requires removing the target data cluster because installing PostgreSQL packages onto a Debian-based distro always includes creating a data cluster:
1 2 3 4 5 6 | <span style="font-size: 11px;">pg_lsclusters Ver Cluster Port Status Owner Data directory 9.6 main 5432 online postgres /var/lib/postgresql/9.6/main 11 main 5434 online postgres /var/lib/postgresql/11/main pg_dropcluster --stop 11 main</span> |
For our purposes we are simply adding the extension, no user-defined functions have been included:
1 2 3 4 | su - postgres createdb -p 5432 db01 psql -p 5432 db01 -c "create extension PostGIS" exit |
Shutting down the source data cluster is the last step before the upgrade process can begin:
1 | systemctl stop postgresql@9.6-main |
Debian based distros provide a convenient CLI, making upgrades easy:
1 2 | <span style="font-size: 10px;"># /usr/bin/pg_upgradecluster [OPTIONS] <old version> <cluster name> [<new data directory>] pg_upgradecluster -v 11 9.6 main</span> |
It’s important to check the upgrade logs before starting PostgreSQL version 11. This is a one-way process and once it’s active the old PostgreSQL 9.6 cluster is no longer available and must be destroyed:
1 2 | systemctl start postgresql@11-main pg_dropcluster --stop 9.6 main |
Here’s confirmation of the PostgreSQL and PostGIS upgrade respectively:
1 2 3 4 5 6 7 8 9 10 | <span style="font-size: 10px;">su - postgres psql -p 5432 db01 show server_version; server_version ------------------------------------ 11.14 (Ubuntu 11.14-1.pgdg18.04+1) select * from PostGIS_version(); PostGIS_version --------------------------------------- 3.1 USE_GEOS=1 USE_PROJ=1 USE_STATS=1</span> |
PostGIS Function API, Inspection, and Review
This is critical; the process validates that the application logic works or that it must be updated.
METHOD: inspect each function call used between all versions:
- from 2.4 -> 2.5
- from 2.5 -> 3.0
- from 3.0 -> 3.1
TIP: 3.1 documentation encapsulates all previous versions i.e. section 9.12
REFERENCES:
- https://PostGIS.net/documentation/
- https://PostGIS.net/docs/manual-2.4/
- https://PostGIS.net/docs/manual-2.5
- https://PostGIS.net/docs/manual-3.0
- https://PostGIS.net/docs/manual-3.1/
Regression Testing
- In the current setup, pg 9.6
- Identify all functions used in PostGIS
- Execute a simple function call with every type of parameter typically used in your environment
- Collect, record all variables returned
- In the target setup, pg 11 or pg 13
- Execute a simple function call with every type of parameter typically used in your environment
- Collect, record all variables returned
- Analysis
- Compare values: similar values mean you don’t have a problem
Working With Cloud Provider Technology
Be advised, cloud environments are not ideal upgrade candidates. The aforementioned process is quite detailed and will facilitate a successful upgrade process.
- AZURE
- pg 9.6: PostGIS 2.3.2
- pg 11: PostGIS 2.5.1
- AMAZON
- pg 11, 13: PostGIS 3.1.4
- pg 9.6.*: PostGIS 2.3.[0247], 2.5.[25]
References:
https://learn.microsoft.com/en-us/azure/postgresql/single-server/concepts-extensions
As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.
Read Our New White Paper:
Why Customers Choose Percona for PostgreSQL