Enter pg_stat_monitor: this extension, created here at Percona, has been developed as an advanced replacement of pg_stat_statement, providing new capabilities in addition to the standard fare.
As you may recall, PostgreSQL’s pg_stat_statements extension provides a means of tracking execution statistics of all SQL statements executed by the server. But sometimes just having the basics is not enough as the queries, and their generated metrics, are presented as one big “clump” of data. It neither provides aggregated statistics nor histogram information. This makes it difficult to identify problem queries during peak loading times versus periods of normal loading versus report generation during off-peak times. In order to trend server performance, one is therefore either forced to manually generate the needed data and calculate the aggregate, which can become quite the chore, or use a 3rd party monitoring solution presenting the results on a dashboard and which brings its own overhead requirements (and headaches?).
Download/Compile/Install
Depending upon your circumstances, I’m going to share with you three (3) methods one of which you can use to obtain and try out pg_stat_monitor. Although I’m limiting the installation instructions to Ubuntu, one can, of course, install it on the Redhat/CENTOS distros too.
Method 1: The Percona Distribution For PostgreSQL
The easiest way, of course, is downloading and installing the extension from our own Percona repository. The following instructions are performed as root on your OS.
First things first, update your distribution packages:
1 2 | apt update apt upgrade -y |
Install the Percona repository and download the latest Percona release package:
1 2 | apt install -y wget gnupg2 lsb-release curl wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb |
The package percona-release_latest.generic_all.deb installs a binary you will use to install the packages of interest:
1 | dpkg -i percona-release_latest.generic_all.deb |
And now you install both pg_stat_monitor and Percona’s own bundled version of PostgreSQL at the same time!
1 2 | percona-release setup ppg-12 apt install -y percona-postgresql-12 percona-pg-stat-monitor12 |
Attention: At this time, the only real difference between Percona’s and the community version of PostgreSQL is that it is located under a different path. However, over time we will be adding additional capabilities in the form of extensions. This is a work in progress, stay tuned!
Method 2: Compile And Install (Community PostgreSQL Repository)
Suppose you aren’t using the Percona Distribution For PostgreSQL, shame on you, but instead, the community version downloaded from postgresql.org. Although a little more detailed, this set of instructions will help you install the extension with your current version of PostgreSQL. In this case, one downloads the source code for pg_stat_monitor and compiles using the community PostgreSQL development packages.
Download the source code directly from our GIT HUB:
1 | git clone https://github.com/percona/pg_stat_monitor |
For demonstration purposes, we use version 12, although any version newer than 11 will work just fine. Be warned; there are a lot of packages! My tests, while writing this blog, required 500MB of additional space:
1 | apt install -y postgresql-server-dev-12 git make gcc |
Now it’s an easy matter of compiling; execute the following as root:
1 2 3 | cd pg_stat_monitor make USE_PGXS=1 make USE_PGXS=1 install |
Method 3: Roll Your Own Packages
The method lends itself well for production environments by bundling your own package, whether it be DEB or RPM, using the FPM package management system.
FPM is a command-line program designed to help you build packages.
Author’s note: I won’t go into details using FPM due to the complexity involved in getting and configuring it … unless you’d like me to, and in which case, I’d be very happy to write a blog about this really cool tool. 🙂
Building a package might look something like this:
1 fpm -s <source type> -t <target type> [list of sources]…
“Source type” is what your package is coming from; a directory (dir), a rubygem (gem), an rpm (rpm), a python package (python), a PHP pear module (pear), etc.
“Target type” is what your output package form should be, such as RPM and DEB.
Method 4: Using PGXN
pg_stat_monitor is released on PGXN and the latest stable version is available there. It is easily downloadable using pgxn utility.
1 | pgxn install pg_star_monitor. |
Create Extension “pg_stat_monitor”
Once compiled and installed, this next step is straightforward. Update the PostgreSQL runtime parameters so it sees the extension’s module:
1 2 | -- there's more than one way to configure this parameter alter system set shared_preload_libraries = 'pg_stat_monitor'; |
Restart the server:
1 | systemctl restart postgresql |
You can install this on any database as this is a data cluster-wide extension:
1 | create extension pg_stat_monitor; |
There are two views:
- The first view is pg_stat_monitor which is similar to pg_stat_statements in that you can view generated metrics in real-time.
- The second view, pg_stat_monitor_settings, returns the entire suite of parameters defining and controlling this extension’s behavior. One can edit these parameters using ALTER SYSTEM.
1 2 3 4 5 | List of relations Schema | Name | Type | Owner -------+--------------------------+------+---------- public | pg_stat_monitor | view | postgres public | pg_stat_monitor_settings | view | postgres |
1 2 3 4 5 6 7 8 9 10 11 12 | db01=# \d pg_stat_monitor_settings View "public.pg_stat_monitor_settings" Column | Type | Collation | Nullable | Default ---------------+---------+-----------+----------+--------- name | text | | | value | integer | | | default_value | integer | | | description | text | | | minimum | integer | | | maximum | integer | | | restart | integer | | | |
Using pg_stat_monitor
Let’s generate some activity using pgbench:
1 2 | pgbench -i db01 pgbench -c 4 -j 2 -T 300 -b tpcb-like db01 > /dev/null 2>&1 & |
Now query the view pg_stat_monitor, returning the top ten results for all operations on the current database:
1 2 3 4 5 6 7 8 9 10 | select application_name, userid::regrole AS user_name, datname AS database_name, substr(query,0, 50) AS query, calls, client_ip from pg_stat_monitor, pg_database where dbid = oid order by calls desc, application_name limit 10; |
And here’s our results, notice pg_stat_monitor returns information similarly to, but not quite, to pg_stat_statements:
1 2 3 4 5 6 7 8 9 10 11 12 | application_name | user_name | db | query | calls | client_ip ------------------+-----------+-------+---------------------------------------------------+-------+----------- pgbench | postgres | db01 | UPDATE pgbench_branches SET bbalance = bbalance + | 3523 | 127.0.0.1 pgbench | postgres | db01 | END | 3523 | 127.0.0.1 pgbench | postgres | db01 | UPDATE pgbench_tellers SET tbalance = tbalance + | 3523 | 127.0.0.1 pgbench | postgres | db01 | SELECT abalance FROM pgbench_accounts WHERE aid = | 3523 | 127.0.0.1 pgbench | postgres | db01 | UPDATE pgbench_accounts SET abalance = abalance + | 3523 | 127.0.0.1 pgbench | postgres | db01 | INSERT INTO pgbench_history (tid, bid, aid, delta | 3523 | 127.0.0.1 pgbench | postgres | db01 | BEGIN | 3523 | 127.0.0.1 pgbench | postgres | db01 | END | 3257 | 127.0.0.1 pgbench | postgres | db01 | INSERT INTO pgbench_history (tid, bid, aid, delta | 3257 | 127.0.0.1 pgbench | postgres | db01 | UPDATE pgbench_branches SET bbalance = bbalance + | 3256 | 127.0.0.1 |
This query highlights the key difference between pg_stat_monitor and pg_stat_statements, i.e. aggregating performance over a time interval using buckets:
1 2 3 4 5 6 7 8 | postgres=# SELECT bucket, bucket_start_time, application_name, datname AS database_name, substr(query,0, 50) AS query, calls FROM pg_stat_monitor LIMIT 10; |
bucket | bucket_start_time | application_name | database_name | query | calls
——–+———————+——————+—————+—————————————————+——-
5 | 2021-01-14 14:49:10 | pgbench | db01 | vacuum analyze pgbench_history | 1
5 | 2021-01-14 14:49:10 | pgbench | db01 | select count(*) from pgbench_branches | 1
5 | 2021-01-14 14:49:10 | pgbench | db01 | UPDATE pgbench_accounts SET abalance = abalance + | 1375
5 | 2021-01-14 14:49:10 | pgbench | db01 | begin | 1
5 | 2021-01-14 14:49:10 | pgbench | db01 | UPDATE pgbench_branches SET bbalance = bbalance + | 1372
5 | 2021-01-14 14:49:10 | pgbench | db01 | copy pgbench_accounts from stdin | 1
5 | 2021-01-14 14:49:10 | pgbench | db01 | create table pgbench_branches(bid int not null,bb | 1
5 | 2021-01-14 14:49:10 | pgbench | db01 | alter table pgbench_accounts add primary key (aid | 1
5 | 2021-01-14 14:49:10 | pgbench | db01 | INSERT INTO pgbench_history (tid, bid, aid, delta | 1372
5 | 2021-01-14 14:49:10 | pgbench | db01 | alter table pgbench_branches add primary key (bid | 1
(10 rows)
Updating pg_stat_monitor_settings
View pg_stat_monitor_settings returns those parameters controlling the metrics that gather the data.
This example query returns the list of runtime parameters that can be edited:
1 | select name,description from pg_stat_monitor_settings; |
As you can see, one has substantially more parameters allowing for targeted investigations and analysis:
1 2 3 4 5 6 7 8 9 10 11 12 13 | name | description -----------------------------------------------+---------------------------------------------------------------------------------------------------------- pg_stat_monitor.pgsm_max | Sets the maximum size of shared memory in (MB) used for statement's metadata tracked by pg_stat_monitor. pg_stat_monitor.pgsm_query_max_len | Sets the maximum length of query. pg_stat_monitor.pgsm_enable | Enable/Disable statistics collector. pg_stat_monitor.pgsm_track_utility | Selects whether utility commands are tracked. pg_stat_monitor.pgsm_normalized_query | Selects whether save query in normalized format. pg_stat_monitor.pgsm_max_buckets | Sets the maximum number of buckets. pg_stat_monitor.pgsm_bucket_time | Sets the time in seconds per bucket. pg_stat_monitor.pgsm_respose_time_lower_bound | Sets the time in millisecond. pg_stat_monitor.pgsm_respose_time_step | Sets the response time steps in millisecond. pg_stat_monitor.pgsm_query_shared_buffer | Sets the maximum size of shared memory in (MB) used for query tracked by pg_stat_monitor. (10 rows) |
Altering the runtime parameters is as simple as executing ALTER SYSTEM:
1 2 3 4 5 | -- the default length is 1,024 characters show pg_stat_monitor.pgsm_query_max_len; -- increase the query length to 2,048 characters alter system set pg_stat_monitor.pgsm_query_max_len = 2048; |
Restarting the server updates the parameter:
1 | systemctl restart postgresql |
Here is the updated value:
1 2 3 4 5 6 7 8 | SELECT name, value FROM pg_stat_monitor_settings where name = 'pg_stat_monitor.pgsm_query_max_len'; name | value ------------------------------------+------- pg_stat_monitor.pgsm_query_max_len | 2048 |
Error Monitoring
pg_stat_monitor not only records the successful queries but all the ERROR and WARNINGS too.
1 2 3 4 5 6 | SELECT decode_error_level(elevel) AS level, query, message FROM pg_stat_monitor WHERE elevel != 0; elevel | sqlcode | query | message --------+----------+-- ---------------------+---------------------------------- ERROR | 16908420 | SELECT * FROM pg_foo; | relation "pg_foo" does not exist ERROR | 33816706 | SELECT 1/0; | division by zero |
Tell us what you think; your feedback is important!
How does one log bugs with this cool utility? On the github page there is no issues tab.