We have been writing blog posts about how to write simple extensions in C language and a little more complex one by Ibrar which were well received by PostgreSQL user community. Then we observed that many PostgreSQL users create simple triggers for small auditing requirements, and then feel the pain of trigger on transactions. So we were discussing how simple/lightweight and faster a trigger function is when written in C. Generally, Trigger functions are written in high-level languages like PlpgSQL, but it has a higher overhead during execution and it can impact the transactions – and thereby application performance.
This blog post is an attempt to create a simple trigger function to address one of the common use-cases of triggers, which is to update auditing columns in a table.
In this post, we are going to introduce SPI (Server Programming Interface) functions for novice users. Towards the end of the blog, we share some of the quick benchmark results for understanding the benefits.
Example of Audit timestamp
Let’s proceed with taking up a case and assume that we have a table to hold transaction details. But auditing requirements say that there should be a timestamp on each tuple when the tuple is inserted and when it was last updated.
1 2 3 4 5 6 7 8 | CREATE TABLE transdtls( transaction_id int, cust_id int, amount int, ... insert_ts timestamp, update_ts timestamp ); |
For demonstration purpose, let’s remove and trim the other columns and create a table with only 3 essential columns.
1 2 3 4 5 | CREATE TABLE transdtls( transaction_id int, insert_ts timestamp, update_ts timestamp ); |
Developing Trigger Function
The trigger function can also be developed and packaged as an extension, which we discussed in s previous blog post here. So we are not going to repeat those steps here. The difference is that file names are named as “trgr” instead of “addme” in the previous blog. Makefile is also modified to refer “trgr” files. This need not be same as the function name “trig_test” in the C source detailed below.
In the end, the following files are available in the development folder:
1 2 | $ ls Makefile trgr--0.0.1.sql trgr.c trgr.control |
The trgr.c is the main source files with the following content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | #include <stdio.h> #include <time.h> #include "postgres.h" #include "utils/rel.h" #include "executor/spi.h" #include "commands/trigger.h" #include "utils/fmgrprotos.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif extern Datum trig_test(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(trig_test); Datum trig_test(PG_FUNCTION_ARGS) { TriggerData *trigdata = (TriggerData *) fcinfo->context; //TupleDesc tupdesc; HeapTuple tuple; HeapTuple rettuple; int attnum = 0; Datum datumVal; //Get the structure of the tuple in the table. //tupdesc = trigdata->tg_relation->rd_att; //Make sure that the function is called from a trigger if (!CALLED_AS_TRIGGER(fcinfo)) elog(ERROR, "are you sure you are calling from trigger manager?"); //If the trigger is part of an UPDATE event if (TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event)) { //attnum = SPI_fnumber(tupdesc,"update_ts"); attnum = 3; tuple = trigdata->tg_newtuple; } else //If the trigger is part of INSERT event { //attnum = SPI_fnumber(tupdesc,"insert_ts"); attnum = 2; tuple = trigdata->tg_trigtuple; } //Get the current timestamp using "now" datumVal = DirectFunctionCall3(timestamp_in, CStringGetDatum("now"), ObjectIdGetDatum(InvalidOid), Int32GetDatum(-1)); //Connect to Server and modify the tuple SPI_connect(); rettuple = SPI_modifytuple(trigdata->tg_relation, tuple, 1, &attnum, &datumVal, NULL); if (rettuple == NULL) { if (SPI_result == SPI_ERROR_ARGUMENT || SPI_result == SPI_ERROR_NOATTRIBUTE) elog(ERROR, "SPI_result failed! SPI_ERROR_ARGUMENT or SPI_ERROR_NOATTRIBUTE"); elog(ERROR, "SPI_modifytuple failed!"); } SPI_finish(); /* don't forget say Bye to SPI mgr */ return PointerGetDatum(rettuple); } |
and trgr--0.0.1.sql with the following content:
1 2 3 | CREATE OR REPLACE FUNCTION trig_test() RETURNS trigger AS 'MODULE_PATHNAME','trig_test' LANGUAGE C STRICT; |
Now it is a matter of building, installing, and creating the extension.
1 2 3 | $ make $ sudo make install psql> create extension trgr; |
In case you don’t want to develop it as an extension, you may compile it to generate a shared object file (.so) file. Copy the same to the library folder of PostgreSQL binaries, which on my Ubuntu laptop is : /usr/lib/postgresql/11/lib/, and then define the function. You can even specify the full path of the shared object file like this:
1 2 3 | CREATE FUNCTION trig_test() RETURNS trigger AS '/usr/lib/postgresql/11/lib/trgr.so' LANGUAGE C; |
Using Trigger Function
Usage of trigger function is not different from regular PLpgSQL functions. You just need to attach the function to the table for all INSERT and UPDATE events.
1 2 3 | CREATE TRIGGER transtrgr BEFORE INSERT OR UPDATE ON public.transdtls FOR EACH ROW EXECUTE PROCEDURE public.trig_test(); |
Benchmarking
For a fair comparison with trigger function written in PLpgSQL, a similar function is created as follows:
1 2 3 4 5 6 7 8 9 10 11 | CREATE OR REPLACE FUNCTION transtrgr_pl() RETURNS TRIGGER AS $$ BEGIN if (TG_OP = 'UPDATE') then NEW.update_ts = now(); else NEW.insert_ts = now(); end if; RETURN NEW; END; $$ language 'plpgsql'; |
The number of lines and the readability of the code is in favor of PLpgSQL. The development and debugging time required is much less.
Regarding the performance benchmarking, three cases are compared.
- PostgreSQL client/application providing the audit timestamp, so that trigger can be avoided.
- Trigger function in C language.
- Trigger function in PLpgSQL.
Here are the performance numbers in milliseconds for 1 million bulk inserts, obviously a smaller number is better.
Caveats
- The first case where there is no trigger on the database side, it takes less time. But the application and network need to take up the extra load, which is not considered in this test.
- The C function is bit hardcoded with an attribute number like attnum = 3; and if we want a generic trigger function which looks for specific column name, we can use SPI_fnumber function like attnum = SPI_fnumber(tupdesc,"update_ts"); . Such a generic trigger function can be used in multiple tables. Obviously, this involves more processing. Those lines are commented out in the source code. On repeated tests, the average time of execution increases to 1826.722 ms. Still, we can see that it is considerably faster than the PLpgSQL trigger function.