Comments on: Scaling PostgreSQL using Connection Poolers and Load Balancers for an Enterprise Grade environment

By: Erikas Neverdauskas

Erikas Neverdauskas — Sun, 22 Mar 2020 12:37:54 +0000

In reply to Jobin Augustine. thanks for quick answer i see you point regarding reporting service. if anyone would be interested how to have behavior as i described: while browsing some articles i found one article which shows how to routing as i described using HAProxy: https://severalnines.com/database-blog/postgresql-load-balancing-using-haproxy-keepalived (section "HAProxy Configuration") health check for slave alias uses regex so any node (master or standby) status is ok this way selects would be redirected to all 3 nodes.

By: Jobin Augustine

Jobin Augustine — Sun, 22 Mar 2020 10:01:49 +0000

In reply to Erikas Neverdauskas. pgpool2 can do statement routing. Blindly redirecting all SELECTs to standby may not be a good idea. we may have SELECTs which is part of transactions and different isolation level requirements relative to other concurrent transactions. I would prefer SELECTs from specific modules like Reporting to be redirected to standby.

By: Erikas Neverdauskas

Erikas Neverdauskas — Sun, 22 Mar 2020 07:48:31 +0000

Hi,

thanks for article,
i may be wrong, but for me it seems that there is no possibility to utilize MASTER node for SELECTS in case app choose role (connection string) with slave alias.

Ideally master node should receive ANY TYPE of query and stand by nodes just SELECTS.
But in case i have 2 nodes cluster then such solution won’t load balance anything because:
if i choose alias MASTER i will be redirected to MASTER node only
if i choose alias SLAVE i will be redirected to single STANDBY node only
but cluster of 2 nodes is able theoretically handle SELECTS from both nodes

can you comment ? or i misundesrtood something ?

By: Jobin Augustine

Jobin Augustine — Mon, 11 Nov 2019 05:15:38 +0000

In reply to dafalla omda. Connection routing is preferred rather than individual statement routing. Application module decides whether it really need a primary connection (read-write) or any hot standby connection (read-only) is sufficient. For example, reporting modules with complex queries can request for read-only connection and proxy can route that connection to any of the hot standby.

By: dafalla omda

dafalla omda — Sat, 09 Nov 2019 21:14:06 +0000

Who distinguishes the Query thus for read and that for write?

By: Jobin Augustine

Jobin Augustine — Tue, 05 Nov 2019 10:32:02 +0000

In reply to Enzo. Application need to have clear understanding of its own requirement. whether it needs a Primary connection or any standby connection is sufficient. This is similar to Oracle's role-based service. A primary connection always it is available in HAProxy's Port 5002 and Standby connection (Read-only) will be available on HAProxy's port 5003 which will remain unaffected by database switchovers/failovers. So the explicit connection routing happens at HAProxy layer. PGbouncer just adds a connection pooling on the top of it.

By: Mike

Mike — Tue, 05 Nov 2019 10:12:24 +0000

In reply to Enzo. It won't. The article is written in a way that makes you think it would, i.e. an SQL statement hits pgbouncer, pgbouncer recognizes that it is either a read or write and basing on that directs the statement to either "master" or "slave" databases configured in [databases] section, but pgbouncer does not have such functionality AFAIK.

By: avivallarapu

avivallarapu — Wed, 30 Oct 2019 20:38:27 +0000

In reply to revanth. Such Crashes can happen to any services inside the Application. Not just pgBouncer, be it HAProxy or some other service being used by App may fail. This is why we need to have Application level failover as well. You will usually see Active and Passive App servers in neat setups and Apps that are scaled enough to handle failover in case any of the behaviour changes.

By: revanth

revanth — Tue, 29 Oct 2019 20:08:10 +0000

What if the pgbouncer get’s crashed ?!

By: Enzo

Enzo — Sun, 05 May 2019 09:21:05 +0000

Nice post. However I still don’t understand how pgbouncer will redirect write queries to the master and read-only to the slaves.

By: Avinash Vallarapu

Avinash Vallarapu — Wed, 03 Oct 2018 15:14:30 +0000

In reply to Andy. Thank You for reading through the blog. pgBouncer still maintains persistent connections and behave no different. Many applications with native connection poolers use HA Proxy similar way. Using pgBouncer before HAProxy still does the job of a connection pooler as it does when it directly connects to DB

By: Andy

Andy — Wed, 03 Oct 2018 15:01:29 +0000

Why would you put HAProxy behind PgBouncer? The point of PgBouncer is to maintain a pool of PG connections that can be reused. Now that you put HAProxy in there, PgBouncer can’t even connect directly with Postgresql. What’s the point?

By: Avinash Vallarapu

Avinash Vallarapu — Wed, 03 Oct 2018 14:27:33 +0000

In reply to Alexander. Hi Alexander, Correct, this method is simpler too. Both the methods get the information from Patroni and should give same information. Do you see a situation where the leader key is with another node than the actual master ?

By: Jobin Augustine

Jobin Augustine — Wed, 03 Oct 2018 13:42:18 +0000

In reply to Alexander. Thank you very much for this detailed expert comment. We hope all our readers will benefit from this information. The HAProxy configuration template which uses http-check rather than tcp-check is much better and simpler. confd can can make the configuration much more dynamic also.

By: Alexander

Alexander — Wed, 03 Oct 2018 09:14:52 +0000

> tcp-check expect string “role”:\ “master”

This is considered to be a bad practice, you shouldn’t do that!
In some situations it might happen that role is master (because postgres is not running in recovery), but the leader key is owned by some other node.

Patroni provides a few REST API endpoints for master-replica health checks, which will return http status codes 200 or 503. Haproxy must rely only on http status codes.

“/master” — will return http status code 200 only if the node is holding the leader key in DCS. In all other cases it will return 503.
“/replica” — will return code 200 if node is running as replica and it is allows load balancing (noloadbalance tags is not set)

Besides that it is better to use OPTIONS request method instead of GET:
option httpchk OPTIONS /master
http-check expect status 200

In order to automate everything, one can use confd (https://github.com/kelseyhightower/confd), and Patroni provides you an example of haproxy template: https://github.com/zalando/patroni/blob/master/extras/confd/templates/haproxy.tmpl#L19-L32
Confd will take care about generating the new haproxy config and reloading/restarting haproxy if the list of Patroni nodes registered in DCS is changing.