A few days ago, Heikki Linnakangas posted a message in the PostgreSQL mailing list titled “Let’s make PostgreSQL multi-threaded.” This topic got quite a discussion on Hacker News too.
The poll I’ve done on Twitter shows there is great interest in this topic and overwhelming support for such an effort!
Should PostgreSQL become multi-threaded ?
— Peter Zaitsev (@PeterZaitsev) June 12, 2023
I am very excited to see this discussion finally happening!
From a technical standpoint, I think multi-threaded architecture is quite superior; the cost of process context switch is a lot more expensive than thread context switch. PostgreSQL has great performance despite this, yes, but it is surely an opportunity to improve performance, especially for some workloads.
It is a big change, though; PostgreSQL and PostgreSQL plugins relied on multi-process rather than multi-threaded architecture for years, and if PostgreSQL Global Development Group embarks on the effort, it will not be quick. Heikki himself acknowledges it would take multiple major PostgreSQL releases for work to be complete.
The downside of embarking on such an effort is opportunity cost. Scarce development resources could be spent on something else and potentially impact the stability of PostgreSQL and, even more so, the plugin ecosystem.
In my mind, though, there is no question this work will need to be done at some point. The database is performance-critical software; you must use the most high-performance operating system concepts to get the best performance possible. It also will unlikely get easier as the codebase only grows and becomes more complicated through the years (unless, of course, AI becomes capable of performing such a major re-architecturing task.)
So is it time to bite the bullet and make PostgreSQL multi-threaded? What do you think?
Discover the power of Percona and PostgreSQL – Download the ‘Why Customers Choose Percona for PostgreSQL’ white paper to learn more.
There isn’t really much stomach for this in the community and this is a topic that has come across the lists more than once over the years. There is just too much investment in making the existing model reliable. It would also break so much stuff in the wild.
One item that would help is if PostgreSQL were to engineer processes to prefork (Ala Apache).
I’m not expecting change to be easy and safety concerns may push the team to stick to satus quo. While Pre-fork can help with overhead of creating the process it does nothing for cost of switching between processes vs cost of switching between threads.
You of course can imagine process based architecture where such switch is not happening – like fixed number of processes each bound to specific cpu core/thread and managing multiple queries per process, but change to such architecture is likely to be even harder.
Do you mean an event based architecture?
That would be even more optimal.