PostgreSQL Connection Pooling: Part 1 – Pros & Cons
Friday, October 18, 2019 at 8:51AM
Kristi Anderson in Connection Pool, Connection Pooler, Connection Pooling, Database, DevOps, Forking, Language Library, Latency, Linux, Multithreaded, Performance, Postgres, Postgres, Web Applications, administrator, application, architecture, cluster, connection-pool, database, database performance online website, deployment, developer, failure, multilanguage, multithreading, open source, postgresql, postgresql, sql, sql

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

A long time ago, in a galaxy far far away, ‘threads’ were a programming novelty rarely used and seldom trusted. In that environment, the first PostgreSQL developers decided forking a process for each connection to the database is the safest choice. It would be a shame if your database crashed, after all.

Since then, a lot of water has flown under that bridge, but the PostgreSQL community has stuck by their original decision. It is difficult to fault their argument – as it’s absolutely true that:

However, in modern web applications, clients tend to open a lot of connections. Developers are often strongly discouraged from holding a database connection while other operations take place. “Open a connection as late as possible, close a connection as soon as possible”. But that causes a problem with PostgreSQL’s architecture – forking a process becomes expensive when transactions are very short, as the common wisdom dictates they should be. In this post, we cover the pros and cons of PostgreSQL connection pooling.

PostgreSQL Architecture DiagramThe PostgreSQL Architecture | Source

The Connection Pool Architecture

Using a modern language library does reduce the problem somewhat – connection pooling is an essential feature of most popular database-access libraries. It ensures ‘closed’ connections are not really closed, but returned to a pool, and ‘opening’ a new connection returns the same ‘physical connection’ back, reducing the actual forking on the PostgreSQL side.

Visual Representation of a Connection PoolThe architecture of a generic connection-pool

However, modern web applications are rarely monolithic, and often use multiple languages and technologies. Using a connection pool in each module is hardly efficient:

As a result, popular middlewares have been developed for PostgreSQL. These sit between the database and the clients, sometimes on a seperate server (physical or virtual) and sometimes on the same box, and create a pool that clients can connect to. These middleware are:

PostgreSQL Connection Pooler Cons

A connection pooler is an almost indispensable part of a production-ready PostgreSQL setup. While there is plenty of well-documented benefits to using a connection pooler, there are some arguments to be made against using one:

Should You Use a PostgreSQL Connection Pooler?

However, all of these problems are well-discussed in the PostgreSQL community, and mitigation strategies ensure the pros of a connection pooler far exceed their cons. Our tests show that even a small number of clients can significantly benefit from using a connection pooler. They are well worth the added configuration and maintenance effort.

In the next post, we will discuss one of the most popular connection poolers in the PostgreSQL world – PgBouncer, followed by Pgpool-II, and lastly a performance test comparison of these two PostgreSQL connection poolers in our final post of the series.

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.