« Stuff The Internet Says On Scalability For January 10th, 2020 | Main | Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream »
Wednesday
Jan082020

PostgreSQL Connection Pooling: Part 2 – PgBouncer

PostgreSQL Connection Pooling: Part 2 – PgBouncer

When it comes to connection pooling in the PostgreSQL world, PgBouncer is probably the most popular option. It’s a very simple utility that does exactly one thing – it sits between the database and the clients and speaks the PostgreSQL protocol, emulating a PostgreSQL server. A client connects to PgBouncer with the exact same syntax it would use when connecting directly to PostgreSQL – PgBouncer is essentially invisible.

PgBouncer is supported by almost every PostgreSQL DBaaS vendor, and widely used across the community. In this blog post, we’ll explain how PgBouncer works, the pros and cons of using it, and how to setup the connection pooler. If you’d like to know more about connection pooling in general, or wondering if it’s right for your deployment, check out our PostgreSQL Connection Pooling: Part 1 – Pros & Cons post.

How Does PgBouncer Work?

When PgBouncer receives a client connection, it first performs authentication on behalf of the PostgreSQL server. PgBouncer supports all the authentication mechanisms that PostgreSQL server supports, including a host-based-access configuration (note: we cannot route replication connections through PgBouncer). If a password is provided, the authentication can be done in two ways:

  1. PgBouncer first checks the userslist.txt file – this file specifies a set of (username, md5 encrypted passwords) tuples. If the username exists in this file, the password is matched against the given value. No connection to PostgreSQL server is made.
  2. If passthrough authentication is setup, and the user is not found in the userslist.txt file, PgBouncer searches for an auth_query. It connects to PostgreSQL as a predefined user (whose password must be present in the userslist.txt file) and executes the auth-query to find the user’s password and matches it to the provided value.

Once the authentication succeeds:

  1. PgBouncer checks for a cached connection, with the same username+database combination.
  2. If a cached connection is found, it returns the connection to the client.
  3. If a cached connection is not found, it creates a new connection, provided creating a new connection does not:
    • Increase the number of connections to > pool_size
    • Increase the number of connections from the client to > max_client_connections
    • Increase the number of connections to the database to > max_db_connections
    • Increase the number of connections from the user to > max_user_connections
  4. All of these values can be defined in the PgBouncer settings.
  5. If creating a new connection would violate any of the settings, PgBouncer queues the connection until a new one can be created, except if it violates the max_client_connections restriction.
    Note - The timing of post-authentication steps differ slightly based on PgBouncer mode. Under transaction or statement pooling mode, the post-authentication steps are executed only when the client starts executing a transaction/statement. We discuss more about the pooling modes below.
  6. If it violates the max_client_connections restriction, it aborts the connection.

PgBouncer Architecture DiagrampgBouncer Architecture | Source

Based on the pooling mode, PgBouncer waits for an opportunity to return the connection back to the database:

  • In session pooling mode, a connection is returned to the pool only when a client closes the session.
  • In transaction pooling mode, a connection is returned to the pool only when a client completes a transaction (typically either a rollback or a commit is executed). As a result, session-based features are not supported in this mode. There is no guarantee two transactions ran on the same client PgBouncer connection will run on the same PgBouncer server connection.
  • In statement pooling mode, a connection is returned to the pool as soon as a statement is executed. Here, autocommit is always on.

Before returning the connection back to the database, PgBouncer runs a reset query to strip it of all session information – this makes it safe to share connections between clients. It is possible to configure this query based on the needs of the application.

The transaction pooling mode is used most often, though the session pooling mode might be useful for particular workloads. You can read more about PgBouncer on their Wiki page.

PostgreSQL Connection Pooling: Part 2 – PgBouncerCLICK TO TWEET  

Why Choose PgBouncer?

There are many reasons PgBouncer is the most popular choice when it comes to connection pooling in PostgreSQL. Here are some of the best features and pros PgBouncer offers:

  • Pooling Modes – By giving users the power to decide when a connection is returned to the pool, PgBouncer is able to support a vast range of use cases. And, since this setup is at a pool level, you could use transaction mode (better performance) for your usual database connections, and session mode only when you require features like prepared statements!
  • Easy Setup & Use – PgBouncer is one of the easiest PostgreSQL connection poolers to setup out of the box, and it also requires no client-side code changes.
  • Passthrough Authentication – PgBouncer is one of the few “middleware” connection poolers that can securely authenticate a user without having access to their passwords (in plaintext or encrypted form). This makes PgBouncer more secure and much easier to maintain – you need not update PgBouncer every time a user updates their password.
  • Lightweight – It is a single process, and all commands from the client and responses from the server passthrough PgBouncer without any processing. So, it does not need to ‘see’ the entire contents at once, and hence, maintains a very small memory footprint.
  • Scalability & Performance – As we will discuss in more detail in the final part of our series, PgBouncer can significantly improve the transactions per second that your PostgreSQL server can support, and it scales very well to a very large number of clients.

What Doesn’t PgBouncer Do?

PgBouncer, while a great connection pooler, does not support automated load balancing or high-availability. It recommends using other common linux tools like HAProxy to create an architecture which does support these features.

Take a look at the sample PostgreSQL architecture for load-balanced reads below:

PgBouncer Architecture to support load balanced readsPgBouncer architecture for load balanced reads | Source

Note – The master node (that all these slaves would be replicating from) is not shown in the diagram.

How To Set Up PgBouncer

If you have a ScaleGrid PostgreSQL deployment, you can set up PgBouncer in a few clicks. Go to the details view of your PostgreSQL cluster and click on the PgBouncer icon. Once you select “Enable PgBouncer” you will be presented with configuration options to customize your pooling mode and pool size – you can accept the defaults (don’t worry, you can change them anytime with no downtime), and click Enable!

Enable PgBouncer - PostgreSQL Hosting at ScaleGrid DBaaS

And that’s it! You are good to go.

If you have a non-ScaleGrid deployment, PgBouncer is distributed as part of the PostgreSQL repository and can be installed using the respective package managers. For more detailed instructions, or to build from source, you can follow the instructions from their blog.

Once installed, PgBouncer only requires you to set up a few configuration parameters to get up and running:

  1. A list of (username, md5 encrypted password) to authenticate clients or a passthrough authentication setup for a more secure deployment.
  2. Interfaces/IP:ports to listen for incoming connections.
  3. Pool definitions. A ‘pool’ is a name that clients use as a database-name when connecting to PgBouncer – it can be mapped to a full connection string (host, port, dbname and user). The simplest definition is of the form:
    * = host=
    This will create dynamic pools for each dbname+user combination, and connect to the defined host using the port, dbname and username provided by the user.

And that is it! You can be up and running very quickly with PgBouncer. However, there are many more settings which must be tuned for any production distribution – those are beyond the scope of this blog post, but you can read more about them in this PgBouncer configurations overview.

PgBouncer, however, is not the only option for PostgreSQL connection pooling – in our next post, we will discuss Pgpool-II, which is probably the main competitor to PgBouncer. Stay tuned for our fourth post in this four-part series where we compare PgBouncer vs. Pgpool-II.

Reader Comments (1)

hi Kristi, I am searching and looking around for part 3 and 4 on the site. Have those been written yet?

May 8, 2020 | Unregistered Commenterjf

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>