Step-by-Step Guide: Setting up a High Availability PostgreSQL Cluster with Load Balancing and Automatic Fail-over on AWS EC2 using Pgpool-II and Watchdog
Created By
Daniyal Khawaja
Type
Walk-Through
Created
Welcome to this step-by-step guide on setting up a high-availability Postgres cluster with load balancing and automatic failover on AWS EC2 using Pgpool and Watchdog. This guide walks you through the process of creating a highly available database cluster on AWS that provides improved reliability, automatic failover, and load balancing for improved performance.
Before we begin, note that this guide assumes that you have some basic knowledge of AWS, Postgres, Pgpool, and Watchdog. If you are new to any of these technologies, it is recommended that you review the documentation and tutorials provided by their respective vendors before proceeding.
With that said, let's get started!
Step 1: Setting up the Ec2 instances
Log in to the AWS Management Console and navigate to the EC2 dashboard.
Click the "Launch Instance" button to begin creating a new EC2 instance.
Choose the Ubuntu Server AMI (Amazon Machine Image) from the list of available images.
Select the instance type for your EC2 instances. For this setup, it's recommended to use at least medium instances with 2 vCPUs, 4 GB RAM, and 10 GB storage.
Configure the network settings. Make sure to select the same sub-net for all three instances to ensure that they are on the same network.
Review the instance details, and then click "Launch".
Create a new key pair or select an existing one. This will allow you to access your instances using SSH.
Launch your instances and wait for them to become available.
Once your instances are up and running, use SSH to connect to each instance.
Step 2: Installing PostgreSQL and Pgpool-II.
💡
In this guide, we will be using version 15 of PostgreSQL.
💡
The following steps need to be replicated in each instance:
Once you're logged in to each instance, run the following command to update and upgrade the system packages:
sudo apt-get update && sudo apt-get upgrade
Next, you'll need to download and install Postgres. Start by adding the Postgres repository to the sources list:
These changes include enabling the PostgreSQL server to listen on all network interfaces by setting listen_addresses to '*' and setting the port to 5432.
The shared_buffers parameter has been set to 1GB, which controls the amount of memory that PostgreSQL uses for caching data. The work_mem parameter has been set to 2621kB, which controls the amount of memory used for sorting and other operations. The maintenance_work_mem parameter has been set to 256MB, which controls the amount of memory used for maintenance operations such as VACUUM and REINDEX.
The wal_level has been set to replica to enable replication. The archive_mode parameter has been set to on to enable archiving of WAL files. The archive_command and restore_command parameters specify how the archived WAL files are copied to and restored from the archive directory.
The max_wal_senders, max_replication_slots, wal_keep_size, synchronous_standby_names, wal_receiver_create_temp_slot, and wal_receiver_status_interval parameters are all related to synchronous replication and are set according to the specific needs of the cluster.
Finally, various other parameters such as random_page_cost, effective_cache_size, and default_statistics_target have been set to improve performance.
💡
You can always change these settings according to your needs.
To start the primary server, you need to first stop the PostgreSQL service if it is already running. This can be done with the command:
sudo systemctl stop postgresql.service
Next, you need to make sure that the /var/run/postgresql/
directory has the necessary permissions. You can do this with the command:
sudo chmod a+rw /var/run/postgresql/
After that, create a directory for storing the archived WAL files and give it the necessary permissions. You can use the following commands for this:
Once the server is running, you can connect to it using the psql
command-line tool. Use the following command to create a new user with replication privileges:
psql -h localhost -U postgres
####after entering psql, run these commands
CREATE USER <username> WITH PASSWORD '<password>';
ALTER USER <username> WITH REPLICATION;
ALTER USER <username> WITH SUPERUSER;
To confirm that the user was created successfully, you can run the following command:
SELECT usename, usecreatedb, usesuper, userepl, passwd IS NOT NULL AS haspassword FROM pg_user;
Next, you need to create replication slots for the replica databases. To create a physical replication slot, run the following command:
To view the replication slots, you can run the following command:
SELECT slot_name, slot_type, active, wal_status FROM pg_replication_slots;
By following these steps, you should now have a primary PostgreSQL server running with the necessary permissions, a user with replication privileges, and replication slots created for the replica databases.
After creating the replication slots for replica databases, the next step is to add the replication servers to the main server's pg_hba.conf file.
Suppose the replication servers have IP addresses ip1 and ip2. To add them to the pg_hba.conf file, run the following commands:
First, enable password-less SSH from every node to every other node. To do this, run ssh-keygen on every node and copy its ~/.ssh/id_rsa.pub file to every other node's ~/.ssh/authorized_keys file, including its own.
Now start the standby servers using the following command:
Running the command psql -h <standby-ip> -U <username> on the standby server and then executing SELECT * FROM pg_stat_wal_receiver; will display the status of the WAL receiver process. This will confirm whether the standby server is successfully receiving the WAL data from the primary server and is ready to serve as a standby server.
Configuring Pgpool-II with Watchdog and Load Balancing on the Main Server.
After setting up streaming replication, the next step is to configure pgpool and enable high availability and load balancing. The following steps should be taken on the main server:
Change into the “/usr/local/etc” directory
Move the sample files by running the following commands:
To configure Watchdog for Virtual IP, first allocate an Elastic IP to the primary server through AWS. Then, use that IP in the delegate IP configuration .
Your output should be in the log file in the following format:
✅
If u see this output then you can go on ahead and start pgpool on the standby servers too, and observer which one is the master by using:
”pcp_watchdog_status -U username”
After completing the installation and configuration of the PostgreSQL cluster with pgpool and watchdog on AWS, you can check the replication statistics of the primary server using the following command:
You can test if fail-over is working correctly by purposely failing the primary node and observing the cluster's behavior through logs and the following command:
To test if streaming replication is working fine, you can create a table from the primary and test if you can add or delete any value from it from the standby nodes.
To test for load balancing, you can use pgbench. Please refer to the documentation or Google for more information.
Remember to keep your system up to date with the latest security patches and to regularly back up your data to avoid any data loss. If you need further assistance or want to explore more advanced configurations, feel free to consult the official PostgreSQL documentation or seek help from the community.