Relational Database Technologies: Configuring Automatic failover using Replication Manager 2.0 on PostgreSQL 9.3.5

Tuesday, January 13, 2015

Configuring Automatic failover using Replication Manager 2.0 on PostgreSQL 9.3.5

In PostgreSQL high availability(Streaming Replication/Hot Standby), one of the thing require human interference and has no automation, that is in the event of master database crash; initiating failover procedure(trigger file creation or pg_ctl promote command) on the standby. Because in core of PostgreSQL there's no such built-in functionality to identify a failure of master and notify the standby. Thus, we require some automated tools to take over manual failover work or we have to dive into scripting land for writing our own script to do it.

Today, we have very good external tools to handle automatic failover like Replication Manager(repmgr), EDB Failover Manager(EFM), pgHA and HandyRep. Thanks to all for filling the gap of automatic failover in PostgreSQL.

In this post, am demonstrating Replication Manager on single node(localhost) on RHEL 6.5 - PostgreSQL 9.3.5. In order to achieve an easy and good understanding of concept I have compiled repmgr with EnterpriseDB One Click Installer(a pre-build binary package) instead of PG source.

To compile repmgr, we need to install few mandatory dependency packages gcc, postgresql-devel, libxslt-devel, pam-devel, libopenssl-devel, krb5-devel and libedit-devel by using yum or rpm. After installing dependencies, download repmgr 2.0 from here and set pg_config in your path and start compiling.

[postgres@localhost:/opt/PostgreSQL/9.3~]$ type pg_config
pg_config is hashed (/opt/PostgreSQL/9.3/bin/pg_config)

export PATH=/opt/PostgreSQL/9.3/bin:$PATH
export LD_LIBRARY_PATH=/opt/PostgreSQL/9.3/lib:$LD_LIBRARY_PATH

tar xvzf repmgr-2.0.tar.gz
cd repmgr
make USE_PGXS=1
make USE_PGXS=1 install

Mostly, repmgr compiles smoothly without any hiccups if we have installed all dependency packages, since am compiling against PG pre-build binaries, there may be diverse variants of libraries came with pre-build and rpm which might throw some compilation errors. Like one you see here:

/lib64/libldap_r-2.4.so.2: undefined reference to `ber_sockbuf_io_udp'
collect2: ld returned 1 exit status
make: *** [repmgrd] Error 1

To fix, find the checking library in /lib64/libldap_r-2.4.so.2.

[root@localhost repmgr-2.0]# cd /lib64/
[root@localhost lib64]# ls -l libldap*
lrwxrwxrwx. 1 root root     20 Dec  8 09:23 libldap-2.4.so.2 -> libldap-2.4.so.2.5.6
-rwxr-xr-x. 1 root root 317200 Apr 29  2013 libldap-2.4.so.2.5.6
lrwxrwxrwx. 1 root root     22 Dec  8 09:23 libldap_r-2.4.so.2 -> libldap_r-2.4.so.2.5.6
-rwxr-xr-x. 1 root root 335264 Apr 29  2013 libldap_r-2.4.so.2.5.6

Ok, there are two copies, retain one and unlink other.

[root@localhost lib64]# unlink libldap_r-2.4.so.2

Now clean the previous compilation by command "make USE_PGXS=1 clean" and proceed from step 1 and it will compile without any issues. Similarly, you need to fix for other library related errors. After installation you can find binaries, libraries and .SQL files related to repmgr in PostgreSQL locations.

[postgres@localhost:/opt/PostgreSQL/9.3/bin~]$ ls repmgr*     (Two utility commands)
repmgr  repmgrd

[postgres@localhost:/opt/PostgreSQL/9.3/lib/postgresql~]$ ls rep*  
repmgr_funcs.so

[postgres@localhost:/opt/PostgreSQL/9.3/share/postgresql/contrib~]$ ls
repmgr_funcs.sql  repmgr.sql  uninstall_repmgr_funcs.sql  uninstall_repmgr.sql

We are all set to setup automatic failover with a super-simple-toy Replication Manager. As a first step we need to have streaming replication(Refer to wiki) configured which I have done already on my localhost between two port 5432 (Master) and 5433 (Standby) lets use them. You can also try building standby using repmgr STANDBY CLONE command. Refer to repmgr documentation for more details.

Step 1. Enable repmgr libraries on both PostgreSQL instances that required for its backend functions.

Master Data Directory : /opt/PostgreSQL/9.3/data
Standby Data Directory: /opt/PostgreSQL/9.3/data_slave

Edit $PGDATA/postgresql.conf 
shared_preload_libraries = 'repmgr_funcs'

[postgres@localhost:/opt/PostgreSQL/9.3~]$ /opt/PostgreSQL/9.3/bin/pg_ctl -D /opt/PostgreSQL/9.3/data start

[postgres@localhost:/opt/PostgreSQL/9.3~]$ psql -p 5432 -c "show shared_preload_libraries;"
 shared_preload_libraries
--------------------------
 repmgr_funcs
(1 row)

Step 2. Repmgr need repmgr.conf file for each node, since we are working on localhost we need to keep each nodes repmgr.conf in seperate directory.

[postgres@localhost:/opt/PostgreSQL/9.3~]$ pwd
/opt/PostgreSQL/9.3
[postgres@localhost:/opt/PostgreSQL/9.3~]$ mkdir -p repmgr/master repmgr/standby

Step 3. Create repmgr.conf file for Master(5432) and Standby(5433) in the directories we created in Step 2.

[postgres@localhost:/opt/PostgreSQL/9.3/repmgr/master~]$ pwd
/opt/PostgreSQL/9.3/repmgr/master
[postgres@localhost:/opt/PostgreSQL/9.3/repmgr/master~]$ more repmgr.conf
cluster=test
node=1
node_name=master
conninfo='host=127.0.0.1 port=5432 dbname=postgres'
pg_bindir=/opt/PostgreSQL/9.3/bin
master_response_timeout=60 
reconnect_attempts=6
reconnect_interval=10
failover=automatic
promote_command='/opt/PostgreSQL/9.3/repmgr/auto_failover.sh'


[postgres@localhost:/opt/PostgreSQL/9.3/repmgr/slave~]$ pwd
/opt/PostgreSQL/9.3/repmgr/slave
[postgres@localhost:/opt/PostgreSQL/9.3/repmgr/slave~]$ more repmgr.conf
cluster=test
node=2
node_name=slave
conninfo='host=127.0.0.1 port=5433 dbname=postgres'
pg_bindir=/opt/PostgreSQL/9.3/bin
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic
loglevel=DEBUG
promote_command='/opt/PostgreSQL/9.3/repmgr/auto_failover.sh'

What mainly you have to observe in repmgr.conf is "master_reponse_timeout" which's total wait duration in seconds before declaring master has disappeared. In that duration 6 reconnect attempts made with 10 seconds of interval. After no response from master in "master_response_timeout" duration automatic failover takes place by promote_command script. The script consist of Standby promotion steps, which I have created one for this setup shared below.

[postgres@localhost:/opt/PostgreSQL/9.3~]$ more repmgr/auto_failover.sh
#!/bin/bash
echo "Promoting Standby at `date '+%Y-%m-%d %H:%M:%S'`" >>/tmp/repsetup.log
/opt/PostgreSQL/9.3/bin/pg_ctl -D /opt/PostgreSQL/9.3/data_slave promote >>/tmp/repsetup.log

Step 4. Register Master and Standby node with repmgr using "repmgr" utility.

[postgres@localhost:/opt/PostgreSQL/9.3~]$ repmgr -f repmgr/master/repmgr.conf --verbose master register >/tmp/repsetup.log 2>&1

Master register logs (/tmp/repsetup.log):

[2015-01-12 01:28:55] [INFO] repmgr connecting to master database
[2015-01-12 01:28:55] [INFO] repmgr connected to master, checking its state
[2015-01-12 01:28:55] [INFO] master register: creating database objects inside the repmgr_test schema
[2015-01-12 01:28:55] [DEBUG] master register: CREATE SCHEMA repmgr_test
[2015-01-12 01:28:55] [DEBUG] master register: CREATE TABLE repmgr_test.repl_nodes (          [2015-01-12 01:28:55] [DEBUG] master register: CREATE TABLE repmgr_test.repl_monitor (   [2015-01-12 01:28:55] [DEBUG] master register: CREATE VIEW repmgr_test.repl_status AS  [2015-01-12 01:28:55] [DEBUG] master register: CREATE INDEX idx_repl_status_sort     ON repmgr_test.repl_monitor (last_monitor_time, standby_node)
[2015-01-12 01:28:55] [DEBUG] master register: INSERT INTO repmgr_test.repl_nodes (id, cluster, name, conninfo, priority) VALUES (1, 'test', 'master', 'host=1
27.0.0.1 port=5432 dbname=postgres', 0)
[2015-01-12 01:28:55] [NOTICE] Master node correctly registered for cluster test with id 1 (conninfo: host=127.0.0.1 port=5432 dbname=postgres)
Opening configuration file: repmgr/master/repmgr.conf

Standby:

[postgres@localhost:/opt/PostgreSQL/9.3~]$ repmgr -f repmgr/slave/repmgr.conf --verbose standby register >>/tmp/repsetup.log 2>&1

Standby register logs (/tmp/repsetup.log):

[2015-01-12 01:30:37] [INFO] repmgr connecting to standby database
[2015-01-12 01:30:37] [INFO] repmgr connected to standby, checking its state
[2015-01-12 01:30:37] [DEBUG] standby register: SELECT 1 FROM pg_namespace WHERE nspname = 'repmgr_test'
[2015-01-12 01:30:37] [INFO] repmgr connecting to master database
[2015-01-12 01:30:37] [INFO] finding node list for cluster 'test'
[2015-01-12 01:30:37] [INFO] checking role of cluster node 'host=127.0.0.1 port=5432 dbname=postgres'
[2015-01-12 01:30:37] [INFO] repmgr connected to master, checking its state
[2015-01-12 01:30:37] [INFO] repmgr registering the standby
[2015-01-12 01:30:37] [DEBUG] standby register: INSERT INTO repmgr_test.repl_nodes(id, cluster, name, conninfo, priority) VALUES (2, 'test', 'slave', 'host=12
7.0.0.1 port=5433 dbname=postgres', 0)
[2015-01-12 01:30:37] [INFO] repmgr registering the standby complete
[2015-01-12 01:30:37] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=127.0.0.1 port=5433 dbname=postgres)
Opening configuration file: repmgr/slave/repmgr.conf

By looking to the logs you can easily notice, repmgr creating its own schema in database with "repmgr_$CLUSTER" name and some tables/view/functions in it. And it create one row regarding replication lag in repl_monitor table. Repmgr has a utility called "repmgrd" to monitor Master availability running daemon process from Standby node. Its also a management and monitoring system daemon that watches the cluster status and can trigger standby promotion. We need to start "repmgrd" daemon process after registering master/standby nodes with repmgr.

Step 5. Lets start the repmgr daemon process to watch master, in our case we are not running standby on separate node hence we need to start the daemon with repmgr/standby/repmgr.conf file.

repmgrd -f repmgr/slave/repmgr.conf --verbose --monitoring-history >>/tmp/repsetup.log 2>&1 &

Logs (/tmp/repsetup.log)

[2015-01-12 01:42:13] [INFO] repmgrd Connecting to database 'host=127.0.0.1 port=5433 dbname=postgres'
[2015-01-12 01:42:13] [INFO] repmgrd Connected to database, checking its state
[2015-01-12 01:42:13] [INFO] repmgrd Connecting to primary for cluster 'test'
[2015-01-12 01:42:13] [INFO] finding node list for cluster 'test'
[2015-01-12 01:42:13] [INFO] checking role of cluster node 'host=127.0.0.1 port=5432 dbname=postgres'
[2015-01-12 01:42:13] [INFO] repmgrd Checking cluster configuration with schema 'repmgr_test'
[2015-01-12 01:42:13] [INFO] repmgrd Checking node 2 in cluster 'test'
[2015-01-12 01:42:13] [INFO] Reloading configuration file and updating repmgr tables
[2015-01-12 01:42:13] [INFO] repmgrd Starting continuous standby node monitoring
[2015-01-12 01:42:14] [DEBUG] standby_monitor: INSERT INTO repmgr_test.repl_monitor VALUES(1, 2, '2015-01-12 09:42:14.457287+00'::timestamp with time zone,  '2015-01-12 09:42:13.950172+00'::timestamp with time zone, '2/C84DAB08', '2/C84DAB08',  0, 0)

Step 6. Monitor nodes registered with repmgr and running daemon process.

[postgres@localhost:/opt/PostgreSQL/9.3~]$ psql -p 5432 -d postgres -xc "SELECT * FROM repmgr_test.repl_status"
-[ RECORD 1 ]-------------+------------------------------
primary_node              | 1
standby_node              | 2
standby_name              | slave
last_monitor_time         | 2015-01-12 09:42:29.514056+00
last_wal_primary_location | 2/C84DB7A8
last_wal_standby_location | 2/C84DB7A8
replication_lag           | 0 bytes
replication_time_lag      | 00:04:01.960772
apply_lag                 | 0 bytes
communication_time_lag    | 00:03:59.45349

[postgres@localhost:/opt/PostgreSQL/9.3~]$ repmgr -f repmgr/master/repmgr.conf cluster show
[2015-01-12 01:47:00] [INFO] repmgr connecting to database
Role      | Connection String
* master  | host=127.0.0.1 port=5432 dbname=postgres
  standby | host=127.0.0.1 port=5433 dbname=postgres

Step 7. Lets simulate failover scenario by taking master down. I am doing some stupid way by killing postmaster.pid. Please don't do the same on production if you want to stick for long time in the same company :).

[postgres@localhost:/opt/PostgreSQL/9.3~]$ kill `head -n1 data/postmaster.pid`
or, you can take master down safely. 
[postgres@localhost:/opt/PostgreSQL/9.3~]$ pg_ctl -D /opt/PostgreSQL/9.3/data stop -mf

Now check the logs, how repmgr has promted the standby:

[2015-01-12 02:14:11] [WARNING] Can't stop current query: PQcancel() -- connect() failed: Connection refused
[2015-01-12 02:14:11] [WARNING] repmgrd: Connection to master has been lost, trying to recover... 10 seconds before failover decision
...
[2015-01-12 02:14:21] [ERROR] repmgrd: We couldn't reconnect for long enough, exiting...
[2015-01-12 02:14:21] [DEBUG] repmgrd: there are 2 nodes registered
[2015-01-12 02:14:21] [DEBUG] repmgrd: node=1 conninfo="host=127.0.0.1 port=5432 dbname=postgres" witness=false
[2015-01-12 02:14:21] [ERROR] Connection to database failed: could not connect to server: Connection refused
        Is the server running on host "127.0.0.1" and accepting
        TCP/IP connections on port 5432?
[2015-01-12 02:14:21] [DEBUG] repmgrd: node=2 conninfo="host=127.0.0.1 port=5433 dbname=postgres" witness=false
[2015-01-12 02:14:21] [DEBUG] Total nodes counted: registered=2, visible=1
[2015-01-12 02:14:21] [DEBUG] XLog position of node 2: log id=2 (2), offset=3360733048 (C850B778)
[2015-01-12 02:14:21] [DEBUG] Last XLog position of node 2: log id=2 (2), offset=3360733048 (C850B778)
[2015-01-12 02:14:26] [INFO] repmgrd: This node is the best candidate to be the new primary, promoting...
[2015-01-12 02:14:26] [DEBUG] promote command is: "/opt/PostgreSQL/9.3/repmgr/auto_failover.sh"
Promoting Standby at 2015-01-12 02:14:26
server promoting
[2015-01-12 02:14:29] [INFO] repmgrd Checking cluster configuration with schema 'repmgr_test'
[2015-01-12 02:14:29] [INFO] repmgrd Checking node 2 in cluster 'test'
[2015-01-12 02:14:29] [INFO] Reloading configuration file and updating repmgr tables
[2015-01-12 02:14:29] [INFO] repmgrd Starting continuous primary connection check

Perfect, repmgr daemon recognised master failure and before promoting standby by "auto_failover.sh" script it has properly verified the WAL's location as well. Lets verify whether Standby had promotion or not ?

[postgres@localhost:/opt/PostgreSQL/9.3~]$ psql -p 5433 -c "select pg_is_in_recovery();"
 pg_is_in_recovery
-------------------
 f
(1 row)

Incredible, without any human intervention manual failover procedure taken care by repmgr. You can rebuild the dead master as standby either by repmgr itself or you can follow the switchback procedure shown in my previous post. Same way you can setup for two nodes, by executing standby steps on standby node instead of localhost.

Thanks for reading.
Raghav

10 comments :

Anonymous said...: But does the client application also connect to the new master? And how does a client know where it can find the new master database?; January 13, 2015 at 4:58 PM
Raghavendra said...: Repmgr, handle only automatic failover not the application connection swap. You need to depend on some middle wares like pgpool or pgbouncer. And bounce then accordingly at the time failover in failover script pointing to new master. One smooth way I found was pgpool, if you configure pgpool in load balancing mode and when failover happens pgpool sends all connections to standby as it finds master not available. This way application with slight rollbacks immediately connects to newly promoted master via pgpool.; January 13, 2015 at 5:12 PM
Anonymous said...: Hi Raghavendra,

I have setup the load balancing and failover for High Availability in pgpool. I
have used the Master-slave replication(stream mode).

DB version PPAS 9.4AS(EnterpriseDB)

pgpool version 3.3.4

The following is my failover command in pgpool.conf

failover_command ='/usr/local/etc/failover.sh %d %P %H
%R'

Following is the failover script.

#!/bin/bash -x
FALLING_NODE=$1 # %d
OLDPRIMARY_NODE=$2 # %P
NEW_PRIMARY=$3 # %H
PGDATA=$4 # %R

if [ $FALLING_NODE = $OLDPRIMARY_NODE ]; then
if [ $UID -eq 0 ]
then
su enterprisedb -c "ssh -T enterprisedb@$NEW_PRIMARY touch
$PGDATA/trigger"
else
ssh -T enterprisedb@$NEW_PRIMARY touch $PGDATA/trigger
fi
exit 0;
fi;
exit 0;

To check the failover scenario i have stopped(kill -9) the master db
server process.According to the script the present salve will take over the
role of master. But the same was not happening in my case.

From pgpool end

show pool_nodes;

o/p:

edb=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role
---------+------------+------+--------+-----------+---------
0 | 10.0.0.149 | 5444 | 3 | 0.500000 | standby
1 | 10.0.0.158 | 5444 | 2 | 0.500000 | standby
(2 rows)

Please help me in Fixing this Issue.Provide me the working failover
script if any.

Thanks for your time.; March 11, 2015 at 12:19 PM
Anonymous said...: [ERROR] failed to connect to local node, node marked as failed and terminating!
Got this error before executing the script; April 2, 2015 at 7:08 PM
Anonymous said...: Hi Raghavendra,

First of All thanks for this article, I follow this articles this will help me a lot.
I have setup the load balancing and failover for High Availability in pgpool. I have used the Master-slave replication (streaming replication).
After sucessful execution, a few days ago when I monitoried it I got below results:

$ repmgr -f /backups/repmgr/Primary/repmgr.conf cluster show
Role | Connection String
* master | host=IPAddress1 dbname=postgres port=5432 user=postgres
standby | host=IPAddress2 dbname=postgres port=5432 user=postgres

This will show perfectly master and standby with respective information.
Also,

postgres=# select * from repmgr_shadow.repl_nodes ;
id | cluster | name | conninfo | priority | witness
----+---------+---------------+------------------------------------------------------------+----------+---------
1 | shadow | IPAddress1 | host=IPAddress1 dbname=postgres port=5432 user=postgres | 0 | f
2 | shadow | IPAddress2 | host=IPAddress2 dbname=postgres port=5432 user=postgres | 0 | f
(2 rows)

But when I run the status command I got nothing i.e. zeor results.
postgres=# select * from repmgr_shadow.repl_status ;
(0 rows)

Can you help me to debugging it, how it was happened and what should I do to get out of it.; May 25, 2015 at 5:16 PM
Abdul Rahim said...: Hi,

Do you have any idea of Failover manager using Straming Replications.
I had configured. But I am getting the Error.

I have setup a Failover Manager in a 3 Node Cluster(1 Master DB, 1 Stand by DB & 1 witness Node). I have used Streaming Replication for the same. Streaming Replication is working perfectly fine.
I have started the EFM agent in all the nodes and it was started in all the three nodes without any failure.
When I query the cluster status i got the following messages. In the master It couldn't detect the Standby node.
Also in the Stand by node, it could not detect the Master Node.

Can anybody please help to sort out the same.

In Master Node
------------------

efm cluster-status efm
Cluster Status: efm

Agent Type Address Agent DB Info
--------------------------------------------------------------
Master 192.168.192.120 UP UP

Allowed node host list:
192.168.192.120 192.168.1.121

Standby priority host list:
192.168.1.121

Promote Status:

DB Type Address XLog Loc Info
--------------------------------------------------------------
Master 192.168.192.120 0/14033218

No standby databases were found.

In Stand By Node
---------------------

efm cluster-status efm
Cluster Status: efm

Agent Type Address Agent DB Info
--------------------------------------------------------------
Standby 192.168.192.121 UP UP

Allowed node host list:
192.168.192.121

Standby priority host list:
192.168.192.121

Promote Status:

DB Type Address XLog Loc Info
--------------------------------------------------------------
Standby 192.168.192.121 0/140333C8

No master database was found.
[root@bdnode02 efm-2.0]#

In witness Node
===============

[root@bdnode04 efm-2.0]# efm cluster-status efm
Cluster Status: efm

Agent Type Address Agent DB Info
--------------------------------------------------------------
Witness 192.168.192.123 UP N/A

Allowed node host list:
192.168.192.123

Standby priority host list:
(List is empty.)

Promote Status:

Did not find XLog location for any nodes.; December 3, 2015 at 2:44 PM
Bastiaan Welmers said...: I would advise to store the postgresql data dir somewhere in /var rather than /opt, since /opt like /usr is meant for static program data only on many systems.
Something like /var/opt/PostgreSQL/9.3/data would be logic choice; February 3, 2016 at 4:19 AM
Unknown said...: Hi Raghavendra,

This blog is really help full for me, and i have an doubt
In master server both database and application is installed and as well as in two slaves hot_standby mode replication has been done and both two slaves are different IP address.My query is if i follow the above steps in rep manager(external tool) it will automatically enable one slave as a new master and other two as slave at the time of fail over from existing master.

for eg: my application address is 192.168.2.10 (this is master server) now this server fail, according to your blog if i follow these steps now one slave server will move to new master that means
now my application address is will change to 192.168.2.20 in this case how client will know the new ip address.

Kindly help me for this issue.; July 25, 2016 at 4:36 PM
Unknown said...: Hi Raghavendra,

I had another doubt please clarify me,

I have setup the hot_standby and replication data in slave server,Right now master server is having read and write operations, but slave server is only read operations, if fail over occurs in master server so automatically slave will become a new master, in this case slave is only having read operation then how it could be access to read write operation.; July 25, 2016 at 5:13 PM
Unknown said...: Hi Ragav, Automatic failover in psql master slave instances on windows server, Possible?; March 21, 2017 at 8:43 PM

Tuesday, January 13, 2015

Configuring Automatic failover using Replication Manager 2.0 on PostgreSQL 9.3.5

10 comments :

Post a Comment