-
Notifications
You must be signed in to change notification settings - Fork 97
Description
Using pgpool 4.7.0:
root@dlss-azanella-db-01:~# pgpool --version
pgpool-II version 4.7.0 (tasukiboshi)
I have a healthy pgpool cluster:
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:39:31
dlss-azanella-db-02.stanford.edu 5432 1 0.500000 waiting up standby standby 0 streaming async 2026-01-30 11:39:47
root@dlss-azanella-db-01:~# su - postgres -c 'psql -c "show checkpoint_timeout;"'
checkpoint_timeout
--------------------
5min
(1 row)
After the checkpoint_timeout, I perform a switchover:
root@dlss-azanella-db-01:~# date
Fri Jan 30 11:47:32 AM PST 2026
root@dlss-azanella-db-01:~# pcp_promote_node -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -n 1 --switchover -w
pcp_promote_node -- Command Successful
The follow_primary script succeeds:
2026-01-30 11:47:49.972: follow_child pid 353841: LOG: === Starting follow primary command for node 0 ===
2026-01-30 11:47:49.972: follow_child pid 353841: LOG: execute command: /etc/pgpool2/follow_primary.sh 0 dlss-azanella-db-01.stanford.edu 5432 /var/lib/postgresql/16/main 1 dlss-azanella-db-02.stanford.edu 0 0 5432 /var/lib/postgresql/16/main
+ NODE_ID=0
+ NODE_HOST=dlss-azanella-db-01.stanford.edu
+ NODE_PORT=5432
+ NODE_PGDATA=/var/lib/postgresql/16/main
+ NEW_PRIMARY_NODE_ID=1
+ NEW_PRIMARY_NODE_HOST=dlss-azanella-db-02.stanford.edu
+ OLD_MAIN_NODE_ID=0
+ OLD_PRIMARY_NODE_ID=0
+ NEW_PRIMARY_NODE_PORT=5432
+ NEW_PRIMARY_NODE_PGDATA=/var/lib/postgresql/16/main
+ PGHOME=/usr/lib/postgresql/16
+ REPLUSER=repl
+ PCP_USER=pgpool
+ PGPOOL_PATH=/sbin
+ PCP_PORT=9898
++ tr -- -. _
++ echo dlss-azanella-db-01.stanford.edu
+ REPL_SLOT_NAME=dlss_azanella_db_01_stanford_edu
+ POSTGRESQL_STARTUP_USER=postgres
+ SSH_KEY_FILE=id_rsa
+ SSH_OPTIONS='-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/id_rsa'
+ PGCONFIG_DIR=/etc/postgresql/16/main
+ echo follow_primary.sh: start: Standby node 0
follow_primary.sh: start: Standby node 0
+ /usr/lib/postgresql/16/bin/pg_isready -h dlss-azanella-db-01.stanford.edu -p 5432
2026-01-30 11:47:49.983: sr_check_worker pid 351959: LOG: worker process received restart request
2026-01-30 11:47:49.984: watchdog pid 175581: LOG: received the failover indication from Pgpool-II on IPC interface
2026-01-30 11:47:49.984: watchdog pid 175581: LOG: watchdog is informed of failover end by the main process
2026-01-30 11:47:49.984: main pid 175576: LOG: === Failover done. shutdown host dlss-azanella-db-01.stanford.edu(5432) ===
+ '[' 0 -ne 0 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i '~/.ssh/id_rsa' postgres@dlss-azanella-db-02.stanford.edu ls /tmp
Warning: Permanently added 'dlss-azanella-db-02.stanford.edu' (ED25519) to the list of known hosts.
+ '[' 0 -ne 0 ']'
++ /usr/lib/postgresql/16/bin/psql -V
++ awk '{print $3}'
++ sed 's/\..*//'
++ sed 's/\([0-9]*\)[a-zA-Z].*/\1/'
+ PGVERSION=16
+ '[' 16 -ge 12 ']'
+ RECOVERYCONF=/var/lib/postgresql/16/main/myrecovery.conf
+ echo follow_primary.sh: pg_rewind for node 0
follow_primary.sh: pg_rewind for node 0
+ /usr/lib/postgresql/16/bin/psql -h dlss-azanella-db-02.stanford.edu -p 5432 postgres -c 'checkpoint;'
CHECKPOINT
+ /usr/lib/postgresql/16/bin/psql -h dlss-azanella-db-02.stanford.edu -p 5432 postgres -c 'SELECT pg_create_physical_replication_slot('\''dlss_azanella_db_01_stanford_edu'\'');'
+ '[' 0 -ne 0 ']'
++ echo /var/lib/postgresql/16/main/myrecovery.conf
++ sed -e 's/\//\\\//g'
++ echo /var/lib/postgresql/16/main/myrecovery.conf
++ sed -e 's/\//\\\//g'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i '~/.ssh/id_rsa' postgres@dlss-azanella-db-01.stanford.edu '
set -o errexit
/usr/bin/sudo /usr/bin/systemctl stop postgresql@16-main
# /usr/lib/postgresql/16/bin/pg_ctl -w -m f -D /var/lib/postgresql/16/main stop
/usr/lib/postgresql/16/bin/pg_rewind -D /var/lib/postgresql/16/main --source-server="user=postgres host=dlss-azanella-db-02.stanford.edu port=5432 dbname=postgres"
[ -d "/var/lib/postgresql/16/main" ] && rm -rf /var/lib/postgresql/16/main/pg_replslot/*
cat > /var/lib/postgresql/16/main/myrecovery.conf << EOT
primary_conninfo = '\''host=dlss-azanella-db-02.stanford.edu port=5432 user=repl application_name=dlss-azanella-db-01.stanford.edu passfile='\'''\''/var/lib/postgresql/.pgpass'\'''\'''\''
recovery_target_timeline = '\''latest'\''
primary_slot_name = '\''dlss_azanella_db_01_stanford_edu'\''
EOT
if [ 16 -ge 12 ]; then
sed -i -e "\$ainclude_if_exists = '\''\/var\/lib\/postgresql\/16\/main\/myrecovery.conf'\''" -e "/^include_if_exists = '\''\/var\/lib\/postgresql\/16\/main\/myrecovery.conf'\''/d" /etc/postgresql/16/main/postgresql.conf
touch /var/lib/postgresql/16/main/standby.signal
else
echo "standby_mode = '\''on'\''" >> /var/lib/postgresql/16/main/myrecovery.conf
fi
/usr/bin/sudo /usr/bin/systemctl start postgresql@16-main
# /usr/lib/postgresql/16/bin/pg_ctl -l /dev/null -w -D /var/lib/postgresql/16/main start
'
Warning: Permanently added 'dlss-azanella-db-01.stanford.edu' (ED25519) to the list of known hosts.
pg_rewind: servers diverged at WAL location 0/7E000148 on timeline 73
pg_rewind: rewinding from last common checkpoint at 0/7E000098 on timeline 73
2026-01-30 11:47:50.984: pcp_main pid 351958: LOG: restart request received in pcp child process
2026-01-30 11:47:50.986: main pid 175576: LOG: PCP child 351958 exits with status 0 in failover()
2026-01-30 11:47:50.987: main pid 175576: LOG: fork a new PCP child pid 353904 in failover()
2026-01-30 11:47:50.987: pcp_main pid 353904: LOG: PCP process: 353904 started
2026-01-30 11:47:50.988: sr_check_worker pid 353905: LOG: process started
pg_rewind: Done!
+ '[' 0 -eq 0 ']'
+ /sbin/pcp_attach_node -w -h localhost -U pgpool -p 9898 -n 0
2026-01-30 11:47:53.793: pcp_child pid 353914: LOG: received failback request for node_id: 0 from pid [353914]
2026-01-30 11:47:53.793: watchdog pid 175581: LOG: watchdog received the failover command from local pgpool-II on IPC interface
2026-01-30 11:47:53.793: watchdog pid 175581: LOG: watchdog is processing the failover command [FAILBACK_REQUEST] received from local pgpool-II on IPC interface
2026-01-30 11:47:53.793: watchdog pid 175581: LOG: The failover request does not need quorum to hold
2026-01-30 11:47:53.793: watchdog pid 175581: DETAIL: proceeding with the failover
2026-01-30 11:47:53.793: watchdog pid 175581: HINT: REQ_DETAIL_CONFIRMED
2026-01-30 11:47:53.793: pcp_child pid 353914: LOG: signal_user1_to_parent_with_reason(0)
2026-01-30 11:47:53.793: main pid 175576: LOG: Pgpool-II parent process received SIGUSR1
2026-01-30 11:47:53.793: main pid 175576: LOG: Pgpool-II parent process has received failover request
pcp_attach_node -- Command Successful
2026-01-30 11:47:53.793: watchdog pid 175581: LOG: received the failover indication from Pgpool-II on IPC interface
2026-01-30 11:47:53.793: watchdog pid 175581: LOG: watchdog is informed of failover start by the main process
+ '[' 0 -ne 0 ']'
2026-01-30 11:47:53.794: main pid 175576: LOG: === Starting fail back. reconnect host dlss-azanella-db-01.stanford.edu(5432) ===
2026-01-30 11:47:53.794: main pid 175576: LOG: Node 1 is not down (status: 1)
+ echo follow_primary.sh: end: follow primary command is completed successfully
follow_primary.sh: end: follow primary command is completed successfully
+ exit 0
2026-01-30 11:47:53.794: main pid 175576: LOG: Do not restart children because we are failing back node id 0 host: dlss-azanella-db-01.stanford.edu port: 5432 and we are in streaming replication mode and not all backends were down
2026-01-30 11:47:53.794: main pid 175576: LOG: find_primary_node_repeatedly: follow primary is ongoing. return current primary: 1
2026-01-30 11:47:53.794: main pid 175576: LOG: failover: set new primary node: 1
2026-01-30 11:47:53.794: main pid 175576: LOG: failover: set new main node: 0
2026-01-30 11:47:53.794: watchdog pid 175581: LOG: received the failover indication from Pgpool-II on IPC interface
2026-01-30 11:47:53.794: watchdog pid 175581: LOG: watchdog is informed of failover end by the main process
2026-01-30 11:47:53.794: sr_check_worker pid 353905: LOG: worker process received restart request
2026-01-30 11:47:53.794: main pid 175576: LOG: === Failback done. reconnect host dlss-azanella-db-01.stanford.edu(5432) ===
2026-01-30 11:47:53.794: follow_child pid 353841: LOG: === Follow primary command for node 0 ended ===
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up standby standby 0 streaming async 2026-01-30 11:47:53
dlss-azanella-db-02.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:47:49
Also, I can perform a switchover in the other direction right away:
root@dlss-azanella-db-01:~# pcp_promote_node -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -n 0 --switchover -w
pcp_promote_node -- Command Successful
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:52:01
dlss-azanella-db-02.stanford.edu 5432 1 0.500000 waiting up standby standby 0 streaming async 2026-01-30 11:52:05
I can also successfully simulate a failover and perform a recovery in both directions:
root@dlss-azanella-db-01:~# systemctl stop postgresql@16-main
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 3 0.500000 down down standby unknown 0 none none 2026-01-30 11:55:22
dlss-azanella-db-02.stanford.edu 5432 2 0.500000 up up primary primary 0 none none 2026-01-30 11:55:22
root@dlss-azanella-db-01:~# pcp_recovery_node -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -n 0 -w
pcp_recovery_node -- Command Successful
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up standby standby 0 streaming async 2026-01-30 11:56:03
dlss-azanella-db-02.stanford.edu 5432 2 0.500000 up up primary primary 0 none none 2026-01-30 11:55:22
root@dlss-azanella-db-02:~# systemctl stop postgresql@16-main
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:57:05
dlss-azanella-db-02.stanford.edu 5432 3 0.500000 down down standby unknown 0 none none 2026-01-30 11:57:05
root@dlss-azanella-db-01:~# pcp_recovery_node -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -n 1 -w
pcp_recovery_node -- Command Successful
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:57:05
dlss-azanella-db-02.stanford.edu 5432 1 0.500000 waiting up standby standby 0 streaming async 2026-01-30 11:57:31
But if I do not wait for the next checkpoint_timeout, a pcp_promote_node –switchover says the command succeeds, but follow_primary fails:
root@dlss-azanella-db-01:~# pcp_promote_node -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -n 1 --switchover -w
pcp_promote_node -- Command Successful
2026-01-30 11:59:10.048: follow_child pid 355683: LOG: === Starting follow primary command for node 0 ===
2026-01-30 11:59:10.048: follow_child pid 355683: LOG: execute command: /etc/pgpool2/follow_primary.sh 0 dlss-azanella-db-01.stanford.edu 5432 /var/lib/postgresql/16/main 1 dlss-azanella-db-02.stanford.edu 0 0 5432 /var/lib/postgresql/16/main
+ NODE_ID=0
+ NODE_HOST=dlss-azanella-db-01.stanford.edu
+ NODE_PORT=5432
+ NODE_PGDATA=/var/lib/postgresql/16/main
+ NEW_PRIMARY_NODE_ID=1
+ NEW_PRIMARY_NODE_HOST=dlss-azanella-db-02.stanford.edu
+ OLD_MAIN_NODE_ID=0
+ OLD_PRIMARY_NODE_ID=0
+ NEW_PRIMARY_NODE_PORT=5432
+ NEW_PRIMARY_NODE_PGDATA=/var/lib/postgresql/16/main
+ PGHOME=/usr/lib/postgresql/16
+ REPLUSER=repl
+ PCP_USER=pgpool
+ PGPOOL_PATH=/sbin
+ PCP_PORT=9898
++ echo dlss-azanella-db-01.stanford.edu
++ tr -- -. _
+ REPL_SLOT_NAME=dlss_azanella_db_01_stanford_edu
+ POSTGRESQL_STARTUP_USER=postgres
+ SSH_KEY_FILE=id_rsa
+ SSH_OPTIONS='-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/id_rsa'
+ PGCONFIG_DIR=/etc/postgresql/16/main
+ echo follow_primary.sh: start: Standby node 0
follow_primary.sh: start: Standby node 0
+ /usr/lib/postgresql/16/bin/pg_isready -h dlss-azanella-db-01.stanford.edu -p 5432
2026-01-30 11:59:10.058: sr_check_worker pid 355388: LOG: worker process received restart request
2026-01-30 11:59:10.058: watchdog pid 175581: LOG: received the failover indication from Pgpool-II on IPC interface
2026-01-30 11:59:10.058: watchdog pid 175581: LOG: watchdog is informed of failover end by the main process
2026-01-30 11:59:10.059: main pid 175576: LOG: === Failover done. shutdown host dlss-azanella-db-01.stanford.edu(5432) ===
+ '[' 0 -ne 0 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i '~/.ssh/id_rsa' postgres@dlss-azanella-db-02.stanford.edu ls /tmp
Warning: Permanently added 'dlss-azanella-db-02.stanford.edu' (ED25519) to the list of known hosts.
+ '[' 0 -ne 0 ']'
++ /usr/lib/postgresql/16/bin/psql -V
++ awk '{print $3}'
++ sed 's/\..*//'
++ sed 's/\([0-9]*\)[a-zA-Z].*/\1/'
+ PGVERSION=16
+ '[' 16 -ge 12 ']'
+ RECOVERYCONF=/var/lib/postgresql/16/main/myrecovery.conf
+ echo follow_primary.sh: pg_rewind for node 0
follow_primary.sh: pg_rewind for node 0
+ /usr/lib/postgresql/16/bin/psql -h dlss-azanella-db-02.stanford.edu -p 5432 postgres -c 'checkpoint;'
CHECKPOINT
+ /usr/lib/postgresql/16/bin/psql -h dlss-azanella-db-02.stanford.edu -p 5432 postgres -c 'SELECT pg_create_physical_replication_slot('\''dlss_azanella_db_01_stanford_edu'\'');'
+ '[' 0 -ne 0 ']'
++ echo /var/lib/postgresql/16/main/myrecovery.conf
++ sed -e 's/\//\\\//g'
++ echo /var/lib/postgresql/16/main/myrecovery.conf
++ sed -e 's/\//\\\//g'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i '~/.ssh/id_rsa' postgres@dlss-azanella-db-01.stanford.edu '
set -o errexit
/usr/bin/sudo /usr/bin/systemctl stop postgresql@16-main
# /usr/lib/postgresql/16/bin/pg_ctl -w -m f -D /var/lib/postgresql/16/main stop
/usr/lib/postgresql/16/bin/pg_rewind -D /var/lib/postgresql/16/main --source-server="user=postgres host=dlss-azanella-db-02.stanford.edu port=5432 dbname=postgres"
[ -d "/var/lib/postgresql/16/main" ] && rm -rf /var/lib/postgresql/16/main/pg_replslot/*
cat > /var/lib/postgresql/16/main/myrecovery.conf << EOT
primary_conninfo = '\''host=dlss-azanella-db-02.stanford.edu port=5432 user=repl application_name=dlss-azanella-db-01.stanford.edu passfile='\'''\''/var/lib/postgresql/.pgpass'\'''\'''\''
recovery_target_timeline = '\''latest'\''
primary_slot_name = '\''dlss_azanella_db_01_stanford_edu'\''
EOT
if [ 16 -ge 12 ]; then
sed -i -e "\$ainclude_if_exists = '\''\/var\/lib\/postgresql\/16\/main\/myrecovery.conf'\''" -e "/^include_if_exists = '\''\/var\/lib\/postgresql\/16\/main\/myrecovery.conf'\''/d" /etc/postgresql/16/main/postgresql.conf
touch /var/lib/postgresql/16/main/standby.signal
else
echo "standby_mode = '\''on'\''" >> /var/lib/postgresql/16/main/myrecovery.conf
fi
/usr/bin/sudo /usr/bin/systemctl start postgresql@16-main
# /usr/lib/postgresql/16/bin/pg_ctl -l /dev/null -w -D /var/lib/postgresql/16/main start
'
Warning: Permanently added 'dlss-azanella-db-01.stanford.edu' (ED25519) to the list of known hosts.
2026-01-30 11:59:11.059: pcp_main pid 355387: LOG: restart request received in pcp child process
2026-01-30 11:59:11.061: main pid 175576: LOG: PCP child 355387 exits with status 0 in failover()
2026-01-30 11:59:11.061: main pid 175576: LOG: fork a new PCP child pid 355743 in failover()
2026-01-30 11:59:11.061: pcp_main pid 355743: LOG: PCP process: 355743 started
2026-01-30 11:59:11.062: sr_check_worker pid 355744: LOG: process started
pg_rewind: servers diverged at WAL location 0/84000060 on timeline 77
pg_rewind: error: could not open file "/var/lib/postgresql/16/main/pg_wal/0000004D0000000000000083": No such file or directory
pg_rewind: error: could not find previous WAL record at 0/83000138
+ '[' 1 -eq 0 ']'
+ /usr/lib/postgresql/16/bin/psql -h dlss-azanella-db-02.stanford.edu -p 5432 postgres -c 'SELECT pg_drop_replication_slot('\''dlss_azanella_db_01_stanford_edu'\'');'
+ '[' 0 -ne 0 ']'
+ echo ERROR: follow_primary.sh: end: follow primary command failed
ERROR: follow_primary.sh: end: follow primary command failed
+ exit 1
2026-01-30 11:59:11.133: follow_child pid 355683: LOG: === Follow primary command for node 0 ended ===
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 3 0.500000 down down standby unknown 0 none none 2026-01-30 11:59:10
dlss-azanella-db-02.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:59:10
Although follow_primary failed, the failover promotion succeeded:
failover.sh: end: new_main_node_id=1 on dlss-azanella-db-02.stanford.edu was successfully promoted to primary
02 is the primary, but there is no replication to 01, even though logs show follow_primary executed the checkpoint. At this point I can recover 01 as the standby and end up with a healthy cluster again:
root@dlss-azanella-db-01:~# pcp_recovery_node -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -n 0 -w
pcp_recovery_node -- Command Successful
root@dlss-azanella-db-01:~# pcp_node_info -h dlss-azanella-pgpool.stanford.edu -p 9898 -U pgpool -w
dlss-azanella-db-01.stanford.edu 5432 1 0.500000 waiting up standby standby 0 streaming async 2026-01-30 12:07:04
dlss-azanella-db-02.stanford.edu 5432 1 0.500000 waiting up primary primary 0 none none 2026-01-30 11:59:10
And at this point, unless I perform a manual checkpoint, or wait until the next checkpoint_timeout has expired a switchover will do the same thing: pcp_promote_node –switchover will say it succeeded, but logs and pcp_node_info will show a promoted node without replication. If I perform a manual checkpoint or wait until the next checkpoint_timeout, a pcp_promote_node –switchover will say it succeeded, and logs will show follow_primary succeeded, and pcp_node_info will show a promoted node with streaming replication.
I guess pcp_promote_node does succeed, in the sense that I end up with a promoted node. But as the switchover does not succeed, I would have expected pcp_promote_node –switchover to error out when the logs show follow_primary does not succeed, and when pcp_node_info does not show a switchover. Unless that is the expected behavior, or I have something misconfigured?