Skip to content

HOWTO: Execute a PostgreSQL Failover with Patroni

Purpose: Execute a controlled or emergency PostgreSQL failover in a Patroni-managed cluster with structured run-record capture.

Difficulty: Advanced

Track: Disaster Recovery Automation


Overview

Both paths (planned switchover and emergency failover) produce the same output: before/after Patroni cluster state, replication lag at the point of promotion, and application reconnection confirmation. Capture these at each step; they form the run record.


1. Pre-Failover State Capture

  • Patroni cluster status snapshot.
  • Replication lag baseline.
  • Application connection pool state.

2. Controlled Switchover

  • patronictl -c /etc/patroni/config.yml switchover.
  • Pre-switchover health check.
  • Execution and timing.
  • Post-switchover state confirmation.

3. Emergency Failover

  • Detecting primary failure: Patroni leader election log.
  • Understanding the DCS-based election process.
  • Manual trigger if automatic election is inhibited.

4. Post-Failover Validation

  • New primary confirmed in Patroni cluster list.
  • Replication from old primary (now replica) established.
  • Application reconnection confirmed.
  • pgBackRest reconfigured for new primary.

5. Run-record capture and Run Record

  • Storing before/after cluster state under <runtime-root>/logs/dr/postgresql-failover/.
  • Timing data for RTO measurement.
  • Linking to the DR drill run record.

References


License: MIT-0 for code, CC-BY-4.0 for documentation unless otherwise stated.