Skip to content

2025-08-05: Apdex SLO violation in patroni's rails_primary_sql component on main stage

Apdex SLO violation in patroni's rails_primary_sql component on main stage (Severity 3 (Medium))

Problem: The Apdex score for SQL transactions in the Patroni service on the 'main' stage experienced significant degradation due to performance issues related to database lock waits and contention.

Impact: The performance degradation impacted the rails_primary_sql SLI Apdex for 50 minutes.

Causes: Investigations have shown that heavy contention on the database LWLock caused by a recent ALTER TABLE migration with a foreign key constraint to the p_duo_workflows_checkpoints table, introduced in db/migrate/20250701233451_create_p_duo_workflows_checkpoints.rb.

Response strategy: To resolve this, we have aligned on replacing the FK constraint with our home-grown 'loose FK' mechanism, which uses delete logging and async propagation to avoid locking issues.


This ticket was created to track INC-3118, by incident.io 🔥