Add monitoring/reporting for wallet transaction mismatches between PostgreSQL and ClickHouse
Overview
Create weekly monitoring and reporting to identify subscriptions where sum(postgres consumption) does not match sum(clickhouse consumed). This will provide visibility into wallet transaction discrepancies and help us stay on top of these issues proactively.
Requirements
Phase 1: Weekly Mismatch Report
Generate weekly reports for all overaged subscriptions that identify:
- Overaged subscriptions with mismatches in sums (beyond rounding errors)
- Mismatch delta per subscription
- Unique transaction IDs with mismatches, along with delta per transaction ID
- If no mismatches found for the week, report that as well
Phase 2: Notification & Monitoring Integration
- Integrate with existing usage billing job monitoring for notifications
- Assign weekly monitoring responsibility to the on-call team member
- Reuse existing monitoring infrastructure where possible
Phase 3: Remediation Steps
- Identify and document remediation steps for when mismatches are found
- Create action items based on debugging findings
Context
- Currently, wallet_transactions are not easily accessible in ClickHouse
- Comparative analysis requires querying two separate sources (PostgreSQL and ClickHouse)
- Access to production console is limited, making manual validation difficult
- Starting with overaged subscriptions; can expand to all subscriptions later if needed
Future Considerations
- Eventually migrate PostgreSQL wallet transactions and consumer/subscription_wallet tables to ClickHouse (potentially via Siphon)
- This would enable seamless dashboard/query monitoring within ClickHouse
- Would eliminate the need for cross-source analysis
Related Issue
Closes #15772 - Investigation: Credit Mismatch Between ClickHouse and PostgreSQL Wallet Transactions