Trigger AI usage data backfill when ClickHouse is enabled
What does this MR do and why?
Trigger AI usage data backfill when ClickHouse is enabled
When CH is enabled in application setting we trigger backfill from PG to CH for existing data. Temporary duplicates can happen if backfill is retriggered by reenabling CH in application settings. Applicable for Dedicated\SM instances.
This change adds functionality to backfill AI events data (specifically DuoChat and CodeSuggestion events) into ClickHouse when analytics with ClickHouse is enabled. The implementation includes:
- A new feature flag
ai_events_backfill_to_chthat controls this functionality - Event subscribers that trigger backfill workers when ClickHouse for analytics is enabled
- Updated code to check application settings directly instead of using ClickHouse's global enablement
- Comprehensive tests to verify the backfill process works correctly, including cases where the feature is enabled/disabled and when settings change
The change ensures that when an admin enables ClickHouse for analytics, existing AI event data is properly transferred to ClickHouse, but only happens once to avoid duplicate data. This supports the analytics infrastructure by making historical AI usage data available in ClickHouse for reporting and analysis.
Feature flag for .com
For now feature should be disabled on .com since it can provide duplicate records for AI Analytics. After !186988 (merged) is deployed we can enable the flag and\or remove it. The backfill feature behind this flag isn't useful for .com at the moment because it provides backfill in case ClickHouse wasn't configured and it is configured on .com for a long time already.
How to set up and validate locally
- Add some Duo Chat activity by using the chat.
- Setup ClickHouse and change
use_clickhouse_for_analyticssetting to true. (Adminarea -> Settings -> General -> Analytics) - Verify that CH table
duo_chat_eventshas data for your Duo Chat activity.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #528662