Redis/RedisHLL events don't get triggered
Problem
Redis(HLL) metrics are not getting updated with events from the frontend from 15th February onwards on Gitlab.com.
original from when the issue was created:
1. The `user_visited_dashboard` event doesn't seem to get triggered when entering an analytics dashboard: for example, here, after opening the `network` in Google Dev Tools and clicking on a dashboard, there's no `/track_event` call visible. 2. The same is true for the `user_viewed_dashboard_list` event, which should get triggered when entering the dashboard list view.Judging by our metrics, it looks like this bug started some time during the last 2 weeks, probably at the same point in time for both of the metrics:
For the current week, we have 0 triggers for user_visited_dashboard and one for user_viewed_dashboard_list - tested on production console:
[ gprd ] production> Gitlab::UsageDataCounters::HLLRedisCounter.un
ique_events(event_names: 'user_visited_dashboard', start_date: Dat
e.current, end_date: Date.current + 1.week, property_name: :user)
=> 0
[ gprd ] production> Gitlab::UsageDataCounters::HLLRedisCounter.un
ique_events(event_names: 'user_viewed_dashboard_list', start_date:
Date.current, end_date: Date.current + 1.week, property_name: :us
er)
=> 1
(please note that start_date: Date.current actually checks data for the whole week, not just for today)
Detection
@michold detected while implementing another feature. Automatic detection only happened later through https://getmontecarlo.com/incidents/26977dc9-8572-4af2-a53f-e61bc0a82f3d?utm_source=slack
Impact
The following event counters for Gitlab.com did not get incremented from the frontend during the period from 2024-02-15 and 2024-02-27
Redis
diff_searchesusers_clicking_registration_features_offerusers_visiting_testing_license_compliance_full_reportusers_visiting_testing_manage_license_complianceusers_clicking_license_testing_visiting_external_website
RedisHLL
i_code_review_user_searches_diffusers_visiting_testing_license_compliance_full_reportusers_visiting_testing_manage_license_complianceusers_clicking_license_testing_visiting_external_websitei_testing_metrics_report_widget_totali_testing_load_performance_widget_totalincident_management_issuable_resource_link_visitedi_testing_full_code_quality_report_totalusers_visiting_pipeline_securityi_testing_group_code_coverage_project_click_totali_testing_group_code_coverage_visit_totalg_analytics_ci_cd_release_statisticsg_analytics_ci_cd_deployment_frequencyg_analytics_ci_cd_lead_timeg_analytics_ci_cd_time_to_restore_serviceg_analytics_ci_cd_change_failure_ratei_code_review_post_merge_click_cherry_picki_code_review_post_merge_click_reverti_code_review_post_merge_delete_branchusers_visiting_security_configuration_threat_managementp_analytics_ci_cd_time_to_restore_servicep_analytics_ci_cd_change_failure_ratei_code_review_user_create_mr_from_issuei_code_review_click_diff_view_settingi_code_review_diff_view_inlinei_code_review_diff_view_paralleli_code_review_click_file_browser_settingi_code_review_file_browser_tree_viewi_code_review_file_browser_list_viewi_code_review_click_whitespace_settingi_code_review_diff_show_whitespacei_code_review_diff_hide_whitespacei_code_review_click_single_file_mode_settingi_code_review_diff_single_filei_code_review_diff_multiple_filesdesign_actionusers_expanding_testing_license_compliance_report
InternalEvents
user_viewed_cluster_configurationi_code_review_saved_replies_createmerge_request_click_start_review_on_changes_tabmerge_request_click_add_to_review_on_changes_tabp_analytics_ci_cd_pipelinesp_analytics_ci_cd_deployment_frequencyp_analytics_ci_cd_lead_timei_code_review_saved_replies_usei_code_review_saved_replies_use_in_mri_code_review_saved_replies_use_in_otheruser_viewed_dashboarduser_created_custom_dashboarduser_edited_custom_dashboarduser_viewed_dashboard_designeruser_viewed_custom_dashboarduser_viewed_builtin_dashboarduser_viewed_visualization_designeruser_created_custom_visualizationexclude_anonymised_usersuser_viewed_dashboard_listvalue_streams_dashboard_metric_link_clickedvalue_streams_dashboard_change_failure_rate_link_clickedvalue_streams_dashboard_contributor_count_link_clickedvalue_streams_dashboard_cycle_time_link_clickedvalue_streams_dashboard_deployment_frequency_link_clickedvalue_streams_dashboard_deploys_link_clickedvalue_streams_dashboard_issues_completed_link_clickedvalue_streams_dashboard_issues_link_clickedvalue_streams_dashboard_lead_time_for_changes_link_clickedvalue_streams_dashboard_lead_time_link_clickedvalue_streams_dashboard_merge_request_throughput_link_clickedvalue_streams_dashboard_time_to_restore_service_link_clickedvalue_streams_dashboard_vulnerability_critical_link_clickedvalue_streams_dashboard_vulnerability_high_link_clickedi_analytics_dev_ops_adoptioni_analytics_dev_ops_scoreinsights_chart_item_clickedinsights_issue_chart_item_clickedinsights_merge_request_chart_item_clickeduser_viewed_instrumentation_directionsg_project_management_users_epic_issue_added_from_epicclick_create_confidential_mr_issues_listclick_create_mr_issues_listclick_new_merge_request_listclick_new_merge_request_empty_listuser_edited_cluster_configuration
7d metrics dependent on those won't be correct until 2024-03-05, and 28d metrics won't be correct until 2024-03-26. This is only the case if the metric is dependent on the Redis value and does not have a Snowplow equivalent.
Additional information
Checklist
-
Assigned severity tags based on this guidance -
Assigned to PM and EM of groupanalytics instrumentation -
Posted link to incident in g_analyze_analytics_instrumentationand tagged both PM and EM of the group
<---- TO BE FILLED BY ASSIGNEE / RESOLUTION DRI---->
Root Cause
The issue was caused by the removal of the feature flag usage_data_api. It was removed from the backend in this MR, but it was not removed from the frontend. As a result, this was blocking Redis/RedisHLL and InternalEvents API requests from being sent. As part of the fix,
Resolution
@ankit.panchal has removed feature flag from the frontend as well in !145690 (merged)