Skip to content

Monitor and fix unhandled Sidekiq job errors (Ai::RepositoryXray::ScanDependenciesWorker)

Context

In #476177 (closed), we introduced the Ai::Context::Dependencies::ConfigFiles::Base class where the intention is for each dependency manager config file type to be represented by a child class. Each child class class contains the parsing logic in extract_libs, which returns a list of libraries and their versions from the file content. It's executed when the config file parser (Ai::Context::Dependencies::ConfigFileParser) runs .parse! on each config file object.

The Sidekiq worker Ai::RepositoryXray::ScanDependenciesWorker runs Ai::RepositoryXray::ScanDependenciesService, which executes the config file parser.

Problem

The parsing logic in extract_libs sometimes misses certain edge cases in the file content. When an unexpected data type or value is encountered, it throws an exception that bubbles up as Sidekiq job error. These are unhandled exceptions and should be either fixed or caught and re-raised as a known ParsingError instead.

Typically the rate of these unhandled errors is quite low compared to the success rate (see Grafana worker detail). So these are considered low priority "bugs", but they should still be addressed for code completeness and to avoid impacting our error budget.

References

Proposal

We should periodically monitor Kibana logs for these failed Sidekiq job errors and fix/handle them as needed. This needs to be a continuous process throughout the lifetime of the X-Ray service because new config file classes may be added over time. New edge cases may also be uncovered as the service becomes used more widely.

UPDATE [2024-02-04]

Per #500575 (comment 2303465333), we have promoted this to an epic (&16680 (closed)) and will create child issues as needed.

Edited by Leaminn Ma