Evaluate and Implement Tools for Gathering Container Metrics to Report Component Performance
As part of our ongoing effort to implement component-level performance testing, we need to evaluate and select appropriate tools for gathering metrics from containers. This is a crucial step in our initiative to shift left with component-level performance testing, as outlined in the related issue Create AI Gateway component for Performance tes... (#3304 - closed).
Objectives:
- Research and identify potential tools for container metric collection
- Evaluate each tool based
- Select the most suitable tool(s) for our needs
Tasks:
-
Research potential tools: - cAdvisor
- Telegraf
- Prometheus Node Exporter
- Others as identified during research
-
Go thorough the runner configs for QA runners and see if those tools need to be installed as a part of the runner setup or can be ran separately as a side container to the component. -
'Define evaluation criteria: - Ease of integration with our current CI/CD pipeline
- Ease of sending metrics to external InfluxDB
- Compatibility with InfluxDB
- Compatibility with Docker and Kubernetes environments
- Ability to collect key metrics (CPU, memory, network, disk I/O)
- Performance overhead
- Data export capabilities (especially to InfluxDB, as used in the PoC)
- Community support and documentation
Edited by Vishal Patel