Telemetry: Fix `app_server_type` attribute
What does this MR do?
- Depends on:
- Closes: #219114 (closed)
For Usage Ping we have been collecting the app_server_type
for a while now, but that data is always wrong, because it is evaluated based on the client runtime, which will always be a sidekiq worker, not a Rails app server.
In order to reliably know which app server (Puma or Unicorn) is running on which node in any given Omnibus installation, we need to push this logic out of the application runtime and down to Prometheus. Fortunately, we sort of already have this data: since both Unicorn and Puma export metrics to Prometheus, the mere presence of these will tell us what is running.
The metric in question is added as a recording rule in this MR: omnibus-gitlab!4374 (merged)
Here, we are making the client side changes that involve an additional query for the new metric being recorded; it carries a server
label indicating which app server (puma
or unicorn
) is running. Moreover, via the instance
and job
labels we can then associate this to a node and submit it alongside the existing data in the topology
Usage Ping.
Example
I pulled this from an Omnibus container:
"topology": {
"application_requests_per_hour": 266,
"nodes": [
{
"node_memory_total_bytes": 33269903360,
"node_cpus": 16,
"node_services": [
{
"name": "web",
"process_count": 16,
"process_memory_rss": 732653824,
"process_memory_uss": 110505792,
"process_memory_pss": 148698496,
"server": "puma"
},
{
"name": "sidekiq",
"process_count": 3,
"process_memory_rss": 734683591,
"process_memory_uss": 716128711,
"process_memory_pss": 718348174
},
{
"name": "node-exporter",
"process_count": 1,
"process_memory_rss": 15460352
},
{
"name": "redis",
"process_count": 1,
"process_memory_rss": 13308928
},
{
"name": "postgres",
"process_count": 1,
"process_memory_rss": 16097280
},
{
"name": "workhorse",
"process_count": 1,
"process_memory_rss": 31762432
},
{
"name": "gitaly",
"process_count": 1,
"process_memory_rss": 32832512
}
]
}
],
"duration_s": 0.032179404002818046,
"failures": [
]
}
The new element here is the "server": "puma"
entry for web
. Every web
node will have this entry now.
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry Usage Ping changes do not need to be announced in changelogs because they are covered by our Privacy Policy.
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides - [-] Database guides
- [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
-
Test in Omnibus