Skip to content

Add per DB counters to logs and refactor

Dylan Griffith requested to merge 336838-log-db-config-name-for-db-count into master

What does this MR do?

This MR adds multiple new DB metrics to our logging when the multiple_database_counters feature flag is enabled.

+ "db_primary_primary_cached_count": 0,
+ "db_primary_primary_count": 0,
+ "db_primary_primary_duration_s": 0,
+ "db_replica_primary_cached_count": 0,
+ "db_replica_primary_count": 0,
+ "db_replica_primary_duration_s": 0,
+ "db_primary_primary_wal_count": 0,
+ "db_primary_primary_wal_cached_count": 0,
+ "db_replica_primary_wal_cached_count": 0,
+ "db_replica_primary_wal_count": 0,
+ "db_replica_primary_duration_s": 0,
+ "db_primary_primary_duration_s": 0,

These new fields indicate how many queries went to the database named primary. primary happens to be the default name Rails gives to the database connections described in config/database.yml. This is unfortunate as it conflicts with the primary/replica language we use to distinguish between read/write and read-only servers (hence primary_primary in the key) but this will change when we start explicitly setting the 2 databases we configure to be called main and ci. The feature flag exists to alleviate some of this confusion as described in !66515 (merged) as only the groupsharding will need to use this for testing in the meantime.

Prior to this change we already supported logging the duration_s data per database but we did not log the count of queries. When implementing this count of queries it became obvious it was much easier to implement these metrics in the same way that all the other metrics were implemented. In particular that they are logged as 0 even if there was no matching query. This led to quite a lot of refactoring because the list of database_config_names is dynamic. It's what is configured in config/database.yml and we had a tonne of tests that needed the full list of keys and values being logged.

This MR also swaps out the existing multiple_database_metrics feature flag with an env var as this is more efficient since FF is read from the database and this is happening very frequently.

Screenshots or Screencasts (strongly suggested)

{"method":"GET","path":"/","format":"html","controller":"RootController","action":"index","status":200,"time":"2021-08-04T07:16:03.713Z","params":[],"remote_ip":"127.0.0.1","user_id":1,"username":"root","ua":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0","correlation_id":"01FC807N53P4BHYXEM0XQ5W8BT","meta.user":"root","meta.caller_id":"RootController#index","meta.remote_ip":"127.0.0.1","meta.feature_category":"projects","meta.client_id":"user/1","gitaly_calls":81,"gitaly_duration_s":0.637408,"redis_calls":290,"redis_duration_s":0.073479,"redis_read_bytes":912,"redis_write_bytes":1271839,"redis_cache_calls":285,"redis_cache_duration_s":0.071713,"redis_cache_read_bytes":907,"redis_cache_write_bytes":1271486,"redis_shared_state_calls":5,"redis_shared_state_duration_s":0.001766,"redis_shared_state_read_bytes":5,"redis_shared_state_write_bytes":353,"db_count":114,"db_write_count":6,"db_cached_count":2,"db_replica_count":5,"db_replica_cached_count":0,"db_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_count":109,"db_primary_cached_count":2,"db_primary_wal_count":0,"db_primary_wal_cached_count":0,"db_replica_primary_count":5,"db_replica_primary_cached_count":0,"db_replica_primary_wal_count":0,"db_replica_primary_wal_cached_count":0,"db_primary_primary_count":109,"db_primary_primary_cached_count":2,"db_primary_primary_wal_count":0,"db_primary_primary_wal_cached_count":0,"db_replica_ci_count":0,"db_replica_ci_cached_count":0,"db_replica_ci_wal_count":0,"db_replica_ci_wal_cached_count":0,"db_primary_ci_count":0,"db_primary_ci_cached_count":0,"db_primary_ci_wal_count":0,"db_primary_ci_wal_cached_count":0,"db_primary_duration_s":0.67,"db_replica_duration_s":0.031,"db_primary_primary_duration_s":0.67,"db_replica_primary_duration_s":0.031,"db_primary_ci_duration_s":0.0,"db_replica_ci_duration_s":0.0,"cpu_s":25.974111,"queue_duration_s":0.306508,"db_duration_s":0.45327,"view_duration_s":37.67823,"duration_s":67.45431}
{"method":"GET","path":"/-/peek/results","format":"json","controller":"Peek::ResultsController","action":"show","status":200,"time":"2021-08-04T07:16:07.871Z","params":[{"key":"request_id","value":"01FC807N53P4BHYXEM0XQ5W8BT"}],"remote_ip":"127.0.0.1","user_id":1,"username":"root","ua":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0","correlation_id":"01FC809ZC5RMTN6NEP3V83JN10","meta.user":"root","meta.caller_id":"Peek::ResultsController#show","meta.remote_ip":"127.0.0.1","meta.client_id":"user/1","redis_calls":4,"redis_duration_s":0.004751,"redis_read_bytes":1230611,"redis_write_bytes":2036,"redis_cache_calls":2,"redis_cache_duration_s":0.004264,"redis_cache_read_bytes":1230430,"redis_cache_write_bytes":1244,"redis_shared_state_calls":2,"redis_shared_state_duration_s":0.000487,"redis_shared_state_read_bytes":181,"redis_shared_state_write_bytes":792,"db_count":1,"db_write_count":0,"db_cached_count":0,"db_replica_count":1,"db_replica_cached_count":0,"db_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_count":0,"db_primary_cached_count":0,"db_primary_wal_count":0,"db_primary_wal_cached_count":0,"db_replica_primary_count":1,"db_replica_primary_cached_count":0,"db_replica_primary_wal_count":0,"db_replica_primary_wal_cached_count":0,"db_primary_primary_count":0,"db_primary_primary_cached_count":0,"db_primary_primary_wal_count":0,"db_primary_primary_wal_cached_count":0,"db_replica_ci_count":0,"db_replica_ci_cached_count":0,"db_replica_ci_wal_count":0,"db_replica_ci_wal_cached_count":0,"db_primary_ci_count":0,"db_primary_ci_cached_count":0,"db_primary_ci_wal_count":0,"db_primary_ci_wal_cached_count":0,"db_primary_duration_s":0.0,"db_replica_duration_s":0.005,"db_primary_primary_duration_s":0.0,"db_replica_primary_duration_s":0.005,"db_primary_ci_duration_s":0.0,"db_replica_ci_duration_s":0.0,"cpu_s":0.152964,"queue_duration_s":0.152699,"db_duration_s":0.0,"view_duration_s":0.00021,"duration_s":0.01234}

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #336838 (closed) #332945 (closed)

Edited by Dylan Griffith

Merge request reports