Detect and log cross-database modifications in production

Closed Dylan Griffith requested to merge 341783-log-cross-db-modifications into master

What does this MR do and why?

This adds a Rack middleware and Sidekiq middleware for detecting "Cross-database modification" violations. These occur when you query 2 different databases in the context of a single transaction. Previously we only wrapped our RSpec code with this detection logic but we found that it was often causing lots of false positives in RSpec (in test-only code) which made it hard to sort through the real issues. In addition we want extra validation by running this in production and therefore detecting code that our specs may be missing.

The middlewares use a feature flag detect_cross_database_modification which we intend to enable for a small percentage of time.

The middlewares can also be disabled by setting the env var DISABLE_CROSS_DATABASE_MODIFICATION_DETECTION=true and restarting the application. This environment variable is an extra safety net if the middlewares are causing problems.

Screenshots or screen recordings

$ tail -f log/exceptions_json.log

{"severity":"ERROR","time":"2021-11-02T06:13:08.284Z","correlation_id":"01FKFMF9SFB8YGZ9ERYZEA1ZQ7","exception.class":"Gitlab::Database::PreventCrossDatabaseModification::CrossDatabaseModificationAcros
sUnsupportedTablesError","exception.message":"Cross-database data modification of 'gitlab_main, gitlab_ci' were detected within a transaction modifying the 'uploads, routes, namespaces, keys, project_a
uthorizations, events, releases, ci_builds, ci_pipelines' tables.Please refer to https://docs.gitlab.com/ee/development/database/multiple_databases.html#removing-cross-database-transactions for details
 on how to resolve this exception.","exception.backtrace":["lib/gitlab/database/prevent_cross_database_modification.rb:116:in `prevent_cross_database_modification!'","lib/gitlab/database/prevent_cross_
database_modification.rb:24:in `block in with_cross_database_modification_prevented'","app/services/users/destroy_service.rb:68:in `execute'","ee/app/services/ee/users/destroy_service.rb:10:in `execute'","app/workers/delete_user_worker.rb:17:in `perform'","lib/gitlab/sidekiq_middleware/detect_cross_database_modification.rb:11:in `block in call'","lib/gitlab/database/prevent_cross_database_modification.rb:30:in `with_cross_database_modification_prevented'","lib/gitlab/sidekiq_middleware/detect_cross_database_modification.rb:10:in `call'","lib/gitlab/sidekiq_middleware/duplicate_jobs/strategies/until_executing.rb:16:in `perform'","lib/gitlab/sidekiq_middleware/duplicate_jobs/duplicate_job.rb:58:in `perform'","lib/gitlab/sidekiq_middleware/duplicate_jobs/server.rb:8:in `call'","lib/gitlab/sidekiq_middleware/worker_context.rb:9:in `wrap_in_optional_context'","lib/gitlab/sidekiq_middleware/worker_context/server.rb:17:in `block in call'","lib/gitlab/application_context.rb:74:in `block in use'","lib/gitlab/application_context.rb:74:in `use'","lib/gitlab/application_context.rb:27:in `with_context'","lib/gitlab/sidekiq_middleware/worker_context/server.rb:15:in `call'","lib/gitlab/sidekiq_status/server_middleware.rb:7:in `call'","lib/gitlab/sidekiq_versioning/middleware.rb:9:in `call'","lib/gitlab/sidekiq_middleware/admin_mode/server.rb:14:in `call'","lib/gitlab/sidekiq_middleware/instrumentation_logger.rb:9:in `call'","lib/gitlab/sidekiq_middleware/batch_loader.rb:7:in `call'","lib/gitlab/sidekiq_middleware/extra_done_log_metadata.rb:7:in `call'","lib/gitlab/sidekiq_middleware/request_store_middleware.rb:10:in `block in call'","lib/gitlab/with_request_store.rb:17:in `enabling_request_store'","lib/gitlab/with_request_store.rb:10:in `with_request_store'","lib/gitlab/sidekiq_middleware/request_store_middleware.rb:9:in `call'","lib/gitlab/sidekiq_middleware/server_metrics.rb:66:in `block in call'","lib/gitlab/sidekiq_middleware/server_metrics.rb:89:in `block in instrument'","lib/gitlab/metrics/background_transaction.rb:33:in `run'","lib/gitlab/sidekiq_middleware/server_metrics.rb:89:in `instrument'","lib/gitlab/sidekiq_middleware/server_metrics.rb:65:in `call'","lib/gitlab/sidekiq_middleware/monitor.rb:8:in `block in call'","lib/gitlab/sidekiq_daemon/monitor.rb:49:in `within_job'","lib/gitlab/sidekiq_middleware/monitor.rb:7:in `call'","lib/gitlab/sidekiq_middleware/size_limiter/server.rb:13:in `call'","lib/gitlab/sidekiq_logging/structured_logger.rb:19:in `call'"],"user.username":"root","tags.program":"sidekiq","tags.locale":"en","tags.feature_category":"authentication_and_authorization","tags.correlation_id":"01FKFMF9SFB8YGZ9ERYZEA1ZQ7","extra.sidekiq":{"retry":3,"queue":"default","backtrace":true,"version":0,"class":"DeleteUserWorker","args":["[FILTERED]","[FILTERED]","{\"hard_delete\"=\u003e\"true\"}"],"jid":"4d5f265653b9c51ae7884c5d","created_at":1635833587.9075613,"meta.user":"root","meta.caller_id":"Admin::UsersController#destroy","meta.remote_ip":"127.0.0.1","meta.feature_category":"authentication_and_authorization","meta.client_id":"user/1","correlation_id":"01FKFMF9SFB8YGZ9ERYZEA1ZQ7","worker_data_consistency":"always","idempotency_key":"resque:gitlab:duplicate:default:4a342bc1b7c3a9bed0dbe6366209705d403939742a0a49e59527ba126197923d","size_limiter":"validated","enqueued_at":1635833587.9133484},"extra.gitlab_schemas":["gitlab_main","gitlab_ci"],"extra.tables":["uploads","routes","namespaces","keys","project_authorizations","events","releases","ci_builds","ci_pipelines"],"extra.query":"UPDATE \"ci_pipelines\" SET \"user_id\" = NULL, \"lock_version\" = COALESCE(\"lock_version\", 0) + 1 WHERE \"ci_pipelines\".\"user_id\" = 66 /*application:sidekiq,correlation_id:01FKFMF9SFB8YGZ9ERYZEA1ZQ7,jid:4d5f265653b9c51ae7884c5d,endpoint_id:DeleteUserWorker,db_config_name:main,line:/home/dylan/.rvm/gems/ruby-2.7.4/gems/marginalia-1.10.0/lib/marginalia/comment.rb:25:in `block in construct_comment'*/"}

How to set up and validate locally

  1. Feature.enable(:detect_cross_database_modification)
  2. Go to Admin > Users
  3. Delete a user and contributions
  4. Check for an error in log/exceptions_json.log

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #341783 (closed)

Edited by Dylan Griffith