Skip to content

Consolidate DB pool size settings into initializer

Matthias Käppler requested to merge 35170-consolidate-db-pool-config into master

What does this MR do?

Part 1/2 of #35170 (closed): This MR here only covers DB pool settings, not Redis.

I tried to make this a refactoring with as few semantic changes as possible, but I realized that this was somewhat at odds with the main goal of this: consolidating these settings into a single file. This meant broadening the database_config initializer to also account for sidekiq settings and removing these from the sidekiq specific initializer respectively.

The problem I hit was that for Puma only, we introduced an initializer database_config, which sets the DB pool size for the app server based on the following logic: pool_size = max(user_specified_value, max_threads).

However, we did not do this for sidekiq; for sidekiq we instead completely ignore the pool setting from database.yml currently and force the pool size to be max_threads (called concurrency in sidekiq).

So this is the dilemma. We have several options:

  1. Current impl: Apply the max(x, y) logic to both Puma and Sidekiq, making Sidekiq connection pooling behave differently from before (a user specified pool size would now take precedence if it's greater than Sidekiq concurrency). This should be relatively harmless, since connections are created lazily and the pool size is merely a theoretical maximum. In fact, the chance that more connections are actually being utilized than there are sidekiq threads should be impossible. This is likely the safest option to choose from.
  2. Drop the max(x, y) logic and force max_threads on both Puma and Sidekiq, making Puma connection pooling behave differently from before (a user specified pool size greater than Puma max threads would now be ignored). This might be surprising to an admin, but should also be relatively harmless, since as above, it shouldn't be possible that more connections are utilized than there are threads.
  3. Change the pool sizing strategy entirely, as outlined in #36377 (closed), since it's kinda confusing that user settings are sometimes ignored, sometimes not. This would make both Puma and Sidekiq connection pooling behave differently from before. I would very much prefer not to do this in this PR as mentioned above and in the issue description.

UPDATE: We decided on option 1.

Note that I also retained the behavior that the Geo DB pool size is only adjusted under Sidekiq, not Puma. I don't know why it was done that way, presumably because that code pre-dated Puma. But I would like to keep the blast radius of this story as small as possible, since it was meant to be merely a refactor, not an optimization to how we size connection pools. I broke out a new issue for this: #195163

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Depending on which strategy we choose from the ones outlined above, we might see changes in database connectivity performance from either Sidekiq, Puma, or both. If we choose option 1, I don't actually expect anything to change in practice.

  • test this on the branch specific deployment
  • post-launch, keep an eye open for connection congestion
Edited by 🤖 GitLab Bot 🤖

Merge request reports