Skip to content

Force ApplicationRecord#with_fast_statement_timeout to use replicas

What does this MR do?

Closes #322133 (closed)

Although we already have a load balancer layer to redirect read-only queries to the replicas, the effectiveness of this load balancing layer is not as high as we expected. On Monday 15 March 2021, about 8.4% of read-only GET requests queried the primary database. The number slightly increases comparing to 7.4% in the previous week.

After debugging, a major of the requests to primary comes from the following endpoints:

Endpoint Sum of db_primary_count Sum of db_primary_duration_s
Projects::MergeRequestsController#index 12,090,304 24,792.892
Projects::IssuesController#index 7,438,682 20,488.362

In Projects::MergeRequestsController#index, there is one call to Gitlab::IssueablesCountForState#perform_count: https://gitlab.com/gitlab-org/gitlab/blob/e2dcd5e26a01d7d23685d6cc03592133dc1610f6/lib/gitlab/issuables_count_for_state.rb#L81. This method triggers this line of code:

ApplicationRecord.with_fast_statement_timeout do
  finder.count_by_state
end

Surprisingly, all inside statements are wrapped inside a transaction

(0.2ms)  BEGIN
SQL (0.2ms)  SET LOCAL statement_timeout = 5000 
User Load (2.6ms)  SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT 1
(0.2ms)  COMMIT

As a result, all the following queries of the request are redirected to the primary. This MR is to fix that issue

Solution

Right now we are treating transaction as a sticky statement that force all the following statements to the primary DB. However, it is false negative in the scenario that all statements inside a transaction are all read-only. We already have user_primary to force primary redirections. Adding use_replica_if_possible seems reasonable.

def with_fast_read_read_statement_timeout
  Gitlab::LoadBalancing.use_replica_if_possible do
    transaction(requires_new: true) do
      connection.exec_query("SET LOCAL statement_timeout = 5000")

      yield
    end
  end
end

After this MR, we may be interested in forcing some other queries to replica as well.

Screenshots (strongly suggested)

Before

Screen_Shot_2021-03-15_at_15.59.46

Screen_Shot_2021-03-15_at_16.00.38

After

Screen_Shot_2021-03-15_at_16.13.17

Screen_Shot_2021-03-15_at_16.13.34

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Quang-Minh Nguyen

Merge request reports