Skip to content

`/api/:version/repos/:namespace/:project/branches` should not query the primary databases frequently

Screen_Shot_2021-03-23_at_13.39.49

Even though being a GET read-only endpoint, /api/:version/repos/:namespace/:project/branches is one of the top endpoint that frequently queries the primary database. Around 500k (38%) requests to this endpoint get this problem over total 1.3M request. Somehow, all of the troublesome requests have JIRA DVCS user agent.

It turns out that in the API controller, we update the feature usage of the integrations:

# Blah blah
        def update_project_feature_usage_for(project)
          # Prevent errors on GitLab Geo not allowing
          # UPDATE statements to happen in GET requests.
          return if Gitlab::Database.read_only?

          project.log_jira_dvcs_integration_usage(cloud: jira_cloud?)
        end

        get ':namespace/:project/branches' do
          user_project = find_project_with_access(params)

          update_project_feature_usage_for(user_project) # !!!

          branches = ::Kaminari.paginate_array(user_project.repository.branches.sort_by(&:name))

          present paginate(branches), with: ::API::Github::Entities::Branch, project: user_project
        end

# Blah blah

Inside ProjectFeatureUsage, we wrap around timestamp updates with a transaction. That makes the session sticky and redirects the following queries to the primary.

  def log_jira_dvcs_integration_usage(cloud: true)
    transaction(requires_new: true) do
      save unless persisted?
      touch(self.class.jira_dvcs_integration_field(cloud: cloud))
    end
  rescue ActiveRecord::RecordNotUnique
    reset
    retry
  end

Solution

In !56849 (merged), we introduces ::Gitlab::Database::LoadBalancing::Session.without_sticky_writes to prevent session stickiness after a write. It actually solves this issue. We just need to wrap:

        def update_project_feature_usage_for(project)
          # Prevent errors on GitLab Geo not allowing
          # UPDATE statements to happen in GET requests.
          return if Gitlab::Database.read_only?

          ::Gitlab::Database::LoadBalancing::Session.without_sticky_writes do
            project.log_jira_dvcs_integration_usage(cloud: jira_cloud?)
          end
        end

The transaction in the model looks redundant. It can be refactored as:

  def log_jira_dvcs_integration_usage(cloud: true)
     assign_attributes(
       self.class.jira_dvcs_integration_field(cloud: cloud) => Time.now
     )
     save
  rescue ActiveRecord::RecordNotUnique
    reset
    retry
  end
Edited by Quang-Minh Nguyen