N+1 queries in NotificationRecipientService
From https://gitlab.com/gitlab-org/gitlab-ce/issues/45526, we see that NotificationRecipientService
has a number of N+1 queries, particularly when retrieving notification settings from each participant:
QueryRecorder
reveals this with this spec:
require 'spec_helper'
describe NotificationRecipientService do
let(:service) { described_class }
let(:assignee) { create(:user) }
let(:project) { create(:project, :private) }
describe '#build_new_note_recipients' do
let(:issue) { create(:issue, project: project, assignees: [assignee]) }
let(:note) { create(:note_on_issue, noteable: issue, project_id: issue.project_id) }
def create_participant
participant = create(:user)
project.add_master(participant)
issue.subscriptions.create(user: participant, project: project, subscribed: true)
end
it 'avoids N+1 queries' do
create_participant
service.build_new_note_recipients(note)
control_count = ActiveRecord::QueryRecorder.new do
service.build_new_note_recipients(note)
end
create_participant
expect { service.build_new_note_recipients(note) }.not_to exceed_query_limit(control_count)
end
end
end
Results:
1) NotificationRecipientService#build_new_note_recipients avoids N+1 queries
Failure/Error: expect { service.build_new_note_recipients(note) }.not_to exceed_query_limit(control_count)
Expected a maximum of 18 queries, got 23:
Extra queries:
[2] SELECT "notification_settings"."user_id" FROM "notification_settings" WHERE "notification_settings"."level" = $1 AND "notification_settings"."user_id" IN (5, 1, 2) AND "notification_settings"."source_type" IS NULL
[1] SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT 1
[1] SELECT "notification_settings".* FROM "notification_settings" WHERE "notification_settings"."user_id" IN (1, 5)
[2] SELECT 1 AS one FROM "users" INNER JOIN "project_authorizations" ON "users"."id" = "project_authorizations"."user_id" WHERE "project_authorizations"."project_id" = $1 AND "users"."id" = $2 LIMIT 1
[1] SELECT 1 AS one FROM "users" INNER JOIN "issue_assignees" ON "users"."id" = "issue_assignees"."user_id" WHERE "issue_assignees"."issue_id" = $1 AND "users"."id" = $2 LIMIT 1
[1] SELECT "subscriptions".* FROM "subscriptions" WHERE "subscriptions"."subscribable_id" = $1 AND "subscriptions"."subscribable_type" = $2 AND "subscriptions"."user_id" = $3 LIMIT 1
The issue here is that Participable#raw_participants
builds a set of users and doesn't provide a way to preload attributes.
Designs
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Contributor
@stanhu I think this is basically the same as the issue described in https://gitlab.com/gitlab-org/gitlab-ce/issues/43106#note_64018144? Sorry, the other issue doesn't have an obvious title, but I am working on it.
- Contributor
Actually, I will use the other issue to just move the queries to a Sidekiq worker (so at least the controller is faster), and we can keep this one for the general problem, as it's not isolated to approvals.
1 - Sean McGivern added backend database ~18332 labels
- Sean McGivern changed milestone to %11.0
changed milestone to %11.0
- Sean McGivern mentioned in issue #43106 (closed)
mentioned in issue #43106 (closed)
- Contributor
@crollison I hope that this is fairly straightforward (adding a
preload
) somewhere. You can find a more detailed description at https://gitlab.com/gitlab-org/gitlab-ce/issues/43106#note_64018144, but to reproduce this you can:- Create a public project.
- As various users (you can impersonate users from the admin section, if you're logged in as admin):
- Subscribe to that project with the Watch notification level.
- Subscribe to several other projects with any notification level.
- Invoke
NotificationRecipientService
from a console (see the examples in https://gitlab.com/gitlab-org/gitlab-ce/blob/master/app/services/notification_service.rb). - Notice that we run several queries for each user set up in step 2.
Please let me know if any of that isn't clear!
- Sean McGivern assigned to @crollison
assigned to @crollison
@crollison looked at this one together. This one is a big tricky:
-
In many places (e.g. reference extraction, project subscribers, etc.),
NotificationRecipientService
loads users as anArray
, not as anActiveRecord::Relation
, so we can't just addpreload
orincludes
to the list. The call inNotificationRecipientService#add_recipients
isn't able to run theincludes
(https://gitlab.com/gitlab-org/gitlab-ee/blob/74f694421761b70ed93788681155de14a3d61d64/app/services/notification_recipient_service.rb#L57-59) because of that. -
We could track down every case where we return an
Array
(e.g. https://gitlab.com/gitlab-org/gitlab-ee/blob/master/app/models/concerns/subscribable.rb#L31-35) and ensure we eager load the associations, but this is incredibly painful. This would affect the Markdown reference extractor and their relevant parsers, which doesn't support preloading additional associations. -
There is an undocumented call in the Rails API (https://apidock.com/rails/ActiveRecord/Associations/Preloader) that allows you to do something like this:
ActiveRecord::Associations::Preloader.new.preload(users, :notification_settings)
But the Rails team discourages people from using this internal API: https://github.com/rails/rails/issues/11719
Edited by Stan Hu-
https://gitlab.com/gitlab-org/gitlab-ce/issues/43106#note_64018144 is definitely one source of N+1 queries when you have many users with multiple notification settings per recipient. I think the test in the issue description doesn't exercise this, but you can see it in the sample trace attachment. We should probably address this first. @crollison attempted the first suggestion in https://gitlab.com/gitlab-org/gitlab-ce/issues/43106#note_64018144, but it wasn't obvious why that didn't appear to help.
However, the test mentioned in the issue description does expose possible other N+1 queries. You can see by using
QUERY_RECORDER_DEBUG=1
via https://docs.gitlab.com/ee/development/query_recorder.html#finding-the-source-of-the-query:Subscribable#subscribers
SQL:
SELECT "subscriptions".* FROM "subscriptions" WHERE "subscriptions"."subscribable_id" = $1 AND "subscriptions"."subscribable_type" = $2 AND ("subscriptions"."project_id" IS NULL OR "subscriptions"."project_id" = 1) AND "subscriptions"."subscribed" = $3
--> /Users/stanhu/gdk/gitlab/app/models/concerns/subscribable.rb:34:in `subscribers' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:161:in `add_subscribed_users' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:322:in `build!' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:84:in `notification_recipients' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:18:in `build_new_note_recipients'
NotificationRecipient#unsubscribed?
SQL:
SELECT "subscriptions".* FROM "subscriptions" WHERE "subscriptions"."subscribable_id" = $1 AND "subscriptions"."subscribable_type" = $2 AND "subscriptions"."user_id" = $3 LIMIT 1
--> /Users/stanhu/gdk/gitlab/app/models/notification_recipient.rb:67:in `unsubscribed?' --> /Users/stanhu/gdk/gitlab/app/models/notification_recipient.rb:39:in `notifiable?' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:32:in `select!' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:32:in `filter!' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:85:in `notification_recipients' --> /Users/stanhu/gdk/gitlab/app/services/notification_recipient_service.rb:18:in `build_new_note_recipients'
NotificationRecipient#has_access? (team_member?)
SQL:
SELECT 1 AS one FROM "users" INNER JOIN "project_authorizations" ON "users"."id" = "project_authorizations"."user_id" WHERE "project_authorizations"."project_id" = $1 AND "users"."id" = $2 LIMIT 1
--> /Users/stanhu/gdk/gitlab/app/policies/project_policy.rb:386:in `team_member?' --> /Users/stanhu/gdk/gitlab/app/policies/project_policy.rb:40:in `block in <class:ProjectPolicy>' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:21:in `instance_eval' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:21:in `compute' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:42:in `block in pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:280:in `cache' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:42:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/rule.rb:79:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/step.rb:79:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:89:in `block in run' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:177:in `block in steps_by_score' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:146:in `loop' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:146:in `steps_by_score' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:79:in `run' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:57:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `block in allowed?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `each' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `all?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `allowed?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:207:in `can?' --> /Users/stanhu/gdk/gitlab/app/models/ability.rb:70:in `allowed?' --> /Users/stanhu/gdk/gitlab/app/models/user.rb:694:in `can?' --> /Users/stanhu/gdk/gitlab/app/models/notification_recipient.rb:95:in `block in has_access?'
NotificationRecipient#has_access? (Issue#assignee_or_author?)
SQL:
SELECT 1 AS one FROM "users" INNER JOIN "issue_assignees" ON "users"."id" = "issue_assignees"."user_id" WHERE "issue_assignees"."issue_id" = $1 AND "users"."id" = $2 LIMIT 1
--> /Users/stanhu/gdk/gitlab/app/models/issue.rb:150:in `assignee_or_author?' --> /Users/stanhu/gdk/gitlab/app/policies/issuable_policy.rb:9:in `block in <class:IssuablePolicy>' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:21:in `instance_eval' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:21:in `compute' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:42:in `block in pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:280:in `cache' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/condition.rb:42:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/rule.rb:79:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/step.rb:79:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:89:in `block in run' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:177:in `block in steps_by_score' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:146:in `loop' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:146:in `steps_by_score' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:79:in `run' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/runner.rb:57:in `pass?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `block in allowed?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `each' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `all?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:215:in `allowed?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/base.rb:207:in `can?' --> /Users/stanhu/gdk/gitlab/app/models/ability.rb:70:in `allowed?' --> /Users/stanhu/gdk/gitlab/app/models/user.rb:694:in `can?' --> /Users/stanhu/gdk/gitlab/app/models/notification_recipient.rb:95:in `block in has_access?' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/preferred_scope.rb:7:in `with_preferred_scope' --> /Users/stanhu/gdk/gitlab/lib/declarative_policy/preferred_scope.rb:21:in `subject_scope' --> /Users/stanhu/gdk/gitlab/app/models/notification_recipient.rb:85:in `has_access?' --> /Users/stanhu/gdk/gitlab/app/models/notification_recipient.rb:28:in `notifiable?'
Edited by Stan HuFor example, rewriting
app/models/concerns/subscribable.rb
in this way preserves theActiveRecord::Relation
and thus eliminates one N+1 query:diff --git a/app/models/concerns/subscribable.rb b/app/models/concerns/subscribable.rb index f478c8ede1..394a61e3a4 100644 --- a/app/models/concerns/subscribable.rb +++ b/app/models/concerns/subscribable.rb @@ -29,9 +29,11 @@ module Subscribable end def subscribers(project) - subscriptions_available(project) - .where(subscribed: true) - .map(&:user) + relation = subscriptions_available(project) + .where(subscribed: true) + .select(:user_id) + + User.where(id: relation) end
The SQL becomes:
SELECT "users".* FROM "users" WHERE "users"."id" IN (SELECT "subscriptions"."user_id" FROM "subscriptions" WHERE "subscriptions"."subscribable_id" = $1 AND "subscriptions"."subscribable_type" = $2 AND ("subscriptions"."project_id" IS NULL OR "subscriptions"."project_id" = 1) AND "subscriptions"."subscribed" = $3)
This change also fixes the N+1 issue for
NotificationRecipient#has_access?
:diff --git a/app/services/notification_recipient_service.rb b/app/services/notification_recipient_service.rb index 5658699664..7783caef38 100644 --- a/app/services/notification_recipient_service.rb +++ b/app/services/notification_recipient_service.rb @@ -67,6 +67,8 @@ module NotificationRecipientService end def make_recipient(user, type, reason) + target = target.preload(:issue_assignees) if target.is_a?(Issue) + NotificationRecipient.new( user, type, reason: reason,
- Maintainer
@crollison @stanhu thanks for pushing this forward! This turned out to be much bigger task than expected. We might need to re-evaluate this or splitting this into smaller pieces.
1 - Jan Provaznik assigned to @jprovaznik
assigned to @jprovaznik
- Sean McGivern mentioned in issue #45493 (moved)
mentioned in issue #45493 (moved)
- Sean McGivern unassigned @jprovaznik
unassigned @jprovaznik
- Contributor
@stanhu thanks for the update!
But the Rails team discourages people from using this internal API: https://github.com/rails/rails/issues/11719
We already use that in https://gitlab.com/gitlab-org/gitlab-ce/blob/master/app/controllers/concerns/renders_notes.rb#L23. I also think it's a pretty soft discouragement. It says:
But advanced users can still use it, but we don't guarantee it will not change between releases. It is the same case of arel.
So if this API is the only way to fix this, and it works in Rails 4 and Rails 5, I'd say that's probably good enough for now.
@crollison attempted the first suggestion in https://gitlab.com/gitlab-org/gitlab-ce/issues/43106#note_64018144, but it wasn't obvious why that didn't appear to help.
Hmm, that works for me on production:
# First, paste in https://gitlab.com/gitlab-org/gitlab-ce/blob/master/spec/support/helpers/query_recorder.rb *** PRODUCTION *** production> merge_request = MergeRequest.find(8845120) => #<MergeRequest id:8845120 gitlab-org/gitlab-ce!18157> *** PRODUCTION *** production> user = User.find_by_username('smcgivern') => #<User id:443319 @smcgivern> *** PRODUCTION *** production> ActiveRecord::QueryRecorder.new { NotificationRecipientService.build_recipients(merge_request, user, action: "resolve_all_discussions") }.count => 1527 # Paste the method below *** PRODUCTION *** production> ActiveRecord::QueryRecorder.new { NotificationRecipientService.build_recipients(merge_request, user, action: "resolve_all_discussions") }.count => 327
class User def notification_settings_for(source) if notification_settings.loaded? notification_settings.find do |notification| notification.source_type == source.class.to_s && notification.source_id == source.id end else notification_settings.find_or_initialize_by(source: source) end end end
Now that 327 is still very high, which is (as you rightly say) due to other issues.
I think that it doesn't fix your spec, because your spec doesn't actually exercise the (quite subtle) problem we see in production. I probably did a bad job of explaining that problem above, so perhaps this diff to the spec will clarify:
diff --git a/spec/services/notification_recipient_service_spec.rb b/spec/services/notification_recipient_service_spec.rb index b97d970519e..340d4585e0c 100644 --- a/spec/services/notification_recipient_service_spec.rb +++ b/spec/services/notification_recipient_service_spec.rb @@ -3,20 +3,24 @@ require 'spec_helper' describe NotificationRecipientService do let(:service) { described_class } let(:assignee) { create(:user) } - let(:project) { create(:project, :private) } + let(:project) { create(:project, :public) } + let(:other_projects) { create_list(:project, 5, :public) } describe '#build_new_note_recipients' do let(:issue) { create(:issue, project: project, assignees: [assignee]) } let(:note) { create(:note_on_issue, noteable: issue, project_id: issue.project_id) } - def create_participant - participant = create(:user) - project.add_master(participant) - issue.subscriptions.create(user: participant, project: project, subscribed: true) + def create_watcher + watcher = create(:user) + create(:notification_setting, project: project, user: watcher, level: :watch) + + other_projects.each do |other_project| + create(:notification_setting, project: other_project, user: watcher, level: :watch) + end end it 'avoids N+1 queries' do - create_participant + create_watcher service.build_new_note_recipients(note) @@ -24,7 +28,7 @@ describe NotificationRecipientService do service.build_new_note_recipients(note) end - create_participant + create_watcher expect { service.build_new_note_recipients(note) }.not_to exceed_query_limit(control_count) end
Before my patch above, it fails like this:
Failures: 1) NotificationRecipientService#build_new_note_recipients avoids N+1 queries Failure/Error: expect { service.build_new_note_recipients(note) }.not_to exceed_query_limit(control_count) Expected a maximum of 22 queries, got 29: Extra queries: [1] SELECT "users".* FROM "users" WHERE "users"."id" IN (16, 1) [1] SELECT "notification_settings".* FROM "notification_settings" WHERE "notification_settings"."user_id" IN (1, 16) [6] SELECT "projects".* FROM "projects" WHERE "projects"."id" = $1 LIMIT 1 [1] SELECT "subscriptions".* FROM "subscriptions" WHERE "subscriptions"."subscribable_id" = $1 AND "subscriptions"."subscribable_type" = $2 AND "subscriptions"."user_id" = $3 LIMIT 1 # ./spec/services/notification_recipient_service_spec.rb:33:in `block (3 levels) in <top (required)>' # -e:1:in `<main>'
And after:
Failures: 1) NotificationRecipientService#build_new_note_recipients avoids N+1 queries Failure/Error: expect { service.build_new_note_recipients(note) }.not_to exceed_query_limit(control_count) Expected a maximum of 16 queries, got 17: Extra queries: [1] SELECT "users".* FROM "users" WHERE "users"."id" IN (16, 1) [1] SELECT "notification_settings".* FROM "notification_settings" WHERE "notification_settings"."user_id" IN (1, 16) [1] SELECT "subscriptions".* FROM "subscriptions" WHERE "subscriptions"."subscribable_id" = $1 AND "subscriptions"."subscribable_type" = $2 AND "subscriptions"."user_id" = $3 LIMIT 1 # ./spec/services/notification_recipient_service_spec.rb:33:in `block (3 levels) in <top (required)>' # -e:1:in `<main>'
Now, I don't know whether that patch is a good idea or not. I feel like it's piling micro-optimisation on top of micro-optimisation, and stepping back and making a more holistic change might be useful. I just wanted to explain the concrete value of that change that I proposed; it's a weird case that isn't natural to write a spec for, but leads to a huge amount of queries being run on GitLab.com for MRs to this project. (Partly because of your notification settings
)The other cases that you've mentioned are also extremely valid. I just think this one is especially bad because the N is not the number of users, but the number of notification settings those users have. (It's actually an upper bound, because we break out of the loop early when we find the right one.)
Sorry for the wall of text, but does that make sense to you?
Edited by Sean McGivern - Sean McGivern assigned to @smcgivern and unassigned @crollison
assigned to @smcgivern and unassigned @crollison
- Sean McGivern mentioned in merge request !19535 (merged)
mentioned in merge request !19535 (merged)
- Sean McGivern mentioned in issue #47496 (closed)
mentioned in issue #47496 (closed)
- Contributor
I created https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/19535 to address those specific points. I opened https://gitlab.com/gitlab-org/gitlab-ce/issues/47496 to follow-up on some of the other N+1s, because there are a lot.
- Rémy Coutable closed via merge request !19535 (merged)
closed via merge request !19535 (merged)
- Rémy Coutable mentioned in commit 5024c4aa
mentioned in commit 5024c4aa
- Contributor
@smcgivern Nice work!
- Maintainer
Awesome!
- 🤖 GitLab Bot 🤖 added devopsplan label
added devopsplan label