Improve NOT IssuableFinder via memoization
What does this MR do?
Share the memoized values of IssuableFinder with the inner instantiated class if they're present.
This minimizes the need of the inner class to do the same queries already performed by the outer class.
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Refs #198324 (closed)
Merge request reports
Activity
changed milestone to %12.9
added databasereview pending label
Reviewer roulette
Changes that require review have been detected! A merge request is normally reviewed by both a reviewer and a maintainer in its primary category (e.g. frontend or backend), and by a maintainer in all other categories.
To spread load more evenly across eligible reviewers, Danger has randomly picked a candidate for each review slot. Feel free to override this selection if you think someone else would be better-suited, or the chosen person is unavailable.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines.
Once you've decided who will review this merge request, mention them as you normally would! Danger does not (yet?) automatically notify them for you.
Category Reviewer Maintainer database Krasimir Angelov ( @krasio
)Adam Hegyi ( @ahegyi
)backend Jonathan Schafer ( @jschafer
)Douglas Barbosa Alexandre ( @dbalexandre
)Generated by
🚫 DangerEdited by 🤖 GitLab Bot 🤖- Resolved by Adam Hegyi
@acroitor could I bother you for an initial look here? This is not a full fix for #198324 (closed), but rather one of many I'll be pushing. This attempts to minimize extra SQL calls in the internal
IssuableFinder
object that gets instantiated inside#by_negation
by pushing in the memoized values on the class.There are some pitfalls that I had to work around, however.
@krasio I'd appreciate if you took a look as well, since the bot says your team owns the finders :)
assigned to @acroitor and @krasio and unassigned @mdelaossa
- Resolved by Alexandru Croitor
103 instance_variable_set(:"@#{param}", memoized_params[param]) unless memoized_params[param].nil? 104 end 105 end 106 107 def memoized_params 108 @memo = MEMOIZED_PARAMS.map do |param| 109 next unless instance_variable_defined?(:"@#{param}") && !skip_memoized_param?(param) 110 111 val = instance_variable_get(:"@#{param}") 112 [param, val] unless val.nil? 113 end.compact.to_h 114 end 115 116 # We want to avoid passing the memoized groups/projects to the negated self.class if we're passing groups or projects 117 # into as NOT params, otherwise the search just doesn't work. 118 def skip_memoized_param?(param) changed this line in version 5 of the diff
91 93 end 92 94 end 93 95 94 def initialize(current_user, params = {}) 96 def initialize(current_user, params = {}, memoized_params = {}) 95 97 @current_user = current_user 96 98 @params = params 99 ## If these params are set, we're a class instantiated inside itself, so let's share pre-loaded memoized values 99 ## If these params are set, we're a class instantiated inside itself, so let's share pre-loaded memoized values 99 # If `memoized_params` are present, then this is an instance of IssuableFinder instantiated from within itself(see .by_negation method) to get negated items, so we can use pre-loaded values from `memoized_params` Edited by Alexandru Croitorchanged this line in version 5 of the diff
92 94 end 93 95 94 def initialize(current_user, params = {}) 96 def initialize(current_user, params = {}, memoized_params = {}) 95 97 @current_user = current_user 96 98 @params = params 99 ## If these params are set, we're a class instantiated inside itself, so let's share pre-loaded memoized values 100 return unless memoized_params.present? 101 102 MEMOIZED_PARAMS.each do |param| 103 instance_variable_set(:"@#{param}", memoized_params[param]) unless memoized_params[param].nil? 104 end 105 end 106 107 def memoized_params 108 @memo = MEMOIZED_PARAMS.map do |param| changed this line in version 5 of the diff
91 93 end 92 94 end 93 95 94 def initialize(current_user, params = {}) 96 def initialize(current_user, params = {}, memoized_params = {}) 95 97 @current_user = current_user 96 98 @params = params 99 ## If these params are set, we're a class instantiated inside itself, so let's share pre-loaded memoized values 100 return unless memoized_params.present? 101 102 MEMOIZED_PARAMS.each do |param| 103 instance_variable_set(:"@#{param}", memoized_params[param]) unless memoized_params[param].nil? I think we already check that we do not set
nil
as a value inmemoized_params
method !26832 (diffs)changed this line in version 5 of the diff
99 ## If these params are set, we're a class instantiated inside itself, so let's share pre-loaded memoized values 100 return unless memoized_params.present? 101 102 MEMOIZED_PARAMS.each do |param| 103 instance_variable_set(:"@#{param}", memoized_params[param]) unless memoized_params[param].nil? 104 end 105 end 106 107 def memoized_params 108 @memo = MEMOIZED_PARAMS.map do |param| 109 next unless instance_variable_defined?(:"@#{param}") && !skip_memoized_param?(param) 110 111 val = instance_variable_get(:"@#{param}") 112 [param, val] unless val.nil? 113 end.compact.to_h 114 end WDYT about ?
107 def memoized_params 108 @memo = MEMOIZED_PARAMS.map do |param| 109 next unless instance_variable_defined?(:"@#{param}") && !skip_memoized_param?(param) 110 111 val = instance_variable_get(:"@#{param}") 112 [param, val] unless val.nil? 113 end.compact.to_h 114 end 107 def memoized_params_hash 108 MEMOIZED_PARAMS.map do |param| 109 instance_value = instance_variable_get(:"@#{param}") 110 next if !instance_value.present? || skip_memoized_param?(param) 111 112 [param, instance_value] 113 end.compact.to_h 114 end thanks, that reads much nicer! However I'm keeping
@memo
until you answer !26832 (comment 302736646) :)changed this line in version 5 of the diff
@mdelaossa I've left mostly nitpicks for somewhat better readability, but I would like to get more context on the change to understand what we are doing here.
From a few glances it seems that for negation we generate a query to pull items and then use that as a subquery under
NOT IN (subquery)
form, which does not sound too optimal🤔 Yeah sorry, I should have explained how the negation works currently!
Right now, for each negatable param, we create an inner
IssuableFinder
that returns a non-executed SQL query that then getswhere.not(negation)
called on it. So yes, currently they're used as subqueries in aNOT IN (subquery)
form.Because of the chance of creating multiple
IssuableFinder
s (for example if you're negating both the author and some labels at the same time) passing in these memoized values is worthwhile.right, I am asking myself why are we doing subqueries and not just negating the params within original query ? what is the challenge in doing so? Obviously I am not aware of the whole picture, just want to understand why it would not be possible or difficult ?
I'll take a scenario here, lets say we have the search field with following params on issues list:
label = 'X' label != 'Y' assignee != 'UserX'
so in my understanding we end up having a query like:
select * from issues inner join labels where label = 'X' and issues.id not in (select id from issues where label = 'Y') AND issues.id NOT IN (select id from issues where assignee = 'UserX')
So we are basically joining the issues table as many times as many negation arguments we have. This is obviously not going to scale well. And this change is not going to add any visible improvement. I think we need to radically change what we are doing
🤔 Why can we not do:
select * from issues inner join labels where label = 'X' and label != 'Y' assignee != 'UserX' ?
What is the challenge in doing so ?
here is a sample sql generated right now: https://gitlab.com/snippets/1952338
Initially this was done this way due to speed. We have a large amount of parameters that need negation and a finder that already knew how it all worked, so negating it all was a quick win.
I do agree that negating each param directly in the main SQL will be our biggest improvement, but that's out of scope for this particular MR. That IS something we should do however
Yes, most of the problem is SQL timing out. This MR was an incidental improvement, I have other improvements in the works where I was trying to get the internal subqueries to not have extraneous authorization checks that the external query doesn't have. That would have improved the actual slow queries themselves
assigned to @mdelaossa
91 93 end 92 94 end 93 95 94 def initialize(current_user, params = {}) 96 def initialize(current_user, params = {}, memoized_params = {}) 95 97 @current_user = current_user 96 98 @params = params 99 ## If these params are set, we're a class instantiated inside itself, so let's share pre-loaded memoized values 100 return unless memoized_params.present? What about we extract this in its own method and change the param name, somethning like
memoize_params(params_to_memoize)
This way we'll also avoid confusion with the
memoized_params
method - !26832 (comment 302736636).changed this line in version 5 of the diff