"Improve performance of searching by BuildNames"
What does this MR do and why?
This MR improves the performance of the BuildNameFinder by rewriting the accessing of Builds by filtering by BuildName Thanks to @ahegyi for building the foundation and @mfanGitLab for improving upon it. Here, we are wiring our ideas together, wrote new specs and removed old specs that make no sense anymore. We have removed the ability to sort or add a before cursor due to how the filtering works now. Since the removed features where used nowhere, we shouldn't consider this as a breaking change.
Related to #463875
See this comment for where it originated from: !151465 (comment 1923039706)
Database Performance Plans
Previously we had a very naive approach to this: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/31411/commands/97508 this was not performant as it did a join without proper indexes
The current approach: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/32859/commands/101567 Although much more complicated, optimizes what we want with a better performance time
How to get the database plans locally
I did this in a very spartanic way
-
Open
gdk rails consolelocally after checking out this branch. -
Copy and paste
query = ::Ci::BuildNameFinder.new(relation: Ci::Build.where(status: 'success'), name: 'rspec', project: OpenStruct.new(id: 278964), params: {cursor_id: nil}).execute.to_sql(We are using the OpenStruct mock here, since it is unlikely that we have a local project with the id 278964 and since 278964 is a large project, this should test our query quite nicely :) -
Type
puts queryto get an unescaped version of the query or just execute theBuildNameFinderwith parameters that are tied to your installation of GitLab to get reasonable results.The steps below are completely optional - you have to do them only if you want to find out more for yourself about the query plan and execution times. You can see a query plan for the given parameters here: https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/32859/commands/101567
-
Now that you have the query, you can look through it to find out, what is done or create the index
exec create index build_name_index on p_ci_build_names (project_id, name, build_id, partition_id);on postgres.ai for theciinstance. -
After that, you can paste the query that you
puts-ed out (what a word) and useexplainas a prefix for the query to get the query plan and execution times.
How to validate locally
- Ensure FF is enabled
populate_and_use_build_names_table - You can either filter jobs manually in the UI - http://gdk.test:3000/root/normal/-/jobs?statuses=FAILED&name=test
- or use graphql
{
project(fullPath: "root/normal") {
jobs(name: "test", first: 2, statuses: MANUAL) {
pageInfo {
endCursor
hasNextPage
hasPreviousPage
startCursor
}
nodes {
id
name
}
}
}
}