Update Catalog resource search logic to use Postgres Full Text Search
What does this MR do and why?
This updates the search logic in Ci::Catalog::Resource
to use PostgreSQL Full Text Search capability instead of ILIKE
(fuzzy_search
) and trigram indexes. We are doing this for performance reasons and to reduce the number of indices required to be used with visibility_level
in #429056 (closed). See #430889 (closed) for details.
This MR:
- Adds a migration for a new
search_vector
TSVector column that is automatically generated with data fromname
anddescription
. - Removes the trigram indices on
name
anddescription
as they are no longer needed. - Updates the
search
scope onCi::Catalog::Resource
to use the new Full Text Search functionality.
NOTE:
- Indices on
search_vector
will be added after denormalizingvisibility_level
in #429056 (closed). - The
search
scope is already tested in resource_spec.rb, listing_spec.rb, and resources_resolver_spec.rb. The behaviour hasn't changed so the tests pass as expected and additional specs are not required.
Resolves #430889 (closed).
How to set up and validate locally
We can validate that the GraphQl ciCatalogResources
search functionality still works the same way as before.
- Under the same parent group, create 2 projects with README files named
Orange Project
andApple Project
. (We want to set up a situation where only one or both of these projects are returned by the search filter.) - Update the description of
Orange Project
toFoo
; update the description ofApple Project
toOrange
. - In the console, mark both projects as catalog resources.
Ci::Catalog::Resource.create!(project: Project.find_by(name: 'Orange Project'))
Ci::Catalog::Resource.create!(project: Project.find_by(name: 'Apple Project'))
- Go to
http://gdk.test:3000/
and test the following query (update the group name as necessary):
query {
ciCatalogResources(projectPath: "group-a/orange-project", search: "orange") {
nodes {
id
name
}
}
}
Observe that both projects from Step 1 are in the response.
- If you change the search term to
"apple"
, then onlyApple Project
will show.
Query plan
Up
main: == [advisory_lock_connection] object_id: 184040, pg_backend_pid: 62227
main: == 20231107193342 AddSearchVectorToCatalogResources: migrating ================
main: -- execute("ALTER TABLE catalog_resources\n ADD COLUMN search_vector tsvector\n GENERATED ALWAYS AS\n (setweight(to_tsvector('english', COALESCE(name, '')), 'A') ||\n setweight(to_tsvector('english', COALESCE(description, '')), 'B')) STORED;\n")
main: -> 0.0097s
main: == 20231107193342 AddSearchVectorToCatalogResources: migrated (0.0140s) =======
main: == [advisory_lock_connection] object_id: 184040, pg_backend_pid: 62227
ci: == [advisory_lock_connection] object_id: 184240, pg_backend_pid: 62229
ci: == 20231107193342 AddSearchVectorToCatalogResources: migrating ================
ci: -- execute("ALTER TABLE catalog_resources\n ADD COLUMN search_vector tsvector\n GENERATED ALWAYS AS\n (setweight(to_tsvector('english', COALESCE(name, '')), 'A') ||\n setweight(to_tsvector('english', COALESCE(description, '')), 'B')) STORED;\n")
ci: -> 0.0082s
ci: == 20231107193342 AddSearchVectorToCatalogResources: migrated (0.0184s) =======
ci: == [advisory_lock_connection] object_id: 184240, pg_backend_pid: 62229
main: == [advisory_lock_connection] object_id: 184400, pg_backend_pid: 62232
main: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: migrating
main: -- transaction_open?(nil)
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0955s
main: -- indexes(:catalog_resources)
main: -> 0.0035s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0002s
main: -- remove_index(:catalog_resources, {:algorithm=>:concurrently, :name=>"index_catalog_resources_on_name_trigram"})
main: -> 0.0014s
main: -- execute("RESET statement_timeout")
main: -> 0.0004s
main: -- transaction_open?(nil)
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0007s
main: -- indexes(:catalog_resources)
main: -> 0.0020s
main: -- remove_index(:catalog_resources, {:algorithm=>:concurrently, :name=>"index_catalog_resources_on_description_trigram"})
main: -> 0.0012s
main: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: migrated (0.1218s)
main: == [advisory_lock_connection] object_id: 184400, pg_backend_pid: 62232
ci: == [advisory_lock_connection] object_id: 184520, pg_backend_pid: 62234
ci: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: migrating
ci: -- transaction_open?(nil)
ci: -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci: -> 0.0007s
ci: -- indexes(:catalog_resources)
ci: -> 0.0030s
ci: -- execute("SET statement_timeout TO 0")
ci: -> 0.0001s
ci: -- remove_index(:catalog_resources, {:algorithm=>:concurrently, :name=>"index_catalog_resources_on_name_trigram"})
ci: -> 0.0008s
ci: -- execute("RESET statement_timeout")
ci: -> 0.0001s
ci: -- transaction_open?(nil)
ci: -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci: -> 0.0004s
ci: -- indexes(:catalog_resources)
ci: -> 0.0024s
ci: -- remove_index(:catalog_resources, {:algorithm=>:concurrently, :name=>"index_catalog_resources_on_description_trigram"})
ci: -> 0.0019s
ci: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: migrated (0.0397s)
ci: == [advisory_lock_connection] object_id: 184520, pg_backend_pid: 62234
Down
main: == [advisory_lock_connection] object_id: 183220, pg_backend_pid: 67116
main: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: reverting
main: -- transaction_open?(nil)
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0775s
main: -- index_exists?(:catalog_resources, :name, {:name=>"index_catalog_resources_on_name_trigram", :using=>:gin, :opclass=>{:name=>:gin_trgm_ops}, :algorithm=>:concurrently})
main: -> 0.0023s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0002s
main: -- add_index(:catalog_resources, :name, {:name=>"index_catalog_resources_on_name_trigram", :using=>:gin, :opclass=>{:name=>:gin_trgm_ops}, :algorithm=>:concurrently})
main: -> 0.0026s
main: -- execute("RESET statement_timeout")
main: -> 0.0001s
main: -- transaction_open?(nil)
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0003s
main: -- index_exists?(:catalog_resources, :description, {:name=>"index_catalog_resources_on_description_trigram", :using=>:gin, :opclass=>{:description=>:gin_trgm_ops}, :algorithm=>:concurrently})
main: -> 0.0015s
main: -- add_index(:catalog_resources, :description, {:name=>"index_catalog_resources_on_description_trigram", :using=>:gin, :opclass=>{:description=>:gin_trgm_ops}, :algorithm=>:concurrently})
main: -> 0.0007s
main: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: reverted (0.0987s)
main: == [advisory_lock_connection] object_id: 183220, pg_backend_pid: 67116
ci: == [advisory_lock_connection] object_id: 184460, pg_backend_pid: 67536
ci: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: reverting
ci: -- transaction_open?(nil)
ci: -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci: -> 0.0995s
ci: -- index_exists?(:catalog_resources, :name, {:name=>"index_catalog_resources_on_name_trigram", :using=>:gin, :opclass=>{:name=>:gin_trgm_ops}, :algorithm=>:concurrently})
ci: -> 0.0027s
ci: -- execute("SET statement_timeout TO 0")
ci: -> 0.0001s
ci: -- add_index(:catalog_resources, :name, {:name=>"index_catalog_resources_on_name_trigram", :using=>:gin, :opclass=>{:name=>:gin_trgm_ops}, :algorithm=>:concurrently})
ci: -> 0.0029s
ci: -- execute("RESET statement_timeout")
ci: -> 0.0002s
ci: -- transaction_open?(nil)
ci: -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci: -> 0.0004s
ci: -- index_exists?(:catalog_resources, :description, {:name=>"index_catalog_resources_on_description_trigram", :using=>:gin, :opclass=>{:description=>:gin_trgm_ops}, :algorithm=>:concurrently})
ci: -> 0.0017s
ci: -- add_index(:catalog_resources, :description, {:name=>"index_catalog_resources_on_description_trigram", :using=>:gin, :opclass=>{:description=>:gin_trgm_ops}, :algorithm=>:concurrently})
ci: -> 0.0009s
ci: == 20231115172623 RemoveNameDescriptionTrigramIndexesFromCatalogResources: reverted (0.1294s)
ci: == [advisory_lock_connection] object_id: 184460, pg_backend_pid: 67536
main: == [advisory_lock_connection] object_id: 184880, pg_backend_pid: 67975
main: == 20231107193342 AddSearchVectorToCatalogResources: reverting ================
main: -- remove_column(:catalog_resources, :search_vector)
main: -> 0.0015s
main: == 20231107193342 AddSearchVectorToCatalogResources: reverted (0.0050s) =======
main: == [advisory_lock_connection] object_id: 184880, pg_backend_pid: 67975
ci: == [advisory_lock_connection] object_id: 184180, pg_backend_pid: 68406
ci: == 20231107193342 AddSearchVectorToCatalogResources: reverting ================
ci: -- remove_column(:catalog_resources, :search_vector)
ci: -> 0.0015s
ci: == 20231107193342 AddSearchVectorToCatalogResources: reverted (0.0098s) =======
ci: == [advisory_lock_connection] object_id: 184180, pg_backend_pid: 68406
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #430889 (closed)
Edited by Leaminn Ma