Skip to content

Support all projects indexed in elastic rake

Terri Chu requested to merge 428070-rake-task-updates-to-index-all-projects into master

Background

Related to #428070 (closed)

This change only affects instances where Elasticsearch limiting is enabled and configured to limit indexing to specific namespaces or projects. Note: This is how Advanced search is setup on GitLab.com The MR only makes changes to the indexing process but does not touch any code related to search.

This MR is broken into three parts (all behind a disabled feature flag)

Merge request What it does
!134456 (merged) Allow all projects being indexed
!137308 (merged) Support all projects being indexed when performing deletes and transfers to another namespace
!138098 (merged) Support all projects indexed in elastic rake

What does this MR do and why?

This MR introduces support for indexing all projects to the gitlab:elastic:index and gitlab:elastic:index_projects rake tasks. This is done through the Search::RakeTaskExecutorService class. Indexing all projects is currently behind a disabled by default feature flag (search_index_all_projects). The main changes are:

  • change to work through projects in batches. This saves a database query to refind all the projects
  • if limiting is enabled, check each project to see if it's maintaining elasticsearch (uses the search_index_all_projects feature flag)
  • rename method elastic_enabled_projects to projects_maintaining_indexed_associations to more accurately describe what it returns
  • Spec updates

Screenshots or screen recordings

N/A

How to set up and validate locally

  1. Setup elasticsearch in gdk
  2. Enable advanced search and Elasticsearch indexing restrictions, I chose one group and project image
  3. create a new project and add at least 1 issue (or merge request or comment) in a non-indexed namespace
  4. disable the feature flag in rails console:
    echo "Feature.disable(:search_index_all_projects)" | gdk rails c
  5. reindex everything from scratch:
    bundle exec rake gitlab:elastic:index
  6. verify that only the expected projects exist in the index, in my case that is 3 projects (2 under the selected group, and 1 selected project)
  7. enable the feature flag for a not indexed project's root namespace:
     echo "Feature.enable(:search_index_all_projects, Project.last.root_namespace)" | gdk rails c
  8. reindex everything from scratch:
    bundle exec rake gitlab:elastic:index
  9. verify that only the expected projects exist in the index, in my case that is 4 projects (3 from limiting + the new project)
  10. search and verify that any issue records do not exist in the index for that project

Elasticsearch queries

The queries below can be modified and used to verify data

Projects
curl --request GET \
  --url http://localhost:9200/gitlab-development-projects/_search \
  --header 'Content-Type: application/json' \
  --data '{
	"query": {
		"bool": {
			"must": [
				{
					"term": {
						"id": {
							"value": 19
						}
					}
				}
			]
		}
	}
}'
Issues
```
curl --request POST \
  --url http://localhost:9200/gitlab-development-issues/_search \
  --header 'Content-Type: application/json' \
  --data '{
	"query": {
		"bool": {
			"must": [
				{
					"term": {
						"project_id": 19
					}
				}
			]
		}
	}
}'
```

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports