Some groups and projects are not being fully indexed

BLOCKED

Blocked: this is blocked by the need to add reporting and better understand a systemic issue with index integrity.

Summary

This appears to be related to an issue we previously had where projects were not being fully indexed on GitLab.com: #259721 (closed)

An MR has largely fixed the issues we were seeing, but it seems to have resurfaced in the way it presents itself, but may in fact be a different problem.

Support has received at least 3 requests to re-index whole groups with the earliest being in January 2022

Performing the steps outlined here resolves the issue but we were not able to capture any data of the current state prior to re-indexing.

In the cases I have seen:

  • Advanced Search is enabled is present.
  • Searching either works partially or produces no results (but never 500s or errors).
  • Re-indexing allows for immediate success and improvement of results as expected

Steps to reproduce

Haven't been able to reproduce, only happens on certain groups

Example Project

See the above internal-requests issues above

What is the current bug behavior?

Advanced Search is enabled but searching does not work as expected with no results or minimal results

What is the expected correct behavior?

Projects using Advanced Search should be fully indexed

Relevant logs and/or screenshots

Output of checks

This bug happens on GitLab.com

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

Possible workaround

Groups and projects can be manually reindexed, but information about the current state should be captured first for troubleshooting purposes.

Information to capture (pulled from #360579 (comment 926504925) and https://gitlab.com/gitlab-com/support/internal-requests/-/issues/15327#note_1027291153 (internal) )

p.index_status
p.last_repository_updated_at
p.repository_state
p.repository.size
p.commit.sha
p.commit.committed_date
p.use_elasticsearch?
p.mirror?
p.import?
GitlabSubscriptionHistory.where(namespace_id: 00000).sort_by(&:id)

For internal team members, please reach out to the search team via Slack so we can try to capture logs for these.

workaround in #259721 (closed)

Edited by Cleveland Bledsoe Jr