Some groups and projects are not being fully indexed
### BLOCKED **Blocked:** this is blocked by the need to add reporting and better understand a systemic issue with index integrity. - [ ] [Advanced Search: Come up with solution for index integrity](https://gitlab.com/groups/gitlab-org/-/epics/7978) - [x] [Add a way to temporarily store logs for projects deleted out of the Elasticsearch index for several months.](https://gitlab.com/gitlab-org/gitlab/-/issues/366444) 15.2 - [x] [Improve gitlab-elasticsearch-indexer logging and traceability](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer/-/issues/101) 15.3 <!--- Please read this! Before opening a new issue, make sure to search for keywords in the issues filtered by the "regression" or "type::bug" label: - https://gitlab.com/gitlab-org/gitlab/issues?label_name%5B%5D=regression - https://gitlab.com/gitlab-org/gitlab/issues?label_name%5B%5D=type::bug and verify the issue you're about to submit isn't a duplicate. ---> ### Summary This appears to be related to an issue we previously had where projects were not being fully indexed on GitLab.com: https://gitlab.com/gitlab-org/gitlab/-/issues/259721 An [MR](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/59457) has largely fixed the issues we were seeing, but it seems to have resurfaced in the way it presents itself, but may in fact be a different problem. Support has received at least 3 requests to re-index whole groups with the earliest being in January 2022 * https://gitlab.com/gitlab-com/support/internal-requests/-/issues/12601 * https://gitlab.com/gitlab-com/support/internal-requests/-/issues/13097 * https://gitlab.com/gitlab-com/support/internal-requests/-/issues/14192 Performing the steps outlined [here](https://gitlab.com/gitlab-org/gitlab/-/issues/259721#workaround-for-a-whole-group) resolves the issue but we were not able to capture any data of the current state prior to re-indexing. In the cases I have seen: - `Advanced Search is enabled` is present. - Searching either works partially or produces no results (but never 500s or errors). - Re-indexing allows for immediate success and improvement of results as expected ### Steps to reproduce Haven't been able to reproduce, only happens on certain groups ### Example Project See the above internal-requests issues above ### What is the current *bug* behavior? Advanced Search is enabled but searching does not work as expected with no results or minimal results ### What is the expected *correct* behavior? Projects using Advanced Search should be fully indexed ### Relevant logs and/or screenshots <!-- Paste any relevant logs - please use code blocks (```) to format console output, logs, and code as it's tough to read otherwise. --> ### Output of checks This bug happens on GitLab.com #### Results of GitLab environment info <!-- Input any relevant GitLab environment information if needed. --> <details> <summary>Expand for output related to GitLab environment info</summary> <pre> (For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:env:info`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`) </pre> </details> #### Results of GitLab application Check <!-- Input any relevant GitLab application check information if needed. --> <details> <summary>Expand for output related to the GitLab application check</summary> <pre> (For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:check SANITIZE=true`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true`) (we will only investigate if the tests are passing) </pre> </details> ### Possible fixes <!-- If you can, link to the line of code that might be responsible for the problem. --> ## Possible workaround Groups and projects can be manually reindexed, but information about the current state should be captured first for troubleshooting purposes. Information to capture (pulled from https://gitlab.com/gitlab-org/gitlab/-/issues/360579#note_926504925 and https://gitlab.com/gitlab-com/support/internal-requests/-/issues/15327#note_1027291153 (internal) ) ```ruby p.index_status p.last_repository_updated_at p.repository_state p.repository.size p.commit.sha p.commit.committed_date p.use_elasticsearch? p.mirror? p.import? GitlabSubscriptionHistory.where(namespace_id: 00000).sort_by(&:id) ``` For internal team members, please reach out to the search team via Slack so we can try to capture logs for these. workaround in https://gitlab.com/gitlab-org/gitlab/-/issues/259721#workaround-in-the-meantime
issue