Skip to content

Advanced Search: Optimize Group-level Searches for Wiki blobs & blobs

Overview

Searches that are across entire groups can be slow if the Group has a large number of projects. This is because the query will contain all the project IDs that are part of the selected group.

Adding the Group Hierarchy in the indexed document will allow for the query to specify the group level as a filter and keep eliminate the need to have a large list of project IDs included in the query.

Additionally, There are mapping needs for Wikis that may be different than Repo Blobs and may need additional rules to be included in the indexer. Problems that have been reported on indexing wikis include.

  • Titles don't get indexed when a project is moved.
  • File contents are being indexed even though these files type are supposed to be omitted from the index.
  • We should confirm file size limits are applied correctly

Steps

  • Add namespace_ancestry_ids to Wiki blobs index mapping MR Changes 2nd Iteration
  • Populate namespace_ancestry_ids for new/updated documents for Wiki blobs, and add a flag for namespace_ancestry_ids field to be sent or not (make changes in the indexer) MR Changes
  • Release the new indexer version 3.1.1 MR Link
  • Send the flag namespace_ancestry_ids to the indexer if migration is applied. MR Link
  • Backfill namespace_ancestry for blobs MR example (Note: we want to avoid hitting Gitaly. We'll probably need to use painless scripts) MR Link
  • Backfill namespace_ancestry for wiki_blobs MR example MR Link
  • Use namespace_ancestry_ids for blobs MR example MR Link
  • Use namespace_ancestry_ids for wiki_blobs MR example MR Link
Click to expand process progress

Next Steps for this issue

Validation track

Build track

  • workflowplanning breakdown - @JohnMcGuire
    • Well-scoped MVC issues
      • Issues are the SSOT for all feature development.
      • Refine issues into something that can be delivered within a single milestone
      • Open follow on issues to track work that is de-prioritized
      • Promote existing issues to Epics and open implementation issues for the upcoming milestone
      • Review feature issues with contributors
      • Consider scheduling a POC or engineering investigation issue
      • Make scope tradeoffs to reach for a right-sized MVC
      • Request an issue review to ensure communication is clear and have proposed the right iteration plan to execute on the solution.
  • Prioritized in Milestone
    • The team should understand what issues should be delivered during the next milestone
  • workflowready for development - @JohnMcGuire
  • typebug typefeature typemaintenance - @JohnMcGuire
  • Deliverable - @changzhengliu and @nickbrandt
  • Add to Planning Issue - @JohnMcGuire
  • Defined Quality Plan -@ebanks
  • workflowrefinement - @changzhengliu
    • as needed, refine the aspects of the original feature
  • workflowin dev - @changzhengliu
    • Applied by the engineer after work (including documentation) has begun on the issue. An MR is typically linked to the issue at this point.
  • workflowin review - Engineering
    • Applied by an engineer indicating that all MRs required to close an issue are in review.
  • workflowblocked - Engineering
    • Applied if at any time during development the issue is blocked. For example: technical issue, open question to PM or PD, cross-group dependency.
  • workflowverification - Engineering
    • After the MRs in the issue have been merged, this label is applied signaling the issue needs to be verified in staging or production.
  • workflowawaiting security release -Engineering
    • Applied by an engineer after the security issue has passed verification, this label signals that it is ready but awaiting the next monthly security release.
  • Close the Issue - Once available in production
  • Feature is available to GitLab.com hosted customers - Developer
  • Feature is available to self-managed customers - Developer
    • Code is included in the self-managed release (depending upon the cut-off).
  • Stakeholders of a feature will know it's available in production - Developer
    • After the feature is deployed to production and any needed verification in production is completed, the development team will close the issue.
    • Prior to the issue being closed, the development team may set the workflow label to workflow::verification or workflow::production for tracking purposes.
    • Product Manager may follow up with individual stakeholders to let them know the feature is available.
  • Customers will be informed about major changes - @JohnMcGuire
Edited by Siddharth Dungarwal