How do we keep the main DB responsive during the first import from License DB?
Topic to Evaluate
On GitLab instances that import data from License DB for the first time, PackageMetadata::SyncWorker
inserts millions of row into the main Postgres database using INSERT queries. This causes important performance issues.
- This heavily degrades the initial user experience.
- It makes QA tests fail due to performance slowdowns in the main DB. #396649 (comment 1319465309)
We need to evaluate options to solve this performance problem.
Tasks to Evaluate
For each option considered in this issue:
- Check feasibility
- Check compatibility with all supported deployments
- Assess cost of implementation
- Measure performance gain
Then,
-
Compare options, and choose a first step -
Create implementation issues
Risks and Implementation Considerations
Team
/cc @gonzoyumo @sam.white @brytannia @stanhu
🤖
Auto-Summary Discoto Usage
Points
Discussion points are declared by headings, list items, and single lines that start with the text (case-insensitive)
point:
. For example, the following are all valid points:
#### POINT: This is a point
* point: This is a point
+ Point: This is a point
- pOINT: This is a point
point: This is a **point**
Note that any markdown used in the point text will also be propagated into the topic summaries.
Topics
Topics can be stand-alone and contained within an issuable (epic, issue, MR), or can be inline.
Inline topics are defined by creating a new thread (discussion) where the first line of the first comment is a heading that starts with (case-insensitive)
topic:
. For example, the following are all valid topics:
# Topic: Inline discussion topic 1
## TOPIC: **{+A Green, bolded topic+}**
### tOpIc: Another topic
Quick Actions
Action Description /discuss sub-topic TITLE
Create an issue for a sub-topic. Does not work in epics /discuss link ISSUABLE-LINK
Link an issuable as a child of this discussion
Last updated by this job
-
TOPIC Import using
COPY FROM
#397670 (comment 1320233823) - TOPIC Import to partition, attach #397670 (comment 1320273682)
- TOPIC Licenses as arrays of IDs #397670 (comment 1320310058)
- TOPIC Use separate DB #397670 (comment 1320320460)
- TOPIC Improve bulk upsert #397670 (comment 1320324486)
- TOPIC Allow duplicates #397670 (comment 1320393315)
- TOPIC Admins enable the sync of each package type #397670 (comment 1320405899)
- TOPIC Get package metadata via an API #397670 (comment 1320425477)
-
TOPIC Throttle database requests #397670 (comment 1320815585)
- Following the normal workflow #397670 (comment 1320838568)
- TOPIC Testing #397670 (comment 1320819134)
- TOPIC Allow importing the top N most popular packages of each PURL type as an option #397670 (comment 1321230965)
- TOPIC Compress data using ranges of versions #397670 (comment 1322371276)
- TOPIC Compress the export files to save on space and network transfer time #397670 (comment 1324805831)
Discoto Settings
---
summary:
max_items: -1
sort_by: created
sort_direction: ascending
See the settings schema for details.
Implementation Plan
-
modify PackageMetadata::SyncService
to add a simple db request throttle afteringest
viasleep
https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/package_metadata/sync_service.rb#L45