Upgrade on-prem to 16.3.3 or 16.3.4 : tables pm_package_versions pm_package_version_licenses and pm_packages get filled up
<!---
Please read this!
Before opening a new issue, make sure to search for keywords in the issues
filtered by the "regression" or "type::bug" label:
- https://gitlab.com/gitlab-org/gitlab/issues?label_name%5B%5D=regression
- https://gitlab.com/gitlab-org/gitlab/issues?label_name%5B%5D=type::bug
and verify the issue you're about to submit isn't a duplicate.
--->
:warning: **FIXED in GitLab 16.3.5**. If you have upgraded to 16.3.3 or 16.3.4, please read [how to fix this](https://gitlab.com/gitlab-org/gitlab/-/issues/425971#how-to-fix-this)?
### Summary
After upgrading my onprem install to 16.3.3, tables public.pm_package_versions public.pm_package_version_licenses and public.pm_packages get filled. 5GB in 3 days are automatically inserted into tables. Upgrading to 16.3.4 doesn't change behaviour, and tables continue to get filled up, by 1.7 GB/day in my case.
### Steps to reproduce
Have an on-prem installation of Gitlab 16.3.2 with the 7.3.2 chart (with postgres 15.3)
Upgrade to Gitlab 16.3.3
Wait and see table consumption increase.
### What is the current *bug* behavior?
Tables are getting filled automatically and database space is filling up.
### What is the expected *correct* behavior?
No automatic space filling.
### Relevant logs and/or screenshots
```
gitlabhq_production=> SELECT
gitlabhq_production-> table_schema || '.' || table_name AS table_full_name,
gitlabhq_production-> pg_size_pretty(pg_total_relation_size('"' || table_schema || '"."' || table_name || '"')) AS size
gitlabhq_production-> FROM information_schema.tables
gitlabhq_production-> ORDER BY
gitlabhq_production-> pg_total_relation_size('"' || table_schema || '"."' || table_name || '"') DESC;
table_full_name | size
-----------------------------------------------------------------------------------+------------
public.pm_package_versions | 2475 MB
public.pm_package_version_licenses | 2422 MB
public.pm_packages | 2021 MB
public.ci_builds | 21 MB
public.merge_request_diff_files | 19 MB
public.ci_builds_metadata | 13 MB
pg_catalog.pg_attribute | 10 MB
public.ci_pipelines | 7064 kB
...
```
```
gitlabhq_production=> select * from public.pm_packages limit 10;
id | purl_type | name | created_at | updated_at | licenses
--------+-----------+--------------------------------+-------------------------------+-------------------------------+--------------------------------
112811 | 1 | ee01/php-html-parser | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:17.986373+00 | [[11], "1.5.1", "2.0.2.1", []]
112901 | 1 | ee-objects/config | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], "0.1.1", "0.1.1", []]
112915 | 1 | effiana/phpunit-test-generator | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], "0.1.0", "0.1.1", []]
112921 | 1 | efiku/locamon | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], null, null, []]
112922 | 1 | efipeek/prestaconsole | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[24], null, null, []]
112930 | 1 | efrane/phar-builder-bundle | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], "0.0.1", "0.5.0", []]
112931 | 1 | efrane/tinkr | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], "0.5.0", "0.5.4", []]
112936 | 1 | efriandika/laravel-settings | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], "1.0.0", "1.2.5", []]
112951 | 1 | eftec/securityonemysql | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[10], "0.14.0", "1.5.3", []]
112950 | 1 | eftec/projectone | 2023-09-18 11:26:11.705752+00 | 2023-09-20 07:50:18.878443+00 | [[11], "0.1.0", "0.1.0", []]
(10 rows)
```
```
gitlabhq_production=> select count(*) from public.pm_package_versions;
count
----------
14209278
gitlabhq_production=> select * from public.pm_package_versions limit 10;
id | pm_package_id | version | created_at | updated_at
----+---------------+------------+-------------------------------+-------------------------------
1 | 1 | v1.0.2 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
2 | 1 | v1.0.3 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
3 | 1 | v1.0.4 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
4 | 1 | dev-main | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
5 | 1 | v1.0.0 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
6 | 1 | v1.0.1 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
7 | 7 | dev-master | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
8 | 4 | dev-master | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
9 | 4 | 1.0.2 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
10 | 4 | 1.0.1 | 2023-09-20 05:46:22.163674+00 | 2023-09-20 05:46:22.163674+00
```
```
gitlabhq_production=> select * from public.pm_package_version_licenses limit 10 offset 10000;
pm_package_version_id | pm_license_id | created_at | updated_at | id
-----------------------+---------------+-------------------------------+-------------------------------+-------
9939 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10001
9940 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10002
9941 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10003
9942 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10004
9943 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10005
9944 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10006
9945 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10007
9946 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10008
9947 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10009
9948 | 11 | 2023-09-20 05:47:08.405175+00 | 2023-09-20 05:47:08.405175+00 | 10010
```
#### Results of GitLab environment info
<!-- Input any relevant GitLab environment information if needed. -->
<details>
<summary>Expand for output related to GitLab environment info</summary>
<pre>
current helm chart 7.3.4
GitLab v16.3.4-ee
GitLab Shell14.26.0
GitLab Workhorsev16.3.4
GitLab APIv4
GitLab KASv16.3.0
Ruby 3.0.6p216
Rails 7.0.6
PostgreSQL (main)15.3
PostgreSQL (ci)15.3
Redis 6.2.7
</pre>
</details>
### How to fix this?
1. Disable the feature flag `package_metadata_synchronization`.
- Using the rails console: `Feature.enabled?(:package_metadata_synchronization) && Feature.disable(:package_metadata_synchronization)`
2. Cleanup data.
- Using the rails console:
```
> ActiveRecord::Base.connection.execute('SET statement_timeout TO 0')
> PackageMetadata::PackageVersionLicense.delete_all
> PackageMetadata::PackageVersion.delete_all
```
- Then running VACUUM to free up space of embedded database: :warning: This will cause table lock and potential down time.
```
echo "SET statement_timeout TO 0;\
VACUUM FULL pm_package_version_licenses, pm_package_versions;"\
| sudo gitlab-psql -h /var/opt/gitlab/postgresql -d gitlabhq_production
```
- OR using `truncate` statement instead (does not require vacuum, see why [here](https://www.cybertec-postgresql.com/en/postgresql-delete-vs-truncate/))
```
truncate public.pm_package_versions cascade;
truncate public.pm_package_version_licenses cascade;
```
#### What if I don't have rails console access?
As a last resort option, you can disable the sync for each package type by changing the admin setting in `GitLab > Admin Area > Settings > Security and Compliance`.
Please note that doing so will prevent the sync of all package metadata which is necessary for our [License Compliance feature](https://docs.gitlab.com/ee/user/compliance/license_scanning_of_cyclonedx_files/). This approach should be a temporary workaround untill rails console access is provided to disable the feature flag or the instance is upgraded to a fixed version.
issue