Draft: Add pgvector to CI
What does this MR do and why?
Adds the Embedding database to rspec CI runs. The database needs to be installed with pgvector to allow for storing vectors.
Screenshots or screen recordings
N/A
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
N/A
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Merge request reports
Activity
assigned to @dgruzd
2 Warnings featureaddition and featureenhancement merge requests normally have a documentation change. Consider adding a documentation update or confirming the documentation plan with the Technical Writer counterpart.
For more information, see:
- The Handbook page on merge request types.
- The definition of done documentation.
Most of the time, merge requests should target master
. Otherwise, please set the relevantPick into X.Y
label.2 Messages CHANGELOG missing: If you want to create a changelog entry for GitLab FOSS, add the
Changelog
trailer to the commit message you want to add to the changelog.If you want to create a changelog entry for GitLab EE, also add the
EE: true
trailer to your commit message.If this merge request doesn't need a CHANGELOG entry, feel free to ignore this message.
This merge request adds or changes files that require a review from the Database team. This merge request requires a database review. To make sure these changes are reviewed, take the following steps:
- Ensure the merge request has database and databasereview pending labels. If the merge request modifies database files, Danger will do this for you.
- Prepare your MR for database review according to the docs.
- Assign and mention the database reviewer suggested by Reviewer Roulette.
The following files require a review from the Database team:
ee/db/embedding/structure.sql
lib/gitlab/database/gitlab_schema.rb
Pipeline Changes
This merge request contains changes to the pipeline configuration for the GitLab project.
Please consider the effect of the changes in this merge request on the following:
- Effects on different pipeline types
- Effects on non-canonical projects:
gitlab-foss
security
dev
- personal forks
- Effects on pipeline performance
Please consider communicating these changes to the broader team following the communication guideline for pipeline changes
Reviewer roulette
Changes that require review have been detected!
Please refer to the table below for assigning reviewers and maintainers suggested by Danger in the specified category:
Category Reviewer Maintainer backend Eugenia Grieff (
@egrieff
) (UTC+2, same timezone as@dgruzd
)Mark Chao (
@lulalala
) (UTC+8, 6 hours ahead of@dgruzd
)database Tianwen Chen (
@tianwenchen
) (UTC+10, 8 hours ahead of@dgruzd
)Dylan Griffith (
@DylanGriffith
) (UTC+10, 8 hours ahead of@dgruzd
)maintenanceworkflow / maintenancepipelines for CI, Danger Amparo Luna (
@a_luna
) (UTC-5, 7 hours behind@dgruzd
)David Dieulivol (
@ddieulivol
) (UTC+2, same timezone as@dgruzd
)To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot, based on their timezone. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.
Once you've decided who will review this merge request, assign them as a reviewer! Danger does not automatically notify them for you.
If needed, you can retry the
danger-review
job that generated this comment.Generated by
Danger- Resolved by 🤖 GitLab Bot 🤖
Proper labels assigned to this merge request. Please ignore me.
@dgruzd - please see the following guidance and update this merge request.1 Error Please add typebug typefeature, or typemaintenance label to this merge request. Edited by 🤖 GitLab Bot 🤖
- A deleted user
added backend label
requested review from @terrichu
changed milestone to %16.0
added 2 commits
Allure report
allure-report-publisher
generated test report!e2e-review-qa:
test report for a117cfe7expand test summary
+-----------------------------------------------------------------------+ | suites summary | +------------------+--------+--------+---------+-------+-------+--------+ | | passed | failed | skipped | flaky | total | result | +------------------+--------+--------+---------+-------+-------+--------+ | Create | 28 | 0 | 1 | 0 | 29 | ✅ | | Plan | 49 | 0 | 1 | 0 | 50 | ✅ | | Data Stores | 22 | 0 | 0 | 0 | 22 | ✅ | | Govern | 24 | 0 | 0 | 0 | 24 | ✅ | | Manage | 8 | 0 | 3 | 0 | 11 | ✅ | | Verify | 10 | 0 | 0 | 0 | 10 | ✅ | | Package | 0 | 0 | 1 | 0 | 1 | ➖ | | Monitor | 4 | 0 | 0 | 0 | 4 | ✅ | | Framework sanity | 9 | 0 | 1 | 0 | 10 | ✅ | +------------------+--------+--------+---------+-------+-------+--------+ | Total | 154 | 0 | 7 | 0 | 161 | ✅ | +------------------+--------+--------+---------+-------+-------+--------+
e2e-package-and-test:
test report for a117cfe7expand test summary
+-----------------------------------------------------------------------+ | suites summary | +------------------+--------+--------+---------+-------+-------+--------+ | | passed | failed | skipped | flaky | total | result | +------------------+--------+--------+---------+-------+-------+--------+ | Release | 30 | 0 | 0 | 20 | 30 | ❗ | | Manage | 199 | 2 | 33 | 80 | 234 | ❌ | | Create | 738 | 0 | 105 | 150 | 843 | ❗ | | Verify | 270 | 0 | 20 | 225 | 290 | ❗ | | Data Stores | 181 | 0 | 3 | 55 | 184 | ❗ | | Plan | 304 | 0 | 0 | 160 | 304 | ❗ | | Secure | 20 | 0 | 40 | 20 | 60 | ❗ | | Govern | 231 | 0 | 0 | 230 | 231 | ❗ | | Monitor | 54 | 0 | 1 | 50 | 55 | ❗ | | Fulfillment | 12 | 0 | 110 | 2 | 122 | ❗ | | Growth | 0 | 0 | 10 | 0 | 10 | ➖ | | ModelOps | 0 | 0 | 5 | 0 | 5 | ➖ | | Package | 126 | 0 | 59 | 0 | 185 | ✅ | | Configure | 1 | 0 | 15 | 0 | 16 | ✅ | | Analytics | 11 | 0 | 0 | 10 | 11 | ❗ | | Framework sanity | 0 | 0 | 5 | 0 | 5 | ➖ | | Systems | 19 | 0 | 0 | 0 | 19 | ✅ | | GitLab Metrics | 2 | 0 | 1 | 0 | 3 | ✅ | +------------------+--------+--------+---------+-------+-------+--------+ | Total | 2198 | 2 | 407 | 1002 | 2607 | ❌ | +------------------+--------+--------+---------+-------+-------+--------+
mentioned in merge request !117914 (closed)
The only workaround I can see for setting the postgres version in pgvector is by building the image and passing a build argument. Each image has a hardcoded PG_VERSION and the latest is pg15. Tags for other pg versions exist:
But we need at least
v0.4.0
because that's where they changed the max dimensions from 1024, otherwise this error occurs:ERROR: dimensions for type vector cannot exceed 1024 LINE 5: embedding vector(1536) NOT NULL,
Steps to build the image correctly:
git clone https://github.com/pgvector/pgvector.git # be sure to be on master, other tags don't have PG_MAJOR as ARG cd pgvector docker build --build-arg PG_MAJOR=13 -t pgvector_pg_13 . docker run -e POSTGRES_HOST_AUTH_METHOD=trust pgvector_pg_13
Do we have a container registry for CI pipelines? If that's a possibility, we could create images for pg12, pg13, pg14 and put it in the registry and then point the CI script to those images.
Edited by Madelein van NiekerkDevin had the same issue and got around it by running
make install
manually, which is not an option here either. Maybe building the image can solve for both cases?This seems possible given our Dependency Proxy
I'm going to play with https://gitlab.com/gitlab-org/gitlab-build-images and see if we can add a new image. Maybe first even build it manually to unblock CI
Thanks, I built the image manually and am uploading to https://gitlab.com/gitlab-org/enablement-section/tanuki-bot/container_registry/4120284 but it's taking more than an hour to upload the last layer
I think the manual approach might be good for starters and then we can do a checked-in version.@maddievn wow. I've just pushed
registry.gitlab.com/gitlab-org/enablement-section/tanuki-bot/pgvector_14:v0.4.1
let me also push other versions and use them in CIFYI: I've opened gitlab-build-images!675 (merged) to build these images properly. After this is merged, my plan is to use one postgres image in CI instead of two.
mentioned in merge request !118195 (merged)
removed review request for @terrichu
I believe we can close this MR since we've merged the changes into !117695 (merged)