feat: postgres indexer (!832) · Merge requests · GitLab.org / gitlab-elasticsearch-indexer

What does this MR do and why?

Implements a PostgreSQL adapter to store code chunks in a PostgreSQL database. This provides an alternative to Elasticsearch and OpenSearch adapters while maintaining the same indexing functionality.

Changes

PostgreSQL Client (internal/mode/chunk/client/postgresql/postgresql.go): Manages database connections and operations
PostgreSQL Indexer (internal/mode/chunk/indexer/postgresql/indexer.go): Implements chunk indexing operations
Integration (internal/mode/chunk/chunk.go): Orchestrates the indexing workflow

Key Features

Batched Operations: Buffers chunks up to 1000 (configurable) before flushing to reduce transaction overhead
Partition Support: All operations filter by partition_id for data isolation between partitions
Orphan Cleanup: Automatically removes chunks from modified files that no longer contain those chunks
Incremental Reindexing: Supports reindexing flag workflow for efficient incremental updates
Transaction Safety: Uses transactions for atomic operations

Operations Implemented

Index: Upserts chunks with batching and orphan cleanup
DeletePaths: Removes all chunks for specified file paths
Delete: Removes all chunks for a project in a partition
ResolveReindexing: Completes incremental reindexing workflow
Flush: Executes buffered upsert operations in a transaction

The test mocks database calls since we'll have integration tests in rails where a database is already configured in CI.

How to set up and validate locally

Checkout ActiveContext Postgres indexer support (gitlab!216987) branch on rails
Checkout this branch on the indexer and run make
Run postgres

docker run -p 5432:5432 --name pgvector17 -e POSTGRES_PASSWORD=password pgvector/pgvector:pg17

Create the vector extension

psql -h localhost -p 5432 -U postgres
CREATE EXTENSION vector;

Create a postgres connection

connection = Ai::ActiveContext::Connection.create!(
  name: "postgres",
  options: { host: 'localhost', port: 5432, username: 'postgres', password: 'password' },
  adapter_class: "ActiveContext::Databases::Postgresql::Adapter"
)
connection.activate!

Run migration worker on repeat

::Ai::ActiveContext::MigrationWorker.new.perform

Create enabled namespaces

Ai::ActiveContext::Code::SchedulingWorker.new.perform("create_enabled_namespace")

Trigger indexing for a project

::Ai::ActiveContext::Code::AdHocIndexingWorker.new.perform(1000000)

Note that the repo files were chunked and indexed
Update a file and note that the chunks are representative (orphaned data deleted)
Run the deleter and note that the chunks were deleted

Ai::ActiveContext::Code::Deleter.run!(Ai::ActiveContext::Code::Repository.find_by(project_id:
 1000000))

Edited Dec 18, 2025 by Madelein van Niekerk

feat: postgres indexer

What does this MR do and why?

Changes

Key Features

Operations Implemented

How to set up and validate locally

Merge request reports