Skip to content

Code creation pipeline

Mikołaj Wawrzyniak requested to merge mwaw/add_code_creation_pipeline into main

What does this merge request do and why?

This MR adds a new pipeline. The new pipeline is designed to work with MBPP data set and test code generation feature build by groupcode creation The new pipeline integrate with GitLab code suggestions API https://docs.gitlab.com/ee/api/code_suggestions.html which is build following details described at https://handbook.gitlab.com/handbook/engineering/development/dev/create/code-creation/engineering_overview/#code-suggestions-technical-overview

The new pipeline beside being runnable with GitLab API can also be run with other LLM providers using to support cross comparison between GitLab offering and broader domain. To compare with different providers one needs to add additional answering_models entries into config file.

The new pipeline supports --test-run --sample-size 3 flags allowing for local execution with limited subset of test cases

The new pipeline can be run agains GDK instance running locally, one needs to adjust base_url param in answering_models to point to GDK endpoint

How to set up and validate locally

  1. Ensure GCP environment variables are setup.

  2. Check out to this merge request's branch.

  3. Run the follow command to kick off the pipeline.

    GOOGLE_APPLICATION_CREDENTIALS=data/secrets/dev-ai-research-0e2f8974-a2b7a104f34b.json GITLAB_TOKEN=$GITLAB_AI_API_TOKEN poetry run promptlib code-creation eval --test-run --sample-size 3 --config-file data/config/code_generation_eval_mbpp_config.json

Merge request checklist

  • I've ran the affected pipeline(s) to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Susie Bitters

Merge request reports