Skip to content

Resolve: Separate `prefix` into true `prefix` and `prompt`

Dylan Bernardi requested to merge db/adding-developer-input-variable into main

What does this merge request do and why?

Use prompt_transformed to create a new data field called prompt_final (or something similar) to capture the final version of the prompt that is sent to the model. This MR will also change prompt_prefix to be the original prompt (code_before) that doesn't undergo any transformations. Descriptions have been updated as well.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

An example pipeline:

poetry run promptlib code-suggestions \
    --runner DirectRunner \
    --project unreview-poc-390200e5 \
    --region us-central1 \
    --temp-location "gs://unreview-dataflow/tmp/" \
    --save-main-session \
    eval \
    --input-bq-table unreview-poc-390200e5.gl_gitlab_codebase.input_testcases_v1 \
    --output-bq-table unreview-poc-390200e5:gl_gitlab_experiments.db_test_prompt_final_field_v1 \
    --include-suffix \
    --throttle-sec 0.01 \
    --batch-size 24 \
    --min-length 25 \
    --language python \
    --transformation function_signatures \
    --transformation imports \
    --transformation file_name_and_language \
    --model code-gecko@001

An example table can be seen using C here. The table name is:

unreview-poc-390200e5.gl_gitlab_experiments.db_test_prompt_final_field_v1

IMPORTANT: The changes made here reflect a schema change. These changes will need to be approved in the Code-Suggestions Schema Issue before being merged.

Merge request checklist

  • I've ran the affected pipeline(s) to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Dylan Bernardi

Merge request reports