Skip to content

Turn symbol column into string

Hongtao Yang requested to merge json-column-type into main

What does this merge request do and why?

This MR change the symbol column into STRING type and write the symbol count to it as json-serialized string. This way we can avoid table schema change when there are new symbols.

Ref: https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/prompt-library/-/issues/95

How to set up and validate locally

Example table: https://console.cloud.google.com/bigquery?project=unreview-poc-390200e5&ws=!1m10!1m4!1m3!1sunreview-poc-390200e5!2sbquxjob_6d6c39ef_18b3d89fc28!3sUS!1m4!4m3!1sunreview-poc-390200e5!2sgl_gitlab_experiments!3shyang_json_symbol

Merge request checklist

  • I've ran the eval_codebase.py pipeline to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Hongtao Yang

Merge request reports