FY 24 Q1 OKR : Code Suggestions Open Beta
# Overview Beyond our Gated Customer ( Closed Beta Release ) we would like to iterate to Open Beta. # Functionalities (in priority order) 1. Monitoring and Infra support for Code Suggestions - scaling and optimizations 1. Model and Backend enhancement post Closed Beta - Prompt and feedback implementation 1. New Web IDE Integration - A developers can access Code Suggestions from the (new) GitLab Web IDE 1. Multiple Suggestions & Developer UX improvements - Explore code recommendations vs the current code completion ( based on internal testing feedback). This has come back through feedback based on competitors's offering on instead of provide one prompt to complete but have multiple recommendations as the developers can choose or reject 1. Request Routing to Multiple Models for Improved Results - Focus on merging or serving models for two purposes ( more languages and middle of the code recommendation). CodeGen works fairly well with python , sql , go. Based on the top 10 language for ultimate customers use multi-models for multi language support. Work also to help in end to end recommendation including middle of code block. We have been researching polycoder and facebook incoder - We plan to develop an ensemble of model using cogegen-multi as a base model. We will finetune codegen-multi on our target languages like python, go, javascript, gitlab yaml etc. to obtain a suite of models that perform better at their respective languages. - For infilling, we will also finetune codegen-multi on the prompt similar to facebook incoder. The reason we want to stay with codegen-multi as a base model is for easy deployment with Nvidia Triton. 1. Exploration of code suggestions for Gitlab ci.yml file MVC - An additional use case that can help us pave the way for future training on customer code and provide gitlab unique value proposition. - For gitlab yaml, we want to first try the simplest solution, which is just finetune the model same way as we finetune on any other languages. Next step would be to explore a better tokenizer tailored towards yaml. Associated OKR: https://gitlab.com/gitlab-com/gitlab-OKRs/-/work_items/122439728 **Note:** _We are in the process of adding more details_ cc @tmccaslin @nkhalwadekar
epic