Add manifests to deploy the v2 model (!75) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

Tan Le requested to merge deploy-model-v2 into main May 09, 2023

This MR provides the k8s YAML manifests to deploy the model v2 after fine-tuning for 7 additional languages.

Steps to reproduce

Create a new disk to store models in the k8s cluster

gcloud compute disks create --size=250GB --zone=us-central1-c nfs-code-suggestions-models-disk

Get cluster credentials

gcloud container clusters get-credentials ai-assist --zone us-central1-c --project unreview-poc-390200e5
kubectl config set-context --current --namespace fauxpilot

Deploy the service account JSON key as a secret

export GOOGLE_APPLICATION_CREDENTIALS=<path to gcp application credentials>
kubectl create secret generic gcp-storage-credentials \
    --from-file=key.json="$GOOGLE_APPLICATION_CREDENTIALS"

Deploy the NFS server to access the model across the cluster. Note, we deploy the NFS server to support the ReadWriteMany access mode. This allows us to increase replicas safely when pods are deployed to different nodes.
```
kubectl apply -f ./manifests/fauxpilot/v2/models-nfs-server.yaml
```

Create the persistent volume, persistent volume claim, and start the model loader k8s job

kubectl apply -f ./manifests/fauxpilot/v2/models-persistense-volumes.yaml
kubectl apply -f ./manifests/fauxpilot/v2/model-loader.yaml
kubectl wait --for=condition=complete --timeout=30m job/model-loader-job-v2

Deploy the triton server

kubectl apply -f ./manifests/fauxpilot/v2/model-triton.yaml

Ref: https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/82

Edited May 09, 2023 by Alexander Chueshev

Add manifests to deploy the v2 model

Steps to reproduce

Merge request reports