🚀 Migration Schedule for Daily Runs to Staging Environment
📋 Overview
This is an overview issue to track the Migration Schedule for Daily Runs as we transition to the staging environment. We capture other details such as accounts, error limits, and areas the team are investigating as well.
The schedule will include RCA, Chat, and Vulnerability Explanation. There was no daily run for Code Suggestion, but with staging, we will be adding that as well. We will have interim production runs as well till the full migration is completed. Details of the migration is here: #346 (closed)
🏗 ️ System Design with token usage for daily runs
- Currently both Eval judges for Prod and Staging using Anthropic Eval Account
- The red block are areas work is under troubleshooting
- The Prod account uses the feature token usage and the details and estimation are in the below table.
graph LR
DailyRuns["Daily Runs"]
CEF
subgraph Stg
direction LR
LS(*GraphQL Limit)
CFS[Cloud Flare]
LBS[Load Balancer]
GLS[GitLab]
GWS[AI Gateway]
LS --> CFS --> LBS --> GLS --> GWS
end
subgraph Prod
direction LR
LP(*GraphQL Limit)
CFP[Cloud Flare]
LBP[Load Balancer]
GLP[GitLab]
GWP[AI Gateway]
LP --> CFP --> LBP --> GLP --> GWP
end
subgraph Environment
direction TB
Stg
Prod
end
CEF --> Stg
CEF --> Prod
DailyRuns["Daily Runs"] --> CEF
Stg --> AE[Anthropic Eval]
AE --> RCAS[RCA]
Prod --> AP[Anthropic Prod]
AP --> RCA[RCA]
AP --> ETV[ETV]
AP --> CS[Code Suggestion]
AP --> D[Duo Feature]
CEF --> EJ[*Eval judge]
EJ --> AE
classDef ap fill:#74992e,stroke:#333,stroke-width:2px
classDef ae fill:#4287f5,stroke:#333,stroke-width:2px
classDef at fill:#f54242,stroke:#333,stroke-width:2px
class AP ap
class AE ae
class LS,EJ at
🚧 Current Progress and limitations on Staging
For staging the current work is the red blocks in the above diagram
-
🛠 ️ DRI: infrastructure- We have a GraphQL limit error as staging and production environments are different and would need Infra support on that. This could be in any of the components of staging, cloudflare , loadbalancer, GitLab
- 🧠 DRI: groupai model validation
- We are also working on the robustness of the judge that is giving null values. https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/issues/429
-
👥 Feature Teams:- a)
✅ RCA seeding data is completed. - b)
💬 We have the chat team working on seeding data. - c)
🔒 We want to validate on Vulnerability seeding data as well.
- a)
📅 Interim Recommendation for production run schedule for features as we migrate to staging
Note: All production runs for Evaluator judges used Anthropic Eval accounts, and for feature inference, the production account.
| Feature | Daily Run Production Schedule till Staging Migration | Feature Request and Token Usage | Task for Migration to Staging | Priority for Staging | Staging Migration Estimation Date |
|---|---|---|---|---|---|
| Root Cause Analysis | 900 prompts/day ( Monday, Wednesday, Friday) |
Max Request/Min: 20 requests/Min Token Usage: 35Ktoken X16BatchX2.5Batch/Min 1.4M/minute (17.5% of the Production limit) (Rough Calculation, feature teams not tracking) |
Post GraphQL error support from Infra Estimated Date: Aug-15th |
||
| Duo Chat | Full dataset ( Tuesday, Thursday, Saturday) |
Max Request/Min: 50 requests/Min Is it tracked by Feature team? |
Post Graph QL error support from Infra and seeding data Estimated Date: Aug - 17th |
||
| Vulnerability Explanation | On Hold till staging | Is it tracked by Feature Team? |
Post Graph QL error support from Infra and seeding data review Estimated Date: Aug- 21 |
||
| Vulnerability Resolves | On Hold till staging |
Post Graph QL error support from Infra and seeding data review Estimated Date Aug-21 |
|||
| Code Suggestion | Build post migration to Staging | TBD |