🚀 Migration Schedule for Daily Runs to Staging Environment

📋 Overview

This is an overview issue to track the Migration Schedule for Daily Runs as we transition to the staging environment. We capture other details such as accounts, error limits, and areas the team are investigating as well.

The schedule will include RCA, Chat, and Vulnerability Explanation. There was no daily run for Code Suggestion, but with staging, we will be adding that as well. We will have interim production runs as well till the full migration is completed. Details of the migration is here: #346 (closed)

🏗️ System Design with token usage for daily runs

  1. Currently both Eval judges for Prod and Staging using Anthropic Eval Account
  2. The red block are areas work is under troubleshooting
  3. The Prod account uses the feature token usage and the details and estimation are in the below table.
graph LR
    DailyRuns["Daily Runs"]
    CEF

    subgraph Stg
        direction LR
        LS(*GraphQL Limit)
        CFS[Cloud Flare]
        LBS[Load Balancer]
        GLS[GitLab]
        GWS[AI Gateway]

        LS --> CFS --> LBS --> GLS --> GWS
    end

    subgraph Prod
        direction LR
        LP(*GraphQL Limit)
        CFP[Cloud Flare]
        LBP[Load Balancer]
        GLP[GitLab]
        GWP[AI Gateway]

        LP --> CFP --> LBP --> GLP --> GWP
    end

    subgraph Environment
        direction TB
        Stg
        Prod
    end
 
    CEF --> Stg
    CEF --> Prod
    DailyRuns["Daily Runs"] --> CEF
    
    Stg --> AE[Anthropic Eval]
    AE -->  RCAS[RCA]

    Prod --> AP[Anthropic Prod]
    AP -->  RCA[RCA]
    AP -->  ETV[ETV]
    AP -->  CS[Code Suggestion]
    AP -->  D[Duo Feature]
 
    CEF --> EJ[*Eval judge]
    EJ --> AE
    classDef ap fill:#74992e,stroke:#333,stroke-width:2px
    classDef ae fill:#4287f5,stroke:#333,stroke-width:2px
    classDef at fill:#f54242,stroke:#333,stroke-width:2px
    class AP ap
    class AE ae
    class LS,EJ at

🚧 Current Progress and limitations on Staging

For staging the current work is the red blocks in the above diagram

  1. 🛠️ DRI: infrastructure
    • We have a GraphQL limit error as staging and production environments are different and would need Infra support on that. This could be in any of the components of staging, cloudflare , loadbalancer, GitLab
  2. 🧠 DRI: groupai model validation
  3. 👥 Feature Teams:
    • a) RCA seeding data is completed.
    • b) 💬 We have the chat team working on seeding data.
    • c) 🔒 We want to validate on Vulnerability seeding data as well.

📅 Interim Recommendation for production run schedule for features as we migrate to staging

Note: All production runs for Evaluator judges used Anthropic Eval accounts, and for feature inference, the production account.

Feature Daily Run Production Schedule till Staging Migration Feature Request and Token Usage Task for Migration to Staging Priority for Staging Staging Migration Estimation Date
Root Cause Analysis 900 prompts/day ( Monday, Wednesday, Friday)

Max Request/Min: 20 requests/Min

Token Usage: 35Ktoken X16BatchX2.5Batch/Min

1.4M/minute

(17.5% of the Production limit)

(Rough Calculation, feature teams not tracking)

Completed Dataseeding

priority1

Post GraphQL error support from Infra

Estimated Date: Aug-15th

Duo Chat Full dataset ( Tuesday, Thursday, Saturday)

Max Request/Min: 50 requests/Min

Is it tracked by Feature team?

Dataseeding in Progress

priority2

Post Graph QL error support from Infra and seeding data

Estimated Date:

Aug - 17th

Vulnerability Explanation On Hold till staging Is it tracked by Feature Team?

Dataseeding in Review

priority2

Post Graph QL error support from Infra and seeding data review

Estimated Date: Aug- 21

Vulnerability Resolves On Hold till staging

Dataseeding in Review

priority2

Post Graph QL error support from Infra and seeding data review

Estimated Date

Aug-21

Code Suggestion Build post migration to Staging

Not yet started

priority3

TBD
Edited by Tan Le