Separate read and write operations for external service queues
## Problem Currently, queues for external services (Zuora, Salesforce, GitLab.com) mix read and write operations with the same priority: **Current queues**: - `zuora` (weight 4): Contains both critical writes (subscription creation) and routine reads (data sync) - `salesforce` (weight 4): Contains both urgent writes (opportunity creation) and routine reads (data fetching) - `gitlab` (weight 4): Contains both critical writes (provisioning) and routine reads (status checks) **Issues**: 1. Routine read/sync operations can delay critical write operations 2. Can't prioritize customer-blocking writes over background syncs 3. Difficult to throttle read operations during API rate limit issues 4. No clear distinction between critical and routine external service calls ## Proposal Split external service queues into separate read and write queues with different priorities. ### Benefits 1. **Better prioritization**: Critical writes processed before routine reads 2. **Rate limit management**: Can throttle reads without impacting writes 3. **Improved reliability**: Customer-blocking operations get priority 4. **Clearer intent**: Obvious which operations are critical 5. **Better monitoring**: Can track read vs. write performance separately ## Proposed Queue Structure ### Zuora Queues **`zuora_write` (weight 8-9)** - Critical subscription operations - Customer-blocking writes - Jobs: - Subscription creation/updates - Order processing - Payment method updates - Amendment creation **`zuora_read` (weight 4-5)** - Data synchronization - Non-blocking reads - Jobs: - `Zuora::RefreshLocalSubscriptionsJob` - `Zuora::SyncResourceJob` (for reads) - Product catalog sync - Account data sync **Keep `zuora_callback` (weight 9)** - Real-time Zuora callouts (already separate) - `ZuoraCallbackJob` - `Zuora::SyncFailedCalloutJob` ### Salesforce Queues **`salesforce_write` (weight 7-8)** - Critical CRM operations - Customer-blocking writes - Jobs: - `Salesforce::CreateOpportunityJob` - `Salesforce::CreateAccountJob` - `Salesforce::CreateLeadJob` - `Salesforce::CreateQuoteForReconciliationJob` **`salesforce_read` (weight 4-5)** - Data fetching and sync - Non-blocking reads - Jobs: - Account lookups - Opportunity status checks - Data synchronization ### GitLab.com Queues **`gitlab_write` (weight 9)** - Critical provisioning operations - Customer-blocking writes - Jobs: - `UpdateGitlabPlanInfoJob` - Subscription provisioning - Namespace updates **`gitlab_read` (weight 4-5)** - Status checks and sync - Non-blocking reads - Jobs: - Namespace lookups - Usage data fetching - Status verification ## Implementation Steps 1. **Audit current jobs**: - List all jobs in `zuora`, `salesforce`, `gitlab` queues - Categorize as read or write - Identify customer-blocking vs. background operations 2. **Create base job classes**: ```ruby # app/jobs/zuora/write_base_job.rb module Zuora class WriteBaseJob < ApplicationJob queue_as :zuora_write # Configuration for critical writes end end # app/jobs/zuora/read_base_job.rb module Zuora class ReadBaseJob < ApplicationJob queue_as :zuora_read # Configuration for routine reads end end ``` 3. **Update job classes**: ```ruby # Before class Zuora::SyncResourceJob < ApplicationJob queue_as :zuora end # After (if it's a read operation) class Zuora::SyncResourceJob < Zuora::ReadBaseJob end # Or (if it's a write operation) class Zuora::CreateSubscriptionJob < Zuora::WriteBaseJob end ``` 4. **Update `config/sidekiq.yml`**: ```yaml :queues: # Critical writes - [gitlab_write, 9] - [zuora_callback, 9] - [zuora_write, 8] - [salesforce_write, 8] # ... other queues ... # Routine reads - [zuora_read, 5] - [salesforce_read, 5] - [gitlab_read, 5] ``` 5. **Add rate limiting** (optional): ```ruby # Can throttle read operations without impacting writes module Zuora class ReadBaseJob < ApplicationJob # Rate limit read operations sidekiq_throttle threshold: { limit: 100, period: 1.minute } end end ``` 6. **Update monitoring**: - Track read vs. write queue depths separately - Monitor API rate limit usage by queue - Alert on write queue depth (more critical) ## Decision Criteria: Read vs. Write **Write operations** (higher priority): - Creates or updates external resources - Customer is waiting for the operation - Failure directly impacts customer experience - Time-sensitive (must complete quickly) **Read operations** (lower priority): - Fetches or syncs data - Background operation - Failure can be retried later - Can tolerate delays ## Success Criteria - All external service jobs categorized as read or write - Write queues have higher priority than read queues - Can throttle read operations independently - Clear documentation for future job classification - Improved customer experience for critical operations ## Related - Parent epic: gitlab-org&19587 - Related: #14268 (weight granularity) - Related: #14270 (user-facing vs internal)
issue