Separate read and write operations for external service queues
## Problem
Currently, queues for external services (Zuora, Salesforce, GitLab.com) mix read and write operations with the same priority:
**Current queues**:
- `zuora` (weight 4): Contains both critical writes (subscription creation) and routine reads (data sync)
- `salesforce` (weight 4): Contains both urgent writes (opportunity creation) and routine reads (data fetching)
- `gitlab` (weight 4): Contains both critical writes (provisioning) and routine reads (status checks)
**Issues**:
1. Routine read/sync operations can delay critical write operations
2. Can't prioritize customer-blocking writes over background syncs
3. Difficult to throttle read operations during API rate limit issues
4. No clear distinction between critical and routine external service calls
## Proposal
Split external service queues into separate read and write queues with different priorities.
### Benefits
1. **Better prioritization**: Critical writes processed before routine reads
2. **Rate limit management**: Can throttle reads without impacting writes
3. **Improved reliability**: Customer-blocking operations get priority
4. **Clearer intent**: Obvious which operations are critical
5. **Better monitoring**: Can track read vs. write performance separately
## Proposed Queue Structure
### Zuora Queues
**`zuora_write` (weight 8-9)**
- Critical subscription operations
- Customer-blocking writes
- Jobs:
- Subscription creation/updates
- Order processing
- Payment method updates
- Amendment creation
**`zuora_read` (weight 4-5)**
- Data synchronization
- Non-blocking reads
- Jobs:
- `Zuora::RefreshLocalSubscriptionsJob`
- `Zuora::SyncResourceJob` (for reads)
- Product catalog sync
- Account data sync
**Keep `zuora_callback` (weight 9)**
- Real-time Zuora callouts (already separate)
- `ZuoraCallbackJob`
- `Zuora::SyncFailedCalloutJob`
### Salesforce Queues
**`salesforce_write` (weight 7-8)**
- Critical CRM operations
- Customer-blocking writes
- Jobs:
- `Salesforce::CreateOpportunityJob`
- `Salesforce::CreateAccountJob`
- `Salesforce::CreateLeadJob`
- `Salesforce::CreateQuoteForReconciliationJob`
**`salesforce_read` (weight 4-5)**
- Data fetching and sync
- Non-blocking reads
- Jobs:
- Account lookups
- Opportunity status checks
- Data synchronization
### GitLab.com Queues
**`gitlab_write` (weight 9)**
- Critical provisioning operations
- Customer-blocking writes
- Jobs:
- `UpdateGitlabPlanInfoJob`
- Subscription provisioning
- Namespace updates
**`gitlab_read` (weight 4-5)**
- Status checks and sync
- Non-blocking reads
- Jobs:
- Namespace lookups
- Usage data fetching
- Status verification
## Implementation Steps
1. **Audit current jobs**:
- List all jobs in `zuora`, `salesforce`, `gitlab` queues
- Categorize as read or write
- Identify customer-blocking vs. background operations
2. **Create base job classes**:
```ruby
# app/jobs/zuora/write_base_job.rb
module Zuora
class WriteBaseJob < ApplicationJob
queue_as :zuora_write
# Configuration for critical writes
end
end
# app/jobs/zuora/read_base_job.rb
module Zuora
class ReadBaseJob < ApplicationJob
queue_as :zuora_read
# Configuration for routine reads
end
end
```
3. **Update job classes**:
```ruby
# Before
class Zuora::SyncResourceJob < ApplicationJob
queue_as :zuora
end
# After (if it's a read operation)
class Zuora::SyncResourceJob < Zuora::ReadBaseJob
end
# Or (if it's a write operation)
class Zuora::CreateSubscriptionJob < Zuora::WriteBaseJob
end
```
4. **Update `config/sidekiq.yml`**:
```yaml
:queues:
# Critical writes
- [gitlab_write, 9]
- [zuora_callback, 9]
- [zuora_write, 8]
- [salesforce_write, 8]
# ... other queues ...
# Routine reads
- [zuora_read, 5]
- [salesforce_read, 5]
- [gitlab_read, 5]
```
5. **Add rate limiting** (optional):
```ruby
# Can throttle read operations without impacting writes
module Zuora
class ReadBaseJob < ApplicationJob
# Rate limit read operations
sidekiq_throttle threshold: { limit: 100, period: 1.minute }
end
end
```
6. **Update monitoring**:
- Track read vs. write queue depths separately
- Monitor API rate limit usage by queue
- Alert on write queue depth (more critical)
## Decision Criteria: Read vs. Write
**Write operations** (higher priority):
- Creates or updates external resources
- Customer is waiting for the operation
- Failure directly impacts customer experience
- Time-sensitive (must complete quickly)
**Read operations** (lower priority):
- Fetches or syncs data
- Background operation
- Failure can be retried later
- Can tolerate delays
## Success Criteria
- All external service jobs categorized as read or write
- Write queues have higher priority than read queues
- Can throttle read operations independently
- Clear documentation for future job classification
- Improved customer experience for critical operations
## Related
- Parent epic: gitlab-org&19587
- Related: #14268 (weight granularity)
- Related: #14270 (user-facing vs internal)
issue