Fix verification -> secret token migration in Group HTTP Destinations
What does this MR do and why?
- The previous migration: https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/background_migration/fix_incomplete_external_audit_destinations.rb
Assumed that
verification_tokenwas encrypted, which it is not (has_secure_tokensimply generates a string), The model it self does not call attr_encrypted, unlike theinstance_model
This MR is a result from testing another change and noticing that frontend was not loading some destinations when using the feature flag: use_consolidated_audit_event_stream_dest_api
Root Cause
The original migration from legacy to streaming destinations had a bug where plaintext tokens (24 bytes) were stored directly in the encrypted_secret_token field instead of being properly encrypted (~40 bytes). This happened because:
Legacy destinations store verification_token as plaintext The migration incorrectly handled the encryption process Plaintext data ended up in fields expecting encrypted data
The Fix
This migration:
- Detects corrupted destinations by attempting decryption
- Retrieves the original plaintext token from legacy destinations
- Properly encrypts tokens using a temporary model instance
- Updates the encrypted columns directly to avoid validation conflicts
- Generates new tokens when legacy data is missing
References
Screenshots or screen recordings
Queries:
- Identification of bad rows: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/40990/commands/125989
How to set up and validate locally
This script mimics the existing migration and this current migration and runs the methods:
Click to expand
ActiveRecord::Base.logger = nil
puts "="*80
puts "DEMO: HTTP Streaming Destinations Migration Bug Fix"
puts "Problem: 313/324 HTTP destinations are corrupted and cause GraphQL errors"
puts "="*80
puts "\n🧹 Cleaning up existing test data..."
AuditEvents::Group::ExternalStreamingDestination.where(name: ['Demo Broken Migration', 'Demo Production Corruption']).destroy_all
AuditEvents::ExternalAuditEventDestination.where(name: ['Demo Legacy Destination']).destroy_all
puts "\n" + "="*80
puts "STEP 1: DEMONSTRATING THE ORIGINAL MIGRATION BUG"
puts "="*80
module BuggyMigration
class GroupStreamingDestination < ApplicationRecord
include Gitlab::EncryptedAttribute
self.table_name = 'audit_events_group_external_streaming_destinations'
attr_accessor :secret_token
attr_encrypted :secret_token,
mode: :per_attribute_iv, key: :db_key_base_32, algorithm: 'aes-256-gcm',
encode: false, encode_iv: false
belongs_to :group, class_name: '::Group', optional: true
end
end
puts "\n📦 Creating destination with buggy migration..."
dest = AuditEvents::Group::ExternalStreamingDestination.create!(
name: "Demo Broken Migration",
category: :http,
config: {
'url' => 'https://example.com/demo',
'headers' => { 'Demo-Header' => { 'value' => 'demo', 'active' => true } }
},
group_id: Group.first.id
)
buggy_dest = BuggyMigration::GroupStreamingDestination.find(dest.id)
buggy_dest.secret_token = "OriginalToken123"
buggy_dest.save!
puts "✅ Buggy migration completed - token set to: OriginalToken123"
puts "\n🔍 Testing if normal model can read the 'migrated' data..."
normal_dest = AuditEvents::Group::ExternalStreamingDestination.find(dest.id)
begin
decrypted = normal_dest.secret_token
puts "✅ SUCCESS: Normal model can read token: #{decrypted}"
puts " (This means the bug didn't reproduce in local environment)"
rescue => e
puts "❌ FAILED: Normal model cannot read token: #{e.class} - #{e.message}"
puts " (This would be the production issue)"
end
puts "\n" + "="*80
puts "STEP 2: SIMULATING ACTUAL PRODUCTION CORRUPTION"
puts "="*80
puts "\n📦 Creating destination with actual production corruption pattern..."
legacy_dest = AuditEvents::ExternalAuditEventDestination.create!(
name: "Demo Legacy Destination",
namespace_id: Group.first.id,
destination_url: "https://example.com/legacy",
verification_token: "JVRRHZVhaLFiiqnijh5yFg2F"
)
corrupted_dest = AuditEvents::Group::ExternalStreamingDestination.create!(
name: "Demo Production Corruption",
category: :http,
config: {
'url' => 'https://example.com/corrupted',
'headers' => { 'Demo-Header' => { 'value' => 'demo', 'active' => true } }
},
group_id: Group.first.id,
legacy_destination_ref: legacy_dest.id
)
production_encrypted_bytes = [190, 47, 109, 60, 24, 226, 1, 143, 37, 253, 140, 13, 253, 6, 242, 54, 224, 254, 5, 70, 24, 49, 215, 132]
production_iv_bytes = [63, 189, 96, 66, 199, 84, 253, 210, 53, 70, 109, 37]
corrupted_dest.update_columns(
encrypted_secret_token: production_encrypted_bytes.pack('C*'),
encrypted_secret_token_iv: production_iv_bytes.pack('C*')
)
puts "✅ Applied actual production corruption pattern"
puts " - Encrypted token length: #{corrupted_dest.encrypted_secret_token.length} bytes (should be ~40)"
puts " - This simulates the 313 broken destinations in production"
puts "\n🔍 Testing the corrupted destination..."
begin
corrupted_dest.secret_token
puts "❌ UNEXPECTED: Can decrypt corrupted data"
rescue => e
puts "✅ CONFIRMED: Cannot decrypt corrupted data: #{e.class}"
puts " This is exactly what's happening in production!"
end
puts "\n" + "="*80
puts "STEP 3: DEMONSTRATING THE MIGRATION FIX"
puts "="*80
puts "\n🔧 Applying the migration fix..."
original_token = legacy_dest.verification_token
puts "📋 Retrieved original legacy token: #{original_token}"
begin
ApplicationRecord.transaction do
temp_dest = AuditEvents::Group::ExternalStreamingDestination.new
temp_dest.secret_token = original_token
properly_encrypted_token = temp_dest.encrypted_secret_token
properly_encrypted_iv = temp_dest.encrypted_secret_token_iv
puts "🔐 Generated proper encryption:"
puts " - New token length: #{properly_encrypted_token.length} bytes (was #{production_encrypted_bytes.length})"
puts " - New IV length: #{properly_encrypted_iv.length} bytes"
corrupted_dest.update_columns(
encrypted_secret_token: properly_encrypted_token,
encrypted_secret_token_iv: properly_encrypted_iv
)
puts "✅ Migration fix applied successfully!"
end
rescue => e
puts "❌ Migration fix failed: #{e.class} - #{e.message}"
end
puts "\n" + "="*80
puts "STEP 4: VERIFYING THE FIX"
puts "="*80
puts "\n🧪 Testing the fixed destination..."
begin
fixed_token = corrupted_dest.reload.secret_token
puts "✅ SUCCESS: Can now decrypt token: #{fixed_token}"
puts "✅ Token matches original: #{fixed_token == original_token}"
puts "\n📊 Before vs After:"
puts " - Before: 24-byte corrupted data → OpenSSL::Cipher::CipherError"
puts " - After: #{corrupted_dest.encrypted_secret_token.length}-byte proper encryption → ✅ Works!"
rescue => e
puts "❌ Still cannot decrypt: #{e.class} - #{e.message}"
end
puts "\n" + "="*80
puts "SUMMARY"
puts "="*80
puts "\n🎯 What this migration fixes:"
puts " • 313 corrupted HTTP group streaming destinations"
puts " • OpenSSL::Cipher::CipherError in GraphQL API"
puts " • Broken audit event streaming functionality"
puts "\n⚙️ How the fix works:"
puts " 1. Detects destinations with corrupted 24-byte encrypted data"
puts " 2. Retrieves original tokens from legacy destinations"
puts " 3. Generates proper ~40-byte encrypted data"
puts " 4. Replaces corrupted fields with proper encryption"
puts " 5. Verifies each fix works correctly"
puts "\n📈 Expected impact:"
puts " • Before: 313/324 HTTP destinations broken"
puts " • After: 0/324 HTTP destinations broken"
puts " • GraphQL audit_events queries will work without errors"
puts "\n🧹 Cleanup:"
AuditEvents::Group::ExternalStreamingDestination.where(id: [dest.id, corrupted_dest.id]).destroy_all
AuditEvents::ExternalAuditEventDestination.where(id: legacy_dest.id).destroy_all
puts "\n" + "="*80
puts "✅ DEMO COMPLETE - Ready to fix production!"
puts "="*80
nil
Another small example: Copy paste the table definitions for GroupStreamingDestination into console
Create a destination with a corrupted token:
original_token = SecureRandom.base64(18) # Or the broken token
temp_destination = GroupStreamingDestination.new
temp_destination.secret_token = original_token
properly_encrypted_token = temp_destination.encrypted_secret_token
properly_encrypted_iv = temp_destination.encrypted_secret_token_iv
destination = AuditEvents::Group::ExternalStreamingDestination.last # Or wherever you got the broken token from
destination.update_columns(encrypted_secret_token: properly_encrypted_token, encrypted_secret_token_iv: properly_encrypted_iv)
decrypted_token = destination.reload.secret_token # Should not error anymore