Fixing Gitlab/JsonSafeParse violations
What does this MR do and why?
Fixes Gitlab/JsonSafeParse on ee/app/services/ai/duo_workflows/start_workflow_service.rb .
Related to #586119
Data source identification
- workflow_metadata comes from
Gitlab::DuoWorkflow::Client.metadata()+ modelmetada fromGitlab::DuoWorkflow::DuoWorkflowService::Client.agent_platform_model_metadata() - All data comes from internal services with no user input
Example of data struture:
{
"extended_logging": true,
"is_team_member": false,
"modelMetadata": {
"provider": "ai-provider",
"name": "model-name",
"endpoint": "http://localhost:11434/v1",
"api_key": "service_token_xyz",
"identifier": "provider/some-model"
}
}
Giving it is internal data, can safe_parse be used?
Giving it is internal data and safe_parse is not actually needed, what would be the impact of using it, should parse be used instead?
- Data doesn't exceed the the
safe_parselimits:- Typical data size: 150-300 bytes, nesting dept ~3
- Profiling on next section
Performance Profiling
Comprehensive benchmark comparing parse() vs safe_parse() with 10,000 iterations:
| Test Case | Data Size | Parse Time | Safe Parse Time | Overhead |
|---|---|---|---|---|
| Small (basic metadata) | 48 bytes | 0.0025ms | 0.0081ms | 324% |
| Medium (typical) | 720 bytes | 0.0025ms | 0.0084ms | 236% |
| Large (full context) | 15,971 bytes | 0.0235ms | 0.0605ms | 157% |
Production Load Scenarios
At 10,000 requests/second (high-traffic scenario):
| Scenario | Parse Total | Safe Parse Total | Difference |
|---|---|---|---|
| Small (48B) | 0.025ms | 0.081ms | +0.056ms ✓ |
| Medium (720B) | 0.025ms | 0.084ms | +0.059ms ✓ |
| Large (15.9KB) | 0.235ms | 0.605ms | +0.370ms ✓ |
Source: benchmark.rb
#!/usr/bin/env ruby
# frozen_string_literal: true
# Condensed Performance Benchmark: parse vs safe_parse for StartWorkflowService
#
# This script compares the performance impact of switching from Gitlab::Json.parse()
# to Gitlab::Json.safe_parse() in StartWorkflowService#workflow_metadata
#
# Usage: bundle exec ruby benchmark_json_parse_compact.rb
require 'benchmark'
require 'json'
require_relative 'config/environment'
# Test data representing typical workflow_metadata sizes
TEST_DATA = {
small: {
extended_logging: true,
is_team_member: false
}.to_json,
medium: {
extended_logging: true,
is_team_member: false,
modelMetadata: {
provider: 'ai-provider',
name: 'model-name',
endpoint: 'http://localhost:11434/v1',
api_key: 'token' * 100,
identifier: 'provider/some-model'
}.to_json
}.to_json,
large: {
extended_logging: true,
is_team_member: false,
modelMetadata: {
provider: 'ai-provider',
name: 'model-name',
endpoint: 'http://localhost:11434/v1',
api_key: 'token' * 100,
identifier: 'provider/some-model',
additional_context: { data: Array.new(50) { 'context_value' } }
}.to_json,
additional_context: Array.new(100) { |i| { category: "cat_#{i}", content: 'x' * 100 } }.to_json
}.to_json
}
ITERATIONS = 10_000
puts "=" * 90
puts "Performance Benchmark: Gitlab::Json.parse vs Gitlab::Json.safe_parse"
puts "For: StartWorkflowService#workflow_metadata"
puts "=" * 90
puts
results = {}
TEST_DATA.each do |name, data|
puts "\n#{name.to_s.capitalize} (#{data.bytesize} bytes, #{ITERATIONS} iterations)"
puts "-" * 90
Benchmark.bm(35) do |x|
parse_time = x.report("parse()") do
ITERATIONS.times { Gitlab::Json.parse(data) }
end
safe_parse_time = x.report("safe_parse()") do
ITERATIONS.times { Gitlab::Json.safe_parse(data) }
end
overhead_ms = (safe_parse_time.real - parse_time.real) / ITERATIONS * 1000
overhead_pct = ((safe_parse_time.real - parse_time.real) / parse_time.real) * 100
results[name] = {
parse_ms: parse_time.real / ITERATIONS * 1000,
safe_parse_ms: safe_parse_time.real / ITERATIONS * 1000,
overhead_ms: overhead_ms,
overhead_pct: overhead_pct
}
end
end
Conclusion
Using safe_parse adds some overhead (~0.059ms), but considering this operation is done only once per workflow I believe it is acceptable to be used in this case, unless the responsible disagrees.
References
- Contributor guidelines were added on https://gitlab.com/gitlab-org/gitlab/-/blob/master/.rubocop_todo/gitlab/json_safe_parse.yml.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.