Avoid Races Stale Data in job webhooks
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem
We have seen some reports of missed webhook job statues. #330499 (closed)
The current architecture runs a worker after commit with the pre-serialized data. But given that the data can change rapidly it's possible that the job status is changed by the time it it serialized. Note: given the build_data is memoized it could also be stale if other parts of the code are not using fresh instances.
after_create unless: :importing? do |build|
run_after_commit { build.execute_hooks }
end
after_transition pending: :running do |build|
build.run_after_commit do
build.ensure_persistent_ref
build.execute_hooks
end
end
def execute_hooks
return unless project
return if user&.blocked?
return unless project.has_active_hooks?(:job_hooks) || project.has_active_integrations?(:job_hooks)
Ci::ExecuteBuildHooksWorker.perform_async(project.id, build_data) # build_data is the serialized data
end
BuildFinishedWorker serializes this very late, but that should be ok because a retry would create a new job.id
Proposal
Refactor each state machine transition to capture the status it is transitioning too.
after_create unless: :importing? do |build|
status = 'created'
build.run_after_commit do
execute_hooks(status)
end
end
after_transition pending: :running do |build|
status = 'running'
build.run_after_commit do
build.ensure_persistent_ref
build.execute_hooks(status)
end
end
Edited by Allison Browne (PTO 12/19-1/2)