Resolve user_details duplicate entry errors on registration

Summary

We are seeing our storage of user.user_details.onboarding_status failing in a transient way as seen by this log search with an average of 15 failures per day. This happens when we try to save an update to the user_details table here.

This is not a hard failure and so we believe we are combatting an issue with duplicate user.user_detail records being built by this lazy loading user_detail override. When searching for what could be causing more than 1 record to be built in the same request cycle, we found that the Users::RefreshAuthorizedProjectsService is making an update when there are projects that need authorized for a user on this line when the user.user_details.project_authorizations_recalculated_at is updated. Since the worker runs in the background, it is conceivable that we are hitting a race condition where our update to user.user_details.onboarding_status, especially when a user is invited to a project and registers, with the Users::RefreshAuthorizedProjectsService's update to the same table. Both are likely building a new user.user_details record and attempting then to save them to the database. Only one wins and the other will raise the exception and fail. It is likely that the Users::RefreshAuthorizedProjectsService will retry and be successful due to being called from a background worker with retry. However, the user.user_details.onboarding_status case will not.

We have further validated this theory by looking at logs:

Screenshot_2024-05-14_at_1.42.43_PM

Screenshot_2024-05-14_at_1.41.51_PM

Screenshot_2024-05-14_at_1.33.07_PM

Screenshot_2024-05-14_at_1.35.37_PM

See troubleshooting thread in https://gitlab.com/gitlab-org/gitlab/-/issues/454680#note_1903095776 for more details

Steps to reproduce

note transient error, so likely not able to reproduce easily.

  1. Invite someone by email that doesn't have a GitLab.com user account yet.
  2. Invited user registers on GitLab.com with that email.
  3. Error shows in the Kibana logs and user may not be able to progress in onboarding past a screen or has to re-submit many times.

What is the current bug behavior?

Uniqueness error from database and user has a degraded onboarding experience as described above.

What is the expected correct behavior?

No Database errors and user has the expected onboarding experience that can finish.

Relevant logs and/or screenshots

https://log.gprd.gitlab.net/app/r/s/ADBqr

Possible fixes/Plan

  • Initialize a user.user_details record on user creation so that it exists and never has to be lazily built by this method - !152995 (merged)
  • If above is successful, open up follow-up issues to explore adding this to every user build.

See other attempts to fix this in #333245 (closed) that were rolled back.

Edited by Doug Stull