Skip to content

Fix 500 errors during ActionCable connect

Manoj M J requested to merge mmj-actioncable-500-fix into master

What does this MR do and why?

Background

This MR fixes the 500 errors showing up in Sentry: Link

There are 35k events in the last 24 hours, and 318k events in the last 30 days, which is a lot.

Screenshot_2024-05-08_at_1.13.44_PM

I encountered this error when perusing logs for a specific 500 error in Kibana, but I found it extremely hard to sift through the 500 logs because it is just swarming with this specific error arising from ActionCable. Link

"exception.backtrace": [
        "lib/gitlab/auth/auth_finders.rb:267:in `find_oauth_access_token'",
        "ee/lib/ee/gitlab/auth/auth_finders.rb:37:in `find_oauth_access_token'",
        "lib/gitlab/auth/auth_finders.rb:242:in `block in access_token'",
        "gems/gitlab-utils/lib/gitlab/utils/strong_memoize.rb:34:in `strong_memoize'",
        "lib/gitlab/auth/auth_finders.rb:228:in `access_token'",
        "lib/gitlab/auth/auth_finders.rb:129:in `find_user_from_access_token'",
        "ee/lib/ee/gitlab/auth/auth_finders.rb:42:in `find_user_from_access_token'",
        "lib/gitlab/auth/auth_finders.rb:74:in `find_user_from_bearer_token'",
        "app/channels/application_cable/connection.rb:13:in `connect'"
        ....
]

Context

  • In c0057a9d we introduced the means for ActionCable to be connected via access tokens/oauth tokens.
      def find_oauth_access_token
        token = parsed_oauth_token
        return unless token

        # PATs with OAuth headers are not handled by OauthAccessToken
        return if matches_personal_access_token_length?(token)

        # Expiration, revocation and scopes are verified in `validate_access_token!`
        oauth_token = OauthAccessToken.by_token(token)
        raise UnauthorizedError unless oauth_token

        oauth_token.revoke_previous_refresh_token!
        oauth_token
      end

Above is the code for finding and verifying the oauth access token present in the request.

Turns out that any string that is passed on as part of the valid header 'Authorization' => "Bearer some_token" goes through this path. Which means that, even if I pass a string, say "my_cat" as,

'Authorization' => "Bearer my_cat"

it runs through the find_oauth_access_token method, and finally

oauth_token = OauthAccessToken.by_token(token)
raise UnauthorizedError unless oauth_token

will be executed, and since my_cat does not exist the OauthAccessToken table, it raises UnauthorizedError, which is the cause of this giant stream of 500s since the last 5 months.

This should ideally have been affecting our error budgets, but since this is on the core ApplicationCable::Connection controller, it is not attributed to any specific team using meta.feature_category.

This 500 seems to be produced every time actioncable tries to connect, and hence the volume of errors is huge and it makes even searching for other legit errors difficult in the logs, which is why I decided to fix this.

Fix

The fix is simple, we just rescue the UnauthorizedError error raised, so that a 500 error is not registered.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Edited by Manoj M J

Merge request reports