Skip to content

Sporadic Bad Gateway 502 user-specific errors in global code search

Summary

While searching gitlab-org/gitlab on gitlab.com I came across 500 and 502 errors which persisted across multiple requests, refreshes and search attempts with the same user but could not be reproduced with other users. After brief a discussion with @changzhengliu and @terrichu we decided to create this issue for future reference.

https://gitlab.com/search?search=%22web_url%22+%22namespace%22&nav_source=navbar&project_id=2009901&group_id=9970&search_code=true&repository_ref=master

Steps to reproduce

TBD

What is the current bug behavior?

Application

On the application level, this manifests as a normal HTTP 500 error.

image

Logs

On the rails and logging level, however, we can reliably observe an 502 error followed by a 500 error with a Zoekt stacktrace.

image

502 Bad Gateway

<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n

And sometimes it seemed as if at some point this erroneous output was being parsed, albeit unsuccessfully:

unexpected character (after ) at line 1, column 1 [parse.c:804] in '<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n

image

500 Failed to open TCP connection to 10.216.8.36:443

Errno::ECONNREFUSED
Failed to open TCP connection to 10.216.8.36:443 (Connection refused - connect(2) for \"10.216.8.36\" port 443)
       "lib/gitlab/json.rb:107:in `rescue in adapter_load'",
        "lib/gitlab/json.rb:102:in `adapter_load'",
        "lib/gitlab/json.rb:28:in `parse'",
        "ee/lib/gitlab/search/zoekt/client.rb:165:in `parse_response'",
        "ee/lib/gitlab/search/zoekt/client.rb:50:in `search'",
        "ee/lib/gitlab/search/zoekt/client.rb:15:in `search'",
        "ee/lib/gitlab/zoekt/search_results.rb:152:in `zoekt_search_and_wrap'",
        "ee/lib/gitlab/zoekt/search_results.rb:118:in `search_as_found_blob'",
        "ee/lib/gitlab/zoekt/search_results.rb:96:in `block in blobs'",
        "ee/lib/gitlab/zoekt/search_results.rb:95:in `blobs'",
        "ee/lib/gitlab/zoekt/search_results.rb:37:in `blobs_count'",
        "ee/lib/gitlab/zoekt/search_results.rb:33:in `formatted_count'",
        "app/controllers/search_controller.rb:102:in `block in count'",
        "app/models/application_record.rb:73:in `block (2 levels) in with_fast_read_statement_timeout'",
        "app/models/concerns/cross_database_modification.rb:92:in `block in transaction'",
        "lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `public_send'",
        "lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `block in read_using_load_balancer'",
        "lib/gitlab/database/load_balancing/load_balancer.rb:63:in `read'",
        "lib/gitlab/database/load_balancing/connection_proxy.rb:110:in `read_using_load_balancer'",
        "lib/gitlab/database/load_balancing/connection_proxy.rb:75:in `transaction'",
        "lib/gitlab/database.rb:359:in `block in transaction'",
        "lib/gitlab/database.rb:358:in `transaction'",
        "app/models/concerns/cross_database_modification.rb:83:in `transaction'",
        "app/models/application_record.rb:70:in `block in with_fast_read_statement_timeout'",
        "lib/gitlab/database/load_balancing/session.rb:95:in `fallback_to_replicas_for_ambiguous_queries'",
        "app/models/application_record.rb:69:in `with_fast_read_statement_timeout'",
        "app/controllers/search_controller.rb:101:in `count'",
        "app/controllers/application_controller.rb:547:in `block in allow_gitaly_ref_name_caching'",
        "lib/gitlab/gitaly_client.rb:457:in `allow_ref_name_caching'",
        "app/controllers/application_controller.rb:546:in `allow_gitaly_ref_name_caching'",
        "ee/lib/gitlab/ip_address_state.rb:10:in `with'",
        "ee/app/controllers/ee/application_controller.rb:45:in `set_current_ip_address'",
        "app/controllers/application_controller.rb:498:in `set_current_admin'",
        "lib/gitlab/session.rb:11:in `with_session'",
        "app/controllers/application_controller.rb:489:in `set_session_storage'",
        "lib/gitlab/i18n.rb:114:in `with_locale'",
        "lib/gitlab/i18n.rb:120:in `with_user_locale'",
        "app/controllers/application_controller.rb:480:in `set_locale'",
        "app/controllers/application_controller.rb:473:in `set_current_context'",
        "ee/lib/omni_auth/strategies/group_saml.rb:41:in `other_phase'",
        "lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'",
        "lib/gitlab/middleware/memory_report.rb:13:in `call'",
        "lib/gitlab/middleware/speedscope.rb:13:in `call'",
        "lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'",
        "lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'",
        "lib/gitlab/etag_caching/middleware.rb:21:in `call'",
        "lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'",
        "lib/gitlab/metrics/web_transaction.rb:46:in `run'",
        "lib/gitlab/metrics/rack_middleware.rb:16:in `call'",
        "lib/gitlab/middleware/go.rb:20:in `call'",
        "lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'",
        "lib/gitlab/database/query_analyzer.rb:37:in `within'",
        "lib/gitlab/middleware/query_analyzer.rb:11:in `call'",
        "lib/gitlab/middleware/multipart.rb:173:in `call'",
        "lib/gitlab/middleware/read_only/controller.rb:50:in `call'",
        "lib/gitlab/middleware/read_only.rb:18:in `call'",
        "lib/gitlab/middleware/same_site_cookies.rb:27:in `call'",
        "lib/gitlab/middleware/path_traversal_check.rb:25:in `call'",
        "lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'",
        "lib/gitlab/middleware/basic_health_check.rb:25:in `call'",
        "lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'",
        "lib/gitlab/middleware/request_context.rb:15:in `call'",
        "lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'",
        "config/initializers/fix_local_cache_middleware.rb:11:in `call'",
        "lib/gitlab/middleware/compressed_json.rb:44:in `call'",
        "lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'",
        "lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'",
        "lib/gitlab/metrics/requests_rack_middleware.rb:79:in `call'",
        "lib/gitlab/middleware/release_env.rb:13:in `call'"

What is the expected correct behavior?

The query is performed as expected:

https://gitlab.com/search?search=%22web_url%22+%22namespace%22&nav_source=navbar&project_id=2009901&group_id=9970&search_code=true&repository_ref=master

Relevant logs and/or screenshots

See jdsalaro_zoekt_500s.json as exported from https://log.gprd.gitlab.net/app/discover#/?_g=h@e5dc602&_a=h@590ff3b

Possible fixes

TBD, although this seems like a temporary condition arising at the Kubernetes/network level.

Edited by Jayson Salazar Rodriguez