web_hook_logs Data too long for column 'request_data' with MySQL
Summary
Having upgraded to 9.3.9 over the weekend we are seeing sidekiq errors when commits with a large number of changes, or a long message, are being logged to web_hook_logs.
Doesn't look like there are any code changes around this for 9.4 but haven't had a chance to test this yet.
Steps to reproduce
- Install GitLab with MySQL
- Create a repository or pick and existing one.
- Add a web hook to the repository, I have a system hook for all commits.
- Clone locally and cd to cloned directory
- git checkout -b lots-of-files
- mkdir test cd test for i in {1..10000}; do echo "Lorem ipsum dolor sit amet " > $i.txt; done
- git add *.txt git commit -m "10000 files"
- git push --set-upstream origin lots-of-files
What is the current bug behavior?
Error is logged by sidekiq when trying to execute the web hook, Mysql2::Error: Data too long for column 'request_data' at row 1: INSERT INTO
web_hook_logs``
What is the expected correct behavior?
Webhook log entry added without any errors.
Relevant logs and/or screenshots
2017-08-01_14:19:26.99394 2017-08-01T14:19:26.993Z 2427 TID-ory28lvcg WARN: {"context":"Job raised exception","job":{"class":"WebHookWorker","args":[254,{"object_kind":"push","event_name":"push","before":"bfa2643ad330441a8da5fb5cb471e981e4cec2e2","after":"ff1687c9649e0484063452808837b5338494ef96","ref":"refs/heads/lots-of-files","checkout_sha":"ff1687c9649e0484063452808837b5338494ef96","message":null,"user_id":2,"user_name":"Steve Norman","user_username":"steve.norman","user_email":"steve.norman@thomsonreuters.com","user_avatar":null,"project_id":79,"project":{"name":"automation-test","description":"","web_url":"https://redacted/steve.norman/automation-test","avatar_url":null,"git_ssh_url":"git@redacted:steve.norman/automation-test.git","git_http_url":"https://redacted/steve.norman/automation-test.git","namespace":"steve.norman","visibility_level":10,"path_with_namespace":"steve.norman/automation-test","default_branch":"master","homepage":"https://redacted/steve.norman/automation-test","url":"git@ redacted:steve.norman/automation-test.git","ssh_url":"git@ redacted:steve.norman/automation-test.git","http_url":"https://redacted/steve.norman/automation-test.git"},"commits":[{"id":"ff1687c9649e0484063452808837b5338494ef96","message":"More files, some updated\n","timestamp":"2017-07-31T17:25:19+01:00","url":"https://redacted/steve.norman/automation-test/commit/ff1687c9649e0484063452808837b5338494ef96","author":{"name":"Steve Norman","email":"steve.norman@thomsonreuters.com"},"added":["test/10000.txt","test/2001.txt","test/2002.txt","test/2003.txt","test/2004.txt","test/2005.txt","test/2006.txt","test/2007.txt","test/2008.txt","test/2009.txt","test/2010.txt","test/2011.txt","test/2012.txt","test/2013.txt","test/2014.txt","test/2015.txt","test/2016.txt","test/2017.txt","test/2018.txt","test/2019.txt","test/2020.txt","test/2021.txt","test/2022.txt","test/2023.txt","test/2024.txt","test/2025.txt","test/2026.txt","test/2027.txt","test/2028.txt","test/2029.txt","test/2030.txt","test/2031.txt","test/2032.txt","test/2033.txt","test/2034.txt","test/2035.txt","test/2036.txt","test
snipped for brevity, this resulted in the following error, again snipped:
2017-08-01_14:19:26.99822 2017-08-01T14:19:26.998Z 2427 TID-ory28lvcg WARN: ActiveRecord::StatementInvalid: Mysql2::Error: Data too long for column 'request_data' at row 1: INSERT INTO `web_hook_logs` (`web_hook_id`, `trigger`, `url`, `execution_duration`, `response_body`, `response_status`, `request_headers`, `request_data`, `response_headers`, `created_at`, `updated_at`) VALUES (254, 'system_hooks', 'http://localhost:19091/', 0.89843911, '', '200', '---\nContent-Type: application/json\nX-Gitlab-Event: System Hook\n', '--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess\nobject_kind: push\nevent_name: push\nbefore: bfa2643ad330441a8da5fb5cb471e981e4cec2e2\nafter: ff1687c9649e0484063452808837b5338494ef96\nref: refs/heads/lots-of-files\ncheckout_sha: ff1687c9649e0484063452808837b5338494ef96\nmessage: \nuser_id: 2\nuser_name: Steve Norman\nuser_username: steve.norman\nuser_email: steve.norman@thomsonreuters.com\nuser_avatar: \nproject_id: 79\nproject: !ruby/hash:ActiveSupport::HashWithIndifferentAccess\n name: automation-test\n description: \'\'\n web_url: https://redacted/steve.norman/automation-test\n avatar_url: \n git_ssh_url: git@ redacted:steve.norman/automation-test.git\n git_http_url: https://redacted/steve.norman/automation-test.git\n namespace: steve.norman\n visibility_level: 10\n path_with_namespace: steve.norman/automation-test\n default_branch: master\n homepage: https://redacted/steve.norman/automation-test\n url: git@ redacted:steve.norman/automation-test.git\n ssh_url: git@ redacted:steve.norman/automation-test.git\n http_url: https://redacted/steve.norman/automation-test.git\ncommits:\n- !ruby/hash:ActiveSupport::HashWithIndifferentAccess\n id: ff1687c9649e0484063452808837b5338494ef96\n message: \'More files, some updated\n\n\'\n timestamp: \'2017-07-31T17:25:19+01:00\'\n url: https://redacted/steve.norman/automation-test/commit/ff1687c9649e0484063452808837b5338494ef96\n author: !ruby/hash:ActiveSupport::HashWithIndifferentAccess\n name: Steve Norman\n email: steve.norman@thomsonreuters.com\n added:\n - test/10000.txt\n - test/2001.txt\n - test/2002.txt\n - test/2003.txt\n - test/2004.txt\n - test/2005.txt\n - test/2006.txt\n - test/2007.txt\n - test/2008.txt\n - test/2009.txt\n - test/2010.txt\n - test/2011.txt\n - test/2012.txt\n - test/2013.txt\n - test/2014.txt\n - test/2015.txt\n - test/2016.txt\n - test/2017.txt\n - test/2018.txt\n - test/2019.txt\n - test/2020.txt\n - test/2021.txt\n - test/2022.txt\n - test/2023.txt\n - test/2024.txt\n - test/2025.txt\n - test/2026.txt\n - test/2027.txt\n - test/2028.txt\n - test/2029.txt\n - test/2030.txt\n - test/2031.txt\n - test/2032.txt\n - test/2033.txt\n - test/2034.txt\n - test/2035.txt\n - test/2036.txt\n - test/2037.txt\n - test/2038.txt\n - test/2039.txt\n - test/2040.txt\n - test/2041
Possible fixes
I think there probably needs to be a migration for MySQL to extend the request_data
and response_body
fields to be largetext
similar to what has been done in https://gitlab.com/gitlab-org/gitlab-ce/blob/master/db/migrate/merge_request_diff_file_limits_to_mysql.rb
I do wonder if response_headers
could be large as well? In theory the response shouldn't be that large as by default there is a 10 second timeout for web hooks but I'm not sure if that timeout is only for no response and if it does get a valid response, or start of one, is the timeout ignored until the request completes?