Indexing string columns is broken on some MySQL setups
Summary
As reported by @rtripault in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14414#note_41248492
Running the migration introduced in that MR results in this error on some MySQL databases:
Mysql2::Error: Specified key was too long; max key length is 767 bytes: CREATE INDEX `index_projects_on_repository_storage` ON `projects` (`repository_storage`)
The issue is the collation used by the database. a 4-byte collation breaks indexing the column. 3 bytes (e.g., utf8) is fine. Apparently, we advised people to switch to utf8mb4 at some point in teh past.
However, other columns are affected too. In total:
-
merge_requests.merge_commit_shaabcfdd5c 2017-09-28 -
projects.repository_storage3ce81ec6 2017-09-21 -
user_custom_attributes.keye9eae3eb 2017-09-28 -
user_custom_attributes.valuee9eae3eb 2017-09-28
Steps to reproduce
Set the database collation to utf8mb4, run the migration
What is the current bug behavior?
Adding the index fails
What is the expected correct behavior?
The indexing operation should succeed.
Possible fixes
We could include a predated migration to reduce the column size so it can be indexed? 128 characters should be more than enough for a repository storage name.
We could implement https://gitlab.com/gitlab-org/gitlab-ce/issues/38238 to normalize the data, turning this column into a foreign key shard_id (in time for %10.1?)
We could specify a list of supported collations and exclude utf8mb4? https://gitlab.com/gitlab-org/gitlab-ce/blob/master/config/database.yml.mysql is quite confused about which collation to use.
/cc @yorickpeterse