Skip to content

Indexing string columns is broken on some MySQL setups

Summary

As reported by @rtripault in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14414#note_41248492

Running the migration introduced in that MR results in this error on some MySQL databases:

Mysql2::Error: Specified key was too long; max key length is 767 bytes: CREATE INDEX `index_projects_on_repository_storage` ON `projects` (`repository_storage`)

The issue is the collation used by the database. a 4-byte collation breaks indexing the column. 3 bytes (e.g., utf8) is fine. Apparently, we advised people to switch to utf8mb4 at some point in teh past.

However, other columns are affected too. In total:

  • merge_requests.merge_commit_sha abcfdd5c 2017-09-28
  • projects.repository_storage 3ce81ec6 2017-09-21
  • user_custom_attributes.key e9eae3eb 2017-09-28
  • user_custom_attributes.value e9eae3eb 2017-09-28

Steps to reproduce

Set the database collation to utf8mb4, run the migration

What is the current bug behavior?

Adding the index fails

What is the expected correct behavior?

The indexing operation should succeed.

Possible fixes

We could include a predated migration to reduce the column size so it can be indexed? 128 characters should be more than enough for a repository storage name.

We could implement https://gitlab.com/gitlab-org/gitlab-ce/issues/38238 to normalize the data, turning this column into a foreign key shard_id (in time for %10.1?)

We could specify a list of supported collations and exclude utf8mb4? https://gitlab.com/gitlab-org/gitlab-ce/blob/master/config/database.yml.mysql is quite confused about which collation to use.

/cc @yorickpeterse

Edited by Nick Thomas