MySQL: Use of utf8_bin collation prevents use of emojis and normal Unicode characters
Fields that are
SAText have overrides for MySQL which set
<size> COLLATE utf8_bin. But utf8 is problematic, it's actually 3-bit UTF-8, not standard 4-bit UTF-8, which is
utf8mb4 in MySQL. Using utf8 means it doesn't support emojis and other non-Western languages.
These fields should be switched to use the
utf8mb4_bin collation, which means they'll have a
utf8mb4 charset. Using utf8mb4 is already the recommended config as of e6e0a10a.
Wikimedia's downstream task is https://phabricator.wikimedia.org/T282271, where someone had an emoji in their display name, causing the import to fail.