Automatic project slug creation should project Unicode into ASCII
Problem to solve
Automatic project slug generation could improve dramatically if a "unidecode" strategy is used, that is, using a specialized library in order to project Unicode strings into ASCII, meaning providing a reasonable ASCII representation of a Unicode string, like converting "ó" into "o" or "ü" into "u".
Intended users
Any user creating a new project.
Further details
- The current behavior of Gitlab's project slug automatic creation seems to be projecting any non-ASCII character into a dash ("-") character.
- This is bad UX for non-English-speaking user's worldwide.
- I myself, being a Spanish speaker developed an awareness of this issue by trying to name a project "Programación desde cero" and getting a slug of
programaci-n-desde-cero
when I was expecting some along the lines ofprogramacion-desde-cero
. - For non-Latin script users the situation is almost ridicule: you type something like 演習 (Enshū) meaning "exercise" in Japanese, which could easily be projected to the slug
enshu
but you get no real suggestion at all onlymy-awesome-project
as some placeholder text.
Proposal
Use a server-side Ruby Unidecode module or a client-side Javascript module in order to properly project Unicode text into ASCII in the automatic slug creation process.
Permissions and Security
- Not really aware of the permissions required for this. Please, someone, fill in this information.
- Regarding consistency across different UIs like the REST API, I guess this feature only makes sense in the web-based UI or any desktop/mobile UIs available.
Documentation
Sorry, not feeling able to fill this section or willing to research how to to it at this very moment.
Testing
- I don't see a relevant risk here other than non-Latin script language users ending up with slugs that they don't like. Since this is just a suggestion you can edit by yourself I think projecting to ASCII is a reasonable compromise.
- Regarding testing, this should be particularly tricky to test neither at the unit, integration, system or acceptance levels.
What does success look like, and how can we measure that?
- Success metrics: the rate of project titles that include non-ASCII Unicode characters that stick to a slug generated automatically via ASCII project. The higher the better.
- Acceptance criteria: Automatic slug creation uses a capable Unicode-to-ASCII projection library.
Links / references
- Projecting Unicode to ASCII
- Unidecode NPM module
- Original Text::Unidecode Perl module
- Unidecode library for Ruby
- Unidecode library for Python
~feature ~UX
Edited by 🤖 GitLab Bot 🤖