Update support for Unicode 15.1
Upgrade to start using Unicode 15.1 emojis. Note that while 16.0 was just released, noto-emoji doesn't yet support it. Various things start breaking without the proper fallback images, and it's better to wait until 16 is a little more prevalent.
- load the Unicode
emoji-test.txt
file as the main basis for emoji data - load the gemojione data, adding as aliases
- load any additional aliases
Closes #28 (closed)
Merge request reports
Activity
added Category:Markdown devopsplan groupknowledge sectiondev labels
assigned to @digitalmoksha
mentioned in issue #4 (closed)
Gemojione shortcodes are added as aliases to the Unicode emojis. This provides backward compatibility. However there are some minor differences between the shortcodes of
gemojione
and the Unicode emojis.For example,
:cow:
in gemojione is , while in Unicode it is🐄
, which is:cow2:
in gemojione. Now:cow_face:
will give .I think this difference is acceptable. It's worth a minor change in emoji to better align with the Unicode CLDR names. And the emojis are not vastly different.
Differences generated using a temporary spec:
it 'check alpha_codes', :aggregate_failures do db = File.open(TanukiEmoji::Db::Gemojione::DATA_FILE, 'r:UTF-8') do |file| JSON.parse(file.read, symbolize_names: true) end db.each do |emoji_name, emoji_data| emoji = TanukiEmoji.find_by_alpha_code(emoji_data[:shortname]) if emoji.codepoints != emoji_data[:moji] && !emoji.codepoints_alternates.include?(emoji_data[:moji]) puts "#{emoji.codepoints} (#{emoji.alpha_code}) VS #{emoji_data[:moji]} (#{emoji_data[:shortname]})" puts emoji.inspect puts emoji_data puts end # expect(emoji.codepoints).to eq emoji_data[:moji] end end
differences
📅 (:calendar:) VS 📆 (:calendar:) #<TanukiEmoji::Character: 📅 (1f4c5) :calendar: aliases: [":date:"]> {:unicode=>"1F4C6", :unicode_alternates=>[], :name=>"tear-off calendar", :shortname=>":calendar:", :category=>"objects", :aliases=>[], :aliases_ascii=>[], :keywords=>["schedule", "object", "office"], :moji=>"📆"} 🐪 (:camel:) VS 🐫 (:camel:) #<TanukiEmoji::Character: 🐪 (1f42a) :camel: aliases: [":dromedary_camel:"]> {:unicode=>"1F42B", :unicode_alternates=>[], :name=>"bactrian camel", :shortname=>":camel:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "hot", "nature", "bactrian", "camel", "hump", "desert", "central asia", "heat", "water", "hump day", "wednesday", "sex", "wildlife"], :moji=>"🐫"} 🐈 (:cat:) VS 🐱 (:cat:) #<TanukiEmoji::Character: 🐈 (1f408) :cat: aliases: [":cat2:"]> {:unicode=>"1F431", :unicode_alternates=>[], :name=>"cat face", :shortname=>":cat:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "meow", "halloween", "vagina", "cat"], :moji=>"🐱"} 🐄 (:cow:) VS 🐮 (:cow:) #<TanukiEmoji::Character: 🐄 (1f404) :cow: aliases: [":cow2:"]> {:unicode=>"1F42E", :unicode_alternates=>[], :name=>"cow face", :shortname=>":cow:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "beef", "ox"], :moji=>"🐮"} 🐕 (:dog:) VS 🐶 (:dog:) #<TanukiEmoji::Character: 🐕 (1f415) :dog: aliases: [":dog2:"]> {:unicode=>"1F436", :unicode_alternates=>[], :name=>"dog face", :shortname=>":dog:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "friend", "nature", "woof", "dog", "pug"], :moji=>"🐶"} 🐎 (:horse:) VS 🐴 (:horse:) #<TanukiEmoji::Character: 🐎 (1f40e) :horse: aliases: [":racehorse:"]> {:unicode=>"1F434", :unicode_alternates=>[], :name=>"horse face", :shortname=>":horse:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "brown", "wildlife"], :moji=>"🐴"} 🐁 (:mouse:) VS 🐭 (:mouse:) #<TanukiEmoji::Character: 🐁 (1f401) :mouse: aliases: [":mouse2:"]> {:unicode=>"1F42D", :unicode_alternates=>[], :name=>"mouse face", :shortname=>":mouse:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "nature"], :moji=>"🐭"} ✏️ (:pencil:) VS 📝 (:pencil:) #<TanukiEmoji::Character: ✏️ (270f-fe0f) :pencil: aliases: [":pencil2:"]> {:unicode=>"1F4DD", :unicode_alternates=>[], :name=>"memo", :shortname=>":pencil:", :category=>"objects", :aliases=>[":memo:"], :aliases_ascii=>[], :keywords=>["documents", "paper", "station", "write", "work", "office"], :moji=>"📝"} 🐖 (:pig:) VS 🐷 (:pig:) #<TanukiEmoji::Character: 🐖 (1f416) :pig: aliases: [":pig2:"]> {:unicode=>"1F437", :unicode_alternates=>[], :name=>"pig face", :shortname=>":pig:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "oink"], :moji=>"🐷"} 🐇 (:rabbit:) VS 🐰 (:rabbit:) #<TanukiEmoji::Character: 🐇 (1f407) :rabbit: aliases: [":rabbit2:"]> {:unicode=>"1F430", :unicode_alternates=>[], :name=>"rabbit face", :shortname=>":rabbit:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "nature", "wildlife"], :moji=>"🐰"} 🛰️ (:satellite:) VS 📡 (:satellite:) #<TanukiEmoji::Character: 🛰️ (1f6f0-fe0f) :satellite: aliases: [":satellite_orbital:"]> {:unicode=>"1F4E1", :unicode_alternates=>[], :name=>"satellite antenna", :shortname=>":satellite:", :category=>"objects", :aliases=>[], :aliases_ascii=>[], :keywords=>["communication", "object"], :moji=>"📡"} ☃️ (:snowman:) VS ⛄ (:snowman:) #<TanukiEmoji::Character: ☃️ (2603-fe0f) :snowman: aliases: [":snowman2:"]> {:unicode=>"26C4", :unicode_alternates=>["26C4-FE0F"], :name=>"snowman without snow", :shortname=>":snowman:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["christmas", "cold", "season", "weather", "winter", "xmas", "holidays", "snow"], :moji=>"⛄"} 🐅 (:tiger:) VS 🐯 (:tiger:) #<TanukiEmoji::Character: 🐅 (1f405) :tiger: aliases: [":tiger2:"]> {:unicode=>"1F42F", :unicode_alternates=>[], :name=>"tiger face", :shortname=>":tiger:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "wildlife", "roar", "cat"], :moji=>"🐯"} 🚆 (:train:) VS 🚋 (:train:) #<TanukiEmoji::Character: 🚆 (1f686) :train: aliases: [":train2:"]> {:unicode=>"1F68B", :unicode_alternates=>[], :name=>"Tram Car", :shortname=>":train:", :category=>"travel", :aliases=>[], :aliases_ascii=>[], :keywords=>["tram", "rail", "transportation", "travel", "train"], :moji=>"🚋"} ☂️ (:umbrella:) VS ☔ (:umbrella:) #<TanukiEmoji::Character: ☂️ (2602-fe0f) :umbrella: aliases: [":umbrella2:"]> {:unicode=>"2614", :unicode_alternates=>["2614-FE0F"], :name=>"umbrella with rain drops", :shortname=>":umbrella:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["rain", "weather", "sky", "cold"], :moji=>"☔"} 🐋 (:whale:) VS 🐳 (:whale:) #<TanukiEmoji::Character: 🐋 (1f40b) :whale: aliases: [":whale2:"]> {:unicode=>"1F433", :unicode_alternates=>[], :name=>"spouting whale", :shortname=>":whale:", :category=>"nature", :aliases=>[], :aliases_ascii=>[], :keywords=>["animal", "nature", "ocean", "sea", "wildlife", "tropical", "whales"], :moji=>"🐳"}
added 6 commits
-
24b3fe7d...c6d8be74 - 3 commits from branch
main
- f559cacc - Add Unicode 16.0 files
- 960b9c0f - Base data off of emoji-text.txt
- 034d3451 - work in progress
Toggle commit list-
24b3fe7d...c6d8be74 - 3 commits from branch
4 Warnings This merge request is definitely too big (59224 lines changed), please split it into multiple merge requests. cc86c389: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. 1e89c66d: The commit subject must contain at least 3 words. For more information, take a look at our Commit message guidelines. a24a59a6: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. Reviewer roulette
Changes that require review have been detected! A merge request is normally reviewed by both a reviewer and a maintainer in its primary category and by a maintainer in all other categories.
To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.
Once you've decided who will review this merge request, mention them as you normally would! Danger does not automatically notify them for you.
Reviewer Maintainer No reviewer available @kerrizor
(UTC-6, 1 hour behind
@digitalmoksha
)If needed, you can retry the
danger-review
job that generated this comment.Generated by
DangerEdited by ****- Resolved by 🤖 GitLab Bot 🤖
Proper labels assigned to this merge request. Please ignore me.
@digitalmoksha
- please see the following guidance and update this merge request.1 Error Please add typebug typefeature, or typemaintenance label to this merge request. Edited by 🤖 GitLab Bot 🤖
added 2 commits
added featureenhancement typefeature labels
added 2 commits
requested review from @brodock
Current version of
tankuki_emoji
uses gemojione's3.3.0
index. By adding adding its shortcodes as aliases, we maintain backward compatibility. gemojione is no longer being maintained, and the data it's based on, emojione has now been moved to emoji-toolkit, which has different licenses.Their
8.0
license does say non-artwork is under the MIT licenseJoyPixels Non-Artwork
Applies to the Javascript, JSON, PHP, CSS, HTML files, and everything else not covered under the artwork license above, found in both the emoji-toolkit and emoji-assets repos. License: MIT Complete Legal Terms: https://opensource.org/license/mit/
However, it's better at this time not to tie ourselves to that data. So we will not be using gemojione's final index data.
added 6 commits
Toggle commit listchanged milestone to %17.5
added emoji label
added 10 commits
- d8233551 - Add support for Unicdoe 15.1 emojis
- bf5b4f85 - Update to noto-emoji v2.042 for Unicode 15.1
- 1882e952 - Use Unicode emoji-test.tx as primary data
- 91131998 - Sort by all indexd emojis rather than main emoji
- 7cacd43a - Don’t add missing gemojione emojis
- 4b76b356 - Update README
- 83284eae - Add changed aliases from 13.1 and 14.0
- 2e26e46a - Remove unicode versioning dataset
- 91df8b55 - Add gemojione spec file
- f51227e2 - Add additional specs
Toggle commit listadded 7 commits
Toggle commit listmentioned in merge request !30 (closed)
mentioned in merge request !68 (closed)
mentioned in issue gitlab-org/gitlab#465351
- lib/tanuki_emoji/db/emoji_test_parser.rb 0 → 100644
1 # frozen_string_literal: true 2 3 require 'strscan' 4 require 'date' 5 require 'i18n' 6 require_relative 'emoji_data' 7 8 module TanukiEmoji 9 module Db 10 # Reads and extract content from emoji-test.txt 11 class EmojiTestParser 2 2 3 3 require 'strscan' 4 4 require 'date' 5 require_relative 'emoji_data' 5 6 6 7 module TanukiEmoji 7 8 module Db 8 9 # Reads and extract content from emoji-data.txt and its metadata 9 10 class EmojiDataParser 10 DATA_FILE = 'vendor/unicode/emoji-data.txt' - lib/tanuki_emoji/db/emoji_test_parser.rb 0 → 100644
1 # frozen_string_literal: true 2 3 require 'strscan' 4 require 'date' 5 require 'i18n' 6 require_relative 'emoji_data' 7 8 module TanukiEmoji 9 module Db 10 # Reads and extract content from emoji-test.txt 11 class EmojiTestParser 12 DATA_FILE = "#{::TanukiEmoji::Db::UNICODE_DATA_DIR}/emoji-test.txt" 13 14 # https://www.unicode.org/reports/tr51/#Versioning 15 EMOJI_UNICODE_VERSION = { 24 24 JSON.parse(file.read, symbolize_names: true) 25 25 end 26 26 27 db.each do |emoji_name, emoji_data| 28 emoji = Character.new(emoji_name.to_s, 29 codepoints: emoji_data[:moji], 30 alpha_code: emoji_data[:shortname], 31 description: emoji_data[:name], 32 category: emoji_data[:category]) 27 db.each do |_emoji_name, emoji_data| - Resolved by Brett Walker
@kerrizor I was wondering if I can also get your eyes on this.
requested review from @kerrizor
added 16 commits
-
cc86c389...9a793efe - 7 commits from branch
main
- 3d1f30ab - Add support for Unicdoe 15.1 emojis
- 7790e61a - Update to noto-emoji v2.042 for Unicode 15.1
- ccac90b7 - Use Unicode emoji-test.tx as primary data
- eb8b836a - Sort by all indexd emojis rather than main emoji
- 40527130 - Don’t add missing gemojione emojis
- 45bb4143 - Update README
- 26b64474 - Remove unicode versioning dataset
- 78b2b8f3 - Add gemojione spec file
- 3a4dcda9 - Add additional specs
Toggle commit list-
cc86c389...9a793efe - 7 commits from branch
4 Warnings This merge request is definitely too big (59225 lines changed), please split it into multiple merge requests. ae84f8f7: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. 45bb4143: The commit subject must contain at least 3 words. For more information, take a look at our Commit message guidelines. ccac90b7: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. Reviewer roulette
Changes that require review have been detected! A merge request is normally reviewed by both a reviewer and a maintainer in its primary category and by a maintainer in all other categories.
To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.
Once you've decided who will review this merge request, mention them as you normally would! Danger does not automatically notify them for you.
Reviewer Maintainer No reviewer available @kerrizor
(UTC-7, 2 hours behind
@digitalmoksha
)If needed, you can retry the
danger-review
job that generated this comment.Generated by
DangerEdited by ****mentioned in issue #28 (closed)
mentioned in commit ba7d4cd1
mentioned in issue #17 (closed)
mentioned in merge request gitlab-org/gitlab!170017 (closed)
mentioned in merge request !73 (merged)
mentioned in merge request gitlab-org/gitlab!170967 (closed)
mentioned in merge request gitlab-com/www-gitlab-com!137182 (merged)