umlauts in picture filename break relative-path images in AsciiDoc rendering
Summary
In the AsciiDoc rendering on the GitLab web interface, images with relative paths and with certain (or any?) non-ASCII-characters in their filename aren't being displayed.
Steps to reproduce
- begin in a new empty directory:
cd $(mktemp -d) - Download https://d33wubrfki0l68.cloudfront.net/dbfc383d23401ccbed7262a1822dba9babecb949/69a10/images/sunset.jpg and rename it to
sunsët.jpg:(Should also work with any other image file and with file names with other non-ASCII characters.)wget https://d33wubrfki0l68.cloudfront.net/dbfc383d23401ccbed7262a1822dba9babecb949/69a10/images/sunset.jpg mv sunset.jpg sunsët.jpg - Create an AsciiDoc file
README.adocalongside the image file, that references the image file as a picture by a relative path:(Should also work with other file placements and names, as long as the relative path is correct.)cat <<eof > README.adoc = Example document image::sunsët.jpg[placeholder text] eof - (optional) Verify that this works for asciidoctor:
asciidoctor-pdf README.adoc evince README.pdf & # make sure the picture is shown in the PDF - Create a new repo and add and commit the image file and the AsciiDoc file:
git init git add sunsët.jpg README.adoc git commit -m'bug reproduction' - Push to a new GitLab project
Example Project
- Created with above reproduction instructions: das-g/non-ascii-image-name-in-asciidoc>
- More thorough examples: das-g/asciidoctor-relative-image-path-vs.-gitlab>
What is the current bug behavior?
Preview of the AsciiDoc document on GitLab displays the placeholder text "placeholder text" instead of the picture.
Interesting observations:
- The link around the missing picture still leads to the image file. I.e., if you click on the placeholder text, the browser will display the image (only the picture, without the AsciiDoc document or GitLab UI).
- The
srcURL is justwhich the browser resolves to https://gitlab.com/das-g/suns%C3%ABt.jpg (on the project's file overview) or https://gitlab.com/das-g/non-ascii-image-name-in-asciidoc/-/blob/master/suns%C3%ABt.jpg (when viewing the README.adoc onsunsët.jpgmaster) while it should probably bewhich the browser would resolve to https://gitlab.com/das-g/non-ascii-image-name-in-asciidoc/-/raw/master/suns%C3%ABt.jpg/das-g/non-ascii-image-name-in-asciidoc/-/raw/master/suns%C3%ABt.jpg
What is the expected correct behavior?
Preview of the AsciiDoc document on GitLab displays the picture.
Relevant logs and/or screenshots
- README.pdf from optional step 4 of the Steps to reproduce above
- Placeholder text shown instead of picture:
Output of checks
This bug happens on GitLab.com
Possible fixes
If you can, link to the line of code that might be responsible for the problem.
I don't know (yet) what specific line of code is responsible, but I'm pretty sure that the problem (and/or potential fix) is /should be in class Banzai::Filter::RepositoryLinkFilter.
Here's a new automated test that I believe reproduces the problem.
Failures:
1) Banzai::Filter::RepositoryLinkFilter with a valid commit rebuilds relative URL for an image with Umlaut in the repo
Failure/Error:
expect(doc.at_css('img')['src'])
.to eq "/#{project_path}/-/raw/#{ref}/files/images/logo-bläck.png"
expected: "/namespace263/project1130/-/raw/markdown/files/images/logo-bläck.png"
got: "files/images/logo-bläck.png"
(compared using ==)
Shared Example Group: :valid_repository called from ./spec/lib/banzai/filter/repository_link_filter_spec.rb:375
# ./spec/lib/banzai/filter/repository_link_filter_spec.rb:262:in `block (3 levels) in <top (required)>'
# ./spec/spec_helper.rb:329:in `block (3 levels) in <top (required)>'
# ./spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'
# ./spec/spec_helper.rb:320:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:316:in `block (3 levels) in <top (required)>'
# ./spec/spec_helper.rb:316:in `block (2 levels) in <top (required)>'
2) Banzai::Filter::RepositoryLinkFilter with a valid ref rebuilds relative URL for an image with Umlaut in the repo
Failure/Error:
expect(doc.at_css('img')['src'])
.to eq "/#{project_path}/-/raw/#{ref}/files/images/logo-bläck.png"
expected: "/namespace293/project1160/-/raw/markdown/files/images/logo-bläck.png"
got: "files/images/logo-bläck.png"
(compared using ==)
Shared Example Group: :valid_repository called from ./spec/lib/banzai/filter/repository_link_filter_spec.rb:382
# ./spec/lib/banzai/filter/repository_link_filter_spec.rb:262:in `block (3 levels) in <top (required)>'
# ./spec/spec_helper.rb:329:in `block (3 levels) in <top (required)>'
# ./spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'
# ./spec/spec_helper.rb:320:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:316:in `block (3 levels) in <top (required)>'
# ./spec/spec_helper.rb:316:in `block (2 levels) in <top (required)>'
Finished in 13 minutes 33 seconds (files took 57.39 seconds to load)
2331 examples, 2 failures
Failed examples:
rspec './spec/lib/banzai/filter/repository_link_filter_spec.rb[1:12:15]' # Banzai::Filter::RepositoryLinkFilter with a valid commit rebuilds relative URL for an image with Umlaut in the repo
rspec './spec/lib/banzai/filter/repository_link_filter_spec.rb[1:13:15]' # Banzai::Filter::RepositoryLinkFilter with a valid ref rebuilds relative URL for an image with Umlaut in the repo
Edited by Raphael Das Gupta
