Long load times rendering large (~2.5MB) Markdown files in Wiki

Summary

Opening a large* Markdown file in the Wiki leads to very long load times (in excess of 30 seconds) and sometimes 500 and 502 errors.

(* - large = ~2.5 MB, ~11,000 lines)

Reloading/refreshing the page does not lead to a shorter load time.

Steps to reproduce

  1. Browse to the Wiki for a new or existing project
  2. Create a new page in the wiki with the Large.md described below as the content of the new page
  3. Observe the Wiki was successfully updated. notice
  4. Navigate away from the newly created Wiki page
  5. Browse to the Wiki page that you just created
  6. Observe that it takes a long time for the file to load (30-50 seconds) and every now and then results in a 500 or 502
  7. Reload the page
  8. Observe that reloading the page does not improve the load time

Example Project

https://gitlab.com/gitlab-gold/briecarranza/issues/large-file-render-wiki/-/wikis/home

Browse to the wiki for this project and click large page or browse directly to the large page.

Example large Markdown file: Large.md

The source of the example Large.md that I have been testing with is available for additional testing.

What is the current bug behavior?

The page takes a long time (30-50 seconds) to be displayed.

What is the expected correct behavior?

The page should load in a more timely manner.

Relevant logs and/or screenshots

  • This behavior was first brought to our attention through a support ticket. GitLab team members with access to ZenDesk can learn more by reading the ticket.

I am wondering whether gitlab-org/gitlab-foss#64827 is possibly related.

Additional Information

This long load time happens both on GitLab.com and on various self-managed GitLab instances. Using multiple browsers and multiple devices, the large Wiki page takes between 34 and 42 seconds to load on GitLab.com.

On my self-managed instance, I see entries like this one corresponding to me loading a Wiki with the contents of Large.md:

(Completed 200 OK in 49800ms (Views: 49629.4ms | ActiveRecord: 15.8ms | Elasticsearch: 0.0ms | Allocations: 43963218)

We observed that the response times from Gitaly and Postgres are very reasonable. With @wchandler, we decided to profile the request in order to better understand the cause and nature of the delay. A screenshot of the (potentially) interesting areas is below (from 6.69% to 4.90%):

Screen_Shot_2020-11-03_at_8.32.02_PM

The entire profile report is too large to attach but is available in the sample project as root-wiki-large-file-execution.html. Download.

Output of checks

This bug happens on GitLab.com and on various self-managed GitLab instances.

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

I have been reproducing this on an Omnibus GitLab instance:

 sudo gitlab-rake gitlab:check SANITIZE=true
Checking GitLab subtasks ...

Checking GitLab Shell ...

GitLab Shell: ... GitLab Shell version >= 13.11.0 ? ... OK (13.11.0)
Running /opt/gitlab/embedded/service/gitlab-shell/bin/check
Internal API available: OK
Redis available via internal API: OK
gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Gitaly ...

Gitaly: ... default ... OK

Checking Gitaly ... Finished

Checking Sidekiq ...

Sidekiq: ... Running? ... yes
Number of Sidekiq processes ... 1

Checking Sidekiq ... Finished

Checking Incoming Email ...

Incoming Email: ... Reply by email is disabled in config/gitlab.yml

Checking Incoming Email ... Finished

Checking LDAP ...

LDAP: ... LDAP is disabled in config/gitlab.yml

Checking LDAP ... Finished

Checking GitLab App ...

Git configured correctly? ... yes
Database config exists? ... yes
All migrations up? ... yes
Database contains orphaned GroupMembers? ... no
GitLab config exists? ... yes
GitLab config up to date? ... yes
Log directory writable? ... yes
Tmp directory writable? ... yes
Uploads directory exists? ... yes
Uploads directory has correct permissions? ... yes
Uploads directory tmp has correct permissions? ... yes
Init script exists? ... skipped (omnibus-gitlab has no init script)
Init script up-to-date? ... skipped (omnibus-gitlab has no init script)
Projects have namespace: ...
2/1 ... yes
1/2 ... yes
2/3 ... yes
14/4 ... yes
1/5 ... yes
1/6 ... yes
18/7 ... yes
2/8 ... yes
1/9 ... yes
Redis version >= 4.0.0? ... yes
Ruby version >= 2.5.3 ? ... yes (2.6.6)
Git version >= 2.24.0 ? ... yes (2.28.0)
Git user has default SSH configuration? ... yes
Active users: ... 10
Is authorized keys file accessible? ... yes
GitLab configured to store new projects in hashed storage? ... yes
All projects are in hashed storage? ... yes
Elasticsearch version 6.x - 7.x? ... skipped (elasticsearch is disabled)

Checking GitLab App ... Finished


Checking GitLab subtasks ... Finished

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes