Skip to content

Provide a sitemap.xml file for public projects

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem statement

GitLab.com does not provide a sitemap.xml resource.
Google is currently picking up links to repositories via the https://gitlab.com/explore/projects endpoint.
Many of these URLs are resulting in error pages (e.g. https://gitlab.com/explore/projects?page=11961 as of 2019-11-20 results in error 500) nor is the most optimal way to provide discovery of projects

Job to be done

  1. Block the /explore/ section in the robots.txt file
  2. Generate a sitemap.xml file(s) containing a list of all the public repositories according to the specification: https://www.sitemaps.org/protocol.html
    1. The sitemaps should be updated once every 24hrs
  3. Ensure the file(s) are linked in the robots.txt file

Notes

  1. The maximum number of URLs per sitemap file is 50000. If multiple files are needed please follow this approach: https://support.google.com/webmasters/answer/75712?hl=en
  2. https://support.google.com/webmasters/answer/183668?hl=en
Edited by 🤖 GitLab Bot 🤖