Provide a sitemap.xml file for public projects
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem statement
GitLab.com does not provide a sitemap.xml resource.
Google is currently picking up links to repositories via the https://gitlab.com/explore/projects endpoint.
Many of these URLs are resulting in error pages (e.g. https://gitlab.com/explore/projects?page=11961 as of 2019-11-20 results in error 500) nor is the most optimal way to provide discovery of projects
Job to be done
- Block the
/explore/
section in the robots.txt file - Generate a sitemap.xml file(s) containing a list of all the public repositories according to the specification: https://www.sitemaps.org/protocol.html
- The sitemaps should be updated once every 24hrs
- Ensure the file(s) are linked in the robots.txt file
Notes
- The maximum number of URLs per sitemap file is 50000. If multiple files are needed please follow this approach: https://support.google.com/webmasters/answer/75712?hl=en
- https://support.google.com/webmasters/answer/183668?hl=en
Edited by 🤖 GitLab Bot 🤖