Skip to content

sitemap.xml for SEO

This adds a sitemap.xml file to the site, which is basically just a list of all the pages on the site, so search engines don't have to rely on finding them though links. Having one is the first piece of SEO advice. I looked at the jekyll-sitemap plugin, but it doesn't play well with the jekyll-polygolt plugin that's handling translations. It's a simple enough format to generate in a template though.

  • add a sitemap.xml template
  • add it to the exclude_from_localization list
  • link to the sitemap from the robots.txt
  • update the canonical urls to use the un-prefixed url (vs. each translation pointing to itself as the canonical url)
  • update the "hreflang" alternate links to use un-prefixed urls for all languages (we serve the correct language based on headers)
  • Update prod deploy to ping google and indexnow (for Bing, Yandex, Seznam, DuckDuckGo, and Yahoo)

Google says the order of the links doesn't matter, so we're not favoring what comes first over what comes last. Once it's published, it should get picked up though the robots.txt, but additionally you can send a GET request to https://www.google.com/ping?sitemap=https://f-droid.org/sitemap.xml according to their docs. Bing has a similar one http://www.bing.com/ping?sitemap=https://f-droid.org/sitemap.xml. I can look in to other search engines if we want to kickstart the process instead of waiting for them to find the robots.txt link too, but I imagine that will happen pretty quickly.

Google says the file should be under 50MB, and the full one I generated locally is only 9MB, so we have a lot of room to grow before having to worry about splitting it up. Update: since we're using only un-prefixed urls in the sitemap, it's even smaller.

Edited by Steven McDonald

Merge request reports