when gitlab pages are loaded from object storage, does the site get downloaded as a zip, or do the elements of the site get downloaded as zips? Is there a way to cache this locally?

The zip archive is not downloaded in to Pages. Instead, we make multiple requests, for example the first time we access a domain:

create a resource in memory by fetching 1 byte to ensure we support ranged-requests. This gives us the total size of the archive.
Load zip metadata
Load all archive files names into memory
Get the requested file offset, for example for index.html
Get the actual bytes of the file

The resource (zip archive) is now cached. If we want to load additional files within the cached period:

For another_file.html:

1, Get file offset 2. Get bytes
For index.html again:
1. File offset cached, we just need one request to fetch the bytes

The test TestOpenCached shows this in action

An extra refresh request might happen if accessed during the zip_cache_refresh period. If the content changed (e.g. after a new pages deployment), the resource in memory needs to be updated, triggering the 5 requests again (treated as a new resource).

If you would like to reduce the number of requests made to object storage, you could configure and extend the timings of the cache. This might be good solution if your Pages content doesn’t change too often.

For example, you could increase zip_cache_expiration and reduce zip_cache_refresh. If the archive expires in 10 minutes, and the refresh period is only 10s, it means we would always serve the cached content for 9m50s. If a request comes at time 9m55s, a refresh will occur. The risk of doing this, is that you may see old content, potentially indefinitely, as the resource would never be refreshed and rather extending its lifetime in memory.

Edited Oct 17, 2025 by Katrin Leinweber