Limit concurrent operations in content aggregator

Currently, the content aggregator clones/fetches and scans all repositories at once. When cloning or fetching a lot of repositories, this can put a heavy burden on the network and/or the local resources, and may even lead to the git client being throttled or barred by the git server. It also means that all the packfile and index files get loaded into memory at once, which can push the Node.js process the limits of its memory.

This happens because the array of async function calls that load and process each repository are resolved using a standard Promise.all call, which lets all the functions run at once. We can replace this function with one that enforces a concurrency limit (known as a fixed thread pool) so that only a fixed number of operations can proceed at once.

We'll need to allow the user to configure this through the playbook. Since this is related to git operations, it may fit best on the git category:

git:
  concurrency_limit: 15

While changing this loop, there's another concern to address. Currently, if a clone/fetch operation fails, it terminates the Node.js process, which can leave repositories in the process of being cloned/fetched in an inconsistent state. While Antora gracefully handles this situation by cloning the repository anew on the next run, it still causes the cache of those repositories to be invalidated. The loop should defer failures so all clone/fetch operatations are settled (resolved or rejected) before the error is thrown. This change will maintain the integrity of the repository cache.

Edited Aug 24, 2021 by Dan Allen

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information