Configurable retries for network related tasks

Sometimes a build fails simply because of a network timeout or such, this can happen especially when on build has failed and we've suspended all tasks to debug it, often this can cause an ongoing Fetch task to fail when resumed.

We should:

  • Have a configuration for maximum retries for network related activities
  • Reschedule failed network related tasks until maximum retries is reached
  • Frontend should not suspend tasks when a failure occurs that is in fact being retried

This requires that the failure message for a retry-able task should be special, or, the Message object could have a retry field added to it, otherwise the frontend will suspend tasks and try to handle the failure.