Keep going when a connection issue occurs on a mutation

Problem

API Security makes thousands of requests against an operation to perform testing. Sometimes one request will timeout or fail, when this happens API Security ends the test with an error. However, in some cases it would be possible to continue testing.

This has come up during customer trials of our tool. So far it has typically been an issue with the customer application, but that will not always be the case.

Proposal

Improve our handling of request failures so that minor issues will not stop the test from completing.

Quick notes on logic to implement, needs refinement.

If we can connect and send the request but the response is invalid or times out
1. Try calling the operation without modification to see if application is still responding.
  1. If this fails then also fail the test.
2. Keep a list of failed requests and report them at the end of testing
3. Keep going until we have a certain number of failures in a row (5? 10?)
  1. Fail test
  2. Report requests that are failing (should we include the full request in console?)
  3. Include a troubleshooting message (link to doc?)
If we cannot connect to the application port
1. Retry 3 times
  1. If failing stop test and report that the application became inaccessible and customer should investigate

Edited Sep 07, 2022 by Michael Eddington