Clarify Manual Testing Expectations

Context

Recently, I have seen a few regressions in particular that I feel could have been caught by the multiple reviewers/maintainers who reviewed the merge request. For those regressions, I went back and viewed the merge request comments - The review comments were steered towards DRYing up code, pointing to standards or simplifying the code, and offering institutional knowledge around methods that exist. It is my belief that manually testing the changes could have prevented some or all of the regressions; if manual testing had been done and the regressions still happened, we would have at least been assured that there were multiple eyes on the problem and it still went unnoticed.

Why isn't it enough for the author to be solely responsible?

The author of the merge request is absolutely the first line of defense against a regression and should be testing their work manually. However, it is all too common that our heads are down trying to solve the one problem that we're aware of, so we are unable to see the problems that we are overlooking. An example of this was a change that went out on the backend - the engineer worked diligently on the data for the code and manually tested it. However, they overlooked testing this response from the user perspective, and a regression occurred.

Who should be manually testing, if not the author?

My proposal is that reviewers and potentially maintainers should be manually testing the work. I understand that this would cause more overhead, but I think there are several options that would accomplish this:

Author provides written steps to test the work - Risk: This may be too narrow sighted, but would be a great first iteration and should streamline the testing
Similarly, author must begin writing or modifying feature specs
Author provides the expected result in the description, which can hopefully be copied from the issue description - Risk: This requires the reviewer to understand how to test the work, but it also allows reviewers to take a different set of "test steps"
Software Engineers in Test provide test scripts for particular issues - Risk: I do not think they have capacity to be writing these, and this detaches ownership from the engineers; it also leaves us with gaps, where the examples I have seen recently would likely have not been "worthy" of a test script
Only the first reviewer manual tests - Risk: Fewer eyes and ways of clever testing, but also quicker to do, should "spread the workload out," and should cover at least some cases

Surely there are more options than what I have listed. Do others find this to be a problem? I would love to hear different perspectives.

Edited May 11, 2020 by Michelle Gill